Dolly-v2-3b Tunning

Inspiration

My inspiration for this project was to create an AI capable of engaging in contextual dialogue with users. Imagine building a video or audio search engine where one aspect of the software pipeline involves an AI processing audio input in text form and providing answers based on user queries within the story context.

What it does

This AI responds to users based on a given context of text provided to it.

How I built it

To build this AI, I fine-tuned the dolly-v2-3b model using the databricks-dolly-15k dataset on Google Colab. The process involved utilizing the Tesla T4 GPU Instance and took approximately one hour. However, it took around seven days to complete due to multiple attempts and debugging.

Challenges I ran into

The major challenge I encountered was the input token limit of 2048 tokens for the model. This constraint restricted the model size and required me to fine-tune the model in multiple iterations instead of completing it in a single run.

Built With

google-cloud
llm
transformers

Updates

Alex Pam started this project — Jun 16, 2023 03:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.