Inspiration
I built this since my friends would take hours trying to decide with image was social media worthy and the best caption to go along with it. Inspired by their struggles, I decided to make an app that would generate captions.
What it does
The app generates three captions for every image based on a category: funny, vibes, and inspiration.
How we built it
I built it by training 10 different AI models all branching off pretrained image to text models then training them with my own data set. Then after getting the model downloaded, I turned to using an API to call GPT 3.5 to process the details that were given by the image-to-text model and generate a caption.
Challenges we ran into
I ran into huge challenges training the model and fixing the parameters as some versions of one model were not compatible with the other and some arguments could not be passed to older versions of such modules.
Accomplishments that we're proud of
I am proud of being able to pull through and train a model that was barely functional to a good image-to-text model. I am also proud to be able to complete the training in a short time span with limited data.
What we learned
I learned how to work with certain AI models and how parameters can affect the batches of data that is used to train the model, and how I can better improve the results of my work.
What's next for AI Image Caption Generator
I aspire to do longer training sessions with my model as the current one is only pulling data from 500 our of 8000+ training images. I will be doing that in my downtime to help improve results and better suit my users needs.
Log in or sign up for Devpost to join the conversation.