Inspiration
This project was inspired during the MLH (Major League Hacking) GHW: DATA event, where participants from diverse backgrounds came together to learn, collaborate, and build innovative projects. The MLH event provided a stimulating environment that encouraged exploration and experimentation with cutting-edge technologies like computer vision and natural language processing.
What it does
Our AI-Powered Image Captioning project utilizes a deep learning model that takes an image as input and generates descriptive captions for the content present in the image.
How we built it
We obtained the Pistachio Image Dataset, which contains a vast collection of images with corresponding captions, providing a suitable foundation for training our image captioning model.
Challenges we ran into
Finding the optimal hyperparameters for the CNN and LSTM models was a time-consuming process, as we needed to strike a balance between model complexity and performance.
What we learned
🏫Image Processing: We gained insights into image preprocessing techniques and the importance of resizing and normalization for CNN input.
🏫Sequence Generation: Implementing the LSTM model for sequence generation helped us understand the challenges of generating coherent and contextually relevant captions.
🏫Evaluation Metrics: Understanding and working with evaluation metrics like BLEU, METEOR, and CIDEr enhanced our ability to assess the performance of NLP models.
Log in or sign up for Devpost to join the conversation.