Inspiration:

We wanted to make images more accessible and interactive. Automatic alt-text on social media and apps like Google Lens inspired us to build our own lightweight Image Caption Generator where anyone can upload an image and instantly get an AI-generated caption.

What it does:

The app lets users drag-and-drop or select an image, sends it to the backend, runs it through a pre-trained AI model, and returns a short caption describing the content. Captions can then be copied or shared.

How we built it:

Frontend: HTML/CSS/JavaScript with drag-and-drop upload and instant preview.

Backend: Python Flask API to receive images and return captions.

AI Model: TensorFlow/Keras InceptionV3 (or optionally Google Vision API) for image classification → we generate a descriptive caption from the top prediction.

Challenges we ran into:

Installing large libraries like TensorFlow on a slow connection.

Ensuring the virtual environment used by Flask, pip, and VS Code matched.

Optimizing image preprocessing for real-time caption generation.

Accomplishments that we’re proud of:

Building a full working pipeline from drag-and-drop UI → Flask backend → AI model.

Achieving instant captions without training our own network.

Making the app light enough to run locally or on a small server. What we learned

Serving ML models through a web API.

Handling file uploads securely in Flask.

Implementing a clean, responsive drag-and-drop interface in vanilla JavaScript.

Dealing with environment and dependency issues in Python projects.

What’s next for Image Caption Generator:

Support for multi-sentence captions using an NLP decoder.

Multi-language caption generation.

Adding OCR so that text inside images is also extracted and described.

Deploying the app publicly with a simple share-button for captions.

Built With

  • amazon-web-services
  • css
  • gcp)-tools:-virtualenv-for-isolated-python-environment
  • html
  • javascript-frameworks:-flask-(backend)
  • languages:-python
  • tensorflow/keras-(ai-model)-frontend-libraries:-vanilla-js-for-drag-and-drop-upload-&-preview-apis-/-services:-(optional)-google-cloud-vision-api-for-alternative-caption-generation-platforms:-runs-locally-or-can-be-deployed-on-any-cloud-(heroku
  • vs
Share this project:

Updates