Inspiration

AI search from Google Photos,

What it does

This application allows users to search for images using contextual, natural language descriptions. Instead of relying on filenames or tags, a user can simply describe what they are looking for to retrieve the most relevant images from their collection.

How I built it

I used Gemini, Google's multimodal model on Vertex AI, to generate vector embeddings for each image. These embeddings were then stored in a Qdrant vector database. To find matching images, I implemented semantic search, which retrieves results by comparing the user's query to the stored image embeddings.

Challenges I ran into

Initially, the model struggled with queries longer than a few words and failed to return relevant results for complex descriptions.

To overcome this, I implemented a two-level search system. First, I used an image captioning API to generate detailed descriptions for each image, providing richer text to search against. The search process now combines a top-k similarity vector search for broad contextual relevance with a strict keyword match to refine the results. This dual approach ensures high accuracy even for long and specific queries.

Accomplishments

I have developed a system that achieves highly accurate results and can successfully interpret long, descriptive user queries. This overcomes the initial limitations of the model and makes the search functionally robust and user-friendly.

My learnings

This project was a great learning experience. I gained hands-on skills in:

  • Generating and utilizing vector embeddings for images.

  • Leveraging the Gemini API to extract both image vectors and text descriptions.

  • Implementing an effective semantic search pipeline using a vector database like Qdrant.

What's next for Contextual Image Album

  • Expanding the functionality to include video search.

  • Integrating user accounts to allow for personal, private image albums.

  • Improving the user interface (UI) for a more seamless experience.

Built With

Python

Gemini on Vertex AI

Qdrant

Built With

Share this project:

Updates