Image Similarity Search Application

Image Similarity Search Application

Streamlit App

About the Project

Inspiration

The inspiration for this project came from the need to efficiently manage and search through large collections of images. With the growing volume of digital content, finding similar images manually becomes increasingly challenging. Leveraging AI to automate this process can save time and enhance productivity for various applications, such as digital asset management, content creation, and research.

What We Learned

Building this project provided valuable insights into several key areas:

AI and Machine Learning: Gained deeper understanding of how image embeddings work and how to utilize pre-trained models like "microsoft/resnet-50" for generating these embeddings.
Streamlit for Web Applications: Learned to create interactive and user-friendly web applications using Streamlit.
Qdrant Vector Database: Acquired knowledge about storing and querying high-dimensional vectors efficiently.
Data Processing: Developed skills in handling and processing image data, including resizing, encoding, and decoding.

How I Built the Project

Setting Up the Environment:
- Cloned the repository and installed the required packages.
- Set up Qdrant for storing image embeddings.
```
git clone https://github.com/surbhi498/ImageSearchEngine.git
cd streamlit
pip install -r requirements.txt
```
Generating Image Embeddings:
- Utilized "microsoft/resnet-50" model from Transformers and PyTorch to generate embeddings for images.
Building the Streamlit Application:
- Created a Streamlit app to fetch initial records from a Qdrant collection and display images.
- Added functionality for users to select a record and find similar images based on embeddings.
- Implemented caching of the QdrantClient instance for efficiency.
Image Processing:
- Used PIL to dynamically resize and preprocess images.
- Converted images to base64 format for rendering in Streamlit.

Challenges Faced

Embedding Generation: Understanding and implementing image embeddings using a pre-trained model required significant research and experimentation.
Efficiency: Ensuring that the application performs efficiently with large datasets was a major challenge, particularly in terms of querying and displaying images.
Data Handling: Handling and processing images, including encoding and decoding base64 strings, presented various technical challenges.

Future Enhancements

Scalability: Improve the scalability of the application to handle larger datasets more efficiently.
Additional Features: Add more functionalities such as filtering results based on different criteria and integrating more advanced image processing techniques.
User Interface: Enhance the user interface to provide a more seamless and intuitive experience.

Enhanced Search Capabilities

Text-to-Image Search

Description: Enable users to input text descriptions and generate matching images using advanced text-to-image generation models.
Example Use Case: Users can type "sunset over the mountains" and the application generates images that match this description.

Multimodal Search

Description: Combine text, image, and voice inputs to provide more comprehensive and accurate search results.
Example Use Case: Users can provide a photo and a text description like "beach with palm trees," and the system finds images that match both criteria.

These enhanced search capabilities leverage advanced Generative AI technologies to expand the functionality of the application, enabling users to search for and generate content based on various input modalities.

Visit Our Deployed Application

You can visit our website here.

Tutorial Video

Watch the tutorial video for a demonstration of this project on YouTube.

Feel free to explore and provide feedback!

Built With

base64
image
pil
processing
python-with-streamlit
pytorch
qdrant
transformers

Updates

surbhi sharma started this project — Jul 07, 2024 09:34 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.