Clara: Image Genie - search your photos with a keyword

1. home page to select photo folder to search through
2. search all pictures with cat in the image description
3. search result: cat pictures!
4. search all pictures that contain the word "time"
5. search result: pictures that contain the word "time". clearly seen on the rightmost picture
6. search all pictures that contain a boat
7. search result: pictures that contain a boat

Inspiration

Struggling to find that one picture in thousands of pictures

Ever wanted to dig up your ancient receipt photos? long-lost memes? your beloved cat pictures? Yes, we've all been there - we spend hours scrolling through the entire photo library to find it, and this only takes longer if we have poor sight or are blind but wanna share images with our sighted friends.
Our solution to this problem is: What if we can search our photos with a keyword, just like Google Image Search?
So we created Clara. We made it accessible for those with vision disabilites as well, hoping that they'll also have the freedom to easily navigate through their photo albums.

What it does

Search through your entire photo album with a keyword, just like Google Image Search

Clara is your personal local picture search and insight assistant. The name borrows from the latin word clarus meaning clear, bright, famous, as this app aspires to make pictures searchable and clearly annotated for users.

User interface

Users will type a keyword of what they want to search for, for example "cat".
Users will select from our three different types of searches:

Image Description: our algorithm will return all the images containing a cat, for example: "a picture of a cat sitting on grass land" or "a cat standing on a hill". In the future, we will also return the corresponding image description texts, which existing alt-text readers could then read aloud for blind users.
Object Detection: This will return all the images containing that keyword, for example all images containing a cat.
OCR (Optical Character Recognition): This will return all images containing the actual keyword inputted, for example all images containing the word "cat".

Clara streamlines the image search process using image-to-text algorithms, and makes images more accessible for everyone.

Platform supported

PC (as a web app running on localhost) works well
mobile coming up soon

How we built it

Machine learning models for image-to-text

We deployed open-source, pretrained machine learning models such as multimodal transformer for image captioning, ResNet-LSTM-autoencoder based OCR model, and YOLOv3 supported object recognition model to achieve our core functions: Image Description, Object Detection, and OCR (Optical Character Recognition) to process all images in a given folder.
Then we displayed all search result images on front end, a website running on localhost only. No internet connection needed after downloading - all image processing happens offline, and user data is super safe!
For details, see project roadmap.

How to use

Challenges we ran into

All of our team members came in with varying degrees of experience, so one challenge was simply learning the skills necessary.
While we were able to successfully deploy the back-end models of image recognition and the front-end, we struggled with getting them to communicate with each other. It took some research and trial and error, but eventually we got both parts to integrate and work with each other.

Accomplishments that we're proud of

We are particularly proud that we were able to get all 3 models up and running and functioning with the pictures provided.
We are proud of our team for navigating between conflicting time zones and installation issues to still successful and consistently collaborate with one another on the project. Go team!

What we learned

We initially had some trouble developing the front-end, but learned more about designing styles, and creating fields for the back-end to integrate with.
We also learned a lot about pytorch models and how to deploy them into a fully-functioning project.

What's next for Clara: Image Genie

Improve UI to make it more easily read
Display an image description text for each image in the search result (the current version has this information in backend, just need to pull it up on frontend), so a blind user can use their alt-text reader to read that aloud
Integrate a text reader into the front end, so user's without a pre-installed reader will still be able to make the most of our application
Pre-compile the app so it's easily installed

Built With

Submitted to

Black Wings Hacks 2023
- Winner Best Accessibility Hack sponsored by Fidelity

Created by

I helped build a website using Django by setting up the framework, connecting the front-end and back-end, making sure information flowed smoothly between both. I also assisted in the use of SQLite3 for the database and successfully deploying the site.

Chimdindu Chukwuka
I worked on building the website framework using Django and deploying the site. I ensured that the front-end was passing information smoothly with the backend by setting up the correct fields for the backend to integrate with. I also worked on the frontend pages so they correctly received and displayed the picture matches from the backend.

Sai Sunku
I built the back end using python, deploying models such as vision transformer, object recognition deep neural network, and OCR. I also worked on frontend to ensure it interfaces with backend correctly.

As the team leader, I designed this project and coordinated teammates to work on both front and back end.

Lan Luo

Updates

Sai Sunku started this project — Feb 04, 2023 07:40 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.