Inspiration

We were inspired by Google Lens; with a twist: the user in our app provides a text snippet to be searched in the physical world.

What it does

The web-app we built invites the user to upload pictures of bookshelves and search on the books for arbitrary strings. With the help of AI, the various objects are being located and text is being extracted. Finally, the location of the book with the corresponding string is being highlighted on the original image.

How we built it

We used a deep learning model (YOLOv3) with opencv and tesseract to perform object detection and text extraction. To get more meaningful results, we aligned the results with the Google Books API. The web-app was built using FastAPI.

Challenges we ran into

Creating a robust algorithm for text extraction was a challenge, especially if we consider the different lighting conditions, orientations of books and typefaces.

Accomplishments that we're proud of

We are happy that we managed to put the various pieces together and that we've built an end-to-end application in such a limited amount of time.

What we learned

We developed our know-how in the areas of computer vision and we acquired experience with a number of technologies (opencv, Google Books API, FastAPI)

What's next for Alexandria

The idea is to create a mobile app that can detect text from a live stream and in real-time return locations of objects that match what the user is searching for. Mid-term goal is to create a general search tool for labels on physical objects with a broad range of applications spanning from libraries to shops and warehouses.

Built With

Share this project:

Updates