Inspiration
Bookshelves are worse than fjords to navigate. There is too much choice, and indecision hits when trying to pick out a cool book at a library or bookstore. Why isn’t there an easy way to compare the ratings of different books from just the spine? That’s where BookBud comes in. Paper books are a staple part of our lives - everyone has a bookshelf, hard to find them, very manual organisation
What it does
Bookbud is Shazam but for books. Bookbud allows users to click on relevant text relating to their book in a live video stream while they scan the shelves. Without needing to go through the awkward process of googling long book titles or finding the right resource, readers can quickly find useful information on their books.
How we built it
We built it from the ground up using Swift. The first component involves taking in camera camera input. We then implement Apple’s Vision ML framework to retrieve the text recognised within the scene. This text is passed into the second component that deals with calling the Google Books API to retrieve the data to be displayed.
Challenges we ran into
We ran into an unusual bug in the process of combining the two halves of our project. The first half was the OCR piece that takes in a photo of a bookshelf and recognises text such as title, author and publisher, and the second half was the piece that speaks directly to the Google client to retrieve details such as average rating, maturity_level and reviews from text. More generally, we ran into compatibility issues as Apple recently shifted from the pseudo-deprecated UIKit to SwiftUI and this required many hours of tweaking to finally ensure the different components played well together.
We also initially tried to separate each book’s spine from a bookshelf can be tackled easily through openCV but we did not initially code in objective c++ so it was not compatible with the rest of our code.
Accomplishments that we're proud of
We were able to successfully learn how to use and implement Apple Vision ML framework to run OCR on camera input to extract a book title. We also successfully interacted with the Google API to retrieve average ratings and title for a book, integrating the two into an interface.
What we learned
For 3 of 4 on the team, it was the first time working with Swift or mobile app development. This proved to be a steep learning curve, but one that was extremely rewarding. Not only was simulation a tool we drew on extensively in our process, but we also learned about different objects and syntax that Swift uses compared to C.
What's next for BookBud
There are many technical details BookBud could improve on: Improved UI Basic improvements and features include immediately prompting a camera, Booklovers need an endearing UI. Simple, intuitive - but also stylish and chic. Create a recommendation system of books for the reader depending on the books that readers have looked at/wanted more information on in the past or their past reading history Do this in AR, instead of having it be a photo, overlaying each book with a color density that corresponds to the rating or even the “recommendation score” of each book. Image Segmentation through Bounding Boxes Automatically detect all books in the live stream and suggest which book has the highest recommendation score. Create a ‘find your book’ feature that allows you to find a specific book amidst the sea of books in a bookshelf.
More ambitious applications… Transfer AR overlay of the bookshelf into a metaversal library of people and their books. Avid readers can join international rooms to give book recommendations and talk about their interpretations of material in a friendly, communal fashion. I can imagine individuals wanting NFTs of the bookshelves of celebrities, their families, and friends. There is a distinct intellectual flavor of showing what is on your bookshelf. NFT book? Goodreads is far superior to Google Books, so hopefully they start issuing developer keys again!
Built With
- googlebooksclientapi
- swift
- visionml

Log in or sign up for Devpost to join the conversation.