Nowadays, kids spend most of their time on a tablet or computer instead of reading books. This interactive application will make the whole reading experience more fun.
What it does
BookLens captures any textual document and provides the ability to read aloud texts of the whole page, a sentence, or just a word in any language desired on click. It is also able to display the picture on word-click and translation.
How we built it
Back-end and front-end are written in Node.js and React respectively, using multiple apis from Google Cloud (Vision, Text-to-Speech, Translation) to manipulate both images and texts, as well as the Shutterstock API to retrieve images.
Challenges I ran into
Canvas and structuring data. The way that Google Vision API returns back OCR data, it is unsuitable to our needs and the data and to be parsed and restructured to follow the models we require. The algorithm was redone three times at 5 am but it finally worked!
Accomplishments that I'm proud of
Most base features are completed and work as expected.
What I learned
How to integrate via Google Cloud whether it is with their REST API, gRPC, or their client libraries. Also canvas and how it will never be used again :').
What's next for BookLens
For features, it would be translation of whole pages, read translated texts aloud, word visualization. and the ability to go back to previous page, among many other possible ideas!