narratorRL

Scanning a poster
Regular read mode
Advanced features mode

Inspiration

According to the World Health Organization, at least 2.2 billion people have some sort of near/distance vision impairment. As such, it is hard for many of these people to read things such as small pieces of text. With an exponential increase in information found on small screens in recent years, vision impairment is rapidly becoming a major accessibility problem. At MetHacks, we wanted to create a solution to counter this ever-growing unseen epidemic.

What it does

Users can point their phone at text and press the capture button. The app uploads the image to a server to be processed by an optical character recognition program and Cohere if requested. An "advanced" mode can be toggled which gives more options for text recognition, including language recognition and keyword finder.

How we built it

The mobile app is built with React Native using Expo. Expo handles the camera system, file system, image manipulation, and text-to-speech. The backend, built with Django and the REST framework, handles the integration between the app and APIs like Cohere.

Challenges we ran into

Working with computer vision for the first time
Training Cohere to be accurate and have fewer hallucinations
Manipulating the image so that it can be effectively read
Cleaning and sorting the text
Finding an effective text-to-speech service that can clearly read the given text

Accomplishments that we're proud of

We worked and focused well together. We divided the tasks properly such that every member had new learnings and objectives. Each member was able to efficiently complete their goal.

What we learned

This was all our first time working with computer vision and optical character recognition. We also expanded our knowledge of the tech stacks we used.