There has been a lot of research in the field of image and video feature extraction and recognition. We decided to leverage on some of these advancements into building an audio assistant for the visually challenged - enabling them to get a feel of what is happening around them.

What it does

The application, "netra"(Sanskrit: eyes) , allows the user to operate in two modes - one click mode in which an audio assistant describes the contents of the user's current screen (camera lens) and a second, real time mode, which constantly examines the user's surroundings and notifies the user with audio tags. The user can enable and disable the text captions appearing on the screen. The navigations also have gesture support enabled to make it easy for people who are visually challenged.

How we built it

We build this application as an android mobile application making use of the async tasks to run background services. We used the Algomus API, based on the developments in recurrent neural networks for the caption generation (one click mode) and the Clarifai API for the tag generation (real time mode).

Accomplishments that we're proud of

Securing the 1st place at USC's ACM Mobile Hackathon for this project.

What's next for netra

Release a beta version soon

Built With

+ 1 more
Share this project: