This story starts with someone in my campus residence almost hitting me with a cane. That someone is the only blind people I had met over the last few years. Going from this, I started wondering about the way blind people walking with difficulty in the street with their cane, could benefit from technology to improve how they perceive their environment. Eyevoice is an application that aims at helping the blind in their everyday life.
What it does
The app connects to a camera that is streamed over a server. After that Eyevoice relies on Machine Learning to be able to recognize what's happening around the user at any time and send him/her audio hints with updates about their surroundings e.g where are the objects and people in the images.
How we built it
We trained a model relying on FaceNet among other to be able to find and reveal information about the user surroundings. The input comes from a camera (embedded in glasses/phones) and is transmitted via our app to be analyzed and then inferred on. Then we use voice capabilities widely available on phones to transmit audio hints to the user.
Challenges we ran into
The main challenge was trying to build the entire pipeline with glasses at first, we pivoted to using a phone. Another challenge was also in objects/people recognition.
Accomplishments that we're proud of
- We were able to take live video and count how many people there are on every frame
- We were able to accomplish the basic use case of audio feedback about the user surroundings
What we learned
What's next for EyeVoice
It would be interesting if we could build use glasses with embedded cameras that are connected to our server instead of a phone. That would be more convenient. Also, we could add other options like description of places where the user is, an option of calling automatically their favorite number if they get lost (which tends to happen quite a lot) etc.