During this summer, one of the members met a blind guy while doing an internship at Microsoft. Between conversations, we figured out that software still has a long way to go to help people with visual impairment live to their full potential.
What it does
Lumina is a mobile app that snaps photos and analyzes them to provide information about what might be in the image. Furthermore, it reads text out loud and identifies people and logos.
How we built it
We used the Bing Speech API to develop the STT recognition to enable voice command intercation and used the Google Cloud Vision API to get serialized information about the image. Finally, we used the Android SDK TTS engine to allow the user to hear the information.
Challenges we ran into
Understanding the APIs was the biggest challenge, and also making the application synchronize so that voice commands could be received accordingly to the information provided by the application simultaneously.
Accomplishments that we're proud of
We managed to learn a lot about STT and TTS APIs and their implementation.
What we learned
What's next for Lumina
Further implementations will include more commands.