"Appliances all around us have digital displays. Most of the time, even though I can’t see the display, I can memorize where the buttons are or use tactile stickers as a reference. But now more of these displays are becoming context dependent: the display on my oven looks different when I’m setting the timer than it does when I’m starting to cook a meal.
These changing displays make it difficult for me to use my oven, stove and crock pot on my own, like I used to. I wish there was an alternative to having a sighted friend on hand to read the options out loud!"
What it does
We are creating an app intended for mobile use that allows visually impaired users to interact more efficiently with modern home appliances. This is done through a front-end web app with speech input, and an OCR image-to-text input and text-to-speech output back end. This app will map out and vocalize where the buttons are on the display, and the text in the readout.
How I built it
- A front end web app with speech input that calls the python script.
- Getting the image and detecting text in the image using the Google Cloud OCR API
- OpenCV and image processing is used to detect color at the fingertip
- Digital display of appliance is mapped to a grid as a reference coordinate system for the fingertip
- Uses IBM Watson for text-to-speech output to user
Challenges I ran into
-Mobile device orientation -Method of mapping appliance button location -Lighting; glare, reflection -Connecting the front end and back end: design architecture
Accomplishments that I'm proud of
-Mapping and navigation
-Integrating various APIs and code -Speech verification of relative location -UI implementation -Overall vision for the idea even though we didn't get to accomplish all the tasks we set in mind
What I learned
- Architecture strategy is key for integrating front end and back end
- Design with compensating behavior in mind
What's next for OvenBot
- Machine learning algorithm for fingertip detection, remove external aid
- Field of view
- Converting project into a mobile app
- More natural conversational speech output
- Speech input