Over 35 million Americans each year are either blind or have some sort of visual impairment. Thousands of dollars are spent on assistance and thousands of hours are spent by caretakers for the handicapped. From stumbling upon everyday obstacles, navigating simple household/local routes, to misplacing essential items blind people require a friendly, automated, and wireless assistant for such everyday tasks.

What it does

Deep Sight uses a camera module attached to a Raspberry Pi synced with an Amazon Echo to act as a wireless voice assistant glassware for the blind/visually impaired. Users may hold objects in front of a camera module attached to their glasses/goggles, and Alexa will respond with image recognition and may be conversed with for further information about the object. The device uses OpenCV software for image recognition and Alexa Skills for a conversational UI.

How I built it

We used Alexa Skills to develop a friendly and conversational UI between the speaker and Amazon Echo. The Alexa transfers data using Amazon web services to the raspberry pi which takes pictures (using its attached camera module) and sends pictures back to a host PC for image recognition and analysis. We used OpenCV software for image recognition, and this data was sent

Challenges I ran into

We originally wanted simple direct communication between Alexa and the Raspberry Pi. One of our biggest challenges was running the OpenCV software - a large file with many pre-trained images - to run on the Raspberry Pi. To overcome this, we send images taken by the Raspberry Pi to a local computer which has the OpenCV library, which is then analyzed and sent back to Alexa for communication.

Accomplishments that I'm proud of

We're proud we were able to automate the entire process in a smooth, efficient manner. Integrating and connecting the Alexa, transmitting data through AWS in lambda functions and several SQS, communicatnto the raspberry pi, and then to the local computer and finally back to Alexa was a tedious process.

What I learned

We learned how to use raspberry pi functions, configure Open Computer Vision software, and develop a conversational UI with Alexa Skill sets.

What's next for Deep Sight

Extensions also include training the device to send emergency alerts when it senses dangerous home/public situations using either Google Cloud or other safety API's. For example, if the image recognition software recognizes a fire, it should immediately send an alert to 911 or firefighters. Next steps would be to include Machine Learning with our core analysis software which could make the voice-assistant more capable at detecting images and objects around the house and in public places. For example, instead of holding an object and inquiring the identity of the object, a user should be able to ask where in his/her frame is a certain object of interest.

Built With

Share this project: