Visually impaired people are not able to sense the world around us as easy as we do. With the help of IoT devices and Computer Vision, we built VisualAssist, an app that helps them explore the world around them freely.
What it does
Users, especially visually impaired people, can use VisualAssist to learn about their surrounding environment.
- easily select an area in a webcam feed by moving their arm;
- by holding a fist, the computer recognizes the objects in the area;
- the computer tells the user what’s in the area using natural language.
How we built it
- An image is captured from a webcam attached to a laptop with OpenCV.
- The orientation of the user’s arm is then received from a Myo armband, allowing the user to select area-of-interest in the image.
- The image recognition is handled by Google Cloud Vision API. With a single API call, we’re able to extract the most likely objects in a picture frame.
- After a list of possible objects are recognized from the image, we use Text-To-Speech to speak the list of objects aloud, allowing the users to hear it clearly.
Challenges we ran into
- There’s a lot of math involved in the project: the orientation we received from Myo device is in Quaternion, so we had to convert it to roll, yaw and pitch to use it in our program.
- We also try to triangulate where the user is pointing from the orientation of the user’s arm.
- Finally, we also had to fine-tune the sensitivity of the sensors in order to allow the user to move
Accomplishments that we're proud of
- During the development process, we are able to extract the user’s pose and orientation from the Myo armband.
- We successfully utilized Google Cloud Vision API during its Beta release to recognize objects captured by the camera with accuracy.
What we learned
Python, OpenCV, Google Cloud Vision API, Linux Command
What's next for VisualAssist
- We need to apply a kalman filter in order to smoothen the user’s interaction with the system.
- The recognition of fist gesture does not work 100% accurate.