Inspiration
We wanted to help blind people by providing them a device that gives them a quick scan of their environment allowing them to have some sense of their surroundings
What it does
We have a program that recognizes the voice of the user. When "Vision" is heard by the computer, it runs the program and takes a photo of the user's surroundings. The computer the recognizes what the objects are that were captured in the photo, and responds to the user with voice. The user can further interact with the program by asking for more info about the surroundings.
How we built it
tensorflow, h5py, opencv, speech_recognition and portaudio, machine learning with imageai
Challenges we ran into
Speech recognition was very controlled, so we had to make it so that it would be listening repeatedly. This was eventually adjusted to a point where it would recognize speech more accurately and quickly. The imageai detection had to be adjusted to fit our needs, meaning displaying the largest object that is closest to the user and the most centered. This was done by writing another algorithm to find the closest object in an image.
Accomplishments that we're proud of
We were working with many apis so we had to learn an extend of everything and then put it all together. We organized it well, by having a group working with speech recognition and input, and the other working with image recognition and output. With 2 successful working components we put it together to make a working product.
What we learned
As we worked on this product, we learned how to use object recognition and speech recognition, as we had never used anything like that before. In addition, we learned how to create an effective website with HTML and CSS.
What's next for Vision
Our goal for Vision is to incorporate hardware and eventually add a micro-camera and attach it to glasses. This allows for the user to take this product on the go.
Log in or sign up for Devpost to join the conversation.