Inspiration

Our team has a passion for AI and helping people. We used this opportunity to create something to help the visually impaired overcome their challenges. This project gave us the chance to explore AI, robotics, and team building, and created something beyond what we thought possible

What it does

Sight Sense uses AI to scan a room and determines important obstacles a person might face in their day to day life. The glasses describe what is in the user’s surroundings, using spacial audio, to make the visually impaired aware of what is happening around and where it is happening without the need for an interpreter.

How we built it

Sight Sense was built using Python and a set of libraries for real-time video processing and image labeling, specifically Ultralytic’s YoloV8 machine learning model. The Sight Sense software is deployed on a Raspberry Pi and attached to our Sight Sense Sunglasses. We used PyDub and PyAudio libraries for the audio functionality as well as an open source text-to-speech model, SpeechT5 by Microsoft.

Challenges we ran into

The main issues we ran into were time management. Downloading speeds for libraries and APIs slowed us down. Although this led to long waiting times, we were able to prioritize other important tasks to keep the project moving along. Furthermore, training the AI model on more datasets to detect more hazards made the model less efficient since it had more categories to label a given object. We spent a lot of time training these models to decide to use a pre-trained one instead for optimization.

Accomplishments that we're proud of

We developed a real-time image recognition AI, and deployed the software to a Raspberry Pi to create the Sight Sense Glasses. Additionally, we have developed an AI text to speech feature to provide auditory feedback of the objects in the user’s vicinity.

What we learned

We learned how to train existing AI models to tailor them to our specific needs and application. Moreover, we learned how to process and label our own datasets. This also taught us that it takes a lot of time and data not only to train a fast and efficient model, but to verify and label the large chunks of data necessary to feed to the neural network. It is also hardware demanding, and we learned how to outsource these computations onto cloud computing with Colab.

We also learned how to generate real-time AI speech. We cached the generated audio for faster subsequent responses.

What's next for Sight Sense

Sight Sense’s next steps are expanding our reach and adding support for smart glasses such as Meta Smart Glasses. These devices have the built-in hardware needed for the Sight Sense tool. Integrating Sight Sense into these glasses would be a key move to enter this market. Another important step for our team is optimizing the AI (faster recognition, more object recognition)

Built With

Share this project:

Updates