Inspiration
Our project began with a simple yet impactful goal: to make the world more navigable for those with visual impairments. We wanted to create an assistive device that could provide real-time information about the environment in a way that was easy to use and accessible. This idea led us to combine some of the most advanced tools in computer vision, machine learning, and audio technology to create an intuitive system for helping blind and visually impaired individuals move more confidently through their surroundings.
What it does
The project is an assistive software for people with visual impairments. It uses computer vision to detect nearby objects distance and orientation, and then provides real-time audio descriptions, helping users understand their surroundings and navigate safely.
How we built it
We used YOLOv5 for real-time object detection to accurately identify objects in the environment. OpenCV managed the camera feed and image processing, integrating bounding boxes for visual feedback. For visually impaired users, we added Google Text-to-Speech (gTTS) and Pygame to convert visual data into clear audio prompts.
Depth perception was calibrated using a single camera to ensure accurate distance measurements. This combination of technologies provided users with verbal cues about object distances and directions, helping them navigate safely and independently.
Challenges we ran into
The limitation of using a single camera posed challenges for accurately calculating distances, especially to nearby objects. The absence of depth perception that multiple cameras or sensors provide made it difficult to gauge proximity effectively. To overcome this, we experimented with various distance estimation techniques and calibration methods.
What we learned
YOLOv5 for real-time object detection, OpenCV for processing video input and drawing bounding boxes, and gTTS combined with Pygame to deliver audio feedback. The video input from a single camera was processed to detect and classify objects, estimate their distance, and provide spatial information to the user. We carefully calibrated the distance calculations and tuned the YOLOv5 model to accurately identify objects while ensuring the information was conveyed effectively through auditory cues.
What's next for SeeAgain
We have a few ideas on how to expand this project and make it a fully functional and viable software. Firstly we would like to enhance the current software, especially the accuracy of the calculations. We would also like to expand into indoor navigation and mapping.
Log in or sign up for Devpost to join the conversation.