Inspiration
We were inspired by the everyday challenges visually impaired individuals face when navigating unfamiliar environments. EchoSight was built to empower them with real-time, AI-generated narration of their surroundings — giving them a way to "see" through sound.
What it does
EchoSight is a headset-powered assistive tool that captures images of the environment every few seconds, uses AI to detect nearby objects, and provides short, spoken descriptions of what’s in front of the user. It helps blind users better understand the space around them through sound.
How we built it
We used Unity to build the user-facing application for AR glasses, capturing and sending images in real time. A FastAPI backend runs a YOLOv8 object detection model to identify key objects in the image. Then, Google’s Gemini LLM interprets these detections into a simple, friendly sentence, which is finally converted to speech using gTTS. The audio file is sent back to Unity and played through the device.
Challenges we ran into
1- Getting AR camera feeds to work reliably across devices 2- Ensuring accurate object detection in low-light or blurry frames 3- Making AI narration accessible and non-overwhelming 4- Managing latency between capture and audio playback
Accomplishments that we're proud of
1- Creating a complete pipeline from image capture → detection → narration → playback 2- Integrating YOLO and Gemini in a way that feels natural and human 3- Designing a project that can genuinely help people and be extended into a real assistive tool
What we learned
1- How to use YOLOv8 for real-time image understanding 2- How to integrate LLMs (Google Gemini) into assistive applications 3- How to balance technical performance with user accessibility 4- That sometimes simplicity wins — the best tech is invisible
What's next for EchoSight
1- Add direction-aware narration: “A dog is to your left.” 2- Explore voice cloning for personalized narration 3- Add object memory so the system doesn’t repeat the same thing 4- Deploy on smart glasses like Magic Leap or Quest Pro 5- Partner with accessibility orgs to get real feedback from blind users
Log in or sign up for Devpost to join the conversation.