Inspiration

Wanted to build something meaningful for people with low vision who need safer support while walking. Many tools are there, either too expensive, too complex, or not practical for daily use, so just explored a simpler voice-first approach.

What it does

Audio VisionAssist looks at a walking scene and turns important moments into short spoken guidance. It focuses on useful alerts, updates them as situations change, and avoids repeating the same message again and again.

How we built it

Built a cloud flow that reads video frames, understand what is happening, and creates voice alerts. Those alerts are saved and updated continuously, while a local player fetches and plays them to simulate a real walking assistant.

Challenges we ran into

Getting smooth timing between scene updates and spoken alerts was challenging. During testing, usage and speed limits required tuning how often frames were checked. Another challenge was reducing repetitive voice output while still reacting quickly to new risks.

Accomplishments that we're proud of

The idea was turned into a working end-to-end demo. The system provides clear spoken guidance and feels seamless even with limited testing settings. The architecture was also designed to grow toward real edge-device deployment.

What we learned

Clear, short voice guidance matters more than too much detail. Smart filtering of alerts is essential for trust and usability. Additionally, the Nova model capabilities and AWS Gen AI tools ecosystem.

What's next for Audio VisionAssist

Move from demo setup to a compact edge device setup for faster, more private use. Improve guidance quality with richer scene understanding and better direction cues. Test with real users, gather feedback, and refine for everyday walking support.

Built With

Share this project:

Updates