Inspiration
Wanted to build something meaningful for people with low vision who need safer support while walking. Many tools are there, either too expensive, too complex, or not practical for daily use, so just explored a simpler voice-first approach.
What it does
Audio VisionAssist looks at a walking scene and turns important moments into short spoken guidance. It focuses on useful alerts, updates them as situations change, and avoids repeating the same message again and again.
How we built it
Built a cloud flow that reads video frames, understand what is happening, and creates voice alerts. Those alerts are saved and updated continuously, while a local player fetches and plays them to simulate a real walking assistant.
Challenges we ran into
Getting smooth timing between scene updates and spoken alerts was challenging. During testing, usage and speed limits required tuning how often frames were checked. Another challenge was reducing repetitive voice output while still reacting quickly to new risks.
Accomplishments that we're proud of
The idea was turned into a working end-to-end demo. The system provides clear spoken guidance and feels seamless even with limited testing settings. The architecture was also designed to grow toward real edge-device deployment.
What we learned
Clear, short voice guidance matters more than too much detail. Smart filtering of alerts is essential for trust and usability. Additionally, the Nova model capabilities and AWS Gen AI tools ecosystem.
What's next for Audio VisionAssist
Move from demo setup to a compact edge device setup for faster, more private use. Improve guidance quality with richer scene understanding and better direction cues. Test with real users, gather feedback, and refine for everyday walking support.
Log in or sign up for Devpost to join the conversation.