Inspiration

We were inspired by the challenges visually impaired individuals face daily when using technology or navigating unfamiliar environments. Existing tools like screen readers and voice assistants often lack real-time awareness and intuitive design. We set out to build a smart, voice-first, multimodal assistant that empowers users through AI—not just accommodates them.

What it does

Our system enables hands-free, eyes-free interaction with the environment using a blend of AI-powered technologies. It provides: 1.Real-time object recognition and verbal identification 2.Scene understanding (e.g., “There is a chair to your left”) 3.OCR + text-to-speech for reading printed materials 4.Haptic feedback for alerts and directional guidance 5.Conversational interface powered by NLP for natural dialogue It’s designed to work offline, ensuring fast and private interactions.

How we built it

Computer Vision: YOLOv5 and OpenCV for object detection Speech & NLP: Google Speech-to-Text, Whisper (offline fallback), and a GPT-style model for understanding user intent TTS: Responsive, expressive text-to-speech Haptics: Arduino + vibration motors for directional cues Mobile App: Built in Flutter with Android integration Edge AI Optimization: TensorFlow Lite for mobile performance

Challenges we ran into

Getting accurate ASR in noisy environments Optimizing large models to run on-device with low latency Integrating multiple input/output modalities without overloading the user Designing haptic feedback that’s intuitive and meaningful

Accomplishments that we're proud of

Delivered a working prototype that runs offline Successfully combined voice, vision, and touch in one interface Blindfolded testing users could find objects and navigate a space independently Created a platform that balances tech performance with real-world accessibility

What we learned

Multimodal design dramatically improves accessibility Real-world testing with edge cases is crucial AI needs to be adaptive to truly serve diverse users Accessibility design is not a constraint—it's an innovation driver

What’s next

Integrate with wearables like smart glasses Add gesture control for non-verbal input Expand language support and user personalization Conduct field tests with visually impaired users Open-source the platform to grow community contribution

Built With

Share this project:

Updates