Inspiration
Traditional mobility tools like walking sticks or basic GPS devices don’t offer enough contextual awareness for blind or visually impaired individuals. We wanted to build something more intelligent—a system that can not only guide users but also recognize people and objects around them, providing a richer, safer, and more independent experience.
What it does
WalkMate is a voice-activated smart assistant for the blind and visually impaired. It:
- Recognizes familiar faces and identifies relationships.
- Helps dementia patients by providing contextual cues about people nearby.
- Provides GPS-based navigation with real-time voice updates.
- Detects nearby obstacles using depth estimation and warns the user.
- Offers direction-based guidance to nearby objects like chairs, doors, or exits.
- Optionally chats with users using Google’s Gemini AI for extra support.
How we built it
WalkMate is a multi-threaded Python system that integrates several powerful technologies:
- Face Recognition: OpenCV + LBPH model trained on local dataset.
- Object Detection: YOLOv5 with real-time bounding box + Depth-Anything for depth mapping.
- Voice Commands:
speech_recognition+pyttsx3and OpenMind API for bidirectional audio. - Navigation: Google Maps API + geolocation with
geopyandgeocoder. - Gemini AI (Optional): Text/voice chat integration with Google's generative language model.
We built and connected multiple modules (app.py, map.py, sub.py) with shared speech queues and voice interactions to deliver a unified, real-time assistive experience.
Challenges we ran into
- Combining multiple real-time inputs like voice, video, and GPS was tricky to synchronize.
- Depth estimation needed tuning for adaptive object thresholds to avoid false warnings.
- Voice recognition required filtering and timing logic to reduce misfires.
- Training the face recognizer for high accuracy while maintaining performance was a challenge.
- GPS precision limitations on indoor environments restricted full navigation testing.
Accomplishments that we're proud of
- Built a real-time, voice-controlled, multi-sensory assistant from scratch.
- Successfully integrated depth perception with object recognition for smarter alerts.
- Created a facial memory system that supports dementia patients with context.
- Developed voice navigation with real-time map data and auditory instructions.
- Modular and extensible code structure ready for future integration with hardware or wearables.
What we learned
- How to integrate voice recognition, computer vision, and depth mapping in a single flow.
- How to build accessible technology that really considers user context and usability.
- The importance of modular design for scalability and team collaboration.
- Hands-on experience with multiple APIs including Google Maps, OpenMind, YOLOv5, Depth-Anything, and Gemini.
What's next for WalkMate
- Integrating with wearables like smart glasses or phones for on-the-go usage.
- Adding live obstacle tracking with LiDAR or IR sensors for real-world testing.
- Supporting offline maps and localized navigation.
- Expanding face memory with more contextual cues (e.g., workplace, family).
- Offering real-time emergency alerts or fall detection features.
Built With
- flask
- gemini
- machine-learning
- opencv
- python
- pyttsx3
- yolov5
Log in or sign up for Devpost to join the conversation.