SightSync—An AI-Powered Assistive Spectacle
Empowering the Visually Impaired
Millions of visually impaired people around the world face daily challenges: reading printed text, locating misplaced belongings, and safely navigating unfamiliar places. Inspired by their struggle for independence, we built Sight Sync—a wearable smart assistant that acts like an extra pair of eyes, powered by **AI, MongoDB, and Google Cloud.
Key Features
Real-Time Object Detection
Identifies everyday objects around the user using a stereo vision camera and YOLOv3.Obstacle Awareness
Calculates depth maps with dual cameras to detect obstacles and provide voice alerts.Text Reading
Converts printed or handwritten text to natural speech with Tesseract OCR and Google Text-to-Speech.Voice-Guided Navigation
Uses Google Maps API for live, turn-by-turn directions and dynamic rerouting.Smart Visual Memory
Stores object detections and embeddings in MongoDB Atlas, and retrieves similar scenes using Atlas Vector Search — answering questions like “Where did I last see my keys?”
How It Works
Edge AI:
YOLOv3 is trained with the COCO 2014 dataset and converted to TensorFlow Lite for fast inference on a Raspberry Pi.Depth Estimation:
Stereo cameras capture offset images to build depth maps and detect obstacles within safety limits.Text-to-Speech:
Tesseract OCR extracts text; Google TTS reads it aloud in the user’s chosen language.Navigation:
Google Maps API plans the safest routes and delivers clear, real-time voice instructions.Data Management:
Detection logs, OCR text, GPS coordinates, commands, and vector embeddings are securely stored in MongoDB Atlas. Atlas Vector Search enables similarity queries, while Atlas Search handles text lookups.Hands-Free Control:
Users interact via voice commands and speech recognition — no buttons needed.
Integrated Google AI Services
Google Text-to-Speech:
High-quality, multi-language speech synthesis for reading text aloud.Google Maps API:
AI-enhanced route planning, traffic prediction, and accessible navigation.
Challenges and Solutions
Building Sight Sync involved solving several tough challenges:
- Running high-accuracy detection models on low-power edge devices.
- Achieving reliable depth perception under diverse lighting.
- Providing real-time voice alerts without lag.
- Merging local AI processing with cloud-based MongoDB Vector Search smoothly.
- Ensuring timely obstacle alerts and safe route guidance.
Through model optimization, efficient coding, and rigorous testing, these challenges were tackled to deliver a practical, robust solution.
Achievements and Impact
- Developed a working wearable prototype combining object detection, obstacle detection, OCR, TTS, and live navigation.
- Integrated MongoDB Vector Search for unique visual memory capabilities.
- Delivered a fully voice-controlled, accessible user experience.
- Designed a hybrid architecture blending edge AI with cloud-powered search and Google APIs.
Insights Gained
Through this journey, we gained valuable experience in:
- Deploying AI models on edge hardware.
- Integrating cloud databases with real-time inference.
- Using MongoDB Atlas for text and vector search.
- Designing user-friendly assistive technology for real-world mobility support.
Roadmap Ahead
What’s next for Sight Sync:
- Add multi-language OCR and TTS for global users.
- Log user feedback in MongoDB to improve models continuously.
- Build a companion mobile app for caregivers.
- Refine the hardware for lighter weight and longer battery life.
- Partner with organizations to test and refine Sight Sync with real users.
Technologies Used
| Component | Details |
|---|---|
| AI Models | YOLOv3, TensorFlow Lite, Tesseract OCR |
| Database | MongoDB Atlas, Atlas Vector Search |
| Google AI | Google Text-to-Speech, Google Maps API |
| Hardware | Raspberry Pi, stereo camera module, microphone, speakers |
Thank You
Thank you for exploring Sight Sync — a step forward in accessible, intelligent mobility for all.
Built With
- api;
- atlas
- coco
- dataset:
- google-maps
- languages:-python
- maps-api)
- microphone
- mongodb
- mongodb-atlas-(search
- nlp;-platforms-&-hardware:-raspberry-pi
- opencv
- shell/bash;-frameworks-&-libraries:-tensorflow-lite
- speakers;-cloud-services:-google-cloud-(text-to-speech-api
- speechrecognition
- stereo-camera-module
- tesseract-ocr
- vector-search);-databases:-mongodb-atlas;-apis:-google-tts-api
- yolov3
Log in or sign up for Devpost to join the conversation.