Object Detection
Image to Text
Navigation Assistance

SightSync—An AI-Powered Assistive Spectacle

Empowering the Visually Impaired

Millions of visually impaired people around the world face daily challenges: reading printed text, locating misplaced belongings, and safely navigating unfamiliar places. Inspired by their struggle for independence, we built Sight Sync—a wearable smart assistant that acts like an extra pair of eyes, powered by **AI, MongoDB, and Google Cloud.

Key Features

Real-Time Object Detection
Identifies everyday objects around the user using a stereo vision camera and YOLOv3.
Obstacle Awareness
Calculates depth maps with dual cameras to detect obstacles and provide voice alerts.
Text Reading
Converts printed or handwritten text to natural speech with Tesseract OCR and Google Text-to-Speech.
Voice-Guided Navigation
Uses Google Maps API for live, turn-by-turn directions and dynamic rerouting.
Smart Visual Memory
Stores object detections and embeddings in MongoDB Atlas, and retrieves similar scenes using Atlas Vector Search — answering questions like “Where did I last see my keys?”

How It Works

Edge AI:
YOLOv3 is trained with the COCO 2014 dataset and converted to TensorFlow Lite for fast inference on a Raspberry Pi.
Depth Estimation:
Stereo cameras capture offset images to build depth maps and detect obstacles within safety limits.
Text-to-Speech:
Tesseract OCR extracts text; Google TTS reads it aloud in the user’s chosen language.
Navigation:
Google Maps API plans the safest routes and delivers clear, real-time voice instructions.
Data Management:
Detection logs, OCR text, GPS coordinates, commands, and vector embeddings are securely stored in MongoDB Atlas. Atlas Vector Search enables similarity queries, while Atlas Search handles text lookups.
Hands-Free Control:
Users interact via voice commands and speech recognition — no buttons needed.

Integrated Google AI Services

Google Text-to-Speech:
High-quality, multi-language speech synthesis for reading text aloud.
Google Maps API:
AI-enhanced route planning, traffic prediction, and accessible navigation.

Challenges and Solutions

Building Sight Sync involved solving several tough challenges:

Running high-accuracy detection models on low-power edge devices.
Achieving reliable depth perception under diverse lighting.
Providing real-time voice alerts without lag.
Merging local AI processing with cloud-based MongoDB Vector Search smoothly.
Ensuring timely obstacle alerts and safe route guidance.

Through model optimization, efficient coding, and rigorous testing, these challenges were tackled to deliver a practical, robust solution.

Achievements and Impact

Developed a working wearable prototype combining object detection, obstacle detection, OCR, TTS, and live navigation.
Integrated MongoDB Vector Search for unique visual memory capabilities.
Delivered a fully voice-controlled, accessible user experience.
Designed a hybrid architecture blending edge AI with cloud-powered search and Google APIs.

Insights Gained

Through this journey, we gained valuable experience in:

Deploying AI models on edge hardware.
Integrating cloud databases with real-time inference.
Using MongoDB Atlas for text and vector search.
Designing user-friendly assistive technology for real-world mobility support.

Roadmap Ahead

What’s next for Sight Sync:

Add multi-language OCR and TTS for global users.
Log user feedback in MongoDB to improve models continuously.
Build a companion mobile app for caregivers.
Refine the hardware for lighter weight and longer battery life.
Partner with organizations to test and refine Sight Sync with real users.

Technologies Used

Component	Details
AI Models	YOLOv3, TensorFlow Lite, Tesseract OCR
Database	MongoDB Atlas, Atlas Vector Search
Google AI	Google Text-to-Speech, Google Maps API
Hardware	Raspberry Pi, stereo camera module, microphone, speakers

Thank You

Thank you for exploring Sight Sync — a step forward in accessible, intelligent mobility for all.

Built With

api;
atlas
coco
dataset:
google-maps
languages:-python
maps-api)
microphone
mongodb
mongodb-atlas-(search
nlp;-platforms-&-hardware:-raspberry-pi
opencv
shell/bash;-frameworks-&-libraries:-tensorflow-lite
speakers;-cloud-services:-google-cloud-(text-to-speech-api
speechrecognition
stereo-camera-module
tesseract-ocr
vector-search);-databases:-mongodb-atlas;-apis:-google-tts-api
yolov3

Updates

Tejaswini K R ACT started this project — Jun 17, 2025 05:38 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.