VisionAid – AI Glasses for the Visually Impaired

Inspiration

We were inspired by the struggles of visually impaired individuals in navigating daily life independently. Expensive hardware solutions are often inaccessible or limited in functionality. We wanted to create a software-first, AI-powered tool that works on devices people already own. The idea was to simulate "vision" through audio using object and text recognition. Our goal was to make assistive tech affordable, portable, and intelligent.

What it does

VisionAid uses a device's camera to detect objects, read text, recognize faces, and provide navigation. It processes the live feed, identifies what's visible, and speaks it out loud to the user. Voice commands allow users to interact hands-free. Navigation is assisted through Google Maps with audio cues. All features are integrated into a seamless, screen-free experience.

How we built it

We used Python with OpenCV and PyTorch for real-time object detection. Tesseract OCR extracted text from images, and gTTS provided voice output. Face recognition was added using the face_recognition library. Google Maps API handled basic navigation logic and directions. The app was built as a desktop prototype, simulating hardware-based use cases.

Challenges we ran into

Optimizing object detection for speed and accuracy was difficult on limited hardware. OCR struggled with low-contrast or blurry text. Face recognition was sensitive to lighting and angles. Voice input accuracy dropped in noisy environments. Bringing all features into one clean, user-friendly interface was a key challenge.

Accomplishments that we're proud of

We built a fully functional AI assistant for the visually impaired without hardware. Successfully integrated real-time detection, OCR, face recognition, and TTS. Created a voice-controlled interface that’s intuitive and inclusive. Packaged everything into a unified prototype within hackathon time limits. We addressed a real-world problem with a real-world solution.

What we learned

We learned how to combine multiple AI models into one smooth user workflow. Working with accessibility in mind changed how we thought about user interaction. We gained experience with real-time processing, voice interfaces, and UX design. Understanding model performance trade-offs was crucial. We realized simplicity and usability are more important than complexity

What's next for VisionAid – AI Glasses for the Visually Impaired

We plan to convert this into a cross-platform mobile app for broader reach. Offline mode and faster processing will be enabled using edge AI models. Currency detection and voice-based help features will be added. We aim to collaborate with accessibility NGOs for testing with real users. Ultimately, we hope to turn VisionAid into a real-world assistive product.

Built With

face-recognition
google-maps
gtts
opencv
python
speechrecognition
tesseract-ocr
yolov5

Updates

shivam kumar started this project — Aug 05, 2025 01:53 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.