Inspiration
Millions of visually impaired individuals face daily challenges in navigating their environment and accessing visual information. We were inspired to create EYEVISION as an AI-powered assistant to provide real-time image analysis and feedback, enabling greater independence for blind and visually impaired users. Our goal is to bridge the accessibility gap using AI and intuitive design.
What It Does
EYEVISION captures images using a device’s camera, analyzes them with AI, and provides concise spoken descriptions. Users can ask follow-up questions about the image, receiving detailed and relevant answers through speech. The app also helps users understand colors, objects, and surroundings with minimal effort.
How We Built It
We developed EYEVISION using:
- Next.js & React for the front-end interface
- Google Gemini AI API for image analysis and question-answering
- Web Speech API for text-to-speech and speech recognition
- MediaStream API for real-time camera access
The system processes captured images, extracts key visual details, summarizes them for easy understanding, and responds to user queries in natural language.
Challenges We Ran Into
- Ensuring accurate image-to-text conversion for complex environments
- Fine-tuning AI responses to prioritize user-relevant details. For example, when capturing a picture of a kitchen space with ingredients, the AI should not only describe the scene but also be able to suggest recipes based on the identified ingredients.
- Speech synthesis inconsistencies across different devices and browsers
- Handling real-time camera flipping and optimization for mobile performance
Accomplishments That We're Proud Of
- This is the team's first time working with AI, and we successfully integrated AI-powered image analysis with real-time speech feedback. We also optimized responses to prioritize relevant image details to an extent. Another thing we are proud of is being able to implement multi-device compatibility to ensure accessibility across different platforms.
What We Learned
- Deepened our understanding of AI-driven accessibility solutions
- Learned how to enhance AI’s ability to summarize visual content effectively
What's Next for EYEVISION
For our future plans, the EYEVISION team intends to:
- Enhance object recognition to help users locate objects more efficiently
- Incorporate navigation features for improved accessibility
- Customize voice options to cater to different user preferences
We aim to continue improving EYEVISION to make it a powerful accessibility tool for the visually impaired community.
Built With
- googlegeminiapi
- mediastreamapi
- nextjs
- react
- webspeechapi

Log in or sign up for Devpost to join the conversation.