Inspiration
We were inspired by the daily challenges faced by visually impaired individuals. Our goal was to create a tool that uses Gemini API to empower them with a better understanding of their surroundings.
What it does
Beyond Eyes captures images using a webcam, analyzes them with AI(gemini), and provides real-time auditory feedback to describe the objects and scene in the image.
How we built it
We used:
- Python for coding the application.
- OpenCV for image capture.
- Google Gemini API for analyzing images.
- pyttsx3 for converting the analysis into speech.
Challenges we ran into
- Integrating the Gemini API for seamless analysis.
- Ensuring the app is easy to use for visually impaired individuals.
- Debugging issues with camera accessibility.
Accomplishments that we're proud of
We successfully built a functional tool that can:
- Analyze images in real-time.
- Provide accurate auditory descriptions.
- Empower users with accessibility-focused technology.
What we learned
We learned:
- How to integrate APIs like Google Gemini.
- The importance of accessibility in technology.
What's next for Beyond Eyes
- Developing a mobile app for portability.
- Integrating OCR for reading text in images.
- Exploring wearable device compatibility for hands-free usage.
Log in or sign up for Devpost to join the conversation.