Inspiration

We were inspired by the daily challenges faced by visually impaired individuals. Our goal was to create a tool that uses Gemini API to empower them with a better understanding of their surroundings.

What it does

Beyond Eyes captures images using a webcam, analyzes them with AI(gemini), and provides real-time auditory feedback to describe the objects and scene in the image.

How we built it

We used:

  • Python for coding the application.
  • OpenCV for image capture.
  • Google Gemini API for analyzing images.
  • pyttsx3 for converting the analysis into speech.

Challenges we ran into

  • Integrating the Gemini API for seamless analysis.
  • Ensuring the app is easy to use for visually impaired individuals.
  • Debugging issues with camera accessibility.

Accomplishments that we're proud of

We successfully built a functional tool that can:

  • Analyze images in real-time.
  • Provide accurate auditory descriptions.
  • Empower users with accessibility-focused technology.

What we learned

We learned:

  • How to integrate APIs like Google Gemini.
  • The importance of accessibility in technology.

What's next for Beyond Eyes

  • Developing a mobile app for portability.
  • Integrating OCR for reading text in images.
  • Exploring wearable device compatibility for hands-free usage.

Built With

Share this project:

Updates