Live Blind Assistant

Inspiration

The inspiration for our Live Assistant for Blind People came from the need to improve accessibility for individuals with visual impairments. We wanted to create a tool that could provide real-time assistance in navigating their surroundings, identifying objects, and offering descriptions of their environment. Our goal was to empower blind users with greater independence and confidence in their daily lives through the use of cutting-edge AI and computer vision technologies.

What it does

The Live Assistant for Blind People is an AI-powered app that helps visually impaired individuals by capturing images from their surroundings and providing detailed audio descriptions. Using speech recognition, the app allows users to issue voice commands to describe objects, environments, or even read text from books. Additionally, a live mode captures images at intervals to warn the user of any potential hazards or provide assistance while on the move.

How I built it

We built the app using Python and integrated the Google Gemini API for generating descriptive text from images. We also utilized OpenCV for capturing images through the camera and Pyttsx3 for converting text to speech. The app’s interface is built using Tkinter to make it simple and user-friendly, and we incorporated the OpenWeather API to offer additional information like weather updates. The real-time processing capabilities of Gemini API allowed us to create a seamless and efficient experience for the users.

Challenges I ran into

One of the main challenges we faced was optimizing the accuracy of image descriptions while keeping response times low. Capturing relevant details about objects and ensuring the descriptions were concise yet informative took a lot of fine-tuning. Additionally, developing a voice command system that could reliably recognize and respond to user input was challenging, especially when dealing with noisy environments or different accents.

Accomplishments that I'm proud of

We’re incredibly proud of creating an application that truly makes a difference in the lives of blind individuals. Successfully integrating speech recognition and AI-powered descriptions was a huge milestone. The app's ability to detect potential hazards and offer real-time assistance is a feature we are especially proud of, as it addresses a critical need for user safety.

What I learned

During the development process, we learned a lot about the importance of accessibility in software design. We deepened our understanding of AI model integration and learned how to balance performance and accuracy in real-time systems. Additionally, working with user feedback from the blind community gave us valuable insights into designing solutions that prioritize user needs and experience.

What's next for Live Blind Assistant

In the future, we plan to enhance the live mode to include object tracking and more sophisticated hazard detection. We are also exploring ways to improve the accuracy of text recognition for reading printed materials and expand the language support for users worldwide. Ultimately, our goal is to make this app a staple tool for visually impaired individuals, helping them navigate and engage with their environment confidently and independently.

Built With

imagerecognition
opencv
pyttsx3
speechrecognition
textrecognition

Submitted to

VolunteerTech
Determined AI Hackathon
Forest Hacks
- Winner Second
CSPSC Hacks
TerraHacks
Hacks for Humanity
HackFusion 1.0
ConCordia CS Hackathon
Cathay Hackathon 2024
Hack Summit'24
ACMHacks 2024
Build with AI - Google Developer Groups GGITS
HorizonHacks 2024
InsightMed Hacks
- Winner Top 10 Participants
Immerse the Bay 2024
The Dev Challenge
- Winner Runner Up 1
- Winner Participation Swags
Jama
AcademyHacks
Mindfulness Hacks
Soario X Karmeq: Apps for Finance Hackathon
Luna Vista ADC Q-Dev Hackathon 2024
GreenPioneers
Quantum Hacks
GNEC Hackathon 2025 Spring - Compete For UN-Affiliated/NGO Internships & Prizes
Digital Jam
HackNation
Texas Healthcare Challenge
TechPals Code for Change 2025
HEALTH HACK X - ILLUMINATI
AlgoArena
Smart City Hackathon

Created by

I developed the Live Assistant for Blind People, handling the entire lifecycle of the project from concept to implementation. My contributions include designing and coding the core functionalities, such as integrating speech recognition for voice commands, leveraging the Google Gemini API for generating real-time image descriptions, and implementing computer vision techniques for image capturing and processing. I also focused on optimizing the user experience for blind individuals by ensuring seamless audio feedback and a simple interface. Additionally, I overcame challenges in AI integration and performance optimization to deliver a reliable and efficient assistant for visually impaired users.

Rojan Sapkota
Life is binary; you're either a 0 or a 1.