Inspiration

What it does

How we built it

Challenges we ran into

Inspiration ​Technology is often praised for innovation, but at VisionAid AI, we believe its true purpose is to empower humanity. Inspired by the millions of visually impaired individuals worldwide who face daily barriers, I set out to bridge the digital divide. My mission was to create more than just software; I wanted to build a reliable companion—an offline, AI-powered assistant that acts as a digital guide, inspired by the dedication of guide dogs. ​The Journey & Challenges ​During development, we faced a critical dilemma: the trade-off between "raw accuracy" and "real-world usability." While some might focus solely on achieving the highest percentage of detection, we chose the path of transparency and efficiency. ​We realized that for a visually impaired user, an AI tool that drains the battery in an hour is not truly helpful, no matter how accurate it is. Therefore, my journey was defined by: ​True Optimization: Balancing the precision of YOLOv8 and EasyOCR with the constraints of mobile hardware. ​Environmental Awareness: We didn't just aim for a number; we optimized for the user's daily life, ensuring the app runs smoothly without constant charging, making it a reliable tool for long-day use. ​Integrity: We prioritized honest performance over inflated statistics, believing that a sustainable, usable tool is far more valuable than a fragile, high-percentage demo. ​What I Learned ​Building VisionAid AI taught me that "efficiency" is the highest form of innovation. I learned that as developers, our responsibility is not just to build the most complex model, but the most helpful one. By focusing on the user's environment and daily needs, I discovered that true success lies in creating technology that works for the user, anywhere and anytime, with total privacy and reliability.

Accomplishments that we're proud of

What we learned

What's next for VisionAid AI

Built With

Share this project:

Updates

posted an update

  1. AI Architecture Explanation:

Inputs: Real-time video stream from the user's camera. AI Capability: Object detection (Computer Vision). Processing: The system utilizes the YOLOv8 model to analyze video frames, identifying people and common objects in real-time. Outputs: The system provides immediate, clear audio feedback (using pyttsx3) to the user, announcing the presence of people or objects to improve situational awareness for visually impaired individuals.

  1. Human-in-the-Loop Decision:

The Decision: The AI identifies objects and people, but it does NOT decide on the user's navigation path or movement. Why Human Involvement is Critical: The decision to move in a certain direction or navigate an environment is a life-critical choice. The AI acts only as an information assistant. The visually impaired user must retain full agency to evaluate the AI's feedback against the physical context, ensuring that all navigation decisions are made by the human, not the algorithm.

  1. Responsible AI Guardrail:

Risk: Over-reliance (The risk that a user might trust the AI output completely and ignore potential hazards that the AI might miss). Mitigation: We implemented a confidence score threshold to reduce false positives. Furthermore, the system is designed to provide situational awareness rather than navigation instructions. We emphasize to users that this tool is a secondary aid, not a substitute for standard mobility aids like white canes or guide dogs. AI Tools & Data

  1. AI Tools Used:

YOLOv8 (Ultralytics): Pre-trained model for real-time object detection (Free/Open Source).

pyttsx3: Offline text-to-speech engine (Free/Open Source).

OpenCV: For video stream capture and processing (Free).

Python: Core programming language (Free).

We utilized pre-trained weights for YOLOv8 based on the standard COCO dataset, which is a massive, publicly available image dataset. No private or sensitive user data was collected or used.

Log in or sign up for Devpost to join the conversation.