Inspiration

Millions of visually impaired individuals face navigation challenges due to the high cost and limited accessibility of existing solutions. While guide dogs and specialized devices provide assistance, they remain expensive, require maintenance, or are unavailable in many regions. We wanted to create an accessible, affordable, and easy-to-use tool that helps individuals of underserved communities to navigate their surroundings independently.

What it does

iAssist is an AI-powered vision assistant designed for real-time navigation. Using state-of-the-art object detection, our software identifies obstacles, detects safe pathways, and provides instant audio guidance. Unlike traditional assistive devices, iAssist runs on everyday smartphones, making navigation support widely accessible without expensive hardware.

How we built it

We developed iAssist using computer vision and deep learning, integrating YOLO11 for real-time object detection and spatial awareness. Context-dependent information was highlighted with the integration of an Ollama LLM. The application processes live video from a smartphone camera, utilizes deque and priority queue structures for efficient tracking, and Kokoro text-to-speech for audio guidance. The front-end, as well, was built using Next,js, React.js, and Chart.js.

Challenges we ran into

One of our biggest challenges was balancing real-time processing to ensure users received enough information for safe navigation without overwhelming them with unnecessary details. We needed to prioritize urgent alerts, such as approaching vehicles or sudden obstacles, while still providing helpful scene context. In addition, computer vision was challenging due to the high volume of data processing and our limited prior experience.

Accomplishments that we're proud of

One of our biggest achievements was successfully designing an efficient buffering system using a deque and priority queue. Less urgent instructions are stored in the deque for sequential processing, while critical alerts—such as sudden obstacles—are pushed to the front of the queue for immediate response. To optimize computer vision speed and accuracy, we resized frames, and implementing selective frame skipping. We are especially proud of enhancing the model's understanding of the scene understanding as we integrated an LLM that provides context beyond object detection.

What we learned

We gained experience on real-time computer vision processing and how to design efficient buffering systems. We also learned how to integrate LLM-powered scene summarization, text-to-speech, and speech-to-text technologies to create an intuitive and accessible user experience. Additionally, we learned the importance of clear communication and collaboration, ensuring every team member stayed aligned despite working on different components.

What's next for iAssist

Next, we aim to refine iAssist’s vision processing for faster, more accurate navigation tailored to geographical contexts. In urban environments, where higher population density affects mobility, users may need drastically different features from that of rural environments. To address this, we plan to integrate Google Maps’ population density API, enabling iAssist to adapt guidance based on crowd levels and surroundings for improved situational awareness.

Built With

Share this project:

Updates