iAssist

Detecting objects
Detecing objects and processing results in real time
Blind trust

Inspiration

Millions of visually impaired individuals face navigation challenges due to the high cost and limited accessibility of existing solutions. While guide dogs and specialized devices provide assistance, they remain expensive, require maintenance, or are unavailable in many regions. We wanted to create an accessible, affordable, and easy-to-use tool that helps individuals of underserved communities to navigate their surroundings independently.

What it does

iAssist is an AI-powered vision assistant designed for real-time navigation. Using state-of-the-art object detection, our software identifies obstacles, detects safe pathways, and provides instant audio guidance. Unlike traditional assistive devices, iAssist runs on everyday smartphones, making navigation support widely accessible without expensive hardware.

How we built it

We developed iAssist using computer vision and deep learning, integrating YOLO11 for real-time object detection and spatial awareness. Context-dependent information was highlighted with the integration of an Ollama LLM. The application processes live video from a smartphone camera, utilizes deque and priority queue structures for efficient tracking, and Kokoro text-to-speech for audio guidance. The front-end, as well, was built using Next,js, React.js, and Chart.js.

Challenges we ran into

One of our biggest challenges was balancing real-time processing to ensure users received enough information for safe navigation without overwhelming them with unnecessary details. We needed to prioritize urgent alerts, such as approaching vehicles or sudden obstacles, while still providing helpful scene context. In addition, computer vision was challenging due to the high volume of data processing and our limited prior experience.

Accomplishments that we're proud of

One of our biggest achievements was successfully designing an efficient buffering system using a deque and priority queue. Less urgent instructions are stored in the deque for sequential processing, while critical alerts—such as sudden obstacles—are pushed to the front of the queue for immediate response. To optimize computer vision speed and accuracy, we resized frames, and implementing selective frame skipping. We are especially proud of enhancing the model's understanding of the scene understanding as we integrated an LLM that provides context beyond object detection.

What we learned

We gained experience on real-time computer vision processing and how to design efficient buffering systems. We also learned how to integrate LLM-powered scene summarization, text-to-speech, and speech-to-text technologies to create an intuitive and accessible user experience. Additionally, we learned the importance of clear communication and collaboration, ensuring every team member stayed aligned despite working on different components.

What's next for iAssist

Next, we aim to refine iAssist’s vision processing for faster, more accurate navigation tailored to geographical contexts. In urban environments, where higher population density affects mobility, users may need drastically different features from that of rural environments. To address this, we plan to integrate Google Maps’ population density API, enabling iAssist to adapt guidance based on crowd levels and surroundings for improved situational awareness.

Built With

chart
css
groq
html
javascript
kokoro
next
numpy
ollama
python
react
shadcn
tailwind
yolo11

Submitted to

DevFest 2025
- Winner Best Use of Groq (Sponsored by Groq)

Created by

I contributed to the iAssist project by implementing real-time object detection and tracking with YOLOv8, adding movement detection for navigation guidance, and assisted designing a priority queue for urgent alerts. I integrated a low-latency TTS system using Kokoro and developed a Flask server with WebSocket support for real-time updates. Additionally, I coordinated the application flow, optimizing frame capture, object detection, scene summarization, and TTS playback with efficient frame rate control and buffer management.

Shium Mashud
Uni CS major focused on real-time processing, scalable systems, and full-stack development. Passionate about building impactful solutions.
Richard Li
data science & statistics @ ucla
Annie Dong
Soonwoo Kwon