Inspiration
Our inspiration comes from the everyday challenges faced by visually impaired individuals who struggle with simple tasks like locating objects around them. We were inspired by assistive technologies like Seeing AI, but noticed that many solutions are either complex or not easily accessible to everyone. This motivated us to build a simple, affordable system using just a smartphone. We wanted to create an AI assistant that acts like a human guide, giving step-by-step directions in real time. Our goal is to make technology more inclusive and practical for daily life. . . .
What it does
Our solution is an AI-powered mobile assistant that helps visually impaired users locate objects using voice and camera. The user simply asks, for example, “Where is my bottle?” and the system scans the surroundings in real time. It detects the object and provides directional guidance using voice and sound cues. The user can follow step-by-step instructions and tap the screen to receive the next movement guidance. This makes object finding simple, intuitive, and independent. The system works entirely on a smartphone, making it accessible and practical for everyday use. It can answer all user basic questions. . . . .## How we built it We built our solution using web-based technologies to ensure accessibility on any smartphone. We used JavaScript and browser APIs for voice input and output, allowing users to interact through simple commands. For vision, we integrated TensorFlow.js with the COCO-SSD model to detect objects in real time using the phone’s camera. We then process the object’s position and size to estimate direction and distance. The system provides step-by-step guidance through voice and directional sound feedback. . . .
Challenges we ran into
During development, we faced challenges in achieving real-time object detection with smooth performance on mobile devices. Handling fast user movement and maintaining accurate guidance was difficult, so we designed a step-by-step tap interaction to improve reliability. Estimating exact distance using a single camera was not possible, so we used approximate distance based on object size. We also worked on reducing voice overlap and ensuring clear, timely feedback. Another challenge was making the system simple enough for visually impaired users while keeping it responsive. Overcoming these helped us build a more practical and user-friendly solution. . . .
Accomplishments that we're proud of
We are proud of building a fully functional prototype that solves a real-world problem in a simple and practical way. Our system successfully combines voice interaction, real-time object detection, and step-by-step navigation into one seamless experience. What makes this special is that it works using just a smartphone, without any expensive hardware. We were able to create an intuitive interface that a visually impaired user can easily understand and use. Turning a complex AI concept into a working, user-friendly solution is our biggest achievement. . . .
What we learned
Through this project, we learned how to turn a real-life problem into a practical solution using technology. We gained hands-on experience with JavaScript, camera integration, and AI-based object detection using TensorFlow.js. We also learned the importance of user experience, especially when designing for visually impaired users. Working on this project improved our problem-solving and logical thinking skills. We understood that building something useful is more important than just writing code. As a 14-year-old, this project helped us gain confidence in building real-world applications. . . .
What's next for VISIONARAIRE-AI
In the future, we plan to improve the accuracy and speed of object detection for smoother real-time performance. We want to add support for more objects and enable continuous tracking so users don’t have to repeatedly search. We also aim to integrate advanced models like YOLO for better precision and stability. Adding vibration feedback and offline functionality will make the system more reliable in different environments. We plan to test the solution with real users and refine it based on feedback. Our goal is to develop this into a complete mobile application that can be used widely. Increase the accuracy of VISIONAIRE-AI from 85% to 95-98%.
.
.
.
Testing & Feedback:
The system utilizes the COCO-SSD model architecture, achieving a peak target Confidence Score of up to 85% during real-time object localized scanning(calculated by taking average of 10 demo).
Built With
- axios
- coco-ssd
- css3
- html5
- javascript
- node.js
- openai
- openrouteservice
- tailwind
- tensorflow.js
Log in or sign up for Devpost to join the conversation.