Inspiration
Visually impaired people rely on a "Mental Map" of their house. When that path is blocked by a moved chair or a package left in the hallway, the house indoors becomes equally as dangerous as outside.
Statistics show that over 50% of serious falls for visually impaired individuals happen in their own homes.
This is largely because GPS doesn't work indoors, and traditional canes often can't "see" obstacles ahead of time till the collision.
We built InVision to give people a safe and confident navigation in their own space.
What it does
InVision is a real-time AI navigation system with a physical cane made of an Arduino.
Completely voice-based.
How it works: -A family member or a user records a video of the house layout once. -Gemini backend identifies the anchors (Fridge, Bathroom, Kitchen, etc.) -The app uses the user's camera in real-time to identify their location and guide them to their destination. -The camera, along with the Arduino, detects any obstacle in the path.
How we built it
We approached the problem from two sides:
The Brain: We used a Node.js backend and React frontend integrated with Gemini Flash 2.5 to analyze the video of the location and map it out. The frontend is a web app that utilizes the camera for real-time location tracking.
Obstacle detection: We built a custom hardware attachment using an Arduino and HC-SR04 ultrasonic sensors. Whenever the user is near an obstacle, the passive buzzer makes a beep, which is heard by the backend to generate an alert speech for the user.
The hardware attachment acts like an instant real-time detection since the AI tends to be slow in response due to video computation.
The Integration: We focused on a "set it and forget it" model where the AI memorizes the environment so the user never has to navigate a complex app interface.
Challenges we ran into
We kept on running out of API usage because of heavy computation, so we had to switch to a paid model.
Additionally, Gemini kept giving us the wrong direction because of the bad recorded footage, which is more seamless now.
The voice API not working as intended and also accidentally leaking the API key, which was resolved instantly.
In terms of hardware, we initially tried to use the STM32 board but faced a couple of problems with the clock speed between the board and the ultrasonic sensor, and later pivoted to Arduino.
Accomplishments that we're proud of
We are proud of our one-time setup feature. Making it so that a single video recording can transform a house into a narrated map is a huge step for accessibility.
We also successfully built a hardware prototype that catches obstacles ahead of time.
What we learned
We learned to integrate different API and use them to compare the real-time snapshots of the camera with video for mapping. This is a bit unique from the other indoor mapping where Bluetooth beacons are used.
What's next for Invision
We want to expand InVision beyond the home. Our goal is to make any indoor space (hospitals, hotels, or workplaces) "InVision-Ready" with a single scan.
Log in or sign up for Devpost to join the conversation.