Selam

Inspiration

My inspiration for Vision comes from a powerful story in the Bible where Jesus heals a blind woman. Her joy in regaining sight made me realize how precious vision is — not just for seeing beauty, but for feeling safe and independent. That moment motivated me to build a tool that allows visually impaired users to “see” their environment in a new way — through sound.

What it does

Vision is a mobile app that uses a phone’s rear camera and AI to detect nearby obstacles and convert them into intelligent audio cues. It:

Classifies object height: low, waist-high, tall, or flat (ignorable).
Estimates distance: closer objects sound louder, distant ones quieter.
Uses text-to-speech to identify the object by name.

This enables users to navigate spaces safely with sound-based spatial awareness.

How we built it

Built using React Native (Expo) for cross-platform compatibility.
Captured camera frames with Expo Camera API.
Used Google Vision API to detect and classify objects.
Estimated object height via bounding box positions.
Calculated approximate distance using bounding box size, then adjusted volume:
$$ \text{Volume} \propto \frac{1}{\text{Distance}} $$
Played categorized sounds using Expo AV.
Displayed a dropdown UI of detected objects.
Computed an accuracy score factoring in:
- Detection correctness
- Volume scaling
- Position logic

All functionality runs inside Expo Go, with no native code or builds needed.

Challenges we ran into

Creating dynamic volume scaling based on object distance without hardware or native modules.
Designing logic to estimate height categories from 2D bounding boxes.
Preventing the dropdown UI from breaking when multiple detections occurred.
Working entirely within Expo, due to no MacBook/Xcode for iOS builds.

Accomplishments that we're proud of

Integrated the Google Vision API for advanced object detection.
Built a multi-layer audio system for spatial feedback.
Designed a custom accuracy algorithm to assess feedback quality.
Delivered a fully functional prototype entirely in JavaScript/Expo.

What we learned

Designing for accessibility means thinking through every sound, delay, and UI choice.
Real-world detection requires fallback systems and smart filtering.
Expo is surprisingly capable when used creatively.
Even without native tools, assistive tech can be built effectively and meaningfully.

What's next for Vision

🧭 Add left/right spatial awareness using horizontal bounding box positions.
📸 Implement auto-frame capture so users don’t have to tap.
🔊 Improve audio feedback with more natural sounds.
🔭 Enable longer-range detection and zoom features.
💡 Explore offline ML models to speed up detection and remove API dependency.

Built With

expo-camera
expo-go
expo.io
google-cloud
google-vision-api
javascript
react
react-native

Updates

Nahom Hailegiorgis started this project — Aug 08, 2025 12:31 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.