Inspiration
We were inspired by the real challenges that visually impaired people face every day just trying to get around. The hackathon theme of cloning got us thinking: what if you could clone the experience of sight itself, replicating spatial awareness through entirely different senses? The idea of a phone acting as a full awareness system for someone who can't see felt like something worth spending a hackathon on.
What it does
DualSight clones spatial vision for the visually impaired, turning a standard iPhone into a full awareness system. It scans the environment in real time, identifies objects, and converts those detections into directional audio cues that tell the user what is around them and where, replicating the situational awareness that sighted people take for granted. A priority system filters out unimportant background objects and focuses on things that actually matter. A $100°$ spatial radar shows the field ahead visually, and haptic feedback communicates how close objects are through touch. Everything runs on-device with no internet connection needed and under $100\text{ms}$ of latency.
How we built it
The app is built in Swift using SwiftUI in Xcode. The vision side runs YOLOv8m, a computer vision model trained on over 600 object classes, converted to run natively on the iPhone's neural engine. We built a tracking system that keeps tabs on objects across frames so the audio doesn't flicker, and packaged the core engine into its own Swift package called BlindGuyKit so the app stays clean and modular. The audio engine uses AVFoundation so safety announcements can duck under music or podcasts automatically. We used LLMs to generate boilerplate code that our team could then manually debug, refactor, and optimize.
Challenges we ran into
Getting the audio engine to behave correctly was a bigger challenge than we expected. We ran into a frustrating AVAudioSession error that took a long time to track down and fix. Tuning which objects get announced and which get ignored, essentially deciding what a cloned sense of sight should and shouldn't pay attention to, also required a lot of trial and error. On top of that, keeping the vision loop fast without freezing the UI took significant work to get right.
Accomplishments that we're proud of
We are proud that the whole thing runs on the phone with no server involved. Getting a $15\text{Hz}$ detection rate with real object tracking and spatial audio working together on a phone is something we honestly weren't sure we could pull off, and seeing it actually clone that sense of spatial awareness in real time felt like a genuine breakthrough. We're also proud of how much of the app we were able to finish in the time we had.
What we learned
Trying to clone a human sense forced us to think harder about the actual user experience than we usually would. Every small decision had a real impact on someone who can't look at a screen, because for them, this audio and haptic output is their vision. We also learned a ton about how to properly optimize machine learning models for Apple hardware.
What's next for DualSight
We want to add a mode where the app gives a full spoken description of the whole scene on demand, a complete clone of the "glance around the room" experience. We were also working on LiDAR support for better depth sensing and an Apple Watch companion app for haptic feedback on the wrist, but those didn't make it in time. We also want to open source BlindGuyKit so other developers can build on top of it.
Built With
- ai
- avfoundation
- corehaptics
- coreml
- flask
- python
- swift
- swiftui
- xcode
- yolov8
Log in or sign up for Devpost to join the conversation.