BlindSpot

Inspiration

We often take our awareness for granted. But for millions of people, whether visually impaired or living with cognitive and developmental disabilities, the world is full of silent threats. For a blind person, the danger is unseen. For someone with an intellectual disability, the danger may be unrecognized or difficult to process in real-time.

The statistics are alarming. According to the Bureau of Justice Statistics, people with disabilities are nearly 4x more likely to be victims of violent crime than the general population. Predators specifically target these groups because they are viewed as "easy targets" who may struggle to identify an attacker or articulate what happened to the police.

We realized that while many assistive tools focus on navigation, almost none focus on defense. A simple walk home shouldn't require a choice between independence and safety. We built BlindSpot to restore that confidence, an autonomous digital bodyguard that watches your back and speaks up when you can't.

What it does

BlindSpot acts as an autonomous AI agent that monitors the user's surroundings for dangers they cannot see or process.

The system continuously scans the rear view for quiet threats, ranging from oncoming traffic to stalkers (using Re-Identification logic to track persistent followers).
It communicates stealthily with the user, using haptic vibrations and concise audio narration to warn them of immediate danger without drawing attention.
If a threat escalates to a point where the user becomes incapacitated, BlindSpot automatically calls 911 and any emergency contacts they choose. It uses a voice agent to speak to dispatchers with real-time context from labeled video feed that the user cannot give.

How we built it

React Native (Frontend UI)
Next.JS (Backend server)
Metro Bundler (Compiles JS code into optimized files for mobile deployment)
Overshoot (VLM that flags potential threats, altercations, signs of violence, injuries, etc.)
Google Gemini (Converts Overshoot scene data into descriptive text)
LiveKit (Voice agent with situational context from Gemini that calls emergency services)

Challenges we ran into

Without a Rasberry PI-to-webcam connection, we resorted to using an iPhone as the camera and our laptop as the backend host. However, sandboxes like Expo Go (commonly used for iOS development) didn't support the necessary libraries to work with LiveKit and Overshoot. Instead we used Metro Bundler (which is what Expo Go uses under the hood to compile and bundle JS code) and manually ran the Metro dev server via CLI.
We were unable to directly stream video from our phone to the backend server (on laptop) because of firewall restrictions on both the iPhone hotspot and CMU public WiFi that block peer-to-peer (P2P) traffic. Instead, we used Supabase as a proxy, uploading video snippets from iPhone to a storage bucket. Then, a WebSocket Listener on the backend pulls those video clips and passes them to Overshoot as a continuous stream. To prevent bloating, older clips are deleted from the storage bucket as new ones continuously flow in.
Because latency overhead of sending and retrieving data from Supabase was higher than expected, the clips passed to Overshoot were shortened (1-2 seconds) to compensate, increasing the frequency of data per minute of streamed video. However, this gives Overshoot significantly less context to judge whether an emergency call needs to be made within a given video clip. So, instead of Overshoot making this decision, we use it to generate scene descriptions instead. These text descriptions are then concatenated and passed to Gemini 2.5-flash in batches, which makes the decision for whether to initiate an emergency call and passes the relevant context to LiveKit. This system reduces overall latency while maintaining accurate emergency detection.

Accomplishments that we're proud of

Using Supabase as an intermediary server to get around P2P blocking.
Accurate first-person threat detection, which traditionally performs significantly worse than third-person.

What we learned

This project, which we assumed would help us gain primarily experience with AI agents, turned out to be an even bigger learning experience in the field of networking, particularly communication protocols like UDP.

What's next for BlindSpot

Due to limited access to hardware, we plan on migrating the camera aspect of BlindSpot to a separate hardware system (Raspberry Pi + camera housed in a backpack and/or Meta RayBan Glasses to allow for convenient use to the point where users may even forget they're wearing BlindSpot. Our vision system also has higher latencies due to the limitations we faced from being unable to livestream data directly from the phone camera to the backend server (having to redirect data flow through Supabase). By utilizing a Raspberry Pi, the streaming connection should be seamless and lead to much quicker feedback and response times, and eliminate the need for using Supabase as an intermediary server (it would still be used to store video clips of emergencies for legal evidence).