Shadow Sound

Inspiration

Over 430 million people worldwide live with disabling hearing loss. For deaf and hard-of-hearing individuals, something as simple as walking down a street carries real risk — an approaching ambulance, a car horn, someone shouting a warning. Existing solutions either require expensive specialized hardware or are too slow to matter in the moment.

We wanted to build something different: a safety tool that lives on the phone people already carry, costs nothing extra, and works in real time. Shadow Sound was born from a simple question — what if your phone could be your ears?

What it does

Shadow Sound listens to your environment through your phone's microphone and alerts you to safety-critical sounds through haptic vibration patterns — no sound required on your end.

When the app detects a sound, it:

  • Classifies it into 14 safety-relevant categories (emergency sirens, car horns, shouting, dog barking, glass breaking, and more)
  • Triggers a distinct haptic pattern matched to the sound type — a rapid triple pulse for a siren feels nothing like a gentle tap for footsteps
  • Flashes the screen in urgency-coded colors (red for critical, orange for high, yellow for medium) so the alert is impossible to miss
  • Logs a history of every detection so users can review what happened around them

The whole pipeline runs in under 2 seconds end-to-end.

How we built it

Frontend: React Native (Expo) for iOS and Android. Audio is captured in 1.5-second chunks via the device microphone, base64-encoded, and streamed to the backend over a persistent WebSocket connection. Haptic feedback is delivered through expo-haptics with custom vibration patterns per sound category. The UI is built around a dark, high-contrast design so alerts are visible in any lighting condition.

Backend: Python + FastAPI serving a WebSocket endpoint. Audio chunks are decoded and resampled to 16kHz mono using pydub and librosa, then passed through Google's YAMNet model loaded from TensorFlow Hub. YAMNet classifies audio into 521 AudioSet event classes — we map the relevant ones to our 14 app categories and return the result with urgency level and haptic pattern in under 500ms. The backend runs in Docker with a hardened, non-root container configuration.

ML: YAMNet is a pre-trained deep neural network trained on Google's AudioSet — 2 million human-labeled 10-second YouTube clips. We didn't train from scratch; we used transfer learning by mapping YAMNet's 521 output classes to our safety-focused category set, which let us skip weeks of data collection and get to a working classifier in hours.

Challenges we ran into

Silent audio files. The single most painful bug of the hackathon. The backend was receiving audio, decoding it successfully, and YAMNet was confidently returning "Silence: 1.0" for every single clip. The WAV files being produced were valid but only 3KB instead of the expected 47KB — meaning the microphone wasn't actually capturing audio even though the recording session appeared to start. The fix was switching from raw LinearPCM WAV to AAC/M4A output format and ensuring the iOS audio session mode was set with allowsRecordingIOS: true before the recording object was created, with a small delay to let the session activate.

YAMNet confidence distribution. YAMNet spreads probability across 521 classes, so even a clearly loud siren might only score 0.4 on the "Siren" class. Setting a naive 0.85 threshold (as our design doc originally specified) meant almost nothing got through. We tuned this down to 0.4 after testing and added top-3 logging to understand what the model was actually seeing.

WebSocket state on mobile. Mobile apps aggressively suspend background processes. Managing reconnect logic, queuing audio chunks during re-auth, and keeping the connection alive across React Native's render cycles required careful use of refs vs state to avoid stale closures.

Accomplishments that we're proud of

  • End-to-end pipeline in under 24 hours — from microphone to haptic feedback, fully working on a real device
  • Sub-2-second latency from sound to haptic alert, which is fast enough to be genuinely useful in real-world safety scenarios
  • 14 sound categories with distinct haptic patterns, each designed to feel meaningfully different from the others — a siren should feel like a siren, not a doorbell
  • Zero extra hardware — the entire system runs on a phone everyone already owns
  • Demo mode — we built an on-stage demo mode that fires any alert scenario instantly, because we learned the hard way that you can't guarantee a dog will bark on cue

What we learned

  • Pre-trained models are extraordinarily powerful when used correctly — YAMNet gave us a production-quality audio classifier without a single training run
  • The gap between "audio is being recorded" and "audio contains real data" is wider than you'd expect, especially on iOS
  • Haptic design is a real discipline — early versions of our patterns all felt the same, and it took deliberate iteration to make them meaningfully distinct
  • For accessibility tools, the feedback loop with real users matters more than technical sophistication. The question isn't "does it classify correctly" — it's "does the person feel safe"

What's next for Shadow Sound

  • Wear OS / Apple Watch companion app — send haptic alerts directly to a smartwatch so users feel them even when the phone is in a bag
  • Directional detection — use multiple microphones or spatial audio processing to indicate which direction a sound is coming from, displayed as a compass on screen
  • Personalized sensitivity — let users tune which sound categories matter to them and at what confidence threshold
  • Background / always-on mode — run the pipeline as a persistent background service so the app doesn't need to be open
  • On-device inference — move YAMNet to run locally using TensorFlow Lite, eliminating the backend dependency and enabling fully offline use
  • Community sound profiles — crowdsourced environment profiles (busy intersection vs quiet suburb vs construction site) that adjust sensitivity automatically
Share this project:

Updates