# Inspiration
What if you could play a full instrument without touching anything? We wanted to build
something that feels like magic — where your body is the controller. No MIDI keyboards,
no touchscreens, just a webcam and whatever energy you bring.
# What it does
Theremin turns your webcam into a multi-layered musical instrument:
- Raise & lower your hands to play melody across scales (Egyptian, Hijaz, Pentatonic,
and more) - Hand size (distance to camera) controls volume
- Bang your head to trigger 808 kick drums — harder bangs hit harder
- Hold a fist ✊ and head bang to tap a tempo, then open your hand to lock a steady beat loop
- Single 🤟 activates a tanpura-style drone that follows your hand height
- Double 🤟 triggers an epic bass drop with screen shake and particle explosion
- Point fingers together 👉👈 to hush everything — sing or talk, then release to bring it all back
Challenges we ran into
- Running two ML models (hands + face mesh) on every camera frame while synthesizing audio in real-time required careful sequencing to avoid frame drops
- Head bang detection was tricky — frame rate variations made velocity-based approaches unstable. We switched to a position-window baseline comparison that doesn't depend on
timing - Gesture detection flickers when your hand moves during a head bang. We added frame-count debouncing so a fist doesn't accidentally "release" mid-recording
- Getting the 808 kick to sound right with pure synthesis took several iterations of pitch curves, saturation, and layering
# What we learned
Your body is surprisingly expressive as a musical controller. The gap between "this is silly" and "wait, this actually sounds cool" is about 30 seconds of playing with it.
Built With
- javascript
- mediapipe
- webaudio




Log in or sign up for Devpost to join the conversation.