# Inspiration

What if you could play a full instrument without touching anything? We wanted to build
something that feels like magic — where your body is the controller. No MIDI keyboards, no touchscreens, just a webcam and whatever energy you bring.

# What it does

Theremin turns your webcam into a multi-layered musical instrument:

  • Raise & lower your hands to play melody across scales (Egyptian, Hijaz, Pentatonic,
    and more)
  • Hand size (distance to camera) controls volume
  • Bang your head to trigger 808 kick drums — harder bangs hit harder
  • Hold a fist ✊ and head bang to tap a tempo, then open your hand to lock a steady beat loop
  • Single 🤟 activates a tanpura-style drone that follows your hand height
  • Double 🤟 triggers an epic bass drop with screen shake and particle explosion
  • Point fingers together 👉👈 to hush everything — sing or talk, then release to bring it all back

Challenges we ran into

  • Running two ML models (hands + face mesh) on every camera frame while synthesizing audio in real-time required careful sequencing to avoid frame drops
  • Head bang detection was tricky — frame rate variations made velocity-based approaches unstable. We switched to a position-window baseline comparison that doesn't depend on
    timing
  • Gesture detection flickers when your hand moves during a head bang. We added frame-count debouncing so a fist doesn't accidentally "release" mid-recording
  • Getting the 808 kick to sound right with pure synthesis took several iterations of pitch curves, saturation, and layering

# What we learned

Your body is surprisingly expressive as a musical controller. The gap between "this is silly" and "wait, this actually sounds cool" is about 30 seconds of playing with it.

Built With

Share this project:

Updates