Inspiration

Imogen Heap is one of my favorite producers because of her unique sound and innovative way she approaches music. I recently fell down a rabbit hole learning about the MiMU Gloves, a wearable technology she developed that allows musicians to control sound and effects through hand gestures. As a musician myself, I was inspired to build a more accessible version using a laptop webcam to produce a similar outcome.

What it does

Using the laptop’s webcam, I implemented real time hand tracking to detect specific gestures and map them to musical effects. Each gesture acts as a control signal, allowing me to manipulate sound dynamically as I perform. For example, holding an open palm begins recording a loop, which continues until I close my fist to stop it. Raising one finger adds reverb, while two fingers shift the pitch up by a couple of semitones. These hand movements transform the audio in real time without any physical hardware needed.

How I built it

The system uses a laptop webcam and OpenCV to capture video frames, which are analyzed using MediaPipe's HandLandmark model. MediaPipe provides 21 3D hand landmarks, and I extract their 2D positions to identify the configuration of fingers and detect custom gestures. To map each gesture to a musical effect, I used the sounddevice library to record the loop and play it back. I also used Pedalboard for audio effects such as gain, chorus, reverb, and pitch shift. To ensure smooth interactions between the visual and audio processing, I used threading for all the audio functions.

Challenges I ran into

While only the recording and playback gestures are currently functional, the framework is fully in place to support other effects. I originally coded the audio processing and video processing in separate files for easier debugging. Both files were working excellently individually. When I integrated them into the same script, the system wasn't running smoothly anymore due to the lack of threading. By implementing multithreading, I was able to get the looping functionality working reliably. However, the reverb and pitch shift features are still in progress and not yet fully integrated.

Accomplishments that I'm proud of

I am proud of the overall cohesiveness of the project. At the start, I had a lot of scattered implementation ideas and wasn’t sure how they would all come together. Seeing both the gesture recognition and the audio processing aspects start to work individually felt rewarding. Although they aren't fully working together yet, it was still exciting to see my ideas come to life.

What I learned

Through this project, I was able to take a deep dive into audio processing with Python. While I had previously worked with tools primarily focused on video processing, exploring audio was a refreshing challenge. I learned about various libraries and techniques specific to sound manipulation, which expanded my skill set and opened my eyes to the complexities of simultaneous real time audio and video processing.

What's next for Gesture Based Music Controller

Once I overcome the current challenges, I’m excited to explore the range of potential use cases. While this tool offers a unique way for musicians to express themselves, it also holds promise for individuals in special education or those with physical disabilities. By enabling users to create music without the need for precise or strenuous movements, it could provide a new, accessible avenue for creative expression and learning.

Built With

Share this project:

Updates