That’s My Jam
Winner of SensAI Hack SF 2025 - Best AI with Camera Access
💡 Inspiration
We are obsessed with the intersection of music and Augmented Reality. However, most musical AR apps isolate you in a digital interface, ignoring the physical world around you. We wanted to flip that script by using the headset’s contextual intelligence—letting AI understand your environment and turn everyday objects into expressive tools.
Out of that desire came a simple question: What if anything you pick up could become an instrument?
That idea evolved into That’s My Jam: a playful, immersive way to explore your reality through sound. We didn't just want to make a synthesizer; we wanted to give a "voice" to the inanimate objects filling our lives.
🎸 What it does
That's My Jam transforms the physical world into a jam session. Pick up any object—a banana, a coffee cup, a plastic bottle—and it instantly becomes a playable musical instrument.
The system performs three key actions:
- Identifies the object in your hand using computer vision.
- Generates a "sound personality" that matches the object's physical properties.
- Enables performance using gesture-based controls mapped to a musical scale.
The results are magical: Shake a banana to trigger a funky slap bass line. Tap a ceramic mug to hear warm, lofi Rhodes keys. Pluck a clothespin to generate pizzicato strings. Every object offers a unique auditory texture, turning your living room into an orchestra.
⚙️ How we built it
The core experience is a Unity application deployed on the Meta Quest, leveraging the passthrough capabilities for AR.
For the intelligence layer, we engineered a custom Node.js backend to handle the heavy lifting. Here is the technical data flow:
- The Quest captures the user's field of view and isolates the object currently being held.
- The image data is sent to our backend endpoint.
- A Multimodal LLM analyzes the visual input not just for identification (e.g., "cup"), but for vibe descriptions (e.g., "hollow, rigid, ceramic").
- The system maps these descriptors to specific sound banks and returns a JSON payload to the headset.
- Using high-frequency hand-tracking data, the Unity app detects specific gestures to trigger the assigned audio samples in real-time.
🤯 Challenges we ran into
- The Midnight Ghost: At midnight, moments after connecting the frontend to the backend, our machine stopped sending POST requests. There were no error logs and no logical reason for the failure. After two hours of panic debugging, we switched laptops, and everything magically worked.
- Occlusion & Tracking: Hand tracking is notoriously difficult when the hand is grasping an arbitrary object. The object often occludes the fingers, confusing the Quest’s sensors. We had to write logic to balance accurate gesture detection with "forgiving" usability so the instrument wouldn't cut out during a performance.
🏆 Accomplishments that we're proud of
- The "IKEA Factor": It’s genuinely fun. After filming our demo, we spent an hour wandering around IKEA, turning random furniture and lamps into music just for the joy of it.
- Magical UX: We built a loop (See Object → Analyze → Play Music) that feels responsive and intuitive.
- Completeness: Despite the 24-hour time crunch, we shipped a functional full-stack AR experience.
🧠 What we learned
- Naming matters: Pun-based placeholder names can (and should) turn into real product names.
- Flow state is real: When we lock in as a team, we can build incredibly fast.
🚀 What's next for That’s My Jam
Ship it. We plan to polish the tracking algorithm for more expressive playing motions. We also want to add a loop station mode—allowing you to record a beat with a pen, loop it, and then layer a melody with a banana. Finally, we want to expand the AI's sound customization to enable it to make intriguing unique sounds for every object.
Built With
- chatgpt5.1
- digitalocean
- express.js
- metapca
- node.js
- openrouter
- unity

Log in or sign up for Devpost to join the conversation.