inspiration The fear of the "lost melody." Every musician has experienced a "happy accident" on a keyboard that they couldn't recreate. We wanted to build an intelligent ear that doesn't just record sound, but understands the musical intent behind it. What it does Aura Music Studio is a 64-key polyphonic workstation. It allows users to perform via laptop or touch, record 15-second sessions with precision timing, and uses Gemini 1.5 Flash to analyze harmonics, suggest trendy asset names, and generate unique 10-second MIDI motifs from natural language prompts. How we built it Audio Engine: Built from scratch using the Web Audio API for zero-latency synthesis. Intelligence: Integrated Gemini 1.5 Flash via the v1beta endpoint for Semantic Harmonic Analysis. Persistence: Used LocalStorage and JSON serialization to ensure all user mixes and AI reports are permanent. Platform: Developed as a Progressive Web App (PWA) for offline, cross-device installation. Challenges we ran into Audio Overlap: Solving "glaring" polyphonic noise by implementing a strict Note-Off tracking Map. AI Data Structuring: Engineering prompts that forced Gemini to return raw JSON arrays without conversational filler, ensuring our engine could parse the music data instantly. Hardware Mapping: Syncing 48 different physical laptop keys to the virtual engine across 4 octaves. Accomplishments that we're proud of Multimodal Bridge: Successfully turning a text-based LLM into a functional MIDI composer. Professional UX: Building a high-fidelity "Studio Blue" interface that feels like a production-ready software product. Offline Capability: Ensuring a high-end music app can still function without a constant internet connection. What we learned We mastered the lifecycle of the Web Audio API and learned that Prompt Engineering is the secret to using AI for non-text tasks (like musical composition). We also deepened our understanding of PWA Service Workers and state management. What's next for Aura Music Studio Collaborative Jamming: Using WebSockets for multi-user remote recording sessions. Export to WAV/MIDI: Allowing users to download their assets directly for use in professional DAWs like Ableton or FL Studio. Visual Synesthesia: Using AI mood analysis to generate real-time generative art backgrounds for every performance.
Log in or sign up for Devpost to join the conversation.