Inspiration
ever had a melody stuck in your head that you wished could become a real song? we've all hummed ideas we thought were fire, only to forget them minutes later. pulse was born from the frustration of losing musical inspiration and the dream of turning any creative spark into an immersive audiovisual experience. it also looks pretty sick ngl.
What it does
pulse transforms your random musical thoughts into real-time audiovisual experiences. users hum into our interface, and pulse instantly generates a continuous stream of music that responds to whatever direction they want. as users type stuff like "add jazz piano" or "make it more chill" both the music and visuals evolve in real-time. the generated music is kinda buns tho.
How we built it
humming processing pipeline: when users hum into our interface, we transcribe their audio to midi using replicate's basic pitch model, then seed musicgen to create an initial audio foundation. we analyze the generated audio and use this to intelligently start our lyria session with bar-aligned handoff, keeping the real-time experience true to their original melody while enabling infinite creative evolution through live steering.
frontend architecture & ui/ux: pure javascript/typescript with direct websocket connections to google's gemini api for real-time communication with lyria. built a glassmorphic interface using three.js and webgl for hardware-accelerated 3d graphics and audio visualization, helped by framer motion and some liquid glass components that look hella clean.
ai steering agent & real-time control: our main innovation is this intelligent agent powered by google gemini that takes user prompts and converts them into precise steering vectors and parameters for the lyria model. the agent analyzes what users type and generates weighted prompt combinations that smoothly transition the music in real-time.
Challenges we ran into
the biggest pain was optimizing our multi-stage inference pipeline with slow external apis. replicate's basic pitch and musicgen models had terrible latency that would've killed the user experience if processed sequentially. our breakthrough was using musicgen seeding strategically. we seed with raw hum audio while basic pitch processes in parallel, then use midi data to configure lyria parameters. this got perceived latency down from 15+ seconds to under 5 seconds which is actually usable.
managing real-time websocket connections while coordinating slower replicate apis required careful state management and robust error handling for when models failed or gave incomplete transcriptions. synchronizing our three.js interface with this complex pipeline pushed browser performance limits, requiring optimized rendering to prevent frame drops during model transitions while keeping animations smooth.
Accomplishments that we're proud of
we successfully built a complex multi-ai pipeline entirely in the frontend that transforms hummed melodies into real-time musical experiences. our seeding approach solved major latency issues by processing musicgen and basic pitch in parallel, reducing wait time from 15+ seconds to under 5 seconds. the seamless integration of three ai models with our liquid glass and three.js interface creates a pretty unprecedented real-time audiovisual experience, while our gemini-powered steering agent enables natural language control over complex musical parameters entirely through browser-based websocket connections.
What we learned
this project taught us advanced techniques for coordinating multiple ai models with different latencies and optimizing complex inference pipelines for real-time user experiences. we discovered that strategic compromises often create better user experiences than perfect technical solutions. for example, our seeding approach wasn't technically ideal but delivered the smoothest workflow. working with cutting-edge ai apis taught us to build resilient systems with graceful fallbacks and the importance of performance optimization when combining webgl rendering with intensive ai processing.
What's next for pulse
our immediate focus is evolving from a demo platform into a production-ready music creation tool with user authentication, stripe integration for premium features, oauth for easy login, and a comprehensive project management system for saving and organizing musical creations. we're also planning to add mongodb for better data persistence and more advanced user features.
long-term, we want pulse to become the go-to platform for ai-assisted music production, with collaborative real-time sessions, professional daw export capabilities, and direct integration with music distribution platforms to complete the journey from hum to published track. basically turning random humming into spotify hits. or if maybe you just want a dj to jam to.
Built With
- framer-motion
- gemini
- gemini-api
- google-cloud
- google-gemini-api
- google-lyria
- musicgen
- react
- replicate-api
- three.js
- typescript
- vercel
- vite
- webgl
- websockets



Log in or sign up for Devpost to join the conversation.