pulse.

landing page
loading screen
visualizer
GIF
animated intro

Inspiration

ever had a melody stuck in your head that you wished could become a real song? we've all hummed ideas we thought were fire, only to forget them minutes later. pulse was born from the frustration of losing musical inspiration and the dream of turning any creative spark into an immersive audiovisual experience. it also looks pretty sick ngl.

What it does

pulse transforms your random musical thoughts into real-time audiovisual experiences. users hum into our interface, and pulse instantly generates a continuous stream of music that responds to whatever direction they want. as users type stuff like "add jazz piano" or "make it more chill" both the music and visuals evolve in real-time. the generated music is kinda buns tho.

How we built it

humming processing pipeline: when users hum into our interface, we transcribe their audio to midi using replicate's basic pitch model, then seed musicgen to create an initial audio foundation. we analyze the generated audio and use this to intelligently start our lyria session with bar-aligned handoff, keeping the real-time experience true to their original melody while enabling infinite creative evolution through live steering.

frontend architecture & ui/ux: pure javascript/typescript with direct websocket connections to google's gemini api for real-time communication with lyria. built a glassmorphic interface using three.js and webgl for hardware-accelerated 3d graphics and audio visualization, helped by framer motion and some liquid glass components that look hella clean.

ai steering agent & real-time control: our main innovation is this intelligent agent powered by google gemini that takes user prompts and converts them into precise steering vectors and parameters for the lyria model. the agent analyzes what users type and generates weighted prompt combinations that smoothly transition the music in real-time.

Challenges we ran into

the biggest pain was optimizing our multi-stage inference pipeline with slow external apis. replicate's basic pitch and musicgen models had terrible latency that would've killed the user experience if processed sequentially. our breakthrough was using musicgen seeding strategically. we seed with raw hum audio while basic pitch processes in parallel, then use midi data to configure lyria parameters. this got perceived latency down from 15+ seconds to under 5 seconds which is actually usable.

managing real-time websocket connections while coordinating slower replicate apis required careful state management and robust error handling for when models failed or gave incomplete transcriptions. synchronizing our three.js interface with this complex pipeline pushed browser performance limits, requiring optimized rendering to prevent frame drops during model transitions while keeping animations smooth.

Accomplishments that we're proud of

we successfully built a complex multi-ai pipeline entirely in the frontend that transforms hummed melodies into real-time musical experiences. our seeding approach solved major latency issues by processing musicgen and basic pitch in parallel, reducing wait time from 15+ seconds to under 5 seconds. the seamless integration of three ai models with our liquid glass and three.js interface creates a pretty unprecedented real-time audiovisual experience, while our gemini-powered steering agent enables natural language control over complex musical parameters entirely through browser-based websocket connections.

What we learned

this project taught us advanced techniques for coordinating multiple ai models with different latencies and optimizing complex inference pipelines for real-time user experiences. we discovered that strategic compromises often create better user experiences than perfect technical solutions. for example, our seeding approach wasn't technically ideal but delivered the smoothest workflow. working with cutting-edge ai apis taught us to build resilient systems with graceful fallbacks and the importance of performance optimization when combining webgl rendering with intensive ai processing.

What's next for pulse

our immediate focus is evolving from a demo platform into a production-ready music creation tool with user authentication, stripe integration for premium features, oauth for easy login, and a comprehensive project management system for saving and organizing musical creations. we're also planning to add mongodb for better data persistence and more advanced user features.

long-term, we want pulse to become the go-to platform for ai-assisted music production, with collaborative real-time sessions, professional daw export capabilities, and direct integration with music distribution platforms to complete the journey from hum to published track. basically turning random humming into spotify hits. or if maybe you just want a dj to jam to.

Built With

framer-motion
gemini
gemini-api
google-cloud
google-gemini-api
google-lyria
musicgen
react
replicate-api
three.js
typescript
vercel
vite
webgl
websockets

Submitted to

BigRed//Hacks 2025
- Winner People's Choice

Created by

designed the overall system architecture and coordinated the multi-ai pipeline. handled websocket connections to lyria, replicate api integration, and optimized the inference pipeline to reduce latency from 15+ seconds to under 5 seconds. also built glsl shaders and three.js components.

lucas-309 He
architeched technical stack
(multi-ai pipeline coordination)
replicate api for seeding
google lyria for real-time
optimized inference pipeline via parallel processing
and db for song storage

and of course the frontend ui/ux (gsap, three fiber, etc.)

joshua lin
hi
Carrie Lee
Ore Adeniyi