Vibehum

Inspiration

We were inspired by the gap between musical creativity and production barriers. Many people have melodies in their heads but lack the tools, knowledge, or time to turn them into finished songs. We wanted to democratize music creation—allowing anyone to hum a tune and instantly get a professionally produced track back.

What it does

UnShazam lets users:

Record 5-8 seconds of humming, singing, or beatboxing
Have the app analyze musical features (tempo, key, rhythm, energy, melodic contour)
Generate genre-matched lyrics based on a theme they provide
Create a full 30-second song with AI-generated vocals and instrumentation
Generate Neo-Retro style album artwork to accompany the track
Play back, preview, and download their creation

How we built it

We architected this as a React + Vite frontend application with a sophisticated API orchestration layer:

Audio Capture & Analysis: Used Web Audio API's AnalyserNode to extract musical features from the recorded audio
Lyrics Generation: Sent audio analysis + genre + theme to Claude API to generate contextual lyrics
Music Generation: Combined all audio features and lyrics into a detailed prompt for ElevenLabs Music API
Artwork Generation: Used Claude to create an image prompt from the lyrics, then sent to Stable Diffusion for Neo-Retro style album covers
State Management: Built a state machine managing the generation pipeline (RECORDING → ANALYZING → GENERATING_LYRICS → GENERATING_MUSIC → GENERATING_ARTWORK → COMPLETE)

Challenges we ran into

Real-time Audio Analysis: Accurately extracting tempo and key signatures from short, noisy recordings required implementing beat detection and pitch analysis algorithms
API Coordination: Managing sequential and parallel API calls with proper error handling and retry logic across multiple services was complex
Quality Control: Ensuring generated lyrics matched the intended genre style and that the music generation reflected the audio analysis required careful prompt engineering
Browser Audio Limitations: Working within Web Audio API constraints while maintaining responsive UX during generation steps
State Synchronization: Keeping UI in sync with the async generation pipeline without race conditions

Accomplishments that we're proud of

End-to-End Automation: Created a seamless pipeline from voice input to professional-sounding output without user intervention between steps
Intelligent Feature Extraction: Implemented music theory-aware audio analysis that translates raw audio into meaningful musical descriptors
Smart Prompt Engineering: Designed prompts that effectively communicate nuanced musical and visual requirements to AI models
Responsive User Experience: Built intuitive UI with clear visual feedback for each generation step
Versatile Output: The system works across multiple genres (Lo-fi Hip Hop, EDM, Jazz, Rock, etc.) and adapts to different musical inputs

What we learned

Audio processing in the browser is surprisingly powerful with Web Audio API, but accuracy requires algorithmic sophistication
Effective AI generation depends heavily on prompt quality and specificity
Building a multi-service orchestration system requires careful state management and error handling
The "reverse Shazam" concept opens up interesting possibilities for creative tools powered by AI
User experience matters as much as technical capability—clear feedback during long-running operations is crucial

What's next for UnShazam

Cloud Processing: Implement a backend service for more advanced audio analysis and generation, enabling higher-quality output
Collaborative Features: Allow users to save, share, and remix songs within a community platform
Customization Controls: Let users fine-tune generation parameters (BPM, energy level, vocal style, artwork theme)
Real-time Genre Adaptation: Expand genre library and implement intelligent genre recommendations based on hummed melody
Social Integration: Add sharing to social media with one-click export and embedded playback
Model Fine-tuning: Train custom models on user preferences for personalized music generation
Multi-language Lyrics: Support lyric generation in multiple languages
Live Performance Mode: Enable real-time music generation during live sessions or improvisations

Built With

claudeapi
elevenlabs
javascript
mongodb
node.js
stablediffusion
vite
webaudioapi

Updates

Emily Jing started this project — Feb 07, 2026 07:59 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.