Vibe.FM - Project Story
Inspiration
There are moments we can't explain and feelings we can't quite put into words. But music always seems to understand.
We've all been there—scrolling endlessly through playlists, unable to find music that matches our current emotional state. We wanted to build a bridge to that understanding. Not just another recommendation algorithm, but an intelligent system that could interpret the nuance of human emotions and translate them into the perfect soundtrack.
The idea came from a simple question: What if you could just tell an AI how you're feeling, and it would understand you?
We saw an opportunity to combine Google's cutting-edge Agent Development Kit (ADK) with the vast world of music streaming to create something truly special—a multi-agent system that doesn't just recommend music, but understands you.
What it does
Vibe.FM is an intelligent music agent that translates your emotional state into perfectly curated playlists.
Here's the magic:
You describe your vibe through a beautiful, interactive interface with dynamic color-shifting backgrounds. Just type how you're feeling—"a rainy afternoon," "energetic workout," "something tender that feels like starting over."
Our multi-agent system springs into action:
- The OrchestratorAgent analyzes your mood using Gemini's natural language understanding
- The ScoutAgent discovers new music from a database of 8M+ songs
- The PersonalizedAgent curates tracks from your own Spotify history
- The MergerAgent intelligently balances both lists for the perfect mix
In seconds, your playlist is ready and automatically added to your Spotify queue. You get full playback controls right in the app—play, pause, skip, all integrated seamlessly.
The result? A unique soundtrack that blends fresh discovery with the comfort of your personal favorites. Not just a playlist—a perfect capture of your current moment.
How we built it
Architecture Overview
We built a decoupled, cloud-native application with a clear separation of concerns:
Frontend (Next.js + React)
- Interactive UI with Framer Motion animations
- Dynamic color-shifting backgrounds that respond to user input
- Real-time communication with the backend via Axios
- Tailwind CSS for responsive, beautiful design
Backend (FastAPI + Python)
- RESTful API with async support for high performance
- Secure OAuth2 authentication flow with Spotify
- Session-based state management for user profiles
- Server-side caching to optimize repeated requests
Multi-Agent System (Google ADK + Gemini)
This is where the magic happens. We implemented a true multi-agent architecture:
# Simplified flow
OrchestratorAgent
├─> ScoutAgent (searches 8M+ songs in DuckDB)
├─> PersonalizedAgent (queries user's Spotify data)
└─> MergerAgent (balances and validates final playlist)
Each agent is a specialist:
- OrchestratorAgent: Uses Gemini to interpret natural language mood descriptions and coordinate the workflow
- ScoutAgent: Performs lightning-fast searches across millions of tracks using DuckDB's analytics capabilities
- PersonalizedAgent: Accesses the user's Spotify data (top tracks, artists, playlists) to find familiar favorites
- MergerAgent: Intelligently combines both lists, balancing discovery with comfort
Data Layer
- DuckDB: High-performance analytics database for 8M+ songs with audio features
- Spotify API: Real-time data fetching and playback control via Spotipy
- User-specific tables: We create dedicated tables for each user's liked songs on first use
Deployment
- Google Cloud Run: Serverless, auto-scaling deployment
- Docker: Containerized for consistency across environments
- Stateless design: Each request is independent, perfect for serverless
Key Technical Decisions
Why DuckDB? We needed sub-second queries on millions of songs. DuckDB's columnar storage and analytics optimization made it perfect for our use case.
Why multi-agent? A single agent couldn't balance discovery vs. personalization effectively. By specializing agents, we achieve better results and can optimize each independently.
Why Cloud Run? Auto-scaling, pay-per-use, and fast cold starts made it ideal for a hackathon project that could scale to real users.
Challenges we ran into
1. Agent Coordination Complexity
Getting three agents to work together seamlessly was harder than expected. Our first attempts had:
- Race conditions between parallel agents
- Inconsistent output formats
- Difficulty balancing "new" vs "familiar" music
Solution: We implemented a clear orchestration pattern with structured tool outputs and a dedicated MergerAgent to handle the final decision-making.
2. Spotify Rate Limits
Fetching all spotify songs was a no go, thats why we looked for some public catalogs in Kaggle. This are not up to date, but it's nice enough
3. Prompt Engineering for Mood Understanding
Getting Gemini to consistently understand nuanced emotional descriptions was tricky.
Solution: Extensive prompt iteration with examples and structured outputs. We also added context from the user's music history to improve accuracy.
Accomplishments that we're proud of
🎯 True Multi-Agent Architecture: We didn't just call it multi-agent—we built a real orchestrated system with specialized agents working in parallel.
🎨 Beautiful UX: The animated, responsive interface makes the AI's work feel magical rather than mechanical.
🔍 8M+ Song Database: We integrated a massive dataset and made it queryable in real-time.
🎵 Perfect Balance: Our agents successfully blend discovery with personalization—users get new music they love, not just what they already know.
☁️ Cloud-Native Design: Fully serverless, auto-scaling, and production-ready on Google Cloud Run.
What we learned
Technical Skills
- Google ADK: Deep dive into building production multi-agent systems
- Gemini API: Effective prompt engineering for natural language understanding
- DuckDB: High-performance analytics on large datasets
- Cloud Run: Serverless deployment patterns and optimization
- Parallel Processing: Coordinating multiple agents for optimal performance
Product Insights
- Users want context-aware recommendations, not just algorithmic ones
- The balance between discovery and familiarity is critical for music satisfaction
- Natural language interfaces for music selection feel more intuitive than filters and toggles
- Visual feedback during AI processing builds trust and engagement
Team Collaboration
- Clear API contracts between frontend and backend enabled parallel development
- Agent specialization made testing and iteration much easier
- Docker made our local environments consistent and deployment smooth
What's next for Vibe.FM
Short-term (Next Month)
Enhanced Multi-Agent Recommendation System
We're evolving from our current 3-agent system to a sophisticated 7-agent architecture for dramatically better playlists:
- A1. Query Understanding Agent (NLU): Convert natural language into structured intent with emotion extraction, genre detection, and constraint parsing
- A2. Emotion-to-Audio Mapper: Translate emotions into precise Spotify audio feature ranges (valence, energy, tempo, danceability)
- A3. Candidate Retriever: Enhanced retrieval with 200-500 candidates using seeds, vector embeddings, and multi-source fetching
- A4. Reranker/Set Builder: Optimize track selection with hard constraints (explicit content, region availability) and soft optimization (diversity, novelty balance)
- A5. Sequencer Agent: Create natural flow with energy curves, BPM transitions, and strategic placement for maximum emotional impact
- A6. Critic/Validator: Final QA pass ensuring regional availability, no duplicates, and intent consistency
- A7. Learning/Profile Agent: Capture user feedback (skips, likes, replays) to personalize future recommendations
This architecture will give us:
- Better emotion understanding: "nostalgic but hopeful" accurately mapped to audio features
- Smoother flow: Playlists that build, peak, and resolve like a curated mixtape
- Smarter diversity: MMR (Maximal Marginal Relevance) for optimal novelty vs. familiarity
- Regional intelligence: Proper market filtering and explicit content handling
Additional Features
- Voice Input: Speak your mood instead of typing
- Playlist History: Save and revisit your past moods and their soundtracks
- Collaborative Sessions: Let friends contribute to mood-based playlists
Medium-term (3-6 Months)
Contextual Intelligence
- Activity Detection: Integrate with fitness trackers, calendars, weather APIs for automatic mood inference
- Temporal Patterns: Learn your emotional rhythms throughout the day/week
- Social Features: Share your vibe and discover friends' moods
Multi-Platform Expansion
- Apple Music integration
- YouTube Music support
- Cross-platform synchronization
Advanced Analytics
- Emotional music pattern visualization
- Mood journey tracking over time
- Personalized insights dashboard
Long-term Vision
Predictive & Adaptive
- Anticipatory Playlists: Pre-generate playlists based on time, context, and behavioral patterns
- Live Mood Mixing: Real-time playlist adjustment as your emotional state shifts during listening
- AI Mood Coach: Suggest music to help you reach desired emotional states (e.g., "music to help you focus" or "transition from stress to calm")
Wellness Integration
- Mood Journaling: Combine music with emotional wellness tracking
- Therapeutic Playlists: Collaborate with music therapists for evidence-based emotional support
- Biometric Integration: Heart rate, stress levels, sleep quality to inform recommendations
Enhanced Personalization
- Few-shot Learning: Adapt to individual music taste with minimal feedback
- Explainable AI: Show users exactly why each track was selected
- Cultural Context: Region-specific emotional-music mappings for global accuracy
We believe Vibe.FM is just the beginning. Music has always been humanity's emotional language—we're building the most sophisticated translator, one agent at a time.
Built With
Languages & Frameworks
- Python 3.10+
- JavaScript (ES6+)
- Next.js 14
- React 18
- FastAPI
AI & Machine Learning
- Google ADK (Agent Development Kit)
- Google Gemini API
- Natural Language Processing
Databases & Storage
- DuckDB (8M+ songs with audio features)
- SQLite (user sessions)
- Server-side caching
Cloud & DevOps
- Google Cloud Run
- Docker
- Uvicorn (ASGI server)
APIs & Integrations
- Spotify Web API
- Spotify Web Playback SDK
- Spotify OAuth 2.0
- Spotipy (Python client)
Frontend Libraries
- Framer Motion (animations)
- Tailwind CSS
- Axios (HTTP client)
- React Hooks
- Shadcn (UI components)
Development Tools
- Poetry (Python dependency management)
- npm (Node package management)
- Git & GitHub
- Postman (API testing)
Built with ❤️ for the Cloud Run Hackathon 2025
Built With
- cloud
- docker
- duckdb
- fastapi
- gemini
- googleadk
- javascript
- next.js
- python
- react
- run
- spotipy
- tailwind
- uvicorn
Log in or sign up for Devpost to join the conversation.