Plailist Architecture

PLAILIST - About the Project

Inspiration

Ever been to a party where the music just didn't match the vibe? The crowd is hyped but the DJ plays slow songs, or the energy dies and nobody knows what to play next?

PLAILIST solves this by combining AI with real-time crowd analysis. We asked: What if a system could:

Know everyone's music taste before the party starts
Listen to crowd reactions in real-time
Adapt the playlist dynamically
Never let the energy die

Traditional DJs rely on intuition. We wanted to combine human musical preferences with AI-powered real-time adaptation.

What It Does

PLAILIST is an AI-powered DJ assistant that creates and adapts party playlists in real-time:

1. Spotify Fingerprint Analysis

Analyzes attendees' Spotify history to understand the group's musical DNA:

Top genres, artists, and language preferences
Musical diversity score using Shannon entropy
Generates optimized seed playlist

2. Live Crowd Reaction Analysis

Every 30 seconds, analyzes party atmosphere using audio classification:

HuggingFace Audio Spectrogram Transformer model
Calculates enthusiasm score: E = cheering + applause - chatter - booing
Detects energy trends (rising, falling, stable)

3. Gemini AI Recommendations

Every 90 seconds, Google's Gemini AI analyzes:

Crowd energy trends and score history
Party stage and current vibe
Group music preferences

Returns 2-3 contextually perfect songs with reasoning.

How We Built It

Tech Stack

Backend:

Python Flask server
PyTorch + HuggingFace Transformers for audio classification
Google Gemini 2.0 Flash API for recommendations
NumPy for signal processing

Frontend:

Vanilla JavaScript with real-time updates
CSS3 animations for vibe meter
30-second polling for crowd analysis

Audio Processing:

MIT Audio Spectrogram Transformer model
DSP fallback for low-resource environments
16kHz resampling and mono conversion

What We Learned

Real-Time Systems Are Hard

Memory management critical for ML models (346MB transformer)
Fallback systems essential for reliability
API throttling prevents rate limits and cost overruns

AI Prompting is an Art

Structured JSON output prevents parsing errors
Multiple time windows help AI distinguish short-term vs long-term trends
Including reasoning fields improves debugging and transparency

State Management Matters

Frontend-backend synchronization requires explicit refetching
Async operations can create race conditions
Always validate state after mutations

Demo vs Production

Predictable behavior beats flexibility for presentations
Clear narrative arcs make technical features understandable
Visible feedback shows AI "thinking"

Challenges We Faced

Model Memory Issues

HuggingFace transformer crashed on low-RAM machines
Built DSP fallback using zero-crossing rate and RMS energy
System now works on any hardware

Async State Synchronization

Queue only updated on first Gemini call
Frontend cached stale playlist data
Fixed by explicitly refetching state after mutations

Audio Format Compatibility

MP3/M4A files failed to load initially
Implemented multi-fallback pipeline (soundfile → torchaudio)
Added resampling and stereo-to-mono conversion

API Rate Limiting

Frontend called Gemini every 30 seconds (too frequent)
Added server-side 90-second throttle
Frontend gracefully handles 429 status codes

Merging Flask Apps

Had separate frontend and backend servers
Merged into unified Flask app
Resolved CORS issues and improved performance

What's Next

Near-term:

Real microphone input for live parties
Spotify API integration for actual playback
Attendee voting and song requests
Mobile app for party-goers

Long-term:

Computer vision for crowd size and dancing detection
Multi-modal AI (audio + video + sentiment)
DJ dashboard with historical analytics
Integration with professional DJ hardware

Built at UB Hacking 2025

Built With

Updates

Arpit Mittal started this project — Nov 09, 2025 12:31 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.