Inspiration
Music can shift a moment in seconds. We wanted a fun, instant way to turn your current mood into a perfect track—no typing, just vibes.
Hackathons are about rapid prototyping with real-world APIs, so we combined computer vision + LLMs + live YouTube links into a tiny, delightful experience.
Bonus: it’s a great demo of privacy-aware client UX, API orchestration, and handling CORS/auth in the wild.
What it does
Captures a quick webcam snapshot (user consent required).
Detects the dominant facial emotion (e.g., CALM, HAPPY, SAD, SURPRISED) using AWS Rekognition.
Asks OpenAI (GPT-4o mini) for several song candidates tailored to that mood.
Calls the YouTube Data API to fetch a fresh, valid video and embeds it on the page—no leaving the app.
Adds variety by sampling from multiple model suggestions and the top YouTube results.
How we built it
Frontend: React + Vite. HTML5 Media APIs to access the webcam; canvas to grab a PNG snapshot.
Emotion detection: AWS Rekognition DetectFaces with SigV4 signing from the client for the hackathon demo (not recommended for prod).
Recommendation: OpenAI Responses API (GPT-4o mini) to propose multiple mood-matched song titles (JSON format).
Live link: YouTube Data API v3 search to retrieve current video IDs; embed via https://www.youtube.com/embed/.
Variety: Randomized pick among candidates and top search results; slight prompt/query variation (“official video”, “music video”, etc.).
(Optional) Proxy: Minimal Node/Express proxy to hide secrets and bypass CORS in a production-ready setup.
Challenges we ran into
CORS & keys: Browser calls to cloud services need careful headers and often a backend proxy; handling SigV4 in the browser is error-prone.
Auth failures: 401s from OpenAI and 403s from Google if keys/quotas/restrictions weren’t configured exactly right.
Data flow: Normalizing the LLM’s output to strict JSON and gracefully recovering when responses weren’t perfectly formatted.
Fresh links: Ensuring links weren’t stale by deferring link selection to YouTube’s live search rather than model-generated URLs.
Accomplishments that we're proud of
A fully working, end-to-end pipeline from camera → emotion → LLM → live video embed in a few hundred lines of code.
Clean UX: one click to “Take Photo,” one click to “Analyze,” and instant music, in-page.
Robustness touches: multiple candidate generation, randomization, and fallbacks that keep the app feeling fresh.
What we learned
The fastest demos still need good API hygiene: key management, quotas, and request shaping matter—even at a hackathon.
LLMs shine as taste engines when paired with a realtime data source (YouTube) instead of inventing links.
Small product details (like embedding the player and adding variety) massively improve perceived quality.
What's next for AI-Powered Song Recommender with Mood Detection
Production hardening: Move all secrets server-side; add App Check / Cognito / OAuth; cache results to reduce API costs.
Richer signals: Blend mood with context (time of day, tempo preferences, recent history) and add user feedback loops.
Multi-modal recs: Offer playlists, similar tracks, or cross-platform links (Spotify/Apple) with deep links.
Accessibility & privacy: On-device face analysis option, blur backgrounds, and explicit data retention controls.

Log in or sign up for Devpost to join the conversation.