Inspiration

Music can shift a moment in seconds. We wanted a fun, instant way to turn your current mood into a perfect track—no typing, just vibes.

Hackathons are about rapid prototyping with real-world APIs, so we combined computer vision + LLMs + live YouTube links into a tiny, delightful experience.

Bonus: it’s a great demo of privacy-aware client UX, API orchestration, and handling CORS/auth in the wild.

What it does

Captures a quick webcam snapshot (user consent required).

Detects the dominant facial emotion (e.g., CALM, HAPPY, SAD, SURPRISED) using AWS Rekognition.

Asks OpenAI (GPT-4o mini) for several song candidates tailored to that mood.

Calls the YouTube Data API to fetch a fresh, valid video and embeds it on the page—no leaving the app.

Adds variety by sampling from multiple model suggestions and the top YouTube results.

How we built it

Frontend: React + Vite. HTML5 Media APIs to access the webcam; canvas to grab a PNG snapshot.

Emotion detection: AWS Rekognition DetectFaces with SigV4 signing from the client for the hackathon demo (not recommended for prod).

Recommendation: OpenAI Responses API (GPT-4o mini) to propose multiple mood-matched song titles (JSON format).

Live link: YouTube Data API v3 search to retrieve current video IDs; embed via https://www.youtube.com/embed/.

Variety: Randomized pick among candidates and top search results; slight prompt/query variation (“official video”, “music video”, etc.).

(Optional) Proxy: Minimal Node/Express proxy to hide secrets and bypass CORS in a production-ready setup.

Challenges we ran into

CORS & keys: Browser calls to cloud services need careful headers and often a backend proxy; handling SigV4 in the browser is error-prone.

Auth failures: 401s from OpenAI and 403s from Google if keys/quotas/restrictions weren’t configured exactly right.

Data flow: Normalizing the LLM’s output to strict JSON and gracefully recovering when responses weren’t perfectly formatted.

Fresh links: Ensuring links weren’t stale by deferring link selection to YouTube’s live search rather than model-generated URLs.

Accomplishments that we're proud of

A fully working, end-to-end pipeline from camera → emotion → LLM → live video embed in a few hundred lines of code.

Clean UX: one click to “Take Photo,” one click to “Analyze,” and instant music, in-page.

Robustness touches: multiple candidate generation, randomization, and fallbacks that keep the app feeling fresh.

What we learned

The fastest demos still need good API hygiene: key management, quotas, and request shaping matter—even at a hackathon.

LLMs shine as taste engines when paired with a realtime data source (YouTube) instead of inventing links.

Small product details (like embedding the player and adding variety) massively improve perceived quality.

What's next for AI-Powered Song Recommender with Mood Detection

Production hardening: Move all secrets server-side; add App Check / Cognito / OAuth; cache results to reduce API costs.

Richer signals: Blend mood with context (time of day, tempo preferences, recent history) and add user feedback loops.

Multi-modal recs: Offer playlists, similar tracks, or cross-platform links (Spotify/Apple) with deep links.

Accessibility & privacy: On-device face analysis option, blur backgrounds, and explicit data retention controls.

Share this project:

Updates