Inspiration
We were inspired by how music and emotions are closely connected in everyday life. People often choose songs based on how they feel, but current systems rely on manual input rather than real-time understanding. We wanted to explore how AI could automatically detect emotions and create a more interactive, personalized experience.
What it does
VibeAI is a real-time emotion-aware system that uses a webcam to detect facial expressions and translate them into music and avatar responses. It identifies emotions such as happiness, sadness, anger, and neutrality, then selects music that matches or responds to the user’s mood. A Unity-based interface displays live emotion data, confidence levels, and a VRoid avatar that reacts dynamically, creating an engaging and personalized experience.
How we built it
We built VibeAI by combining computer vision, AI, and real-time interaction. The backend uses OpenCV to capture webcam input and DeepFace to classify emotions. The dominant emotion is sent to backend music logic, where a mapping table links each emotion to a specific playlist. The system then uses the Spotify API to request playback using an authentication token. Spotify returns a track from the mapped playlist and begins playback on the connected account or device. Meanwhile, the Unity frontend, built in Unity, displays the detected emotion, confidence level, current song, and drives avatar expressions and animations in real time.
Challenges we ran into
One of the biggest challenges was maintaining smooth real-time communication between the Python backend and Unity. Emotion predictions can fluctuate quickly, so we had to implement smoothing techniques to prevent jittery behavior. Integrating the Spotify API also required handling authentication and ensuring reliable playback control. Additionally, mapping emotions to consistent avatar expressions and animations required careful tuning.
Accomplishments that we're proud of
We are proud of building a complete end-to-end system that connects emotion detection, music playback, and a responsive 3D avatar. The real-time pipeline, from webcam input to music output and avatar reaction, works cohesively. We also successfully integrated external APIs and created an intuitive interface that clearly demonstrates the system’s capabilities.
What we learned
We learned how to integrate machine learning models into a real-time application and connect multiple systems through APIs. This included working with computer vision, handling noisy prediction data, and managing communication between backend and frontend systems. We also gained experience working with external services like Spotify and building interactive applications in Unity.
What's next for VibeAI
In the future, we want to make VibeAI more personalized by learning which songs improve a user’s mood over time. We also plan to expand emotion detection by incorporating voice analysis and improving accuracy. Additional improvements could include better animation syncing with music, more advanced avatar interactions, and enhanced analytics to track emotional trends.
Log in or sign up for Devpost to join the conversation.