Inspiration
We really wanted to make something that was fun, unique, and something we'd want to use ourselves. A desktop buddy that makes studying more interesting and involves computer vision, machine learning, and music was a strong fit for these goals. More specifically, the idea of adaptive song selection based on the user's current emotions/state was something we wanted from the very beginning.
What it does
It's a cute little robot buddy that sits on your screen, and you can drag it around wherever you want. It regularly pulls webcam data through a fine-tuned YOLO model to detect drowsiness/focus level on a scale of 0-1 through facial features, and then periodically tells a song ranking algorithm to find a song that matches the current level (through audio features like energy, valence, and tempo) and add it to the queue.
How we built it
Daniel was primarily responsible for the app functionality and Spotify API integration, while Nathan was responsible for the model training, evaluation, and real-time inference. We used a pyside Qt frontend in order to get the desktop buddy window, open CV for webcam capture, fine tuning a YOLO model on a drowsiness detection dataset on the RCAC gautschi clusters, and SpotifyAPI for realtime song selection and queuing based on the user’s drowsiness.
Challenges we ran into
Both Nathan and I ran into significant challenges. The most significant one for me was realizing less than 15 hours before the submission deadline that the spotify API endpoint for recommendations was deprecated. This led to me needing to design an entirely custom recommendation system using the limited SpotifyAPI endpoints available, and though it isn’t as polished as Spotify's algorithm, it worked enough to develop this MVP.
Another thing that was kinda funny was that our initial Spotify developer portal app timed out for a full 24 hours because I guess we spammed the API endpoints way too many times during development and didn't have proper retry logic. So that took a while to fix, but it resulted in a cool caching system and significantly reducing the amount of requests we were making.
The initial dataset Nathan chose, as well as the model architecture, did not yield great results in the beginning. This may have been due to it being multiclass classification, as well as training from scratch. He made the decision to downscope the ML side, opting for a fine-tuning approach and a simpler output value, which ended up yielding great results, as our model performs very well at detecting alertness and drowsiness, even when running purely on local machines.
Accomplishments that we're proud of
We were really proud of actually getting a final product out, actually meeting the goals we set at the beginning of the hackathon, and overcoming the seriously difficult challenges we faced throughout this weekend.
What we learned
Start early, do your research on the tools you are going to use, and don’t be afraid to pivot and downscope a bit in order to deliver a final product.
What's next for AI Study Buddy
There are many additional features that can be added to AI Study Buddy. A line graph visualization to show focus value over time would be very interesting and cool to see as a user. Also, working on improving that song recommendation algorithm, or swapping APIs entirely to a music service that is more open for its developers. Last but not least, we had a model trained to detect emotions as well, so it is very possible to integrate emotion predictions into our song recommendation parameters to get even more personalized music recommendations.
Log in or sign up for Devpost to join the conversation.