Link to repo

https://github.com/eugenethreat/hackpsuspring21

Inspiration

Our inspiration came from our group’s shared love of music and our pursuit to create the best playlists possible.

What it does

Yet Another Playlist Generator is a machine-learning enhanced web app that analyzes your Spotify library and returns playlists organized based on audio features such as danceability, energy, and tempo.

How we built it

Yet Another Playlist Generator has two primary components -- the web app, which reaches out to the Spotify API, and our machine learning models, which perform analysis and song recommendations. The web app was primarily built using HTML/CSS/JavaScript and node.js. After logging in with their Spotify account, the web app makes requests for the user’s created and saved playlists in order to get a list of songs. Then, each of these songs has their audio features analyzed and returned in the form of a CSV sent to the machine learning models. Requests on the web app are handled by a node.js server with express.js.

The machine learning model was created in Python by utilizing the scikit-learn, and Pandas libraries. The model consists of a K-Means Clustering algorithm that takes in a Spotify user’s library in the form of a CSV and creates clusters of songs that are similar in their attributes. To determine the optimal playlists, the model sorts each cluster by the sum of the absolute value of the standardized residual for each attribute, and creates the final playlist based on the desired number of songs with the lowest standardized residuals. The number of playlists as well as the length of each playlist is determined by user input.

Challenges we ran into

Over the course of the weekend, we encountered several challenges:

Google Cloud Platform A big challenge we had dealt with the integration of each part of the project on the Google Cloud Platform: Originally, the web app ran inside App Engine -- however, it turns out App Engine does not allow local file writing, breaking the csv-writing functionality. Google’s suggestion was to use Google Cloud Storage to store files -- however, we decided the learning curve was too high, so we moved the entire program to a local computer instead.

I (Eugene) ran into many CORS-related errors when developing locally, so I decided to run the server on app engine and make requests locally -- however, every time I wanted to make a change to the server, I had to push the change to App Engine -- although pushing a change only took about ~30 seconds on average, over the weekend it added up. This was later remedied when we moved the entire web app locally.

Server problems HTTP requests sizes - Initially, after the request was made to get the user’s songs, I wanted to send the list as another HTTP request -- however, the length of this array exceeded the maximum size the server would accept, wrestling in a HTTP 413 error (which I had never encountered before). JSON problems -- There were also several problems with JSON formatting with the results from the Spotify API, causing the server to improperly parse the response.

Spotify API problems This was the first time any of us had worked with the Spotify API, so it took a while to read documentation and get a foothold. The largest problems we ran into were related to OAuth authentication and rate limits: OAuth -- I (Eugene) spent the majority of the first day working on the kinks in OAuth, which was necessary to connect to the API. Rate limits -- our samples sizes for the model were heavily limited by the Spotify’s API rate limits -- the average user’s library probably contains hundreds of songs -- however, rate limits usually limit our query to between 100 and 300 songs.

Originally, we wanted our model to recommend songs you might like based on your library -- however, this proved to be quite the technical challenge so we pivoted to library organization.

Accomplishments that we're proud of

Although we weren’t able to reach our project’s initial scope, we were successfully able to integrate the project’s two components and produce playlists based on the user’s libraries!

What we learned

Working with the Spotify API also taught me quite a bit about OAuth authentication flows -- all the previous API I’ve done has not had to use authentication, so working around it was quite the issue. I also learned more than I care to know about JavaScript and asynchronous programming -- while I was already somewhat familiar with JavaScript, working on this project showed me many of the quirks and idiosyncrasies that make JavaScript JavaScript.

We learned about the true power of machine learning models through our clustering algorithm. We were surprised at the quality of the playlists the model produced.

What's next for Yet Another Playlist Generator

In the future, we would like to both increase the integration with Spotify’s API where the app would be able to auto-create the playlist it generates in your Spotify library and move our webapp to GCP so our program will run smoother and be easier for the general public to use.

Workflow - Currently, the CSV generated by the web app must be manually downloaded and fed into our model -- in the future, we would like to integrate the two by moving the models to their own server, which would handle processing and return results to the web app.

API Integration - We would like to increase integration with Spotify’s API to allow the app to auto-create generated playlists within the user’s spotify account, instead of simply listing recommended songs for a smoother user experience.

Infrastructure - We would like to move the entire project to GCP to take advantage of their scalability and infrastructure.

Share this project:

Updates