Deep Learning Final Project

Title: Hit Song Prediction Who: Camran Lateef and Jaden Chew

Intro- We are interested in how Deep Learning can be applied in the music industry. We have read that the advent of streaming services and social media apps like Tiktok have drastically changed what constitutes a ‘hit’ song, and have made ‘virality’ a factor in whether a song is successful. Historically, we could use Billboard Chart data to determine whether a song is a hit, but with the advent of streaming services, listeners are tuning into a variety of platforms and the radio plays matter less. We aim to use a methodology that incorporates these ideas. This problem will be a classification problem.

Related Work- ‘music2vec: Generating Vector Embeddings for Genre-Classification Task’ is a paper that explores ways to represent music as vectors, which would be useful for our preprocessing. ‘Revisiting the problem of audio-based hit song prediction using convolutional neural networks’ is a paper that solves this problem for Taiwanese songs using a CNN https://medium.com/@rajatheb/music2vec-generating-vector-embedding-for-genre-classification-task-411187a20820 https://ieeexplore.ieee.org/document/7952230

Data- We want to look at recent hits from different genres so we will manually pick out songs from Youtube based on our ‘hit’ performance metric. We plan to use pytube to convert Youtube videos of the songs we want to train and test on into mp3 files. We will then need to encode the mp3 file data to vectors, which will be the biggest preprocessing task. We may also combine the encoded mp3 data of the songs with an encoded vector of each song’s lyrics, so we may need to scrape song lyrics from genius.com. This will be a significant preprocessing task as well.

We plan to train the model by passing in an encoded vector of a song as the input, and a binary label that indicates whether the song constitutes our definition of a “hit”. We will train our model with an equal number of songs from a bunch of genres (hip-hop, R&B, rock, pop, jazz, country). Like sentences, songs are sequential, so it makes the most sense for us to use RNNs (likely LSTMs) as the layers in our neural network (similar to our ‘language Models’ project). On the other hand, it is often a song’s entire qualities that make it a hit (more than its sequential nature), so it might be worth using Transformers as the layers as well, and comparing the performance of the two networks.

Metric- An important part of this project will be constructing a metric that determines whether a song is a hit or not. We have not finalized on a specific metric yet, but it will likely be a combination of Spotify streams, time on Billboard charts and Youtube views. We plan to test our model with a lower number of songs compared to our training data, but with the same split of genres. Since we have a binary classification problem, we will be able to directly compute the accuracy of our model after training.

Ethics- The commercialization of music is a cultural problem that has become significant recently as music labels are pushing their artists to make viral ‘Tiktok’ songs instead of pushing the boundaries of art. Our trained model will ideally be able to predict a song’s ‘hit’ potential as it will have learned the features of a ‘hit’ song while being trained. Machine Learning would only be a good approach to this problem if we could quantize different qualities of a song, which would be a totally different problem to solve. Music labels and their artists would be major stakeholders in this problem. If this algorithm has mistakes, a consequence may be that a label forces its artists to make songs that are classified by the model as hits, but would not actually be hits.

Division- Camran- preprocessing, initial architecture design Jaden- finding songs, constructing metric, building model

Reflection https://docs.google.com/document/d/1j9Fo1JiXlESYQDE1sG838yhBdpSU8a_JEOqa7kpFyiw/edit?usp=sharing