Checkin #2 (re-posting is so that it is broken up into paragraphs):
We plan to implement the paper Music Genre Classification using RNN-LSTM. The objective of this paper was to use an LSTM to classify the genre of audio files containing music better than the standard Convolutional Neural Network approach. We both enjoy the satisfaction that comes along with classification problems and are avid consumers of music. Thus, implementing a network that solves a classification problem for music was a well-suited choice for us. Furthermore, the idea of using a new approach to tackle the problem was very appealing.
So far, the process has presented various challenges and questions. Preprocessing specifically has required more curation than initially anticipated. The data set we chose appeared to be well-documented and organized, but in reality, ended up being cluttered and inaccessible. We attempted numerous solutions to resolve the issue. Cole pried into the documentation and figured out how to download and organize the data, while Jordan analyzed the database and determined how to extract the relevant attributes.
Ultimately we followed this process: first, index into the dataset in tracks.csv and retrieve data on tracks. The dataset contains a multi-dimensional header due to the fact that tracks also have information on their artist and album, however, for our model we only require the information on each track specifically. In order to obtain the features for audio features for each track, we iterate through features.csv and join on track_id from track.csv. We created a track class to consolidate a track’s data into a single object. Therefore, our inputs array is composed of tracks whose attributes (id, title, genres, and features) are obtained by looping through the curated tracks.csv and features.csv file. Our labels array is an array of lists. Each list contains the genres associated with a track (as a single track can have multiple genres).
Questions still linger in our heads. Can attributes such as title, artist, album, and duration be used in the genre classification of tracks, or are they extraneous data? Perhaps a track’s features and spectrogram are the most necessary features. It is possible that the introduction of unnecessary data may blur underlying patterns, or on the other end create overfitting, as our model picks up on patterns that may not actually be relevant. This is one of many factors that we will tinker with as we implement our model.
Log in or sign up for Devpost to join the conversation.