BACHPropagation

Carter Cobb posted an update — Nov 24, 2020 02:49 AM EST

Introduction: [copied] We are trying to tackle the problem of generating art -- specifically music -- using deep learning. We are implementing the “Deep Learning for Music” paper by Allen Huang and Raymond Hu, which implemented an RNN model on a musical dataset to see if it was possible to generate melody and harmony the same way as one would generate language.

Challenges: Transforming midi data files into a piano roll representation has been the most challenging aspect of the project so far. The notes of the piece are indicated by either “note-on” or “note-off” messages, which are annotated with the note (1-88) and a time stamp. The unit of time is “ticks”, and meta-data messages indicate the number of ticks per beat. These times are also all relative and indicate the delta time between messages. Therefore, there are many conversions that need to be made in order to interpret the midi files correctly. We also need to find a way to generate a data structure which contains all the notes at any given timestep in order to align multiple tracks and sample at every eighth note. Another challenge is reading the midi data files so that they are normalized as much as possible so we have more consistent data. Here, we’re trying to make all our files be in the same key and have similar tempo, and reading this in is proving to be a challenge due to the lack of consistency in the data.

At this time it seems like pre-processing will be the most challenging aspect of the project.

Insights: We do not have any concrete results at the moment, as we are still finishing up pre-processing.

Plan: We are reasonably on track with the project, but we will need to dedicate more time to implementing the model over the next week or so. Ideally we should have ample time to tweak hyper-parameters and fiddle with the architecture so that we can optimize the music being generated. It will be significant work to evaluate the model’s performance, which may involve generating audio samples from the music output by the model. All in all, we need at least a few days to evaluate the model before the project is due. So far, we are not thinking of changing anything. It may not be plausible for us to normalize the tempos and time signatures of the pieces like we had previously planned.

Log in or sign up for Devpost to join the conversation.