Inspiration

All of us listen to classical music and thought it would be pretty cool if we could listen to what a computer thinks classical music is.

What it does

After being fed classical music of a certain genre, period, and style it then attempts to train the generator to create more realistic classical music that can trick a learning discriminator into thinking that the music is legitimate.

How we built it

We used pytorch and pytorch audio to create the two networks and the data loading library.

PyTorch provided the infrastructure to create the two networks relatively easily. The general idea is that a single super layer of a CNN would likely have a convolution, normalization, and activation layer. Pooling likely would occur automatically on a large enough sample size.

PyTorch Audio provided the ability to quickly create the waveform using the mel_spectogram functionality. We used it to load midi files that became wavs that could then be plotted as a single waveform with distinct features. Ideally the network would have been then able to take the 3 dimensions plotted (time bins, frequency, and power) to then identify the type of classical music and generalize a form of it for creation.

The data was extensively pre-processed. We went through a significant amount of songs to find similar quality, length, and formatted (sections of the piece) from nominally similar sources to at least semi-standardize the data inputs.

Challenges we ran into

Pre-processing took far longer than expected. We didn't start the challenge until last Saturday evening and a vast majority of Saturday, Sunday, and this past Saturday and Sunday were spent going through as much data as possible to find quality sources.

In addition learning how to use pytorch was a challenge. Even at the time of submission, we ran into some segfaults that are either the result of memory allocation or some challenges we have encountered with gradience. This is one of the main items we hope to continue to work on.

In the future for quick turn around on ML projects, I think it will be valuable to identify data sources faster and use some of the processing lessons we learned to more efficiently work with the data.

In addition, we didn't truly consider the hardware ramifications of the GAN until late in the game. In an ideal world we would have had multiple GPUs with dedicated memory and maybe even optimized TPUs to better process the data but in lieu of that, it was definitely challenging to then cut back on some of our convolutional layers and increase our chunking size to make better use of our resources.

Accomplishments that we're proud of

Building the networks successfully and actually processing the data in the 72 hour window was a real accomplishment that we are proud of.

I think none of us were familiar with GANs and so both learning the value of preprocessing and architecting the networks was really cool.

There were a lot of highs and lows in the initial hours of the project where we were attempting to narrow the scope by being particular with the data and learning more about doing audio signal processing rather than standard dsp but ultimately those were the best parts of the project by far.

What we learned

The value of data. Simply put we learned a lot about finding good data and the value of constantly adjusting to fit, not necessarily ideal specs but rather the real specs of the resources you have available. This was a great lesson in managing time as well.

What's next for Classical Music GAN

We plan on ironing out some of the kinks with respect to dimensionality and resource issues that seem to result in sporadic seg faults. We hope to do this by engaging with the pytorch community to see if they have any advice or lessons that we can use to improve it.

In addition, we have TPU coming in the mail this week to ideally spread a few more calculations over to and we hope to have a more robust model in the coming weeks. The project was great at giving us a reason to learn about pytorch and the community - now we will be taking the project to a more robust completion.2

Built With

Share this project:

Updates