The Gist
We will be training a generative adversarial network to produce classical music. We will be using transcriptions of a database of classical music to create music, compare against test (real) works with a discriminator, training the generator and the discriminator to better produce music and distinguish between generated and non-generated music, respectively. Note: this was our initial plan and eventually we pivoted to GRUs instead.
Our Team
- dlauerma - David Lauerman
- mburke15 - Mason Burke
- ssungun - Serdar Sungun
Introduction
The creation of music is not so much a problem to be solved as much as it is a creative expedition. At the end of the day, having a machine produce any quality of music is a success for this project. Our reference is a paper from Stanford, and they used an end-to-end learning model to generate classical music. We’re not experienced with end-to-end learning, as it involves significantly more difficult and complicated mathematical formulation than the traditional neural networks we’ve learned about and implemented. Instead, we will be using a generative adversarial network to achieve the same goal, which will hopefully produce comparable results to those observed in the paper. The three of us are all quite musically inclined, with a particular interest in piano and guitar music.
Related Work
I found this great article about an attempted implementation of a GAN to generate monophonic music, which includes some useful insight about the use of GANs in their experience. Essentially, they first tried using an n-gram, with little success, but then moved on to a SeqGAN to much better returns. By their measures, they in the end failed to produce real music, although there were prevalent spots of ordered semi-musicality. Apparently, they see great potential in a more finely tuned implementation with more layers, specifically recommending trying various filter sizes to optimize results and cleaning the dataset so that it contains only songs of the same tempo and time signature.
Data
This dataset is decently large, and it will likely not need much preprocessing. If we can figure out a way to do it, we may try and transpose everything to the same key to simplify the training process.
Methodology
As mentioned previously, we’ll be using a generative adversarial network to build a generator for classical music. We will be training our model with the MusicNet data, a database of 330 classical works. These works have been written into a csv file, with each note having information about pitch, duration, and other important characteristics, such as information about the instrument used. This information is encoded chronologically and preprocessed in such a way as to be ready to be inputs into a neural network. We will train using alternating epochs of generator training and discriminator training, in order to mitigate the equilibrium-style game that continually trained generators and discriminators face when pitted against each other. The paper we reference does not use a GAN, but we are hoping that the generational capabilities of GANs (combined with the inscrutability of its internal representations) will appeal well to a subjective-evaluative model.
Metrics
It is difficult to evaluate a music-creating model from a quantitative or classification-based standpoint, since the value of a piece of classical music is inherently subjective. So, we propose a system of evaluation based on several observational and subjective criteria. Is there a coherent melody? Are there motifs used throughout the work that reinforce a melody? Is there an underlying chord structure that complements the melody, and does it follow an intelligible pattern? Is there a clear use of harmony? Does the piece have development? Is there a clear beginning, middle, and end? Is the music pleasant to listen to? Does the generated sample diverge well enough from the training dataset? For our model to be considered “successful,” it should meet all of these criteria. Most importantly, the music should be nice to listen to.
Our base goal is to make music that is sensical, something that bears some semblance to classical music. The target goal is to make something that is listenable, preferably with some identifiable attention to structure. In the best-case scenario, we would aim to make music that is really pleasant to listen to. Additionally, we could conceivably compare our reactions to our generated music to other classical music generated by others in a similar fashion.
Ethics
What broader societal issues are relevant to your chosen problem space? Music and art pieces are productions of creativity and inspiration. This creates a thin line between just getting inspired and not producing an original work. The possible ethical problem here is to market some other musician’s work as your own after making minor changes on it. How our project may be related to this problem is that, at the end of the day we will be training our model with other tunes to generate new tunes and there most probably be similarities. I don’t think we should be worrying about this too much as a lot of music shares chord progressions, note sequences, or melodic harmonies, and as we’ll be collecting information from many pieces, the generated tunes will have their own touch. Why is Deep Learning a good approach to this problem? We will be specifically focusing on classical music. We will be using 330 classical music pieces that we have access to the data of, as well as tens of underlying metrics such as musical patterns, melody completing chord progressions, the notes etc. This gives us hundreds of thousands of data points. Considering the size of our data, deep learning is a good method to employ in this project.
Division of Labor
We are suitemates and we would mainly be sharing the work equally, if not working on every step together.
Log in or sign up for Devpost to join the conversation.