Random music generators have always seemed lacking, producing often robotic, repetitive, and simplistic tunes that seem instantly dull. The art of music has always escaped the grasp of computers due to the unconventional and often illogical ways music tends to form. That is, until now!
What it does
Viano is a music generation system that produces complex and intricate piano scores. Taking advantage of everything from Deep Neural Networks to a vast vocabulary of musical phonemes, Viano is able to produce organic scores that seem both realistic and dynamic.
How I built it
Viano runs on three high level but extremely powerful variables that influence the end result: Complexity, Dissonance, and "Scaliness".
Complexity: This control the details and levels of the rhythms, note jumps, and adds detailed flair across the piece. High complexity results in intricate results that adds excitement to everything from scales to the melody.
Dissonance: Dissonance is the measure of how far the piece will stray from tradition bounds of Keys, Chords, and measures. Increasing dissonance increases the chaos, allowing off-tune notes to slip though and surprising rhythms to dazzle, increasing the uniqueness of the result.
Scaliness: The is the term that refers to how closely the tunes will adhere to scales. High scaliness will result in bass lines and melodies that follow the natural scale progression of the key signature. This adds a smooth, logical feel that compliments the chaos of dissonance.
Viano runs on a custom interpretation of MIDI that allows for discrete time placement of notes. This system breaks down a song into structures called "segments". Segments can represent everything from a short phrase to the main chorus of a song. The result is a format that is easy to work with in terms of generation and learning. This allowed the projection of existing note sequences from famous classical artists into a "image". From this stage, I passed this "image" representing notes into a deep neural network.
The results of the Deep learning network were then parsed for the top guesses for the next note and compiled with new possible notes pulled from custom algorithms. This formed a probability mapping that was then filtered multiple times to increase affinity for musical structure such as scales, arpeggios, and harmonics. From here, Viano picks a note at random based on the calculated probabilities.
Rhythms are developed in relative independence for each generated segment. Each segment picks a subset of rhythm "phonemes" from a custom library which are building-blocks of any rhythm. These are then assigned notes and put together. This allows for a dynamic and near infinite combinations.
Since running scales do not follow the building-blocks convention, scales were procedural generated by selecting target notes within a key, spreading them across multiple octaves, and then riding the key/chromatic to each target. High complexity results in wiggles in the scale that can be added in.
Each segment contains two melodies: the treble (traditionally played by the right hand) and the bass (played by the left hand). Each melody is separate, the custom data structures allow for completely independent bass and treble melodies and timings.
A song can be assembled by attaching multiple segments together, which combine to form a song that has a variety of similar but different parts.
One of the primary challenges was to convert MIDI data into a format I would be able to work with. I came up with the segment based framework. In this system, each note was stored at a specific time relative to the beginning of the segment. This allowed for much easier manipulation and composition of notes.
Without this system, MIDI notes by themselves are only defined relative to the previous event, making management and working with MIDI data extremely challenging and often times impractical for certain applications.
To create relatively human sounding music, I had to heavily modify and constrain the original machine learning system. This involved a huge amount of custom algorithms, rules, and weighting functions. After adding these properties, the music produced became drastically better. With further tuning, this can be increased further.