Sentence datapoints plotted with initial centroids
K-means clusters after function convergence
We knew we wanted our project to check off a few boxes:
- We knew that we wanted to incorporate machine learning into our project in some form.
- We wanted the project to be something we may use or revisit in the future.
- The project had to incorporate the interests of all of us.
- And of course, it needed to be fun!!!
We did some brainstorming and decided to base our project around music--specifically, creating music through machine learning! SentenceSymphony is the result!
What it does
Given a large text file (like a book), SentenceSymphony analyzes the input text sentence by sentence and uses machine learning to find common sentence structures throughout the text. It then uses these structures to generate an output music file by combining chords from a common chord progression that correspond to the structures. The result? A song created uniquely from common structural trends found in the input text -- a true sentence symphony!
How we built it
SentenceSymphony was built using Python3 with various libraries (including numpy, matplotlib, sk-learn, and midiutil). We first read in a large input text file and sanitize it of any obscurities. Each sentence of the input is then analyzed based on three datapoints: the length of the sentence (number of words), the average word length in the sentence, and the number of syllables contained in the sentence. These datapoints are used to plot each sentence in a three-dimensional space (figure 1 below). K-means clustering is then used to cluster sentences together based on similar sentence structure. To do this, _ n _ randomly initialized centroids are placed in the three-dimensional plot of the sentence datapoints (represented in figure 1 by the colored points). _ n _ is representative of the total number of clusters, each of which is later mapped to a specific chord. For each datapoint in the plot, the datapoint is clustered with the closest centroid to the datapoint. Then, we replace each centroid with the mean of the datapoints in the current cluster, and repeat the previous step until the centroid is stabilized (centroid is equal to the mean of the cluster). This is shown in figure 2 below. Next, the Python library midiutil is used to map a chord from a specific chord progression to each cluster. Finally, we iterate through each sentence in the input, lookup the cluster that the sentence belongs to, and determine the chord corresponding to that cluster. Midiutil is used to write the chord to the output file (.mid file), with the duration of the chord for that sentence determined by a function of the length of that sentence. The output is a unique composition of music created from the specific structures found in the input text.
Challenges we ran into
One of the main challenges we faced in this project was that of creating a composition of music that both sounds half-way decent and yet is generated uniquely from a random input text file. It took quite a bit of tweaking and testing out various chord progressions to come up with something that we felt satisfied with (though, admittedly with more time we could definitely improve our output). To plot the sentences using datapoints, we originally had some difficulty determining effective datapoints that would serve as a good measure of sentence structure. Also, none of us had worked with k-means clustering before, and all of us have minimal experience in machine learning in general, so it was definitely a learning experience from the beginning.
Accomplishments that we're proud of
We're proud of the fact that we were able to successfully use machine learning to generate a piece of music that doesn't sound like complete gibberish thrown together. We came into Hack UMass V unsure of what we even really wanted to do, other than the fact that we wanted to use machine learning. We successfully reached and exceeded that goal, and had a ton of fun during the process.
What we learned
- It is harder than we originally imagined to compose a logical piece of music from any (large enough) input text file.
- Python has a multitude of very effective libraries for implementing machine learning algorithms.
- Sentence structure analysis on a large enough text file seems to follow a normal distribution.
- Sometimes a little sleep deprivation leads to the best ideas.
What's next for SentenceSymphony
- Ideally, we'd like the user to be able to select from a range of genres, so that the output music could be recognized as being of that genre (ex. pop, rock, jazz). Currently, the chords we use simply follow common chord progressions rather than a specific genre. We would also like to involve sentiment analysis in the process of analyzing the input text, so that composition would follow mood trends throughout the text. Right now, the music is created based on the structure of the text rather than it's context, and while it's pretty awesome, it would be kinda neat to see how the context would affect the music.
- Additionally, we think that the best way for users to interface with our application would be through a website. We created a test to show what we would like it to look like, but as of now the upload button doesn't work. The goal would be a straightforward interface where the user can upload a .txt document and is able to download a .midi containing the song we've generated for them!