Lyricalculus

Landing Page
Explanation Page
Asking the Hip-Hop Robot about "Gucci Gang" by Lil Pump
Asking the Hip-Hop Robot about "Money Trees" by Kendrick Lamar
Lyricalculus iOS Application (Swift & SwiftUI)
Simplified Lyricalculus Data Flow Chart

Demo

Check out our demo at www.lyricalculus.tech!

Inspiration

What makes music sound good? Music theory typically discusses how different melodies, harmonies, timbres, and textures become pleasant to the human ear. However, many music lovers, particularly Hip-Hop heads, care a great deal about the lyrics and poetry behind songs. Lyricalculus aims to give an objective judging of your favorite rappers, songwriters, and emcees solely based on the quality of their lyrics. By using lexicographic, semantic, and repetition analysis, the application is able to determine with a >90% accuracy if the song is lyrically lit or not.

What it does

Lyricalculus allows users to copy-paste and inject song lyrics from their favorite artists into the "Hip-Hop Robot" - a master music theorist powered by scikit-learn and NLP pre-processing. The robot compares rhymes, semantic references, and repeting themes in the song with a set of thousands of curated training examples. From this analysis, the robot returns a score out of 10 and an information breakdown of how the user's song of choice compares against the industry's best.

How we built it

First, we used johnwmillr's LyricsGenius API Wrapper (an API wrapper for Genius, the popular music analysis website) to webscrape our favorite Hip-Hop music. We used music critic Iain the Great's list of best hip hop lyrics and Power 106 Radio's list of worst rap songs as a starting point for the Hip-Hop Robot's training data. After preprocessing the data for repetition, semantic, and lexicographic analysis, we aggregated this binary training data on our MongoDB instance.

For repetition analysis, we generated TF-IDF scores for each line in a song before using a weighted average to give it a "curated repetition score". For example, a song like Lil Pump's Gucci Gang would have a very high repetition score (which is bad), while a lyrical masterpiece like Kendrick Lamar's Money Trees would have a low repetition score.

To assess semantic similarity, we used GloVe on different phrase endings in the song, which would give a semantic distance score that determines how the artist creatively connect parts of each verse. To put it simply, two words would have a low semantic distance (near zero) between them if they could be interchanged in a sentence without changing the overall meaning and context of said sentence (pairs like toad & frog, house & hut, or pencil & pen would be good examples). Our analysis found that if a lyricist is able to consistently provide pairs with low semantic distance, the song is far more well-liked. Songs with many double entendres like Bonfire by Childish Gambino would have extremely high scores in this category.

Finally, lexicographic analysis looks for actual rhyme complexity within the song. We convert valid words to phonetic form and then insert them into a map with Move-To-Front heuristic to see which rhymes appear the most. If a song has a high volume of high density rhymes, it is more likely to be viewed as "good" by our testing set. Rap God by Eminem is one of the most complex pieces of consistent rhyming, so it (unsurprisingly) scored one of the highest possible scores in this category.

Once enough training data was aggregated (about 2000 songs, either labelled "lyrically good" or "lyrically bad"), we used scikit-learn to implement a decision tree model with our songs. Our initial round of testing with the training data yielded:

~78% accuracy with only repitition analysis
~85% predicted accuracy with only semantic analysis

The model was then deployed on a Flask API to be connected with our Vue.js or Swift applications. We used Heroku to host the API while our actual user-facing apps are hosted on Netlify.

Challenges we ran into

The most challenging aspect of the project was definitely trying to find the most unbiased sources of training data. We ultimately chose to go with Iain the Great and Power 106 because of their crowdsourced opinions from some of the biggest fans and critics in the music industry.

Lexicographic analysis also proved to be an immense challenge - how does one determine that words like "trough" and "through" don't rhyme? We brainstormed ideas like converting words to their phonetic spellings in order to match rhyming patterns and schemes.