Coconut Karaoke

App in short: An artificial intelligence music generator.

Creating new songs using n-grams

Creating songs is difficult and requires creativity, knowledge, and precision, we are trying to make it easier by removing all of those “feelings” and “work” from the process. By retrieving and indexing a corpus of music, a basic song can be generated using n-grams (usually bigrams or trigrams) that pulls from the existing lexicon to create sentences that either match or resemble sentences from the original corpus.

General summary

Our plan is to generate music in a generic way that can be defined by the user that includes both lyrics and music and is tailored to the individual user’s needs. There are multiple parts to doing this, which makes it perfect for an extended hack, and they are all needed in some form to make it all work.

Searching for the right songs

In order to generate music that sounds like what the user is expecting, a substantial amount of information must be collected and analyzed ahead of time. This includes the background music for the songs and the lyrics that go along with them. In general, it appears as though this will require multiple APIs that provide individual parts of the puzzle, so there may be an eventual limit that is reached when analyzing the information.

Creating new lyrics

The quickest and easiest way to create new lyrics from an existing corpus of text is using a combination of n-grams and Markov chains. This will require that a substantial number of lyrics are collected, so an external API should be used to retrieve them. A text to speech program will also need to be used to generate the resulting audio file that contains the spoken lyrics themselves.

Creating new music

Similar techniques can be used to generate the music that will accompany the lyrics, though the tools have not yet been created. The general idea is to do rhythm processing on MIDI files in order to determine patterns that can be recreated.

Making it sound realistic

The existing songs must be analyzed to determine how the newly generated songs should sound like. As artists and genres tend to have specific rhythms and tones that are associated with them, we should try our best to recreate them as closely as possible. Extracting the typical format of the songs, as well as the rhyme scheme that is used within them, should allow for a starting point that we can work from.

Playing the lyrics

Once we have the lyrics generated, we are going to need to play and record them. There are plenty of text-to-speech APIs out there that we can use.

Playing the music

We will need to generate and play the musical tones once we have determined the patterns. Merging the lyrics and the music

The lyrics and the music will be recorded independently, so the last thing we are going to need to do is merge the two files together to create a finalized song. This may involve converting both files to a common format, or synchronizing the playback such that the both start at the same time, but the end result should be a consistent playback.

Built With

Submitted to

PennApps XV

Created by

I worked on scraping lyric databases and building the Markov chain generator that was used to create the final song.

Kevin Brown
I build things.
Tony Tran
Major: CS
Michael Hawes
Computer Science Major
Gunther Cox
Robotics and web application developer.

Updates

Tony Tran started this project — Jan 22, 2017 03:08 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.