A sample song
Top 10 chart
Recent developments in deep learning gave rise to the new forms of generative art. Computers can create novel paintings, and write ambient music. Still, generating entire songs is out of reach, so I wanted to attempt!
What it does
The project is a web site, that generates random song lyrics, and lets visitors vote for the best. It features a chart of top songs, and collections of favorites for individuals.
How I built it
The project is built on top of a pretrained GPT-2 language model, fine-tuned on a song lyrics database. It uses a seed-based technique to provide reproducible generation for over 100,000,000 unique texts. The most favored ones are cached in a database, which also tracks user votes.
The web site is build on ASP.NET Core and can run on a variety of platforms.
The source code has been published here (its my company's page, I am solopreneur): https://github.com/losttech/BillionSongs
Challenges I ran into
Had to separate lyrics generation from the web server to increase stability and handle parallel execution better.
Had to build a pool of pregenerated lyrics, so that visitors won't have to wait for 3min for a song to be generated.
Had to figure out, that lyrics in the training set has to be assembled into larger chunks to successfully fine-tune the model.
Accomplishments that I'm proud of
The whole thing (excluding training) runs on i3-7100U, consuming about 3GB of RAM, and is quite snappy after warm-up.
What I learned
- You can fit a pretty powerful website with deep learning in a small box, and still provide an excellent experience
- TensorFlow still lacks a bit in terms of reproducibility: many ops have internal states, which can't be reset easily without restarting the whole session and rebuilding graph
- The GPU training is pretty easy to set up.
What's next for Billion Songs
Next stop - parallel lyrics + music generation!
EDIT: a day ago OpenAI actually validated it is possible in exactly the way I planned: by training GPT-2 on MIDI! But they did not make the next step to the actual KARAOKE, so there's some room to research!