Inspiration

We wanted to write a bot that essentially could take some sort of data and created something 'creative' from it.

What it does

  • Takes Royalty-free MP3s
  • Convert MP3 to WAV to numpy array
  • Creates an LSTM RNN training the numpy array 'unsupervised'
  • NEVER FINISH- Convert generated numpy array back into a wav file.

It's pretty much stuck at 'proof of concept'

How I built it

First off, we used a curl script to download 600 songs, then we processed all that data using python to convert it into a WAV and numpy array. Afterwards, we used Tensorflow and Keras to create a LSTM model to train the data on "itself" per se and essentially have some sort of method of teaching it unsupervised.

Challenges I ran into

  • LOTS OF DATA - ~30GB -> 200MB. Audio is extremely dense. We tried to doing short time fourier transforms. Then spectrograms, etc...
  • VERY SLOW - Eta from 7 hours (locally on computer) to 5 minutes (maxed out cloud computer)
  • Optimizations

Accomplishments that I'm proud of

  • A model that successfully compiles and trains
  • A cloud computer that successfully runs it
  • Preprocessing 70GBs

What I learned

  • Created a LSTM model
  • How audio files work

What's next for Audio Deep Learning

  • More optimized pre processing
  • More optimized model
  • Generating audio from the model's output.

Built With

  • cloud-computing
  • keras
  • python
  • theano
Share this project:

Updates