Inspiration
We wanted to write a bot that essentially could take some sort of data and created something 'creative' from it.
What it does
- Takes Royalty-free MP3s
- Convert MP3 to WAV to numpy array
- Creates an LSTM RNN training the numpy array 'unsupervised'
- NEVER FINISH- Convert generated numpy array back into a wav file.
It's pretty much stuck at 'proof of concept'
How I built it
First off, we used a curl script to download 600 songs, then we processed all that data using python to convert it into a WAV and numpy array. Afterwards, we used Tensorflow and Keras to create a LSTM model to train the data on "itself" per se and essentially have some sort of method of teaching it unsupervised.
Challenges I ran into
- LOTS OF DATA - ~30GB -> 200MB. Audio is extremely dense. We tried to doing short time fourier transforms. Then spectrograms, etc...
- VERY SLOW - Eta from 7 hours (locally on computer) to 5 minutes (maxed out cloud computer)
- Optimizations
Accomplishments that I'm proud of
- A model that successfully compiles and trains
- A cloud computer that successfully runs it
- Preprocessing 70GBs
What I learned
- Created a LSTM model
- How audio files work
What's next for Audio Deep Learning
- More optimized pre processing
- More optimized model
- Generating audio from the model's output.
Built With
- cloud-computing
- keras
- python
- theano

Log in or sign up for Devpost to join the conversation.