Inspiration

The Image Style Transfer paper makes it possible render the content of one picture with the style of another. So I thought, can we do the same for music? Thus I tried using CNN with spectrograms.

What it does

Take two songs, one providing content and the other style, and merge them into one song :)

How I built it

Used librosa for audio related processing: convert audio signals into spectrograms and reconstruct audio spectrograms Used pre-trained CNN to produce the spectrogram of the mixed result of two input songs

Challenges I ran into

Setting up the environment for the audio processing unit: lots of missing libs Tuning the parameters: there are quite a lo

Accomplishments that I'm proud of

The mixed song is somewhat legible :P

What I learned

How different layers in a CNN extract feature Fourier transforms and their inverses are fun

What's next for muse

Polish the algorithm to better catch features involving time Better audio processing to remove noise Integrate into web service

Built With

Share this project:

Updates