Inspiration
The Image Style Transfer paper makes it possible render the content of one picture with the style of another. So I thought, can we do the same for music? Thus I tried using CNN with spectrograms.
What it does
Take two songs, one providing content and the other style, and merge them into one song :)
How I built it
Used librosa for audio related processing: convert audio signals into spectrograms and reconstruct audio spectrograms Used pre-trained CNN to produce the spectrogram of the mixed result of two input songs
Challenges I ran into
Setting up the environment for the audio processing unit: lots of missing libs Tuning the parameters: there are quite a lo
Accomplishments that I'm proud of
The mixed song is somewhat legible :P
What I learned
How different layers in a CNN extract feature Fourier transforms and their inverses are fun
What's next for muse
Polish the algorithm to better catch features involving time Better audio processing to remove noise Integrate into web service
Built With
- librosa
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.