MusicallyAI

Inspiration

As people who love music and always seek to find music we enjoy, it’s sometimes difficult to do so. Listening to a song and trying to pinpoint it’s genre to find more songs like it isn’t easily done. And what if you want some completely new music to listen to, after it seems like you’ve exhausted your Spotify recommended list? Our love for music and our passion for machine learning and web development helped birth Musically.AI!

What it does

Musically.AI attempts to incorporate machine learning to classify genres of recorded audio which are uploaded by the user, and also to generate music from snippets of real songs! Our web application also features a versatile and highly accuract audio track splitter which splits any song into its vocals, background accompaniment, bass, drums, etc.

How we built it

To build our application we heavily relied on the Django web app framework and utilized javascript, ajax, html, and css to design and implement the functionalities of our page. On the backend, when detecting music genre, which is trained to a 75 percent accuracy, we extracted features from spectrogram data of each audio file, and then used those features as input into the network. For the music generation, we incorporated a specialized pretrained LSTM network from Google Magenta's Performance RNN. The model is trained on the Yamaha e-Piano Competition dataset and many hours of youtube piano data. Finally, the splitting is done through Spleeter by Deezer through the use of pretrained models written in Python and uses Tensorflow. The model splits vocals apart from seperate instruments such as bass guitar, drums, etc after training on the musdb18 dataset, a dataset with 150 full lengths music tracks (~10h duration) of different genres along with their isolated drums, bass, vocals and others stems. The audio track splitting is fully functional in our finalized project that is included in the github, however, we did not have the time to include it in our video due to time constraints near the final few minutes. Please follow this link to see the split songs and then accompaniment/vocal tracks.

Challenges we ran into

There were many challenges we ran into while building Musically.AI, both on the frontend and backend. Since we were using the Django framework, we had a fairly easy time setting up basic frotend designs, but when it came to getting audio clip data and presenting it to the audience embedded in the website, we ran into many issues where the audio file got corrupted or just wiped. On the backend, we had many issues with training models, a lot of which resulted from insufficient hardware and powerful GPU requirements. Specifically, with the code, we had issues trying to port google colab files into a local repository and when generating music, which sometimes resulted in chaotic and random outputs.

Accomplishments that we're proud of

Although we did not achieve all our intended functionality, we are happy with MusicallyAI as a proof-of-concept. We were able to effectively generate music that sometimes was even able to emulate the original song's intentions. Even in the worst cases, the music was still pleasant to listen to and bore similarity to the original song that was fed in. We are also proud of our web app which has a very sleek and user friendly design (implementing the backend of django was also a struggle, but when accomplished, made it much easier to expand).

What we learned

Each of us learned a tremendous amount in different aspects of audio processing and web development. None of us had worked with audio processing projects before, and this experience helped us expand our knowledge on Recurrent Neural Networks, encodings and decodings, the Django framework, and other machine learning and web development techniques. Because we each had our own areas of expertise, we were able to learn about these techniques from each other as well.

What's next for Musically.AI

In the future, we hope to make MusicallyAI more sophisticated by generating more complex continuations of songs. We hope to accomplish this by fine tuning the LSTM network and training it on additional data. Additionally, we hope to add more features to our website, such as allowing the user to input their own songs for generating continuations.