Inspiration
As lovers of machine learning and data analysis, we wanted to test how our abilities.
So thinking about data, we didn't want to analyze the usual csv file and do binary classification, we wanted to really take on a challenge.
In the end, as music lovers, we decided that analyzing audio files would fill our desire for adversity. And we'll utilize our topical knowledge of machine learning to do it.
What it does
From analyzing an audio file (containing a single chord), we preprocess (noise cleaning, trimming) the file and extract its chroma features (mapping its frequency data to pitches). From here, we use our model that has been trained on piano triads and make a prediction.
How we built it
Front-end: React Back-end: Flask Useful libraries: librosa, NumPy, MatPlotLib, sklearn
Also from this dataset, we were able to acquire a large training set: https://www.kaggle.com/datasets/davidbroberts/piano-triads-wavset?resource=download
Challenges we ran into
The "how" to analyze the audio data was the most difficult. Lots of time was dedicated to attempting the many chords within a song. This included creating windows of frames containing chroma data –– this lead to 3D matrices. Moreover, making sure we could analyze audio without downsampling too much was also an issue.
Overall, the hardest part was data analysis.
Accomplishments that we're proud of
We're proud that we were able to different feature sets before deciding on using pitch class profiles (chroma). This aided in improving our knowledge for what kind of data would optimize our SVM model.
We're also proud that we showed up and did not give up. We wanted to have fun and be presented with a challenge, which have both been satisfied. And now we've improved as programmers!
What we learned
We learned about different methods of analyzing audio data, whether it be creating mel-spectrograms, the effects of sampling rates on the accuracy of our model, and other features that can be extracted from audio.
What's next for ChordSense
We hope to move on to being able to recognize more instruments playing the chords, to arpeggiated chords (broken chords), to detecting chord changes within a song. Maybe in the foreseeable future, this could be abstracted to detect voices within a noisy environment.
Log in or sign up for Devpost to join the conversation.