One of the biggest risks to the mental well-being of young people is social media. Numerous studies show that social media exacerbates exclusion, anxiety, and depression. We sought to create an inclusive, positive-reinforcement based social app for something that everybody loves: singing! Singing has many benefits: it boosts creativity, creates a sense of community, and helps with social integration, especially among children. All four of us have always been enthusiastic about singing and music, although we have never been very good at it. We decided to use our expertise in signal processing and machine learning to solve this problem. Our app helps people and connect with their friends and get better at singing.
What it does
Our app allows people to become better singers. Users can access our song collection and sing over a song of their choice. Our signal processing algorithm calculates and returns a score to the user showing the pitch difference between the original recording and their recording. Users can also view a leaderboard for each song showing the highest scoring individuals. There is also a profile tab where users can play back all their recordings and view their pitch improvement overtime. Finally, users can use our powerful vocal classification deep learning algorithm to find out their phonation mode. The phonation mode classifier categorizes someone’s singing into 4 different classes: breathy, neutral, pressed, and flow. This tells the singer how their vocal cords and larynx are contracting when they sing, and what adjustments must be made in singing technique to sing in a certain style.
How we built it
We built our client app in React Native and Expo. We use Firebase for user accounts, data storage for the leaderboards, and for storing all the song recordings. The algorithms and ML model are run on Flask in a PythonAnywhere production server. The client app communicates with the server through a custom RESTful API. The vocal comparison algorithm was built using numpy and librosa. First, the comparison algorithm takes two audio files and generates a series of chroma vectors. Chroma feature vectors are generated by taking short time Fourier transforms of the data and binning the transformed data into the 12 different Western musical tones. These accurately describe the pitch of the audio files. These features are averaged for each file. The average features are then turned into probability distributions to account for differences in total energy between the two sound files. Finally, the statistical Bhattacharyya distance is calculated between the two distributions to find how close the two sound files are in pitch. The vocal classification model is a convolutional neural network built with tensorflow and keras. The model was trained using mel-frequency cepstral coefficient (MFCC) features from 910 audio files of different phonation modes. The model achieved a 93% training accuracy and an 89% test accuracy.
Challenges I ran into
There were a lot of challenges in synchronizing the Firebase database, our production server in PythonAnywhere, and our client app. Specifically, we had a lot of difficulty getting asynchronous functions to work the way we wanted in React Native. Often the PythonAnywhere server would look for files not yet uploaded to the Firebase because the async function was not finished.
Accomplishments that I'm proud of
We were able to successfully create an Accounts feature in our first attempt using React Native Expo. After learning how to use React Native, we developed a navigation bar leading to each page. We also developed a very extensive signal processing algorithm in Python. This algorithm was based on modern techniques in MIR, or music information retrieval. We had to figure out how to adjust for different vocal timbres and different amplitudes in the files. Through our research, we found that chroma features and probabilistic distance were the best ways to determine the differences in pitch. Finally, we also managed to train a CNN with relatively high accuracy for vocal classification.
What I learned
This is our first time developing a mobile app with react native. Additionally, we learned how to integrate firebase into our react native applications.
What's next for Sing-ly
We believe that we can take Sing-ly public, publishing it on the App Store and Google Play Store. Before this publication, we want to expand our Firebase file system to accommodate more users and more files. We also plan to receive a song licensing agreement to provide more songs. We would also create a friends/followers system to help facilitate more connection between people.