Deep Learning Audio Classification of Misophonic Triggers

Poster

Misophonia, also known as selective sound sensitivity syndrome, is a disorder in which certain sounds, called “triggers,” elicit abnormally negative emotional reactions. Common trigger sounds include chewing, slurping, and sniffling, although each case is unique. Notably, “misophones” most often feel extreme anger, anxiousness, or disgust upon hearing trigger sounds, inciting a fight-or-flight reaction. Unfortunately, as the term “misophonia” was coined in the early 2000s, little research has been done on this condition. Audio classification for misophonia is critical as it can lead to the development of different technological treatments for those affected by the condition, providing much-needed relief. The recent rise of deep learning algorithms have shown high-accuracy performance on feature extraction and modeling, leading deeper exploration on their utilization in sound classification. This project proposes three neural network algorithms—1D and 2D Convolutional Neural Networks (CNNs), along with Artificial Neural Networks (ANNs)—to classify trigger sounds, using audio files of approximately 2,200 total triggers and non-triggers. The results demonstrate that all three neural networks were able to classify trigger sounds with high accuracy, with 1D CNNs having the highest. The results of this research will serve as a crucial step to developing instruments to block trigger sounds, helping misophonia patients deal with everyday triggering situations.

Built With

python

Updates

Shelley Ma started this project — Aug 15, 2024 12:04 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.