Emotion Detection in Human Speech

Computer, in its various forms, is ubiquitous. From table tops to pockets, the computing element is present even in cars nowadays. This evolution in its forms calls for an evolution in input methods. Input methods(like mouse, touch screens, buttons) are suited for specific for factors, but speech input can be used in desktops, phones, tablets, cars, or any other computing system alike. Speech input becomes even more important when there is limited accessibility like while driving a car, or when the user has problematic vision, or simply hands-free usage.

Use cases of Emotion Detection

Detection of emotion in a speech input can serve several purposes. Some of them are:

  • Suggest songs that may suit of complement the mood
  • Set speed limits when driving (if the user sounds angry, set the limit low)
  • Detect signs of depression in a patient


Emotional Prosody Speech and Transcripts was developed by the Linguistic Data Consortium and contains audio recordings and corresponding transcripts, collected over an eight month period in 2000-2001 and designed to support research in emotional prosody. There are 30 data files: 15 recordings in sphere format and their transcripts.

Description taken from here.

Name Emotional Prosody Speech and Transcripts
Authors Mark Liberman, Kelly Davis, Murray Grossman, Nii Martey, John Bell
Feature extracted 13 MFCC coefficients


  • KNN Classifier
  • SVM Classifier

Built With

Share this project: