ποΈ EmotionSense
AI-Powered Multilingual Speech Emotion Recognition for Emergency Response
EmotionSense is an AI-based speech emotion recognition platform that analyzes emotions from voice recordings or live speech. The system detects emotional signals such as panic, fear, anger, sadness, and calmness to assist emergency responders in identifying high-priority situations faster.
# Features
Speech Emotion Detection from recorded or live audio Multilingual Support using acoustic voice analysis Real-Time Distress Alerts for panic or emergency situations Emotion Timeline Visualization during calls Machine Learning-Based Classification Interactive Dashboard
Inspiration
Emergency helplines and disaster response centers receive thousands of calls every day. Operators must quickly judge the urgency of each situation based only on the callerβs voice.
In stressful situations, it can be difficult for humans to accurately detect emotions like panic, fear, or distress, especially when calls are short or chaotic.
EmotionSense was created to assist emergency responders using Artificial Intelligence to analyze emotions in speech, helping them prioritize high-risk calls and respond faster.
How It Works
The system follows a machine learning pipeline for speech emotion recognition.
Audio Input β Audio Processing (Librosa) β Feature Extraction (MFCC, Pitch, Energy) β Machine Learning Model β Emotion Prediction β Dashboard + Alerts
Feature Extraction
EmotionSense extracts key speech features including:
- MFCC (Mel Frequency Cepstral Coefficients)
- Pitch and tone variations
- Energy levels in speech
These features capture emotional characteristics of a speakerβs voice.
Example MFCC Equation
$$ MFCC = \sum_{n=0}^{N-1} x[n] \cdot \cos\left(\frac{\pi k (2n+1)}{2N}\right) $$
Machine Learning Model
The model is trained on labeled speech emotion datasets to classify emotions such as:
- Panic
- Fear
- Anger
- Sadness
- Neutral
Example training snippet:
model.fit(X_train, y_train)
predictions = model.predict(X_test)
Built With
Programming Language
- Python
Machine Learning
- Scikit-learn
- TensorFlow
Audio Processing
- Librosa
Data Processing
- NumPy
- Pandas
Frontend
- Streamlit / Flask
Dataset
- RAVDESS Speech Emotion Dataset
Development Tools
- Git
- GitHub
- Replit
Multilingual Support
EmotionSense detects emotions based on acoustic voice patterns rather than specific spoken words.
This allows the system to work across:
- Different languages
- Regional accents
- Various speaking styles
Challenges
Some challenges we faced during development include:
- Detecting emotions accurately across different languages
- Handling noisy audio recordings
- Processing speech data in real time
- Extracting meaningful emotional features from short call recordings
What We Learned
Through this project we learned:
- Speech signal processing
- Audio feature extraction
- Emotion classification using machine learning
- Building multilingual AI systems
- Applying AI to real-world social impact problems
Impact
EmotionSense can support:
Emergency response systems Crisis helplines Mental health support centers Disaster response teams
By detecting emotional distress early, the system helps prioritize urgent cases and potentially save lives.
Future Improvements
- Real-time live call streaming analysis
- Deep learning models (CNN / LSTM)
- More emotion categories
- Improved multilingual support
- Integration with emergency call center systems
Demo
How the system works
- Upload or record call audio
- System processes the speech signal
- Extracts emotional audio features
- Machine learning model predicts emotion
- Dashboard displays results and alerts
Project Structure
EmotionAi/
β
βββ app.py
βββ model/
βββ dataset/
βββ audio_samples/
βββ requirements.txt
βββ README.md
Author
Augustha Jeya Mukilya M AI & Data Science Student Passionate about Machine Learning, AI, and Social Impact Technology
Support
If you like this project, consider giving it a star β on GitHub.
License
This project is licensed under the MIT License.
Log in or sign up for Devpost to join the conversation.