Inspiration

Emotion recognition from speech is crucial for mental health monitoring, accessibility, and human-computer interaction. Traditional neural networks use fixed architectures, but biological neurons adapt using dendrites. I was inspired by PerforatedAI's approach of adding artificial dendrites to neural networks - mimicking how the brain learns.

What it does

SpeakEmotion classifies 8 emotions (neutral, calm, happy, sad, angry, fearful, disgust, surprised) from speech audio using the RAVDESS dataset:

  1. Converts audio to Mel spectrograms
  2. Processes through a CNN with PerforatedAI dendrites
  3. Dynamically grows new dendrites during training
  4. Achieves 22.2% remaining error reduction over baseline

How I built it

  • PyTorch CNN for spectrogram classification
  • PerforatedAI for dendritic optimization with add_validation_score()
  • Weights & Biases for experiment tracking with Arch/Final logging
  • RAVDESS dataset (1,440 audio files from 24 actors)

Results

Model Accuracy
Traditional CNN 53.45%
+1 Dendrite 59.48%
+2 Dendrites 63.79%

$$RER = \frac{63.79 - 53.45}{100 - 53.45} \times 100 = \textbf{22.2\%}$$

Challenges

  • Handling BatchNorm layers with PerforatedAI module conversion
  • Ensuring consistent spectrogram dimensions across variable-length audio
  • Implementing proper Arch/Final W&B logging per official example

What I learned

  • How dendritic optimization mimics biological neural plasticity
  • The power of dynamic architecture growth vs. fixed networks
  • Proper PerforatedAI integration patterns

What's next

  • Real-time emotion detection in video calls
  • Multi-modal emotion recognition (voice + facial expressions)
  • Apply to other audio tasks (speech recognition, music classification)

Built With

Share this project:

Updates