Landing page
Realtime reporting

MindfulAI - Our Hackathon Journey

Inspiration

In the race to build smarter AI, we've forgotten something fundamental: Emotional Intelligence.

Modern AI systems optimize for accuracy, speed, and capability - but they speak in the same flat, monotone voice whether you're celebrating a promotion or grieving a loss. They analyze your words but miss the tremor in your voice. They generate responses but don't feel the weight of what you're sharing.

The AI revolution neglected the Emotional Quotient (EQ).

We asked ourselves: What if AI could actually listen - not just to words, but to tone? What if it could respond not just with the right information, but with the right feeling? What if talking to AI felt less like querying a database and more like confiding in a friend?

The Gap We Saw

Traditional AI	What Humans Need
Flat, consistent voice	Tone that adapts to emotion
Text analysis only	Voice tone recognition
Same response style always	Empathetic variation
"I understand" (but doesn't)	Actually demonstrates understanding

Our Novel Approach

MindfulAI bridges this gap with two key innovations:

Emotion-Adaptive Voice Output - When you're anxious, MindfulAI speaks slower, softer, calmer. When you're excited, it matches your energy. The AI doesn't just say it understands - it shows it through how it speaks.
User Tone Detection - Beyond analyzing what you say, we analyze how you say it. The intensity in your voice, the hesitation, the relief - these signals inform the AI's understanding and response.

This isn't just another chatbot with a voice skin. This is AI that finally has emotional intelligence.

What We Learned

Technical Insights

1. Memory Decay Mathematics

One of our key innovations was implementing memory decay for emotional state tracking. Emotions shouldn't persist at full intensity forever - they naturally fade:

$$w_{t+1} = w_t \cdot e^{-\lambda \Delta t}$$

Where:

$w_t$ = emotion weight at time $t$
$\lambda$ = decay constant (we used 0.15)
$\Delta t$ = time since last mention

This creates more natural conversations where the AI doesn't fixate on emotions mentioned 10 exchanges ago.

2. Emotion Blending

Instead of overwriting emotions, we blend them for continuity:

$$w_{new} = \max(w_{prev} \cdot 0.6 + I_{new} \cdot 0.4, I_{new} \cdot 0.8)$$

Where $I_{new}$ is the new intensity. This prevents jarring emotional jumps.

3. The Power of "Why NOT"

Explainability isn't just about why the AI did something - it's equally important to explain why it didn't do something else. Our technique selection now includes rejection reasons:

{
  "technique": "validation",
  "reason": "User needs to feel heard",
  "why_not": {
    "cognitive_reframing": "Too early - user not ready",
    "grounding_exercise": "No anxiety symptoms present"
  }
}

Human Insights

Voice changes everything - Text feels transactional; voice feels like connection
Transparency builds trust - Users engage more when they can see the AI's reasoning
Silence is powerful - Sometimes the best response is a pause, not more words

How We Built It

Architecture Overview

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   React + TS    │────▶│  FastAPI + WS   │────▶│   Gemini 2.5    │
│   (Frontend)    │◀────│   (Backend)     │◀────│   (Vertex AI)   │
└─────────────────┘     └────────┬────────┘     └─────────────────┘
                                 │
                    ┌────────────┼────────────┐
                    ▼            ▼            ▼
              ┌──────────┐ ┌──────────┐ ┌──────────┐
              │ Kafka    │ │ Postgres │ │ElevenLabs│
              │ Events   │ │ Sessions │ │  Voice   │
              └──────────┘ └──────────┘ └──────────┘

Tech Stack

Layer	Technology	Purpose
Frontend	React 18 + TypeScript + Vite	Voice-first UI with real-time updates
Backend	FastAPI + WebSockets	Async processing, streaming responses
AI	Gemini 2.5 Flash (Vertex AI)	Emotion analysis, technique selection, response generation
Voice	ElevenLabs	Emotion-adaptive voice synthesis
Events	Confluent Kafka	Observable AI cognition streaming
Database	PostgreSQL	Session persistence, audit trails

Key Components

1. Conversation State Machine

6 phases: Opening → Exploration → Deepening → Technique → Integration → Closing
Automatic phase transitions based on user signals
Memory decay applied every exchange

2. Emotion-Adaptive Voice

Voice parameters adjust based on detected emotion
Anxious user → slower, calmer voice
Happy user → warmer, more energetic voice

3. Observable AI Cognition Dashboard

14+ event types streamed via Kafka
Real-time visualization of AI decision-making
Full audit trail with correlation IDs

4. Safety System

Crisis detection with keyword + context analysis
Immediate resource provision (988, crisis lines)
Never provides medical advice

Challenges We Faced

Challenge 1: The "Sarcasm Problem"

Problem: Our breakthrough detection was triggering on sarcastic statements like "Oh sure, that makes total sense" (said dismissively).

Solution: We added emotion context requirements - breakthroughs only count when accompanied by positive emotions (relief, calm, gratitude):

positive_emotions = ["relief", "calm", "gratitude", "hopeful", "curious"]
if emotion not in positive_emotions:
    return None  # Not a real breakthrough

Challenge 2: The "Repetition Trap"

Problem: The AI would repeat similar responses, especially when users gave short acknowledgments like "yeah" or "okay".

Solution: We implemented multiple anti-repetition mechanisms:

Track used response patterns with fingerprinting
Track questions asked to avoid re-asking
Explicit anti-repetition guidance in prompts
Exercise discussion blocking after completion

Challenge 3: Phase Transition Counter Bug

Problem: Our phase transition events were reporting exchanges_in_previous_phase: 0 instead of the actual count.

Root Cause: The counter was reset before the event was emitted:

# BUG: Counter reset happens inside _check_phase_transition
self._check_phase_transition(memory, user_message)  # Resets counter to 0
emit_event(exchanges_in_previous_phase=memory.exchanges_in_phase)  # Always 0!

Solution: Store the count before calling the transition check:

exchanges_before = memory.exchanges_in_phase  # Save it first!
self._check_phase_transition(memory, user_message)
emit_event(exchanges_in_previous_phase=exchanges_before)  # Correct value

Challenge 4: Emotional "Jumpiness"

Problem: Emotions would suddenly shift from 0.8 anxiety to 0.3 calm in one exchange, feeling unnatural.

Solution: Implemented emotion blending instead of overwriting:

# Before: Jarring overwrites
memory.emotion_weights[emotion] = intensity

# After: Smooth blending
blended = max(previous * 0.6 + intensity * 0.4, intensity * 0.8)
memory.emotion_weights[emotion] = blended

Challenge 5: Kafka Event Explosion

Problem: We were emitting so many events that Kafka costs would scale poorly.

Solution: Strategic event emission - only emit when something meaningful changes, not every micro-update. Added reason fields to every event for human readability.

What Makes MindfulAI Different

Feature	Traditional Chatbots	MindfulAI
Interaction	Text-only	Voice-first
Transparency	Black box	Observable cognition
Memory	Stateless or full history	Decaying memory (natural)
Explainability	None	Why + Why NOT
Auditability	Logs only	Real-time Kafka stream
Emotion handling	Simple sentiment	Multi-emotion with intensity

Future Vision

Multi-language support via Gemini's multilingual capabilities
Therapist dashboard for professionals to review AI sessions
Personalized voice cloning for consistency across sessions
Federated learning for privacy-preserving model improvements
Integration with wearables for physiological context (heart rate, stress levels)

Built With

cloud
cloud-run
confluent-kafka
docker
elevenlabs
fastapi
gemini-2.5-flash
google-cloud-vertex-ai
postgresql
python
react
tailwindcss
typescript
vite
websockets

Updates

Michael Orwa started this project — Dec 31, 2025 02:42 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.