T-Mobile Customer Happiness Index

Inspiration

While working at T-Mobile as a retail associate, I often experienced firsthand how customer frustration could build up quietly before turning into complaints. These moments inspired me to think — what if store teams could detect dissatisfaction before it escalates?

To solve this, I envisioned a system that uses AI-driven emotion detection and natural language processing to analyze both facial expressions and spoken feedback in real-time. The goal: transform subtle cues into actionable insights so managers can proactively improve service, boost satisfaction, and create a smoother customer experience for everyone.


Overview

T-Mobile Customer Happiness Index is a real-time sentiment dashboard that:

  • Tracks multiple customer faces simultaneously using TensorFlow.js BlazeFace model with individual bounding boxes.
  • Analyzes facial expressions every 5 seconds using Google Gemini 2.0 Flash vision API to detect emotions (Happy, Neutral, Frustrated, Angry).
  • Transcribes spoken feedback using the browser-based Web Speech API (no external API needed).
  • Analyzes text sentiment using Gemini's advanced language understanding.
  • Visualizes trends with live charts showing happiness score, emotion distribution, and 24-hour sentiment patterns.
  • Generates real-time alerts when negative sentiment spikes are detected.
  • Provides AI-powered insights with actionable recommendations for store managers.
  • Updates live via WebSocket streaming for instant dashboard refreshes.

Store managers get a comprehensive view of customer satisfaction without interrupting service or requiring customer surveys.


Tech Stack

  • Frontend: React + TypeScript + Tailwind CSS with shadcn/ui components
  • Backend: Express.js + Node.js with WebSocket support
  • AI Engine: Google Gemini 2.0 Flash (multi-modal vision + text analysis)
  • Computer Vision: TensorFlow.js with BlazeFace model for real-time face detection
  • Speech Recognition: Browser Web Speech API for free voice transcription
  • Real-time: WebSockets for live emotion updates
  • Storage: In-memory

Architecture

  1. Webcam captures video frames.
  2. BlazeFace detects multiple faces and tracks their positions with bounding boxes.
  3. Gemini Vision analyzes emotions in real-time (Happy, Neutral, Frustrated, Angry).
  4. Speech-to-text: The Web Speech API converts spoken feedback to text.
  5. Gemini Text analyzes sentiment from transcribed speech.
  6. All data streams through WebSocket for live updates.
  7. The Dashboard refreshes in real-time to display sentiment analysis.
  8. The AI engine periodically generates actionable insights based on sentiment patterns.

Challenges

  • Coordinate transformation for face tracking: Getting bounding boxes to accurately overlay on scaled/letterboxed video required complex mathematical transformations to account for different aspect ratios.

  • Speech-to-text accuracy: The Web Speech API has quirks with interim vs. final transcripts. I had to implement careful state management to prevent duplicate text from appearing.

  • Real-time performance: Balancing continuous face detection with emotion analysis every 5 seconds required careful optimization to avoid overwhelming the API while maintaining a real-time feel.

Built With

Share this project:

Updates