Settings
Report
Voice Lab
Session Setup
Simulation

Resonance - AI Voice Coach

Inspiration

We've all been there — sweaty palms before a big sales call, heart racing before confronting an angry customer, or mind going blank during a crucial negotiation. Traditional communication training relies on expensive coaches, awkward role-plays with colleagues, or simply "winging it" and hoping for the best.

We asked ourselves: What if you could practice any difficult conversation, anytime, with an AI that talks back naturally — and even interrupts you like a real person would?

The inspiration came from observing how pilots use flight simulators to handle emergencies they'll hopefully never face. Sales reps, customer service agents, and managers deserve the same — a safe space to fail, learn, and build confidence before the stakes are real.

What it does

Resonance is an AI-powered mobile app that simulates realistic voice conversations for high-stakes communication training.

Core Features:

Real-time AI Voice Simulation — Talk naturally with AI personas powered by Gemini 2.5 Flash and ElevenLabs voice synthesis
Natural Interruption (Barge-In) — Interrupt the AI mid-sentence just like real conversations, with <150ms response time
Chaos Engine — Simulate real-world disruptions: background noise, connection drops, voice variations, and hardware failures
Stress Mode — Handle multiple callers in queue with a stamina system that tests your endurance
Voice Lab — Clone voices or choose from a library to practice with specific personality types
Performance Analytics — Track pace (WPM), filler words, clarity, confidence scores, and emotional patterns
Context Upload — Upload PDF/DOCX documents (product specs, scripts) for scenario-specific training
Offline-First — All data stays local on your device with SQLite storage

How we built it

Tech Stack:

Expo SDK 50 + React Native — Cross-platform mobile development with managed workflow
Expo Router — File-based navigation for clean architecture
NativeWind (Tailwind CSS) — Rapid UI styling with utility classes
Zustand — Lightweight state management
Gemini 2.5 Flash — AI conversation engine for dynamic, context-aware responses
ElevenLabs WebSocket API — Ultra-low latency text-to-speech streaming
Custom VAD (Voice Activity Detection) — Signal Energy RMS-based detection with ambient noise calibration
expo-sqlite — Local database for sessions, transcripts, and analytics
expo-secure-store — Encrypted storage for user API keys
Moti + Reanimated — Smooth animations including the signature "Sun" orb visualizer

Architecture Highlights:

Layered architecture separating presentation, business logic, services, and data access
WebSocket streaming for sub-800ms voice response latency
Haptic feedback on successful interruptions for tactile confirmation
Mock mode for full functionality without API calls (demo/testing)

Challenges we ran into

1. Voice Activity Detection Accuracy Initial AI-based VAD was too slow and resource-intensive. We pivoted to Signal Energy (RMS) approach with ambient noise floor calibration during splash screen — achieving <150ms detection while being battery-efficient.

2. Barge-In Timing Making interruptions feel natural required careful audio pipeline management. We had to cancel ongoing TTS playback, stop the AI mid-thought, and seamlessly transition to listening mode — all within milliseconds.

3. Latency Optimization Achieving conversational flow meant every millisecond counted. We implemented WebSocket streaming instead of REST calls, pre-buffered audio chunks, and optimized the Gemini prompt structure for faster responses.

4. Offline-First with Cloud AI Balancing offline functionality with cloud-dependent AI services required graceful degradation, smart caching of context documents, and clear user feedback when features are limited.

5. Indonesian Language Support Detecting Indonesian filler words ("eung", "anu", "uhm") required custom detection logic since most NLP tools focus on English.

Accomplishments that we're proud of

Sub-800ms end-to-end latency — From user speech to AI voice response, making conversations feel genuinely natural
Working Chaos Engine — Successfully simulating real-world disruptions that actually stress-test users
Voice cloning integration — Users can practice with cloned voices of specific personality types
Comprehensive analytics — Real-time metrics that provide actionable coaching feedback
Privacy-first design — All user data stays on-device, API keys encrypted, no cloud sync required
Bilingual support — Full Indonesian and English localization including filler word detection
Beautiful UI — Cyber-professional dark theme with the signature golden "Sun" orb visualizer

What we learned

VAD is harder than it looks — Ambient noise, microphone quality, and speaking styles vary wildly across devices and users
Latency is everything for voice apps — Even 200ms extra delay breaks the conversational illusion
State management complexity — Real-time audio + AI + UI animations required careful orchestration to avoid race conditions
Mobile constraints are real — Memory management, battery optimization, and background audio handling need constant attention
User feedback loops matter — Haptic feedback and visual indicators are crucial for users to understand system state during voice interactions

What's next for Resonance

Short-term:

iOS release via TestFlight
More scenario templates (job interviews, medical consultations, conflict resolution)
Team/enterprise features for corporate training programs
Integration with calendar apps for pre-meeting practice sessions

Long-term:

Multi-party conversations (conference call simulations)
AR mode with virtual avatar for body language training
Emotion recognition from voice to provide empathy coaching
API for third-party training content creators
Gamification expansion with leagues, challenges, and social features

Vision: We believe everyone deserves access to world-class communication coaching. Resonance aims to democratize what was previously only available through expensive executive coaches — making confident communication accessible to anyone with a smartphone.

Built with ☕ and determination for the future of communication training.

Built With

android
eas
elevenlabs-api
expo-router
expo.io
google-gemini-api
hermes
javascript
lottie
moti
nativewind
react-native
reanimated
sqlite
tailwind-css
websocket
zustand

Updates

Private user started this project — Dec 24, 2025 02:45 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.