EQ Gym 🥋 – The Social Flight Simulator
Inspiration
Why do we have gyms for our bodies, but nothing for our communication skills?
We all know that sinking feeling in our stomach before a tough conversation. Whether it's asking for a raise, breaking up with someone, or confronting a difficult specific coworker, most of us just... freeze. We replay the conversation in our heads a million times, but when the moment comes, anxiety takes over.
We realized that while pilots have flight simulators to practice high-stakes scenarios without crashing real planes, humans had no safe place to practice high-stakes conversations without ruining real relationships.
That's why we built EQ Gym. It's a safe space to fail, sweat, and get better at the hardest conversations of your life—before they actually happen.
What it does
EQ Gym is a real-time, voice-first training platform that simulates high-stress social scenarios. It's not a text chat; it's a living, breathing conversation.
- Hyper-Realistic AI Roleplay: You speak naturally to an AI persona (e.g., "Alex, the skeptical boss" or "Jordan, the distant partner"). The AI reacts not just to what you say, but how you say it. If you sound unsure, the "boss" might interrupt you. If you're empathetic, the "partner" softens up.
- Custom Scenarios: Have a specific problem? You can describe your exact situation (e.g., "My landlord is refusing to fix the leak"), and our system instantly generates a custom roleplay partner for that specific context.
- Gender-Aware Dynamics: The system intelligently adapts roleplay scenarios based on your gender to create the most realistic and immersive relationship dynamics possible.
- Real-Time Feedback: The AI provides immediate, latency-free responses. It feels like talking to a human on the phone.
- Post-Game Analysis (The Gemini 3 Difference): This is where Gemini 3 shines. After the session, the advanced model analyzes the entire interaction to build a "Game Tape" breakdown—scores on empathy, clarity, and confidence, plus actionable advice on what you could have done better.
(A video demonstration of a live session will be added here shortly to show the real-time capabilities in action.)
How we built it
We approached this with a "speed-first" mindset because voice conversations die if there's lag.
- Frontend: Next.js 16 (App Router) for a snappy, modern UI. We use the Web Audio API to capture high-fidelity audio directly from the browser at the device's native sample rate.
- Backend: FastAPI (Python) acting as a high-performance WebSocket bridge.
- The Brains (Gemini 2.5 & Gemini 3): To achieve the perfect balance of speed and intelligence, we use a dual-model architecture via Vertex AI:
- Live Conversations: Powered by Gemini 2.5 Flash Native Audio. This model is optimized for ultra-low latency, handling the real-time voice stream and emotional tone.
- EQ Coaching: Powered by Gemini 3 Flash Preview. After the session, the heavy lifting of deep psychological analysis and personalized advice is handled by the more advanced Gemini 3 model.
- Development Partner: The entire application was architected and built with AntiGravity, an advanced AI coding assistant that helped us iterate rapidly on complex real-time features.
Challenges we ran into
- The "Robot Voice" Problem: Early on, standard text-to-speech models stripped away all the emotion. A breakup scenario sounded like a GPS navigation instruction. Switching to Gemini's native audio-to-audio capabilities was the breakthrough that made it feel "real."
- Contextual Roleplay: One of the hardest parts was "prompt engineering for personality." It wasn't enough to tell the AI "be a boss." We had to fine-tune the system prompts to make the AI interrupt users, be stubborn, or get annoyed when appropriate. We had to teach the AI to stop being a helpful assistant and start being a realistic human obstacle.
- Audio Pipeline Complexity: Browsers are messy. Microphone sample rates vary wildly (44.1kHz vs 48kHz). We had to build a robust resampling pipeline to ensure the audio stream was perfectly compatible with Gemini's strict input requirements without introducing latency.
Accomplishments that we're proud of
- Sub-500ms Latency: The conversation flows naturally. You can laugh, interrupt, and talk over each other, and it feels like a real phone call.
- It Actually Works: We've tested it with real users who reported genuine sweaty palms during the "Salary Negotiation" scenario. That physical stress response means the simulation is working.
- Integrated EQ Scoring: We didn't just build a chatbot; we built a coach. The feedback system (powered by Gemini Pro) gives genuinely useful advice that users can apply immediately.
What we learned
- Tone is Data: We learned that in difficult conversations, how you say something matters 10x more than the words you use. Gemini's multimodal capabilities allowed us to treat "tone of voice" as a first-class data input.
- Voice interfaces are the future: Moving away from typing/reading to speaking/listening creates a completely different level of emotional engagement.
- Prompting is Programming: Creating a realistic "jerk boss" AI required as much engineering rigor as writing the backend code.
What's next for EQ Gym
- Visual Analysis: Using the camera to analyze facial expressions and body language during the call.
- Multiplayer Mode: Practicing group mediation or couples therapy scenarios with multiple humans and AI agents.
- Long-term Progress Tracking: A dashboard showing your "EQ Gains" over weeks of training.
Built with ❤️ (and a lot of coffee) for the Gemini 3 Hackathon 2026.
Built With
- fastapi
- gemini
- nextjs
- python
- react
- typescript
Log in or sign up for Devpost to join the conversation.