Inspiration
Improv builds confidence, spontaneity, and communication skills, but you can't practice alone. We saw thousands of people wanting to rehearse scenes, explore characters, and grow as performers, but lacking a patient partner available 24/7. By combining ElevenLabs Conversational AI with Gemini's coaching intelligence, we could finally give improvisers the practice space they've been missing.
What it does
YesAnd is your AI improv scene partner, entirely voice-driven and always ready to play. Speak your lines, and your AI partner responds naturally in character. After each scene, Gemini analyzes your performance in real time, identifying key story elements (who/what/where), highlighting what worked, and delivering specific, actionable coaching. It's like having a personal improv coach in your pocket.
How we built it
We built YesAnd on Google Cloud with ElevenLabs Conversational AI handling real-time voice interaction and natural scene partnership. Scene transcripts stream to a backend where Gemini 2.5 Flash tracks scene elements, identifies breakthrough moments, and generates personalized coaching feedback. The frontend displays live scene state and post-scene analysis in an immersive, clean interface that keeps users focused on performance, not technology.
Challenges we ran into
Balancing spontaneity with consistent character behavior required extensive prompt engineering. Making coaching feel actionable (not generic) meant designing structured output formats and iterative refinement of Gemini's analysis prompts. The biggest technical hurdle was minimizing latency across speech recognition, AI reasoning, and audio playback to maintain the natural flow essential to improv.
Accomplishments that we're proud of
We created a voice-first improv practice system that feels genuinely natural. Scene partners respond with personality and spontaneity in real time, while the coaching provides specific, memorable insights instead of vague encouragement. Our live scene tracking (who, what, where, unusual elements) works reliably and actively helps users learn the fundamentals of scene construction.
What we learned
Orchestrating ElevenLabs and Gemini together creates powerfully human interactions when done right. We discovered that effective improv feedback needs concrete structure: users grow most from examples, explanations of why choices worked, and small focused suggestions they can apply immediately. AI coaching works best when it mirrors how great human coaches actually teach.
What's next for YesAnd
We're adding scene replays with emotional arc visualization, deeper partner personalities you can customize, skill-specific drills, and progress tracking. Next, we'll explore multi-character scenes, collaborative practice modes where multiple users can join, and mobile apps to make improv training truly accessible anywhere. Our goal: democratize performance training for everyone, everywhere.


Log in or sign up for Devpost to join the conversation.