🚀 Inspiration

High-stakes communication like job interviews, investor pitches, or even thesis defenses is notoriously difficult to prepare for.

🔹 Traditional methods like reading notes, rehearsing alone, or asking friends for feedback often fail because:

  • They don’t simulate real pressure or dynamic questioning.
  • Feedback is subjective, shallow, or inconsistent.
  • There’s no way to measure progress over time.

😰 As a result, many candidates walk into these moments underprepared, deliver weak responses, and miss career-changing opportunities.

We were inspired to build Sozo Pitch Helper, a tool that acts as a personal, AI-powered sparring partner. It creates a realistic practice environment where users don’t just rehearse, they get objective, data-driven coaching that compounds over time.


🤖 What it does

Sozo Pitch Helper is an AI training platform that transforms preparation into a science.

  1. 📄 Context Ingestion: Users upload a job description, pitch deck, or research paper.
  2. 🧠 Contextual Awareness: The backend uses Gemini to classify the task (interview, pitch, or defense), extract key points, and generate a short description. This context is prepared so the AI panel asks questions directly relevant to the user’s scenario.
  3. 🎙️ Dynamic Simulation: Users engage in a real-time session with a multi-voice AI panel (Eric, Daniel, and Rachel). Each persona represents a distinct style of questioning, adapting to answers and pressing for detail, just like a real hiring manager, venture capitalist, or examiner would.
  4. 📊 Data-Driven Feedback: After each session, users receive a detailed performance analysis scored on four metrics:
    • 🗣️ Communication: Clarity, confidence, and conciseness of speech. Because how you say something is as important as what you say.
    • 🧠 Content Mastery: Subject knowledge and the ability to logically support claims with evidence — proof that you truly know your material under pressure.
    • 🤝 Engagement & Delivery: Tone, pacing, and audience awareness — measuring if you persuade rather than just recite facts.
    • 💪 Resilience: Composure when challenged, tracking how well you think clearly under pressure. Often the deciding factor in high-stakes settings.

✅ The result is not just practice, it’s targeted, intelligent coaching that builds confidence, clarity, and measurable improvement.


🛠️ How we built it

We chose a modern, decoupled architecture for security and scalability.

  • Frontend: A responsive SPA built with React, Vite, and Tailwind CSS.
  • Backend: A stateless API server in Python (Flask) hosted on Hugging Face Spaces.
  • Database & Auth: Firebase Authentication for secure sign-ins and Firebase Realtime Database for user data, credits, and history.
  • Conversational AI: Powered by the ElevenLabs Agents API, enabling natural multi-voice conversations with Eric, Daniel, and Rachel.
  • Analytical AI: The backend uses the Google Gemini API for summarization, task classification, feedback scoring, and memory.

🏗️Architecture

Sozo Pitch Helper

🤯 Challenges we ran into

  • Fair AI Scoring: Early prompts gave perfect scores for trivial answers, then became too strict. We solved this with a "Grader on a Curve" rubric that rewards effort while keeping feedback accurate.
  • User Identification: The AI confused names in role-play (like “Eric” or “Daniel”) with the actual user. Explicitly passing the user’s name into the prompt fixed this.
  • Gemini SDK Integration: Incorrect assumptions about the google-genai library caused repeated crashes. The fix was to strictly follow official documentation.

🎙️ Conversational AI Integration

  • Getting AI agents to respond naturally in real-time was harder than expected.
  • Handling interruptions, context-switching, and smooth back-and-forth required careful orchestration.
  • Multi-voice turn system prompt in Elevenlabs.

🌐 Browser & Mic Permissions

  • Different browsers handle microphone access in slightly different ways.
  • Ensuring a smooth, one-click setup without scaring users with security popups was a delicate balance.

⏹️ Session Termination & Control

  • Users needed the ability to end sessions instantly.
  • Managing cleanup of active streams, freeing resources, and properly logging transcripts was more complex than anticipated.

⚙️ Scalability & Tracking

  • Getting the scoring right took a lot of iteration to get the appropriate “Grader on a curve” system.
  • Every call session required credit tracking, transcripts, and performance logging in Firebase.
  • Keeping this accurate while minimizing server load introduced tricky edge cases.

Despite these hurdles, each challenge shaped the product into something more robust and user-friendly, ensuring a smoother, more realistic experience for users preparing for their big moments. 🚀


🏆 Accomplishments that we're proud of

  • The AI Memory Engine: Our breakthrough feature. Before each session, Gemini analyzes the user’s past performance, identifies weaknesses, and generates a one-sentence directive for the AI panel. This ensures continuity and coaching that actively targets weaknesses , if you struggled with financial projections last time, you’ll be pressed on them again.
  • Multi-Step AI Orchestration: One AI model (Gemini) powers a full pipeline of tasks: document summarization, scenario classification, dynamic AI panel briefing, and performance analysis.
  • Building a Secure & Modern Stack: Delivered a full, production-ready application with real-time data, authentication, and robust AI integrations.

🧠 What we learned

  • Prompt Engineering is Iterative: Scoring and coaching quality improved only through multiple cycles of testing and refinement.
  • Documentation is King: Our toughest bugs were solved by carefully revisiting official SDK documentation.
  • Decoupling is Power: Keeping Gemini on the backend means we can upgrade prompts, scoring logic, and the memory engine without touching the frontend.

🔮 What's next for Sozo Pitch Helper

  • Contextual Research Feature: Use the project’s short description to fetch web context, enriching AI panel knowledge.
  • Visual Progress Tracking: Build a dashboard to graph performance across the four metrics over time.
  • Custom AI Video Personas: Let users choose interviewer styles (e.g., Friendly & Encouraging vs Skeptical & Direct) to broaden practice scenarios.

Built With

Share this project:

Updates