Sozo Pitch Helper

🚀 Inspiration

High-stakes communication like job interviews, investor pitches, or even thesis defenses is notoriously difficult to prepare for.

🔹 Traditional methods like reading notes, rehearsing alone, or asking friends for feedback often fail because:

They don’t simulate real pressure or dynamic questioning.
Feedback is subjective, shallow, or inconsistent.
There’s no way to measure progress over time.

😰 As a result, many candidates walk into these moments underprepared, deliver weak responses, and miss career-changing opportunities.

We were inspired to build Sozo Pitch Helper, a tool that acts as a personal, AI-powered sparring partner. It creates a realistic practice environment where users don’t just rehearse, they get objective, data-driven coaching that compounds over time.

🤖 What it does

Sozo Pitch Helper is an AI training platform that transforms preparation into a science.

📄 Context Ingestion: Users upload a job description, pitch deck, or research paper.
🧠 Contextual Awareness: The backend uses Gemini to classify the task (interview, pitch, or defense), extract key points, and generate a short description. This context is prepared so the AI panel asks questions directly relevant to the user’s scenario.
🎙️ Dynamic Simulation: Users engage in a real-time session with a multi-voice AI panel (Eric, Daniel, and Rachel). Each persona represents a distinct style of questioning, adapting to answers and pressing for detail, just like a real hiring manager, venture capitalist, or examiner would.
📊 Data-Driven Feedback: After each session, users receive a detailed performance analysis scored on four metrics:
- 🗣️ Communication: Clarity, confidence, and conciseness of speech. Because how you say something is as important as what you say.
- 🧠 Content Mastery: Subject knowledge and the ability to logically support claims with evidence — proof that you truly know your material under pressure.
- 🤝 Engagement & Delivery: Tone, pacing, and audience awareness — measuring if you persuade rather than just recite facts.
- 💪 Resilience: Composure when challenged, tracking how well you think clearly under pressure. Often the deciding factor in high-stakes settings.

✅ The result is not just practice, it’s targeted, intelligent coaching that builds confidence, clarity, and measurable improvement.

🛠️ How we built it

We chose a modern, decoupled architecture for security and scalability.

Frontend: A responsive SPA built with React, Vite, and Tailwind CSS.
Backend: A stateless API server in Python (Flask) hosted on Hugging Face Spaces.
Database & Auth: Firebase Authentication for secure sign-ins and Firebase Realtime Database for user data, credits, and history.
Conversational AI: Powered by the ElevenLabs Agents API, enabling natural multi-voice conversations with Eric, Daniel, and Rachel.
Analytical AI: The backend uses the Google Gemini API for summarization, task classification, feedback scoring, and memory.

🏗️Architecture

🤯 Challenges we ran into

Fair AI Scoring: Early prompts gave perfect scores for trivial answers, then became too strict. We solved this with a "Grader on a Curve" rubric that rewards effort while keeping feedback accurate.
User Identification: The AI confused names in role-play (like “Eric” or “Daniel”) with the actual user. Explicitly passing the user’s name into the prompt fixed this.
Gemini SDK Integration: Incorrect assumptions about the google-genai library caused repeated crashes. The fix was to strictly follow official documentation.

🎙️ Conversational AI Integration

Getting AI agents to respond naturally in real-time was harder than expected.
Handling interruptions, context-switching, and smooth back-and-forth required careful orchestration.
Multi-voice turn system prompt in Elevenlabs.

🌐 Browser & Mic Permissions

Different browsers handle microphone access in slightly different ways.
Ensuring a smooth, one-click setup without scaring users with security popups was a delicate balance.

⏹️ Session Termination & Control

Users needed the ability to end sessions instantly.
Managing cleanup of active streams, freeing resources, and properly logging transcripts was more complex than anticipated.

⚙️ Scalability & Tracking

Getting the scoring right took a lot of iteration to get the appropriate “Grader on a curve” system.
Every call session required credit tracking, transcripts, and performance logging in Firebase.
Keeping this accurate while minimizing server load introduced tricky edge cases.

Despite these hurdles, each challenge shaped the product into something more robust and user-friendly, ensuring a smoother, more realistic experience for users preparing for their big moments. 🚀

🏆 Accomplishments that we're proud of

The AI Memory Engine: Our breakthrough feature. Before each session, Gemini analyzes the user’s past performance, identifies weaknesses, and generates a one-sentence directive for the AI panel. This ensures continuity and coaching that actively targets weaknesses , if you struggled with financial projections last time, you’ll be pressed on them again.
Multi-Step AI Orchestration: One AI model (Gemini) powers a full pipeline of tasks: document summarization, scenario classification, dynamic AI panel briefing, and performance analysis.
Building a Secure & Modern Stack: Delivered a full, production-ready application with real-time data, authentication, and robust AI integrations.

🧠 What we learned

Prompt Engineering is Iterative: Scoring and coaching quality improved only through multiple cycles of testing and refinement.
Documentation is King: Our toughest bugs were solved by carefully revisiting official SDK documentation.
Decoupling is Power: Keeping Gemini on the backend means we can upgrade prompts, scoring logic, and the memory engine without touching the frontend.

🔮 What's next for Sozo Pitch Helper

Contextual Research Feature: Use the project’s short description to fetch web context, enriching AI panel knowledge.
Visual Progress Tracking: Build a dashboard to graph performance across the four metrics over time.
Custom AI Video Personas: Let users choose interviewer styles (e.g., Friendly & Encouraging vs Skeptical & Direct) to broaden practice scenarios.

Built With

elevenlabs
firebase
flask
gemini
huggingface
lovable
netlify
react

Updates

Rairo Mukamuri started this project — Sep 06, 2025 11:16 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.