Inspiration
Public speaking is one of the most critical skills for students and professionals, yet it’s rarely practiced in a structured, feedback-driven way.
At the same time, we’re seeing a shift where students increasingly interact with AI through text rather than developing real-world communication skills. This creates a gap: people can generate ideas with AI, but struggle to present them confidently.
We built PitchCoach.AI to bridge that gap—bringing real-time, AI-powered coaching into everyday pitch practice.
Presentation Slides

Problem Statement - Lack of Speech Skills Training for University Students
University students today rely heavily on AI for writing and ideation, but this has unintentionally reduced opportunities to develop verbal communication and presentation skills.
This gap has serious real-world consequences:
❌ Lower job placement rates due to poor interview performance
❌ Difficulty clearly communicating ideas during technical or behavioral interviews
❌ Weak networking ability and reduced confidence in professional settings
❌ Struggles in building strong human business relationships
❌ Reduced effectiveness in teamwork, leadership, and collaboration
In many cases, students may have strong technical skills but fail to express them effectively, leading to missed opportunities.
At the same time:
❌ There is limited access to structured pitch or speaking training
❌ Students practicing alone receive little to no actionable feedback
❌ Over-reliance on text-based AI reduces real speaking practice
There is a clear need for a system that helps students practice speaking, improve confidence, and communicate effectively in real-world scenarios—not just generate content.
What it does
PitchCoach.AI is an AI-powered pitch training platform that analyzes how you speak and present—not just what you say.
Users can:
-Record a pitch or presentation -Receive AI-generated feedback on clarity, pacing, and structure -Get posture and body language analysis using pose detection -Hear feedback through text-to-speech for a realistic coaching experience
The platform transforms solo practice into a guided, interactive coaching session—like having a personal pitch coach available anytime.
Problem Statement - Lack of Speech Skills Training for University Students
University students today rely heavily on AI for writing and ideation, but this has unintentionally reduced opportunities to develop verbal communication and presentation skills.
Key issues:
❌ Limited real-time feedback when practicing alone
❌ Lack of structured pitch training tools
❌ Over-reliance on text-based AI instead of speaking practice
❌ Anxiety and low confidence in presentations
There is a clear need for a system that helps students practice speaking effectively, not just generate content.
How we built it
We combined multiple AI and web technologies to create a full feedback loop:
Frontend: html + CSS
Backend: FastAPI for processing and analysis
Speech Analysis: Transcription + LLM-based feedback generation
Pose Detection: MediaPipe to track posture and movement
Voice Metrics Data: JS library getUserMedia, MediaRecorder, Python backend to convert video into WAV for further Voice metrics
AI Feedback Engine: DigitalOcean Gradient AI (LLM) using Mistral LLM (mistral-nemo-instruct-2407)
Voice Feedback: Text-to-Speech (TTS) integration
Data Handling: Optimized pose frame sampling (reduced frequency for efficiency)
We also implemented:
🔄 Real-time processing indicators for better UX
📉 Data optimization to reduce backend load from pose streams
Challenges we ran into
⚡ Heavy data load from pose tracking → Solved by reducing frame frequency and batching data
🔊 TTS API issues (voice errors, subscription limits) → Required fallback strategies and model updates
⏳ User experience during processing delays → Added loading animations to keep users engaged
🎯 Balancing multi-modal feedback (speech + posture) → Needed careful prompt engineering to generate meaningful insights
🎥 Video/audio synchronization challenges → Ensured consistent timing between transcript and pose data
Accomplishments that we're proud of
- Built a real-time AI pitch coach combining voice + body language
- Successfully integrated multi-modal feedback (speech + pose)
- Optimized performance to handle heavy input data efficiently
- Created a focused use case (elevator pitch training) that is practical and impactful
- Delivered an experience that feels like a personal AI coach, not just analytics
What we learned
Multi-modal AI (speech + vision) creates much richer feedback than text alone UX matters a lot—users need feedback quickly and clearly Real-time systems require careful optimization and trade-offs AI is most powerful when it enhances human skills, not replaces them
What's next for PitchCoach.AI
Add progress tracking and improvement analytics over time Enable peer review + collaborative practice sessions Expand to interview coaching and presentation training Build a mobile app for on-the-go practice Improve AI feedback with personalized coaching styles Deploy scalable infrastructure for real-world student adoption
Built With
- css
- digitalocean
- elevenlabs
- html
- javascript
- mediapipe
- mistral
- mongo-db
- mongodb
- python

Log in or sign up for Devpost to join the conversation.