Inspiration

Most people practice presentations alone with zero feedback. Professional coaches are expensive and inaccessible. We wanted to change that.

What it does

Upload or record a presentation, attach your slides and notes, and get a full AI coaching report. We analyze filler words, pacing, eye contact, posture, and gestures. Everything shows up as a clickable annotated timeline synced to your video, with a structured coach panel and a chatbot for follow-up questions.

Powered By Google

  • Google Speech to Text
  • Google Gemini 2.5 Flash
  • Google Gemini 2.5 Pro
  • Google AI Edge - MediaPipe
  • Google Cloud Platform
  • Google AI Studio

How we built it

React, TypeScript, and MUI on the frontend. Django REST Framework on the backend. Video analysis runs through a two-phase Celery background queue: phase one processes audio with Google Cloud Speech-to-Text, OpenSMILE, and facial expressiveness/body language with MediaPipe. Phase two runs a LangGraph coach that synthesizes everything into structured feedback. Chat streams to the browser over SSE.

Challenges we ran into

Syncing two async phases with a frontend that had to reflect every intermediate state without blocking the user was the hardest problem. Getting HTTP range requests working for large video files so scrubbing actually worked took longer than expected. Coordinating three people across frontend, backend, and ML under time pressure meant making fast, hard calls about what to stub and what to ship.

Accomplishments that we're proud of

The pipeline actually works end to end. A real video goes in, structured timestamped coaching comes out. The annotated timeline with click-to-seek feels genuinely useful, not just a demo gimmick.

What we learned

Designing around an eight-state async lifecycle means every UI component has to be deliberate about what it shows and when. Building a clean API layer with mock/real toggles early was the best decision we made. It meant connecting real endpoints later required zero component changes.

What's next for Speech Coach

We are really excited about Speech Coach, and we think there is a lot of potential and we wish to realize that. Live feedback during recording. A metrics dashboard showing improvement over time. Multi-session comparison. Mobile. And coach personas tailored to specific contexts like investor pitches, job interviews, and conference talks. These are all ideas we talked about and we want to make happen.

Built With

Share this project:

Updates