Inspiration
Preparing for interviews is stressful because most candidates never get clear feedback on how they actually come across. You can record yourself answering practice questions, but rewatching your own videos is awkward, time-consuming, and easy to misjudge. You might not notice that you are rambling, using filler words, speaking too quickly, avoiding eye contact, or giving answers that sound less structured than they felt in the moment.
ReadTheRoom was inspired by the idea that interview coaching should be more accessible, immediate, and less intimidating. Instead of waiting for a mentor, career coach, or mock interviewer, candidates can get fast AI-powered feedback after every practice answer.
What it does
ReadTheRoom is an AI-powered interview coach that helps users understand both what they say and how they say it during practice interviews.
Users choose an interview question, record their answer in the browser, and receive a feedback report in under two minutes. The report includes:
- Communication scores: confidence, clarity, and conciseness
- Speech analytics: filler words, hedging phrases, pacing, duration, and long pauses
- Answer quality: relevance, STAR-method structure, strengths, and improvement areas
- Video insights: engagement and facial expression signals from sampled frames
- Actionable coaching: specific next steps to improve before the real interview
The goal is not just to score users, but to help them practice, improve, and build confidence faster.
How we built it
Frontend: Vue 3, Vite, and Tailwind CSS
- Browser-based video recording with the MediaRecorder API
- Interview question audio playback powered by ElevenLabs text-to-speech
- Smooth 3-step flow with animated modals and progress indicators
- Professional landing page and feedback visualization
Backend: FastAPI and Python
- FFmpeg extracts audio from uploaded interview videos
- ElevenLabs transcribes the extracted audio into text
- Custom audio analysis detects filler words, hedging phrases, pacing, duration, and long pauses
- Video frames are sampled for expression and engagement analysis
- Google Gemini generates structured interview coaching from the transcript
- Results are combined into one feedback report for the frontend
Challenges we ran into
- AI orchestration: combining video recording, FFmpeg processing, ElevenLabs transcription, custom metrics, vision analysis, and Gemini feedback into one smooth flow
- Speech analysis accuracy: detecting filler words and hedging phrases without overcounting normal speech
- Prompt engineering: making Gemini respond like a supportive coach instead of a generic evaluator
- Performance: keeping the record-to-feedback flow fast enough for users to practice repeatedly
- User experience: presenting feedback in a way that feels useful and confidence-building, not discouraging
Accomplishments that we're proud of
ReadTheRoom turns a raw interview recording into a structured coaching report in under two minutes. By combining browser-native video recording, FFmpeg audio extraction, ElevenLabs transcription, custom speech analytics, and Gemini-powered feedback, we built an end-to-end pipeline that transforms unstructured video into measurable interview insights.
We are especially proud of our custom audio metrics engine. Instead of relying only on an LLM, we extract concrete communication signals such as filler word frequency, hedging phrases, speaking pace, long pauses, buzzwords, and STAR-method indicators. These metrics are then presented alongside AI-generated coaching so users can see both objective speech patterns and qualitative feedback in one dashboard.
The result is a fast, repeatable interview practice loop: record an answer, get detailed feedback, improve, and try again.
What we learned
- How to orchestrate multiple AI and media-processing tools in one product
- How to extract useful communication signals from transcripts and audio
- How to design LLM prompts for constructive coaching
- How to build a full-stack Vue + FastAPI application under hackathon time pressure
- How to turn raw AI output into a dashboard users can actually understand and act on
What's next for ReadTheRoom
- Video playback with feedback overlays showing exactly where filler words, pauses, or weak moments happened
- Interview history and progress tracking over time
- Multi-language interview practice
- Harder mock interview scenarios with follow-up questions
Built With
- computervision
- elevenlabsapi
- expressionanalysis
- fastapi
- ffmeg
- googlegeminiapi
- mediarecorderapi
- python
- tailwindcss
- vite
- vue
Log in or sign up for Devpost to join the conversation.