InterviewOS
AI-powered multi-panel mock interviews with real-time coaching, industry-specific depth, and actionable post-interview insights.
Inspiration
Traditional mock interviews are hard to scale, inconsistent in quality, and rarely capture the full picture of how a candidate performs under pressure. Most tools focus only on question-and-answer text, ignoring delivery, timing, and industry context.
With the Gemini 3 family and Gemini 2.5 Live, we saw an opportunity to build something closer to a real hiring panel: multiple distinct interviewers, continuous context, real-time audio and video, and structured feedback that feels like a debrief from a seasoned hiring committee.
InterviewOS is our attempt to turn “practice interviews” into high-fidelity simulations that measure not just what you say, but how you say it and how you evolve over the course of a full session.
What it does
InterviewOS simulates a full panel-style interview and produces a rich, structured evaluation:
Multi-panel AI interviewers
- Generates three distinct panelists with Indian and global names, each with their own role, focus area, and questioning style.
- Assigns gender-matched voices and avatar colors for clarity in the UI.
Real-time live interview
- Uses Gemini 2.5 Flash Live in the browser for ultra-low-latency audio.
- Streams your microphone and camera to drive live conversation with the panel.
- Shows a live transcript with clear “You vs. Panelist” speaker attributions.
Adaptive orchestration
- A dedicated
InterviewOrchestratortracks topics, depth (1–5), and panel balance. - A server-side orchestration WebSocket sends hints back to the client: which topic to explore next, how deep to go, and which panelist should lead.
- A dedicated
Emotion and body language snapshots
- Periodically captures short video segments during the call.
- Sends them to backend endpoints for body language and emotion analysis, with rate limiting and safe fallbacks.
- Feeds a dashboard-like view of posture, eye contact, and general composure across the interview.
Industry-specialized evaluation
- Supports profiles for FAANG, Finance, Consulting, Medical, Legal, Startup, and General via
IndustrySpecialist. - Can generate industry-specific questions and evaluations (scores, strengths, weaknesses, recommendations).
- Supports profiles for FAANG, Finance, Consulting, Medical, Legal, Startup, and General via
Final multi-dimensional report
- Uses Gemini 3 Pro to synthesize:
- Technical, Communication, and Culture Fit scores.
- Panelist-level comments and improvement suggestions.
- Augments the report with sample-based body/voice/temporal analytics, clearly labeled as demonstration data when APIs are rate-limited.
How we built it
Frontend (React + TypeScript + Vite)
- A single-page React app written in TypeScript, styled with Tailwind CSS and animated with Framer Motion.
- Core UI pieces:
LiveInterview.tsxfor the full interview experience (camera feed, controls, live transcript, timers).Dashboard.tsx,PanelConfiguration.tsx, andResumeUploader.tsxfor setup and post-interview views.- Uses:
@google/genaiin the browser to open a Gemini Live session and stream audio/video.- A custom audio worklet and
ScriptProcessorfallback to send 16kHz PCM packets to Gemini in near real time. useVADhook for client-side voice activity detection, triggering end-of-speech events.useVideoAnalysishook to periodically record VP8 clips, convert them to base64, and call/api/analyze-body-language.
Backend (Node.js + Express + WebSocket)
- Express API in
server/src/index.tsexposing: /api/health,/api/parse-resume,/api/generate-panelists,/api/generate-report.- Advanced endpoints:
/api/analyze-emotion,/api/analyze-body-language,/api/analyze-speech. - Industry endpoints:
/api/industry/:industry,/api/industry-questions,/api/industry-evaluate. - WebSocket orchestration server at
/ws/interview: - Uses
LiveInterviewHandlerto receive transcript updates andspeech_endevents. - Calls
InterviewOrchestratorto compute hints (topic, depth, panelist) and time/phase updates.
- Express API in
Core services
GeminiService:- Wraps Gemini 3 Flash and Gemini 3 Pro with typed
responseSchemafor structured JSON. - Implements retry with exponential backoff, skipping 4xx client errors.
- Generates panelists and final reports and augments reports with sample analytics when needed.
InterviewOrchestrator:- Tracks interview phase (opening/active/closing/completed), topics covered, depth, and panelist workloads.
- Uses simple but effective heuristics to decide when to follow up, when to switch topics, and when to rotate panelists.
EmotionAnalyzerandPresentationCoach:- Handle text/audio/video analysis for emotion and body language, wrapped with rate limiting and fallback defaults.
IndustrySpecialist:- Encodes reusable profiles for industries and drives question generation and answer evaluation on top of Gemini Pro.
Challenges we ran into
Low-latency audio streaming
- Ensuring smooth, gap-free playback while decoding base64 audio from Gemini in the browser required a queue + pre-decoding strategy.
- Handling edge cases when the audio worklet fails and falling back to
ScriptProcessorwithout breaking the user experience.
Transcript consistency
- Gemini Live often sends cumulative partial transcripts; we had to carefully reconcile:
- Input transcriptions (user speech) and
- Output transcriptions (panel speech) into a single, readable chat-style log without duplicates or missing chunks.
Panel orchestration without over-complication
- Designing
InterviewOrchestratorto be smart enough (topics, depth, panel rotation, timing) without making it brittle or overfit to a specific conversation pattern.
- Designing
Accomplishments that we’re proud of
Multi-panel interview that feels coherent
- Three AI interviewers with distinct personas, voices, and focus areas, tied together by a shared orchestrator, produce a session that feels more like a real panel than a single model “persona”.
End-to-end real-time experience
- From microphone input to Gemini 2.5 Live to audio playback and transcript, the pipeline is tuned for low latency and stable behavior, instrumented with timing logs for continuous tuning.
Thoughtful orchestration layer
- The orchestration WebSocket and
InterviewOrchestratorgive us a place to experiment with “Marathon Agent”-style logic: - Tracking depth,
- Managing phases and timing,
- And enabling dynamic panel handoffs.
- The orchestration WebSocket and
Industry-aware evaluation
- The
IndustrySpecialistservice lets the same core engine feel tailored to FAANG vs Finance vs Consulting vs Medical, without forking the rest of the system.
- The
What we learned
Gemini Live is powerful, but demands careful UX
- The technology is capable of near-conversational latency, but the user’s perceived smoothness depends on:
- How transcripts are updated,
- How audio is buffered,
- And how clear the UI is about who is speaking.
Orchestration is where “application-level intelligence” lives
- The biggest leap in realism came not from tweaking prompts, but from explicit orchestration:
- Tracking state,
- Planning next actions,
- And feeding hints back into the Live session.
Rate limiting and fallbacks are part of product design
- Designing a good experience meant assuming APIs will occasionally say “no,” and making sure the app still:
- Responds quickly,
- Shows something meaningful,
- And clearly labels any sampled/demo data.
Typed schemas reduce friction
- Using Gemini’s structured output via
responseSchemaremoved a lot of fragile parsing logic and made the system more robust to prompt drift.
- Using Gemini’s structured output via
What’s next for InterviewOS
Deeper “Marathon Agent” behavior
- Expand
InterviewOrchestratorwith richer Thought Signatures and explicit self-correction loops so the panel can critique and refine its own questions across long sessions.
- Expand
Richer post-interview analytics
- Turn temporal trends (confidence, nervousness, engagement) into:
- Comparative views across multiple sessions,
- And personalized recommendation plans over time.
More granular industry & role templates
- Add role-specific panels (e.g. “Staff Backend at FAANG”, “Product Manager in FinTech”) with targeted question banks and scoring rubrics.
Interactive replay
- Allow candidates to replay key moments:
- Jump to points with high stress or low clarity,
- See recommendations tied to specific transcript segments.
Team & recruiter dashboards
- Extend InterviewOS from a solo practice tool into a team training and evaluation platform, where mentors or recruiters can review reports and annotate sessions.
Tech Stack (at a glance)
Frontend
- React 19, TypeScript 5.8, Vite
- Tailwind CSS, Framer Motion, Recharts
@google/genai(browser), custom audio worklet,useVAD,useVideoAnalysis
Backend
- Node.js 20+, Express, WebSocket (
ws) @google/genai, Multer, dotenv
- Node.js 20+, Express, WebSocket (
AI Models
- Gemini 3 Flash – resume parsing, panelist generation
- Gemini 3 Pro – final evaluation, industry-specific reasoning
- Gemini 2.5 Flash Live – real-time audio and transcript
Built With
- express.js
- framer-motion
- gemini-2.5-flash-live)
- google/genai-(gemini-3-flash/pro
- html2canvas
- jspdf
- mediarecorder
- multer
- node.js-20+
- react-19
- react-router
- recharts
- tailwind-css
- typescript
- vite
- web-audio-api
- webrtc/getusermedia
- websocket-(ws)
Log in or sign up for Devpost to join the conversation.