Home Page
Page to Upload Resume
Customizable Panel Details
Live Interview page with Transcription
Results Page

InterviewOS

AI-powered multi-panel mock interviews with real-time coaching, industry-specific depth, and actionable post-interview insights.

Inspiration

Traditional mock interviews are hard to scale, inconsistent in quality, and rarely capture the full picture of how a candidate performs under pressure. Most tools focus only on question-and-answer text, ignoring delivery, timing, and industry context.

With the Gemini 3 family and Gemini 2.5 Live, we saw an opportunity to build something closer to a real hiring panel: multiple distinct interviewers, continuous context, real-time audio and video, and structured feedback that feels like a debrief from a seasoned hiring committee.

InterviewOS is our attempt to turn “practice interviews” into high-fidelity simulations that measure not just what you say, but how you say it and how you evolve over the course of a full session.

What it does

InterviewOS simulates a full panel-style interview and produces a rich, structured evaluation:

Multi-panel AI interviewers
- Generates three distinct panelists with Indian and global names, each with their own role, focus area, and questioning style.
- Assigns gender-matched voices and avatar colors for clarity in the UI.
Real-time live interview
- Uses Gemini 2.5 Flash Live in the browser for ultra-low-latency audio.
- Streams your microphone and camera to drive live conversation with the panel.
- Shows a live transcript with clear “You vs. Panelist” speaker attributions.
Adaptive orchestration
- A dedicated InterviewOrchestrator tracks topics, depth (1–5), and panel balance.
- A server-side orchestration WebSocket sends hints back to the client: which topic to explore next, how deep to go, and which panelist should lead.
Emotion and body language snapshots
- Periodically captures short video segments during the call.
- Sends them to backend endpoints for body language and emotion analysis, with rate limiting and safe fallbacks.
- Feeds a dashboard-like view of posture, eye contact, and general composure across the interview.
Industry-specialized evaluation
- Supports profiles for FAANG, Finance, Consulting, Medical, Legal, Startup, and General via IndustrySpecialist.
- Can generate industry-specific questions and evaluations (scores, strengths, weaknesses, recommendations).
Final multi-dimensional report
- Uses Gemini 3 Pro to synthesize:
- Technical, Communication, and Culture Fit scores.
- Panelist-level comments and improvement suggestions.
- Augments the report with sample-based body/voice/temporal analytics, clearly labeled as demonstration data when APIs are rate-limited.

How we built it

Frontend (React + TypeScript + Vite)
- A single-page React app written in TypeScript, styled with Tailwind CSS and animated with Framer Motion.
- Core UI pieces:
- LiveInterview.tsx for the full interview experience (camera feed, controls, live transcript, timers).
- Dashboard.tsx, PanelConfiguration.tsx, and ResumeUploader.tsx for setup and post-interview views.
- Uses:
- @google/genai in the browser to open a Gemini Live session and stream audio/video.
- A custom audio worklet and ScriptProcessor fallback to send 16kHz PCM packets to Gemini in near real time.
- useVAD hook for client-side voice activity detection, triggering end-of-speech events.
- useVideoAnalysis hook to periodically record VP8 clips, convert them to base64, and call /api/analyze-body-language.
Backend (Node.js + Express + WebSocket)
- Express API in server/src/index.ts exposing:
- /api/health, /api/parse-resume, /api/generate-panelists, /api/generate-report.
- Advanced endpoints: /api/analyze-emotion, /api/analyze-body-language, /api/analyze-speech.
- Industry endpoints: /api/industry/:industry, /api/industry-questions, /api/industry-evaluate.
- WebSocket orchestration server at /ws/interview:
- Uses LiveInterviewHandler to receive transcript updates and speech_end events.
- Calls InterviewOrchestrator to compute hints (topic, depth, panelist) and time/phase updates.
Core services
- GeminiService:
- Wraps Gemini 3 Flash and Gemini 3 Pro with typed responseSchema for structured JSON.
- Implements retry with exponential backoff, skipping 4xx client errors.
- Generates panelists and final reports and augments reports with sample analytics when needed.
- InterviewOrchestrator:
- Tracks interview phase (opening/active/closing/completed), topics covered, depth, and panelist workloads.
- Uses simple but effective heuristics to decide when to follow up, when to switch topics, and when to rotate panelists.
- EmotionAnalyzer and PresentationCoach:
- Handle text/audio/video analysis for emotion and body language, wrapped with rate limiting and fallback defaults.
- IndustrySpecialist:
- Encodes reusable profiles for industries and drives question generation and answer evaluation on top of Gemini Pro.

Challenges we ran into

Low-latency audio streaming
- Ensuring smooth, gap-free playback while decoding base64 audio from Gemini in the browser required a queue + pre-decoding strategy.
- Handling edge cases when the audio worklet fails and falling back to ScriptProcessor without breaking the user experience.
Transcript consistency
- Gemini Live often sends cumulative partial transcripts; we had to carefully reconcile:
- Input transcriptions (user speech) and
- Output transcriptions (panel speech) into a single, readable chat-style log without duplicates or missing chunks.
Panel orchestration without over-complication
- Designing InterviewOrchestrator to be smart enough (topics, depth, panel rotation, timing) without making it brittle or overfit to a specific conversation pattern.

Accomplishments that we’re proud of

Multi-panel interview that feels coherent
- Three AI interviewers with distinct personas, voices, and focus areas, tied together by a shared orchestrator, produce a session that feels more like a real panel than a single model “persona”.
End-to-end real-time experience
- From microphone input to Gemini 2.5 Live to audio playback and transcript, the pipeline is tuned for low latency and stable behavior, instrumented with timing logs for continuous tuning.
Thoughtful orchestration layer
- The orchestration WebSocket and InterviewOrchestrator give us a place to experiment with “Marathon Agent”-style logic:
- Tracking depth,
- Managing phases and timing,
- And enabling dynamic panel handoffs.
Industry-aware evaluation
- The IndustrySpecialist service lets the same core engine feel tailored to FAANG vs Finance vs Consulting vs Medical, without forking the rest of the system.

What we learned

Gemini Live is powerful, but demands careful UX
- The technology is capable of near-conversational latency, but the user’s perceived smoothness depends on:
- How transcripts are updated,
- How audio is buffered,
- And how clear the UI is about who is speaking.
Orchestration is where “application-level intelligence” lives
- The biggest leap in realism came not from tweaking prompts, but from explicit orchestration:
- Tracking state,
- Planning next actions,
- And feeding hints back into the Live session.
Rate limiting and fallbacks are part of product design
- Designing a good experience meant assuming APIs will occasionally say “no,” and making sure the app still:
- Responds quickly,
- Shows something meaningful,
- And clearly labels any sampled/demo data.
Typed schemas reduce friction
- Using Gemini’s structured output via responseSchema removed a lot of fragile parsing logic and made the system more robust to prompt drift.

What’s next for InterviewOS

Deeper “Marathon Agent” behavior
- Expand InterviewOrchestrator with richer Thought Signatures and explicit self-correction loops so the panel can critique and refine its own questions across long sessions.
Richer post-interview analytics
- Turn temporal trends (confidence, nervousness, engagement) into:
- Comparative views across multiple sessions,
- And personalized recommendation plans over time.
More granular industry & role templates
- Add role-specific panels (e.g. “Staff Backend at FAANG”, “Product Manager in FinTech”) with targeted question banks and scoring rubrics.
Interactive replay
- Allow candidates to replay key moments:
- Jump to points with high stress or low clarity,
- See recommendations tied to specific transcript segments.
Team & recruiter dashboards
- Extend InterviewOS from a solo practice tool into a team training and evaluation platform, where mentors or recruiters can review reports and annotate sessions.

Tech Stack (at a glance)

Frontend
- React 19, TypeScript 5.8, Vite
- Tailwind CSS, Framer Motion, Recharts
- @google/genai (browser), custom audio worklet, useVAD, useVideoAnalysis
Backend
- Node.js 20+, Express, WebSocket (ws)
- @google/genai, Multer, dotenv
AI Models
- Gemini 3 Flash – resume parsing, panelist generation
- Gemini 3 Pro – final evaluation, industry-specific reasoning
- Gemini 2.5 Flash Live – real-time audio and transcript

Built With

express.js
framer-motion
gemini-2.5-flash-live)
google/genai-(gemini-3-flash/pro
html2canvas
jspdf
mediarecorder
multer
node.js-20+
react-19
react-router
recharts
tailwind-css
typescript
vite
web-audio-api
webrtc/getusermedia
websocket-(ws)

Updates

Himavarshith Reddy Gundam started this project — Feb 09, 2026 01:33 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.