InterviewAI: The AI-Powered Mock Interview Coach

About the Project

InterviewAI is an intelligent video-based platform that gives job-seekers personalized, actionable feedback on their interview skills—right from their desktop browser.

Inspired by the challenge of standing out in competitive job markets and the surge of remote interviewing, we set out to help users practice, analyze, and improve by leveraging the latest in AI. Our vision: democratize access to high-quality interview coaching, making elite feedback accessible to anyone, anytime.

Inspiration

The idea started with personal experience—many on our team struggled with confidence, pacing, and communication during virtual interviews. Traditional coaching was expensive, inaccessible, or lacked specific, actionable feedback. We realized recent advances in AI could transform video, speech, and text analysis into a 24/7 personal coach that’s always available.

What We Learned

Speech and Video Analysis: Combining modern ASR (Automatic Speech Recognition), NLP (Natural Language Processing), and Computer Vision enabled us to deliver feedback on both what users said and how they said it.
LLM-Driven Insights: Leveraging Google Gemini provided richer, more holistic analysis—spotting confidence, structure, and emotion beyond simple metrics.
Full-Stack Engineering: Building a seamless user experience required tight integration across a React frontend, Node.js data/API glue, Python AI services, and robust deployment/devops pipelines.

How We Built It

Frontend: Developed with React (TypeScript), Vite, and Tailwind for a responsive, modern UI. Key features include video upload, interactive dashboards, and progress tracking.
Backend: Hybrid architecture:
- Python (FastAPI): Performs all AI/ML processing—speech-to-text (Deepgram), video frame analysis, NLP, and LLM integrations.
- Node.js (Express): Manages REST APIs, authentication, user management, and MongoDB.
AI Integrations: Google Gemini for advanced multimodal (text, speech, video) analysis; Deepgram for speech transcription; Vision and NLP modules for cues like filler words, body language, and tone.
Deployment: Dockerized microservices; Vercel for frontend, Render for backend APIs; CI/CD for rapid iteration.
Prompt Engineering: Custom prompt configs (.bolt/) to tune LLM outputs for clarity and transparency.

Challenges We Faced

Multimodal Integration: Coordinating real-time analysis across speech, text, and video was complex—especially aligning transcript timing to video frames for accurate feedback.
Efficient AI Utilization: Balancing responsiveness and cost, while making the most of external AI APIs.
Security & Privacy: Handling sensitive user uploads with secure storage, redaction, and data handling policies.
Usability: Translating raw AI outputs into simple, motivating feedback (and beautiful charts!) that users could act on.
Codebase Coordination: Merging Python, Node.js, and React teams and avoiding code duplication, especially during rapid prototyping.

What's Next

We’re excited to open InterviewAI to the public, helping candidates everywhere land their dream roles. Upcoming, we hope to add peer benchmarking, question banks for more targeted practice, and deeper explainability for all AI-generated feedback.