Inspiration
We all know someone incredibly talented—a brilliant developer or a visionary manager—who gets rejected simply because they freeze up during interviews. They have the hard skills, but they lack the practice to communicate them effectively under pressure.
I realized that most interview prep tools are passive: they give you text to read or generic tips. I wanted to build something active. I wanted to create a "flight simulator" for interviews that doesn't just check if your answer is correct, but checks if you sound confident, honest, and clear. The inspiration for QnAi was to bridge the gap between "qualified" and "hired" by using AI to mirror the high-pressure environment of a real interview.
What it does
QnAi is an advanced, voice-first interview coach. It allows users to:
- Practice Realistically: Users speak their answers to probable interview questions generated specifically for their target Job, College, or Scholarship application.
- Mock Interviews: Users engage in a full, conversational loop where the AI listens, understands context, and asks follow-up questions like a real hiring manager.
- Get Behavioral Feedback: I used ElevenLabs Scribe v2 to transcribe speech with high fidelity—capturing every "um," "ah," nervous pause, and stutter. This allows the backend to score the user's Confidence and Clarity based on actual delivery, not just the words.
- Resume Fact-Checking: QnAi is "Resume Aware." It cross-references the user's spoken answers against their uploaded PDF CV. If a user claims 10 years of experience when their CV says they were a student, the system flags it as a "Professional Integrity" risk.
- Granular Analysis: Users get a detailed scorecard with "Better Approach" suggestions to help them rephrase their answers professionally.
How I built it
I built QnAi using the MERN Stack (MongoDB, Express, React, Node.js) powered by two cutting-edge AI models:
- ElevenLabs Scribe v2: This is the sensory layer. I stream user audio to the Scribe v2 model. Crucially, I utilized its ability to capture disfluencies and nuances. I didn't want clean text; I wanted accurate text that reflected the user's anxiety levels (stammers, fillers).
- Google Gemini 2.5 Flash: This is the reasoning layer. I feed the raw Scribe transcription + the user's PDF Resume + the Job Description into Gemini. I engineered complex prompts to make Gemini act as a strict hiring committee, analyzing the transcript for contradictions against the resume and scoring the "Confidence" based on the Scribe data.
- Frontend: Built with React and Vite, featuring a modern Glassmorphism UI and a custom audio recorder for capturing user input.
Challenges I ran into
- Prompt Engineering for "Lies": Getting the AI to accurately detect factual inconsistencies between a spoken answer and a PDF resume was tricky. I had to refine the system instructions to make it a "strict auditor" so it wouldn't hallucinate contradictions but would catch obvious ones (like date mismatches).
- Handling Audio Formats: Ensuring the audio recorded in the browser (often WebM) was correctly processed and sent to the Scribe v2 API without losing quality or causing encoding errors required some deep diving into
Recorder.jsand audio blobs. - Balancing Feedback: Early versions were too harsh! I had to tune the scoring algorithm to ensure the feedback was constructive ("Here is a better way to say this") rather than just critical ("You said 'um' 50 times").
Accomplishments that I'm proud of
- The "Lie Detector": I am incredibly proud of the "Resume Awareness" feature. Seeing the AI correctly flag a user for claiming fake experience at Google (because it contradicted their uploaded CV) was a huge "Aha!" moment.
- Using Scribe for Behavior, Not Just Text: I successfully turned a transcription tool into a behavioral analysis tool. By preserving the "ums" and "ahs" via Scribe, I created a metric for confidence that standard Speech-to-Text APIs simply filter out.
- Glassmorphism UI: The app looks and feels professional, making the stressful activity of interview prep feel a bit more sleek and manageable.
What I learned
- Nuance Matters: I learned that for coaching, clean transcription is actually bad transcription. The flaws in human speech are where the most valuable coaching data lies. Scribe v2's ability to keep those flaws was essential.
- Context is King: Generic interview questions are useless. Injecting the specific Job Description and Resume into the context window completely transformed the quality of the mock interview, making it feel specific and high-stakes.
What's next for QnAi
- Tone Analysis: I plan to integrate deeper audio analysis to detect tone (monotone vs. dynamic) to give feedback on vocal energy.
- Mobile App: Building a React Native version so users can practice their elevator pitch while on a walk or commute.
- Company Integrations: Allowing companies to upload their own interview scripts so candidates can pre-screen themselves before talking to a human recruiter.
Built With
- elevenlabs
- express.js
- gemini
- mongodb
- node.js
- react
Log in or sign up for Devpost to join the conversation.