About the Project

VIDEO

https://drive.google.com/file/d/1haSRF50eij493tvS-TPtPefR-FHV02bo/view?usp=sharing

Inspiration

Technical interview preparation today is fragmented. Candidates typically practice coding problems on one platform, behavioral questions on another, and rarely get the opportunity to simulate a full, realistic interview environment. What is missing is the experience of an interview thinking out loud, structuring answers under pressure, and receiving meaningful feedback across both communication and technical ability.

This project was inspired by that gap. The goal was to build an AI-driven interview simulator that replicates the dynamics of a real interview, not just isolated question practice. We wanted users to experience the full loop: being asked questions, responding naturally (including by voice), solving coding problems, and receiving structured feedback.


What We Built

We developed an AI Interview Simulator that supports:

  • Role-based interview sessions (e.g., backend engineer, frontend developer)
  • Mixed question types:

    • Behavioral
    • Technical discussion
    • Situational/problem-solving
    • Coding questions
  • Voice-based responses with transcription

  • A coding environment integrated into the interview flow

  • AI-generated evaluation with structured scoring and feedback

  • Session summaries and historical tracking

A key differentiator is the ability to speak while solving coding problems, allowing the system to evaluate not only the final solution but also the candidate’s reasoning process similar to real-world interviews.


How We Built It

The system follows a modular architecture with clear separation between frontend interaction and backend intelligence.

Frontend

Built using React, the frontend handles the full interview experience including session flow, coding interaction, and speech capture. A Monaco-based editor is used for coding questions, while browser-based speech recognition captures user responses in real time. The interface maintains state for the current question, transcript, code, and session progress, ensuring a smooth and continuous experience.

Backend

The backend is built using Node.js and acts as the orchestration layer for the entire interview lifecycle. It manages session state, coordinates question generation, evaluates answers, and produces final summaries. It also ensures structured data flow between the frontend and the AI layer.

AI Layer

A Large Language Model (LLM) powers the core intelligence of the system. It is responsible for generating interview questions, evaluating responses, and producing structured feedback. The model receives rich context, including the current question, user response, coding input, and transcript of spoken reasoning.

Evaluation is designed to go beyond simple correctness. It considers multiple dimensions such as technical accuracy, clarity of communication, and problem-solving approach.

A simplified conceptual scoring model can be represented as:

[ \text{Total Score} = w_1 S_{\text{technical}} + w_2 S_{\text{communication}} + w_3 S_{\text{problem-solving}} ]

where (w_1), (w_2), and (w_3) are weighting factors depending on the question type.

Speech Integration

Speech-to-text is used to capture user responses in real time, while text-to-speech enables the system to deliver questions and feedback audibly. Transcripts are stored and used as part of the evaluation process, particularly for coding questions where reasoning plays a critical role.


Challenges We Faced

Repetition and Question Quality

One of the earliest issues was repetitive question generation, where the system would produce semantically similar questions with slightly different wording. This required implementing question tracking and introducing lightweight semantic similarity checks to ensure variety.

Coding vs Interview Balance

The system initially leaned too heavily toward coding questions, which made the experience feel unrealistic. We had to redesign the flow to maintain a balance between behavioral, technical, and coding questions.

Speech Integration in Coding Flow

Integrating speech during coding introduced unexpected complexity. Managing continuous transcription alongside an active code editor required careful state handling. We also encountered issues where speech input would truncate prematurely, which required debugging component lifecycle and event handling.

Meaningful Evaluation

Early versions of the system produced generic feedback that did not reflect the user’s actual input. Improving this required restructuring prompts and ensuring that evaluation was grounded in the provided code, transcript, and answer.

System Reliability

The system depends on multiple external services, including language models and speech APIs. Ensuring consistent performance during demos required simplifying the pipeline and reducing points of failure.


What We Learned

  • Prompt design is as important as system design when working with LLMs
  • Realistic user experience depends heavily on flow, not just features
  • Combining speech and coding introduces complex state management challenges
  • AI systems require strong guardrails to avoid generic or repetitive outputs
  • Simplicity and reliability are often more valuable than adding more features

Final Thoughts

This project goes beyond a typical practice tool. It attempts to simulate the experience of interviewing by combining communication, problem-solving, and technical execution into a single system.

While still an MVP, it provides a strong foundation for building a more advanced, production-ready interview training platform.

Built With

Share this project:

Updates