Inspiration
It all started with a personal challenge. I applied for an apprenticeship at my dream company. While grinding LeetCode problems, I realized that solving algorithms alone wasn't enough. I was missing the core experience: the real-time interaction, the pressure of a live interview, and having someone actually follow my coding logic. I decided to build the ultimate preparation tool I wished I had.
What it does
mockInterview.ai is a high-performance, multimodal AI coaching agent. Unlike traditional simulators, it:
- Speaks and listens with near-zero latency using the Gemini Live API.
- Sees your screen: It follows your code and comments on your architecture in real-time.
- Grounds everything in YOUR experience: For behavioral interviews, the agent analyzes your specific Resume/CV and target Job Description to ask personalized, deep-dive questions about your actual past projects.
- Coaches in depth: It maintains consistent conversations for 30+ minutes and generates a structured performance report based on full video analysis.
How we built it
We leveraged the "Google AI Stack" to its fullest:
- Google ADK (Agent Development Kit) to orchestrate the agent’s lifecycle and multi-turn logic.
- Gemini 2.5 Flash (Live API): The reactive core for native audio dialogue and live vision.
- Gemini File Search (RAG): To ground behavioral sessions in the user's CV/Resume for ultra-personalized questioning.
- Gemini 3.1 Flash: Powers the post-session structured feedback and video analysis.
- Backend: FastAPI deployed on Cloud Run with optimized scaling (zero cold starts).
- Frontend: React & TypeScript with tldraw and Monaco Editor.
Challenges we ran into
The biggest technical hurdle was session durability and grounding. Sustaining a 30-minute interview while keeping the agent strictly grounded in the candidate's specific CV and a complex Job Description required fine-tuned prompt engineering and robust RAG integration via the ADK.
Accomplishments that we're proud of
- True Personalization: The agent doesn't just ask generic questions; it challenges you on the specific technical choices mentioned in your uploaded CV.
- Live Vision: The agent truly understands what you draw or write, creating a mind-blowing "pair-programming" feel.
- Architectural Stability: Successfully combining bidirectional native audio, a video stream, and a live code editor into one fluid experience.
What we learned
We pushed the boundaries of Multimodal AI. We learned that Grounding is the key to moving from a "chatbot" to a "coach." An agent that knows your history and the job you're targeting provides infinitely more value than a generic LLM. We have learned also that a great agent isn't just an LLM that talks, it’s a combination of state management, proactive behavior, and modality choice. Integrating Affective Dialog (voice tone sensitivity) was also a revelation in how AI can help calm a nervous candidate.
What's next for mockInterview.ai
- Industry-Agnostic Scaling: Moving beyond tech to support Medical (patient simulations), Legal (moot courts), and Sales (high-stakes negotiations) by leveraging Gemini's ability to analyze specialized documents and charts in real-time.
- Full Camera Integration: Allowing the agent to observe non-verbal communication, eye contact, and body language during behavioral rounds for a more holistic coaching experience.
- Voice Cloning & Empathy: Let users practice with specific personas or clone their favorite mentor's voice for a more comfortable training environment.
- Multi-agent Panel: Simulating a board of three distinct AI interviewers with different "personalities" (The Skeptic, The Supportive Lead, The Manager) to test the candidate's ability to handle group dynamics.
Special Thanks 🙏
A huge shout-out to the Agent Starter Pack team! Their modular architectural patterns and foundational ADK blueprints were instrumental in helping us structure this project correctly from day one. By building on top of their optimal starting points, we were able to focus all our energy on pushing the multimodal boundaries with Gemini Live.
Built With
- adk
- cloud-run
- gcloud
- gcloud-storage
- gemini-2.5-flash-native-audio
- gemini-3.1-flash-lite-preview
- python
- react
- sqlite
- terraform
- typescript
- websocket
Log in or sign up for Devpost to join the conversation.