🚀 Project Story: AI Technical Interviewer
Inspiration
As second-year software engineering students at Polytechnique Montréal, we have experienced firsthand the high-pressure environment of internship hunting. We realized that while many students are brilliant coders, the real challenge lies in verbalizing thought processes during technical interviews. We were inspired to build a tool that bridges the gap between solving a problem on a screen and explaining it to a human, creating a safe space for students to practice their communication and technical skills simultaneously.
What We Learned
This project was a deep dive into the world of real-time AI and high-performance web architecture:
- Stateful AI Interactions: We learned that simple request-response cycles aren't enough for a natural conversation. Mastering the
client.chatssession management in the Google Gen AI SDK was crucial for maintaining context across multiple turns. - Real-time Voice Orchestration: Integrating ElevenLabs Scribe taught us the nuances of WebSocket streams, specifically how to handle "partial" vs. "committed" transcripts to ensure the AI doesn't interrupt the user mid-sentence.
- System Prompt Engineering: We discovered the power of dynamic system instructions—learning how to inject live problem data into the AI’s "personality" so it acts as a specific subject matter expert for every unique challenge.
How We Built It
Our stack was chosen for speed, reliability, and modern developer experience:
- Frontend: Built with React and Vite. We used custom hooks to manage the complex states of the voice recorder and the multi-turn chat interface.
- Backend: A Flask (Python) server serves as the "brain," orchestrating the flow of data between the user, the LLM, and the Speech-to-Text engine.
- AI Engine: We utilized Google Gemini 1.5 Flash for its speed and massive context window. We implemented a dynamic session handler that reconstructs the system prompt whenever a new problem is loaded.
- Voice STT: ElevenLabs Scribe provided near-instant transcription, which we secured using server-side single-use token generation.
Challenges We Faced
The road to a working MVP was full of technical puzzles:
- The "Circular Import" Trap: As our backend grew, we hit a wall with Python circular dependencies between our main server and our AI integration logic. We had to refactor our architecture to decouple data from logic, ensuring a clean and scalable codebase.
- Strict Type Validation: The latest Google Gen AI SDK is built on Pydantic and is extremely strict. We spent hours debugging validation errors until we perfected the exact dictionary structure required for
ContentandParttypes. - Latency vs. Accuracy: Balancing the speed of the voice-to-text conversion while ensuring the AI waited for the "Final Transcript" (the
scribe.commit()call) before responding was a major UX hurdle that we solved with careful asynchronous timing.
Technical Conclusion
To provide high-quality feedback, our AI analyzes the efficiency of the user's code. For example, when practicing the Two Sum problem, the interviewer is programmed to steer the user away from a brute-force $O(n^2)$ solution toward an optimized hash map approach:
$$\text{Time Complexity}: O(n)$$ $$\text{Space Complexity}: O(n)$$
This ensures that the "Technical" in Technical Interviewer is always mathematically grounded.
Log in or sign up for Devpost to join the conversation.