Devil's Advocate

Overall Architecture Diagram
Adversarial Agent Flow Diagram
Judge Agent Flow Diagram
Report Agent Flow Diagram

About the project

Early-stage founders often get vague encouragement instead of honest pressure-testing. Friends are biased, peers may hold back, and standard chatbots tend to become helpful assistants instead of real challengers. We built Devil’s Advocate to solve that problem: a live voice AI opponent that forces founders to defend their assumptions out loud.

This project was built for the Google Gemini Live Agent Challenge and also serves as part of our UIUC CS 568 (User-Centered Machine Learning) Spring 2026 final project. Our broader research question is whether adversarial, real-time AI feedback can help people refine startup ideas more effectively than traditional peer review or a normal LLM conversation.

At the center of the system is a live spoken debate agent powered by the Gemini Live API. A user states their idea verbally or uploads supporting materials like a pitch deck or business plan, and the agent pushes back in real time. The experience is intentionally conversational and interruptible, so it feels less like filling out a form and more like being challenged by a skeptical investor, co-founder, or judge.

We designed the system to be grounded rather than purely generative. The backend retrieves context from a curated startup knowledge base, incorporates user-uploaded documents, and uses Gemini’s Google Search tool support to reference real competitors, funding data, market signals, and business benchmarks. During the debate, a separate lightweight classifier tracks whether the founder defended, conceded, deflected, or introduced a new claim. At the end of the session, the system generates both a judge scorecard and a post-debate report with strengths, weaknesses, the biggest unanswered gap, and recommended next steps.

We built the frontend with React + Vite and the backend with FastAPI + Socket.IO, then deployed the system using Firebase Hosting for the frontend and Google Cloud / Cloud Run for the backend. We used Firebase Auth, Firebase Storage, and Firestore to support authentication, document uploads, and consented session logging. For retrieval, we used ChromaDB with a lightweight abstraction layer so the architecture can evolve later if needed.

One of the biggest challenges was balancing the agent’s tone. If it is too soft, the experience loses its value; if it is too aggressive, it becomes unhelpful. We also had to solve a number of real-time interaction challenges: handling interruption cleanly, keeping audio and transcript state synchronized, grounding the agent without introducing too much latency, and making the overall experience robust enough for a live demo. On top of that, we had to think carefully about privacy, consent, and how to log research data responsibly.

What we learned most was that building a strong live agent is not just a model problem. It is a systems and UX problem. The quality of the experience depends on prompt design, grounding quality, latency, interruption handling, authentication, data flow, and clear user feedback all working together. This project pushed us to think about multimodal AI not as a chatbot with extra features, but as an interactive system with real product and research implications.