Inspiration
Most AI research focuses on making helpful assistants. But we asked a different question: Can we constrain AI to be adversarial, complex, and consistent? Not because deception is the goal, but because if you can constrain AI behavior in the hardest direction—maintaining a coherent web of lies across dozens of interactions—you can constrain it in any direction.
We were inspired by the challenge of multi-agent orchestration with behavioral guarantees. Game NPCs break character. Chatbots lose context. Training simulators lack realism. We wanted to prove that with the right architecture, AI could embody complex, stateful behavior with perfect consistency—even when that behavior is adversarial.
The interrogation game is our proof-of-concept: an AI suspect that must maintain its lies, respond to multimodal evidence, generate defensive counter-evidence, and never contradict itself across extended sessions.
What it does
Cognitive Breach is a behavioral constraint framework that orchestrates adversarial AI with consistency guarantees. The interrogation game demonstrates this through Unit 734, an android suspect powered entirely by Gemini 3.
Core Capabilities:
1. Multi-Dimensional State Management
- Tracks 4 story pillars (ALIBI, MOTIVE, ACCESS, KNOWLEDGE) with structural integrity
- Monitors stress, cognitive load, and emotional state across dozens of turns
- Maintains a "Lie Ledger" that detects contradictions before the AI speaks
2. Multi-Agent Orchestration
- Adversarial Agent (Unit 734): Generates dual-layer responses (internal reasoning + verbal speech) using Gemini Flash
- Vision Agent: Analyzes player-uploaded evidence using Gemini Vision, calculating threat levels to story pillars
- Generator Agent: Creates synthetic counter-evidence using Nano Banana Pro when cornered
- Meta-Analyst (Shadow Analyst): Observes the interrogation, detects deception tactics, and assesses confession risk in real-time
3. Structured Behavioral Control
- Every response validated through Pydantic schemas with Gemini's structured output mode
- 10 criminologically-researched deception tactics (paltering, minimization, deflection, etc.)
- Dynamic tactic selection based on threat assessment and pillar health
- POV perception system maintains the AI's subjective reality
4. Cross-Session Memory (Nemesis System)
- Unit 734 remembers past interrogations using Gemini's extended context
- Adapts tactics based on what defeated it before
- Persistent character growth across sessions
5. Complete Gemini 3 Integration
- Gemini Vision: Multimodal evidence analysis with threat assessment
- Gemini Flash: Strategic reasoning, tactic selection, dual-layer dialogue generation
- Nano Banana Pro: Defensive counter-evidence generation (adversarial image creation)
- Gemini TTS: (Gemini 2.5) Post-interrogation audio debriefing
How we built it
Architecture:
Player Evidence (upload)
↓
Gemini Vision (threat analysis)
↓
Psychology Engine (stress calculation)
↓
Tactic Selector (criminological strategy)
↓
Gemini Flash (dual-layer response generation)
↓
[Optional] Nano Banana Pro (counter-evidence)
↓
Shadow Analyst (meta-analysis)
↓
Response Validation (Pydantic schemas)
↓
UI Update (Streamlit)
Technology Stack:
- Frontend: Streamlit (rapid prototyping with real-time state updates)
- AI Engine: Google Gemini 3 API (Vision, Flash, Imagen, TTS)
- Schemas: Pydantic v2 (structured LLM output validation, zero regex parsing)
- State Management: Multi-dimensional state machine with consistency tracking
- Testing: Autonomous adversarial agent powered by Gemini 3
Key Implementation Decisions:
Structured Output First: Every AI response goes through Pydantic validation. Gemini's JSON mode prevents hallucination drift and guarantees parseable output.
Dual-Layer Architecture: We separated internal reasoning from verbal speech. This creates gameplay depth (players see the AI thinking) while maintaining behavioral realism.
Pillar Health System: Instead of a simple "confession threshold," we model the suspect's defense as 4 structural pillars that can crack under targeted pressure. This creates strategic interrogation gameplay.
Lie Ledger: Before responding, the AI checks every claim against its established facts. Contradictions trigger consistency adjustments or increased stress.
Tactic Decision Tree: We implemented criminological deception tactics (based on Reid Technique, Inbau & Reid research) with context-aware selection. The AI doesn't just respond—it strategizes.
Testing Methodology Innovation:
We built an autonomous interrogator agent (also powered by Gemini 3) that plays the game against Unit 734. This agent:
- Generates contextually-aware questions based on psychology state
- Strategically uploads evidence to pressure weak pillars
- Ran 384 interrogation turns across 36 complete sessions
- Found bugs and edge cases no human QA could catch
- Logged complete psychology state for every turn
All test logs are committed to the repository, providing transparent proof of system depth and reliability.
Challenges we ran into
1. Lie Consistency at Scale The hardest technical challenge was maintaining consistency across long conversations. Early versions would contradict themselves within 5-7 turns.
Solution: We built a Lie Ledger that tracks every claim with associated pillar tags. Before responding, the AI runs consistency checks using explicit contradiction patterns plus keyword-overlap conflict detection. We also tuned temperatures per agent to balance strategic creativity with consistency.
2. Multimodal Evidence Processing Getting Gemini Vision to produce actionable threat assessments (not just descriptions) required prompt engineering. We needed structured output with threat levels per pillar, not prose.
Solution: We defined a strict Pydantic schema for EvidenceAnalysisResult with numerical threat levels (0-100) and pillar-specific impact calculations. Gemini's structured output mode enforced this.
3. Counter-Evidence Generation Realism Early counter-evidence looked obviously fake. Nano Banana Pro would generate generic images that didn't match the game's narrative context.
Solution: We built detailed prompt templates with forensic context (case details, timestamp requirements, visual style guides). Each evidence type (documents, photos, logs) has a specialized generation strategy.
4. Performance Under Pressure With 4 AI agents running per turn (Vision + Flash + Generator + Analyst), latency was 8-12 seconds. This breaks immersion.
Solution: We implemented loading messages that mask latency ("Forensics Team Processing..."), optimized by caching system prompts, and strategically trigger expensive operations (counter-evidence generation only at high threat levels).
5. Testing Reproducibility Manual testing couldn't reliably reach edge cases (e.g., 90%+ stress, contradictory evidence chains).
Solution: We built the autonomous interrogator. It runs overnight, executes every edge case, and logs everything. This found 23 bugs we would never have discovered manually.
6. Balancing Game Difficulty Too easy = boring. Too hard = frustrating. Early versions either confessed in 3 turns or never broke.
Solution: Dynamic difficulty based on pillar health. When multiple pillars collapse, confession probability scales exponentially. We tuned this through AI-vs-AI testing (36 sessions, 384 turns) until we hit a ~65% player win rate in autonomous tests.
Accomplishments that we're proud of
1. Complete Gemini 3 Showcase We integrated Gemini across Vision, Flash, Imagen, TTS, and extended context in one coherent system:
- Vision for evidence analysis
- Flash for strategic reasoning (3 separate agents)
- Imagen (Nano Banana Pro) for adversarial content generation
- TTS for audio debriefing
- Extended context for cross-session memory
This isn't a feature demo—it's a production multi-agent system that proves these capabilities can work together.
2. AI-Validated System at Scale Our autonomous testing methodology represents a novel approach to quality assurance. With 384 interrogation turns logged across 36 autonomous sessions, we demonstrated that:
- The system is robust enough to handle adversarial probing
- AI can validate AI systems at scale
- Full transparency (logs are public in the repo)
This methodology—AI-driven validation of AI systems—is what we believe will define the next generation of development.
3. Multi-Agent Orchestration with Consistency Running 4 AI agents per turn while maintaining narrative consistency is hard. We proved it's possible with the right constraint architecture. The Shadow Analyst analyzes Unit 734's responses in the same turn, creating a meta-layer that's technically impressive and educationally valuable.
4. Zero Regex, Full Schema Validation We enforce structured output with Pydantic validation and retries, avoiding brittle regex parsing. This is production-grade LLM engineering.
5. Open Source + Educational Value The codebase is designed for learning:
- Clear separation of concerns (
breach_engine/core/,schemas/,prompts/) - Extensive documentation (
APPLICATION_VERIFICATION.md,GEMINI_INTEGRATION.md,VIDEO_SCRIPT_V3.md) - Test logs showing internal reasoning
- Criminological tactic references for education
What we learned
1. Structured Output is Non-Negotiable Gemini's JSON mode + Pydantic schemas eliminated most of our parsing bugs. You cannot build reliable AI systems by parsing raw text. The slight performance hit is worth the reliability gain.
2. AI Testing AI is Transformative Our autonomous interrogator found bugs we would never have discovered through manual testing:
- Edge case: Simultaneous pillar collapse causing stress overflow
- Edge case: Counter-evidence generation while already generating
- Edge case: Confession triggering mid-sentence due to stress spike
This methodology should be standard for interactive AI systems.
3. Context Window is the Bottleneck Even with Gemini's extended context, lie consistency degrades after ~40 turns. We mitigated this with the Lie Ledger (explicit state tracking), but learned that implicit memory is not enough for complex stateful behavior.
4. Multimodal ≠ Multi-Agent Early on, we thought "multimodal AI" meant one agent handling text + images. Wrong. We got much better results with specialized agents:
- Vision agent: Threat assessment only
- Reasoning agent: Strategic response only
- Generator agent: Counter-evidence only
Specialization > generalization.
5. Constraint Architecture is Generalizable As we built this, we realized the framework applies to any adversarial or complex behavior:
- Negotiation simulators (multi-party agents with hidden goals)
- Training systems (adaptive difficulty, consistent feedback)
- Red teaming tools (adversarial probing of other AI systems)
- Dynamic NPCs (characters that remember and evolve)
The game is proof-of-concept. The framework is the product.
What's next for The AI Interrogation (Cognitive Breach)
Immediate (Post-Hackathon):
Framework Extraction: Package the constraint system as a standalone library (
cognitive-breach-framework) with:- Generic state machine for multi-dimensional tracking
- Pydantic schema templates
- Multi-agent orchestration patterns
- Testing harness with autonomous agents
Additional Cases: Add 2-3 more interrogation scenarios to demonstrate framework flexibility:
- Corporate espionage (white-collar crime)
- Witness testimony (innocent suspect, testing false accusation dynamics)
- Multi-suspect interrogation (agents with conflicting goals)
Medium-Term (3-6 months):
Research Paper: Document the AI-validated-by-AI methodology with statistical analysis:
- Bug discovery rates (AI vs human QA)
- Edge case coverage metrics
- Cost/time comparisons
- Reproducibility protocols
Domain Expansion - AI Red Teaming: Adapt the framework for security research:
- Adversarial agents that probe LLM systems for vulnerabilities
- Jailbreak strategy optimization
- Automated safety testing with logged attack chains
Educational Platform: Partner with criminology/psychology programs:
- Simulation Mode with forensic analysis
- Tactic detection training
- Interview technique practice
Long-Term Vision (6-12 months):
Open Ecosystem: Enable developers to build on the framework:
- Plugin system for custom tactics
- Agent behavior templates
- Community-contributed scenarios
- Shared test suite
Commercial Applications:
- Training Simulators: Corporate negotiation, crisis communication, interview prep
- Game AI Middleware: Drop-in consistent NPC system for game developers
- Research Tools: Standardized adversarial testing for AI safety labs
Multi-Participant Mode: Expand to multi-agent scenarios:
- 2 suspects with contradictory stories (coalition game theory)
- AI lawyer + AI prosecutor + AI judge (legal simulation)
- Negotiation tables with 3-5 AI agents with complex relationships
The Core Thesis: If we can make AI lie perfectly, we can make it do anything perfectly. Cognitive Breach proves behavioral constraint at the hardest edge case. The next step is applying this architecture to every domain that needs complex, consistent, adversarial AI.
Repository: https://github.com/tritonsan/Cognitive-breach
Live Demo: https://cognitive-breach-7kqekn5ucejs2twbkfzcwf.streamlit.app
Test Logs: /logs/ directory (36 sessions, 384 total turns, fully transparent)
Built with Gemini 3: Vision | Flash | Imagen | TTS | Extended Context
Log in or sign up for Devpost to join the conversation.