Polisim – AI Agents for Constructive Political Debate
Inspiration
Political polarization continues to make compromise increasingly difficult. Emotional reactions often overpower data-driven reasoning, whether in media, social platforms, or policy discussions.
I built Polisim to explore whether AI agents could debate political issues in a structured, research-driven way — with the explicit goal of finding practical compromises instead of escalating conflict.
What It Does
Polisim allows users to:
- Describe a political issue.
- Create two AI personas with custom ideologies.
- Run a structured debate between them.
The system proceeds in four stages:
🧩 Problem Generation
An AI agent converts the user’s prompt into a clearly scoped, debate-ready policy question.
🔎 Research
Each persona independently conducts research using web search and content extraction tools, building a thesis supported by evidence.
🔥 Crossfire
Agents generate and answer structured questions challenging each other’s assumptions.
🤝 Negotiation
Agents compare theses and attempt to reach a compromise while remaining ideologically consistent.
The output is a research-backed debate focused on structured reasoning and solution-building.
How I Built It
I implemented Polisim using Vercel’s AI SDK (generateText, structured Output.object, and tool-calling).
Context Management
The main technical challenge was preventing context window explosion during long research + debate loops.
Instead of appending raw content to message history, I created a context store (a UUID-keyed hashmap). Each tool result is:
- Summarized via a small LLM call
- Stored with both summary + full content
- Referenced by ID
Agents access information through two tools:
- get available context → list IDs + summaries
- get context → fetch full content by ID
This keeps prompts bounded while preserving full research access.
Research Loop
Exposed tools:
- Web search (Tavily API)
- URL extraction (Tavily full text)
- get thesis field (read incremental thesis components)
- get searched links (prevent duplicates)
The thesis is built incrementally in memory. Each research step returns a structured ResearchResponse:
messagethesis_field(1–5, or -1 when complete)
The loop runs until thesis_field = -1, capped at 10 tool rounds (stepCountIs(10)).
Crossfire
Crossfire uses tool-free generateText calls with strict JSON schema (answer: string). This avoids structured-output failures caused by mid-turn tool calls.
All Q&A pairs are written back into the context store for negotiation reference.
Negotiation Loop
Negotiation re-enables tools:
- get thesis part (self/opponent)
- get crossfire pair
- Context retrieval tools
Each turn must output exactly one structured action:
messageproposed_solutionconfirm_solutiondeny_solution
A running debate summary is maintained and refreshed after each turn to avoid transcript bloat.
Turn limits:
- 12 steps (tools enabled)
- 6 steps (retry mode)
Reliability & Fallback
Structured output often failed during multi-tool turns. To stabilize:
- Increased reasoning step limits.
- Added a retry path with tool-reduced configuration.
- Explicitly encoded workflow rules in the system prompt.
- Applied safe defaults if parsing still failed.
This ensures the debate never stalls on malformed output.
Orchestration
The full pipeline:
Problem generation
→ Thesis 1
→ Thesis 2
→ Crossfire (both sides)
→ Negotiation loop (until agreement or max rounds)
Shared state (thesis object, context cache, searched links) resets between runs to prevent leakage.
Challenges
The hardest problem was maintaining strict structured outputs during heavy tool usage. Without fallback logic and bounded loops, the system would break under complex reasoning.
Designing guardrails without over-constraining autonomy was the central engineering tradeoff.
Accomplishments
Under hackathon constraints, I successfully built:
- Multi-agent orchestration
- Tool-driven deep research
- Context retrieval beyond naive summarization
- Structured negotiation enforcement
- Fault-tolerant debate loops
Most importantly, I shipped a working system.
What I Learned
Polisim taught me:
- Practical context management strategies for agent systems
- Structured output enforcement under tool-calling
- Recovery design for LLM parsing failures
- How to prioritize impactful features under time pressure
Long-running agent workflows require explicit architectural control — not just prompting.
What’s Next
- Add context-trace debugging (why did the agent believe this?).
- Improve source verification and credibility scoring.
- Build a grounded policy/legal knowledge base.
- Deploy to real users and policymakers.
Polisim is my attempt to build AI systems that don’t just argue — but reason, negotiate, and search for common ground.
Built With
- openai
- tavily
- typescript
- vercel
- vercel-ai-sdk
Log in or sign up for Devpost to join the conversation.