Polisim – AI Agents for Constructive Political Debate

Inspiration

Political polarization continues to make compromise increasingly difficult. Emotional reactions often overpower data-driven reasoning, whether in media, social platforms, or policy discussions.

I built Polisim to explore whether AI agents could debate political issues in a structured, research-driven way — with the explicit goal of finding practical compromises instead of escalating conflict.


What It Does

Polisim allows users to:

  1. Describe a political issue.
  2. Create two AI personas with custom ideologies.
  3. Run a structured debate between them.

The system proceeds in four stages:

🧩 Problem Generation

An AI agent converts the user’s prompt into a clearly scoped, debate-ready policy question.

🔎 Research

Each persona independently conducts research using web search and content extraction tools, building a thesis supported by evidence.

🔥 Crossfire

Agents generate and answer structured questions challenging each other’s assumptions.

🤝 Negotiation

Agents compare theses and attempt to reach a compromise while remaining ideologically consistent.

The output is a research-backed debate focused on structured reasoning and solution-building.


How I Built It

I implemented Polisim using Vercel’s AI SDK (generateText, structured Output.object, and tool-calling).

Context Management

The main technical challenge was preventing context window explosion during long research + debate loops.

Instead of appending raw content to message history, I created a context store (a UUID-keyed hashmap). Each tool result is:

  1. Summarized via a small LLM call
  2. Stored with both summary + full content
  3. Referenced by ID

Agents access information through two tools:

  • get available context → list IDs + summaries
  • get context → fetch full content by ID

This keeps prompts bounded while preserving full research access.


Research Loop

Exposed tools:

  • Web search (Tavily API)
  • URL extraction (Tavily full text)
  • get thesis field (read incremental thesis components)
  • get searched links (prevent duplicates)

The thesis is built incrementally in memory. Each research step returns a structured ResearchResponse:

  • message
  • thesis_field (1–5, or -1 when complete)

The loop runs until thesis_field = -1, capped at 10 tool rounds (stepCountIs(10)).


Crossfire

Crossfire uses tool-free generateText calls with strict JSON schema (answer: string). This avoids structured-output failures caused by mid-turn tool calls.

All Q&A pairs are written back into the context store for negotiation reference.


Negotiation Loop

Negotiation re-enables tools:

  • get thesis part (self/opponent)
  • get crossfire pair
  • Context retrieval tools

Each turn must output exactly one structured action:

  • message
  • proposed_solution
  • confirm_solution
  • deny_solution

A running debate summary is maintained and refreshed after each turn to avoid transcript bloat.

Turn limits:

  • 12 steps (tools enabled)
  • 6 steps (retry mode)

Reliability & Fallback

Structured output often failed during multi-tool turns. To stabilize:

  1. Increased reasoning step limits.
  2. Added a retry path with tool-reduced configuration.
  3. Explicitly encoded workflow rules in the system prompt.
  4. Applied safe defaults if parsing still failed.

This ensures the debate never stalls on malformed output.


Orchestration

The full pipeline:

Problem generation
→ Thesis 1
→ Thesis 2
→ Crossfire (both sides)
→ Negotiation loop (until agreement or max rounds)

Shared state (thesis object, context cache, searched links) resets between runs to prevent leakage.


Challenges

The hardest problem was maintaining strict structured outputs during heavy tool usage. Without fallback logic and bounded loops, the system would break under complex reasoning.

Designing guardrails without over-constraining autonomy was the central engineering tradeoff.


Accomplishments

Under hackathon constraints, I successfully built:

  • Multi-agent orchestration
  • Tool-driven deep research
  • Context retrieval beyond naive summarization
  • Structured negotiation enforcement
  • Fault-tolerant debate loops

Most importantly, I shipped a working system.


What I Learned

Polisim taught me:

  • Practical context management strategies for agent systems
  • Structured output enforcement under tool-calling
  • Recovery design for LLM parsing failures
  • How to prioritize impactful features under time pressure

Long-running agent workflows require explicit architectural control — not just prompting.


What’s Next

  1. Add context-trace debugging (why did the agent believe this?).
  2. Improve source verification and credibility scoring.
  3. Build a grounded policy/legal knowledge base.
  4. Deploy to real users and policymakers.

Polisim is my attempt to build AI systems that don’t just argue — but reason, negotiate, and search for common ground.

Built With

Share this project:

Updates