PublicWire: Devpost Submission Draft

An autonomous civic intelligence platform that monitors your neighborhood with a self-improving 6-agent pipeline and shows you the receipts.

Inspiration

Local government affects daily life more than any other layer of government bus routes change, streets close, permits get approved, parking rules shift but the information is buried across dozens of PDFs, portals, meeting minutes, and buried city notices that no one has time to read.

We realized that the problem isn't a lack of data. Cities publish this information. The problem is that no one is watching, and when they are, there's no way to trust what an AI summarizes.

We asked: what if an autonomous agent team could monitor the civic web the way a local newsroom used to but with full transparency into every editorial decision it makes?

That's PublicWire.

What it does

PublicWire is an autonomous civic change monitor. You give it a city (e.g., New Brunswick, NJ), and it:

Discovers official and public civic web sources (city notices, transit alerts, construction updates, school closures, permits, council agendas)
Extracts structured civic events from raw, messy web content
Evaluates each event through an editorial pipeline — rejecting routine noise, flagging unsupported claims, and routing weak items back for second-source verification
Publishes short, source-cited civic micro-briefs that residents can actually read
Audits itself every agent decision is logged to an immutable ledger that residents can inspect
Self-improves a Reflector meta-agent reads rejection logs, detects failure patterns, and dynamically rewrites downstream agent prompts to prevent the same mistakes

The key insight: PublicWire doesn't just generate content. It generates trust. Every published brief ships with a full agent trace who checked what, what got rejected and why, and what sources backed the final claim.

How we built it

Architecture: The 6-Agent Pipeline

PublicWire is not a single LLM call. It's a multi-agent orchestration pipeline where each agent has a distinct role, and agents can autonomously loop back to earlier stages:

┌─────────┐    ┌──────────┐    ┌────────┐    ┌────────┐    ┌────────┐
│  Scout  │───▶│ Analyst  │───▶│ Editor │───▶│ Writer │───▶│ Mentor │
└─────────┘    └──────────┘    └────┬───┘    └────────┘    └───┬────┘
     ▲                              │                          │
     │         ↩ "needs 2nd source" │    ↩ "revise — drift"    │
     └──────────────────────────────┘──────────────────────────┘

                        ┌────────────┐
                        │ Reflector  │  ← reads rejection logs,
                        │ (meta)     │    rewrites Editor prompt
                        └────────────┘

Agent	Role	Key Behavior
Scout	Source discovery	Uses Nimble to find official (.gov/.edu) and public civic sources for a target area
Analyst	Extraction	Converts raw web content into structured civic events with evidence chains
Editor	Editorial gate	Classifies events as `resident-relevant`, `routine`, `duplicate`, or `unsupported`. Can loop back to Scout if only one source backs a high-impact claim
Writer	Brief generation	Converts verified events into short, cited micro-briefs (strict rules: no inventing facts, <60 word summaries)
Mentor	Quality review	Reviews Writer output against original evidence. Can loop back to Editor if brief drifts from evidence
Reflector	Self-improvement	Reads rejection logs from ClickHouse, detects patterns (e.g., "weak-sourcing" appeared 8 times), and dynamically rewrites the Editor's prompt to prevent repeat failures. Versions every prompt change in the ledger.

The Loopback System

This is what separates PublicWire from a basic RAG pipeline. Agents don't just pass data forward — they challenge each other:

Editor → Scout loopback: If the Editor sees a resident-relevant claim backed by only one source, it sends a directive back to Scout specifying exactly what second source to find. The Analyst re-extracts, merges evidence, and the Editor re-evaluates.
Mentor → Editor loopback: If the Mentor detects that a Writer's brief drifts from the evidence, it routes the event back to the Editor for re-classification. A new brief is written and re-reviewed.

These loops are autonomous no human intervention. Every loop is logged to the immutable ledger.

The Reflector: Self-Improving Prompts

The Reflector is PublicWire's most technically interesting component. It:

Queries ClickHouse for recent Editor and Mentor rejection decisions
Runs pattern detection across rejection reasons (regex-based buckets: weak-sourcing, vague-claim, drift, routine-flood, unsupported-claim, etc.)
Selects the highest-frequency failure pattern
Generates a tightening clause and appends it to the Editor's system prompt
Versions the new prompt in ClickHouse with a trigger_pattern field
Logs its own decision to the agent_decisions table

This means PublicWire literally reprograms its own editorial standards based on what it's learning from its mistakes and every change is versioned and auditable.

Sponsor Integrations

Nimble Civic web extraction layer

Scout uses Nimble's Search API to discover official and public civic sources for a target area
Analyst uses Nimble to extract structured civic changes from discovered sources
Graceful fallback to seeded demo data if the API is unavailable

ClickHouse Immutable audit ledger

6 tables: sources, snapshots, civic_events, agent_decisions, published_briefs, prompt_versions
Every agent decision is recorded with decision_id, event_id, agent_name, decision, reason, and prompt_version_id
The Reflector reads from and writes to ClickHouse it's both the memory and the feedback loop
MergeTree engine for append-only, time-ordered audit trails

Senso / cited.md Grounded publishing layer

Published briefs are sent to Senso's search API for organizational knowledge grounding
Each brief includes the full agent trace and source citations
Returns a public URL and citation ID for the published artifact

Google Gemini Editorial intelligence

Editor and Mentor agents use Gemini (via OpenAI-compatible endpoint) for civic relevance classification
Structured JSON output with importance, confidence, needsSecondSource, and reason fields
Google editorial decision layer provides a second-pass publishability check with resident-relevant / routine / unsupported / urgent classification

Datadog (Lapdog) Reliability review & tracing

Every pipeline step is wrapped in dd-trace spans with custom tags (sponsor, area, session_id)
Lapdog reliability review runs 4 checks before publishing: source grounding, claim specificity, resident impact, trace completeness
Produces a scored verdict (pass/hold) with per-check comments

Tech Stack

Framework: Next.js 16 (App Router, React 19)
Language: TypeScript (strict)
LLM: Google Gemini 2.5 Flash (via OpenAI-compatible endpoint) — also supports OpenAI as a drop-in swap
Data validation: Zod
Tracing: Datadog dd-trace
Deployment: Vercel-ready

Challenges we ran into

Agent loop termination: When the Editor routes back to Scout, and the new evidence still doesn't satisfy the threshold, we needed to ensure the loop terminates rather than cycling forever. We solved this by limiting loopbacks to one round and falling back to rejection.
Prompt versioning without drift: The Reflector rewrites prompts dynamically, which creates a risk of prompt bloat or contradictory rules accumulating over time. We mitigated this by keeping each tightening clause scoped to a single detected failure pattern, and by versioning every change in ClickHouse so we can always roll back.
Graceful degradation across 5 sponsor APIs: Any of the sponsor services (Nimble, ClickHouse, Senso, Gemini, Datadog) could be unavailable. We built a 3-tier fallback system (real-api → api-error-fallback → seeded-demo) so the pipeline always completes and the demo always works, while clearly showing which services are live vs. fallback.
Structured JSON from LLMs: Getting Gemini to reliably return strict JSON for editorial decisions required careful prompt engineering, response_format: json_object, low temperature (0.2), and a safe-parse fallback layer.

Accomplishments that we're proud of

The Reflector actually works: It reads real rejection data from ClickHouse, finds the most common failure pattern, and writes a concrete rule into the Editor prompt. Watching the prompt diff appear in the UI; where the system literally rewrites its own instructions — is the most technically impressive moment in the demo.
Full auditability: Every single agent decision, from Scout discovery through Mentor approval to Reflector prompt rewrites, is logged to ClickHouse with a traceable event_id chain. A resident can open a published brief and see exactly why it was published.
The loopback architecture: Agents don't just pass data forward. The Editor can challenge the Scout. The Mentor can challenge the Editor. This creates a self-correcting pipeline that's closer to how a real newsroom works than any single-call LLM approach.
5 sponsor technologies, one coherent product: Nimble → ClickHouse → Gemini → Senso → Datadog all serve distinct, meaningful roles in the pipeline. None of them are bolted on — each one would degrade the product if removed.

What we learned

Multi-agent systems are harder to debug than monolithic LLM calls, but the reliability payoff is real. When you can see exactly which agent made which decision, you can fix failures surgically instead of re-prompting the whole system.
Immutable logging changes the game. Once we started logging every decision to ClickHouse, we could build the Reflector: a component that would be impossible without a queryable decision history. The ledger isn't just for trust; it's the substrate for self-improvement.
Civic data is messier than we expected. City websites don't follow standards. The same information appears in PDFs, HTML tables, press releases, and meeting minutes. Nimble's extraction layer was critical for normalizing this into something our agents could reason about.

What's next for PublicWire

Multi-city expansion: The architecture is area-agnostic. We want to support any US city by configuring Scout with jurisdiction-specific source lists.
Resident subscriptions: Push notifications when PublicWire detects a civic change in your neighborhood category (transit, construction, schools, etc.).
Reflector v2: cross-agent prompt evolution: Currently the Reflector only rewrites the Editor prompt. We plan to expand it to Mentor and Writer prompts as well, creating a fully self-evolving editorial pipeline.
Community trust scoring: Let residents upvote/downvote published briefs, feeding human feedback back into the Reflector's pattern detection.

https://github.com/advik-bhatt/mouthpiece