PolicyPilot: Decision Governance for Reddit Moderators

Thumbnail

Inspiration

AutoModerator is the backbone of Reddit moderation — but it has a fundamental limitation: it's stateless. Every post and comment is evaluated in isolation. AutoMod can't know whether this is a user's 1st violation or their 15th. It can't remember that it warned someone last week. It can't track that the same user keeps breaking the same rule.

Research data on moderator pain points

Research confirms the scale of this problem. The CHI 2026 "In the Queue" study (Bajpai & Chandrasekharan) surveyed 110 moderators and found that 74.5% experience workflow collisions, while studies on moderation consistency show 1 in 7 decisions are disputed within the same team. Meanwhile, AutoMod has exactly 0% ability to track cross-interaction user state — it's stateless by design.

This forces moderators into a painful daily reality: no memory of past actions, no standardized enforcement policies, and no visibility into team workload. We built PolicyPilot to fill these gaps.

Before vs After PolicyPilot workflow

What it does

PolicyPilot is a Devvit mod tool that adds three capabilities Reddit moderation currently lacks: memory, consistency, and visibility.

1. The Reputation Ledger — Moderation Memory

Every mod action (remove, warn, ban, approve) is automatically logged to a per-user history via the onModAction trigger. No manual input needed — the ledger builds itself as moderators work normally.

When a mod encounters a flagged post, they click "View User History" and instantly see a color-coded risk badge:

Risk Check showing user offense history

🟢 Clean — zero offenses
🟡 Watched — 1-2 offenses, monitor closely
🔴 Escalation Zone — 3+ offenses, escalation candidate

No more opening new tabs to check user profiles. The context is right there.

2. The Playbook Engine — Consistent Decisions

Senior moderators define playbooks — decision trees that encode their community's enforcement policies. When any moderator encounters a flagged item, they click "Run Playbook." The app checks the user's ledger history, evaluates the playbook conditions, and recommends the correct escalation tier.

Here's the same user progressing through all three tiers as violations accumulate:

Tier 1 — First offense (0 prior offenses) → Remove content:

Tier 1: First offense, recommended action is Remove

Tier 2 — Second offense (1 prior offense) → Warn user via modmail:

Tier 2: Second offense, recommended action is Warn

Tier 3 — Third offense (2 prior offenses) → Temp ban 7 days:

Tier 3: Third offense, recommended action is Temp Ban

The reasoning is shown step-by-step (priorOffenses lt 1: no → priorOffenses lt 2: no → Temp ban), so every moderator can see exactly why the app recommends what it does. The moderator reviews and confirms with one click — the action executes, a distinguished removal comment is posted, and the event is logged to the ledger.

Playbooks also support:

New-account gates that apply stricter rules to accounts below a configurable age threshold
Dry-run previews that simulate the playbook against recent users without taking any actions — so mods can verify their logic before going live
Manage/delete existing playbooks with confirmation

3. The Ops Dashboard — Team Visibility

A real-time analytics dashboard built as a custom post with a React web view, showing 7 days of aggregated data:

PolicyPilot Dashboard with stats, charts, and activity log

PolicyPilot Splash Screen

Dashboard recent activity log with color-coded actions

The dashboard includes animated count-up stat tiles, spring-animated bar charts, mod workload distribution, top offenders ranked by offense count, and a color-coded 24-hour activity log showing which actions were playbook-assisted (marked PB).

A one-click Generate Mod Report creates a formatted 7-day summary post — perfect for async team handoffs across time zones.

How we built it

PolicyPilot system architecture

PolicyPilot runs entirely on Devvit's infrastructure with zero external dependencies — no AI APIs, no external databases, no hosted servers.

Stack:

Devvit (@devvit/web v0.12.24) — Reddit's Developer Platform
Hono — lightweight server routing
React + Tailwind CSS 4 — dashboard web view with animations
Redis (Devvit built-in) — all persistence, subreddit-scoped
toolbox-devvit — integration with the Mod Toolbox usernotes ecosystem

Key design decisions:

Deterministic core. No AI, no probabilistic decisions. Playbooks evaluate conditions with pure logic — same input always produces the same recommendation. Moderators trust deterministic tools.
Community sovereignty. All data is local to the subreddit. We don't import bans, warnings, or reputation from other communities. Each subreddit's moderation standards remain independent.
Complement, don't replace. PolicyPilot works alongside AutoMod. AutoMod catches violations at submission time. PolicyPilot decides what to do about them based on context and history.
Human in the loop. PolicyPilot recommends actions but never executes autonomously. The moderator always reviews and confirms.
Dedup-safe triggers. Playbook-executed actions set a short-lived Redis key so the onModAction trigger skips the duplicate, preventing double-counting in the offense ledger.

By the numbers: 9 menu items, 9 form flows, 2 hourly scheduler jobs, 1 custom post type, 4 configurable app settings, and 10 Redis key patterns — all running inside Devvit's serverless runtime.

Features

Feature	Description
Auto-logging Ledger	Every mod action auto-writes to per-user Redis sorted set
Risk Badge	🟢🟡🔴 instant toast with risk level, account age, and karma
View Full History	Detailed action log with per-rule offense breakdown
Configure Playbooks	Form-driven 3-tier escalation builder with dynamic subreddit rule names
Run Playbook	Step-by-step wizard with reasoning chain and one-click execution
Distinguished Removal Comments	Auto-posts mod comment explaining the violation
Modmail Warnings	Sends warnings to users via modmail
Preview Playbook	Dry-run simulation against real users — zero side effects
Manage Playbooks	List and delete playbooks with confirmation
Ops Dashboard	Animated React dashboard with 7-day stats, charts, and activity log
Auto-Escalation Alerts	Hourly threshold checker sends modmail when offense limits are crossed
Generate Mod Report	One-click 7-day summary as a formatted post
Toolbox Integration	Syncs ledger entries as Mod Toolbox usernotes
Dynamic Rule Names	Fetches actual subreddit rules via API — works on any subreddit
Resilient Error Handling	Three-layer error handling — triggers always return 200

Challenges we ran into

Double-counting offenses. When a playbook executes reddit.remove(), the onModAction trigger fires for the same removal, creating a duplicate ledger entry. We solved this with a dedup key pattern: the playbook sets a short-lived Redis key (pb-dedup:{targetId}, TTL 30s) before executing, and the trigger checks for it to skip the duplicate.
Redis transient failures. ECONNRESET errors from Devvit's Redis caused our trigger to return 500, which made Reddit retry or drop the event. We implemented three-layer error handling: try-catch at the trigger level, sequential Redis writes for clear partial-failure identification, and a global Hono onError handler as a safety net.
Devvit form rendering. Form description fields collapse newlines into single paragraphs. We redesigned all output to use emoji section headers (👤📋⚠️) and dot separators that remain readable whether newlines render or collapse.
Deprecated Reddit APIs. reddit.sendPrivateMessageAsSubreddit() was deprecated mid-development. We migrated to reddit.modMail.createConversation() for warnings and reddit.modMail.createModDiscussionConversation() for threshold alerts.
Devvit Redis zRange behavior. zRange(key, '+inf', '-inf', { by: 'score', reverse: true }) returns empty results in the Devvit Redis client. We switched to ascending fetches with JavaScript-side reversal.

Accomplishments that we're proud of

Zero external dependencies. The entire app runs inside Devvit's runtime — no API keys to manage, no servers to host, no third-party services to trust with user data.
Toolbox ecosystem integration. Instead of competing with the beloved Mod Toolbox extension, PolicyPilot integrates with it — every ledger entry is mirrored as a Toolbox usernote. Existing Toolbox users get unified data without choosing between tools.
The escalation sequence. Watching the same user progress from Remove → Warn → Temp Ban across three violations — with the app correctly reading history and recommending the right tier each time — is exactly the workflow mods currently do manually in their heads.
Playbook dry-run preview. Mods can simulate a playbook against real user data before activating it. This eliminates the #1 fear with automation: "will this accidentally ban innocent people?"

What we learned

The moderator pain points that matter most aren't the flashy ones (AI classification, cross-subreddit networks) — they're the mundane ones: "I removed this user's post yesterday but I can't remember that today."
Building for moderators means building for trust. Deterministic tools that show their reasoning are trusted more than black-box AI that just says "ban this user."
Devvit's @devvit/web framework is powerful but has sharp edges. Form rendering, Redis client differences from standard Redis, and trigger lifecycle nuances require careful testing and defensive coding.

What's next for PolicyPilot

Edit playbooks in place — currently playbooks can be created and deleted but not modified
Manual ledger notes — allow mods to add context notes to a user's history without a specific post
Cross-rule analytics — identify users who violate multiple different rules (behavioral pattern detection)
Optional AI enhancement layer — async API integration for violation summaries and warning message drafting, designed as a progressive enhancement that never blocks the deterministic core