The problem

A spam botnet hits a subreddit. Twenty-five comments arrive in eleven minutes — different accounts, same domain, same template. The mod queue shows them as twenty-five separate decisions: click, read, remove. Click, read, remove. Twenty-five times. By the time the moderator catches up, the spam has already been seen by thousands.

Or: a contentious post attracts a harassment pile-on. Thirty-five comments, ten of them toxic, six reports. The mod queue treats it as forty-one items. A moderator opens the queue and sees chaos.

It's not chaos. It's one incident.

The Reddit mod queue is a list of items. But moderators don't think in items — they think in incidents. Burnout, slow response, and inconsistent enforcement come from forcing humans to do the same triage thirty times in a row.

Inspiration

I wanted a tool that thinks the way mods actually think: in incidents, not items. Group related signals into a single, explainable, reviewable thing — and never take a destructive action without the moderator clicking through.

What it does

ModShield: Incident Desk watches every post submit, comment submit, post/comment report, and Automoderator filter event in a subreddit. It runs each item through five deterministic scorers — Spam Wave, Harassment Pile-on, Rule Violation Cluster, Duplicate / Repost Burst, Report Storm — picks the highest-confidence type, and merges it with other recent items that share a primary post, domain, or text fingerprint.

The result is a dashboard of incidents. Each one shows:

  • Confidence (low/medium/high/critical) and 0–100 score
  • The exact signals that matched, with evidence strings ("Repeated domain in window (20)", "Matched 'idiot'", "3 reports on item")
  • Item count, affected users, primary post
  • An estimate of moderator time saved
  • A list of every item with per-row suggested actions

Mods click an incident → review the items table → run a Dry-run resolve to see exactly what would happen on Reddit → click Apply behind a confirmation modal. Every step lands in the audit log.

How I built it

  • Devvit Web for the platform — Redis, Reddit API, triggers, scheduler, menu actions, custom posts.
  • Hono + @hono/node-server + Devvit createServer for the HTTP server.
  • TypeScript everywhere. Five scorers under src/server/scoring/, an incident engine under src/server/engine/, storage repos under src/server/storage/ with explicit sorted-set indexes (Devvit Redis doesn't allow KEYS *).
  • In-memory RedisLike adapter so the entire scoring + storage layer runs under Vitest with a MemoryRedis mock. The real Devvit redis client is injected at the boundary in src/server/index.ts.
  • React 18 + TypeScript client with hash-free in-memory routing (the iframe sandbox dropped hashchange events on the early build), polished mobile-first responsive CSS, dark-mode-aware.
  • Vite 7 + @devvit/start/vite plugin to build client and server in one pipeline. Server bundles to dist/server/index.cjs as a single self-contained CommonJS file.
  • 48 Vitest tests across normalize, fingerprint, domains, scoring (5 types), engine merge, storage, audit, metrics, router, resolve dry-run, resolve apply with isolated failures, demo seed.

Challenges I ran into

  1. Bundling the server correctly. Devvit expects a self-contained CJS bundle at dist/server/index.cjs. My first attempt with esbuild externalized @devvit/* packages, which made the bundle look small and clean but failed at install time with Cannot find module '@devvit/web/server'. Switching to Vite SSR with @devvit/start/vite and noExternal: true fixed it — Devvit's runtime only provides Node builtins; everything else has to be inlined.

  2. UiResponse shape validation. The first menu handler returned { navigateTo, message } and Devvit's runtime rejected it: "unknown key 'message'". UiResponse only accepts navigateTo, showToast, and showForm. Same for trigger handlers — they must return {} (TriggerResponse). Wired all handlers to canonical shapes after that.

  3. Custom-post creation flow. "Open ModShield" originally returned { navigateTo: '/' } — meaningless inside the Reddit client. Switched to reddit.submitCustomPost({ title }) and navigated to the resulting post; that's how the React app gets a host inside Reddit.

  4. In-iframe routing. The first router used window.location.hash. Inside Devvit's webview the hashchange events were dropped, so Settings and History tabs showed "Page not found". Switched to pure React-state routing, which behaves identically in any host context.

  5. No-key-scanning Redis. Devvit Redis explicitly doesn't support KEYS *. Everything is backed by an explicit sorted-set index — incident:index:open, incident:index:resolved, incident:index:type:*, incident:by-domain:*, incident:by-fingerprint:*, incident:by-post:*. Cleaner architecture in the end; it forced the engine to be deliberate about how items are findable.

Accomplishments I'm proud of

  • A scoring system that's explainable, not magic. Every signal carries an evidence string the moderator can read out loud. No "AI score 0.87" — instead, "Repeated domain in window (20)" or "Matched 'idiot'" or "Missing required field (12)".
  • Human-in-the-loop by design. auto-actions OFF by default; every destructive call requires a confirmation modal; full dry-run preview before any Reddit-side action; every step audited.
  • No external LLMs, no external fetches. Ships under standard Devvit app review.
  • 48/48 unit tests passing, including resolve flow with isolated per-item failures.
  • Polished UI that holds up on mobile and dark mode and looks legit at first glance.

What I learned

  • Devvit Web is a complete platform — Redis, Reddit API, triggers, scheduler, menu, custom posts, mod permissions — and the runtime is strict about response shapes (UiResponse, TriggerResponse, SchedulerResponse). Conform early.
  • For sandboxed iframes, default to in-memory state for routing. URL-based state is fragile.
  • Heuristics + clear evidence > ML black box for a moderation tool. Mods need to defend their decisions; "the model said so" doesn't cut it.

What's next for ModShield: Incident Desk

  • Cross-incident repeat-offender memory with configurable retention.
  • Per-flair rule presets — different required fields per flair.
  • Modmail / Slack digest of new high-confidence incidents for off-platform on-call.
  • Optional, opt-in AI explainer mode (after Devvit app review for premium capabilities) — keep the heuristic engine as the source of truth, layer summarization on top.
  • Community-shared spam/toxic word lists between participating subs.

Built With

Share this project:

Updates