Inspiration

Every subreddit moderator knows the feeling. Open the mod queue. 47 items pending. You remove one post, then another, then a third. On item eight you realize: all five of those were from the same domain, posted by accounts created this week, within the last two hours. You've been reviewing a coordinated spam wave one post at a time. There was no way to know.

Reddit's mod queue is a flat chronological list. It shows you what needs review but can't tell you why posts are related or which ones matter most. For a moderator covering a 500k-member subreddit during an election week or a breaking news event, that gap costs a lot of time.

What it does

Sentinel reads your actual mod queue and clusters posts by threat pattern, then lets you batch-act on an entire cluster in one click.

Instead of 47 individual review decisions, you see groups like:

  • ๐ŸŒ Domain Spam Wave โ€” "9 posts from newsfeed-ru.net, accounts all <3 days old ยท ๐Ÿ”ด ACT NOW"
  • ๐Ÿ‘ฅ Account Wave โ€” "5 accounts averaging 2.1 days old, all posted within 2 hours"
  • ๐ŸŽฏ Targeted Harassment โ€” "3 posts naming u/specific_mod โ€” escalate to admins"
  • ๐Ÿ“ข Serial Poster โ€” "4 posts from u/promo_account in the last hour"
  • ๐Ÿ” Near-Duplicate Flood โ€” "6 near-identical titles (similarity โ‰ฅ 0.71)"

Tap Remove All and every post in the cluster is gone. Sentinel also surfaces an AutoMod YAML snippet after domain removals so you can block the next wave before it starts. For false positives (BBC during breaking news, a trusted contributor who posts a lot), Allow โœ“ adds them to a permanent per-subreddit allowlist.

What used to take 35 minutes of individual review takes under 3 minutes.

How we built it

Sentinel runs entirely inside Reddit's developer platform. No external servers, no databases, no LLM, no operating cost.

Five-pass clustering pipeline (pure TypeScript, Devvit V8 sandbox):

  1. Domain exact match โ€” โ‰ฅ3 posts from the same external domain โ†’ domain_spam
  2. Account wave detection โ€” โ‰ฅ4 accounts under 7 days old posting within a 3-hour sliding window โ†’ account_wave
  3. MinHash LSH near-duplicate detection โ€” character 3-gram shingles hashed with FNV-32a, 64 MinHash functions, pairwise Jaccard similarity โ‰ฅ 0.45 โ†’ near_duplicate. O(nยฒ) but n โ‰ค 300 for typical mod queues, runs in <200ms.
  4. Entity harassment โ€” same u/username mention appearing in 3+ separate posts โ†’ targeted_harassment (escalate, not auto-remove)
  5. Serial poster โ€” same author with 3+ unclustered posts โ†’ serial_poster

Priority scoring ranks clusters 0โ€“1 using:

Where P(violation) = account_risk ร— 0.5 + content_risk ร— 0.5, Impact = visibility ร— 0.7 + recency ร— 0.3, and urgency multipliers range from 0.8ร— (near-duplicate) to 1.6ร— (harassment).

Report counts from the actual mod queue boost content risk scores, so posts the community has already flagged surface higher.

Data sources:

  1. getModQueue โ€” posts actually awaiting mod decision
  2. getSpam โ€” AutoMod catches and Reddit's ML filter
  3. getNewPosts โ€” proactive, catches waves before reports accumulate

All state lives in Devvit Redis with appropriate TTLs. The war room post is created, stickied, and distinguished on install. Scans run every 5 minutes automatically, with modmail alerts for clusters scoring โ‰ฅ 0.65.

Challenges we ran into

We were scanning the wrong data to start. We built the first version against getNewPosts and getControversialPosts, the public feed. Midway through we found that Devvit exposes the actual mod queue API (getModQueue, getSpam) directly. That swap changed what the product is: proactive feed monitoring vs. intelligent triage of things mods were already going to review.

Devvit's JSX constraints are strict. useState only accepts JSONValue, so every complex type had to be serialized to JSON strings and parsed back. Conditional && rendering returns undefined rather than something falsy, so every conditional needed a ternary. No gap="xsmall", minimum is gap="small". None of these are things the error messages catch cleanly.

Stable cluster IDs. Early versions used timestamp-based IDs. If a mod dismissed a domain spam cluster, re-scanned, and the same domain was still posting, the cluster reappeared with a new ID. Switched to content-based IDs (domain:rt.com, harassment:username) so dismissed clusters stay dismissed across scans.

False positives are a real problem. BBC during breaking news generates the same signature as a state media spam wave: 7 posts from bbc.com in 2 hours. The allowlist came directly from that realization. Permanent per-subreddit trusted-domain storage so Sentinel doesn't keep flagging sources the mod team has already cleared.

Accomplishments that we're proud of

MinHash LSH for near-duplicate detection was the hardest piece to get right. FNV-32a on character 3-gram shingles, 64 hash functions, pairwise Jaccard comparison, all running in a V8 sandbox with no native modules and no npm dependencies beyond the Devvit SDK. A coordinated repost campaign where each account slightly rewords the title is genuinely hard to catch manually. Sentinel catches it statistically.

No configuration, no operating cost, no external infrastructure. The goal was something a mod team could install and forget about, so all the complexity had to live in the algorithm.

What we learned

AutoMod catches problems at submission time, one post at a time. Sentinel needs the full queue to find patterns. They're doing different jobs, and the AutoMod rule suggestion Sentinel surfaces after each removal is the handoff between them: Sentinel spots the pattern, AutoMod prevents the next one.

Devvit is a younger platform and building on it is a different kind of challenge. The TypeScript constraints are strict, some things you'd reach for instinctively (native modules, external state, flexible JSX patterns) aren't available. But those constraints push you toward simpler solutions. The entire clustering pipeline had to be self-contained, which meant every tradeoff got more scrutiny than it would have on a stack with more escape hatches.

The Reddit developer community is also smaller and more collaborative than most. When something wasn't documented, there was usually a forum thread or a Discord message from someone who'd hit the same wall. That culture made it easier to move fast on a short deadline.

What's next for Review Helper, aka Sentinel

  • Allowlist management UI: view and remove allowlisted domains/authors without touching Redis directly
  • Comment clustering: the mod queue contains posts and comments; the same patterns apply to both
  • Cross-subreddit signals: for mod teams managing multiple communities, correlating patterns across subreddits
  • Temporal trends: show cluster frequency over time so mods can tell whether a campaign is growing or burning out

Built With

Share this project:

Updates