ModShield

All Clear Dashboard
Settings
Active Alert (3 reports)
Action to be taken

The Problem We Set Out to Solve

Reddit moderators are under constant siege from a threat that native tools are completely blind to: coordinated brigading. This is when organized groups of accounts — sometimes bots, sometimes real users acting in coordination — flood a single post with reports to abuse Reddit's moderation system and get legitimate content falsely removed.

The mod queue just shows a pile of reports with zero pattern analysis. A mod sees "5 reports" and has no idea if that's five genuine community members or five accounts created yesterday by the same bad actor.

This isn't speculation. A 2026 ACM/CHI research paper on Reddit moderation specifically identified coordinated report abuse as a documented, unsolved gap — moderators called for tools that surface report groupings and reporter metadata to quickly identify and act on coordinated attacks. We built the direct answer to that finding.

What We Built

ModShield is the first Devvit app to treat Reddit's report system as a security signal rather than a simple queue. It operates completely in the background, intercepts every report the millisecond it fires, and applies a multi-factor threat analysis engine to determine whether the reports represent organic community concern or a coordinated attack.

The Detection Engine

Every report triggers ModShield via Devvit's onPostReport and onCommentReport background triggers. The engine tracks two factors:

Attack Velocity — the mathematical speed of incoming reports. Five reports in 14 seconds looks very different from five reports spread over two hours.

Reason Similarity — coordinated botnets copy-paste the same report reason. ModShield calculates the similarity ratio across all reporters.

These two factors combine into a Brigade Risk Score from 0 to 100:

Risk Score (0–100) = Velocity Score (0–60) + Reason Similarity Score (0–40)

🔴 HIGH RISK (75–100) — High confidence coordinated attack
🟠 MEDIUM RISK (50–74) — Possible brigade, monitor closely
🟡 LOW RISK (25–49) — Low signals, watch list
🟢 LIKELY LEGITIMATE (0–24) — Organic community reports

The Intelligence Layer

ModShield doesn't just score attacks — it explains them. Every alert comes with:

Threat Intelligence: "Why this was flagged: 5 reports in 14 seconds • 100% identical report reasons • Pattern resembles coordinated reporting"
Threat Fingerprint: Tags like "Burst reporting", "Duplicate reasons", "Coordinated wave pattern"
Auto-Generated Recommendation: Specific action advice based on the confidence level — remove, monitor, or mark false positive

Proactive & Predictive Features

Pre-emptive Watch List: When a post hits 70% of the report threshold, ModShield flags it before the full alert fires. Mods see the storm building.
Brigade Wave Detection: If 3+ posts are attacked within 10 minutes, a wave banner fires — "POSSIBLE BRIGADE WAVE DETECTED — 17 suspicious reports across 3 posts"
7-Day False Positive Lock: When a mod marks an alert as false positive, ModShield locks that content ID in Redis for 7 days. Future reports are silently dropped, preventing endless alert loops for controversial but legitimate content.

The Mod Command Center

The interactive dashboard (Mod Tools → Open ModShield Dashboard) shows:

Active threat count, watch list, total clusters caught, estimated moderator time saved in hours
📡 Live Security Events Feed — a timestamped log of every ModShield action
Per-alert display with risk score, velocity, fingerprint, threat intelligence, recommendation
🧹 Recently Resolved — the last 5 resolved alerts with how they were handled
One-click Remove Post — instantly removes via Reddit API and dismisses alert
One-click False Positive — dismisses and applies the 7-day lock
📬 Send Test Weekly Report — instantly sends the full threat intelligence report to modmail

Automation

Zero-Config Install: On install, automatically sends a formatted Welcome Guide to modmail explaining risk scores and how to use every feature
Configurable Thresholds: Mods set their own report threshold (2–20) per subreddit
Weekly Brigade Report: A cron job sends full threat intelligence every Monday at 9am UTC — total clusters intercepted, high-risk attacks, false positives prevented, active fingerprints
Emergency Critical Modmail: Any alert hitting Risk Score ≥ 90 automatically fires an emergency modmail with full threat details

How We Built It

Tech Stack:

Devvit (TypeScript) + Hono web framework for all server routes
Devvit Redis for all state — report counters, alert storage, false positive locks, security event feed, resolved archive, settings
Devvit Triggers — onPostReport, onCommentReport, onAppInstall
Devvit Menus + Forms — the entire interactive dashboard
Reddit API — post removal, modmail dispatch

The architecture is entirely event-driven. There is no polling, no external APIs, no third-party services. ModShield is a pure Devvit-native app that runs entirely within Reddit's infrastructure.

Challenges We Faced

The numReports trap: Reddit's PostReport event sends numReports: 0 in the event payload — it doesn't include the updated count. We discovered this mid-build when our threshold detection wasn't firing. The fix was to implement our own Redis-based counter that increments on every trigger event, making the count reliable and independent of Reddit's event data.

Form-based UI constraints: Devvit's form system doesn't support custom HTML or rendering control. We used the disabled: true boolean field technique to create read-only display blocks, and carefully structured the field order to create a dashboard that feels intentional despite platform constraints.

Body parsing: The Devvit menu trigger sends { targetId, location } rather than a subreddit name. We had to look up the subreddit by ID via reddit.getSubredditById() and store it in Redis for subsequent form handlers to access.

Subreddit name propagation: Form handlers run in a separate request context from the menu handler. We solved this by storing the subreddit name in Redis when the dashboard opens, making it available to the form submission handler for features like test weekly reports.

What We Learned

Building ModShield taught us that the most powerful moderation tools aren't the ones that do the most — they're the ones that give mods the right signal at the right moment. The difference between "5 reports" and "5 reports in 14 seconds from accounts created this week, all using identical reasons" is the difference between a moderator spending 15 minutes manually investigating versus making a confident decision in 5 seconds.

We also learned that Devvit's Redis-first architecture is genuinely well-suited for real-time threat detection — the event triggers, persistent storage, and form system compose cleanly into a security layer that feels native to Reddit rather than bolted on.

What's Next

ModShield lays the groundwork for a cross-subreddit threat intelligence network — where attack signatures detected in one community could (with appropriate privacy safeguards) inform threat scoring in related communities. The brigade that hits r/gaming today is likely to hit r/pcgaming tomorrow. That's the future we're building toward.

Built With

devvit
devvit-redis
hono
node.js
reddit
typescript

Updates

Chetan Swaroop started this project — May 24, 2026 01:15 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.