Inspiration

Reddit moderators are burning out. A mid-size subreddit with 50,000 weekly active users generates 400–500 posts and comments every day. Moderators review them in reverse-chronological order — meaning they read 95% clean, harmless content just to find the 5% that actually violates the rules. There is no native way to see which items need urgent attention first.

I've watched mod teams on large subreddits shrink year after year because the volume becomes unsustainable. The tooling hasn't kept up. AutoModerator is powerful but brittle — keyword rules create false positives, miss novel violations, and require constant maintenance. There was nothing that could read a post, understand its context against a subreddit's specific rules, and tell a moderator "this one matters."

Gemini 2.0 Flash changed that. With 1 million free tokens per month and sub-second response times, it became possible to score every single post and comment the moment it was created — before any human moderator ever sees it.

What it does

ModSentinel is an AI-powered moderation queue built entirely on Devvit. It installs on any subreddit in three steps and runs silently in the background from that point forward.

The moment any post or comment is created, ModSentinel:

  1. Sends the content to Gemini 2.0 Flash alongside the subreddit's own rules
  2. Returns three independent scores — Spam (0–100), Rule Violation (0–100), and Toxicity (0–100)
  3. Computes a weighted Overall priority score: (Violation × 0.5) + (Spam × 0.3) + (Toxicity × 0.2)
  4. Stores the result in Devvit KV Store
  5. Auto-removes content scoring above the configured threshold (default: 95)
  6. Auto-flairs content scoring above a lower threshold (default: 80) with "⚠️ Needs Review"
  7. Sends an instant mod-mail alert for high-confidence violations (default: 90+)

Moderators open the ModSentinel dashboard — a pinned custom post — to see the entire queue sorted by priority score. Critical violations sit at the top. Clean content sits at the bottom. Mods review in order of urgency, not chronology.

From the dashboard, every item shows:

  • Color-coded score badge (🔴 red 80+, 🟡 amber 60–79, 🔵 blue 40–59, 🟢 green under 40)
  • One-sentence AI reasoning: "This post is blatant spam promoting cryptocurrency"
  • Three score bars: Spam · Rule Violation · Toxicity
  • One-tap actions: ✓ Approve · 🗑 Remove · 🚫 Spam
  • Author, timestamp, and current status

Measured impact on a 50K WAU subreddit:

Without ModSentinel With ModSentinel
Daily items to review 425 ~85 (80% auto-triaged)
Time per mod team 4.7 hours 57 minutes
Time saved 3.8 hours/day

Per year, for a 3-person mod team: 1,380 hours returned — the equivalent of a full-time moderator.

How we built it

ModSentinel is built entirely on the Devvit platform with TypeScript, using three Devvit primitives: triggers, KV Store, and custom posts.

Triggers (PostCreate and CommentCreate) fire the moment any content is created. The handler calls the Gemini 2.0 Flash API with the content and the subreddit's configured rules, parses the structured JSON response, and writes the scored item to KV Store — all within seconds of the original post.

Gemini integration uses a tightly-scoped prompt that returns only a JSON object with five fields: spam score, violation score, toxicity score, overall score, and a one-sentence reasoning string. Temperature is set to 0.1 for consistent, deterministic scoring. API failures fall back to safe defaults (score 0) so content is never silently blocked.

The dashboard is a Devvit custom post registered with addCustomPostType. It uses useAsync with a finally callback (Devvit's required pattern for async state) to load the queue from KV Store on render. The queue is serialised as a JSON string in state to satisfy Devvit's JSONValue constraint on useState.

Settings use Devvit's installation settings system. The Gemini API key is stored at scope: 'app' (app-level secret). Thresholds and subreddit rules are at scope: 'installation' so each subreddit configures independently.

The full source is on GitHub with zero external runtime dependencies beyond @devvit/public-api.

Challenges we ran into

Devvit's type system is strict in unexpected ways. useState requires JSONValue-compatible types, which means interfaces with optional fields don't satisfy the constraint. We solved this by serialising the queue as a JSON string in state.

Proto field names differ from the high-level Reddit API names. The PostCreate trigger uses post.selftext (not post.body) and the author username is on event.author.name from UserV2 — not on the post object itself. These mismatches required reading the generated proto type definitions directly.

isSecret settings require scope: 'app', not scope: 'installation'. This isn't prominent in the docs and caused our first devvit upload to fail with a cryptic evaluation error.

useAsync doesn't allow setState in the async function body — it must be called in the finally callback. This caused the dashboard to silently fail to populate until we traced through the hook's source.

Accomplishments that we're proud of

The end-to-end flow works exactly as designed. A post is created → Gemini scores it in under 3 seconds → the score appears in the dashboard with colour-coded bars, AI reasoning, and one-tap actions.

The scoring accuracy is striking. When we posted "BUY CRYPTO NOW — DM FOR GAINS" on our test subreddit, Gemini returned: Spam 95 · Rule Violation 90 · Toxicity 0 · Overall 76 with reasoning: "This post is blatant spam promoting cryptocurrency and soliciting direct messages for financial gain." That's a human-quality moderation decision made automatically in under 3 seconds.

We're also proud that the tool is genuinely free to operate. The Gemini free tier covers 1 million tokens per month — enough for 3,000–5,000 scored items. Most subreddits under 200K WAU will never exceed that limit.

What we learned

Devvit is a serious production platform. The trigger system is reliable, the KV Store is fast, and the custom post renderer handles complex UIs well. The constraints (JSONValue state, async patterns, secret scoping) exist for good reasons — security and serialisation consistency across Reddit's infrastructure.

Gemini 2.0 Flash is remarkably good at structured moderation tasks when given clear scoring rubrics and subreddit-specific context. A post scores very differently on a cryptocurrency subreddit versus a personal finance subreddit — which is exactly the right behaviour.

The hardest part of building mod tools isn't the technology. It's understanding the actual workflow. Moderators don't want AI to replace their judgment. They want AI to sort their queue so they can apply their judgment to the items that actually matter.

What's next for ModSentinel

User reputation tracking. Per-author violation history showing "3 prior violations" next to a username. Repeat offenders get flagged higher regardless of individual post score.

Watchlist. Right-click any post → add user to watchlist. Future content from watched users is auto-flagged at 100, no Gemini call needed.

Shadow queue. A dedicated tab for auto-removed items — mods review here for false positives and restore with one tap. Critical for community trust at aggressive thresholds.

Community health score. A live "94% healthy" metric in the dashboard header, calculated as the percentage of content that scored clean or was mod-approved. Useful for tracking subreddit trends over time.

Daily summary mod-mail. Automated 9 AM UTC report: items scored, auto-removed count, average risk score, top risk users.

Built With

Share this project:

Updates