🛡️ ZENmode AI — Project Story

Inspiration

Every day, Reddit moderators fight an invisible war.

They wake up to hundreds of reports, spam floods, coordinated raids, toxic comment chains, and scam links — all while volunteering their time for free. There are 2.8 million subreddit moderators on Reddit, and most of them are burning out.

I've personally seen subreddit communities collapse — not because the topic became irrelevant, but because moderation couldn't keep up. A single raid can destroy years of community culture in under 10 minutes. A spam wave can bury genuine discussions under garbage. Toxic comment chains can drive good-faith users away permanently.

The existing tools were built for a different era. AutoModerator is powerful but requires you to think like a programmer. The report queue is a flat, unorganized list. There's no AI layer, no priority system, no predictive protection.

The question that drove this project: What if moderation could be as intelligent as the threats it faces?

That question became ZENmode AI — an autonomous, AI-powered moderation suite built natively on Devvit.

What We Built

ZENmode AI is a comprehensive Reddit moderation platform with 27 integrated tools, organized into 4 core pillars:

Pillar 1 — 🚨 Threat Detection

Real-time, automated defense layer that operates 24/7 without human input.

Module	Function
Spam Detector	Hash-matching + behavioral frequency analysis
Toxic Comment Filter	BERT-based NLP hostility detection
Scam/Phishing Detector	URL threat intelligence + OCR image scanning
Bot/Fake Account Detector	Behavioral biometrics + account metadata analysis
Raid Attack Detector	Traffic velocity monitoring + emergency lockdown
Cross-Subreddit Threat Detector	Platform-wide bad actor tracking
Hate Speech Escalation Tracker	Long-term radicalization pattern monitoring

Pillar 2 — 🧠 AI Intelligence Layer

Dynamic, context-aware AI that goes beyond rigid keyword filters.

Module	Function
Context-Aware Moderation AI	Semantic understanding of sarcasm, slang, intent
AI Moderator Assistant	GPT-4 powered recommendations for ambiguous cases
AI Summary for Long Toxic Threads	LLM-based conflict narrative summarization
AI Ban/Warn Suggestion System	Multi-variable penalty recommendation engine
AI Debate Mediator	Real-time de-escalation interventions
Sentiment-Based Moderation	Emotional tone analysis for mental health communities

Pillar 3 — ⚙️ Automation & Workflow

Rule-based automation that handles the repetitive 80% so mods can focus on the nuanced 20%.

Module	Function
Auto Rule Checker	Structural post format enforcement
Auto Moderation Bot	If/Then programmable rule engine
Auto Flair Assigner	NLP-based content categorization
Auto Archive Manager	Time-based thread lifecycle management
Auto FAQ Responder	Instant answers to repetitive questions
Queue Manager	Priority-sorted moderation work center
Comment Cleanup Tool	Bulk action macro for catastrophic threads
Rule Violation Auto Tagging	Color-coded violation labeling

Pillar 4 — 📊 Analytics & Governance

Data-driven insights and transparent community governance tools.

Module	Function
Community Health Dashboard	Bird's-eye community health scoring
Toxicity Heatmap	When/where toxicity clusters occur
Moderator Workload Tracker	Burnout prevention + fair task distribution
Engagement Trend Analyzer	Growth metrics and content strategy insights
Discussion Bots	Automated community engagement events
Governance Tools	Voting, mod-log ledgers, permission management

How We Built It

Architecture Overview

ZENmode AI uses a 3-layer architecture:

┌─────────────────────────────────────────┐
│         Reddit Platform (Devvit)        │
│   Blocks UI + Triggers + KV Store       │
└──────────────────┬──────────────────────┘
                   │ HTTP API calls
┌──────────────────▼──────────────────────┐
│         FastAPI Backend (Python)        │
│   ML Models + Rule Engine + Redis       │
└──────────────────┬──────────────────────┘
                   │
┌──────────────────▼──────────────────────┐
│         PostgreSQL + Redis              │
│   Persistent storage + Real-time cache  │
└─────────────────────────────────────────┘

Layer 1 — Devvit App (TypeScript)

The Reddit-native layer handles all platform integration:

Custom Post Type renders the ZENmode dashboard directly inside Reddit using Devvit Blocks UI
PostSubmit + CommentSubmit Triggers intercept every new piece of content in real-time
KV Store persists moderation statistics across sessions
Devvit Scheduler runs Auto Archive Manager and Discussion Bots on time-based triggers
Menu Actions allow moderators to launch ZENmode with a single click from the subreddit menu

Layer 2 — FastAPI Backend (Python)

The intelligence engine powers all AI and ML features:

ML Stack:

Hugging Face Transformers — BERT for toxic comment classification
scikit-learn — Spam detection, bot probability scoring
spaCy — Named entity recognition, URL analysis
VADER + TextBlob — Sentiment analysis
OpenAI GPT-4 API — AI Moderator Assistant, thread summarization, debate mediation

API Routes:

POST /api/moderation/check-post      → Spam + Rule check
POST /api/analysis/check-comment     → Toxicity + Scam check
GET  /api/moderation/queue           → Fetch prioritized queue
POST /api/moderation/approve         → Approve content
POST /api/moderation/remove          → Remove content
POST /api/suggestions/ban-warn       → AI penalty recommendation
GET  /api/analytics/health           → Community health data

Background Workers (Celery):

queue_worker.py — Processes moderation tasks asynchronously
raid_detector.py — Monitors traffic spikes in real-time via Redis streams

Layer 3 — Database

PostgreSQL stores all persistent data:

Moderation actions audit log
User warning history
Community health metrics
Governance votes and mod logs

Redis handles real-time operations:

Sub-millisecond queue priority sorting
Pub/Sub for real-time moderator notifications
Rate limiting for API calls
Session management

The Math Behind Priority Scoring

The Queue Manager uses a weighted priority score to rank items:

$$ P_{score} = w_1 \cdot R_{count} + w_2 \cdot S_{severity} + w_3 \cdot V_{velocity} + w_4 \cdot A_{age} $$

Where:

$R_{count}$ = number of user reports
$S_{severity}$ = violation severity score $(0-1)$
$V_{velocity}$ = rate of incoming reports per minute
$A_{age}$ = inverse of content age (newer = higher weight)
$w_1, w_2, w_3, w_4$ = tunable weight coefficients

Priority thresholds:

$$ \text{Priority} = \begin{cases} \text{HIGH} & \text{if } P_{score} \geq 0.75 \ \text{MEDIUM} & \text{if } 0.40 \leq P_{score} < 0.75 \ \text{LOW} & \text{if } P_{score} < 0.40 \end{cases} $$

The Math Behind Ban/Warn Suggestion

The AI Ban/Warn system computes a User Risk Score:

$$ R_{user} = \alpha \cdot H_{violations} + \beta \cdot \left(1 - \frac{T_{account}}{T_{max}}\right) + \gamma \cdot \frac{N_{negative}}{N_{total}} - \delta \cdot C_{positive} $$

Where:

$H_{violations}$ = historical violation count (normalized)
$T_{account}$ = account age in days
$T_{max}$ = maximum considered account age (730 days)
$N_{negative}$ = total negative interactions
$N_{total}$ = total interactions
$C_{positive}$ = positive contribution score
$\alpha, \beta, \gamma, \delta$ = weight parameters

Penalty mapping:

$$ \text{Action} = \begin{cases} \text{Permanent Ban} & \text{if } R_{user} \geq 0.85 \ \text{Temporary Ban (7d)} & \text{if } 0.65 \leq R_{user} < 0.85 \ \text{Mute (24h)} & \text{if } 0.40 \leq R_{user} < 0.65 \ \text{Warning} & \text{if } R_{user} < 0.40 \end{cases} $$

Challenges We Faced

1. Devvit's Blocks UI Constraints

Devvit Blocks is not React — it's a declarative UI system with strict layout rules. No CSS, no onClick props drilling, limited state management. We had to rethink our entire UI architecture, moving all state to main.tsx and passing it down cleanly.

Solution: Centralized state in the entry point, pure display components in blocks.

2. Real-time Updates Inside Reddit

Devvit apps don't have WebSocket support natively. Getting the queue to feel "live" required creative use of useAsync combined with KV Store polling.

Solution: Optimistic UI updates — update local state immediately on action, sync with backend asynchronously.

3. Running ML Models at Scale

BERT inference is expensive. Running a full transformer model on every single comment submitted to a large subreddit would create massive latency.

Solution: Two-tier filtering pipeline:

Fast tier — Regex + keyword heuristics (sub-millisecond, catches 70% of cases)
Slow tier — BERT inference only for borderline cases flagged by fast tier

This reduces GPU compute by approximately $\approx 80\%$ while maintaining accuracy.

$$ \text{Compute Saved} \approx 1 - \frac{N_{borderline}}{N_{total}} \approx 0.80 $$

4. Context-Aware Moderation — The Sarcasm Problem

Basic toxicity models flag benign sentences like "You absolutely killed that performance!" Training a model to understand context required fine-tuning on Reddit-specific conversational data.

Solution: Fine-tuned RoBERTa on a Reddit-scraped dataset with human-labeled context annotations. Added surrounding comment thread as input context window.

5. Coordinated Raid Detection — False Positives

Legitimate viral posts also cause traffic spikes. Distinguishing a genuine viral moment from a coordinated raid required more than just velocity monitoring.

Solution: Multi-signal detection combining:

Account age distribution of new posters
Subreddit membership duration
Cross-subreddit origin tracking
Semantic similarity of incoming content

$$ \text{Raid Score} = \frac{\sum_{i=1}^{n} \mathbb{1}[\text{age}i < 7\text{days}]}{n} \cdot V{velocity} \cdot S_{semantic_similarity} $$

What We Learned

Moderation is deeply human — AI can handle 80% of cases, but the 20% edge cases require human judgment. The best tool augments humans, not replaces them.
Devvit is genuinely powerful — Building natively inside Reddit means zero friction for moderators. No external login, no separate dashboard, no context switching.
Speed matters more than perfection — A fast, 90%-accurate filter that responds in 50ms protects the community better than a perfect model that takes 2 seconds.
Moderator burnout is a real crisis — Building the Workload Tracker revealed how unevenly distributed moderation labor is. Tools that protect moderators are as important as tools that protect communities.
Privacy-first design is non-negotiable — All user data is anonymized before ML processing. No personally identifiable information is stored in the analytics pipeline.

What's Next

[ ] Mobile-optimized Blocks UI for on-the-go moderation
[ ] Multi-subreddit dashboard for large mod teams managing multiple communities
[ ] Federated threat intelligence sharing between opt-in subreddits
[ ] Custom ML model fine-tuning per subreddit (community-specific language patterns)
[ ] Integration with Reddit Developer Funds for sustained development

Built With

Devvit TypeScript Python FastAPI PostgreSQL Redis Hugging Face Transformers BERT RoBERTa scikit-learn spaCy OpenAI GPT-4 Celery Docker React Tailwind CSS

Built with ❤️ for the Reddit moderation community — the unsung heroes of the internet.

Built With

built
devvit
docker
fastify
jest
node.js
postgresql
react
typescript
with
zod

Updates

Ankit Pandit started this project — May 27, 2026 12:34 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.