🛡️ ZENmode AI — Project Story
Inspiration
Every day, Reddit moderators fight an invisible war.
They wake up to hundreds of reports, spam floods, coordinated raids, toxic comment chains, and scam links — all while volunteering their time for free. There are 2.8 million subreddit moderators on Reddit, and most of them are burning out.
I've personally seen subreddit communities collapse — not because the topic became irrelevant, but because moderation couldn't keep up. A single raid can destroy years of community culture in under 10 minutes. A spam wave can bury genuine discussions under garbage. Toxic comment chains can drive good-faith users away permanently.
The existing tools were built for a different era. AutoModerator is powerful but requires you to think like a programmer. The report queue is a flat, unorganized list. There's no AI layer, no priority system, no predictive protection.
The question that drove this project: What if moderation could be as intelligent as the threats it faces?
That question became ZENmode AI — an autonomous, AI-powered moderation suite built natively on Devvit.
What We Built
ZENmode AI is a comprehensive Reddit moderation platform with 27 integrated tools, organized into 4 core pillars:
Pillar 1 — 🚨 Threat Detection
Real-time, automated defense layer that operates 24/7 without human input.
| Module | Function |
|---|---|
| Spam Detector | Hash-matching + behavioral frequency analysis |
| Toxic Comment Filter | BERT-based NLP hostility detection |
| Scam/Phishing Detector | URL threat intelligence + OCR image scanning |
| Bot/Fake Account Detector | Behavioral biometrics + account metadata analysis |
| Raid Attack Detector | Traffic velocity monitoring + emergency lockdown |
| Cross-Subreddit Threat Detector | Platform-wide bad actor tracking |
| Hate Speech Escalation Tracker | Long-term radicalization pattern monitoring |
Pillar 2 — 🧠 AI Intelligence Layer
Dynamic, context-aware AI that goes beyond rigid keyword filters.
| Module | Function |
|---|---|
| Context-Aware Moderation AI | Semantic understanding of sarcasm, slang, intent |
| AI Moderator Assistant | GPT-4 powered recommendations for ambiguous cases |
| AI Summary for Long Toxic Threads | LLM-based conflict narrative summarization |
| AI Ban/Warn Suggestion System | Multi-variable penalty recommendation engine |
| AI Debate Mediator | Real-time de-escalation interventions |
| Sentiment-Based Moderation | Emotional tone analysis for mental health communities |
Pillar 3 — ⚙️ Automation & Workflow
Rule-based automation that handles the repetitive 80% so mods can focus on the nuanced 20%.
| Module | Function |
|---|---|
| Auto Rule Checker | Structural post format enforcement |
| Auto Moderation Bot | If/Then programmable rule engine |
| Auto Flair Assigner | NLP-based content categorization |
| Auto Archive Manager | Time-based thread lifecycle management |
| Auto FAQ Responder | Instant answers to repetitive questions |
| Queue Manager | Priority-sorted moderation work center |
| Comment Cleanup Tool | Bulk action macro for catastrophic threads |
| Rule Violation Auto Tagging | Color-coded violation labeling |
Pillar 4 — 📊 Analytics & Governance
Data-driven insights and transparent community governance tools.
| Module | Function |
|---|---|
| Community Health Dashboard | Bird's-eye community health scoring |
| Toxicity Heatmap | When/where toxicity clusters occur |
| Moderator Workload Tracker | Burnout prevention + fair task distribution |
| Engagement Trend Analyzer | Growth metrics and content strategy insights |
| Discussion Bots | Automated community engagement events |
| Governance Tools | Voting, mod-log ledgers, permission management |
How We Built It
Architecture Overview
ZENmode AI uses a 3-layer architecture:
┌─────────────────────────────────────────┐
│ Reddit Platform (Devvit) │
│ Blocks UI + Triggers + KV Store │
└──────────────────┬──────────────────────┘
│ HTTP API calls
┌──────────────────▼──────────────────────┐
│ FastAPI Backend (Python) │
│ ML Models + Rule Engine + Redis │
└──────────────────┬──────────────────────┘
│
┌──────────────────▼──────────────────────┐
│ PostgreSQL + Redis │
│ Persistent storage + Real-time cache │
└─────────────────────────────────────────┘
Layer 1 — Devvit App (TypeScript)
The Reddit-native layer handles all platform integration:
- Custom Post Type renders the ZENmode dashboard directly inside Reddit using Devvit Blocks UI
- PostSubmit + CommentSubmit Triggers intercept every new piece of content in real-time
- KV Store persists moderation statistics across sessions
- Devvit Scheduler runs Auto Archive Manager and Discussion Bots on time-based triggers
- Menu Actions allow moderators to launch ZENmode with a single click from the subreddit menu
Layer 2 — FastAPI Backend (Python)
The intelligence engine powers all AI and ML features:
ML Stack:
Hugging Face Transformers— BERT for toxic comment classificationscikit-learn— Spam detection, bot probability scoringspaCy— Named entity recognition, URL analysisVADER + TextBlob— Sentiment analysisOpenAI GPT-4 API— AI Moderator Assistant, thread summarization, debate mediation
API Routes:
POST /api/moderation/check-post → Spam + Rule check
POST /api/analysis/check-comment → Toxicity + Scam check
GET /api/moderation/queue → Fetch prioritized queue
POST /api/moderation/approve → Approve content
POST /api/moderation/remove → Remove content
POST /api/suggestions/ban-warn → AI penalty recommendation
GET /api/analytics/health → Community health data
Background Workers (Celery):
queue_worker.py— Processes moderation tasks asynchronouslyraid_detector.py— Monitors traffic spikes in real-time via Redis streams
Layer 3 — Database
PostgreSQL stores all persistent data:
- Moderation actions audit log
- User warning history
- Community health metrics
- Governance votes and mod logs
Redis handles real-time operations:
- Sub-millisecond queue priority sorting
- Pub/Sub for real-time moderator notifications
- Rate limiting for API calls
- Session management
The Math Behind Priority Scoring
The Queue Manager uses a weighted priority score to rank items:
$$ P_{score} = w_1 \cdot R_{count} + w_2 \cdot S_{severity} + w_3 \cdot V_{velocity} + w_4 \cdot A_{age} $$
Where:
- $R_{count}$ = number of user reports
- $S_{severity}$ = violation severity score $(0-1)$
- $V_{velocity}$ = rate of incoming reports per minute
- $A_{age}$ = inverse of content age (newer = higher weight)
- $w_1, w_2, w_3, w_4$ = tunable weight coefficients
Priority thresholds:
$$ \text{Priority} = \begin{cases} \text{HIGH} & \text{if } P_{score} \geq 0.75 \ \text{MEDIUM} & \text{if } 0.40 \leq P_{score} < 0.75 \ \text{LOW} & \text{if } P_{score} < 0.40 \end{cases} $$
The Math Behind Ban/Warn Suggestion
The AI Ban/Warn system computes a User Risk Score:
$$ R_{user} = \alpha \cdot H_{violations} + \beta \cdot \left(1 - \frac{T_{account}}{T_{max}}\right) + \gamma \cdot \frac{N_{negative}}{N_{total}} - \delta \cdot C_{positive} $$
Where:
- $H_{violations}$ = historical violation count (normalized)
- $T_{account}$ = account age in days
- $T_{max}$ = maximum considered account age (730 days)
- $N_{negative}$ = total negative interactions
- $N_{total}$ = total interactions
- $C_{positive}$ = positive contribution score
- $\alpha, \beta, \gamma, \delta$ = weight parameters
Penalty mapping:
$$ \text{Action} = \begin{cases} \text{Permanent Ban} & \text{if } R_{user} \geq 0.85 \ \text{Temporary Ban (7d)} & \text{if } 0.65 \leq R_{user} < 0.85 \ \text{Mute (24h)} & \text{if } 0.40 \leq R_{user} < 0.65 \ \text{Warning} & \text{if } R_{user} < 0.40 \end{cases} $$
Challenges We Faced
1. Devvit's Blocks UI Constraints
Devvit Blocks is not React — it's a declarative UI system with strict layout rules. No CSS, no onClick props drilling, limited state management. We had to rethink our entire UI architecture, moving all state to main.tsx and passing it down cleanly.
Solution: Centralized state in the entry point, pure display components in blocks.
2. Real-time Updates Inside Reddit
Devvit apps don't have WebSocket support natively. Getting the queue to feel "live" required creative use of useAsync combined with KV Store polling.
Solution: Optimistic UI updates — update local state immediately on action, sync with backend asynchronously.
3. Running ML Models at Scale
BERT inference is expensive. Running a full transformer model on every single comment submitted to a large subreddit would create massive latency.
Solution: Two-tier filtering pipeline:
- Fast tier — Regex + keyword heuristics (sub-millisecond, catches 70% of cases)
- Slow tier — BERT inference only for borderline cases flagged by fast tier
This reduces GPU compute by approximately $\approx 80\%$ while maintaining accuracy.
$$ \text{Compute Saved} \approx 1 - \frac{N_{borderline}}{N_{total}} \approx 0.80 $$
4. Context-Aware Moderation — The Sarcasm Problem
Basic toxicity models flag benign sentences like "You absolutely killed that performance!" Training a model to understand context required fine-tuning on Reddit-specific conversational data.
Solution: Fine-tuned RoBERTa on a Reddit-scraped dataset with human-labeled context annotations. Added surrounding comment thread as input context window.
5. Coordinated Raid Detection — False Positives
Legitimate viral posts also cause traffic spikes. Distinguishing a genuine viral moment from a coordinated raid required more than just velocity monitoring.
Solution: Multi-signal detection combining:
- Account age distribution of new posters
- Subreddit membership duration
- Cross-subreddit origin tracking
- Semantic similarity of incoming content
$$ \text{Raid Score} = \frac{\sum_{i=1}^{n} \mathbb{1}[\text{age}i < 7\text{days}]}{n} \cdot V{velocity} \cdot S_{semantic_similarity} $$
What We Learned
Moderation is deeply human — AI can handle 80% of cases, but the 20% edge cases require human judgment. The best tool augments humans, not replaces them.
Devvit is genuinely powerful — Building natively inside Reddit means zero friction for moderators. No external login, no separate dashboard, no context switching.
Speed matters more than perfection — A fast, 90%-accurate filter that responds in 50ms protects the community better than a perfect model that takes 2 seconds.
Moderator burnout is a real crisis — Building the Workload Tracker revealed how unevenly distributed moderation labor is. Tools that protect moderators are as important as tools that protect communities.
Privacy-first design is non-negotiable — All user data is anonymized before ML processing. No personally identifiable information is stored in the analytics pipeline.
What's Next
- [ ] Mobile-optimized Blocks UI for on-the-go moderation
- [ ] Multi-subreddit dashboard for large mod teams managing multiple communities
- [ ] Federated threat intelligence sharing between opt-in subreddits
- [ ] Custom ML model fine-tuning per subreddit (community-specific language patterns)
- [ ] Integration with Reddit Developer Funds for sustained development
Built With
Devvit TypeScript Python FastAPI PostgreSQL Redis Hugging Face Transformers BERT RoBERTa scikit-learn spaCy OpenAI GPT-4 Celery Docker React Tailwind CSS
Built with ❤️ for the Reddit moderation community — the unsung heroes of the internet.
Built With
- built
- devvit
- docker
- fastify
- jest
- node.js
- postgresql
- react
- typescript
- with
- zod


Log in or sign up for Devpost to join the conversation.