Inspiration
RedDefence AI - Devpost Submission
Category
Best New Mod Tool
App Listing
https://developers.reddit.com/apps/reddefences
Reddit Usernames
- u/InterestingRate6008
- Add other teammate Reddit usernames here, if applicable.
Tagline
Defend your subreddit before chaos spreads.
One-Liner
RedDefence AI is a Devvit-native, LLM-first moderation command center that helps Reddit moderators detect toxic content, duplicate swarms, risky users, rule confusion, and thread-level chaos, then respond with evidence, audit logs, and rollback.
Elevator Pitch
RedDefence AI gives moderators an AI-first defence layer inside Reddit. Instead of forcing mod teams to scan raw queues during heated moments, RedDefence organizes risk by thread, explains why content was flagged, highlights duplicate swarms and risky user patterns, answers repeated rule questions, and keeps every action reversible through audit logs and rollback.
The product philosophy is simple:
Autonomous when confidence is high. Assistive when context is unclear. Transparent everywhere.
RedDefence is not designed to be an opaque punishment machine. It is designed to be a moderator command center: fast, explainable, configurable, and safe.
About the Project
Inspiration
Reddit moderators rarely deal with just one bad comment at a time. Real moderation problems often arrive as incidents: a match thread gets heated, several users pile on, duplicate comments spread, rule questions repeat, and the mod queue becomes hard to read quickly.
Many moderation tools are useful, but they usually focus on one narrow problem: keyword filters, report queues, spam checks, or simple removal automation. I wanted to build something closer to the way moderators actually think during live chaos:
- What is happening right now?
- Which thread is getting worse?
- Which users need attention, and why?
- Is this one bad comment or a duplicate swarm?
- Can this rule question be answered automatically?
- What action was taken, and can it be reversed?
RedDefence AI was built around that incident-level view. The goal is to help moderators move from raw noise to clear decisions.
What It Does
RedDefence AI gives moderators a native Reddit command center with:
- Defence Command Center for live subreddit risk.
- Live Risk Feed for posts/comments requiring attention.
- Thread War Room for post-wise and thread-wise moderation.
- Toxicity Shield for AI-based risk scoring and recommended actions.
- Duplicate Swarm Detector for copied comments, repeated links, and coordinated spam patterns.
- Risky User Profiles with evidence, positive signals, and suggested actions.
- Rule Question Responder for repeated moderation/rules questions.
- Policy Lab for thresholds, operating modes, and shadow previews.
- Emergency Shield for temporary stricter protection during raids or live chaos.
- Daily Defence Briefing for summarized moderator intelligence.
- Incident Timeline for understanding how a situation unfolded.
- Audit Log and Rollback for transparency and reversibility.
- Ask RedDefence for plain-language answers about community risk.
How I Built It
RedDefence AI is built as a Devvit-native web app using:
- Reddit Devvit
- TypeScript
- React
- Vite
- Tailwind CSS
- Devvit Redis for subreddit-scoped storage
- Devvit triggers for new post/comment analysis
- Server-side API routes for moderation workflows
- Server-side LLM calls for classification and answer generation
- Policy-controlled action engine
- Audit logging and rollback tracking
The client only talks to RedDefence's own backend endpoints. AI provider calls happen server-side so API keys are not exposed to the browser.
The app is LLM-first: every content decision is intended to be analyzed by an AI model using a strict moderation schema. The system asks the model for risk score, confidence, category, matched rule, evidence, recommended action, and whether human review is needed. When provider access is unavailable, RedDefence fails closed into moderator review instead of pretending a low-quality fallback is the same as AI judgment.
What I Learned
The biggest lesson was that moderation tools need trust more than spectacle. It is easy to build a flashy classifier. It is much harder to build a system a moderator would actually rely on during a stressful live thread.
That shaped several design decisions:
- Every decision needs evidence.
- User risk must be subreddit-scoped, not a permanent reputation score.
- Positive signals matter, because moderation should not only remember the worst moment.
- Automation should be configurable and reversible.
- LLM output should be structured, validated, and explainable.
- A mod should understand the current community state in seconds.
I also learned a lot about practical Devvit constraints, especially around external HTTP access, app settings, moderator-only access, and building a reliable webview experience inside Reddit.
Challenges
The hardest challenge was balancing an AI-first product with safe moderation. RedDefence is meant to make strong recommendations and support automation, but it should never become reckless. The solution was to build the product around policy modes, thresholds, audit logs, rollback, and human review for uncertain cases.
Another challenge was provider access. Some AI providers are blocked by Devvit's external request rules, so the AI layer was structured to support approved providers such as Gemini/OpenAI-style server-side endpoints while keeping the rest of the app independent of a single model.
Finally, it was challenging to make the product demo well with only a small test subreddit. To solve that, RedDefence supports real live analysis while also providing a scenario replay path for showing the scale of incidents like duplicate swarms and heated threads.
What I Am Proud Of
RedDefence AI is more than a toxicity demo. It is a complete moderation defence system. It combines AI classification, thread-level intelligence, duplicate swarm detection, risky user context, rule-answering, emergency response, summaries, audit logs, and rollback in one moderator-first product.
The most important design choice is that RedDefence does not ask moderators to blindly trust AI. It gives them evidence, control, and reversibility.
Tool Overview
RedDefence AI is a moderator-only Devvit app for detecting and responding to community risk before it spreads. It is intended to be installed by subreddit moderators and used from inside Reddit as a command center for live moderation.
Moderator-Only Access
The app is designed for moderators, not general users. Dashboard access is restricted so normal community members do not see internal risk scores, user profiles, audit logs, or moderation recommendations.
Defence Command Center
The dashboard gives moderators a quick operational view of the subreddit:
- Threat level: Stable, Heated, or Under Attack.
- Number of posts/comments analyzed.
- Risky users currently active.
- Duplicate swarms detected.
- Auto actions and moderator actions.
- Estimated moderator time saved.
- Current policy mode.
- Current AI/system status.
The purpose is to answer the moderator's first question: "Is the community okay right now?"
Live Risk Feed
The Live Risk Feed shows risky posts and comments with:
- Content type.
- Author.
- Thread/post context.
- Timestamp.
- Content excerpt.
- Risk score.
- Confidence.
- Category.
- Matched subreddit rule.
- Evidence summary.
- Recommended action.
- Available moderator actions.
Moderators can review, remove, warn, answer, mark safe, mark false positive, or roll back actions depending on the item state and policy.
LLM-First Toxicity Shield
Every new post/comment is analyzed through an LLM-first moderation pipeline. The model returns a structured decision:
- Risk score.
- Confidence.
- Category.
- Matched rule.
- Evidence.
- Recommended action.
- Safe warning text.
- Human review flag.
The system is designed to detect:
- Personal attacks.
- Harassment.
- Insults.
- Hate or severe abuse.
- Threatening language.
- Toxic escalation.
- Repeated aggressive behavior.
Automation is controlled by Policy Lab thresholds. RedDefence can operate safely in Observe/Assist modes for review-only workflows, or stricter modes when moderators explicitly configure and verify the policy.
Duplicate Swarm Detector
RedDefence detects duplicate or near-duplicate behavior across recent comments:
- Exact copied comments.
- Normalized duplicates.
- Similar repeated phrases.
- Repeated links.
- Copypasta swarms.
- Multiple users posting similar text.
Clusters show sample text, cluster size, users involved, affected thread, risk score, and status. This helps moderators identify swarm behavior instead of treating every copied comment as an isolated incident.
Risky User Profiles
RedDefence creates subreddit-scoped user risk profiles. These are not permanent reputation scores and are not shared across Reddit. They are moderation context for the current subreddit only.
Profiles show:
- Risk score and risk level.
- Main reasons for risk.
- Toxic flags.
- Duplicate flags.
- Warnings.
- Removals.
- Ignored warnings.
- Positive signals.
- Suggested moderator action.
Positive signals are important. A user who has many normal approved comments should be treated differently from a brand-new account only participating in a heated thread. RedDefence surfaces both risk and context.
Thread War Room
The Thread War Room groups moderation risk by post/thread instead of showing only a flat queue. Each thread can show:
- Status: Stable, Heated, or Under Attack.
- Thread risk score.
- Toxicity trend.
- Duplicate clusters.
- Risky users active.
- New-user or high-activity spikes when available.
- Suggested action.
This is especially useful for live events, debates, breaking news, sports threads, and other high-velocity situations.
Rule Question Responder
RedDefence detects moderation-related questions such as:
- "What are the top rules of this community?"
- "Why was my post removed?"
- "Which flair should I use?"
- "Can I post links here?"
- "What rule did I break?"
- "Why did I get warned?"
When confidence is high and policy allows it, RedDefence can generate a helpful answer from subreddit rules/configuration. If confidence is lower, it drafts an answer for moderator review.
This reduces repetitive mod work while giving users clearer guidance.
Policy Lab and Shadow Mode
Policy Lab is where moderators configure RedDefence:
- Current mode: Observe, Assist, Warn, Autopilot, or Emergency Shield.
- Toxicity auto-remove threshold.
- Warning threshold.
- Duplicate auto-remove threshold.
- Rule question auto-reply threshold.
- Warning cooldown.
- Emergency Shield duration.
- Community preset.
- Tone: Strict, Balanced, or Relaxed.
Shadow preview shows what would happen under a policy before moderators fully trust automation. This helps teams tune thresholds without surprising the community.
Emergency Shield
Emergency Shield is a temporary strict mode for raids, live chaos, or extremely heated threads. When activated, RedDefence can:
- Lower warning thresholds.
- Increase duplicate sensitivity.
- Prioritize active heated threads.
- Create an incident timeline.
- Recommend or draft a civility reminder.
- Auto-expire after the configured duration.
Emergency Shield is designed as a temporary response, not a permanent punishment mode.
Daily Defence Briefing
The Daily Defence Briefing summarizes what happened:
- Community status.
- Comments/posts analyzed.
- High-risk items.
- Warnings.
- Duplicate clusters.
- Rule questions.
- Estimated moderator time saved.
- Top issue.
- Recommended next action.
This gives moderators summary intelligence instead of raw noise.
Incident Timeline
The Incident Timeline shows how a situation developed:
- Thread created.
- Toxicity spike detected.
- Duplicate swarm detected.
- Risky users escalated.
- Actions taken.
- Thread stabilized.
This is useful for handoffs between moderators and for understanding whether the response worked.
Ask RedDefence
Ask RedDefence is a lightweight moderator assistant. Mods can ask:
- Why is threat level high?
- What happened in the last hour?
- Which users ignored warnings?
- Which rule is causing the most removals?
- What should I do now?
- Summarize duplicate swarms.
The assistant answers from stored summaries, audit logs, thread state, duplicate clusters, and user profiles.
Audit Log and Rollback
Every action is logged with:
- Timestamp.
- Actor.
- Content ID.
- Author.
- Action type.
- Risk score.
- Confidence.
- Category.
- Matched rule.
- Evidence.
- Status.
- Rollback reason when applicable.
Moderators can mark false positives, mark actions too harsh or too soft, and roll back reversible actions. The product is built around transparency and accountability.
Intended Moderator Workflow
- Install RedDefence AI on a subreddit.
- Configure AI provider and policy thresholds.
- Start in Observe or Assist mode.
- Watch live posts/comments flow into the command center.
- Review flagged content with evidence and confidence.
- Use thread view to focus on the most heated discussions.
- Detect duplicate swarms and risky user patterns.
- Let RedDefence draft or send rule answers when policy allows.
- Activate Emergency Shield only during active chaos.
- Use Audit Log and Rollback to correct mistakes.
Intended User Experience
Normal Reddit users do not interact with the dashboard. They may receive:
- A polite warning when their comment crosses the configured threshold.
- A helpful answer to a rule question.
- A removal reason when content clearly violates community policy.
The app avoids hidden punishment, permanent automatic bans, cross-subreddit tracking, and private identity inference.
Project Impact
1. Live Sports Communities
Example communities:
- r/soccer
- r/nba
- r/Cricket
Live sports threads can move extremely fast. A controversial referee call, trade rumor, or match result can create personal attacks, repeated slogans, duplicate spam, and heated pile-ons within minutes.
RedDefence helps by:
- Grouping risk by match thread.
- Detecting toxicity spikes.
- Detecting duplicate comment swarms.
- Highlighting users repeatedly escalating the same thread.
- Giving mods an Emergency Shield option.
- Summarizing the incident instead of forcing mods to read thousands of comments.
Impact: faster response during live chaos, less queue overload, fewer missed escalation patterns, and more consistent rule enforcement.
2. Support and Help Communities
Example communities:
- r/help
- r/AskModerators
- product/support subreddits
Support communities often receive repeated rule questions, removal questions, flair questions, and "why was this removed?" comments. Mods spend time answering similar questions again and again.
RedDefence helps by:
- Detecting rule/moderation questions.
- Drafting or sending clear rule-based answers.
- Logging every answer.
- Showing which rules create the most confusion.
- Reducing repetitive mod workload while improving user understanding.
Impact: users get faster guidance, moderators spend less time on repetitive explanations, and communities feel more transparent.
3. Career, Advice, and High-Volume Discussion Communities
Example communities:
- r/cscareerquestions
- r/jobs
- r/personalfinance
These communities often face high-volume posts, repeated questions, spam links, heated arguments, and exhausted volunteer moderators.
RedDefence helps by:
- Prioritizing the queue by actual risk.
- Detecting repeated/duplicate content patterns.
- Summarizing daily moderation trends.
- Showing user context with positive signals.
- Helping small mod teams act consistently without over-automating.
Impact: smaller teams can handle larger communities with better context, fewer repetitive actions, and clearer moderation decisions.
Why This Matters
RedDefence AI saves moderator time by converting a chaotic stream of posts and comments into prioritized, explainable, thread-aware decisions.
It improves community health by catching escalation early, especially before one heated thread turns into a subreddit-wide moderation incident.
It improves trust because every recommendation includes evidence, confidence, matched rule, and rollback. Moderators are not asked to trust a black box. They are given a decision support system they can verify.
What Makes It Unique
- It is not just a toxicity classifier.
- It is not just a spam filter.
- It is not just a dashboard.
- It is a complete AI-first moderation defence system.
RedDefence combines:
- Thread-level chaos detection.
- Duplicate swarm detection.
- Risky user profiles with positive signals.
- Rule question automation.
- Emergency response.
- Daily summaries.
- Audit logs and rollback.
- Moderator-configurable policy.
This gives it broad appeal across the Devvit ecosystem because almost every active subreddit deals with some mix of toxic escalation, repeated questions, duplicate spam, and queue overload.
Safety Philosophy
RedDefence AI follows these safety principles:
- No automatic permanent bans.
- No cross-subreddit user tracking.
- No private identity inference.
- No sensitive attribute inference.
- No hidden punishment.
- No irreversible actions.
- No action without audit logging.
- No automatic removal below configured thresholds.
- No repeated warning spam.
- Human review for uncertain cases.
Risk score is a moderation priority signal, not a judgment of a person.
All AI decisions are designed to be explainable and reversible.
Ported Project Fields
Original Bot Username
N/A - RedDefence AI is submitted as a new Devvit-native moderation tool, not a ported project.
Port Completion
N/A - This submission is for the Best New Mod Tool category. There is no original bot being replaced.
Optional Developer Platform Feedback
Devvit made it possible to build a native moderation product that feels integrated into Reddit rather than bolted on from the outside. The combination of Devvit Web, Redis, triggers, server routes, and app settings is powerful for this kind of moderator tool.
The biggest area for improvement is clarity around external HTTP access and AI providers. During development, some providers were blocked by Devvit domain restrictions. A clearly documented, easy-to-find list of allowed domains and recommended AI provider setup paths would save builders significant time.
It would also help to have:
- More official examples of moderator-only Devvit Web apps.
- More examples of server-side AI integrations.
- Clearer guidance for app settings/secrets in local, playtest, and uploaded environments.
- Better error messages when external HTTP requests are blocked.
- A recommended realtime pattern for webviews, since EventSource/auth behavior can be confusing.
- More examples of Redis schemas for production-style apps.
Overall, Devvit is a strong foundation for moderator tools. The platform would become even stronger with more guidance around production deployment, external services, and mod-only UX patterns.
Short Demo Script
- Open RedDefence AI from the installed subreddit.
- Show that the dashboard is moderator-only.
- Start in Assist/Observe mode and show current policy settings.
- Post a normal comment and show that it is not treated as a threat.
- Post a rule question such as "What are the top rules of this community?" and show the rule-question workflow.
- Post a clearly toxic comment and show the LLM decision: risk score, confidence, matched rule, evidence, and recommended action.
- Post repeated similar comments from multiple accounts and show the Duplicate Swarm Detector.
- Open the Thread War Room to show risk grouped by post/thread.
- Open a Risky User Profile and show both risk reasons and positive signals.
- Activate Emergency Shield and show the stricter temporary state.
- Open the Incident Timeline to show what happened in order.
- Open the Audit Log and demonstrate rollback/false-positive marking.
- Ask RedDefence: "What should moderators do now?"
- End on the Daily Defence Briefing summary.

Log in or sign up for Devpost to join the conversation.