🛡️ Guardian AI — Reddit-Native AI Moderator Copilot

🔗 App Listing

App Portal URL: guardian-app on Reddit Apps guardian-app on Reddit

👥 Reddit Usernames

u/TeaSeparate589

🛠️ Tool Overview

Guardian AI is an intelligent, Reddit-native moderation assistant that runs directly in your subreddit via Devvit Blocks and connects to a FastAPI AI backend powered by Google Gemini. It acts as an inline copilot for human moderators, helping them prioritize tasks, view explanations of flagged content, and perform moderation actions without leaving the Reddit app.

Key Capabilities

Explainable Toxicity Detection: Evaluates submissions and comments against hate speech, harassment, threats, and abuse using gemini-1.5-flash. The dashboard displays a confidence percentage and a natural language explanation of why the content was flagged.
Thread Escalation Detection (Killer Feature): When a comment is checked, Guardian scans the surrounding comment thread history. If toxicity is rising rapidly (e.g. mutual bickering or a flame war), it increases the priority level so moderators can lock or moderate the thread before it disrupts the community.
Embedding-Based Repost/Duplicate Detection: Uses Gemini text-embedding-004 to generate vector representations of all posts. When a new post is submitted, it performs a cosine similarity search against posts in a 30-day sliding window, flagging duplicate threads and copy-paste spam.
Smart Spam Heuristics: Leverages regex patterns and lexical diversity indicators to detect scam links, bot-like formatting abnormalities, and promotional patterns (e.g., giveaway or crypto scams).
Interactive Reddit-Native Mod Feed: Features an embedded queue view with cards showing why items were flagged, the AI copilot's recommendation, and buttons to ✅ Approve, ❌ Remove, or Quiet Ignore. Non-moderator users are presented with a secure "active protection" screen.
Human-in-the-Loop Feedback: Logs moderator actions and accuracy ratings (👍/👎) to continuously improve threshold accuracy.
Community Health Insights: Displays real-time charts indicating toxicity, spam, duplicate, and escalation ratios.
Connection Diagnostics: A status badge (🟢 ONLINE or ● REDIS FALLBACK) shows whether the app is actively connected to the FastAPI backend or running locally on Devvit Redis due to network blocks.

📈 Project Impact

Target Communities

r/AskReddit (and large text-heavy communities): High comment volumes make manual filtering of toxic escalations impossible. Guardian saves moderators hours by filtering obvious violations and highlighting escalating threads before they get out of hand.
r/CryptoCurrency (and finance/giveaway-heavy subreddits): High frequency of bot-driven spam and copy-paste duplicate threads. Guardian's embedding-based similarity matching and spam heuristics instantly flag duplicates and giveaway posts.
r/politics (or debate-heavy communities): Susceptible to quick escalations (flame wars). Guardian's thread history scanner alerts moderators to heated arguments, preserving civil discourse.

Moderator Benefits & Time Savings

Reduced Context Switching: Moderation queue sorting and action triggers are inline inside the subreddit feed via a custom post.
Proactive Mitigation: Finding flame-war escalations early saves hours of clean-up and prevents community disruption.
Explainable Actions: Natural language AI explanations speed up review times, especially for newer mod team members.

🔄 Ported Projects

This project is submitted to the **New Mod Tool Category* and is built from scratch for the hackathon.*

Original Bot username: N/A
Port Completion: N/A

💬 Developer Platform Feedback

What We Loved: Local Redis storage (context.redis) and event triggers (PostSubmit, CommentSubmit) are incredibly straightforward and powerful. The compilation speeds and deployment of custom post types via Blocks are exceptionally fast.
Areas of Improvement:
- Domain Allowlisting constraints: Having to submit custom HTTP domains for admin approval even during local playtesting makes it hard to build and iterate on apps with external backends. An automated sandboxed tunnel (like localtunnel/ngrok approval for playtest-only builds) would improve DX.
- Component Styling: Button appearances are restricted (e.g., "neutral" causes compilation errors), and lack of generic CSS styling/canvas sizes restricts high-fidelity dashboard designs.

Built With

devvit
fastapi
gemini
huggingface
python

Updates

kshitij kumrawat started this project — May 26, 2026 06:12 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.