Inspiration
Reddit communities generate massive amounts of content every minute, but moderators still rely heavily on reactive workflows like reports and manual queue review. During major spam waves, harassment incidents, or coordinated toxic behavior, moderators are often overwhelmed by volume rather than lacking moderation tools.
We were inspired by real moderation failures and large-scale Reddit controversies involving harassment, coordinated abuse, spam campaigns, and harmful communities. Many existing systems focus on detecting individual pieces of bad content, but fewer tools help moderators prioritize risk across users, threads, and sudden activity spikes in real time.
ThreadSentry was built to act as a proactive moderation intelligence layer on top of Reddit’s existing workflows rather than replacing them.
What it does
ThreadSentry is an AI-powered moderation platform built with Devvit for Reddit communities.
The app analyzes live posts and comments as they are submitted using a multi-layer moderation pipeline:
Rule-based blacklist and spam detection Hugging Face toxicity classification Gemini explanations for uncertain cases User risk tracking Coordinated activity detection Activity spike monitoring
High-risk content is automatically prioritized in a moderation dashboard where moderators can:
Review flagged posts and comments View toxicity scores and explanations Detect repeat offenders Track suspicious spikes in subreddit activity Approve, remove, or ignore content Maintain a moderation audit trail
The system also supports comment analysis, real Reddit moderation actions, and thread-level coordinated activity alerts.
How we built it
ThreadSentry was built using the Reddit Devvit platform with a custom moderation dashboard and backend analysis pipeline.
Core architecture Devvit Triggers for real-time post/comment ingestion Custom Devvit UI for the moderation dashboard TypeScript + React for frontend and backend logic Hono API routes for moderation endpoints Devvit KV + Redis for persistent moderation intelligence Hugging Face Toxic-BERT for toxicity scoring Google Gemini API for contextual moderator explanations Moderation pipeline New posts/comments trigger automatic analysis Blacklist/spam rules run first Toxicity scoring runs through Hugging Face Borderline cases generate Gemini explanations High-risk items are stored and prioritized Dashboard displays moderation intelligence in real time
We also implemented:
Cross-community risk aggregation within app installs Coordinated-thread detection Spike-event monitoring Audit logging Real Reddit approve/remove actions
Challenges we ran into
One of the biggest challenges was balancing moderation automation with moderator trust. We did not want ThreadSentry to blindly auto-remove content based only on AI scores, since moderation decisions often require context.
Another challenge was integrating real-time moderation flows into Reddit’s Devvit environment while maintaining fast response times and persistent moderation state across posts, comments, and users.
Handling inconsistent AI model responses and fallback logic was also difficult. We had to harden the toxicity-analysis pipeline against malformed responses, slow model loading, and uncertain classification ranges.
Designing coordinated-activity detection in a hackathon timeframe was another major challenge. Instead of attempting a fully global Reddit-wide graph system, we focused on building an extensible moderation intelligence architecture that works reliably within installed communities.
Accomplishments that we're proud of
What we learned
We learned how to build scalable moderation workflows on top of Reddit using Devvit, triggers, and custom moderation interfaces.
We also learned:
Real-time event-driven application design AI moderation pipeline orchestration Risk aggregation across user activity Building moderation dashboards for usability under high-volume conditions Coordinating frontend and backend moderation systems inside Reddit apps
Most importantly, we learned that effective moderation is not just about detecting harmful content — it is about prioritizing moderator attention efficiently.
What's next for ThreadSentry
Future improvements include:
Full Reddit-wide shared moderation intelligence Stronger coordinated-behavior graph analysis Historical analytics and trend visualization Advanced comment-thread analysis Improved spam and illegal-content detection Moderator collaboration tooling Real-time dashboard updates Expanded machine learning models for behavioral risk prediction
Log in or sign up for Devpost to join the conversation.