About SOS Moderators
Inspiration
Most Reddit moderation tools sit at one of two extremes: no automation at all, or blunt keyword rules with no explanation and no recourse for legitimate users. Watching a subreddit I cared about get flooded during a raid made the gap obvious. Mods spent two hours manually removing posts one by one with no early warning system. The signals for a coordinated spike are detectable. What was missing was a system watching for them.
Beyond enforcement, I noticed almost nothing existed to help mods understand their community: who their trusted contributors were, what topics their members actually cared about, whether their thresholds were calibrated correctly. SOS Moderators was built to do both: enforce automatically where it can, and surface insight where it can't.
Building on Devvit (Reddit's official developer platform) was a natural choice. Devvit apps run inside Reddit itself with no external server, no OAuth setup, and no infrastructure to maintain. Everything is scoped to a single subreddit and accessible through a custom post that lives in the sub.
How I Built It
Stack: TypeScript 5.4 on Devvit 0.12.23. Single npm dependency: @devvit/public-api. Redis for all persistence. Fully typed end-to-end.
Data model first. Before writing a single trigger, I designed src/redis/schema.ts, a single source of truth for every Redis key in the system. No raw string keys exist anywhere else. The schema covers user trust records, post spam scores, daily and hourly counters, sorted-set leaderboards, config, audit log, flair proposals, shadow audit state, Coach context, and rate limit queues.
Spam detection pipeline. Every post runs through src/triggers/postSubmit.ts: nine additive signal checks (duplicate title, duplicate body, banned domain, banned keyword, URL in title, high URL density, all-caps, excessive punctuation, very new account), summed to a score between 0 and 1, then discounted by the author's trust tier before comparing against configurable flag and auto-remove thresholds.
Trust scoring. src/moderation/trustScore.ts maintains a 0-1000 score per user built from account age, subreddit karma (logarithmic), approval rate, positive signals, and penalties. Trust feeds into spam scoring as a multiplier, not a gate, and determines report weighting. A weekly decay job in src/jobs/trustDecay.ts applies 5% decay to users inactive 90+ days, with a floor of 300.
AI layer. Three subsystems, all heuristic-first:
src/ai/heuristic.ts: flair classifier across 15 content categories using weighted keywords, bigrams, negation patterns, and density bonuses. Runs with zero API calls.src/ai/coachHeuristics.ts+src/ai/coachContext.ts: 12 intent handlers that answer mod questions using live Redis data (trust records, audit log, daily stats, config). No API key required.src/ai/llm.ts: thin LLM wrapper with a 6-second timeout and automatic heuristic fallback. Purely additive; core features work without it.
Scheduled jobs. Nine background jobs registered in src/main.ts: aggregate_builder (topic extraction + stemming + rolling stats, every 15 min), spike_detector (post volume vs. 4-week baseline, every 15 min), flair_vote_tallier (hourly), trust_decay (weekly), weekly_highlight (Sunday 20:00 UTC, creates and stickies the top contributor shoutout post), raid_mode_expiry (every 6 hours), redis_monitor (Redis usage alerts at 80% and 90%, every 6 hours), coach_context_warmup (pre-caches Coach context on startup), and retry_queue_drain (processes queued moderation actions when the rate limit clears).
UI surfaces. Four custom posts, all moderator-only and all self-recreating if deleted:
- Dashboard: 5 tabs (Health, Topics, Leaderboard, Posts with Pending Review + Case File, Coach)
- Config UI: 4 tabs (Presets, Features, Thresholds, Danger)
- Community Leaderboard: 2 tabs (This Week by contribution points, All-Time Trust) (Viewable by non mods)
- Flair Vote post: created per vote, shows live results (Viewable by non mods)
Post-level mod menu actions (Approve & Notify Author, Remove & Notify Author, Generate Appeal Summary) let moderators act and notify in a single click without opening the dashboard.
Modmail commands. src/triggers/modmail.ts parses incoming modmail for !recheck, !raid, !shadow-activate, !keyword add/remove/list, and !vote config, dispatching each to its handler after checking mod authorization where required.
What I Learned
Redis is a real database. Sorted sets for leaderboards, TTL-based expiry as a first-class feature, rolling aggregates instead of full history. The 500 MB cap forced discipline that made the system cleaner.
Platform constraints improve design. No inline text input meant Coach had to use quick-answer chips for common queries, which turned out to be a better UX than a text field anyway. Fixed-height blocks forced information density decisions that made the UI more deliberate.
Heuristics beat LLMs for structured tasks. The flair detector and Coach answer the vast majority of cases cleanly without any external model. LLM integration became enhancement, not foundation. That's the right place for it.
Transparency is non-negotiable in moderation. Early builds had more features and less logging. Every time I couldn't answer "why did that post get removed?" the system felt untrustworthy. The audit log, signal-by-signal PMs, and Case File all came from that lesson.
Recovery paths keep users. Automated moderation without a recovery path alienates legitimate users. The !recheck loop (flag, explain, fix, recheck, auto-approve) was an afterthought that became one of the most important features.
Challenges
Hook ordering bug. Devvit's useState hooks are positional. Calling component functions conditionally inside JSX corrupts hook state silently. Coach would stop responding with no error. Fix: always call all component functions unconditionally before the return, store output in variables, use variables in JSX.
No inline text input. Every free-text interaction (Coach questions, keyword management, threshold tuning) requires a form modal instead of an inline field. Worked around with chip-based quick answers covering the most common Coach queries.
Selective topic pruning. Rebuilding the topic sorted set from scratch every 15 minutes destroyed accumulated counts. Solution: read all members, stem each, merge scores by canonical form, only rebuild if at least one invalid entry was found.
Modmail race conditions. Reddit's event system can retry handlers, causing the same !recheck to fire twice. A 60-second TTL deduplication key per event ID on every trigger catches and drops duplicates.
Shadow audit state machine. Shadow mode state spans three files (src/redis/config.ts, src/jobs/aggregateBuilder.ts, src/triggers/postSubmit.ts) with sentinel checks to prevent double-digest, stale accumulation, and cleanup on promotion. Spread state made debugging harder than it needed to be.
LLM without App Review. Outbound HTTP requires Reddit App Review approval. During development the LLM path stubs to a structured failure, forcing automatic heuristic fallback, which meant the heuristic had to be fully production-quality from the start.
Accomplishments That I'm Proud Of
Heuristic AI with no API key. Coach and the flair detector run entirely on pattern matching and Redis data. The flair classifier handles 15 content categories with bigrams, negation, and density bonuses. Coach handles 12 mod intent categories with grounded, subreddit-specific answers. No external model, no latency, no cost.
The trust system's depth. A continuous 0-1000 score built from five independent components, updated after every event, decaying weekly for inactive users, feeding into spam scoring as a multiplier rather than a gate. The system gets more accurate over time as it learns who the community's reliable contributors are.
The community recognition loop. The Leaderboard post and weekly automated shoutout give the community a visible, recurring acknowledgment system. Good contributors are seen. That feedback loop matters for long-term community health.
A complete moderation platform, solo. Spam detection, first-time recovery, trust scoring, ban evasion, CIB detection, anti-raid presets, shadow audit, flair automation, community voting, appeal analysis, leaderboards, weekly shoutouts, an AI co-pilot, and a five-tab dashboard, all native to Reddit with no external server.
What's Next for SOS Moderators
Reddit App Review. Once approved, the LLM path unlocks fully open-ended Coach questions via a mod-configured API endpoint. Heuristics stay as the fallback.
Cross-subreddit trust scores. As more subreddits adopt SOS Moderators, a shared trust layer becomes possible, letting a user's reputation across the network inform how new subreddits treat their first posts.
Expanded autoflair categories. 15 preconfigured categories cover the most common subreddit types. The next phase expands the library so more communities get accurate autoflair out of the box.
Deeper onboarding configurability. Trust signal weights and spam signal weights should be configurable at setup, not buried in constants. Different communities have very different spam profiles.
Mod team analytics. The dashboard currently surfaces community data. The next version should surface mod team data: response time on flagged posts, queue depth trends, and coverage gaps across the team.
Built With
- devvit
- llm
- redis
- typescript
Log in or sign up for Devpost to join the conversation.