Inspiration

Reddit moderation is genuinely hard work and most of it goes unnoticed. I started paying attention to a pattern that kept showing up in posts across different subreddits. Someone would get their post removed and have absolutely no idea why. They would comment asking, get ignored, repost the same thing, get removed again, and then eventually just leave the community frustrated. And the mods were not doing anything wrong. They were just busy.

AutoModerator helps a lot with the mechanical side of things but it has one gap that nobody really talks about. It can remove content but it cannot explain itself. There is no built in way for AutoMod to tell a user why something was taken down in plain human language that actually addresses their specific post. So users are left guessing.

I also noticed that consistency is a quieter problem. Two mods on the same team will sometimes make completely opposite calls on nearly identical content, not because anyone is being unfair but just because moderation is judgment based and humans are inconsistent. Users notice this and it erodes trust over time.

I wanted to build something that addressed these real day to day problems without asking mods to change how they work.

What it does

Open Moderator has four parts that work independently.

The first one handles what happens after a removal. The moment a mod removes a post or comment, the affected user gets a personalized message explaining exactly which rule was broken and why their specific content crossed that line. The message is written by AI and adapts to whatever language the user posted in. Mods do not have to write anything or even see it. It just goes out.

The second one watches for inconsistency. Every approved piece of content gets fingerprinted. If a mod later removes something that looks similar to something that was previously allowed, the mod team gets a private alert with links to both pieces of content and a similarity score. Nobody is stopped from making their call. They just get a heads up before users have to point it out.

The third one monitors comments. It tracks escalating arguments between the same two users in a thread and calls in AI when the exchange starts to look less like debate and more like harassment. It also handles instant removal of custom banned words with no AI cost, and separately runs an AI filter for the kind of tone and obfuscated language that word lists alone always miss.

The fourth one is a monthly analysis. Once every 30 days it reviews all the removals from the past month, separates clear rule violations from the gray areas where no existing rule really applied, and posts a mod only thread with specific suggestions for improving the rules including the exact wording to add.

How we built it

Everything runs inside Devvit so there is no external server and no data leaving Reddit's own infrastructure. The AI is handled by Gemini 3.1 Flash Lite through the Generative Language REST API, which turned out to be a great fit because it is fast, cheap, and smart enough for the kind of judgment calls moderation requires.

The trickiest engineering problem was outbound messaging. Reddit has rate limits on how many messages can be sent per minute. So instead of sending DMs and modmails directly from triggers, everything goes into a Redis sorted set queue that drains in batches of five every 30 seconds. The drain job schedules itself to run again every cycle and has a self healing mechanism that detects and revives itself if something goes wrong.

The semantic similarity feature for detecting inconsistent removals uses a Gemini embedding model to convert content into high dimensional vectors and then computes cosine similarity between them. This is what allows it to catch two posts that mean the same thing even when they share almost no words in common.

Challenges we ran into

Devvit has a lot of undocumented behavior that you only discover by running into it. The Redis client does not support list operations like push and pop, which I only found out after building the first version of the queue around them. Had to rebuild the whole thing around sorted sets. The expire function is unreliable under real load, so I ended up storing timestamps inside the values and checking them manually. ModAction fires four times simultaneously for a single moderation event with no mention of this anywhere in the docs, so deduplication had to be built from scratch.

The Gemini integration had its own surprise. Using Gemini 2.5 models without a specific configuration parameter causes the model to spend almost its entire token budget on internal reasoning before responding, which results in about 46 characters of actual output. Took a while to find that one.

Each of these broke in a way that taught me something real about the platform.

Accomplishments that we're proud of

The thing I am most proud of is that the whole system is self healing. Every scheduled job reschedules itself inside a finally block. A watchdog on every trigger checks whether the drain job is still alive and kicks it back awake if it has gone quiet. The app is designed to keep working even when individual pieces of Reddit's infrastructure have brief outages, which happens more often than you would expect.

The multilingual support also turned out better than I expected. REACT writes removal messages in the same language the user posted in without any extra configuration. It just works.

What we learned

The biggest thing I took from this is that you have to treat every assumption as something to verify when you are building on top of someone else's platform. Documentation is always incomplete. The runtime always has opinions that the docs do not mention. The only way to really learn what a platform does is to build something real on it and watch what breaks.

I also learned a lot about what mods actually need. The temptation when building moderation tools is to give mods more dashboards and more controls. What most of them actually want is for repetitive work to just disappear. The right feature is usually the one that requires zero interaction.

What's next for Open Moderator

The next thing I want to build is a way for mods to set up custom moderation policies in plain language that Open Moderator enforces. Instead of writing AutoMod YAML rules, you would just write something like "remove any post that tries to sell something without disclosure" and the AI would handle the interpretation. Effectively AutoMod but configured in plain English.

Longer term I want to explore cross community learning so that communities dealing with similar problems can benefit from each other's removal patterns without sharing any actual content.

Built With

Share this project:

Updates