CollabGuard

Why we built this

Moderation at scale is a team sport, but most tooling still treats it like solo work. Mods review the same report twice, disagree in DMs, lose context between shifts, and struggle with content that is technically within rules but harmful in context—sarcasm, dog whistles, passive aggression, missing context.

Research on volunteer moderation stress and coordination gaps helped frame our problem. See this paper on moderator workload and community health for broader why context.

We asked a simple question:

What if Reddit mods had a shared case workspace inside the platform—plus in-page semantic hints with evidence—without building another auto-ban bot?

CollabGuard is our answer: a moderator-first real-time collaboration system for Reddit communities. It combines a Devvit app for shared case review with a Chrome extension that surfaces evidence-backed semantic risk signals on Reddit. Mod teams can coordinate queue work, review difficult cases, send gentle reminders, escalate for consensus, and maintain audit trails.

The product is a human-controlled moderation workspace. It is not a strike system, auto-ban tool, autonomous enforcement bot, or replacement for moderator judgment. Its purpose is to make moderation teamwork safer, more transparent, and easier to coordinate.

Join us on our build-and-test community to see CollabGuard in action:

https://www.reddit.com/r/ModQueueLab/

This subreddit is our restricted test lab for playtests, demos, and moderator workflow validation.

What inspired us

Duplicate work and invisible coordination — Two mods often investigate the same report with no shared claim state, notes, or resolution history.
Context that lives outside Reddit — Decisions get debated in Discord or modmail while the official queue stays thin.
“Soft” harm that rules miss — Harassment and bad faith often need quotes and thread context, not a single toxicity score.
Punishment-first tooling culture — We wanted reminders and consensus before removals, especially for fixable mistakes.
Devvit as a first-class home — Reddit’s mod tools category deserved a workspace that feels native, not a bolt-on spreadsheet.

We designed every feature around one constraint:

Human confirms ➣ Action recorded ➣ Optional public outcome

No step skips the moderator.

How we built it

CollabGuard + Semantic Sentry is three products wired together, not one monolith.

CollabGuard (Devvit Web app)

Stack: Devvit Web, React 19, TypeScript, Vite, Hono, Devvit Redis, realtime, scheduler, menus/forms/triggers
Core flows: Shared mod queue, claim/release with stale expiry, case detail (notes, discussion, audit, voting), escalation tags, Resolve & Log, Gentle Nudge (Shadow + Active modes), Team Insights, AutoMod YAML builder (draft + browser simulator), optional AI mod assist (advisory only, keys via subreddit menu)
Reddit integration: Post/comment/subreddit menu actions with deep links into the dashboard; playtest on r/ModQueueLab

Semantic Sentry (Chrome MV3 extension)

Stack: React, TypeScript, Vite, Zustand, Zod, Shadow DOM UI on Reddit pages
Behavior: Scan posts/comments, return labels only with quoted evidence + rationale, rate-limited calls through Supabase Edge Functions (secrets never in the extension)
Handoff: Moderator taps Send to queue → pending row in Postgres → CollabGuard scheduler polls and applies workflow + audit on the Devvit side

Supabase bridge

Edge Functions for scan, enqueue, undo, status
Postgres for verdict cache, device settings, pending_queue_actions
Devvit HTTP allowlist + server-side service role—browser never holds privileged keys

Architecture (high level)

flowchart LR
  Reddit[Reddit pages] --> Ext[Semantic Sentry]
  Ext --> Supa[Supabase]
  Supa --> Poll[Devvit scheduler poll]
  Poll --> CG[CollabGuard]
  Menus[Devvit menus] --> CG
  CG --> Redis[(Subreddit Redis)]
  CG --> Mods[Moderator dashboard]

We iterated in parallel: extension evidence pipeline, dashboard queue UX, bridge idempotency, Gentle Nudge safety rails, and launch polish (branding, mobile layout, publish flow).

Challenges we faced

Challenge	What happened	How we addressed it
Human-in-the-loop everywhere	Easy to accidentally “feel autonomous” with AI or bridge actions	Explicit audit events, confirmation forms on menus, Resolve & Log requires reason; AI off by default
Evidence or nothing	LLMs love labels without quotes	Prompt + schema + sanitization + Zod + DB constraints + UI strip if evidence missing
Two runtimes, one workflow	Chrome extension ≠ Devvit iframe	Pending queue table + minute scheduler + realtime fan-out to dashboard
Devvit HTTP allowlist	Supabase calls blocked until domains declared	Per-domain requests in `devvit.json`; documented bridge justification for review
Gentle Nudge safety	Public replies can harm trust if mis-sent	Shadow Mode, cooldowns, thread caps, duplicate prevention, false-positive feedback → local config only (no training on Reddit data)
Mobile + iframe UX	Sidebar, notifications, splash clipped on small viewports	Responsive drawer, fixed panels, gradient splash, logo assets tuned for playtest
Publish & ops	Terms/privacy required for HTTP; app name collisions; cache on old bundles	GitHub-hosted policies, `collab-guard-lab` slug, fresh posts for playtest after upload
Repo / git pain	Corrupted pack files during crunch	Fresh clone path, rsync of `collab-guard/`, continue on clean branch

The hardest product decision: refusing to ship “one-click remove” from the extension. Every escalation path goes through human review in CollabGuard—even when the model is confident.

What we learned

Product

Mods trust evidence and audit more than scores. Showing why beats showing how risky.
Claims are a social contract, not just a lock. Expiry and release matter as much as claim.
Gentle Nudge belongs in Shadow Mode first. Teams need to see false positives before going public.
Insights should celebrate contribution without turning moderation into a leaderboard.

Technical

Devvit Web wants server-owned secrets and subreddit-scoped Redis—design contracts early.
Bridge patterns need idempotency (client_id, status machine) or you double-apply actions.
Playtest cache is real: bump asset paths or use new posts after devvit upload.
Shared TypeScript contracts between extension and Devvit (extension-bridge) prevented silent schema drift.

Process

Writing Tool Overview, privacy, and terms early made publish review smoother.
Playtesting as multiple mod accounts on r/ModQueueLab surfaced claim races and menu deep links we missed in solo dev.

What’s next

Stronger verified moderator attribution on bridge actions (OAuth path)
More consensus review templates for appeals
Optional AI features remain advisory and subreddit-controlled
Continued hardening of delete/account-deletion cleanup for compliance

Try it

Open r/ModQueueLab as a moderator
Playtest: https://www.reddit.com/r/ModQueueLab/?playtest=collab-guard-lab
Install Semantic Sentry (load unpacked from extension/) for in-context review
Report a test post, claim it in CollabGuard, add a note, and trace the audit log

We built CollabGuard for mods who already care about their communities—we just wanted their tools to care about each other, too.

Built With

chrome
claude
codex
devvit
gemini
github-copilot
kiro
node.js
qwen
react
redis
supabase
typescript
vite

Submitted to

Reddit Mod Tools and Migrated Apps Hackathon

Created by

Review Reply Render Renovate Reopen Reddit

Stephen Nguyen
Data Magician x Future AI Musician x Deep Diver
Minh Chau Vu
Lam Anh Truong
Code to much, so I turned into a cat
thuy trang cao
Tran Minh Tue
Hồng Cúc Lê Thị
Hung Truong
Viet Thanh Nguyen
Bao Tran
ai + startup builder | hemut (yc x25), toddagriscience