HiveCatch — Reddit Scam Ring Detector
HiveCatch detects Reddit scam rings by mapping user connections across threads over time, giving moderators a live threat dashboard and one-click tools to ban entire coordinated hives instantly.
Inspiration
We built HiveCatch to help moderators surface coordinated scams, brigading, and manipulative campaigns that are hard to spot from single reports or posts. Moderation teams want a lightweight, auditable tool that clusters suspicious activity and keeps humans in control — not an opaque automation that acts without explicit review.
What It Does
HiveCatch ingests subreddit events and surfaces suspicious clusters for moderator review:
- Listens to post/comment submissions, native post/comment reports, and moderator-action webhooks
- Persists incidents and raw payloads for auditability
- Builds a compact suspect graph in Redis: per-thread participants and weighted account-to-account edges
- Computes clusters and ranks incidents so moderators can triage high-risk activity
- Provides a dashboard UI with:
- Cluster Roots metric and incident feed
- Time-range filters and clickable incident cards
- Ban preview (dry-run) showing impacted accounts and cluster overlap
- Selective ban controls (pick accounts or clusters) and explicit confirm flows
- Delete-signal and cluster-cleanup actions
- Safe moderator actions: dry-run preview, retries for bans, and fallbacks (lock post, report to admins) when a direct ban fails
- Debug endpoints to inspect raw and parsed mod-action/report payloads
How We Built It
Stack & Structure
| Layer | Technology |
|---|---|
| Frontend | React + Vite (src/client/game.tsx) |
| Server | Devvit Web (Hono), TypeScript |
| Storage | Redis (hc:incidents, hc:edges:<user>, hc:thread:<postId>:participants) |
| Core Logic | src/server/core/hive.ts |
| Parsing | src/server/core/modActionParser.ts |
Pipeline Overview
- Trigger receives event (post / comment / report / mod-action)
recordContentEventscrapes participants, updates per-thread snapshots and user-user edge weights, writes an incident to a time-indexed Redis zset- Dashboard snapshot computes clusters and
clusterBreakdownfor UI preview - Ban-preview API (dry-run) computes affected users/clusters without side effects
- Confirmed ban endpoint accepts
selectedUsers/selectedClusters, executes bans with retries, and returns per-user results
Core Scoring
Incident strength is an explainable weighted sum of simple signals:
$$S = \alpha R + \beta A + \gamma C + \delta T$$
where R = native reports count, A = recent activity score, C = cluster cohesion, T = repeat incident count.
Cluster match percent used for candidate filtering:
$$\text{clusterMatchPercent} = 100 \times \frac{|\text{suspect} \cap \text{cluster}|}{|\text{cluster}|}$$
Challenges We Ran Into
- Varied webhook shapes — mod-action/report payloads differ across calls; we implemented tolerant parsing and persisted raw payloads for post-hoc fixes
- Placeholder accounts — reporter/unknown/redacted placeholders showed up as graph nodes and risked false positives; we exclude synthetics from graph edges and ban candidate lists
- Platform constraints — Devvit disallows network IO at install; we removed install-time side effects
- Safety vs automation — designing a conservative UX (dry-run → select → confirm) to avoid accidental mass moderation
Accomplishments We're Proud Of
- End-to-end pipeline from triggers → graph → dashboard → safe moderator actions with preview
- Auditability: persisted raw inputs and participant snapshots for reliable debugging and forensics
- Conservative, explainable automation: preview mode, selective targeting, retries, and safe fallbacks
- Robust placeholder filtering to avoid banning reporters or unknown placeholders
- Usable dashboard that reduces cognitive load with clear previews and explicit confirmation flows
What We Learned
- Moderators trust transparent, explainable signals more than opaque scores
- Persisting raw webhook/report data accelerates parser fixes and incident forensics
- Human-in-the-loop workflows (preview → selection → confirm) are essential to adopt automation in moderation settings
- Building on hosted platforms needs careful lifecycle and sandbox-aware design
What's Next for HiveCatch
- Add privacy-conscious ML features behind preview/approval (small graph-embedding tests)
- Add replay test harness to validate parser changes using captured payloads
- Add role-based audit logs and exportable incident evidence packages
- Improve scoring and clustering with statistical tests while keeping explanations simple
Log in or sign up for Devpost to join the conversation.