LinguaGuard

RESULT AFTER A POST
REMOVE ACTION

Inspiration

As a Latin American developer and AI engineer, I've experienced firsthand how language barriers fragment online communities. Subreddits like r/france, r/de, and r/chile enforce language-only rules, but moderators spend 3–5 minutes per post manually reading, translating, and deciding on off-language content. I wanted to build a tool that reduces that to under 30 seconds — with full transparency into why a post was flagged and how the decision was made.

What it does

LinguaGuard is a multilingual moderation tool for Reddit that detects, translates, and helps moderators act on off-language posts and comments. When a mod right-clicks any post or comment:

Signal detection — A 3-layer hybrid engine analyzes the content: heuristic language detection first, AI verification second (Gemini 2.0 Flash), graceful fallback third
Interpretation — A rich Devvit Blocks panel shows risk level, semantic signals (e.g. "Spanish greeting detected", "Spanish punctuation (¿)"), confidence score, and matched rule
Action — The mod sees a recommended action (Remove / Approve / Review) with full context, then chooses — and if they disagree with the AI, the override is logged for quality tracking

Key features:

Explainable AI: Semantic signals, decision trace pipeline, confidence %, and ambiguity indicators — not just a black-box recommendation
Human-in-the-loop: Moderator overrides are tracked and logged in mod notes for auditability
Author history: Tracks repeat offenders across sessions with auto-escalation (3+ removals → HIGH RISK)
Progressive disclosure: Primary actions (Remove/Approve) prominent, secondary actions (Remove+DM, Review, Dismiss) tucked below
Resilient architecture: Works offline — heuristic detection covers 10+ languages even when AI is unavailable

How I built it

Platform: Devvit (Reddit's developer platform) with TypeScript and Devvit Blocks for native Reddit UI
AI layer: Gemini 2.0 Flash via Devvit's native AI integration, with a heuristic fallback engine that scores word frequency across Spanish, Portuguese, French, German, Italian + script detection for Japanese, Korean, Chinese, Arabic, Russian, Thai, Hindi
Architecture: 5-module TypeScript codebase — ai.ts (3-layer analysis engine), actions.ts (mod action execution + override tracking), redis.ts (state management with 15-min TTL), history.ts (author decision history with 90-day retention), types.ts (shared interfaces)
UX design: Trust & safety console aesthetic with Signal → Interpretation → Action flow, inspired by operational dashboards used in content moderation platforms

Challenges I faced

Domain allowlisting: Devvit only permits api.openai.com and generativelanguage.googleapis.com for external HTTP — Claude API (api.anthropic.com) was blocked. I pivoted from Claude Haiku to Gemini, then built the hybrid heuristic engine as a resilient fallback
Devvit Blocks constraints: No HTML/CSS — all UI is built with vstack, hstack, text, and button primitives. Achieving a trust & safety console look within these constraints required creative use of background colors, padding, and text hierarchy
API version compatibility: context.ai.chat() wasn't available in the Devvit SDK version that supported Blocks UI (useState, useAsync). Resolved by using direct HTTP fetch to Gemini's REST API
Playtest vs production: Settings, HTTP fetch, and Redis behaved differently between playtest and uploaded versions — required extensive debugging with Redis-based debug markers

What I learned

Resilience > perfection: The 3-layer architecture (heuristic → AI → fallback) makes the tool useful even when external APIs fail. This is critical for moderation tools that need to work reliably
Explainability matters: Showing why the AI flagged something (semantic signals, confidence %, decision trace) builds moderator trust far more than just showing a recommendation
Human-in-the-loop is a feature: Tracking when moderators override AI recommendations creates an implicit feedback loop that improves the system over time

What's next for LinguaGuard

Gemini full integration: Once domain approval is confirmed, real-time AI translation and analysis will replace heuristic detection for ambiguous cases
Feedback dashboard: Aggregate override data to show moderators how often the AI agrees with their decisions
Auto-moderation mode: For high-confidence, repeat-offender cases, allow auto-removal with mod notification (opt-in)
Multi-rule support: Extend beyond language rules to detect other policy violations using the same Signal → Interpretation → Action framework