๐ŸŒ Clarity AI โ€” Real-Time Social Media Moderation

๐Ÿ’ก Inspiration Social media connects billions of people, but it also amplifies harmful behaviorsโ€”misinformation, aggressive communication, and privacy oversharing. What stood out to us is that most moderation systems are reactive: they act after harmful content is already posted and spreading. We wanted to flip that model. Our goal was to build a system that intervenes before a post is published, helping users understand the impact of their content in real time and guiding them toward safer, more responsible communication.

๐Ÿš€ What It Does Clarity AI is a real-time moderation assistant that integrates directly into the social media experience through a browser extension. As a user types, the system analyzes their content across four dimensions: ๐Ÿ”’ Privacy risk โš ๏ธ Misinformation potential ๐Ÿ’ฌ Communication tone ๐Ÿง  Emotional impact

It then: Assigns moderation categories (e.g., aggression, misinformation, privacy risk) Calculates a weighted risk score Tracks user behavior over time Generates a safer rewritten version of the content This transforms moderation from punishment into proactive guidance.

๐Ÿง  How We Built It Our system is composed of three main components:

  1. Frontend (Browser Extension) Built as a Chrome extension Injects directly into social media platforms (e.g., Instagram) Detects user input in real time

Displays: risk scores moderation categories rewrite suggestions cumulative user behavior

  1. Backend (FastAPI on Render) Handles all moderation logic Processes incoming text from the extension Performs:

AI-based analysis using Gemini rule-based privacy detection scoring + categorization Stores user history (in-memory for prototype)

We first developed this locally using localhost, then deployed it to Render to make it publicly accessible.

  1. AI Layer (Google Gemini API) We used Gemini to: Understand tone and intent beyond keywords Detect subtle misinformation patterns Generate human-like rewritten responses

This allowed us to move beyond simple filters and build a system that actually understands language context.

๐Ÿ“Š Risk Scoring Model We designed a scoring system to quantify harmful content: \text{Post Risk Score} = \sum_{i=1}^{n} (\text{Severity}_i \times \text{Weight}_i) Each category (e.g., aggression, misinformation) has:

a severity score (0โ€“10) a weight based on impact Scores are combined to produce a total risk value

This allows the system to be: explainable consistent scalable

๐Ÿ” Behavior Over Time Instead of evaluating just one post, we track patterns of behavior. Each post contributes weighted points Users accumulate risk over time The system assigns moderation levels (e.g., warning, flagged, review) This makes the system more realistic and closer to real-world moderation pipelines.

๐ŸŒ Deployment We built the project in stages: Local Development (localhost) Fast iteration on scoring logic and Gemini integration Backend Deployment (Render) Exposed our API publicly Enabled real-time communication with the extension Frontend Deployment (Vercel) Hosted frontend components Demonstrated scalability beyond local development Browser Extension Integration Brought the system directly into the user workflow

โš ๏ธ Challenges We Faced

  1. Real-Time Integration with Social Media Social media platforms donโ€™t provide clean APIs for input detection, so we had to: dynamically detect input fields handle changing DOM structures ensure consistent behavior across interactions

  2. Balancing AI + Rule-Based Logic We needed to combine: AI understanding (Gemini) deterministic checks (privacy detection) Ensuring both worked together smoothly required careful design.

  3. API & Deployment Issues We encountered: model compatibility issues with Gemini dependency conflicts during deployment CORS and frontend-backend communication errors

  4. Defining a Fair Scoring System Designing weights and severity levels required: balancing different types of harm ensuring explainability avoiding over-penalization

๐Ÿ“š What We Learned How to build and deploy a full-stack AI application How to integrate LLMs (Gemini) into real-time systems The importance of explainability in AI moderation How to design scalable scoring systems How to transition from prototype โ†’ deployed product

๐ŸŒ Impact Clarity AI directly addresses the Base44 challenge by: Reducing misinformation before it spreads Improving mental wellbeing through tone awareness Preventing privacy oversharing Encouraging more meaningful and respectful communication Instead of reacting after harm occurs, we intervene at the moment of creation.

๐Ÿ”ฎ Future Work We plan to expand Clarity AI by: Integrating directly with platforms like Instagram and TikTok Supporting image and video analysis Adding persistent databases for long-term tracking Building dashboards for moderators Personalizing AI responses based on user behavior

๐Ÿ Final Thought Clarity AI transforms moderation into a real-time, user-centered experience. From reactive moderation โ†’ proactive guidance.

Share this project:

Updates