๐ Clarity AI โ Real-Time Social Media Moderation
๐ก Inspiration Social media connects billions of people, but it also amplifies harmful behaviorsโmisinformation, aggressive communication, and privacy oversharing. What stood out to us is that most moderation systems are reactive: they act after harmful content is already posted and spreading. We wanted to flip that model. Our goal was to build a system that intervenes before a post is published, helping users understand the impact of their content in real time and guiding them toward safer, more responsible communication.
๐ What It Does Clarity AI is a real-time moderation assistant that integrates directly into the social media experience through a browser extension. As a user types, the system analyzes their content across four dimensions: ๐ Privacy risk โ ๏ธ Misinformation potential ๐ฌ Communication tone ๐ง Emotional impact
It then: Assigns moderation categories (e.g., aggression, misinformation, privacy risk) Calculates a weighted risk score Tracks user behavior over time Generates a safer rewritten version of the content This transforms moderation from punishment into proactive guidance.
๐ง How We Built It Our system is composed of three main components:
- Frontend (Browser Extension) Built as a Chrome extension Injects directly into social media platforms (e.g., Instagram) Detects user input in real time
Displays: risk scores moderation categories rewrite suggestions cumulative user behavior
- Backend (FastAPI on Render) Handles all moderation logic Processes incoming text from the extension Performs:
AI-based analysis using Gemini rule-based privacy detection scoring + categorization Stores user history (in-memory for prototype)
We first developed this locally using localhost, then deployed it to Render to make it publicly accessible.
- AI Layer (Google Gemini API) We used Gemini to: Understand tone and intent beyond keywords Detect subtle misinformation patterns Generate human-like rewritten responses
This allowed us to move beyond simple filters and build a system that actually understands language context.
๐ Risk Scoring Model We designed a scoring system to quantify harmful content: \text{Post Risk Score} = \sum_{i=1}^{n} (\text{Severity}_i \times \text{Weight}_i) Each category (e.g., aggression, misinformation) has:
a severity score (0โ10) a weight based on impact Scores are combined to produce a total risk value
This allows the system to be: explainable consistent scalable
๐ Behavior Over Time Instead of evaluating just one post, we track patterns of behavior. Each post contributes weighted points Users accumulate risk over time The system assigns moderation levels (e.g., warning, flagged, review) This makes the system more realistic and closer to real-world moderation pipelines.
๐ Deployment We built the project in stages: Local Development (localhost) Fast iteration on scoring logic and Gemini integration Backend Deployment (Render) Exposed our API publicly Enabled real-time communication with the extension Frontend Deployment (Vercel) Hosted frontend components Demonstrated scalability beyond local development Browser Extension Integration Brought the system directly into the user workflow
โ ๏ธ Challenges We Faced
Real-Time Integration with Social Media Social media platforms donโt provide clean APIs for input detection, so we had to: dynamically detect input fields handle changing DOM structures ensure consistent behavior across interactions
Balancing AI + Rule-Based Logic We needed to combine: AI understanding (Gemini) deterministic checks (privacy detection) Ensuring both worked together smoothly required careful design.
API & Deployment Issues We encountered: model compatibility issues with Gemini dependency conflicts during deployment CORS and frontend-backend communication errors
Defining a Fair Scoring System Designing weights and severity levels required: balancing different types of harm ensuring explainability avoiding over-penalization
๐ What We Learned How to build and deploy a full-stack AI application How to integrate LLMs (Gemini) into real-time systems The importance of explainability in AI moderation How to design scalable scoring systems How to transition from prototype โ deployed product
๐ Impact Clarity AI directly addresses the Base44 challenge by: Reducing misinformation before it spreads Improving mental wellbeing through tone awareness Preventing privacy oversharing Encouraging more meaningful and respectful communication Instead of reacting after harm occurs, we intervene at the moment of creation.
๐ฎ Future Work We plan to expand Clarity AI by: Integrating directly with platforms like Instagram and TikTok Supporting image and video analysis Adding persistent databases for long-term tracking Building dashboards for moderators Personalizing AI responses based on user behavior
๐ Final Thought Clarity AI transforms moderation into a real-time, user-centered experience. From reactive moderation โ proactive guidance.
Log in or sign up for Devpost to join the conversation.