Inspiration

I spend a lot of time on social media — Reddit, TikTok, Instagram, and Chinese Douyin. Over time, I noticed that algorithms heavily reward emotional engagement, especially anger. The more time I spent in comment sections, the more toxic content I was shown.

On Douyin especially, the recommendation system aggressively optimizes for watch time and interaction. The posts that make people angry generate the highest engagement. For me, that meant feminism-related videos filled with misogyny, hate speech, and hostile generalizations about women.

I wanted to stay informed and engage thoughtfully. Instead, I kept getting pulled into ragebait loops that left me exhausted and less informed than before.

Social media platforms profit from outrage. Hate spreads because hate drives engagement. There is currently no protective layer between users and the most toxic parts of online discourse. EquiFlow was built to create that layer.

What EquiFlow Does

EquiFlow is an autonomous ethical AI agent that monitors live Reddit trends, detects harmful bias and hate speech, and rewrites toxic discussions into neutral, factual summaries.

The goal is not censorship. The goal is to help users understand what people are talking about — without being exposed to the hate driving the conversation.

Instead of amplifying outrage, EquiFlow extracts the actual signal from the noise.

How We Built It

EquiFlow uses a multi-agent architecture where seven autonomous agents coordinate together:

  • Trend Agent — Collects live Reddit trends in real time
  • Ethics Agent — Detects toxicity, sexism, racism, and harmful bias using NVIDIA Nemotron
  • Action Agent — Decides whether content should be shown, warned, summarized, rewritten, or skipped
  • Rewrite Agent — Converts harmful posts into neutral summaries while preserving informational context
  • Preference Agent — Learns from user feedback and adjusts moderation sensitivity over time
  • Trend Summary Agent — Detects larger harmful narrative patterns across multiple discussions
  • Report Agent — Generates dashboard insights and moderation analytics

Every LLM request passes through NeMo Guardrails, with both input and output rails enforced for safety. The entire system runs inside a NemoClaw OpenShell sandbox on NVIDIA Brev, with restricted network permissions allowing access only to Reddit and NVIDIA APIs.

This means EquiFlow is protected at two levels:

  • At the AI prompt level via NeMo Guardrails
  • At the operating-system level via OpenShell network and filesystem policies

Challenges We Faced

The hardest challenge was not simply detecting hate speech — it was handling nuance.

Some language is not objectively harmful, but highly audience-dependent. For example:

"Chinese New Year" vs "Lunar New Year"

Neither phrase is inherently wrong, but each carries different cultural implications depending on context and audience. EquiFlow distinguishes between genuine harmful rhetoric, emotionally charged disagreement, and culturally nuanced language — explaining the nuance rather than just flagging the content.

Other major challenges included:

  • Configuring NeMo Guardrails correctly with multi-agent workflows
  • Balancing moderation thresholds without over-filtering legitimate discussion
  • Deploying a full-stack AI system on NVIDIA Brev with CORS handling, port forwarding, and isolated environments
  • Writing neutral rewrites that preserve informational value without amplifying harmful messaging

What We Learned

Building EquiFlow changed how I think about AI safety and online platforms.

Ethical AI is not just about filtering outputs — it's about designing systems that reduce harm without removing context or silencing discussion.

AI security also goes beyond prompting and moderation layers. True safety includes restricting what autonomous agents can physically access at the infrastructure level. Using NemoClaw's OpenShell sandbox reinforced how important operating-system-level constraints are for building trustworthy AI systems.

The Bigger Picture

EquiFlow started as a hackathon project, but the problem it addresses is much larger. Students, researchers, and everyday users deserve a way to stay informed about social issues, world events, and online trends — without being manipulated by algorithms designed to maximize outrage.

EquiFlow offers an alternative: a system that helps users understand online conversations without drowning in hate, ragebait, and toxic engagement loops.


EquiFlow — Understand what's trending. Without the hate.

Built With

  • aiosqlite
  • fastapi
  • feedparser
  • nemo-guardrails
  • nemoclaw
  • openclaw-nvidia-brev-(cloud-deployment)
  • openshell
  • python
  • python-dotenv
  • react
  • reddit-rss-api
  • sqlalchemy-nvidia-nemotron-(meta/llama-3.1-8b-instruct-via-nvidia-api)
  • sqlite
  • vite
Share this project:

Updates