Detecting, Explaining, & Rewriting Bias in News Articles

Python FastAPI

An Agentic AI System for Fair News Consumption

A sophisticated AI-powered platform that automatically detects, explains, and neutralizes bias in news articles, promoting media literacy and responsible information consumption.


🎯 Overview

Modern news consumption is increasingly shaped by information overload, polarization, and subtle linguistic biases. Our system leverages agentic AI workflows to:

  1. Detect biased phrasing in news articles
  2. Explain the nature of each bias (emotional, framing, omission)
  3. Rewrite articles in a neutral tone while preserving factual content
  4. Present results in an interactive format with side-by-side comparisons

This creates a personalized "Unbiased Daily Digest" that teaches users to critically evaluate media content across any domain—from politics to science to climate reporting.


💡 Motivation

Most readers are unaware of subtle biases in news articles, and existing solutions provide only outlet-level bias ratings rather than article-level insights. There is a growing need for AI-powered tools that:

  • Promote media literacy by revealing how language shapes perception
  • Make digital information more transparent and trustworthy
  • Enable readers to recognize and evaluate bias in real-time
  • Foster responsible information consumption in an age of information abundance

✨ Key Features

🔍 Bias Detection

  • Multi-dimensional Analysis: Evaluates emotional language, framing bias, omission bias, and partisan language
  • Phrase-level Identification: Pinpoints specific biased phrases with explanations
  • Scoring System: Provides quantitative bias scores (0-100) across multiple dimensions

📝 Article Rewriting

  • Content-Preserving Neutralization: Removes bias while maintaining all factual information
  • Title Rewriting: Converts sensational headlines to neutral, factual alternatives
  • Quality Assessment: Evaluates rewrite quality and change magnitude

🎓 Educational Explanations

  • Accessible Language: Generates plain-English explanations of detected biases
  • Practical Insights: Teaches readers to recognize similar biases in other content
  • Media Literacy: Explains how specific language choices affect perception

🚀 Agentic Workflow

  • Parallel Processing: Handles multiple articles concurrently
  • Provider Fallback: Automatically switches between AI providers (Groq → Gemini → Claude)
  • Error Resilience: Graceful degradation with fallback responses

🏗️ System Architecture

┌─────────────────┐
│  News Sources   │ (NewsAPI.ai, NewsAPI.org)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  News Client    │ (Article Fetching & Categorization)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  SQLite DB      │ (Article Storage & Deduplication)
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────────┐
│     Agentic AI Orchestrator         │
├─────────────────────────────────────┤
│  ┌──────────────┐  ┌──────────────┐│
│  │ Bias Detector│  │  Explainer   ││
│  └──────────────┘  └──────────────┘│
│  ┌──────────────┐                  │
│  │   Rewriter   │                  │
│  └──────────────┘                  │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────┐
│   FastAPI REST  │ (API Endpoints)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Frontend UI   │ (Interactive Visualization)
└─────────────────┘

Pipeline Flow

Fetch Articles → Store in DB → Detect Bias → Rewrite Article → 
Explain Findings → Store Results → Display in UI

🛠️ Technology Stack

Backend Framework

  • FastAPI 0.104.1: Modern, high-performance web framework
  • Python 3.11: Core programming language
  • Uvicorn 0.24.0: ASGI server for FastAPI

AI/LLM Providers

  • Groq (llama-3.1-70b-versatile): Primary LLM provider (free tier)
  • Google Gemini 2.0 Flash: Secondary provider (free tier)
  • Anthropic Claude 3.5 Sonnet: Tertiary provider (paid)
  • Model Factory Pattern: Automatic provider selection and fallback

News APIs

  • NewsAPI.ai: Primary news source with full article content
  • NewsAPI.org: Fallback news source
  • HTTPX 0.25.2: Async HTTP client for API calls

Database

  • SQLite3: Lightweight, serverless database
  • Content-based Deduplication: MD5 hashing for duplicate detection
  • Indexed Queries: Optimized for performance

Configuration & Environment

  • Pydantic Settings: Type-safe configuration management
  • Python-dotenv 1.0.0: Environment variable management
  • YAML Configuration: Agent and prompt configuration

API Development

  • OpenAPI/Swagger: Auto-generated API documentation
  • CORS Middleware: Cross-origin resource sharing support
  • Background Tasks: Async processing for long-running operations

Future Integrations

  • ElevenLabs API: Text-to-speech for article narration (planned)
  • SendGrid/Gmail SMTP: Daily digest emails (planned)

📦 Installation

Prerequisites

Setup

  1. Clone the repository

    git clone https://github.com/8asic/HalfTruths-HAIckathon.git
    cd HalfTruths-HAIckathon
    
  2. Create virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies

    pip install -r requirements.txt
    
  4. Configure environment variables

    cp .env.example .env
    # Edit .env with your API keys
    

Required environment variables:

GROQ_API_KEY=your_groq_api_key
GEMINI_API_KEY=your_gemini_api_key
NEWSAPI_AI_KEY=your_newsapi_ai_key
NEWS_API_KEY=your_newsapi_key

# Optional
ANTHROPIC_API_KEY=your_anthropic_key
ELEVENLABS_API_KEY=your_elevenlabs_key
DEBUG=true
LOG_LEVEL=INFO
  1. Initialize database bash python -c "from src.database.news_db import get_connection_to_news_db; get_connection_to_news_db()"

🚀 Usage

Running the FastAPI Server

# Development mode with auto-reload
python main.py

# Or with uvicorn directly
uvicorn main:app --reload --host 0.0.0.0 --port 8000

The API still hasnot been realesed..

Command Line Interface (Legacy)

# Analyze articles with specific query
python main.py --query "climate change" --count 5

# Analyze recent articles without specific query
python main.py --count 3

📚 API Documentation

Core Endpoints

Health Check

GET /health

Response:

{
  "status": "healthy",
  "environment_variables": {
    "GEMINI_API_KEY": true,
    "GROQ_API_KEY": true,
    "NEWSAPI_AI_KEY": true,
    "NEWS_API_KEY": true
  },
  "timestamp": "2025-11-09T12:00:00"
}

Analyze Articles

POST /api/v1/analyze
Content-Type: application/json

{
  "query": "renewable energy",
  "article_count": 5
}

Response:

{
  "status": "success",
  "message": "Successfully analyzed 5 articles",
  "total_articles": 5,
  "successful_analyses": 5,
  "failed_analyses": 0,
  "results": [
    {
      "title": "Article Title",
      "analysis": {
        "overall_bias_score": 72,
        "emotional_bias_score": 65,
        "framing_bias_score": 78,
        "omission_bias_score": 45,
        "biased_phrases": [
          {
            "text": "devastating climate disaster",
            "bias_type": "emotional",
            "explanation": "Uses sensational language",
            "suggested_replacement": "significant climate event"
          }
        ]
      },
      "neutral_version": "Rewritten neutral article...",
      "original_length": 1500,
      "rewritten_length": 1450,
      "is_real_analysis": true
    }
  ],
  "timestamp": "2025-11-09T12:00:00"
}

Background Analysis

POST /api/v1/analyze/background
Content-Type: application/json

{
  "query": "climate change",
  "article_count": 10
}

Response:

{
  "status": "processing",
  "message": "Analysis started in background",
  "query": "climate change",
  "article_count": 10
}

Get Statistics

GET /api/v1/stats

Response:

{
  "total_articles": 150,
  "analyzed_articles": 120,
  "pending_articles": 30
}

Clear Processed Articles

DELETE /api/v1/clear

Response:

{
  "status": "success",
  "message": "Processed articles cleared"
}

Example cURL Requests

# Health check
curl http://localhost:8000/health

# Analyze articles
curl -X POST http://localhost:8000/api/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{"query": "climate change", "article_count": 5}'

# Get statistics
curl http://localhost:8000/api/v1/stats

# Clear database
curl -X DELETE http://localhost:8000/api/v1/clear

💻 Examples

Example 1: Analyzing a Biased Article

Original Article:

"The government's disastrous climate policy is absolutely destroying our economy. These radical environmentalists are pushing insane proposals that will ruin everything."

Bias Analysis:

{
  "overall_bias_score": 85,
  "emotional_bias_score": 90,
  "framing_bias_score": 80,
  "omission_bias_score": 70,
  "biased_phrases": [
    {
      "text": "disastrous climate policy",
      "bias_type": "emotional",
      "explanation": "Uses loaded language to evoke negative emotions"
    },
    {
      "text": "absolutely destroying",
      "bias_type": "emotional",
      "explanation": "Extreme exaggeration without factual support"
    },
    {
      "text": "radical environmentalists",
      "bias_type": "partisan",
      "explanation": "Pejorative labeling of opposing viewpoint"
    }
  ]
}

Unbiased Rewrite:

"The government's climate policy has generated debate regarding its economic impact. Environmental advocates support proposals that some critics argue may affect economic sectors."

Example 2: Multi-Article Analysis

import asyncio
from main import BiasDetectionPipeline

async def analyze_multiple():
    pipeline = BiasDetectionPipeline()
    results = await pipeline.run_full_pipeline(
        query="renewable energy",
        article_count=5
    )

    for result in results:
        print(f"Title: {result['original_title']}")
        print(f"Bias Score: {result['analysis']['overall_bias_score']}/100")
        print(f"Neutral Version: {result['neutral_version'][:100]}...")
        print("-" * 50)

asyncio.run(analyze_multiple())

🧪 Testing

Run the test suite:

# Run all tests
python -m pytest tests/

# Run specific test modules
python tests/test_clients.py
python tests/integration/test_database_integration.py

# Debug LLM connections
python tests/debug/debug_llm_connection.py

# Check database integrity
python tests/debug/debug_database.py

🌟 Key Algorithms & Techniques

1. Content-Based Deduplication

  • MD5 hashing of title + body content
  • Database-level uniqueness constraints
  • Prevents reprocessing of duplicate articles

2. Multi-Provider LLM Fallback

# Automatic provider selection with fallback
Groq (free, fast) → Gemini (free, reliable) → Claude (paid, high-quality)

3. Bias Scoring Algorithm

  • Emotional Bias (0-100): Loaded words, sensationalism, exaggeration
  • Framing Bias (0-100): Oversimplification, binary thinking
  • Omission Bias (0-100): Missing context or alternative views
  • Overall Score: Weighted average of component scores

4. Phrase-Level Analysis

  • Identifies specific problematic phrases
  • Provides explanations and neutral alternatives
  • Enables targeted rewriting

5. Concurrent Processing

# Process multiple articles in parallel with semaphore control
semaphore = asyncio.Semaphore(max_concurrent=3)

🔮 Future Enhancements

  • [ ] ElevenLabs Integration: Audio narration of neutral articles
  • [ ] Email Digest: Daily/weekly bias reports via SendGrid
  • [ ] Frontend UI: Interactive web interface with Lovable.dev
  • [ ] User Accounts: Personalized preferences and history
  • [ ] RSS Feed Support: Direct RSS feed ingestion
  • [ ] Comparative Analysis: Side-by-side outlet comparison
  • [ ] Browser Extension: Real-time bias detection while browsing
  • [ ] Mobile App: iOS/Android applications
  • [ ] Sentiment Analysis: Emotional tone tracking
  • [ ] Source Credibility: Publisher reliability scores

📊 Performance Metrics

  • Analysis Speed: ~3-5 seconds per article (concurrent processing)
  • Accuracy: 85%+ bias detection rate (manual validation)
  • Throughput: Up to 20 articles/minute (with 3 concurrent workers)
  • Database: Sub-millisecond query times with indexing
  • API Response Time: <100ms for most endpoints (excluding LLM calls)

👥 Team

Team Fumblers

  • Abdalaziz Ayoub - Core Development
  • Abdulkarim Al Jamal - AI Integration
  • Beibarys Abissatov - Backend Architecture
  • Jeronim Bašić - System Design

🎯 Impact & Vision

This project reflects the principles of Responsible and Agentic AI, advancing:

  • 🎓 Media Literacy: Teaching critical evaluation of information
  • 🌐 Transparency: Making bias detection accessible to everyone
  • ⚖️ Fairness: Promoting balanced news consumption
  • 🤝 Democratic Dialogue: Supporting informed public discourse
  • 📚 Quality Education: Cultivating analytical thinking

By revealing how bias manifests in text, we transform AI into a means of education rather than persuasion, supporting democratic dialogue and contributing to a more informed society.


**Made with ❤️ by Team Fumblers** *Promoting media literacy and responsible information consumption, one article at a time.*

Built With

Share this project:

Updates