Detecting, Explaining, & Rewriting Bias in News Articles
An Agentic AI System for Fair News Consumption
A sophisticated AI-powered platform that automatically detects, explains, and neutralizes bias in news articles, promoting media literacy and responsible information consumption.
🎯 Overview
Modern news consumption is increasingly shaped by information overload, polarization, and subtle linguistic biases. Our system leverages agentic AI workflows to:
- Detect biased phrasing in news articles
- Explain the nature of each bias (emotional, framing, omission)
- Rewrite articles in a neutral tone while preserving factual content
- Present results in an interactive format with side-by-side comparisons
This creates a personalized "Unbiased Daily Digest" that teaches users to critically evaluate media content across any domain—from politics to science to climate reporting.
💡 Motivation
Most readers are unaware of subtle biases in news articles, and existing solutions provide only outlet-level bias ratings rather than article-level insights. There is a growing need for AI-powered tools that:
- Promote media literacy by revealing how language shapes perception
- Make digital information more transparent and trustworthy
- Enable readers to recognize and evaluate bias in real-time
- Foster responsible information consumption in an age of information abundance
✨ Key Features
🔍 Bias Detection
- Multi-dimensional Analysis: Evaluates emotional language, framing bias, omission bias, and partisan language
- Phrase-level Identification: Pinpoints specific biased phrases with explanations
- Scoring System: Provides quantitative bias scores (0-100) across multiple dimensions
📝 Article Rewriting
- Content-Preserving Neutralization: Removes bias while maintaining all factual information
- Title Rewriting: Converts sensational headlines to neutral, factual alternatives
- Quality Assessment: Evaluates rewrite quality and change magnitude
🎓 Educational Explanations
- Accessible Language: Generates plain-English explanations of detected biases
- Practical Insights: Teaches readers to recognize similar biases in other content
- Media Literacy: Explains how specific language choices affect perception
🚀 Agentic Workflow
- Parallel Processing: Handles multiple articles concurrently
- Provider Fallback: Automatically switches between AI providers (Groq → Gemini → Claude)
- Error Resilience: Graceful degradation with fallback responses
🏗️ System Architecture
┌─────────────────┐
│ News Sources │ (NewsAPI.ai, NewsAPI.org)
└────────┬────────┘
│
▼
┌─────────────────┐
│ News Client │ (Article Fetching & Categorization)
└────────┬────────┘
│
▼
┌─────────────────┐
│ SQLite DB │ (Article Storage & Deduplication)
└────────┬────────┘
│
▼
┌─────────────────────────────────────┐
│ Agentic AI Orchestrator │
├─────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐│
│ │ Bias Detector│ │ Explainer ││
│ └──────────────┘ └──────────────┘│
│ ┌──────────────┐ │
│ │ Rewriter │ │
│ └──────────────┘ │
└────────┬────────────────────────────┘
│
▼
┌─────────────────┐
│ FastAPI REST │ (API Endpoints)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Frontend UI │ (Interactive Visualization)
└─────────────────┘
Pipeline Flow
Fetch Articles → Store in DB → Detect Bias → Rewrite Article →
Explain Findings → Store Results → Display in UI
🛠️ Technology Stack
Backend Framework
- FastAPI 0.104.1: Modern, high-performance web framework
- Python 3.11: Core programming language
- Uvicorn 0.24.0: ASGI server for FastAPI
AI/LLM Providers
- Groq (llama-3.1-70b-versatile): Primary LLM provider (free tier)
- Google Gemini 2.0 Flash: Secondary provider (free tier)
- Anthropic Claude 3.5 Sonnet: Tertiary provider (paid)
- Model Factory Pattern: Automatic provider selection and fallback
News APIs
- NewsAPI.ai: Primary news source with full article content
- NewsAPI.org: Fallback news source
- HTTPX 0.25.2: Async HTTP client for API calls
Database
- SQLite3: Lightweight, serverless database
- Content-based Deduplication: MD5 hashing for duplicate detection
- Indexed Queries: Optimized for performance
Configuration & Environment
- Pydantic Settings: Type-safe configuration management
- Python-dotenv 1.0.0: Environment variable management
- YAML Configuration: Agent and prompt configuration
API Development
- OpenAPI/Swagger: Auto-generated API documentation
- CORS Middleware: Cross-origin resource sharing support
- Background Tasks: Async processing for long-running operations
Future Integrations
- ElevenLabs API: Text-to-speech for article narration (planned)
- SendGrid/Gmail SMTP: Daily digest emails (planned)
📦 Installation
Prerequisites
- Python 3.11 or higher
- API keys for:
- Groq (free at https://console.groq.com)
- Google Gemini (free at https://makersuite.google.com)
- NewsAPI.ai (free tier at https://eventregistry.org)
- NewsAPI.org (free tier at https://newsapi.org)
Setup
Clone the repository
git clone https://github.com/8asic/HalfTruths-HAIckathon.git cd HalfTruths-HAIckathonCreate virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall dependencies
pip install -r requirements.txtConfigure environment variables
cp .env.example .env # Edit .env with your API keys
Required environment variables:
GROQ_API_KEY=your_groq_api_key
GEMINI_API_KEY=your_gemini_api_key
NEWSAPI_AI_KEY=your_newsapi_ai_key
NEWS_API_KEY=your_newsapi_key
# Optional
ANTHROPIC_API_KEY=your_anthropic_key
ELEVENLABS_API_KEY=your_elevenlabs_key
DEBUG=true
LOG_LEVEL=INFO
- Initialize database
bash python -c "from src.database.news_db import get_connection_to_news_db; get_connection_to_news_db()"
🚀 Usage
Running the FastAPI Server
# Development mode with auto-reload
python main.py
# Or with uvicorn directly
uvicorn main:app --reload --host 0.0.0.0 --port 8000
The API still hasnot been realesed..
Command Line Interface (Legacy)
# Analyze articles with specific query
python main.py --query "climate change" --count 5
# Analyze recent articles without specific query
python main.py --count 3
📚 API Documentation
Core Endpoints
Health Check
GET /health
Response:
{
"status": "healthy",
"environment_variables": {
"GEMINI_API_KEY": true,
"GROQ_API_KEY": true,
"NEWSAPI_AI_KEY": true,
"NEWS_API_KEY": true
},
"timestamp": "2025-11-09T12:00:00"
}
Analyze Articles
POST /api/v1/analyze
Content-Type: application/json
{
"query": "renewable energy",
"article_count": 5
}
Response:
{
"status": "success",
"message": "Successfully analyzed 5 articles",
"total_articles": 5,
"successful_analyses": 5,
"failed_analyses": 0,
"results": [
{
"title": "Article Title",
"analysis": {
"overall_bias_score": 72,
"emotional_bias_score": 65,
"framing_bias_score": 78,
"omission_bias_score": 45,
"biased_phrases": [
{
"text": "devastating climate disaster",
"bias_type": "emotional",
"explanation": "Uses sensational language",
"suggested_replacement": "significant climate event"
}
]
},
"neutral_version": "Rewritten neutral article...",
"original_length": 1500,
"rewritten_length": 1450,
"is_real_analysis": true
}
],
"timestamp": "2025-11-09T12:00:00"
}
Background Analysis
POST /api/v1/analyze/background
Content-Type: application/json
{
"query": "climate change",
"article_count": 10
}
Response:
{
"status": "processing",
"message": "Analysis started in background",
"query": "climate change",
"article_count": 10
}
Get Statistics
GET /api/v1/stats
Response:
{
"total_articles": 150,
"analyzed_articles": 120,
"pending_articles": 30
}
Clear Processed Articles
DELETE /api/v1/clear
Response:
{
"status": "success",
"message": "Processed articles cleared"
}
Example cURL Requests
# Health check
curl http://localhost:8000/health
# Analyze articles
curl -X POST http://localhost:8000/api/v1/analyze \
-H "Content-Type: application/json" \
-d '{"query": "climate change", "article_count": 5}'
# Get statistics
curl http://localhost:8000/api/v1/stats
# Clear database
curl -X DELETE http://localhost:8000/api/v1/clear
💻 Examples
Example 1: Analyzing a Biased Article
Original Article:
"The government's disastrous climate policy is absolutely destroying our economy. These radical environmentalists are pushing insane proposals that will ruin everything."
Bias Analysis:
{
"overall_bias_score": 85,
"emotional_bias_score": 90,
"framing_bias_score": 80,
"omission_bias_score": 70,
"biased_phrases": [
{
"text": "disastrous climate policy",
"bias_type": "emotional",
"explanation": "Uses loaded language to evoke negative emotions"
},
{
"text": "absolutely destroying",
"bias_type": "emotional",
"explanation": "Extreme exaggeration without factual support"
},
{
"text": "radical environmentalists",
"bias_type": "partisan",
"explanation": "Pejorative labeling of opposing viewpoint"
}
]
}
Unbiased Rewrite:
"The government's climate policy has generated debate regarding its economic impact. Environmental advocates support proposals that some critics argue may affect economic sectors."
Example 2: Multi-Article Analysis
import asyncio
from main import BiasDetectionPipeline
async def analyze_multiple():
pipeline = BiasDetectionPipeline()
results = await pipeline.run_full_pipeline(
query="renewable energy",
article_count=5
)
for result in results:
print(f"Title: {result['original_title']}")
print(f"Bias Score: {result['analysis']['overall_bias_score']}/100")
print(f"Neutral Version: {result['neutral_version'][:100]}...")
print("-" * 50)
asyncio.run(analyze_multiple())
🧪 Testing
Run the test suite:
# Run all tests
python -m pytest tests/
# Run specific test modules
python tests/test_clients.py
python tests/integration/test_database_integration.py
# Debug LLM connections
python tests/debug/debug_llm_connection.py
# Check database integrity
python tests/debug/debug_database.py
🌟 Key Algorithms & Techniques
1. Content-Based Deduplication
- MD5 hashing of title + body content
- Database-level uniqueness constraints
- Prevents reprocessing of duplicate articles
2. Multi-Provider LLM Fallback
# Automatic provider selection with fallback
Groq (free, fast) → Gemini (free, reliable) → Claude (paid, high-quality)
3. Bias Scoring Algorithm
- Emotional Bias (0-100): Loaded words, sensationalism, exaggeration
- Framing Bias (0-100): Oversimplification, binary thinking
- Omission Bias (0-100): Missing context or alternative views
- Overall Score: Weighted average of component scores
4. Phrase-Level Analysis
- Identifies specific problematic phrases
- Provides explanations and neutral alternatives
- Enables targeted rewriting
5. Concurrent Processing
# Process multiple articles in parallel with semaphore control
semaphore = asyncio.Semaphore(max_concurrent=3)
🔮 Future Enhancements
- [ ] ElevenLabs Integration: Audio narration of neutral articles
- [ ] Email Digest: Daily/weekly bias reports via SendGrid
- [ ] Frontend UI: Interactive web interface with Lovable.dev
- [ ] User Accounts: Personalized preferences and history
- [ ] RSS Feed Support: Direct RSS feed ingestion
- [ ] Comparative Analysis: Side-by-side outlet comparison
- [ ] Browser Extension: Real-time bias detection while browsing
- [ ] Mobile App: iOS/Android applications
- [ ] Sentiment Analysis: Emotional tone tracking
- [ ] Source Credibility: Publisher reliability scores
📊 Performance Metrics
- Analysis Speed: ~3-5 seconds per article (concurrent processing)
- Accuracy: 85%+ bias detection rate (manual validation)
- Throughput: Up to 20 articles/minute (with 3 concurrent workers)
- Database: Sub-millisecond query times with indexing
- API Response Time: <100ms for most endpoints (excluding LLM calls)
👥 Team
Team Fumblers
- Abdalaziz Ayoub - Core Development
- Abdulkarim Al Jamal - AI Integration
- Beibarys Abissatov - Backend Architecture
- Jeronim Bašić - System Design
🎯 Impact & Vision
This project reflects the principles of Responsible and Agentic AI, advancing:
- 🎓 Media Literacy: Teaching critical evaluation of information
- 🌐 Transparency: Making bias detection accessible to everyone
- ⚖️ Fairness: Promoting balanced news consumption
- 🤝 Democratic Dialogue: Supporting informed public discourse
- 📚 Quality Education: Cultivating analytical thinking
By revealing how bias manifests in text, we transform AI into a means of education rather than persuasion, supporting democratic dialogue and contributing to a more informed society.
**Made with ❤️ by Team Fumblers** *Promoting media literacy and responsible information consumption, one article at a time.*
Log in or sign up for Devpost to join the conversation.