DataHalo

Inspiration

The modern information ecosystem is flooded with biased narratives, fragmented reporting, and hidden influence networks. Traditional tools focus on content consumption, not understanding how narratives are shaped, who drives them, and how they evolve. We wanted to build a system that goes beyond surface-level news and instead answers: Who is influencing the narrative? How does bias vary across sources? Can we trace patterns in storytelling across media? DataHalo was inspired by the need for transparent, explainable media intelligence powered by AI.

What it does

DataHalo is an AI-powered media intelligence platform that analyzes news content to uncover narrative patterns, bias, and source transparency. It: Compares narratives across multiple media sources Assigns transparency and bias scores using LLM-based reasoning Uses Retrieval-Augmented Generation (RAG) to provide source-backed insights Structures unorganized news data into analyzable formats Helps users understand how a story is told, not just what is told

How we built it

We built DataHalo as a full-stack AI system combining data pipelines and LLM reasoning: Data Collection: Web scraping pipelines to gather media content Backend: Node.js for data ingestion and processing AI Layer: LLM-based reasoning with RAG pipelines for contextual analysis Frontend: React-based interface for visualization and interaction Database: Structured storage for articles and metadata The core idea was to convert unstructured media content → structured intelligence → explainable insights.

Challenges we ran into

Handling unstructured and inconsistent media data from different sources Designing meaningful bias and transparency scoring logic Preventing LLM hallucinations while maintaining insight quality Building a system that is both accurate and explainable Managing real-time data pipelines with limited resources

Accomplishments that we're proud of

Built a working end-to-end AI pipeline from scraping to insight generation Successfully implemented RAG-based narrative comparison Created a system that explains why a narrative differs, not just that it does Achieved strong recognition in hackathons and competitions Developed a scalable foundation for future media intelligence tools

What we learned

AI is powerful only when combined with structured data pipelines LLM outputs must be guided, validated, and grounded in sources Simplicity in UI matters even for complex backend systems Building explainable AI systems is harder — but far more valuable Real-world problems require combining multiple domains: AI + data + systems

What's next for DataHalo

Real-time media monitoring and alerts Expanding datasets across global media networks Advanced bias detection using hybrid ML + LLM models Network visualization of journalist and media influence API access for researchers, journalists, and analysts Scaling into a full AI-powered media intelligence platform