Inspiration for "Seeing the Lie"

We are entering an era where seeing is no longer believing.

For centuries, visual evidence has been one of humanity’s strongest anchors of truth. A photograph could stop an argument. A video could expose injustice. Today, that anchor is slipping. Deepfakes are no longer curiosities, they are persuasive, scalable, and increasingly indistinguishable from reality. They threaten trust in journalism, justice, science, and democratic discourse itself.

Beyond Detection: Restoring Trust

Most existing deepfake detectors focus on answering a single question: Is this fake or real?
But in high-stakes situations such as courtrooms, newsrooms, intelligence analysis, or content moderation that answer is not enough.

People need to know:

  • Why is something classified as fake?
  • What cues led to that decision?
  • How does context change the meaning of what we see?

A black-box verdict does not rebuild trust. Understanding does.

Context Is the Missing Signal

Deepfakes rarely exist in isolation. They appear within narratives, social dynamics, historical patterns, and behavioral inconsistencies. Humans intuitively use context to detect deception, but most AI systems ignore it.

We were inspired to build a system that does what humans do best:

  • Compare claims against known reality
  • Detect inconsistencies across time, identity, and intent
  • Interpret visuals in context, not in a vacuum

By embedding contextual reasoning directly into detection, we move from surface-level artifact hunting to semantic and situational truth verification.

Explainability as a Moral Requirement

In a world where AI decisions increasingly shape public opinion and personal outcomes, explainability is not a feature, it is an ethical obligation.

Our project is inspired by a simple but powerful belief:

If an AI system accuses something of being fake, it must be able to explain itself.

Explainable deepfake detection empowers journalists to verify sources, policymakers to justify decisions, and everyday users to regain confidence in what they see.

A Future Where Truth Is Defensible

We did not build this system just to catch fakes.
We built it to defend reality.

“Seeing the Lie” is inspired by the conviction that AI should not replace human judgment, but strengthen it. By combining deep learning, contextual awareness, and transparent reasoning, we aim to give society a tool that does more than detect deception.

What it does

Seeing the Lie is a agentic, context-aware and explainable deepfake detection platform designed for high-stakes decision making. Instead of relying on a single signal or a black-box classifier, the platform orchestrates multiple specialized reasoning modules and synthesizes their outputs into a transparent, human-readable verdict.

At its core, the system answers not only “Is this real or fake?” but also _“Why?”, _“Based on what evidence?”, and _“How confident should we be?”.

System Architecture: Six Complementary Modules

1. Source & Context Verification

The platform begins with reverse image search combined with a Gemini-based analysis.
This module traces where the content has appeared before, compares variants across the web, and evaluates the credibility of sources, timelines, and reuse patterns. The goal is to establish contextual plausibility before any low-level signal analysis takes place.

2. Invisible Watermark Detection (SynthID)

Next, the system checks for Google’s invisible SynthID watermark.
This module detects whether the content carries cryptographic signals indicating AI generation, even when no visible watermark is present and provides a strong provenance signal when available.

3. Structured Indicator Extraction

Using carefully structured Gemini-3 prompting, the platform extracts explicit real/fake indicators, such as:

  • Visual inconsistencies
  • Global/Semantic contradictions
  • Unnatural patterns in faces, lighting, or motion
  • Mismatches between content and claimed context

4. LLM Court Case Simulation

To avoid confirmation bias, the system employs a debating architecture:

  • One LLM argues the case (uploaded image) for “real”
  • A second LLM argues the case for “fake”
  • A judge LLM evaluates both sides, decides whether additional argumentation is required before the debate is stopped, and issues a reasoned interim and final judgment

Motivation: This mirrors human critical reasoning and exposes weak or unsupported claims.

5. Metadata & Provenance Analysis

The platform extracts and analyzes technical metadata including file structure, encoding traces, timestamps, and generation artifacts to surface anomalies or hints about synthetic origins. This module provides low-level evidence that complements higher-level semantic reasoning.

6. Final Observer & Evidence Aggregation

Finally, a dedicated observer module aggregates all findings into:

  • A unified evidence overview
  • A clear explanation of supporting and contradicting signals
  • A probability score reflecting overall confidence

This output is designed to be interpretable by humans, auditable by experts, and usable in real-world decision workflows.

How we built it

We built Seeing the Lie as a full-stack platform that combines a modern web interface with a deeply reasoned, multi-agent AI backend. The system is designed around a sequential, explainable detection pipeline, where each step contributes explicit evidence rather than opaque scores. The backend is implemented in Python using Flask, exposing a REST API enhanced with Server-Sent Events (SSE) to stream real-time progress updates. The frontend is built with vanilla JavaScript, HTML, and CSS, and features drag-and-drop uploads and live status feedback so users can follow each analytical step as it happens.

At the core of the platform is a Gemini-powered AI/ML stack, using Gemini 3 vision-language models for analysis, reasoning, and final judgment. The detection pipeline follows a structured five-step flow:

  1. A reverse image search (via SerpAPI) combined with Gemini-based source analysis
  2. SynthID invisible watermark detection using Google’s AI tooling
  3. A visual forensics agent applying a five-indicator framework (physics, lighting, textures, anatomy, semantics)
  4. A multi-round debate system where Pro-Real and Pro-Fake agents argue under a judge model
  5. An AI metadata analyzer leveraging Stable Diffusion parsers to extract generation traces

All results flow through a unified data model (TaskInput -> BaseStep -> AggregatedContext) and are finally synthesized by Gemini into a clear explanation and a 0-100 probability score, making the system both technically robust and human-interpretable.

Challenges we ran into

Building such a complex platform required orchestrating multiple reasoning-heavy modules into a single, coherent agentic system. One major challenge was to ensure consistency and stability across interacting LLMs, especially in the court case simulation setup, where small prompt changes could cascade into divergent outcomes. Additionally, large-scale testing and calibration proved computationally very expensive: validating reliability across diverse content types, edge cases, and adversarial examples required repeated multi-agent runs, significantly increasing inference cost and evaluation time. Balancing depth of reasoning, explainability, and practical scalability was therefore a central engineering challenge throughout the hackathon.

Accomplishments that we're proud of

We are particularly proud of having built a fully operational, explainable, multi-agent deepfake detection system that goes far beyond single-model classification. Despite its complexity, the platform produces coherent, transparent, and defensible verdicts that humans can actually reason with. Most importantly, we successfully applied the system to recent real-world cases, verifying several viral images as fake, including alleged leaks from the Epstein files and images claiming to show the capture of Nicolas Maduro. These cases demonstrated that our approach works not only in controlled settings, but also under the messy, adversarial conditions of live information warfare.

What we learned

Throughout this project, we learned that deepfake detection is fundamentally a reasoning problem, not just a pattern-recognition task. Isolated visual artifacts can be helpful but are often fragile signals. When combined with context, provenance, metadata, and structured argumentation, they become far more reliable. We also learned that explainability dramatically changes how detection results are perceived and trusted. Users engage more critically and confidently when they can see the evidence and trade-offs behind a verdict. Finally, building an agentic, multi-LLM system taught us that robustness emerges from disagreement and synthesis: allowing models to challenge each other, rather than enforcing premature consensus, leads to more stable and defensible outcomes in the face of ambiguity.

What's next for our Agentic, Context-Aware and Explainable Deepfake Detection

The next major step is a large-scale validation of the system on real-world deepfakes. While early results on viral and high-impact cases are promising, deploying this kind of agentic, multi-module architecture at scale requires systematic testing across diverse domains, languages, media formats, and adversarial manipulation strategies.

In parallel, we aim to move from the prototype level to impact through pilot projects and operational deployments with journalists, newsrooms, and professional fact-checking teams. These environments demand fast, defensible, and transparent assessments which is exactly what our context-aware and explainable detection is designed to provide. By integrating the platform into real editorial and verification workflows, we want to help frontline defenders of truth make confident decisions, even as synthetic media becomes more persuasive and more widespread.

In addition, it is thinkable that we expand the platform to video and audio modalities as well.

Built With

Share this project:

Updates