AI has made media manipulation effortless.

Today, anyone can generate a photorealistic image, clone a voice, or fabricate an entire video in seconds. At the same time, bad actors increasingly reuse real photos and videos with misleading captions or false claims.

The result isn’t just deepfakes. It’s something worse: media that looks completely real — whether synthetic or authentic — but tells a false story.

Seeing is no longer believing. VerifAI is a desktop “truth layer” that lets users verify any content on their screen in one click.

Instead of asking only: “Is this AI-generated?” We ask the bigger question: “Can this be trusted?”

The Problem

Modern misinformation is hybrid.

Sometimes it’s fully synthetic — AI-generated faces, voices, or videos that are nearly indistinguishable from reality. Sometimes it’s authentic media used deceptively — real footage paired with false timelines, misleading captions, or fabricated claims.

Traditional tools only address one side:

  • Deepfake detectors → only analyze pixels
  • Fact-checkers → only analyze text

Neither solves both. We needed something that does signal analysis and semantic reasoning together.

The Solution

VerifAI verifies both the media and the message.

With features such as simple screen capture or direct upload, our system:

  1. Detects whether content shows signs of AI generation or manipulation
  2. Extracts and analyzes text/claims
  3. Cross-checks credibility and context
  4. Returns a clear verdict with reasoning and sources

So users don’t just know if something looks fake — they understand why it’s misleading.

How We Built It — The Hybrid AI Pipeline

To solve real-world misinformation, we combined classical signal forensics with modern AI reasoning instead of relying on a single model.

Signal Integrity — Detect synthetic or edited media

We first built traditional forensic models using FFT features and SVM classifiers trained on open-source datasets in Google Colab and AI Studio. While promising, these approaches required strict image resizing and didn’t generalize well to real-world screenshots.

We pivoted to a more robust method using Principal Component Analysis (PCA) and eigenvector modeling.

By projecting images into eigenvector space, we learn the statistical structure of natural photographs — texture coherence, spatial correlations, and frequency decay. AI-generated or heavily edited images often deviate from this distribution, even when they look visually perfect.

Our backend now performs:

  • PCA + eigenvector “naturalness” scoring
  • Fast Fourier Transform (FFT) frequency analysis
  • Error Level Analysis (ELA)
  • Compression artifact detection

This allows us to detect deepfakes, synthetic images, and heavy edits — even when metadata is missing or stripped.

Semantic Integrity — Reason about the claims (Gemini)

Pixels alone can’t determine truth.

So we pass:

  • the media,
  • extracted on-screen text,
  • and our forensic signals

into Google Gemini 2.5 Flash.

Gemini acts as a reasoning engine, not just a chatbot. It:

  • extracts text from screenshots
  • identifies people and organizations
  • audits source credibility
  • checks timelines and historical consistency
  • evaluates whether claims match reality

We use structured prompt engineering so Gemini returns clean JSON with:

  • trust score
  • verdict
  • reasoning
  • sources
  • indicators

This gives users transparent explanations instead of black-box predictions.

Cloud Infrastructure — Scalable Processing (DigitalOcean)

Forensic image processing is computationally heavy and unsuitable for local devices.

We deploy our Python/FastAPI backend on DigitalOcean App Platform, where containerized services handle OpenCV, NumPy, and ML workloads.

This lets us:

  • offload heavy compute from the client
  • scale automatically
  • keep the desktop overlay lightweight and fast
  • process media in near real-time

Challenges We Overcame

Metadata is unreliable
Most social platforms strip EXIF data. We shifted to signal-based statistics instead of metadata-based checks.

Classical ML didn’t scale well
FFT + SVM models struggled with varying resolutions. PCA eigenvector modeling proved more robust and resolution-agnostic.

Browser extensions were too limited
Privacy sandboxes prevented full-screen analysis. Moving to Electron allowed system-level capture across any app.

Explainability
Most detectors simply say “fake.” By leveraging Gemini’s reasoning, we return clear indicators and evidence for every verdict.

What Makes VerifAI Different

Most tools only detect synthetic pixels.

VerifAI verifies both:

  • signal integrity (is this media authentic?)
  • semantic integrity (is this claim true?)

By combining math-based forensics with AI reasoning, we detect both deepfakes and misleading context — something neither approach can solve alone.

What We Learned

  • Classical ML is still powerful when paired with LLM reasoning
  • Signal statistics often outperform metadata for authenticity checks
  • Cloud infrastructure is essential for real-time forensics
  • Explainability builds user trust more than raw accuracy

What’s Next

  • Frame-by-frame video forensics
  • Real-time browser extension companion
  • Source credibility scoring
  • Batch verification for journalists and researchers
  • Expanded multimodal detection (audio + voice cloning)

Why It Matters

AI has made fake media indistinguishable from reality.

VerifAI helps restore trust by giving people instant, explainable verification — whether the threat is a deepfake or a deceptive narrative.

Because the future isn’t just fake images.

It’s believable lies.

Share this project:

Updates