Is It Real

Inspiration

The rise of synthetic media and deepfakes poses serious risks to trust, security, and authenticity online. Existing detectors often act as "black boxes," failing to explain their decisions or adapt to new manipulation techniques. This project was inspired by the need for:

  • Transparency: Explainable AI that reveals why something is flagged.
  • Adaptability: A self-evolving system that learns from user feedback in real time.
  • Persistence: Cloud-based resilience ensuring the AI’s "brain" survives server restarts.

What it does

The AI Detector is a hybrid system combining deep learning with forensic computer vision to detect manipulated media. Key features include:

  • Core AI Engine

    • Hybrid architecture: Transformer/ResNet + rule-based forensics.
    • Ensemble Decision Forest: Aggregates signals from 10+ analysis vectors.
    • Semantic awareness: Uses CLIP for zero-shot detection of "contextually impossible" content.
  • Advanced Forensic Suite

    • Error Level Analysis (ELA): Finds compression artifacts.
    • Spectral Frequency (FFT): Detects GAN grid patterns.
    • Ocular Asymmetry: Measures mismatched eye features.
    • Corneal Reflection Analysis: Validates geometric consistency of reflections.
    • Skin Texture Topology: Laplacian variance to detect unnatural smoothness.
    • Lighting Consistency: Physics-based shadow analysis.
    • Sensor Noise Profiling: Checks ISO grain presence.
    • Structural Perspective: Detects non-Euclidean geometry in backgrounds.
  • Active Learning System

    • Human-in-the-loop reinforcement learning.
    • Instant fine-tuning (~3 seconds) on feedback.
    • Dual reinforcement:
    • "No, Wrong" → correction training.
    • "Yes, Correct" → confidence strengthening.
    • Experience Replay Buffer prevents catastrophic forgetting.
    • Dynamic override adapts logic when confident.
  • Cloud Persistence & MLOps

    • Infinite versioning with timestamped models.
    • Pointer-based hot-swapping (model_pointer.txt).
    • Auto-sync to Hugging Face Hub.
    • Resilient recovery after server restarts.
  • User Interface (Streamlit)

    • Drag-and-drop analysis for images/videos.
    • Video frame sampling.
    • Explainable AI (XAI) with detailed forensic reasoning.
    • Sidebar confidence + model version tracking.

Built with

Category Technology Usage/Function
Language Python 3.10+ Core backend logic
Interface Streamlit Interactive UI
AI Framework PyTorch Training & inference (AdamW optimizer)
Transformer Hugging Face (transformers) Pre-trained ViT/ResNet models
Computer Vision OpenCV (cv2) Forensic analysis (FFT, SIFT, Canny)
Image Processing PIL (Pillow) Image I/O, color conversion
Zero-Shot AI OpenAI CLIP Semantic impossibility detection
Cloud API huggingface_hub Push/pull model checkpoints
DevOps Hugging Face Spaces Serverless Docker hosting

How we built it

The project followed an Iterative Agile Workflow:

  1. Phase 1: Core Brain (Baseline Model)

    • Fine-tuned ViT/ResNet on Real vs. Fake dataset.
    • Achieved ~85% accuracy but missed obvious artifacts.
  2. Phase 2: Forensic Layer

    • Built ForensicAnalyzer with OpenCV.
    • Added physics/geometry-based rules.
    • Integrated weighted voting with neural scores.
  3. Phase 3: Active Learning

    • Added "Correct/Wrong" feedback buttons.
    • Real-time optimizer updates (~3 seconds).
    • Implemented Experience Replay Buffer to mix old/new data.
  4. Phase 4: Cloud Persistence

    • Solved stateless cloud issue.
    • Local pointer system + Hugging Face Hub sync.
    • Auto-bootstrap ensures survival after restarts.

Math Support

Some forensic checks rely on mathematical operations:

  • Laplacian Variance (Skin Texture Topology):

$$ \sigma^{2} = \frac{1}{N} \sum_{i=1}^{N} (L_{i} - \mu)^{2} $$

where \( L_{i} \) is the Laplacian pixel intensity and \( \mu \) is the mean.

  • Fast Fourier Transform (Spectral Frequency):

$$ F(k) = \sum_{n=0}^{N-1} f(n) \cdot e^{-2\pi i kn / N} $$

used to detect periodic GAN artifacts.

  • Decision Forest Aggregation:

$$ y = \text{sign}\left( \sum_{j=1}^{M} w_{j} \cdot h_{j}(x) \right) $$

where \( h_{j}(x) \) are analysis vectors and \( w_{j} \) are weights.

Built With

Share this project:

Updates