🔐 Inspiration

Generative AI has made fake images nearly indistinguishable from real ones—threatening trust in journalism, digital identity, and art.

Most existing solutions act as black boxes, giving predictions without proof. I wanted to build something different: a system that doesn’t just detect AI-generated images, but proves it.

🧠 What it does

I built VIPER (Visual Intelligence Pipeline for Empirical Recognition)—a forensic AI engine that acts like a lie detector for images.

VIPER combines:

  • Deep learning (ConvNeXt) for visual understanding
  • Mathematical signal analysis (FFT, PRNU, color entropy) for hard evidence

This allows the system to achieve 96%+ accuracy while also explaining why an image is classified as AI or real.

I frame this as an Art Heist scenario, where users analyze artwork and decide whether it’s authentic or an AI-generated forgery—guided by forensic insights.

⚙️ How I built it

Data Pipeline:

  • Processed a large-scale dataset (~67k images), sub-sampled to 10,000 balanced samples
  • Enforced strict 70/15/15 train/validation/test splits to prevent data leakage
  • Applied data augmentation (flips, color jitter) for robustness
  • Optimized with PyTorch DataLoaders (batch size 64)

Forensic Feature Engineering

Before deep learning, I extracted 27 mathematical features across six domains:

  • FFT (frequency analysis) → detects unnatural smoothness
  • PRNU noise patterns → identifies missing camera sensor fingerprints
  • Color entropy (LAB space) → flags overly “perfect” color distributions
  • Texture (GLCM) and edge gradients → measure structural realism

These are standardized and grouped into human-readable signals:

  • FFT Irregularity
  • PRNU Variation
  • LAB Saturation

Baseline Model:

  • I trained a Logistic Regression model on these features (~71% accuracy) to establish a statistical baseline and validate feature importance.

Deep Learning Engine:

  • Used ConvNeXt-Tiny (transfer learning) pretrained on ImageNet
  • Froze lower layers and added a custom classification head

Two modes:

  • Standard: image embeddings only
  • Hybrid: embeddings + forensic features

Training optimizations:

  • Adam optimizer
  • Cosine annealing learning rate
  • Validation after every epoch
  • Automatic checkpointing to prevent overfitting

Evaluation:

  • Achieved 96%+ accuracy and F1 score
  • Evaluated on strictly unseen test data
  • Generated confusion matrix, precision/recall, and AUC Interpretability

To eliminate the “black box,” I added:

  • Grad-CAM heatmaps → show which pixels influenced predictions
  • UMAP projections → visualize separation between AI and real images

This makes VIPER a transparent forensic system, not just a classifier.

Real-World Testing:

  • JPEG compression tests → evaluate robustness to degraded images
  • Zero-shot testing (WikiArt) → evaluate generalization to unseen data

🎯 Results

96%+ F1 Score Strong generalization across datasets Clear clustering between AI and real images Interpretable outputs with visual and mathematical justification

🧩 Challenges I ran into

  • Scaling forensic feature extraction across thousands of images
  • Balancing deep learning with handcrafted signals
  • Preventing overfitting while maximizing performance
  • Making complex outputs understandable for users

🏆 Accomplishments that I'm proud of

  • Turning a black-box model into a transparent forensic system
  • Successfully combining signal processing + deep learning
  • Achieving high accuracy and being able to explain how and why
  • Building a complete, end-to-end pipeline from raw data to insights

🚀 What’s next for VIPER

  • Scale to full 67k+ dataset with optimized feature extraction
  • Deploy as a real-time API for journalists and investigators
  • Improve robustness against adversarial AI-generated content

💡 Final Thoughts

VIPER isn’t just a classifier—it’s a forensic intelligence system that brings trust and transparency back to digital imagery.

Built With

Share this project:

Updates