V.I.P.E.R

🔐 Inspiration

Generative AI has made fake images nearly indistinguishable from real ones—threatening trust in journalism, digital identity, and art.

Most existing solutions act as black boxes, giving predictions without proof. I wanted to build something different: a system that doesn’t just detect AI-generated images, but proves it.

🧠 What it does

I built VIPER (Visual Intelligence Pipeline for Empirical Recognition)—a forensic AI engine that acts like a lie detector for images.

VIPER combines:

Deep learning (ConvNeXt) for visual understanding
Mathematical signal analysis (FFT, PRNU, color entropy) for hard evidence

This allows the system to achieve 96%+ accuracy while also explaining why an image is classified as AI or real.

I frame this as an Art Heist scenario, where users analyze artwork and decide whether it’s authentic or an AI-generated forgery—guided by forensic insights.

⚙️ How I built it

Data Pipeline:

Processed a large-scale dataset (~67k images), sub-sampled to 10,000 balanced samples
Enforced strict 70/15/15 train/validation/test splits to prevent data leakage
Applied data augmentation (flips, color jitter) for robustness
Optimized with PyTorch DataLoaders (batch size 64)

Forensic Feature Engineering

Before deep learning, I extracted 27 mathematical features across six domains:

FFT (frequency analysis) → detects unnatural smoothness
PRNU noise patterns → identifies missing camera sensor fingerprints
Color entropy (LAB space) → flags overly “perfect” color distributions
Texture (GLCM) and edge gradients → measure structural realism

These are standardized and grouped into human-readable signals:

FFT Irregularity
PRNU Variation
LAB Saturation

Baseline Model:

I trained a Logistic Regression model on these features (~71% accuracy) to establish a statistical baseline and validate feature importance.

Deep Learning Engine:

Used ConvNeXt-Tiny (transfer learning) pretrained on ImageNet
Froze lower layers and added a custom classification head

Two modes:

Standard: image embeddings only
Hybrid: embeddings + forensic features

Training optimizations:

Adam optimizer
Cosine annealing learning rate
Validation after every epoch
Automatic checkpointing to prevent overfitting

Evaluation:

Achieved 96%+ accuracy and F1 score
Evaluated on strictly unseen test data
Generated confusion matrix, precision/recall, and AUC Interpretability

To eliminate the “black box,” I added:

Grad-CAM heatmaps → show which pixels influenced predictions
UMAP projections → visualize separation between AI and real images

This makes VIPER a transparent forensic system, not just a classifier.

Real-World Testing:

JPEG compression tests → evaluate robustness to degraded images
Zero-shot testing (WikiArt) → evaluate generalization to unseen data

🎯 Results

96%+ F1 Score Strong generalization across datasets Clear clustering between AI and real images Interpretable outputs with visual and mathematical justification