Is It Real
Inspiration
The rise of synthetic media and deepfakes poses serious risks to trust, security, and authenticity online. Existing detectors often act as "black boxes," failing to explain their decisions or adapt to new manipulation techniques. This project was inspired by the need for:
- Transparency: Explainable AI that reveals why something is flagged.
- Adaptability: A self-evolving system that learns from user feedback in real time.
- Persistence: Cloud-based resilience ensuring the AI’s "brain" survives server restarts.
What it does
The AI Detector is a hybrid system combining deep learning with forensic computer vision to detect manipulated media. Key features include:
Core AI Engine
- Hybrid architecture: Transformer/ResNet + rule-based forensics.
- Ensemble Decision Forest: Aggregates signals from 10+ analysis vectors.
- Semantic awareness: Uses CLIP for zero-shot detection of "contextually impossible" content.
Advanced Forensic Suite
- Error Level Analysis (ELA): Finds compression artifacts.
- Spectral Frequency (FFT): Detects GAN grid patterns.
- Ocular Asymmetry: Measures mismatched eye features.
- Corneal Reflection Analysis: Validates geometric consistency of reflections.
- Skin Texture Topology: Laplacian variance to detect unnatural smoothness.
- Lighting Consistency: Physics-based shadow analysis.
- Sensor Noise Profiling: Checks ISO grain presence.
- Structural Perspective: Detects non-Euclidean geometry in backgrounds.
Active Learning System
- Human-in-the-loop reinforcement learning.
- Instant fine-tuning (~3 seconds) on feedback.
- Dual reinforcement:
- "No, Wrong" → correction training.
- "Yes, Correct" → confidence strengthening.
- Experience Replay Buffer prevents catastrophic forgetting.
- Dynamic override adapts logic when confident.
Cloud Persistence & MLOps
- Infinite versioning with timestamped models.
- Pointer-based hot-swapping (
model_pointer.txt). - Auto-sync to Hugging Face Hub.
- Resilient recovery after server restarts.
User Interface (Streamlit)
- Drag-and-drop analysis for images/videos.
- Video frame sampling.
- Explainable AI (XAI) with detailed forensic reasoning.
- Sidebar confidence + model version tracking.
Built with
| Category | Technology | Usage/Function |
|---|---|---|
| Language | Python 3.10+ | Core backend logic |
| Interface | Streamlit | Interactive UI |
| AI Framework | PyTorch | Training & inference (AdamW optimizer) |
| Transformer | Hugging Face (transformers) | Pre-trained ViT/ResNet models |
| Computer Vision | OpenCV (cv2) | Forensic analysis (FFT, SIFT, Canny) |
| Image Processing | PIL (Pillow) | Image I/O, color conversion |
| Zero-Shot AI | OpenAI CLIP | Semantic impossibility detection |
| Cloud API | huggingface_hub | Push/pull model checkpoints |
| DevOps | Hugging Face Spaces | Serverless Docker hosting |
How we built it
The project followed an Iterative Agile Workflow:
Phase 1: Core Brain (Baseline Model)
- Fine-tuned ViT/ResNet on Real vs. Fake dataset.
- Achieved ~85% accuracy but missed obvious artifacts.
Phase 2: Forensic Layer
- Built
ForensicAnalyzerwith OpenCV. - Added physics/geometry-based rules.
- Integrated weighted voting with neural scores.
- Built
Phase 3: Active Learning
- Added "Correct/Wrong" feedback buttons.
- Real-time optimizer updates (~3 seconds).
- Implemented Experience Replay Buffer to mix old/new data.
Phase 4: Cloud Persistence
- Solved stateless cloud issue.
- Local pointer system + Hugging Face Hub sync.
- Auto-bootstrap ensures survival after restarts.
Math Support
Some forensic checks rely on mathematical operations:
- Laplacian Variance (Skin Texture Topology):
$$ \sigma^{2} = \frac{1}{N} \sum_{i=1}^{N} (L_{i} - \mu)^{2} $$
where \( L_{i} \) is the Laplacian pixel intensity and \( \mu \) is the mean.
- Fast Fourier Transform (Spectral Frequency):
$$ F(k) = \sum_{n=0}^{N-1} f(n) \cdot e^{-2\pi i kn / N} $$
used to detect periodic GAN artifacts.
- Decision Forest Aggregation:
$$ y = \text{sign}\left( \sum_{j=1}^{M} w_{j} \cdot h_{j}(x) \right) $$
where \( h_{j}(x) \) are analysis vectors and \( w_{j} \) are weights.
Log in or sign up for Devpost to join the conversation.