Vastav Agent 6-Judge AI Deepfake Detection System

dashboard for vastav agent

Inspiration

Deepfakes are becoming increasingly indistinguishable from reality. Synthetic media is already being weaponized globally for political manipulation, financial fraud, and targeted harassment.

Most deepfake detection systems rely on a single AI model to make a binary decision. I wanted to explore a different approach — a system that behaves more like a digital forensic investigation team, where each specialist analyzes media from a different perspective before reaching a conclusion.

This led me to build VASTAV Agent, a multi-agent artificial intelligence system created specifically for the Gemini Live Agent Challenge.

VASTAV Agent is completely separate from my other creation, VASTAV AI, which is my production deepfake detection platform that uses its own proprietary detection models.
In contrast, VASTAV Agent is an experimental multi-agent architecture built using Google's Agent Development Kit and Gemini models.

What it does

VASTAV Agent uses six independent AI agents powered by Google Gemini and orchestrated using the Google Agent Development Kit (ADK).

Each agent analyzes media from a different forensic perspective:

Agent 1 — Forensic & Biometric Specialist
Analyzes lighting, facial structure, and biometric inconsistencies.
Agent 2 — AI Artifacts & Neural Pattern Expert
Detects diffusion artifacts, GAN fingerprints, and synthetic textures.
Agent 3 — Contextual & Semantic Evaluator
Evaluates scene logic, object relationships, and contextual anomalies.
Agent 4 — Physics, Lighting & Materials Specialist
Analyzes shadows, reflections, material behavior, and physical realism.
Agent 5 — Chief Justice (Holistic Analysis)
Aggregates the reasoning of all agents to evaluate the overall authenticity of the media.
Agent 6 — SynthID & AI Origin Specialist
Searches for AI watermark signals and indicators of synthetic origin.

The system requires at least 4 out of 6 agents to agree before producing a final authenticity verdict.

In addition to the verdict, VASTAV Agent generates a detailed forensic PDF report containing confidence scores and reasoning from every agent.

VASTAV Agent also features voice verdict announcement — after all 6 judges reach consensus, the system speaks the final verdict out loud: announcing whether the media is REAL or FAKE, the confidence score, and how many judges agreed. This transforms the experience from a text-based tool into a truly multimodal AI agent that sees, analyzes, and speaks.

Architecture Overview

The system follows a multi-agent ensemble architecture where independent AI agents analyze the same media input and then converge through a consensus mechanism.

Pipeline

User Upload → React Frontend → Node.js Backend → Media Processing Layer (EXIF + FFmpeg) → Parallel ADK Agents → Consensus Engine → Final Verdict → PDF Forensic Report

Each agent runs independently and evaluates different forensic signals before contributing its reasoning to the consensus engine.

How I built it

Agent Framework: Google Agent Development Kit (ADK) using the ParallelAgent orchestration pattern
AI Engine: Google Gemini 2.0 Flash for multimodal reasoning and analysis
Backend: Node.js, Express.js, and TypeScript
Frontend: React with Tailwind CSS, Framer Motion, and shadcn/ui
Hosting: Google Cloud Run
Media Processing: FFmpeg for video frame extraction and analysis
Metadata Analysis: EXIF metadata inspection
Reporting: PDFKit for generating forensic intelligence reports
Database Layer: Drizzle ORM

All six agents run in parallel, analyzing the same media input independently.
Their outputs are then aggregated through a consensus engine, which determines the final authenticity verdict.

Challenges

Keeping six agents truly independent without overlapping reasoning domains was one of the biggest challenges.

Another challenge was maintaining consistent structured JSON responses from Gemini across six parallel agent calls while keeping latency low.

For video analysis, I designed an optimized FFmpeg frame sampling pipeline so the system could analyze representative frames rather than every frame, significantly improving performance.

What I learned

One important lesson from this project is that no single AI model should make a critical authenticity decision alone.

A multi-agent consensus architecture dramatically reduces false positives because multiple independent analyses must converge before a verdict is issued.

Working with the ADK ParallelAgent pattern also demonstrated how powerful agent-based AI systems can be for complex reasoning and verification tasks.

Accomplishments that I'm proud of

Built a 6-agent forensic AI ensemble
Implemented a consensus-based deepfake detection system
Created automated forensic PDF intelligence reports
Implemented parallel AI agent orchestration
Built video frame extraction and analysis using FFmpeg

Built With

drizzle-orm
express.js
ffmpeg
framer-motion
google-adk
google-cloud-run
google-gemini
google-genai-sdk
node.js
pdfkit
react
shadcn-ui
tailwindcss
typescript
vite

Updates

NAVNEET Singh started this project — Mar 16, 2026 03:25 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.