CheckPoint

architecture

💡 Inspiration

In a world where AI-generated misinformation spreads faster than truth, we wanted to build something that could bring trust back to digital media. We realized that most “fake news detectors” stop at analyzing text — but misinformation today comes in every form: videos, images, audios, and even subtle deepfakes.

Our team wanted to go beyond traditional classifiers. We asked ourselves:

“What if we could teach an AI to think like a human fact-checker — to see, hear, and reason with the same logic, physics, and biological cues we subconsciously trust?”

That thought became CheckPoint — a multimodal misinformation intelligence system that doesn’t just detect fakes but understands why they’re fake.

⚙️ What It Does

CheckPoint is a unified platform that authenticates and validates information across four media types — Text, Video, Image, and Audio — using a combination of AI forensics, physics-based reasoning, and factual verification.

Here’s how it works:

Video Analysis

Detects whether a video is AI-generated or real using physiological signals like blood-flow (PPG), pupil dilation, and facial micro-movements.

Applies Law of Physics (LOP) reasoning to detect impossible motion or lighting inconsistencies (e.g., a shadow bending against light direction or a flying whale defying gravity).

Scans for AI watermarks or compression fingerprints.

Performs source verification — checking if a known person in the video (e.g., a celebrity) actually made that appearance by comparing with their latest verified media.

Image Analysis

Detects GAN fingerprints and diffusion model artifacts in images.

Identifies hidden watermarks and inconsistent EXIF metadata.

Analyzes physical cues like shadow direction and object lighting consistency.

Audio Analysis

Determines if a voice is human or AI-generated using spectrogram and prosody-based features.

Transcribes the audio into text (using Whisper ASR) and feeds it into the text verification pipeline.

Text Analysis

Extracts the main claim or event from any text or transcript.

Uses real-time news retrieval and ranking to cross-check the claim’s authenticity against verified sources.

Assigns a confidence score (0–100) based on factual agreement and sentiment consistency.

Mutation Graph (the brain of CheckPoint)

Acts as a truth ledger, linking all verified facts, videos, images, and audio clips.

If misinformation mutates — for example, the same event being shown differently across platforms — the graph detects the mutation and flags inconsistencies instantly.

🧠 How We Built It

We divided our team of three into parallel modules — Video/PPG, Text/Audio, and Image/Integration — and merged everything through a unified backend.

Video Deepfake Detection

We built a PPG feature extractor using OpenCV, face_recognition, and NumPy to capture micro color variations on key facial ROIs (forehead, cheeks, mouth, eyes).

These were converted into frequency-domain features and fed into an ensemble model (LightGBM + XGBoost + Logistic Regression meta classifier).

We fine-tuned hyperparameters with Optuna and achieved ~86% accuracy on our validation set.

Law of Physics Verification

Integrated YOLOv8 for object tracking and computed acceleration and motion vectors.

Applied physics sanity checks — e.g., constant acceleration under gravity, realistic shadow angles, and collision plausibility.

Image Forensics

Used CNN-based GAN detectors and watermark scanners to catch diffusion-generated imagery.

Built an EXIF parser to validate source authenticity.

Audio + Text Verification

Transcribed speech using Whisper, then checked truthfulness using our text-based retriever-verifier pipeline.

Text verification uses BM25 + SBERT re-ranker with fact-check APIs and Google News search.

Mutation Graph

Designed a graph database linking every verified claim, clip, or article.

When new content appears, it automatically cross-references it with existing verified nodes.

Frontend Integration

CheckPoint exposes a FastAPI REST interface where users can upload a file (video, image, or audio) or paste text.

The backend returns a comprehensive JSON verdict — with confidence scores per modality and links to verified evidence.

⚔️ Challenges We Ran Into

Feature Alignment Hell — Each model (LightGBM, PCA, Logistic Regression) expected slightly different feature counts. Getting the CSVs, pipelines, and feature scalers perfectly aligned was surprisingly complex.

Physiological Signal Extraction — Detecting real PPG in compressed videos was tough. Lighting noise and motion distortion often broke the BPM detection pipeline, forcing us to design ROI consistency checks and signal fusion methods.

Physics-Based Reasoning — Teaching a model to "understand" physical impossibility required a balance of heuristics and ML-based motion estimation.

Cross-Modal Integration — Unifying results from four different modalities into one coherent verdict (with interpretability) was a massive data-engineering challenge.

Time Constraints — Building four modalities within 17 days, while ensuring everything communicates through a common backend, was a real race against the clock.

🏆 Accomplishments That We’re Proud Of

Built one of the few working biologically inspired DeepFake detectors leveraging PPG and micro-expression signals.

Designed a scalable mutation graph architecture that unifies video, text, image, and audio verification into one ecosystem.

Achieved ~86% accuracy in distinguishing real vs fake videos — using purely explainable biological and visual features (not black-box deep nets).

Completed integration across all modalities with a functioning API and demo-ready front-end.

Most importantly — proved that AI can defend against AI when built with transparency and ethics.

🎓 What We Learned

How to bridge computer vision, NLP, and signal processing into a unified reasoning system.

The importance of explainability in AI forensics — it’s not enough to say “fake”; we must show why.

The real-world complexity of biological signals (PPG) and how deepfake generators distort subtle but measurable cues.

That misinformation is not a single-modality problem — it’s an ecosystem challenge that demands interdisciplinary thinking.

And finally, how to work as a tight-knit team across completely different technical domains under extreme time pressure.

🔮 What’s Next for CheckPoint

We plan to evolve CheckPoint into a public misinformation verification platform — an accessible browser or API tool that can verify any media link in seconds.

Our roadmap:

🔁 Real-time API — scalable backend for real-time uploads and streaming verification.

🧩 Mutation Graph Expansion — connect with live fact-check databases and social media APIs to track misinformation spread.

💬 Explainable AI Dashboard — visualize which part of a video or text caused a fake flag (e.g., showing inconsistent facial rhythm heatmaps).

🌍 Open-Source Contribution — release a lightweight open model for educational and journalism use.

🔊 Multilingual Support — expand audio/text verification to non-English languages using multilingual ASR and NLP.

Built With

ai
api
cloud
data-processing
dataset-preprocessing
docker
fastapi
google
haystack
image
lightgbm
neo4j
news
openai
opencv
optuna
postgresql
python
pytorch
react
react-native
retriever-reader-pipelines
sbert
scikit-learn
tensorflow
vertex
whisper
xgboost
yolov8

Updates

Advaith Kashyap started this project — Nov 10, 2025 07:54 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.