endorser view
ai verification
official view
ai verification
verification tagging on feed

TruChain: Fighting Deepfakes with Blockchain and AI

Inspiration 💡

We live in an era where seeing is no longer believing. Deepfake videos of elected officials often go viral, claiming they said something they never did. By the time fact-checkers catch it, millions have already seen it. That moment hit us hard:

What happens when we can't trust videos of our own elected officials?

Traditional deepfake detection is like playing whack-a-mole — forensic tools analyze videos looking for glitches, but deepfakes keep getting better. We realized we needed a completely different approach:

Instead of asking “is this fake?”, we should be asking “can we prove this is real?”

That's how TruChain was born — a system where authenticity comes from the source, not from after-the-fact analysis.

What We Built 🏗️

TruChain is a three-layer video verification system:

Layer 1: Blockchain Provenance

When an official’s media team uploads a speech video, we hash it, store it on IPFS, and register it on the Solana blockchain.
This creates an immutable timestamp:

“This video existed at this exact moment, uploaded by this exact authority.”

Layer 2: AI Verification

When someone encounters a suspicious clip online, our AI does something clever:

Transcribes the clip using Whisper
Searches for matching text across all official videos
Uses Wav2Vec2 speaker verification to confirm the voice

This catches both content manipulation and voice deepfakes.

Layer 3: Social Consensus

A network of whitelisted endorsers (trusted journalists, party officials, etc.) vote on whether videos they receive match what's on-chain.
Two out of three votes marks a video as Authentic.

Regular users can flag clips as misleading or out-of-context — because sometimes real clips can be used deceptively.

How We Built It 🛠️

Our tech stack came together like a puzzle:

Solana blockchain (Rust + Anchor)
React frontend with four role-based views: admin, official, endorser, public feed
Node.js backend for IPFS uploads + routing
Python FastAPI AI service running Whisper + Wav2Vec2

The most interesting challenge was the hybrid verification algorithm.

We needed to solve:

“Given a 30-second clip, find where it appears in hours of official speeches.”

Our solution:

Perform word-level timestamp matching with a sliding window
Extract that segment from the original video
Compare voice embeddings

This results in three verification states:

✅ Full Verification — content matches and speaker matches → Authentic
⚠️ Content-Only — text matches but voice doesn’t → Possible deepfake
❌ Not Verified — no match in database → Unknown source

Challenges We Faced 🧗

1. The PDA Puzzle

Solana’s PDAs were a brain-bender.
We needed deterministic account addresses for videos, using seeds like: [b"video", official_pubkey, video_hash]

After multiple “invalid seeds” errors at 2 AM, it finally clicked.

2. The Endorser Dilemma

Originally endorsers could vote by checking hashes.
But how do we know they actually have the authentic video?

We refactored the entire workflow: endorsers must upload the actual file before voting.

3. AI Performance

Running Whisper on full speeches was painfully slow.
We added aggressive caching — transcribe once, search many times.

Verification now runs in 4–7 seconds instead of minutes.

4. The Three-Layer Balance

We debated what belongs on-chain vs. off-chain.

Final decision:

On-chain: protocol-level authenticity
Off-chain: community flags + context

Blockchain is the source of truth; SQLite handles the social layer.

What We Learned

Blockchain Architecture

Program Derived Addresses (PDAs) are powerful primitives for deterministic account generation using seeds = [b"video", official_pubkey, video_hash]
Account space optimization is critical on Solana—every byte costs rent, requiring precise calculations: $$\text{Space} = 8_{\text{discriminator}} + 32_{\text{pubkey}} + 32_{\text{hash}} + 64_{\text{CID}} + 4_{\text{vec}} + 99_{\text{votes}} + 2_{\text{metadata}}$$
On-chain governance through threshold voting ($\geq \frac{2}{3}$ endorsers for consensus) provides decentralized trust without oracles

System Design Tradeoffs

What belongs on-chain: Provenance data (hashes, timestamps, votes) → immutable
What stays off-chain: Social signals (user flags, comments) → mutable and cheap
Hybrid architecture balances decentralization with practicality

Cross-Stack Integration

Connecting Rust smart contracts → TypeScript frontend → Python AI services required careful serialization and error handling across language boundaries
IPFS content addressing ($\text{CID} = \text{Hash}(\text{content})$) provides decentralized storage that pairs naturally with blockchain's trustless model