TruChain: Fighting Deepfakes with Blockchain and AI

Inspiration 💡

We live in an era where seeing is no longer believing. Deepfake videos of elected officials often go viral, claiming they said something they never did. By the time fact-checkers catch it, millions have already seen it. That moment hit us hard:

What happens when we can't trust videos of our own elected officials?

Traditional deepfake detection is like playing whack-a-mole — forensic tools analyze videos looking for glitches, but deepfakes keep getting better. We realized we needed a completely different approach:

Instead of asking “is this fake?”, we should be asking “can we prove this is real?”

That's how TruChain was born — a system where authenticity comes from the source, not from after-the-fact analysis.


What We Built 🏗️

TruChain is a three-layer video verification system:

Layer 1: Blockchain Provenance

When an official’s media team uploads a speech video, we hash it, store it on IPFS, and register it on the Solana blockchain.
This creates an immutable timestamp:

“This video existed at this exact moment, uploaded by this exact authority.”

Layer 2: AI Verification

When someone encounters a suspicious clip online, our AI does something clever:

  • Transcribes the clip using Whisper
  • Searches for matching text across all official videos
  • Uses Wav2Vec2 speaker verification to confirm the voice

This catches both content manipulation and voice deepfakes.

Layer 3: Social Consensus

A network of whitelisted endorsers (trusted journalists, party officials, etc.) vote on whether videos they receive match what's on-chain.
Two out of three votes marks a video as Authentic.

Regular users can flag clips as misleading or out-of-context — because sometimes real clips can be used deceptively.


How We Built It 🛠️

Our tech stack came together like a puzzle:

  • Solana blockchain (Rust + Anchor)
  • React frontend with four role-based views: admin, official, endorser, public feed
  • Node.js backend for IPFS uploads + routing
  • Python FastAPI AI service running Whisper + Wav2Vec2

The most interesting challenge was the hybrid verification algorithm.

We needed to solve:

“Given a 30-second clip, find where it appears in hours of official speeches.”

Our solution:

  • Perform word-level timestamp matching with a sliding window
  • Extract that segment from the original video
  • Compare voice embeddings

This results in three verification states:

  • Full Verification — content matches and speaker matches → Authentic
  • ⚠️ Content-Only — text matches but voice doesn’t → Possible deepfake
  • Not Verified — no match in database → Unknown source

Challenges We Faced 🧗

1. The PDA Puzzle

Solana’s PDAs were a brain-bender.
We needed deterministic account addresses for videos, using seeds like: [b"video", official_pubkey, video_hash]

After multiple “invalid seeds” errors at 2 AM, it finally clicked.

2. The Endorser Dilemma

Originally endorsers could vote by checking hashes.
But how do we know they actually have the authentic video?

We refactored the entire workflow: endorsers must upload the actual file before voting.

3. AI Performance

Running Whisper on full speeches was painfully slow.
We added aggressive caching — transcribe once, search many times.

Verification now runs in 4–7 seconds instead of minutes.

4. The Three-Layer Balance

We debated what belongs on-chain vs. off-chain.

Final decision:

  • On-chain: protocol-level authenticity
  • Off-chain: community flags + context

Blockchain is the source of truth; SQLite handles the social layer.


What We Learned

Blockchain Architecture

  • Program Derived Addresses (PDAs) are powerful primitives for deterministic account generation using seeds = [b"video", official_pubkey, video_hash]
  • Account space optimization is critical on Solana—every byte costs rent, requiring precise calculations: $$\text{Space} = 8_{\text{discriminator}} + 32_{\text{pubkey}} + 32_{\text{hash}} + 64_{\text{CID}} + 4_{\text{vec}} + 99_{\text{votes}} + 2_{\text{metadata}}$$
  • On-chain governance through threshold voting ($\geq \frac{2}{3}$ endorsers for consensus) provides decentralized trust without oracles

System Design Tradeoffs

  • What belongs on-chain: Provenance data (hashes, timestamps, votes) → immutable
  • What stays off-chain: Social signals (user flags, comments) → mutable and cheap
  • Hybrid architecture balances decentralization with practicality

Cross-Stack Integration

  • Connecting Rust smart contracts → TypeScript frontend → Python AI services required careful serialization and error handling across language boundaries
  • IPFS content addressing ($\text{CID} = \text{Hash}(\text{content})$) provides decentralized storage that pairs naturally with blockchain's trustless model

Closing Thoughts

We built TruChain in a few intense hours, fueled by the belief that democracy needs trustworthy information. It’s not perfect — but it’s a start.

In the age of AI-generated everything, proving authenticity at the source might be our best defense.


Built With

Share this project:

Updates