Inspiration

The past few years made one thing painfully clear:

Information trust is now an infrastructure problem.

Deepfakes, miscaptioned videos, AI-generated clickbait, and sensationalized reporting blur truth faster than humans can process it. Traditional fact-checking is:

Slow

Centralized

Dependent on trust in third-party servers

Often too late

Meanwhile, devices are suddenly powerful enough to run models locally.

Epistemiq was born from a simple question:

What if anyone could fact-check media privately, instantly, and offline-first?

Not as a cloud service, but as a browser app you can take anywhere — like a pocket truth lab.

What it does

Core experience

Paste text / upload media / link a video

App cleans content locally

Extracts claims

Validates claims against cached model outputs

Retrieves academic sources

Generates a readable verdict + report

Exports results to PDF

Privacy design: Speech data never leaves the device. Claims/verification only use cloud when not cached.

How we built it

Browser AI Layer

window.ai.prompt() → Gemini Nano summarization & prep

Whisper tiny.en via transformers.js (WebAssembly)

Manual 16kHz resampling to avoid silent 48k upsampling (Chrome/Safari quirk)

Graceful fallback: Whisper-only mode when Nano unavailable

Backend (minimal, PythonAnywhere)

Flask

yt-dlp + ffmpeg for safe media extraction

SQLite cache (SHA-256 text hashing)

Fact-checking pipeline User text → SHA256 hash → Check database → If cached → return cached claims → Else call cloud LLM → store & reuse

Caching Objects

Article text hash → claims

Claim hash → verdict & research questions

Claim + question hash → full report

Design ethos

Local > cache > cloud (in that order).

Challenges we ran into

Browser model headaches

WebAudio upsampling to 48kHz → Whisper failure → Solved with forced AudioContext + soxr resample

WASM memory limits during long audio

Nano API UX inconsistencies across Chrome channels

Deployment gymnastics

Vercel static frontend + PythonAnywhere backend

CORS + streaming SSE + media blob piping

Fallback behavior when Nano is missing

Database integrity

Preserving consistent hashing logic while refactoring

Ensuring cross-session reuse under rate limits

Human challenge

Hackathon + dev fatigue + model rate limits + deadlines = 💥 (Made it anyway.)

Accomplishments that we're proud of

End-to-end local speech-to-text in browser

Chrome Nano pipeline for text prep working reliably

Deterministic hashing + smart cache layer

YouTube → local STT pipeline without cloud STT

Usable in low-connectivity situations

Maintained privacy guarantees

Full interface polished enough for real users

And honestly — staying sane while debugging 48kHz audio issues at 3AM.

What we learned

Technical

Nano + WASM can power hybrid reasoning pipelines

Device-first AI requires careful UX fallback

ffmpeg resampling quirks are not for the weak

Browser SSE + LLM streaming is surprisingly elegant

Human

Timeboxing matters

Simplicity > “correctness on paper”

Big insight

People don’t just need answers — they need transparent reasoning they can export, trace, and trust.

What's next for Epistemiq

Phase 2 — journalist/security features

Source provenance scoring

PDF cite-trace (auto-link evidence to sources)

Offline evidence cards

Chrome extension mode

Mobile full-offline WASM model pack

Federated cache sharing between team devices

Plugin API for newsroom workflows

Long term vision:

An offline-capable epistemic OS for journalists, researchers & educators.

Epistemiq should feel like a personal research assistant that never tattles on you.

Built With

  • ai
  • and-final-reports)-note:-speech-and-summaries-never-go-to-cloud-?-only-structured-text-claims-hit-the-llm
  • built-in
  • chrome
  • flask
  • gemini
  • html/css
  • javascript
  • nano)
  • openrouter
  • python
  • pythonanywhere
  • research-questions
  • scientific-cross-verification
  • sqlite
  • vercel
  • via
  • window.ai.prompt
Share this project:

Updates