The Problem That Started It All
Every day, millions of Indians receive forwarded WhatsApp messages — grainy screenshots, dramatic Hindi headlines, viral videos — and have no quick way to verify if they're true. Fact-checking websites exist, but nobody opens a browser mid-conversation to paste a claim. The misinformation spreads before anyone questions it.
We wanted to meet people where they already are: WhatsApp.
What Verto AI Does
Verto AI is a WhatsApp-based fake news detection system. You forward any suspicious news — a text message, a screenshot, a video clip, a PDF article — to our number, and within seconds you get back a structured verdict:
- ✅REAL — confirmed by trusted sources with links
- ❌ FAKE — no credible backing, claim debunked
- ⚠️ MISLEADING — partially true but distorted (wrong location, exaggerated claims, missing context)
Every verdict comes with a confidence score, a plain-language explanation, and actual article links from The Hindu, Times of India, NDTV, or BBC — not summaries, real URLs.
Replies arrive in both English and the user's preferred regional language (Hindi, Marathi, or Gujarati).
How We Built It
The architecture is intentionally layered. Twilio receives the incoming WhatsApp message and fires a POST request to our FastAPI webhook. The webhook identifies the message type — text, image, video, or document — downloads any media, and hands it to a LangChain DeepAgent.
The agent connects to a custom FastMCP server we built that exposes six tools:
extract_text_from_image— Gemini Vision reads screenshots and image-based newsextract_text_from_video— OpenCV samples key frames, Gemini analyzes themextract_text_from_document— PyMuPDF extracts text from PDF articlesweb_search_news— DuckDuckGo searches for what actually happenedsearch_trusted_sources— restricts search to verified Indian and international outletstranslate_text— Gemini translates the verdict for regional language users
The agent reasons through the claim step by step: extract the core assertion, search for what actually happened, cross-reference trusted sources, identify discrepancies, and produce a structured JSON verdict. The FastAPI layer parses this, formats it into a readable WhatsApp message, and fires it back through Twilio — all within a single request lifecycle.
The Challenge That Took the Longest
Getting the MCP server to work reliably inside a FastAPI async context was the hardest
part. The langchain-mcp-adapters library spawns stdio subprocesses, and Python 3.14's
stricter anyio task group enforcement kept throwing RuntimeError: Attempted to exit
cancel scope in a different task. We went through three different architectural
approaches — persistent lifespan client, thread pool isolation, per-request spawning —
before landing on a clean per-request pattern with PYTHONDONTWRITEBYTECODE=1 to
prevent WatchFiles from detecting MCP cache writes and triggering reload loops.
The second hard problem was search quality. DuckDuckGo occasionally returns LLM-generated snippets mixed with real results, especially for recent events. We solved this by adding year-context to all queries and a time-limit filter, then falling back to a broader search if the limited one returns nothing.
What We Learned
Building on WhatsApp via Twilio taught us that webhook reliability is everything. The gap between "app works locally" and "app works when Twilio calls it at 2am with a video attachment" is enormous. Proper logging at every step — not just error logging — is what made debugging possible.
We also learned that multimodal pipelines are only as good as their extraction layer. A verdict is worthless if the vision model misreads the screenshot. Getting the Gemini prompts precise enough to extract only the claim — not surrounding UI chrome or watermarks — took more iteration than the agent logic itself.
What's Next
Voice note analysis, regional language input (not just output), a web dashboard for monitoring flagged claims by geography, and a public API for newsrooms to integrate Verto into their editorial workflows.
Log in or sign up for Devpost to join the conversation.