Inspiration

In today's age of social media, we found that misinformation is rampant. News comes out so quickly that nobody has time to fact-check information. So, we decided to change that by creating SpyGlass on one of the biggest news platforms in the modern age: X.

What It Does

SpyGlass is a Chrome extension that adds inline fact-check verdicts to every tweet on X. When you scroll your timeline, each tweet gets a small colored pill — Likely True, Likely False, Misleading, Unverifiable, or Opinion. More importantly, each individual clause inside the tweet is underlined in the color of its own verdict.

Hovering over any underlined phrase shows a tooltip with the specific claim that was extracted and the reasoning behind its verdict. So instead of seeing a single label for a tweet full of mixed claims, you can see exactly which sentence was flagged and why.

Under the hood, every tweet is decomposed into atomic claims, each claim is independently verified against Google Search, and the results are cached so the second visitor to a tweet gets an instant answer.

How I Built It

  • Chrome extension (MV3) built with Plasmo, TypeScript, and React 18

    • Content script detects tweets via article[data-testid="tweet"]
    • Extracts text and metadata
    • Renders an inline VerdictBadge after each tweet
  • FastAPI backend exposing /tweets/check

  • Three-stage pipeline per tweet:

    1. Neutralize — rewrite tone-heavy tweets into neutral statements
    2. Extract claims — decompose into atomic, independently checkable assertions, each tagged with a verbatim source_span from the original tweet
    3. Verify — fan out one grounded Gemini call per claim, using Google Search as a tool, to reach a verdict and collect citations
  • Google Vertex AI / Gemini 2.5 Flash for all three stages

    • Google Search grounding is used on the verify step
  • Supabase (Postgres + RLS) as the cache and data store

    • Repeat visitors and cross-user traffic are served with zero LLM cost
  • Per-clause highlighting on the client

    • A custom TreeWalker-based DOM module finds each source_span
    • Wraps the matching range in a styled <span>
    • Re-applies highlights via MutationObserver when X virtualizes the timeline
  • Tooltip system

    • A singleton React tooltip is portaled into document.body
    • Follows the hover state of highlighted claims

Challenges I Ran Into

Mapping paraphrased claims back to the tweet

Claims are extracted from a neutralized version of the tweet, so they are often paraphrased and do not appear verbatim in the DOM.

Solution: Added a source_span field that the LLM fills with a verbatim substring of the original post, validated server-side (source_span in original_text) before it is ever sent to the client.

Injecting highlights into X's DOM without fighting its re-renders

Tweet text lives inside nested spans such as hashtags, mentions, and emoji wrappers, and X re-renders aggressively during scroll.

Solution: Built a text-node walker that splits ranges across nodes, tracks wrappers, and re-applies highlights on mutation with a debounce so our own mutations do not loop.

Shadow DOM vs page DOM

Plasmo mounts components in a shadow root, but our highlights had to live inside X's page DOM.

Solution: Injected a second, page-level <style> tag so the verdict color classes could actually reach the underlined spans.

Latency vs cost vs quality

Each tweet was making two or more LLM calls serially before fanning out to verification.

Solution: Built a fast path that skips neutralization for short, clean tweets with no emoji, ALL-CAPS, repeated punctuation, or sarcasm markers, and raised verify concurrency so multi-claim tweets do not serialize.

Context loss in verification

A fragment like "$25 a month" checked in isolation is unverifiable.

Solution: Reworked the extract-claims prompt to force self-contained, co-reference-resolved claim text while keeping source_span pointing back to the original fragment for highlighting.

Temporal anchoring

The verifier was using today's date as "today", so a tweet from July 4 saying "today is July 4" came back false.

Solution: Captured the tweet's <time datetime> from the DOM and passed it to the verifier as the reference date.

Accomplishments That I'm Proud Of

  • Sub-second verdicts on cached tweets thanks to Supabase hits on repeat views
  • Clause-level transparency — users can see which phrase is false, not just that "this tweet has something wrong in it"
  • Graceful degradation everywhere — if an LLM fails, if a source_span cannot be located, if a tweet has no extractable claims, or if the DOM does not match expected selectors, the extension falls back cleanly to a simpler badge instead of breaking
  • Robust text highlighting that survives X's scroll virtualization and nested-span DOM without flicker

What I Learned

  • How to manipulate DOM text nodes safely across nested elements using TreeWalker, splitText, and reverse-order wrapping to keep offsets valid
  • Chrome MV3 content-script patterns with Plasmo — anchor lists, CSUI hosts, shadow roots, and page-level style injection
  • Prompt engineering for structured outputs: forcing the LLM to return a verbatim substring with server-side in original_text validation is far more reliable than asking for character offsets
  • The economics of multi-stage LLM pipelines — caching and routing cheaper vs. more expensive models per stage matters more than picking the "best" model for everything
  • How to integrate Google Search as a grounding tool inside Gemini for real-time citations

What's Next for SpyGlass?

  • Question detection — tweets that are pure questions should not be fact-checked; they should short-circuit to an Unverifiable / No Claim Found state with no LLM spend
  • Self-contained claim rewriting — finish the work so every extracted claim stands alone, so "$25 a month" becomes "Claude costs $25 a month" before it hits the verifier
  • Temporal anchoring shipped end-to-end — use the tweet's post time as "today" for every temporal claim
  • Public dashboard — a browsable feed of recently checked tweets, aggregate stats by verdict, and the ability to dispute a verdict with a citation
  • More platforms — the pipeline is platform-agnostic; next targets are Reddit and LinkedIn
  • Quote-tweet and thread awareness so context flows across posts, not just within one tweet

Built With

Share this project:

Updates