Misinformation Detective-V2

Security Detection Demostration Photo

💡 Inspiration

With the sheer volume of information flooding the web daily, distinguishing factual grounding from sophisticated misinformation has become an uphill battle. While version 1 of Misinformation Detective established a vital proof-of-concept for real-time passive browsing verification, we quickly realized that modern decentralized research environments demand more than a simple text-parsing shield.

Researchers and daily users regularly analyze dense, localized, multi-page data structures (like academic papers, briefs, and offline documents). However, ingesting untrusted local file binaries directly into an LLM pipeline opens up a massive surface for malicious actors—specifically via complex prompt injections, tone-shifting adversarial patterns, and system instruction overrides.

Inspired by the challenge of transforming a reactive web utility into a hardened, proactive security platform, we engineered Misinformation Detective V2 for Next Byte Hacks V2. Our vision was clear: expand data analysis capabilities while reinforcing the entire framework with a strict, defense-in-depth security architecture.

🛠️ How We Built It

Misinformation Detective V2 is a decentralized, privacy-first Chrome Extension backed by an independent, zero-trust execution sandbox.

Frontend UI & State Management: Built with an elegant, glassmorphic UI framework integrated directly into the browser viewport using the Chrome Extensions API. Local state tracking avoids centralized server monitoring by operating entirely via secure client-side storage boundaries (chrome.storage.local).
Dynamic Tab Lifecycle Integration: A background script monitors tab lifecycle instances. The system tracks content loading hooks and executes text extraction based on real-time browser states.
The Ingestion Pipeline: We implemented a dedicated file dropzone component capable of raw string extraction from multi-page document binaries (e.g., local PDFs).
OWASP-Compliant Security Shield: Before ingested data parameters can reach our inference nodes, payloads pass through a local regex and token-classification layer designed to catch indirect prompt injections trying to override system rules.
Decentralized BYOK Inference: Truth-metric parsing is powered by a decentralized, Bring-Your-Own-Key (BYOK) paradigm. The client securely transmits payload embeddings directly to provider-isolated developer endpoints via OpenRouter and Serper APIs, generating predictable, strictly structured JSON outputs.

🧮 Mathematical Validation Framework

To evaluate incoming textual strings objectively, the underlying verification engine handles content telemetry by translating semantic truth values into empirical indices. The core analysis matrix calculates a weighted grounding score based on citation confidence, source reliability, and context deviation.

The contextual grounding metric $G_c$ is determined by evaluating the extracted claim vectors against external validation parameters:

$$G_c = \sum_{i=1}^{n} w_i \cdot \frac{C_i}{R_i + \delta}$$

Where:

$w_i$ represents the assigned static weight of an individual source type (e.g., peer-reviewed index vs. public forum).
$C_i$ represents the verified citation density score returned by the background validation engine.
$R_i$ represents the localized risk rating of the evaluation domain.
$\delta$ acts as an infinitesimal stability constant to prevent division-by-zero errors in unindexed web environments.

If an adversarial payload attempts to scale input length exponentially to break the parser, our Character Safety Limiter steps in. It enforces a strict upper boundary condition on text allocation $A_{max}$ using a processing token defense ceiling:

$$A_{max} = \min \left( \Phi_{limit}, \alpha \cdot T_{available} \right)$$

This mathematical validation loop ensures system resource predictability and prevents downstream token depletion attacks.

🚧 Challenges We Faced

The most intense engineering hurdle was resolving the security vulnerabilities created by the new file dropzone feature. While traditional text blocks are simple to isolate, local files frequently conceal adversarial payloads hidden inside metadata or structural phrasing.

Initially, when we ran test files containing injection templates, the downstream language model would inadvertently inherit the hidden files' malicious instructions (such as altering the validation tone or completely ignoring verification directives). To fix this, we had to systematically build a custom parsing shield.

Designing the engine to intercept raw string data, identify tone-shifting anomalies, and trigger an immediate Threat Detected lockdown before the payload ever left the client viewport required strict, precise sequence handling. Balancing high security with low interface latency was incredibly tricky, but it ultimately forced us to write much cleaner, highly synchronized asynchronous JavaScript code.

🎓 What We Learned

Building Version 2 taught us that security is not a separate feature you patch on top of an app—it is an architectural foundation. We gained deep insights into building defense-in-depth pipelines within the unique constraints of browser sandboxes.

Additionally, we learned how to optimize client-side performance, master advanced text-extraction mechanics, and effectively manage asynchronous background tasks across diverse browser tab lifecycles. Most importantly, we discovered how fulfilling it is to watch an explicit exploit get instantly neutralized by a defensive system built from scratch!

Built With

Updates

Treatable MAK started this project — May 25, 2026 10:21 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.