🕵️ The Story of The Scrutinizer

Inspiration

As a Master’s student in Data & AI at INPT Rabat, I’ve watched the rapid evolution of "Triple-Threat" attacks, scams that combine deepfake video, fraudulent documents, and social engineering. Traditional security tools often fail because they lack the "connect-the-dots" reasoning required to spot inconsistencies across different media types. I wanted to build a tool that doesn't just scan for viruses, but audits logic serving as a forensic shield for the digital age.

How I Built It

The Scrutinizer is an agentic forensic system built on Gemini 3 Pro Preview. The architecture focuses on three core pillars:

  1. Deep Reasoning: I utilized Gemini 3's thinking_level="high" configuration. This allows the model to engage in multi-step planning and self-correction, enabling it to "think" through a scammer's psychological tactics before rendering a verdict.
  2. Multimodal Context: Leveraging Gemini 3's 1M+ token context window, the app ingests videos, screenshots, and PDFs simultaneously. It cross-references visual artifacts (like HeyGen watermarks or lip-sync lag) with text-based claims in legal documents.
  3. Agentic Tooling:
    • Google Search: To verify the existence of claimed "CEOs" or "Investment Firms."
    • Code Execution: To mathematically verify financial promises.

Technical Breakdown: The Math of Deception

Scammers often rely on the fact that humans struggle to calculate exponential growth intuitively. The Scrutinizer uses the Code Execution tool to debunk "guaranteed" return claims using the compound interest formula:

$$V = P \left(1 + \frac{r}{n}\right)^{nt}$$

Where:

  • $V$ is the final value.
  • $P$ is the initial investment.
  • $r$ is the annual interest rate.
  • $n$ is the number of times interest is compounded per year.
  • $t$ is the time in years.

When a "Guru" promises a $4.2\%$ daily return, the model executes Python code to show that a $\$1,000$ investment would theoretically grow to over $\$3.3$ Billion in one year; proving the claim is a mathematical impossibility.

Challenges I Faced

The primary challenge was latency management. High-level reasoning and tool-calling take time. To solve this, I designed a "Forensic Status Log" in the UI. By streaming the model's Thought Signatures and status messages like "Scanning metadata for deepfake artifacts...", I transformed the wait time into a transparent "investigation" experience for the user.

What I Learned

Building this project taught me that context is king. The true power of Gemini 3 isn't just its size, but its ability to understand the relationship between a spoken word in a video and a hidden clause in a 20-page PDF. This "Multimodal Reasoning" is the future of digital safety.

Built With

  • audio-analysis
  • code-execution-tool
  • forensic-ux
  • fraud-detection
  • gemini-3
  • gemini-3-pro-preview
  • gemini-api
  • google-genai
  • google-search-tool
  • image-analysis
  • json-structured-output
  • markdown-ui
  • multimodal-analysis
  • python
  • rest-api
  • scam-detection
  • streamlit
  • thinking-config
  • video-analysis
Share this project:

Updates