Fraudwatch: We supply the facts, you make the call.


The Problem We're Tackling

The digital landscape is saturated with hyper-realistic phishing attempts, fabricated research, and predatory legal agreements. The attacks have become so sophisticated that traditional "blockers" are no longer enough, and simply telling a user "this is bad" doesn't teach them how to navigate the web safely. People need a digital early-warning system. Much like how delayed medical care turns small wounds into emergencies, delaying the verification of a sketchy email or a bold claim can turn into stolen identity or the spread of dangerous misinformation. We wanted to tackle this by empowering users, keeping them informed, and educating the new generation on what is scammy and factual in this age.

What Inspired Us

Some of us were coming off the momentum of presenting advanced solutions at the Dan & Robyn Ives AI Innovation Day, we realized that the best AI applications don't replace human judgment; they enhance it. We kept seeing peers and older relatives fall for sophisticated scams because they didn't know what to look for. We wanted to build a tool that doesn't judge the user or preach to them, but simply arms them with raw, objective data. We don't rescue, we reveal.

What It Does

Fraudwatch is an AI-powered analysis tool designed to educate and protect users. By uploading any file, link, or text snippet, our engine runs a transparent forensic check. We designed a clear, 100-point confidence scoring system based on four main criteria and a wildcard factor. We calculate the total Fraudwatch Risk Score using a weighted sum model:

Risk Score = $$(0.30 * A) + (0.25 * M) + (0.25 * V) + (0.15 * L) + (0.05 * W)$$

Where A is Source Authenticity, M is Manipulative Language, V is Claim Verifiability, L is Logical Consistency, and W is the Wildcard factor. All this is supposed to be our paramters when guiding our OpenAPI model to create a score that isn't hallucinated, but instead rooted through real semantic and cognitive analysis.

Why we are more than just a wrapper

Instead of just printing a generic AI response, Fraudwatch deconstructs the threat. We highlight exactly why information is suspicious and explain the effect of that tactic.

  • If a URL is mismatched, we extract the hidden link and explain how attackers use this to steal credentials.
  • If a text uses manipulative urgency, we highlight the specific phrasing and educate the user on how false time constraints bypass critical thinking.

How We Built It

We engineered a robust layered defense system, utilizing a modern React frontend and a Node.js backend to handle complex file parsing and logic checking.

  • Interface: A sleek, high-contrast React UI focused on readability, allowing users to effortlessly drag and drop PDFs, images, or text.
  • The Dual-Engine Backend: Our Node.js server acts as the orchestrator. Before any AI is involved, traditional scripts extract hidden URLs and metadata from uploaded files to run hard-coded authenticity checks.
  • Semantic Analysis: We pass the parsed text to our Base44-backed engine using highly structured system prompts, forcing the AI to output targeted JSON data evaluating our specific criteria, rather than generating conversational text.
  • Educational Rendering: The frontend maps the AI's JSON output to the original document, visually highlighting the red flags and displaying the educational context side-by-side.

Challenges We Ran Into

  • Data Extraction: Accurately pulling embedded links and text out of complex PDFs without losing formatting required several iterations of our backend parsing logic.
  • Prompt Engineering for Objectivity: It was difficult to keep the AI from sounding preachy or making definitive judgments. We had to strictly constrain the prompts to ensure the output remained clinical, objective, and purely factual.
  • Balancing Latency: Running multi-point checks on large files caused initial timeouts, which we mitigated by optimizing our API calls and adding loading states to the React frontend.

What We Learned

  • Transparency builds trust. Users responded much better to seeing the mathematical breakdown of their risk score rather than just a red "Danger" banner.Hard logic + AI is the winning combo. Relying solely on an LLM is a trap. Pairing traditional programmatic checks (like domain matching) with semantic AI analysis creates a vastly superior and more accurate product.

What's Next for Fraudwatch

We view this web app as phase one of a much larger vision to create an accessible digital shield.

  • The Fraudwatch Browser Extension: We plan to port our React logic into a Chrome extension for real-time, in-context analysis. Users will be able to highlight text or right-click links directly in their browser to run a background forensic check without disrupting their workflow.
  • Training a Specialized LLM: To reduce latency and reliance on generalized APIs, our next major technical milestone is fine-tuning an open-source model (like Llama 3) specifically on a curated dataset of known phishing vectors, deceptive UI patterns, and verified fact-checks. This dedicated "Threat-Intel Model" will offer faster edge inference and deeper domain-specific accuracy.

How can we expand onto this?

  • Train a dedicated threat-intel model: Move beyond generalized APIs by fine-tuning an open-weight model (like Llama 3) specifically on a curated, continuously updated dataset of zero-day phishing heuristics, social engineering scripts, and verified fact-checks.
  • Edge inference & Browser Extension: Transition from a web app to a Chrome extension running lightweight NLP models locally in the browser (via WebAssembly). This enables real-time, privacy-preserving scanning of live websites and emails without ever sending the user's personal browsing data to the cloud.
  • Calibrate risk via dynamic thresholding: Implement feedback loops and integrate with live threat intelligence feeds to adjust scoring weights on the fly, reducing the false positives that lead to "alert fatigue."Automated countermeasures: Introduce one-tap automated reporting to registrars, domain hosts, or a user's IT department to actively take down verified phishing links.

Inline math example: We imagine a dynamic risk probability \( P(R) \) that blends our heuristic weights \( w_i \) and the AI's confidence scores \( c_i \) for each specific criteria: \( P(R) = \sum_{i=1}^{n} w_i \cdot c_i \).

Display math example: To ensure our final Fraudwatch Score remains bounded between 0 and 100 regardless of extreme edge cases, we plan to normalize the raw weighted outputs using a logistic function:

$$P(S|X) = \frac{1}{1 + e^{-(\beta_0 + \sum_{i=1}^{n} \beta_i x_i)}}$$

(Where \( x_i \) represents the continuous scores for Source Authenticity, Manipulative Language, Verifiability, and Consistency, and \( \beta_i \) are our learned parameters).

Our ethos

We’re building for the vulnerable and the overwhelmed—because the internet is incredibly noisy, modern scams are hyper-realistic, and misinformation is weaponized. Fraudwatch’s job is simple: pause the panic, strip away the manipulation, and provide the objective facts so fewer people fall victim to digital exploitation. We don't make the call; we just arm you to make a better one.

Credits & disclaimer

We’re a student team building Fraudwatch at Hack @ Penn State to demonstrate how transparent, carefully designed AI can support digital literacy. Fraudwatch is an educational tool and does not constitute definitive legal, financial, or cybersecurity advice. If our app flags a highly sensitive document—such as a banking notice or a legal contract—please verify it directly with a human professional or the official institution.

Built With

Share this project:

Updates