FraudClaw
Inspiration
Fraud detection has a dirty secret: most systems are good at catching obvious cases. Stolen card used in a foreign country two minutes after a domestic purchase? Easy. But the fraud that actually costs banks money is subtler than that.
A single transaction that looks slightly large means nothing. A password reset followed by a new device login followed by a cross-border purchase three hours later means something. Five different customers all checking out from the same IP address in the same afternoon means something. Low-value test charges that get dismissed one by one mean something when you see the larger purchase that follows.
None of those patterns live in a single row of transaction data. They only appear when you connect customer history, timing, and relationships between accounts, devices, and merchants. That's the gap FraudClaw is built for.
What it does
FraudClaw streams live banking events and evaluates each one through a hybrid fraud engine that combines four types of analysis working together.
First, it compares every transaction against that specific customer's history. A $4,000 purchase isn't inherently suspicious. For a customer whose largest prior transaction was $200, it is. Second, it runs velocity and retry checks: too many transactions in a short window, repeated declines followed by another attempt, same merchant and same amount hitting twice. Third, it detects sequential patterns like card testing (small online probes followed by a larger authorization), account takeover (new device login, profile change, then a high-value purchase), and impossible travel. Fourth, it maps linked entities using an in-memory graph so it can catch burst fraud where multiple cards or customers are quietly sharing the same device or IP cluster.
Every flagged transaction gets a plain-English explanation and a recommended action: approve, step up to MFA, send to analyst review, or decline. The risk level determines the response, not just a binary flag.
The dashboard shows a live transaction feed, risk scores, explanation panel, linked-entity relationship view, and business metrics tied to the synthetic ground truth.
How we built it
- Backend: FastAPI with a synthetic event generator that produces realistic banking activity including injected fraud scenarios
- Fraud engine: Rule-based checks for velocity, retries, and account-change sequences combined with customer baseline comparisons and sequential pattern detection
- Graph layer: NetworkX in-memory graph for linked-entity analysis across devices, IPs, cards, accounts, and merchants
- Explainability: Each signal contributes to a bounded risk score; the top reasons surface in plain English alongside the recommended action
- Frontend: React with TypeScript, live-polling feed, analyst detail panel, entity relationship view, and metrics dashboard
Challenges we ran into
The hardest design decision was the action layer. It would have been easy to flag anything suspicious and call it a decline. But that's not how fraud systems actually work in practice, and judges would see through it immediately.
A real bank decision engine has to weigh friction against risk. A moderate anomaly should trigger a step-up challenge, not a hard decline. Getting the thresholds calibrated so that the four actions (approve, step up, review, decline) felt believable across different fraud scenarios took a lot of iteration. Too aggressive and the false positive rate looks absurd. Too lenient and the demo scenarios don't surface properly.
Accomplishments that we're proud of
- The linked-entity graph catches burst fraud that looks completely clean at the individual transaction level. Seeing five separate "normal" transactions suddenly connect into an obvious cluster in the relationship view is genuinely satisfying.
- The sequential pattern detection for account takeover works end to end: login on new device, password reset, profile update, then the fraudulent purchase all trigger as a chain rather than isolated events.
- Four distinct fraud scenarios (card testing, account takeover, impossible travel, linked burst fraud) all run in the same live stream alongside mostly legitimate activity, which makes the demo feel real rather than cherry-picked.
- The whole thing runs locally with two commands.
What we learned
The most important insight was that explainability is not a feature you add at the end. We started building the scoring engine and quickly realized that a number between 0 and 1 tells an analyst nothing actionable. The explanation had to be designed into the scoring logic from the start, not retrofitted afterward.
We also learned that graph-based analysis changes what fraud is even detectable. Transaction-level models have a ceiling. The moment you start asking "who else touched this device today," a completely different category of fraud becomes visible.
What's next for FraudClaw
- Persistent storage to replace the in-memory state and support longer running sessions and historical analysis
- Real transaction data integration with actual banking APIs to validate the scoring logic against live streams
- ML layer on top of the rule engine so the customer baselines update continuously rather than relying on static thresholds
- Analyst feedback loop so review decisions feed back into the model and improve future scoring
- Expanded graph signals to cover merchant networks and cross-institution patterns, not just device and IP clusters
Built With
- codex
- fastapi
- machine-learning
- pydantic
- python
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.