Inspiration: Fraud reviews are either too manual (slow, inconsistent) or too opaque (“the model said so”). We built a drop-in, explainable layer you can bolt onto an existing microservices app—no schema changes or risky rewrites—that shows the clear “why” behind each decision.

What it does:

• Watches new Bank of Anthos transactions via a read-only gateway. • Scores risk with Gemini using compact signals (amount spikes, geo velocity, time-of-day). • Explains the rationale in plain language and stores an audit record. • Calls existing BoA endpoints for actions (noop/notify/step-up/temp hold). • Dashboard: live table with time | amount | score | action | why.

How we built it:

Bank of Anthos remains untouched (namespace boa); FraudGuard runs in namespace fraudguard. Services (5 pods): mcp-gateway (FastAPI, read-only BoA façade, MCP tools), risk-scorer (Gemini), explain-agent (rationale + audit), action-orchestrator (score→action), dashboard (read-only). One reusable Helm/workload chart (Deployment/Service/HPA/RBAC/NetworkPolicy/PSA) with per-service values. Infra: Terraform (GKE Autopilot, Artifact Registry, WIF, Secret Manager, Budgets). CI/CD: GitHub Actions → build, Trivy, SBOM, push, helm upgrade via WIF (keyless).

Infra as code. Terraform modules in infra-gcp-gke for GKE Autopilot, Artifact Registry, Workload Identity Federation (WIF), Secret Manager, and Budgets.

CI/CD. GitHub Actions builds images → Trivy scan → SBOM → push to Artifact Registry → helm upgrade per service (matrix) using WIF (no long-lived keys).

Secrets. Gemini API key (or Vertex AI later) stored in Secret Manager, mounted with the GKE Secret Manager CSI; KSAs have least-privilege secretAccessor.

Security & ops. NetworkPolicies (deny-by-default), Pod Security (non-root, read-only FS), /healthz & /readyz, HPA, structured JSON logs with txn_id/trace_id.

Challenges we ran into

• Rule: “Do not modify BoA” → adapter layer and actions only via existing endpoints. • Secrets at scale: Secret Manager CSI + Workload Identity (no keys). • DRY deploys: one chart vs per-service differences (ports/env/RBAC/NP). • Latency vs explainability: compact prompts, human-readable rationale..

Accomplishments that we're proud of

• End-to-end on GKE Autopilot with zero BoA changes. • Explainable decisions (score + why) stored for audit, easy to demo. • Repro in ~10 minutes: single chart + per-service values + Make/Helm. • Keyless CI deploys using WIF, plus Trivy + SBOM in the pipeline.

What we learned

• Safe agentic patterns on Kubernetes (MCP façade + action-orchestrator). • Trade-offs: Gemini API key vs Vertex AI + Workload Identity. • A single workload chart is a sweet spot for microservice scaffolding. • Explainability improves trust for judges and users.

What's next for FraudGuard for Bank of Anthos

  • Make Vertex path default; richer features; BigQuery/Feature Store; streaming (Pub/Sub)
  • Alerting/notifications; policy-as-code for actions; audit viewer; expanded A2A/ADK/MCP tooling

Built With

  • bash
  • ci/cd-&-security:-github-actions-(oidc-workload-identity-federation)
  • cloud-logging/monitoring
  • google-cloud:-artifact-registry
  • helm-(single-workload-chart-+-per-service-values)
  • hpa
  • kubernetes:-gke-autopilot
  • languages/frameworks:-python-(fastapi)
  • networkpolicy
  • optional)
  • pod
  • pub/sub
  • secret-manager-(+-csi)
  • security
  • trivy
  • ui:-next.js-or-flask)
  • vertex
Share this project:

Updates