Agentic Anthos: AI Advisor for Bank of Anthos

Private user posted an update — Oct 09, 2025 12:27 PM EDT

Highlights

AI backend migrated from Flask + deprecated vertexai.generative_models → FastAPI + google-genai SDK (Gemini 2.5 Pro).
Reliability/Throughput: per-pod RPM throttle, semaphore concurrency, truncated exponential backoff with jitter + Retry-After.
K8s hardening: clean base/overlays, proper probes, WI bootstrap for Vertex, and FinOps autoscaling.
UI fixed: rebuilt/published budget-coach-ui v0.1.1, corrected in-cluster service ports, and aligned with new /api/* backend.
Smokes green end-to-end: core, data, e2e, fraud, spending, coach all pass.

What changed

Backend (insight-agent)

New FastAPI app (src/ai/insight-agent/main_vertex.py) with endpoints:
- POST /api/budget/coach
- POST /api/spending/analyze
- POST /api/fraud/detect
- GET /api/healthz
Switched to google-genai client (Vertex mode), model: gemini-2.5-pro.
JSON-schema responses + ThinkingConfig budgets; deterministic (temperature=0.0).
DSQ-friendly controls via env:
- GENAI_CONCURRENCY, GENAI_RPM, GENAI_MAX_TOKENS, GENAI_THINK_TOKENS.

Kubernetes

Service ports standardized: cluster port 80 → container 8080 (both mcp-server and insight-agent).
Dev overlay sets Vertex/env knobs; Vertex Dockerfile runs uvicorn.
Added HPA/VPA (Autopilot) manifests under kubernetes-manifests/finops/.

FinOps

Cloud Logging cost cut via exclusion filter on _Default sink.
Enabled Vertical Pod Autoscaling on cluster.
Added HPA/VPA for all app agents (userservice, transactionhistory, frontend, mcp-server, agent-gateway, insight-agent, etc.).
Insight-agent HPA set to minReplicas: 1 (keeps latency predictable for demos).

UI (Streamlit)

Image rebuilt & pushed: .../budget-coach-ui:v0.1.1.
Fixed stale deployment (judges overlay) & set envs:
- INSIGHT=http://insight-agent/api
- USERSVC=http://userservice:8080
- Port fix: MCPSVC=http://mcp-server (service on 80; no :8080).
UI now transforms BoA txns to {date,label,amount} before POSTing to the new FastAPI APIs.

Validation

make smoke-fast and make smoke-e2e passed:
- Fraud: high-quality structured findings with SAR recommendations.
- Spending: top categories + unusual count.
- Coach: budget summary + buckets + tips.
Manual UI checks confirm end-to-end flow after env + port fix.

Notables / Footguns avoided

422 Unprocessable Entity earlier was due to stale UI posting old timestamp shape; fixed by deploying v0.1.1 UI.
ConnectTimeout came from calling http://mcp-server:8080 (service listens on 80). Env corrected.

Paths you’ll see in the repo

Backend: src/ai/insight-agent/* (FastAPI, prompts, Dockerfiles, k8s overlays)
UI: ui/* (Streamlit app, Dockerfile, overlays)
FinOps: kubernetes-manifests/finops/* (HPA/VPA)
Make targets: deploy, smoke, WI bootstrap, image pinning.

Quick commands (for reference)

# Update UI env to correct ports/paths
kubectl -n default set env deploy/budget-coach-ui \
  INSIGHT=http://insight-agent/api \
  USERSVC=http://userservice:8080 \
  MCPSVC=http://mcp-server

# Re-deploy judges UI overlay
make ui-judges-apply

Log in or sign up for Devpost to join the conversation.