INSPIRATION

A Bay Area shopping-mall General Manager oversees hundreds of tenants across millions of square feet — yet most still run the business off weekly Excel reports. By the time a problem surfaces in a spreadsheet, it's already too late to act. GoldenGate was built to answer one question: what if a GM could just ask?

⚠️ All data is completely synthetic and generated for demonstration only. Real Bay Area brand names are used purely for realism and are not associated with any real business data.


WHAT IT DOES

GoldenGate Retail AI is a multi-agent co-pilot that gives mall GMs instant, data-backed answers — and shows the SQL behind every number:

  • Revenue & transactions — "How much revenue did Westfield Valley Fair generate last month?"
  • Tenant rankings — "Who are the top 5 tenants at Stanford Shopping Center by revenue?"
  • Lease risk — "Which tenants at Santana Row have leases expiring in the next 6 months?"
  • Cross-mall brand view — "Compare lululemon's performance across all Bay Area malls"
  • Weather impact — "What was the weather impact on foot traffic at Bay Street Emeryville last quarter?"
  • 30-day forecasts — "Forecast next 30 days revenue for Stanford, including the 30-day total"
  • Prioritized actions — "What are the top 3 actions I should take this week at Valley Fair?"
  • Pipeline health — "Is the Fivetran data pipeline healthy?"

Every answer is grounded in the warehouse — the agent never invents numbers, and an expander shows the exact SQL that produced each result.


HOW WE BUILT IT

Required stack (all invoked at runtime)

  • Gemini 3 Flash Preview on Vertex AI (global endpoint) — every agent's reasoning model.
  • Google Agent Development Kit (ADK 1.34) — part of Vertex AI Agent Builder; powers the root orchestrator and three specialist sub-agents. Deployed on Cloud Run, a Google-documented ADK runtime (ADK agents run on Cloud Run, GKE, Agent Engine, or locally).
  • Fivetran MCP server — wired into the agent via ADK's McpToolset for live pipeline monitoring (this is the Fivetran-track integration).

Data pipeline

A Cloud SQL Postgres source is synced to BigQuery via Fivetran. The agent's analytical warehouse, goldengate_core, is generated synthetically (simulate_data.pyload_bigquery.py) and kept current by a daily incremental refresh — a Cloud Run Job triggered by Cloud Scheduler that appends new days, rebuilds the aggregates, and retrains the ML model.

BigQuery warehouse (goldengate_core)

dim_mall, dim_tenant (SCD Type 2), dim_lease, dim_date, dim_customer, fact_transactions, fact_foot_traffic, fact_weather, agg_mall_daily, agg_tenant_daily, plus a revenue_forecast ARIMA_PLUS model and a forecast_cache. ~1.5M synthetic transactions across 13 Bay Area malls, Jan 2020 – present.

Multi-agent system (Google ADK)

A root orchestrator classifies each question and delegates to specialists via AgentTool:

  1. Data Unifier — pulls BigQuery data and checks the Fivetran MCP server for live connector status, then cross-checks the warehouse's actual latest date so a healthy connector is never mistaken for fresh data.
  2. Tenant Diagnoser — classifies tenants 🔴 Critical / 🟡 Watch / 🟢 Healthy using Bay Area rent-to-sales benchmarks by store format.
  3. Action Recommender — produces a prioritized GM action list backed by ARIMA_PLUS forecasts, with a grounded 30-day total computed in the tool.

Accuracy guarantees

  • Show SQL on every answer; the real executed query is captured at the tool layer.
  • Correct aggregate semantics — COUNT(DISTINCT customer_id) for uniques, SUM(revenue)/SUM(transactions) for basket.
  • A no-fabrication guard: no data for a period → it says so and stops.
  • A pytest regression suite locks these as runtime invariants.

UI & deployment

Streamlit chat UI (lavender theme, live tool-call status, Show SQL expander, mall selector) with an embedded Google Data Studio dashboard. Deployed to Cloud Run via Cloud Build.


CHALLENGES WE RAN INTO

  1. Silent statistical errors — early on the agent computed "unique customers" by summing daily uniques (double-counting) and "average basket" as an average-of-averages. Fixed with explicit prompt rules and a regression test suite that asserts the correct vs. naive results diverge.
  2. SQL transparency across the agent boundary — the SQL runs inside sub-agents, so it never appears in the root event stream. Solved by capturing every executed query at the tool layer and surfacing it in a "Show SQL" expander — which also stopped the agent from fabricating a query when asked to "show SQL."
  3. Fivetran MCP on Cloud Run — the MCP server runs as a stdio subprocess; the local dev path doesn't exist in a container. Solved by vendoring it into vendors/fivetran_mcp_server.py and spawning it via StdioConnectionParams.
  4. Keeping data current — a daily Cloud Run Job + Cloud Scheduler incrementally extends the warehouse to yesterday and retrains the ARIMA model, avoiding full-regeneration timeouts.

ACCOMPLISHMENTS WE'RE PROUD OF

  • A GM goes from question to data-backed answer in seconds, with zero SQL — and can verify every number via Show SQL.
  • The Fivetran MCP integration gives real-time pipeline visibility inside the conversation, and reports freshness honestly instead of rubber-stamping a green light.
  • A clean three-agent separation of data retrieval, diagnosis, and recommendation.
  • Accuracy is enforced, not hoped for — regression tests guard the statistical correctness.

WHAT WE LEARNED

  • ADK's AgentTool makes multi-agent composition clean — but nested sub-agent tool calls are invisible to the root, so observability has to be captured at the tool layer.
  • McpToolset + StdioConnectionParams make MCP servers a natural fit for agentic pipelines talking to external APIs.
  • BigQuery ML ARIMA_PLUS is powerful but only as good as the data prep; grounding the forecast total in the tool (not the LLM) eliminates summation drift.
  • "Healthy connector" ≠ "fresh data" — always cross-check MAX(date).

WHAT'S NEXT for GoldenGate Retail AI

  • Real-time alerts (Pub/Sub) when a tenant's revenue drops week-over-week.
  • An agent-level evaluation harness, not just deterministic SQL invariants.
  • A lease-renegotiation memo drafted automatically when rent-to-sales runs hot.
  • Drop in any real retail warehouse — the data model is mall-agnostic.

Built With

  • artifact-registry
  • bigquery
  • bigquery-ml-arima-plus
  • cloud-build
  • cloud-run
  • cloud-run-jobs
  • cloud-scheduler
  • cloud-sql
  • fivetran-mcp
  • gemini-3-flash-preview
  • google-adk
  • google-data-studio
  • mcp
  • python
  • streamlit
  • vertex-ai-agent-builder
Share this project:

Updates