Pipeline Rescue Agent

An evidence-backed data incident agent that decides whether stale dashboards can be trusted before stakeholders use them.


Inspiration

Dashboards can look available while the data underneath is stale.

That creates a dangerous gap before leadership meetings: stakeholders may open a report, see familiar charts, and assume the numbers are safe to use. Meanwhile, the analyst still has to jump between pipeline tools, warehouse metadata, and status updates to answer the real question:

Can this dashboard be trusted right now?

Pipeline Rescue Agent was built around that trust decision. Instead of acting like a generic chatbot or a passive monitoring page, it gathers evidence, reasons over the incident, records its decision path, and stops for human approval before stakeholder communication.


What it does

Pipeline Rescue Agent is an executive incident cockpit for stale reporting pipelines.

In the demo scenario, the Monday Sales Dashboard is stale before a leadership meeting. The agent investigates the incident and determines whether the Executive Sales Overview dashboard should be cleared for stakeholder use.

The workflow is intentionally short:

Run Rescue Investigation
   |
   v
Review trust decision
   |
   v
Approve Recovery Brief

The agent:

  • loads the active reporting incident
  • checks Fivetran connection evidence
  • checks live BigQuery freshness evidence
  • uses Gemini on Google Cloud to generate a recovery plan
  • records an auditable Agent Run Ledger
  • requires human approval before producing a stakeholder-ready recovery brief

The key demo moment is:

Fivetran evidence is healthy, but BigQuery data is stale — so the dashboard is not cleared for leadership use.


How it uses Google Cloud

Pipeline Rescue Agent uses Google Cloud throughout the deployed workflow:

  • Cloud Run hosts the Next.js web app and API routes.
  • Gemini on Google Cloud / Vertex AI generates recovery recommendations from incident, Fivetran, and BigQuery evidence.
  • BigQuery stores the synced sales order data and provides freshness evidence.
  • Secret Manager stores Fivetran credentials for the deployed service.
  • Cloud Build / Artifact Registry support the Cloud Run source deployment path.

The hosted demo is available here:

https://pipeline-rescue-agent-226881366082.us-central1.run.app

How it uses Fivetran

Fivetran is the pipeline evidence layer.

The demo pipeline is:

Google Sheets source
   |
   v
Fivetran connection
   |
   v
BigQuery table: pipeline_rescue.sales_orders

The app checks Fivetran connection evidence including setup state, sync state, update state, warnings, and tasks. That evidence is compared against BigQuery freshness evidence so the agent can distinguish between:

  • a broken connector, and
  • a healthy pipeline path where the reporting data is still stale.

For judge reliability, the hosted demo can use cached Fivetran connection evidence if the live Fivetran API is unavailable after trial expiration. The connection was validated during development through the live Fivetran API, and the official Fivetran MCP server was validated locally in read-only mode against the same connection.

This keeps the public demo runnable while preserving the intended Fivetran evidence path.


Fivetran MCP validation

The official Fivetran MCP server was validated locally in read-only mode with:

FIVETRAN_ALLOW_WRITES=false

During development, a local MCP client successfully:

  • initialized the official Fivetran MCP server
  • listed 77 available tools
  • confirmed connection-inspection tools including list_connections, get_connection_details, and get_connection_state
  • called get_connection_details for the same Fivetran connection used in the deployed demo
  • retrieved live connection details for the Google Sheets to BigQuery pipeline

The deployed Cloud Run app uses stable backend routes for judge-facing reliability, while the MCP validation confirms compatibility with the official Fivetran MCP tool surface.


What makes it an agent

Pipeline Rescue Agent does not simply summarize an error.

It follows a goal, gathers tool evidence, records observations, makes a trust decision, generates a recovery path, and stops for human approval before stakeholder communication.

The main investigation route returns an agentRun object that records:

  • the agent goal
  • mission context
  • investigation plan
  • tools used
  • observations gathered from Fivetran, BigQuery, and Gemini
  • decision summary
  • confidence level
  • guardrails
  • final artifact status

Example decision:

{
  "pipelineStatus": "Healthy",
  "dataStatus": "Stale",
  "likelyIssue": "Upstream source freshness",
  "confidence": "high",
  "approvalRequired": true
}

The model recommends. The human approves.


How I built it

The frontend is a Next.js, React, TypeScript, and Tailwind CSS app deployed on Cloud Run. The interface is designed as an executive incident cockpit rather than a generic chat page. Judges can run the investigation, review the trust decision, open optional proof panels, and approve the recovery brief.

The backend uses Next.js API routes on Cloud Run:

  • POST /api/investigate orchestrates the full agent investigation
  • GET /api/incidents loads the active demo incident
  • GET /api/fivetran/status checks Fivetran connection evidence
  • GET /api/data/freshness checks live BigQuery freshness
  • POST /api/agent/recovery-plan calls Gemini for a recovery plan
  • POST /api/approval/generate-brief generates the approved recovery brief

Gemini receives structured evidence from the incident, Fivetran connection state, BigQuery freshness result, row count signals, and business impact. It returns likely cause, business risk, recommended action, evidence, next steps, and stakeholder-safe messaging.


Challenges and decisions

Trial-safe Fivetran judging

The Fivetran trial used during development expired before final judging. During the trial, the app validated the Google Sheets to BigQuery connection through the live Fivetran API, and the official Fivetran MCP server was validated locally in read-only mode against the same connection.

To keep the hosted demo reliable for judges after trial expiration, the deployed app uses cached Fivetran connection evidence captured from the validated demo pipeline. The agent still combines that Fivetran evidence with live BigQuery freshness and Gemini reasoning.

Safety over automatic repair

The app intentionally does not auto-send stakeholder communication, trigger destructive pipeline operations, or force a Fivetran resync. The MVP focuses on the trust decision and approval workflow: collect evidence, decide whether the dashboard is safe to use, and produce a recovery brief only after human approval.

Vertical slice over feature sprawl

The project focuses on one high-severity stale-dashboard incident instead of many shallow scenarios. This keeps the demo judge-testable while still showing the full agent loop: incident, evidence, reasoning, audit, approval, and brief.


Accomplishments that I am proud of

  • Built a deployed executive incident cockpit on Google Cloud Run
  • Connected Fivetran pipeline evidence with live BigQuery freshness evidence
  • Used Gemini on Google Cloud to generate a structured recovery plan from evidence
  • Added an Agent Run Ledger so the agent's goal, tools, observations, decision, and guardrails are auditable
  • Enforced human approval before stakeholder communication
  • Added a trial-safe cached Fivetran evidence path so judges can still run the demo after partner trial expiration
  • Designed the UI around a clear trust decision instead of a generic chatbot interface

What I learned

The most important agentic step is not generating a paragraph. It is gathering evidence, making a decision, recording the reasoning path, and knowing when to stop for human approval.

I also learned that pipeline health and dashboard trust are not the same thing. A connector can look healthy while the destination table is stale. That distinction is exactly where an evidence-backed agent can help data teams move faster and communicate more safely.


What's next

  • Add Fivetran sync-history inspection
  • Add an optional MCP-driven production tool path
  • Add dashboard lineage mapping
  • Add exportable recovery briefs
  • Add stakeholder email drafts after approval
  • Add incident history and trend analysis
  • Add ticket creation after human approval

Built with

  • Google Cloud Run
  • Gemini on Google Cloud / Vertex AI
  • BigQuery
  • Secret Manager
  • Fivetran
  • Fivetran MCP server validation
  • Next.js
  • React
  • TypeScript
  • Tailwind CSS
  • Google Gen AI SDK
  • GitHub Codespaces

Built With

  • bigquery
  • fivetran-mcp-server
  • fivetran-rest-api
  • gemini-/-google-gen-ai-sdk
  • google-cloud-run
  • mcp
  • next.js
  • react
  • secret-manager
  • tailwind-css
  • typescript
Share this project:

Updates