Inspiration

EU financial firms drown in DORA, NIS2, GDPR and the EU AI Act. A compliance analyst spends ~1.5 hours every day scanning regulatory bulletins, and the truly dangerous moment is a threshold change — when a delegated act quietly redefines what counts as a "major incident." Miss it, and every incident you reported last quarter may now be mis-classified. Fines run to the millions. We wanted an agent that doesn't just summarize the news, but tells you exactly which of your past filings just became wrong.

What it does

Every morning RegPipeline:

  1. Monitors Fivetran connector health (delayed/broken connectors, schema changes) via the Fivetran MCP server.
  2. Reads the regulatory documents synced in the last 24h from BigQuery.
  3. Scores each document's compliance impact (HIGH/MEDIUM/LOW) with Gemini, naming the affected DORA/NIS2/GDPR articles and the action + deadline.
  4. Retroactively re-classifies history — when a DORA threshold moves (e.g. 10%→8%, 2.0h→1.5h), it re-runs every past incident and shows which would now be MAJOR.
  5. Proposes the fix (resync the connector, send the digest, save remediation tasks) and waits for human approval before any consequential action.

Try it live → click "Judge Tour" (top-right) for a 60-second guided walkthrough.

How we built it

  • Gemini@google/genai, called in src/agent.js to score impact and draft the digest as strict JSON (real Gemini 3 via the Developer API when a key is set, else Vertex gemini-2.5-flash).
  • Google Cloud Agent Builder — the judged agent is defined in agent-builder/agent.json (Gemini + the Fivetran MCP tool, state changes gated on human approval).
  • Fivetran MCP — the real @getnao/fivetran-mcp-server, spawned and called over stdio at runtime in src/fivetran-mcp.js. Proof: npm run mcp:selftest.
  • BigQuery as the warehouse Fivetran syncs into; Cloud Run + Cloud Scheduler for hosting and the daily trigger. A deterministic, eval-tested engine (src/diff.js, 7/7) does the retroactive re-classification.

Challenges we ran into

  • Gemini JSON truncationgemini-2.5-flash "thinking" tokens were consuming the output budget and truncating the digest JSON. Fixed by disabling thinking (thinkingBudget: 0), raising maxOutputTokens, and adding a tolerant parser.
  • Vertex on Cloud Run@google/genai needs GOOGLE_CLOUD_PROJECT explicitly (BigQuery auto-detects it; Gemini doesn't), which surfaced as projects/undefined.
  • Wiring the real Fivetran MCP — spawning @getnao/fivetran-mcp-server over stdio and discovering its real tool names from tools/list instead of guessing.

Accomplishments that we're proud of

A genuinely live app where all three required technologies are provably invoked at runtime (npm run smoke → 10/10), a human-in-the-loop approval gate on every consequential action, a deterministic eval-tested core (7/7), and a built-in Judge Tour so reviewers experience the whole flow in 60 seconds.

What we learned

How to make an LLM agent trustworthy in a regulated context — deterministic math for the parts that must be defensible, Gemini for judgement, and a hard human gate on anything that touches production.

What's next for RegPipeline

Push changed articles downstream to a RAG re-embed index; expand beyond DORA to a full NIS2 / GDPR / EU-AI-Act obligation ledger; and connect more live Fivetran connectors.

Built With

Share this project:

Updates