Inspiration

AI code review has a trust problem: ask the same LLM about the same diff twice and you get two different reviews. You can't gate a merge on an opinion. But GitLab now ships something no other platform has — the Orbit knowledge graph, a queryable call graph of your whole repository. That makes a different architecture possible: compute the risk deterministically from the graph, and use the LLM only for what it's good at — explaining.

What it does

For any merge request, five fixed Orbit DSL queries compute:

  1. Changed definitions — diff hunks intersected with definition line ranges (old-side, changed lines only — context lines never drag untouched neighbors in).
  2. Blast radius — every production definition that transitively calls a changed one (reverse CALLS walk, ≤3 hops), minus tests, minus the change itself.
  3. Uncovered impact — radius minus everything the test suite can reach (forward CALLS walk from tests/).
  4. Verdict — PASS / WARN / FAIL as a pure function of those sets; a scoped label blast-radius::low|medium|high applied to the MR.
  5. Exit path — greedy set-cover: the smallest set of tests that closes every gap. The gate fails you and hands you the way out.

A Duo Agent Platform flow runs the gate automatically: mention the gate's service account on any MR and within a second a workflow spins up, runs the script in its sandbox, and posts the report as an MR note — verbatim. The prompt's hard rule: never compute, estimate, or adjust a verdict. A CI job makes the verdict a real pipeline gate (exit code = verdict), and an on-demand AI Catalog agent provides the chat path.

The verdict is canonical via offline replay, not the live index. Every Orbit response is sha256-recorded, and the FAIL / radius-6 result is reproduced byte-for-byte in CI with zero network access. GitLab's Orbit index transiently de-indexes after a merge to main — so a live run can momentarily see an empty graph, where a naive live-index gate would flip to PASS and wave a risky merge through. This one still returns FAIL, pinned to a committed evidence SHA. Determinism never depends on the index being warm.

How we built it

  • Pure-stdlib Python (urllib only) against the Orbit query API + GitLab REST.
  • Every Orbit query is sha256-fingerprinted; a record/replay transport captures live responses as fixtures, and the test suite replays the full pipeline asserting the verdict byte for byte with zero network access (36 tests, all green).
  • The blast-radius diagram is rendered on two surfaces from the same graph SHA: a GitLab-native mermaid call graph in every MR note, and a self-contained inline SVG on a live GitLab Pages report — no build, no external assets, byte-identical to the CLI/CI output.
  • Agent created and registered entirely via the AI Catalog GraphQL API (aiCatalogAgentCreate, consumer registration, versioned release), ported to a Flow registry definition so a mention trigger auto-provisions a service account.

Challenges we ran into

  • Undocumented Orbit re-index behavior: after any merge to main, the project graph transiently shrinks to only the files touched by the merge commit. We diagnosed it with ad-hoc graph queries and built a recovery procedure (a "nudge" MR). A gate must survive its own platform's indexing lifecycle — which is exactly why the verdict is pinned to a committed snapshot, not the live index.
  • Hunk math: mapping diff hunks to definition line ranges is full of off-by-one traps (insertions anchor to the preceding old line; context lines must advance the counter but never count as changed). We unit-tested the exact MR that fooled the naive version.
  • Triggers are flow-only, and the path there was undocumented: agent items can't have triggers — we ported the gate to a Flow registry v1 definition, discovering by API error that prompts require name and unit_primitives fields no docs mention, and that a group-level experiment_features_enabled flag silently blocks all flow creation.
  • The flow sandbox has no custom CI/CD variables: our ORBIT_TOKEN never arrived. The only credential inside a Duo workflow is GITLAB_TOKEN, the flow's composite-identity OAuth token — so the gate authenticates as the flow itself, which is the better security story anyway.

Accomplishments that we're proud of

  • A verdict that is provably reproducible — fixture replay in CI, sha256 fingerprints in every report, pinned by an evidence SHA.
  • A deterministic blast-radius diagram whose node and edge order is a pure function of the graph SHA — same MR, same picture, byte for byte. Not an LLM sketch: a computed view of the same sets the verdict comes from.
  • Line-range precision verified live: a neighbor function inside the hunk's context lines is correctly excluded.
  • The exit path turns a gate from a wall into a map: "write these 2 tests, close all 4 gaps."
  • End-to-end automation, live: @mention → trigger fires the flow → sandboxed gate run → the gate's own service account posts the report as an MR note, verbatim from the script.

What's next

  • EXTENDS is already in the engine (opt in with --edges calls,extends); next is a demo MR that carries real inheritance so EXTENDS is exercised live, plus IMPORTS and cross-language edges.
  • Per-repo WARN/FAIL calibration from git history is shipped (gate/calibrate.py, deriving boundaries from commit churn); next is folding in pipeline-incident history as a second signal.
  • More trigger surfaces — MERGE_REQUEST_READY for zero-touch gating on every MR.

Built With

  • gitlab-ci
  • gitlab-duo
  • graphql
  • orbit-knowledge-graph
  • python
Share this project:

Updates