Inspiration
GitLab Orbit turns a repository into a queryable knowledge graph so agents finally have real context. But Orbit Local indexes code — and SQL is not one of its languages. Index any project and the graph proves it:
$ orbit sql "SELECT count(*) FROM gl_definition"
0
Zero definitions. Orbit sees the .sql files exist but cannot tell you what
they read, what columns flow where, or what downstream tables break on a change.
For data and analytics teams that is exactly where the risk lives — and an LLM
that "reads" SQL as text will happily hallucinate the answer. We wanted to give
Orbit-powered agents real, grounded SQL comprehension.
What it does
SQL X-Ray is an agent and a flow on the GitLab Duo Agent Platform that gives Orbit something it does not have on its own: an understanding of SQL.
- Agent (on-demand): ask "what does this query do", "where does column
net_revenuecome from", "what breaks if I changerevenue_daily", and it answers with a grounded analysis instead of a guess. - Flow (automated): assign it as a reviewer on a merge request and it
analyzes every changed
.sqlfile, then posts a structured review comment — data-flow summary, column lineage, downstream blast radius, and risk findings with severities, ending in a merge / request-changes recommendation.
Every SQL claim is produced by sqlucent,
a deterministic SQL analyzer, and fused with the repository context Orbit
provides.
How we built it
The whole design is bridge, not rebuild:
| Layer | Source | Answers |
|---|---|---|
| Codebase / SDLC graph | GitLab Orbit (CLI) | which files changed, how they relate, who owns them |
| SQL semantics | sqlucent (run_command) |
walkthrough, column lineage, cross-file impact, risk lint |
- sqlucent is a pure-Python engine on
sqlglot(already shipped to PyPI). No warehouse connection, no credentials, fully deterministic. It emits a step-by-step walkthrough, column-level lineage, a cross-file table DAG with--impactblast radius, and rule-based lint findings with a CI-gate exit code. - The agent (
agents/sql-xray.yml) and flow (flows/sql-xray-review.yml) carry a system prompt with one golden rule: never describe SQL from your own reading — runsqlucentand quote it. They use Orbit viarun_commandfor repository context andcreate_merge_request_noteto post the review. - Both are published to the GitLab AI Catalog as public, versioned items, and
.gitlab-ci.ymlincludes the AI Catalog Sync component to validate the YAML on every pipeline.
Challenges we ran into
--impactneeds write statements. sqlucent builds its cross-file DAG fromCREATE … AS/INSERTedges, so a folder of pureSELECTs yields no graph. We reworked the sample models to write tables so the blast-radius demo is real, and taught the agent to say "the DAG is empty" rather than guess.- Flow template validation. The flow prompt referenced
{{goal}}, but the platform rejects template variables that aren't declared as component inputs. We had to mapcontext:goalexplicitly before the flow would publish. - Keeping the agent honest. The entire value is no hallucination, so the system prompt is built around a single golden rule and guardrails that force every SQL fact to come from sqlucent's verbatim output.
What we learned
- Orbit's knowledge graph is powerful precisely because it knows its lane — the right move is to bridge a deterministic engine onto it, not to ask an LLM to fake SQL understanding.
- Grounding beats fluency: a parser-backed answer a reviewer can trust is worth more than a confident paragraph that might be wrong.
- The Duo Agent Platform's split of Agent (interactive) and Flow (automated, MR-native) maps cleanly onto "ask me" vs "review every MR".
What's next
- Richer Orbit fusion: join sqlucent's table/column lineage with Orbit's ownership and recent-MR signals to rank reviewers by blast radius.
- More dialects and lint rules; schema-aware
SELECT *expansion by default.

Log in or sign up for Devpost to join the conversation.