SQL X-Ray — give GitLab Orbit real SQL comprehension

Inspiration

GitLab Orbit turns a repository into a queryable knowledge graph so agents finally have real context. But Orbit Local indexes code — and SQL is not one of its languages. Index any project and the graph proves it:

$ orbit sql "SELECT count(*) FROM gl_definition"
0

Zero definitions. Orbit sees the .sql files exist but cannot tell you what they read, what columns flow where, or what downstream tables break on a change. For data and analytics teams that is exactly where the risk lives — and an LLM that "reads" SQL as text will happily hallucinate the answer. We wanted to give Orbit-powered agents real, grounded SQL comprehension.

What it does

SQL X-Ray is an agent and a flow on the GitLab Duo Agent Platform that gives Orbit something it does not have on its own: an understanding of SQL.

Agent (on-demand): ask "what does this query do", "where does column net_revenue come from", "what breaks if I change revenue_daily", and it answers with a grounded analysis instead of a guess.
Flow (automated): assign it as a reviewer on a merge request and it analyzes every changed .sql file, then posts a structured review comment — data-flow summary, column lineage, downstream blast radius, and risk findings with severities, ending in a merge / request-changes recommendation.

Every SQL claim is produced by sqlucent, a deterministic SQL analyzer, and fused with the repository context Orbit provides.

How we built it

The whole design is bridge, not rebuild:

Layer	Source	Answers
Codebase / SDLC graph	GitLab Orbit (CLI)	which files changed, how they relate, who owns them
SQL semantics	sqlucent (`run_command`)	walkthrough, column lineage, cross-file impact, risk lint

sqlucent is a pure-Python engine on sqlglot (already shipped to PyPI). No warehouse connection, no credentials, fully deterministic. It emits a step-by-step walkthrough, column-level lineage, a cross-file table DAG with --impact blast radius, and rule-based lint findings with a CI-gate exit code.
The agent (agents/sql-xray.yml) and flow (flows/sql-xray-review.yml) carry a system prompt with one golden rule: never describe SQL from your own reading — run sqlucent and quote it. They use Orbit via run_command for repository context and create_merge_request_note to post the review.
Both are published to the GitLab AI Catalog as public, versioned items, and .gitlab-ci.yml includes the AI Catalog Sync component to validate the YAML on every pipeline.

Challenges we ran into

--impact needs write statements. sqlucent builds its cross-file DAG from CREATE … AS / INSERT edges, so a folder of pure SELECTs yields no graph. We reworked the sample models to write tables so the blast-radius demo is real, and taught the agent to say "the DAG is empty" rather than guess.
Flow template validation. The flow prompt referenced {{goal}}, but the platform rejects template variables that aren't declared as component inputs. We had to map context:goal explicitly before the flow would publish.
Keeping the agent honest. The entire value is no hallucination, so the system prompt is built around a single golden rule and guardrails that force every SQL fact to come from sqlucent's verbatim output.

What we learned

Orbit's knowledge graph is powerful precisely because it knows its lane — the right move is to bridge a deterministic engine onto it, not to ask an LLM to fake SQL understanding.
Grounding beats fluency: a parser-backed answer a reviewer can trust is worth more than a confident paragraph that might be wrong.
The Duo Agent Platform's split of Agent (interactive) and Flow (automated, MR-native) maps cleanly onto "ask me" vs "review every MR".

What's next

Richer Orbit fusion: join sqlucent's table/column lineage with Orbit's ownership and recent-MR signals to rank reviewers by blast radius.
More dialects and lint rules; schema-aware SELECT * expansion by default.

Built With

ai-catalog
duckdb
gitlab
gitlab-duo-agent-platform
gitlab-orbit
knowledge-graph
mermaid
python
sql
sqlglot
sqlucent

Updates

willam wang started this project — Jun 20, 2026 09:00 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.