Short Description

A 4-agent orchestrated Flow on the GitLab Duo Agent Platform that performs automated premortem analysis on every merge request β€” detecting AI-generated code antipatterns, mapping blast radius across dependencies, delivering merge verdicts, and estimating CI sustainability impact.


Inspiration

AI coding assistants are generating code faster than ever, but they're also producing hallucinated imports, hardcoded secrets, and missing error handling at unprecedented volume. The humans reviewing this code are becoming the bottleneck β€” more MRs, shorter review times, and more rubber-stamped approvals. GitLab calls this the AI Paradox: AI accelerates code authoring, but reviews, security, and compliance can't keep up.

We asked: what if your merge request pipeline had its own immune system? Not a chatbot that answers questions, but a team of specialized agents that autonomously analyze every MR before a human ever looks at it.

That's diffRact β€” like a prism refracting light into its spectrum, it breaks down every merge request into its constituent risk signals.

What it does

When triggered on any merge request (via @-mention, assignment, or reviewer assignment), diffRact runs four specialized agents in sequence:

πŸ”¬ The Skeptic (AI Antipattern Detection) scans the diff for patterns that AI coding assistants commonly introduce. It searches the actual codebase to verify findings β€” checking whether imported packages exist, whether credentials are hardcoded, whether error handling is missing on new external calls.

πŸ—ΊοΈ The Cartographer (Blast Radius Mapping) traces the dependency graph of every changed file. It uses grep and code search to find all downstream files, functions, and services that could be affected. It also checks for test coverage gaps and recent incidents on affected paths.

πŸ“‹ The Advisor (Synthesis & Verdict) reads the findings from both previous agents, cross-references antipatterns with blast radius to assess compound risk, and delivers a clear merge verdict: Safe to Merge, Merge with Caution, Changes Requested, or Do Not Merge. It assigns a numeric risk score, generates prioritized action items, and automatically labels the MR with the risk level.

♻️ The Accountant (CI Efficiency & Sustainability) estimates the compute waste from problematic changes β€” pipeline failure probability, wasted CI minutes from debug cycles, and hotfix costs. It recommends concrete actions for greener CI practices like import linting, dependency scanning, and scoped testing.

Each agent posts its findings as a structured comment on the MR. The Advisor also applies a risk label (diffract:critical, diffract:high, diffract:medium, or diffract:low) to the MR automatically.

We also built a standalone diffRact Skeptic agent available in GitLab Duo Chat for on-demand code reviews outside of the MR workflow.

How we built it

diffRact is built entirely on the GitLab Duo Agent Platform as a custom Flow with four AgentComponent nodes chained via sequential routers. The flow is defined in a single YAML configuration file that specifies each agent's system prompt, available tools, inputs, and routing.

Key technical decisions:

  • Sequential agent routing (Skeptic β†’ Cartographer β†’ Advisor β†’ Accountant) so each agent can build on previous findings. The Advisor reads MR comments from the earlier agents using the list_all_merge_request_notes tool.

  • 11 platform tools across the agents: get_merge_request, list_merge_request_diffs, get_repository_file, list_repository_tree, find_files, grep, gitlab_blob_search, gitlab_issue_search, list_all_merge_request_notes, update_merge_request, and create_merge_request_note.

  • Verification-first approach: The Skeptic doesn't just pattern-match on the diff β€” it uses grep and blob_search to verify whether flagged imports actually exist in the project before reporting them. This reduces false positives.

  • Cross-agent synthesis: The Advisor doesn't repeat earlier findings. It reads them via the notes API and cross-references β€” a critical bug in an isolated file is treated differently than a medium bug in a critical-path module.

  • Automated labeling: The Advisor uses update_merge_request to apply risk labels, making diffRact's assessment visible at a glance in MR lists.

The model powering all agents is Anthropic Claude Sonnet 4, accessed through GitLab-managed credentials with zero additional configuration.

Challenges we ran into

  • Tool name schema validation: The flow YAML schema changed between our initial 2-agent prototype and the full 4-agent version. We had to debug whether toolset entries needed string format or object format, and discovered some tool names from the agent tools list aren't available in the flow tool mapping.

  • Project ID context: Our first flow execution failed silently because the agent couldn't resolve the project. We learned that context:project_id must be explicitly declared as an input and referenced in the prompt template for tools to work correctly.

  • Prompt engineering for conciseness: Early versions of the agents produced walls of text. We iterated on the prompts to enforce concise, structured output that's scannable in an MR comment thread β€” judges and developers both have limited attention.

  • Multi-agent data passing: The platform doesn't have a direct memory pipe between agents in a flow. We solved this by having each agent post to MR comments and having downstream agents read those comments via list_all_merge_request_notes. This is actually better than a hidden pipe because the intermediate results are visible to humans too.

Accomplishments that we're proud of

  • 4-agent sequential orchestration working end-to-end with cross-agent synthesis β€” not just parallel agents doing independent things, but a genuine pipeline where each agent's output informs the next.

  • Verification-based detection: The Skeptic doesn't hallucinate findings. It searches the actual codebase to confirm whether flagged imports exist before reporting them. In our demo, it correctly identified that cryptoauth doesn't exist anywhere in the project.

  • Calibrated risk assessment: diffRact correctly assessed a bad MR as DO NOT MERGE (10/10 risk) and a clean MR as SAFE TO MERGE (2/10 risk). It's not a false-alarm machine β€” it calibrates to actual risk.

  • Automated MR labeling: The Advisor automatically applies risk labels, making diffRact's assessment visible in MR lists without opening each MR.

  • Green Agent integration: The Accountant provides concrete sustainability metrics β€” estimated wasted CI minutes, pipeline failure probability, and actionable recommendations for reducing compute waste.

What we learned

  • The GitLab Duo Agent Platform is surprisingly powerful for building multi-agent workflows. The combination of YAML-defined flows, built-in tools, and Anthropic Claude creates a rapid development loop.

  • Agent orchestration through MR comments (rather than hidden state) is actually a feature, not a limitation. It makes the entire reasoning chain transparent and auditable.

  • Prompt engineering matters more than architecture in agent systems. The difference between a useful agent and a noisy one comes down to how precisely you instruct it to verify findings and format output.

  • The "AI Paradox" is real. As we used AI tools to help build diffRact itself, we experienced firsthand the review bottleneck that diffRact is designed to solve.

What's next for diffRact

  • Additional demo scenarios: MRs with mixed good/bad changes, multi-file refactors, and dependency version bumps.

  • Configurable sensitivity: Allow teams to tune what The Skeptic looks for based on their codebase and risk tolerance.

  • Pipeline integration: Auto-trigger diffRact on every MR via CI/CD pipeline hooks instead of manual @-mentions.

  • Historical learning: Track which diffRact findings led to actual fixes vs. dismissals, and use that signal to improve detection accuracy over time.

  • Expanded Accountant: Integrate with actual CI pipeline metrics for more accurate compute waste estimates rather than heuristic-based estimation.

Built With

  • GitLab Duo Agent Platform (Custom Flows)
  • Anthropic Claude Sonnet 4 (via GitLab-managed credentials)
  • Python (demo application)
  • YAML (flow and agent configuration)

Built With

  • anthropic
  • claude
  • gitlab
  • python
  • yaml
Share this project:

Updates