Inspiration

Every developer know the feeling , open an issue, read the description and then spend more time to just getting oriented.

What it does

for newly created GitLab issues and automatically opens a merge request with a targeted fix or production-ready scaffold — so no developer ever starts from a blank slate.

How we built it

We started by identifying the core problem with single-prompt approaches to code generation: asking one agent to read an issue, understand a codebase, make architectural decisions, and write production-quality code in a single pass produces inconsistent results. The reasoning and the generation interfere with each other.

The solution was a two-stage flow on the GitLab Duo Agent Platform. The first stage — the Issue Analyzer — is given only read tools and one job: produce a structured report. It reads the issue, detects the tech stack by reading every manifest file present in the repository, discovers the project's style rules by reading linter and formatter configs (covering over 20 formats across all major languages), searches the codebase for affected files using class names and route paths from the issue as grep terms, and outputs a structured document with explicit sections: ISSUE_SUMMARY, STACK, STYLE_CONFIG, ISSUE_TYPE, COMPLEXITY, AFFECTED_FILES, APPROACH, and ASSUMPTIONS.

The second stage — the MR Creator — receives both the original issue context and the analyzer's report. It has no search ambiguity to resolve and no stack to detect. It reads the affected files, applies the fix-vs-scaffold decision based on the complexity classification, and writes code that matches the project's style exactly — following the detected config as the authority, falling back to existing file conventions, and only applying language defaults as a last resort.

Each stage has its own focused toolset. The analyzer has read-only tools. The MR Creator has write tools. This separation prevents the generator from taking shortcuts by skipping the analysis, and it prevents the analyzer from drifting into implementation.

The agent prompt architecture was designed around Claude's strength in structured output generation. The fixed section headers in the analyzer's output mean the MR Creator always receives well-formed, predictable context rather than free-form reasoning that may or may not contain the information it needs.

To validate the approach, we built five demonstration examples across TypeScript, Python, Ruby, Go, and PHP — each containing real broken code and a realistic issue. These were pushed to the repository so the agent could be triggered against live issues and produce verifiable merge requests.

Challenges we ran into

a)The biggest challenge was the flow's inter-stage communication. The YAML schema for passing output from one component to the next was underdocumented, and the only reference was the starter template. Getting the variable binding between the analyzer and the MR Creator working correctly required several push-and-test cycles since there was no way to run the flow locally.

b)The second challenge was prompt design for the style detection system. Getting the MR Creator to consistently prioritize the project's own config over its training priors — especially for less common languages — required explicit priority ordering in the prompt rather than a general instruction to "match the project's style."

Finally, calibrating the fix-vs-scaffold threshold took iteration. Too aggressive and the agent scaffolds things it should just fix. Too conservative and it attempts complex multi-file refactors that exceed what a first-pass MR should do. The final version uses a three-point complexity scale tied to file count and solution ambiguity, which produced the most consistent results across the five demonstration examples.

Accomplishments that we're proud of

The thing we're most proud of is the quality bar on the generated code. It was easy to build an agent that creates a merge request — it's much harder to build one that creates a merge request you'd actually want to merge. Every output matches the project's existing style, handles errors explicitly, avoids N+1 queries, and validates inputs. That consistency across five different languages and frameworks, without any per-language configuration, is a direct result of the two-stage architecture and the style detection system.

We're also proud of the scaffold mode. The TODO format — where every placeholder explains the decision needed, the options available, and the risk of each — treats the developer as a decision-maker rather than an implementor. That distinction matters. A good scaffold accelerates a developer; a bad one creates more work than starting fresh.

The live demonstration proved the concept works on real code. The agent detected the IDOR vulnerability in the Laravel controller, created a new policy class that didn't exist before, wired it correctly, and produced a testing checklist that covered the edge case we hadn't explicitly mentioned in the issue.

What we learned

Separation of concerns matters as much in AI pipelines as it does in application code. A single agent doing everything produces mediocre results across the board. Two agents doing one thing each — and doing it well — produces results that are consistently better than either could achieve alone.

We also learned that grounding is everything. The quality gap between "write a fix for this issue" and "here is the exact file, the exact style rules, and a structured analysis of the problem — now write the fix" is enormous. The analyzer's structured report is not overhead; it is the reason the MR Creator produces code worth reviewing.

Finally, prompt stability under distribution shift matters for a tool like this. The agent needs to behave predictably on a Next.js project, a Django monolith, and a Go microservice. Achieving that required explicit priority rules rather than relying on general instructions, and testing against diverse real codebases rather than synthetic examples.

What's next for Issue-to-MR Auto-Triage

The immediate next step is a webhook trigger so the flow activates automatically when an issue is opened — removing the manual Duo Chat invocation entirely.

Beyond that, we want to add a third stage: a Review Agent that runs after the developer pushes changes to the branch, reads the diff, and checks whether the implementation matches the original approach from the analysis report. This closes the loop from issue to implementation to verification.

Longer term, the agent could learn from accepted and rejected merge requests within a project to improve its fix-vs-scaffold calibration and better match the team's preferences over time. The structured output format from the analyzer makes this tractable — every decision is already documented and attributable.

Built With

  • gitlab-ai-catalog
  • gitlab-duo-agent-platform
  • gitlab-duo-chat
  • gitlab-issues-api
  • gitlab-merge-request-api
  • yaml
Share this project:

Updates