The Inspiration

Every developer knows the feeling when a new issue lands in the backlog, and before a single line of code is written, there's a codebase to explore, a plan to draft, a branch to create, security concerns to flag, and a Merge Request to open. None of that is engineering. All of it slows teams down.

We wanted to build something that didn't just assist developers — but one that could act on their behalf, handling the coordination overhead so engineers could stay in flow.

🏗️ How We Built It

Arix Pilot is built entirely on the GitLab Duo Agent Platform using a custom agent.yml and flow.yml configuration. The agent is powered by Anthropic Claude via the GitLab AI Gateway and leverages 16 tools from the GitLab AI Catalog including get_issue, edit_file, create_commit, run_tests, and create_merge_request.

The Two-Step Architecture

The core innovation is a strict Plan → Approve → Execute framework:

Step 1: Planning When a developer mentions @ai-arix-pilot-gitlab-ai-hackathon on any GitLab issue, Arix Pilot activates. It recursively scans the repository using list_dir, grep, and read_files, then posts a structured five-section implementation plan directly as an issue comment:

  • 🧭 Understanding — What the issue is actually asking
  • 📁 Files to Touch — Exact files identified from codebase exploration
  • 🗂 Subtasks — Numbered execution steps
  • ⚠️ Security & Compliance Flags — Vulnerabilities flagged before any code is written
  • Recommended First Step — Immediate momentum for the developer

The agent then explicitly halts and waits for human approval.

Step 2: Execution Once the developer replies with approve, Arix Pilot wakes up and executes autonomously — editing files, committing to a named branch, opening a Merge Request with a full description, and posting a completion comment on the original issue.

🌿 The Green Agent Design

Most autonomous agents fall into expensive "infinite validation loops" — burning GPU compute cycles and context windows trying to self-correct hallucinated code. Arix Pilot is designed to prevent this entirely.

By forcing a hard stop at the planning phase, the human validates the architecture before expensive coding tools are ever invoked. This dramatically reduces compute consumption per resolved issue, making AI-assisted development financially and ecologically sustainable at enterprise scale.

🧠 What We Learned

  • Building on the GitLab Duo Agent Platform requires thinking in events and tools, not chat interactions
  • System prompt precision is everything — vague instructions cause infinite tool-call loops
  • The human-in-the-loop checkpoint isn't a limitation, it's a feature that makes the agent trustworthy in production environments
  • Claude's reasoning capability is what makes multi-file codebase navigation reliable — lesser models hallucinate file paths and variable names

⚡ Challenges We Faced

  • Silent pipeline failures — GitLab's flow engine would fail without error messages, requiring systematic tool name verification against the official tool_mapping.json
  • Project context confusion — The agent would loop trying to resolve the correct project ID until we hardcoded the context directly into the system prompt
  • Tool name mismatcheslist_files does not exist; the correct tool is list_dir. Small errors caused complete session failures

Built With

Share this project:

Updates