DeployReady

5 agents, 4 files analyzed, tests auto-committed, RCE vulnerability detected, risk score 85/100 — DO NOT MERGE verdict posted to MR !3.
All 5 agents ran in 3m39s. The agent autonomously committed tests to the feature branch, triggering its own passing CI pipeline.

Inspiration

I've seen too many production incidents that started with a "small change." A developer tweaks a core discount calculation function, the unit tests pass, CI goes green, and the MR gets merged. Three days later, invoices are wrong, carts are broken, and the post-mortem reveals that nobody checked which downstream files actually relied on that function.

The problem isn't that engineers are careless — it's that tracing downstream impact manually is tedious and error-prone, especially under review pressure. I wanted to build something that does that invisible work automatically: map every file that could break, check if those files have tests for the changed behavior, write the missing tests, and flag security issues — all before a human reviewer even opens the MR.

What it does

DeployReady is a five-agent AI orchestration flow that acts as an autonomous DevSecOps gatekeeper. Triggered by mentioning @ai-deployready-gitlab-ai-hackathon in any Merge Request, it launches a sequential pipeline of specialized agents:

Blast Radius Agent: Maps all downstream files affected by the code diff.
Coverage Intelligence Agent: Statically analyzes the repository to find missing tests for the affected files.
Test Generation Agent: Autonomously writes and commits missing pytest integration tests directly to the feature branch.
Risk Scoring Agent: Analyzes the code for lethal security patterns (like eval()) and calculates a deployment risk score.
Verdict Agent: Synthesizes the data into a master markdown report and blocks or approves the Merge Request.

How I built it

DeployReady is built on the GitLab Duo Agent Platform, structured as a Directed Acyclic Graph (DAG) of agents powered by Anthropic's Claude Sonnet.

To make the Risk Scoring Agent objective, I engineered it to evaluate vulnerabilities and blast radius using a weighted formula. The underlying risk calculation conceptually follows this model:

$$Risk = \min\left(100, \alpha \cdot B + \sum_{i=1}^{n} (w_i \cdot s_i)\right)$$

Where:

$\alpha$ is the baseline blast radius multiplier.
$B$ is the total BLAST_RADIUS_COUNT (files affected).
$w_i$ is the contextual weight of the vulnerability (e.g., higher for financial logic).
$s_i$ is the base severity score of the vulnerability (e.g., eval() = Critical).

Challenges I faced

The biggest challenge was a platform-level debugging problem that consumed most of my hackathon. Every session showed workflowStatus: CREATED from start to finish, with no agents executing and no error messages.

After days of debugging with GitLab engineers in the hackathon Discord, I found the root cause: the placeholder: history key was missing from my prompt templates. This gave the ReAct agents "amnesia," preventing them from executing. Because this didn't fail schema validation, it only revealed itself through deep server-side log inspection.

The second major challenge was preventing agent infinite loops. Once the agents could execute, those with access to read tools (gitlab_blob_search, get_repository_file) would make redundant calls, retry endlessly on empty results, and eventually time out. The fix was injecting ruthless, explicit tool-call budgets into every system prompt (e.g., "CRITICAL: MAXIMUM OF 3 TOOL CALLS ALLOWED"), which successfully forced deterministic behavior.

Accomplishments that I'm proud of

Defeating the infinite loops and getting all five agents to hand off context flawlessly was a massive win. In the final demo run, the flow completed in under 3 minutes: it accurately mapped a 4-file blast radius, autonomously committed 96 lines of integration tests to the branch, detected a hidden eval() RCE vulnerability, and successfully blocked the deployment with a DO NOT MERGE verdict.

What I learned

This project taught me more about agentic orchestration than any tutorial could. The GitLab Duo Agent Platform is incredibly powerful but unforgiving. I learned that the gap between "the schema validates" and "the flow executes" can be enormous. Furthermore, I learned that prompt engineering for multi-agent pipelines is a discipline of its own—you must treat agents like strict functions with hard constraints, not open-ended conversational bots.

What's next for DeployReady

The next evolution of DeployReady will include:

Multi-Language Support: Expanding static coverage analysis beyond Python/pytest to Node.js and Go.
Pipeline Integration: Allowing the Risk Scoring agent to ingest actual artifacts from GitLab SAST/DAST jobs rather than relying solely on static LLM analysis.
Auto-Remediation Agent: Adding a 6th agent that doesn't just flag security vulnerabilities like eval(), but actively commits the sanitized fix alongside the generated tests.

Built With

claude-sonnet-4.6
gitlab-agent-platform-yaml
gitlab-ci/cd
gitlab-duo-agent-platform
pytest
python

Updates

Chingkhei Yumkhaibam started this project — Mar 24, 2026 11:22 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.