SecPatch

Commit message to SecPatch' created branch with all the fixes
SecPatch review after being done, contains summary of findings and table of vulns
Attack chain analysis by SecPatch explaining a more in depth process to how it exploited them vulns
All findings and details as to how it found them and exploited them
All code changes done by SecPatch

What it does

SecPatch is an AI pentesting agent that lives inside GitLab. Mention it on any merge request and it runs a full security review. It traces data flows through your code, finds vulnerabilities, writes proof-of-concept exploits, maps findings to CWE/CVSS, and pushes fix commits. All automatic, all within GitLab.

It's built as a 2-agent pipeline on the GitLab Duo Agent Platform, powered by Anthropic Claude:

Scanner (read-only) — reads the MR diff, traces user input from source to sink, classifies by CWE, scores CVSS 3.1, generates curl-based PoCs, and detects multi-step attack chains. Zero write tools. Even if there's prompt injection in the code being reviewed, this agent can't modify the repo.
Fixer (write access) — takes the Scanner's structured JSON output, posts a pentest report as an MR comment, pushes remediation commits with language-idiomatic fixes, and labels the MR by severity.

Inspiration

SAST tools pattern-match. They flag possible issues but can't prove any of them are exploitable. They miss business logic bugs entirely. IDOR, race conditions, mass assignment, broken access control. None of that shows up in a regex scan.

And they never fix anything. Developers get a wall of warnings, ignore most of them, and real vulnerabilities ship to production.

I wanted something that thinks like an attacker. Something that traces how user input actually flows through code, proves vulnerabilities are real with working exploits, and fixes them without waiting for a human to get around to it.

How I built it

Two custom agents orchestrated through a GitLab Duo Flow:

The Scanner agent gets read-only tools (get_merge_request, list_merge_request_diffs, read_file). It performs recon first, then deep analysis. It traces input-to-sink flows, classifies by CWE, scores CVSS 3.1, writes PoC curl commands, and detects attack chains where individual vulns combine into bigger problems. Outputs structured JSON.
The Fixer agent receives that JSON and gets write tools (create_merge_request_note, edit_file, create_commit, update_merge_request). It formats a pentest report, posts it on the MR, pushes fix commits, and labels the MR by severity.

Splitting into two agents enforces least privilege. The agent reading potentially malicious code can't write to the repo. That's not just good practice, it makes debugging easier too because you can inspect the JSON between agents.

What it finds

Tested across 4 intentionally vulnerable apps in Python (Flask), JavaScript (Express), and Java (Spring Boot):

27 vulnerabilities found across IDOR, race conditions, command injection, SQLi, SSRF, path traversal, JWT confusion, mass assignment, and more
8 multi-step attack chains detected
100% of findings auto-remediated with committed fixes
Under 5 minutes per MR review

Challenges

Missing project context in triggers. The GitLab Duo Agent Platform doesn't pass project_id through the trigger context, only the MR IID. The Scanner had no way to know which project to read from and would loop. Solved by hardcoding a default project_id in the flow prompt.
Branch name hallucination. The Fixer agent sometimes invents branch names instead of using the MR's source branch, causing commit failures. Solved by relaxing the constraint and letting it create a new branch if the source branch fails, so the fix still lands.
Comment length limits. Keeping the MR comment under GitLab's character limit while including all findings, PoCs, and compliance mapping required careful template design.

What I learned

Separation of concerns matters for AI agents just like it does for code. Giving the Scanner zero write tools isn't just security. It makes the whole system easier to reason about because you can inspect the handoff between agents.

LLMs are also surprisingly good at thinking like attackers. The attack chain detection, where it connects individual vulnerabilities into realistic multi-step exploit scenarios, was the feature I expected to fail. Turned out to be the most impressive part.

Built with

GitLab Duo Agent Platform
Anthropic Claude
GitLab API
Python
JavaScript
Java
YAML

Built With

anthropic-claude
gitlab-api
gitlab-duo-agent-platform
java
javascript
python
yaml

Updates

Karl Seryani started this project — Mar 20, 2026 07:43 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.