Inspiration
Security scanners find vulnerabilities every day. Developers ignore them for weeks. SAST reports pile up, dependency alerts go stale, and critical fixes sit in backlogs because context-switching to remediate
them is tedious. The scanner found the problem — but nobody fixes it.
The average response time for a flagged vulnerability in most teams is measured in weeks, not minutes. We wanted to build something that closes the gap between detection and remediation — not another
dashboard, but an agent that actually does the work.
What it does
SecureFlow is a multi-agent flow built natively on the GitLab Duo Agent Platform that automates the full security remediation lifecycle:
Triage Agent — Parses SAST, DAST, and dependency scan results. Assesses real-world exploitability using CVSS/EPSS scores, checks code reachability, filters false positives, and produces a prioritized
vulnerability report.- Fix Agent — Reads the vulnerable source code, understands root cause, and generates the smallest possible patch following OWASP best practices and the project's existing code style.
- Test Agent — Identifies the project's test framework, then generates both negative tests (exploit payload rejected) and positive tests (functionality preserved) for each fix.
- Validate Agent — Commits the fix and tests to a
secureflow/remediate-*branch, creates a merge request with full documentation (vulnerability summary, root cause, OWASP reference, risk assessment), and links it to the original vulnerability. - Notification Module — Sends consolidated alerts to Slack and/or Discord with color-coded diffs and interactive Approve/Reject buttons.
- Fix Agent — Reads the vulnerable source code, understands root cause, and generates the smallest possible patch following OWASP best practices and the project's existing code style.
No code lands on main without human approval. What used to take hours of context-switching now takes 30 seconds.
How we built it
SecureFlow is built entirely on the GitLab Duo Agent Platform using Anthropic Claude as the underlying LLM. The architecture consists of:
- 4 custom agents (
secureflow-triage.yml,secureflow-fix.yml,secureflow-test.yml,secureflow-validate.yml) registered in the GitLab AI Catalog - 1 orchestration flow (
secureflow-remediation.yml) that chains the agents in a sequential pipeline: Triage → Fix → Test → Validate - A notification service for Slack/Discord alerts with interactive buttons
Each agent has a dedicated system prompt, a scoped toolset (only the GitLab API tools it needs), and timeout configuration. The flow uses GitLab's native component routing to pass structured output between
stages — the Triage Agent's prioritized report feeds into the Fix Agent, the Fix Agent's patches feed into the Test Agent, and everything flows into the Validate Agent for final commit and MR creation.
By running through GitLab Duo rather than external API calls, SecureFlow benefits from composite identity (scoped permissions per agent), sandboxed execution in CI/CD environments, a full audit trail, and
built-in prompt injection protection. Code never leaves the GitLab platform boundary.
We included a demo vulnerable Flask application with four intentional vulnerabilities (SQL injection, XSS, path traversal, command injection) to showcase the full remediation cycle end-to-end.
Challenges we ran into
Designing the inter-agent communication was the trickiest part. Each agent needs structured output that the next agent can consume, but Claude's responses are inherently free-form. We solved this by being very
specific in each agent's system prompt about the expected output format, and by using the flow's inputs mapping to pipe each stage's output to the next.
Scoping agent permissions correctly required careful thought. The Triage Agent needs read-only access to vulnerability data and source files. The Fix Agent only needs to read code. But the Validate Agent needs
write access to create commits, branches, and merge requests. Getting the toolset right for each agent — minimal privilege per stage — took iteration.
Balancing fix quality with minimality was another challenge. Claude is capable of rewriting entire functions, but the right fix for a SQL injection is often a single-line change from string concatenation to a
parameterized query. Training the agent to resist the urge to refactor was key.
Accomplishments that we're proud of
The four-agent pipeline architecture is clean and each agent does exactly one job. Triage doesn't try to fix. Fix doesn't try to test. This separation means each agent's prompt is focused and the quality of each stage is high.
The merge requests SecureFlow generates are genuinely reviewable — they include vulnerability context, OWASP references, root cause analysis, and both positive and negative test cases. A developer can understand the "why" in 30 seconds and make an informed approve/reject decision.
The notification module with interactive buttons closes the feedback loop. Developers don't need to go hunting for MRs — the fix comes to them in Slack or Discord with a one-click approval flow.
What we learned
Context is everything in security reasoning. The same line of code can be a critical vulnerability or a harmless test fixture depending on where it lives. LLMs excel at this kind of contextual judgment — better than regex-based scanners that treat every pattern match as equally alarming.
We also learned that the GitLab Duo Agent Platform's flow abstraction is powerful for multi-step workflows. Being able to chain agents with typed inputs/outputs and scoped toolsets makes the architecture clean
without needing external orchestration infrastructure.
The most valuable thing an AI security agent can do is explain itself. Developers don't just want to know something is wrong — they want to understand why, and they want a path forward. Structuring output as
explanation + severity + fix made findings dramatically more actionable.
What's next for SecureFlow Agent
- Configurable severity gates — let teams define which findings block a merge vs. post a warning
- Repository memory — track which files historically introduce vulnerabilities and prioritize analysis accordingly
- SARIF output — integrate with GitLab's native security dashboard alongside other scanners
- Open-source prompt templates — share the triage scoring and remediation prompts with the community
Log in or sign up for Devpost to join the conversation.