Inspiration

GitLab's own research highlights the AI Paradox — AI tools have made code authoring 10x faster, but developers only spend ~20% of their time writing code. The other 80% is reviews, testing, security, and operations. Faster code creation means more security scan findings, larger vulnerability backlogs, and slower delivery. We built SORK to break this cycle.

What it does

SORK (Security Orchestration, Remediation & Keeping) automates the entire vulnerability lifecycle using three AI agents on the GitLab Duo Agent Platform:

  • SORK Triage — Analyzes every vulnerability, reads the source code, assesses reachability, dismisses false positives with documented reasoning, and confirms real threats with severity ratings and CWE references.
  • SORK Remediation — Generates minimal, targeted code fixes following the project's existing style, creates a branch, commits the patch, and opens a merge request with all vulnerabilities linked.
  • SORK Keeper — Monitors the fix MR's pipeline, verifies original vulnerabilities are resolved, checks for regressions, and posts a verification report with a SAFE TO MERGE recommendation.

How we built it

We built SORK entirely on the GitLab Duo Agent Platform using Anthropic Claude as the AI model. Each agent is a custom YAML configuration with a carefully engineered system prompt and a curated set of GitLab agent tools (25+ tools across the three agents). The agents are orchestrated using GitLab Flows in a sequential pipeline: Triage → Remediation → Keeper.

We created a deliberate test project with five types of real vulnerabilities — outdated dependencies with known CVEs, SQL injection, hardcoded secrets, cross-site scripting, and path traversal — to validate the full end-to-end flow.

No external APIs, no custom UI, no database. Everything runs natively inside GitLab with zero external dependencies.

Challenges we ran into

  • Prompt engineering for precision — Getting agents to dismiss false positives accurately without missing real threats required extensive iteration on system prompts and tool selection.
  • Minimal fix generation — Teaching the Remediation agent to generate the smallest possible patch without refactoring unrelated code was harder than expected.
  • Agent coordination — Ensuring clean handoff between agents — Triage's output becomes Remediation's input, Remediation's MR becomes Keeper's monitoring target.
  • Verification accuracy — The Keeper agent needed to reliably distinguish between "vulnerability resolved" and "vulnerability no longer detected due to scan misconfiguration."

What we learned

  • The GitLab Agent Platform's 55+ tools are powerful enough to automate complex multi-step workflows without external dependencies.
  • Three focused agents outperform one monolithic agent — each agent has a clear responsibility and a curated toolset.
  • Security automation isn't about replacing security engineers — it's about giving them AI teammates that handle repetitive triage so they can focus on strategic decisions.

What's next for SORK

  • Multi-project scanning across entire GitLab groups
  • Severity-based routing (critical = immediate, medium = batched daily)
  • Pattern learning from historical dismissals to improve triage accuracy
  • Compliance report generation for SOC 2 and ISO 27001 audits
  • MCP integrations for Slack and PagerDuty notifications

Built With

Share this project:

Updates