Inspiration
GitLab's own research highlights the AI Paradox — AI tools have made code authoring 10x faster, but developers only spend ~20% of their time writing code. The other 80% is reviews, testing, security, and operations. Faster code creation means more security scan findings, larger vulnerability backlogs, and slower delivery. We built SORK to break this cycle.
What it does
SORK (Security Orchestration, Remediation & Keeping) automates the entire vulnerability lifecycle using three AI agents on the GitLab Duo Agent Platform:
- SORK Triage — Analyzes every vulnerability, reads the source code, assesses reachability, dismisses false positives with documented reasoning, and confirms real threats with severity ratings and CWE references.
- SORK Remediation — Generates minimal, targeted code fixes following the project's existing style, creates a branch, commits the patch, and opens a merge request with all vulnerabilities linked.
- SORK Keeper — Monitors the fix MR's pipeline, verifies original vulnerabilities are resolved, checks for regressions, and posts a verification report with a SAFE TO MERGE recommendation.
How we built it
We built SORK entirely on the GitLab Duo Agent Platform using Anthropic Claude as the AI model. Each agent is a custom YAML configuration with a carefully engineered system prompt and a curated set of GitLab agent tools (25+ tools across the three agents). The agents are orchestrated using GitLab Flows in a sequential pipeline: Triage → Remediation → Keeper.
We created a deliberate test project with five types of real vulnerabilities — outdated dependencies with known CVEs, SQL injection, hardcoded secrets, cross-site scripting, and path traversal — to validate the full end-to-end flow.
No external APIs, no custom UI, no database. Everything runs natively inside GitLab with zero external dependencies.
Challenges we ran into
- Prompt engineering for precision — Getting agents to dismiss false positives accurately without missing real threats required extensive iteration on system prompts and tool selection.
- Minimal fix generation — Teaching the Remediation agent to generate the smallest possible patch without refactoring unrelated code was harder than expected.
- Agent coordination — Ensuring clean handoff between agents — Triage's output becomes Remediation's input, Remediation's MR becomes Keeper's monitoring target.
- Verification accuracy — The Keeper agent needed to reliably distinguish between "vulnerability resolved" and "vulnerability no longer detected due to scan misconfiguration."
What we learned
- The GitLab Agent Platform's 55+ tools are powerful enough to automate complex multi-step workflows without external dependencies.
- Three focused agents outperform one monolithic agent — each agent has a clear responsibility and a curated toolset.
- Security automation isn't about replacing security engineers — it's about giving them AI teammates that handle repetitive triage so they can focus on strategic decisions.
What's next for SORK
- Multi-project scanning across entire GitLab groups
- Severity-based routing (critical = immediate, medium = batched daily)
- Pattern learning from historical dismissals to improve triage accuracy
- Compliance report generation for SOC 2 and ISO 27001 audits
- MCP integrations for Slack and PagerDuty notifications
Log in or sign up for Devpost to join the conversation.