Security Guardian: Autonomous DevSecOps Flow

Results Security Report

Inspiration

Every 39 seconds a company gets hacked. Most startups ship vulnerable code because they simply cannot afford a dedicated security engineer. We watched teams merge SQL injection, hardcoded AWS keys, and broken authentication straight to production — not because they were careless, but because security review is expensive and slow.

We asked: What if every developer had an AI security engineer that never sleeps, never misses a PR, and catches vulnerabilities before they reach production?

That became Security Guardian.

What it does

Security Guardian is a fully autonomous DevSecOps agent powered by Claude (Anthropic) via the GitLab Duo Agent Platform. It automatically triggers on every Merge Request and executes an 8-phase security review:

Reconnaissance — Reads all changed files, scans the entire repo for secret patterns
Taint Analysis — Traces user input from Source → Sink to find injection paths like a senior security researcher
Vulnerability Detection — Detects 14+ vulnerability types including SQL Injection, XSS, hardcoded secrets, broken auth, insecure deserialization, IDOR, command injection, path traversal, weak crypto, and dependency CVEs
Auto-Remediation — Rewrites vulnerable code with before/after explanation so developers learn
Issue Tracking — Creates detailed GitLab issues with real breach examples and dollar costs
Compliance Mapping — Maps every finding to GDPR, PCI-DSS, HIPAA, and SOC2
Audit Report — Posts Security Score 0-100 with APPROVE / REQUEST CHANGES / BLOCK verdict
Merge Blocking — Adds security-review-required label to block unsafe merges automatically

Demo results on real code:

Found 17 vulnerabilities including $2.7M–$410M in potential breach exposure
Auto-fixed all critical vulnerabilities
Security Score improved from 0/100 → 95/100
Tested on real OWASP WebGoat code — found 16 vulnerabilities across SQL Injection, XSS, XXE, SSRF, CSRF, Path Traversal, and Insecure Deserialization

How we built it

Custom Agent (agents/agent.yml) — 7-phase system prompt with taint analysis logic, 45 GitLab tools configured, compliance mapping for GDPR/PCI-DSS/HIPAA/SOC2
Custom Flow (flows/flow.yml) — Auto-triggers on MR assignment or @mention, extracts MR IID from context, executes full 8-step audit autonomously
Claude (Anthropic) — Powers all reasoning, taint analysis, and vulnerability detection via GitLab Duo Agent Platform
GitLab Duo Agent Platform — Provides tool access: get_merge_request, edit_file, create_issue, create_merge_request_note, update_merge_request and 40+ more
Green Code — Detects O(n²) loops, N+1 queries, estimates CPU and energy savings per fix

Challenges we ran into

Goal too long error — GitLab passes the full MR diff as goal context; large repos like WebGoat hit the 16384 character limit. Fixed by optimizing the system prompt size.
Windows CRLF line endings — The edit_file tool uses Unix LF for matching; Windows files caused all edits to fail silently. Fixed by instructing the agent to fall back to create_file_with_contents.
GraphQL 422 errors — Em-dashes — and pipe characters | in YAML caused GraphQL failures. Fixed by replacing all special characters.
MR context extraction — The flow initially received just a number as goal. Fixed by parsing MergeRequest IID: X from the context string.
Agent trying to delete files — Without explicit rules, the agent tried deleting and recreating files when edits failed. Fixed with strict golden rules in the prompt.

Accomplishments that we're proud of

Fully autonomous — Zero human involvement from MR creation to BLOCK/APPROVE verdict
Real OWASP testing — Successfully scanned real WebGoat vulnerable code, not just toy examples
$410M+ breach exposure detected in a single scan
Complete compliance coverage — GDPR, PCI-DSS, HIPAA, SOC2 mapped automatically
Green Code — Detected O(n²) loops with 99% CPU reduction and ~1,800 kWh/year energy savings estimated
Security Score progression — 0/100 → 95/100 in a single MR review cycle
14 GitLab issues created automatically with CWE references, breach examples, and remediation steps

What we learned

AI agents can replace entire security workflows when given the right tools, context, and chain-of-thought reasoning
Prompt engineering is engineering — every character counts when you hit API limits
Taint analysis in natural language works surprisingly well when structured as Source → Sink → Sanitizer
GitLab Duo Agent Platform is genuinely powerful — the combination of 45+ tools with Claude's reasoning creates something that feels like a real team member
Every dev team regardless of size or budget deserves an autonomous security engineer

What's next for Security Guardian: Autonomous DevSecOps Flow

Google Cloud Pub/Sub integration — Publish every vulnerability as a real-time event so subscribed teams get instant alerts
Auto-approve loop — Guardian re-reviews after fixes and auto-approves when score hits 100/100
Multi-language support — Extend beyond Java to Python, Node.js, Go, Ruby
Historical trending — Track security score across MRs over time
Slack/Teams notifications — Alert security team when Critical vulnerabilities are found
Custom rule engine — Let teams define their own vulnerability patterns
Publish to GitLab AI Catalog — Make Security Guardian available to all GitLab users worldwide

Built With

agent
antropic
ci/cd
claude
duo
gitlab
java
owasp
platform
yaml

Updates

anupborker BORKER started this project — Mar 23, 2026 12:19 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.