Inspiration

Every 39 seconds a company gets hacked. Most startups ship vulnerable code because they simply cannot afford a dedicated security engineer. We watched teams merge SQL injection, hardcoded AWS keys, and broken authentication straight to production — not because they were careless, but because security review is expensive and slow.

We asked: What if every developer had an AI security engineer that never sleeps, never misses a PR, and catches vulnerabilities before they reach production?

That became Security Guardian.

What it does

Security Guardian is a fully autonomous DevSecOps agent powered by Claude (Anthropic) via the GitLab Duo Agent Platform. It automatically triggers on every Merge Request and executes an 8-phase security review:

  1. Reconnaissance — Reads all changed files, scans the entire repo for secret patterns
  2. Taint Analysis — Traces user input from Source → Sink to find injection paths like a senior security researcher
  3. Vulnerability Detection — Detects 14+ vulnerability types including SQL Injection, XSS, hardcoded secrets, broken auth, insecure deserialization, IDOR, command injection, path traversal, weak crypto, and dependency CVEs
  4. Auto-Remediation — Rewrites vulnerable code with before/after explanation so developers learn
  5. Issue Tracking — Creates detailed GitLab issues with real breach examples and dollar costs
  6. Compliance Mapping — Maps every finding to GDPR, PCI-DSS, HIPAA, and SOC2
  7. Audit Report — Posts Security Score 0-100 with APPROVE / REQUEST CHANGES / BLOCK verdict
  8. Merge Blocking — Adds security-review-required label to block unsafe merges automatically

Demo results on real code:

  • Found 17 vulnerabilities including $2.7M–$410M in potential breach exposure
  • Auto-fixed all critical vulnerabilities
  • Security Score improved from 0/100 → 95/100
  • Tested on real OWASP WebGoat code — found 16 vulnerabilities across SQL Injection, XSS, XXE, SSRF, CSRF, Path Traversal, and Insecure Deserialization

How we built it

  • Custom Agent (agents/agent.yml) — 7-phase system prompt with taint analysis logic, 45 GitLab tools configured, compliance mapping for GDPR/PCI-DSS/HIPAA/SOC2
  • Custom Flow (flows/flow.yml) — Auto-triggers on MR assignment or @mention, extracts MR IID from context, executes full 8-step audit autonomously
  • Claude (Anthropic) — Powers all reasoning, taint analysis, and vulnerability detection via GitLab Duo Agent Platform
  • GitLab Duo Agent Platform — Provides tool access: get_merge_request, edit_file, create_issue, create_merge_request_note, update_merge_request and 40+ more
  • Green Code — Detects O(n²) loops, N+1 queries, estimates CPU and energy savings per fix

Challenges we ran into

  • Goal too long error — GitLab passes the full MR diff as goal context; large repos like WebGoat hit the 16384 character limit. Fixed by optimizing the system prompt size.
  • Windows CRLF line endings — The edit_file tool uses Unix LF for matching; Windows files caused all edits to fail silently. Fixed by instructing the agent to fall back to create_file_with_contents.
  • GraphQL 422 errors — Em-dashes and pipe characters | in YAML caused GraphQL failures. Fixed by replacing all special characters.
  • MR context extraction — The flow initially received just a number as goal. Fixed by parsing MergeRequest IID: X from the context string.
  • Agent trying to delete files — Without explicit rules, the agent tried deleting and recreating files when edits failed. Fixed with strict golden rules in the prompt.

Accomplishments that we're proud of

  • Fully autonomous — Zero human involvement from MR creation to BLOCK/APPROVE verdict
  • Real OWASP testing — Successfully scanned real WebGoat vulnerable code, not just toy examples
  • $410M+ breach exposure detected in a single scan
  • Complete compliance coverage — GDPR, PCI-DSS, HIPAA, SOC2 mapped automatically
  • Green Code — Detected O(n²) loops with 99% CPU reduction and ~1,800 kWh/year energy savings estimated
  • Security Score progression — 0/100 → 95/100 in a single MR review cycle
  • 14 GitLab issues created automatically with CWE references, breach examples, and remediation steps

What we learned

  • AI agents can replace entire security workflows when given the right tools, context, and chain-of-thought reasoning
  • Prompt engineering is engineering — every character counts when you hit API limits
  • Taint analysis in natural language works surprisingly well when structured as Source → Sink → Sanitizer
  • GitLab Duo Agent Platform is genuinely powerful — the combination of 45+ tools with Claude's reasoning creates something that feels like a real team member
  • Every dev team regardless of size or budget deserves an autonomous security engineer

What's next for Security Guardian: Autonomous DevSecOps Flow

  • Google Cloud Pub/Sub integration — Publish every vulnerability as a real-time event so subscribed teams get instant alerts
  • Auto-approve loop — Guardian re-reviews after fixes and auto-approves when score hits 100/100
  • Multi-language support — Extend beyond Java to Python, Node.js, Go, Ruby
  • Historical trending — Track security score across MRs over time
  • Slack/Teams notifications — Alert security team when Critical vulnerabilities are found
  • Custom rule engine — Let teams define their own vulnerability patterns
  • Publish to GitLab AI Catalog — Make Security Guardian available to all GitLab users worldwide

Built With

  • agent
  • antropic
  • ci/cd
  • claude
  • duo
  • gitlab
  • java
  • owasp
  • platform
  • yaml
Share this project:

Updates