Inspiration
AI agents are shipping into production faster than security tooling can keep up. Traditional SAST scanners check for SQL injection and XSS, but they don't know what prompt injection looks like. They can't flag when an agent executes LLM output directly, or when a developer hardcodes an API key in a rush to ship. We wanted to close that gap inside GitLab, where the code already lives.
What it does
AI Security Agent reviews merge requests for four categories of vulnerability that traditional scanners miss:
- Prompt injection: user input interpolated into LLM prompts
- Supply chain attacks:
eval(),exec(),curl | bashwith dynamic input - Hardcoded secrets: API keys, database credentials, private keys in source
- Unsafe AI agent patterns: direct execution of LLM output, unauthenticated agent endpoints, missing token limits
Assign it as a reviewer on any MR. It posts findings with severity, exact line numbers, and copy-pasteable fixes.
How we built it
Two detection layers:
Deterministic regex scanner (
src/scanner.py): Parses 25 patterns from markdown-based pattern files and runs them against code diffs. Fast, free, no API calls. Catches the obvious stuff reliably.LLM contextual analysis (
.gitlab/ai/flows/security-scan.yml): A GitLab Duo flow that sends the diff to the configured LLM for deeper review. Understands intent, catches subtle issues the regex layer misses.
Output validation (src/schema.py) checks every LLM response against a strict schema before posting. No malformed reports reach developers.
43 pytest tests cover pattern loading, true positives, true negatives, diff parsing, schema validation, and end-to-end integration.
Challenges
The patterns in the markdown files use regex syntax that breaks YAML parsing (brackets inside quoted strings). We had to write a custom parser that extracts patterns with regex-on-regex instead of relying on yaml.safe_load.
Balancing false positives was tricky. ast.literal_eval(user_input) contains "eval" and "user", triggering the eval detection pattern, but it's actually safe. The regex layer catches these, and the LLM layer compensates with contextual understanding. We document the known blind spots in the README rather than pretending they don't exist.
What we learned
- Adversarial review before submission matters more than polishing the pitch. We ran a judge-perspective teardown of our own project and found inflated claims, fake tests, and unused code. Fixing those made the submission honest and the code real.
- Model-agnostic framing is both more honest and more broadly useful than tying to a specific LLM.
- Regex pre-filtering and LLM analysis complement each other well: deterministic patterns for known-bad code, contextual analysis for everything else.
What's next
- AST-based analysis for data flow tracking (not just surface patterns)
- Feedback loop: learn from developer accept/reject decisions to reduce false positives over time
- Integration with CVE and OWASP databases
- Cost optimization with tiered model routing
Log in or sign up for Devpost to join the conversation.