AI Security Agent

Inspiration

AI agents are shipping into production faster than security tooling can keep up. Traditional SAST scanners check for SQL injection and XSS, but they don't know what prompt injection looks like. They can't flag when an agent executes LLM output directly, or when a developer hardcodes an API key in a rush to ship. We wanted to close that gap inside GitLab, where the code already lives.

What it does

AI Security Agent reviews merge requests for four categories of vulnerability that traditional scanners miss:

Prompt injection: user input interpolated into LLM prompts
Supply chain attacks: eval(), exec(), curl | bash with dynamic input
Hardcoded secrets: API keys, database credentials, private keys in source
Unsafe AI agent patterns: direct execution of LLM output, unauthenticated agent endpoints, missing token limits

Assign it as a reviewer on any MR. It posts findings with severity, exact line numbers, and copy-pasteable fixes.

How we built it

Two detection layers:

Deterministic regex scanner (src/scanner.py): Parses 25 patterns from markdown-based pattern files and runs them against code diffs. Fast, free, no API calls. Catches the obvious stuff reliably.
LLM contextual analysis (.gitlab/ai/flows/security-scan.yml): A GitLab Duo flow that sends the diff to the configured LLM for deeper review. Understands intent, catches subtle issues the regex layer misses.

Output validation (src/schema.py) checks every LLM response against a strict schema before posting. No malformed reports reach developers.

43 pytest tests cover pattern loading, true positives, true negatives, diff parsing, schema validation, and end-to-end integration.

Challenges

The patterns in the markdown files use regex syntax that breaks YAML parsing (brackets inside quoted strings). We had to write a custom parser that extracts patterns with regex-on-regex instead of relying on yaml.safe_load.

Balancing false positives was tricky. ast.literal_eval(user_input) contains "eval" and "user", triggering the eval detection pattern, but it's actually safe. The regex layer catches these, and the LLM layer compensates with contextual understanding. We document the known blind spots in the README rather than pretending they don't exist.

What we learned

Adversarial review before submission matters more than polishing the pitch. We ran a judge-perspective teardown of our own project and found inflated claims, fake tests, and unused code. Fixing those made the submission honest and the code real.
Model-agnostic framing is both more honest and more broadly useful than tying to a specific LLM.
Regex pre-filtering and LLM analysis complement each other well: deterministic patterns for known-bad code, contextual analysis for everything else.

What's next

AST-based analysis for data flow tracking (not just surface patterns)
Feedback loop: learn from developer accept/reject decisions to reduce false positives over time
Integration with CVE and OWASP databases
Cost optimization with tiered model routing

Built With

claude
gitlab-duo-agent-platform
json
pytest
python
regex
schema
yaml

Updates

Jeka P started this project — Mar 25, 2026 12:58 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.