Inspiration
Every team I've worked with has the same problem: the security scanner finishes, 40+ findings appear, and nobody knows where to start. Most of those findings are theoretical — low reachability, strong mitigations already in place, transitive dependencies that never get called. But a raw CVSS score doesn't tell you that. So either a senior engineer spends hours triaging, or the findings get ignored entirely.
I wanted to build the agent that does that triage automatically — not a rules-based filter, but something that actually reasons about whether a vulnerability matters in context.
What it does
SecuritySentinel is a GitLab Duo agent that triggers on pipeline completion and triages your security scan output using Claude.
For each finding it:
- Assesses real exploitability based on code path reachability, not just CVSS score
- Checks for mitigating controls (auth layers, input validation) already present in the codebase
- Assigns a priority tier:
CRITICAL,HIGH,MEDIUM, orLOW - Opens a GitLab issue for critical findings with full reasoning and remediation guidance
- Drafts a fix MR for anything with a trivial remediation (version bump or small code change)
- Annotates the pipeline for lower-severity findings
The result: 47 findings become 15 actionable items in under 10 seconds.
How we built it
The agent is built in Python and structured as a GitLab Duo custom agent. The pipeline triggers a webhook on completion, which the agent receives and processes through three stages:
- Parse —
scanner_parser.pynormalizes GitLab SAST and dependency scan JSON into a clean findings schema - Triage —
triage.pysends findings to Claude via the Anthropic API through GitLab Duo, using a carefully engineered system prompt that instructs Claude to reason about reachability, context, and mitigations before assigning priority - Act —
gitlab_client.pycreates issues, drafts MRs, and annotates the pipeline based on Claude's structured JSON output
The Claude system prompt was the most important design decision. Rather than asking Claude to rate severity, I prompt it to reason step by step: is the code path reachable? Are there mitigations? Has this CVE been actively exploited? That chain of reasoning is what produces useful output rather than just reformatted scanner data.
Stack:
- Python 3.11
- Anthropic API (Claude Sonnet) via GitLab Duo
- GitLab REST API (issues, MRs, pipeline annotations)
- GitLab CI/CD webhook trigger
Challenges we ran into
Getting the prompt right. Early versions of the triage prompt produced verbose, hedged output that wasn't useful for automation. The breakthrough was requiring structured JSON output and being explicit that Claude should disagree with the raw CVSS score when context warrants it. That unlocked the reasoning quality I was looking for.
Parsing scanner output reliably. GitLab's SAST and dependency scan formats differ subtly across scanner versions. Building a normalizer that handles both cleanly took more iteration than expected.
Scoping the demo. The temptation was to handle every scanner type. I forced myself to ship SAST + dependency scanning only, and do those two well, rather than five things poorly.
What we learned
Claude is most useful when you give it permission to disagree with upstream data. The best output came when the prompt explicitly said: "Never mark something CRITICAL solely because its base CVSS is high — reachability matters." That single instruction changed the quality of reasoning dramatically.
I also learned that the hardest part of building agents isn't the AI call — it's the plumbing. Reliable webhook handling, normalized input schemas, and idempotent GitLab API calls took more time than the Claude integration itself.
What's next for SecuritySentinel – AI Security Triage Agent for GitLab
- Support for GitLab DAST and secret detection scanners
- Per-repo policy files so teams can configure their own severity thresholds
- A weekly digest mode that summarizes unresolved findings across all projects
- MR-level triage: run the agent on the diff only, not the full codebase
Built With
- agent
- api
- ci/cd
- claude
- duo
- gitlab
- platform
- python
- rest

Log in or sign up for Devpost to join the conversation.