SecuritySentinel – AI Security Triage Agent for GitLab

Inspiration

Every team I've worked with has the same problem: the security scanner finishes, 40+ findings appear, and nobody knows where to start. Most of those findings are theoretical — low reachability, strong mitigations already in place, transitive dependencies that never get called. But a raw CVSS score doesn't tell you that. So either a senior engineer spends hours triaging, or the findings get ignored entirely.

I wanted to build the agent that does that triage automatically — not a rules-based filter, but something that actually reasons about whether a vulnerability matters in context.

What it does

SecuritySentinel is a GitLab Duo agent that triggers on pipeline completion and triages your security scan output using Claude.

For each finding it:

Assesses real exploitability based on code path reachability, not just CVSS score
Checks for mitigating controls (auth layers, input validation) already present in the codebase
Assigns a priority tier: CRITICAL, HIGH, MEDIUM, or LOW
Opens a GitLab issue for critical findings with full reasoning and remediation guidance
Drafts a fix MR for anything with a trivial remediation (version bump or small code change)
Annotates the pipeline for lower-severity findings

The result: 47 findings become 15 actionable items in under 10 seconds.

How we built it

The agent is built in Python and structured as a GitLab Duo custom agent. The pipeline triggers a webhook on completion, which the agent receives and processes through three stages:

Parse — scanner_parser.py normalizes GitLab SAST and dependency scan JSON into a clean findings schema
Triage — triage.py sends findings to Claude via the Anthropic API through GitLab Duo, using a carefully engineered system prompt that instructs Claude to reason about reachability, context, and mitigations before assigning priority
Act — gitlab_client.py creates issues, drafts MRs, and annotates the pipeline based on Claude's structured JSON output

The Claude system prompt was the most important design decision. Rather than asking Claude to rate severity, I prompt it to reason step by step: is the code path reachable? Are there mitigations? Has this CVE been actively exploited? That chain of reasoning is what produces useful output rather than just reformatted scanner data.

Stack:

Python 3.11
Anthropic API (Claude Sonnet) via GitLab Duo
GitLab REST API (issues, MRs, pipeline annotations)
GitLab CI/CD webhook trigger

Challenges we ran into

Getting the prompt right. Early versions of the triage prompt produced verbose, hedged output that wasn't useful for automation. The breakthrough was requiring structured JSON output and being explicit that Claude should disagree with the raw CVSS score when context warrants it. That unlocked the reasoning quality I was looking for.

Parsing scanner output reliably. GitLab's SAST and dependency scan formats differ subtly across scanner versions. Building a normalizer that handles both cleanly took more iteration than expected.

Scoping the demo. The temptation was to handle every scanner type. I forced myself to ship SAST + dependency scanning only, and do those two well, rather than five things poorly.

What we learned

Claude is most useful when you give it permission to disagree with upstream data. The best output came when the prompt explicitly said: "Never mark something CRITICAL solely because its base CVSS is high — reachability matters." That single instruction changed the quality of reasoning dramatically.

I also learned that the hardest part of building agents isn't the AI call — it's the plumbing. Reliable webhook handling, normalized input schemas, and idempotent GitLab API calls took more time than the Claude integration itself.

What's next for SecuritySentinel – AI Security Triage Agent for GitLab

Support for GitLab DAST and secret detection scanners
Per-repo policy files so teams can configure their own severity thresholds
A weekly digest mode that summarizes unresolved findings across all projects
MR-level triage: run the agent on the diff only, not the full codebase

Built With

agent
api
ci/cd
claude
duo
gitlab
platform
python
rest

Submitted to

GitLab AI Hackathon

Created by

I led the end-to-end development of SecuritySentinel, including concept design, architecture, and implementation.

Specifically, I:
- Designed the core approach for prioritizing vulnerabilities based on real exploitability (code path reachability vs. static CVSS scoring)
- Built the agent workflow that ingests scan outputs, performs contextual reasoning, and generates actionable recommendations
- Developed the automation layer to create prioritized issues and draft merge requests
- Integrated LLM-based reasoning to explain why specific vulnerabilities matter in context
- Created the demo environment, test fixtures, and execution flow showcased in the submission
This project reflects my focus on turning noisy security data into clear, developer-actionable outcomes.

chris smith

Updates

chris smith started this project — Mar 22, 2026 05:29 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.