-
-
Session list. - the latest one is shown in the video
-
Sentinel Full analysis report
-
Commenting on merge request to start the sentinel session
-
Vulnerability put up for manual review
-
Sentinel Fixer in action
-
Report
-
Scorecard on Merge request
-
Full report in session
-
Diff - comparing the main branch vs the latest one with all the fixes
Inspiration
I kept reading about "vibe coding" — developers prompting AI to generate entire codebases without really reviewing what came out. Then I saw the numbers: AI-generated code is 2.7x more likely to have XSS vulnerabilities, and most orgs take 270 days to patch security issues. Attackers exploit them in 15 days. That gap bothered me. I wanted something that would catch these problems automatically, fix what it could, and at least flag what it couldn't — before the code ever hits production.
What it does
Security Sentinel is a 3-component flow on the GitLab Duo Agent Platform. You comment on a merge request, and it does the rest:
- Scanner reads every source file plus GitLab's SAST findings. It classifies vulnerabilities across 9 categories and maps them to the OWASP Top 10.
- Fixer auto-fixes 5 categories (outdated dependencies, hardcoded secrets, SQL injection, missing security headers, exposed .env files) by creating real code commits with educational comments explaining why each fix matters. For things it can't safely auto-fix — IDOR, missing rate limiting, unhashed passwords — it creates triage issues with remediation guides and SLA deadlines.
- Reporter posts a Security Scorecard on the merge request with OWASP mapping, compliance status (SOC2, PCI-DSS, GDPR), and links to everything.
There's also a GCP Cloud Function that stores detected secrets in Secret Manager for rotation.
How I built it
The flow runs on GitLab's Duo Agent Platform — three components (scanner, fixer, reporter) defined in a single YAML file, each with its own prompt and toolset. All the AI reasoning is handled by Anthropic Claude through the Duo platform. The vulnerable Node.js app was built intentionally with known flaws so the agent has real code to work on.
The GCP integration is a Cloud Function deployed to Cloud Run that accepts secret payloads from a CI pipeline job and stores them in Secret Manager. The CI pipeline runs SAST, Secret Detection, and the vault-secrets job on every merge request.
I iterated through v1 (basic scan + fix), v2 (added reporter + triage issues), and v3 (confidence scoring, educational comments, SLA tracking, compliance mapping). Each version was tested end-to-end with a fresh merge request.
Challenges I ran into
The biggest headache was getting the Reporter to post its scorecard on the correct merge request. The flow's service account kept targeting its own namespace instead of the project where the code lives. I spent a few hours debugging the prompt to explicitly extract the MR IID from context and force the right API calls. Turns out prompt engineering for tool-use is a different beast than prompt engineering for text generation — the agent needs very explicit instructions about which parameters to pass to which tool.
The other challenge was working with Developer-level access (not Maintainer). I couldn't set CI/CD variables through the UI, which meant the auto-trigger pipeline job couldn't fire. Manual triggering via MR comments was the workaround, and honestly it demos better anyway.
Accomplishments that we're proud of
This is the only project in the hackathon (out of 1,692) that creates actual code commits and fix merge requests. Everyone else builds agents that report or score — Security Sentinel reports AND fixes. The educational comments in every commit are something I'm particularly happy about. The agent doesn't just patch your code; it teaches you why your code was broken and links you to the OWASP docs so you understand the fix.
Getting the full pipeline working end-to-end — from one comment to fixed code, triage issues with SLAs, and a compliance scorecard — felt like watching the whole thing click into place at 2 AM.
What I learned
GitLab's Duo Agent Platform is surprisingly capable once you understand the component model. The flow YAML is straightforward, but getting agents to use tools correctly requires very precise prompting — especially around which project to target and which MR to comment on. I also learned that confidence scoring on automated fixes is more useful than I expected. Knowing that a dependency bump is HIGH confidence (zero risk) while a SQL parameterization is MEDIUM (semantically equivalent but different code path) changes how you review the output.
What's next for Security Sentinel
Support for more languages beyond Node.js — Python, Go, Java. Smarter recurrence detection so if a vulnerability the agent already fixed reappears, it gets flagged immediately. A dashboard showing security trends across multiple merge requests over time. And making the GCP integration configurable so teams can plug in their own cloud provider or use HashiCorp Vault instead.
Built With
- anthropic-claude
- express.js
- gcp-cloud-functions
- gcp-secret-manager
- gitlab-duo-agent-platform
- node.js
- yaml
Log in or sign up for Devpost to join the conversation.