Inspiration
What it does
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
What's next for GitLab Sentinel: Predictive DevOps Intelligence
Inspiration
Every DevOps team has experienced the frustration of a broken pipeline after merge, a security vulnerability discovered in production, or a risky deployment that should have been caught in review. We noticed that every past GitLab AI Hackathon winner built reactive tools — they fix problems after they happen. We asked: what if AI could predict failures before they occur?
That's GitLab Sentinel — shifting DevOps from reactive firefighting to proactive prevention.
What it does
When a developer submits a Merge Request, Sentinel automatically analyzes the code changes and predicts three types of risk:
- Pipeline Risk — Will CI/CD break? Detects dependency version conflicts, broken CI configs, missing test coverage
- Security Risk — Are there vulnerabilities? Catches hardcoded secrets (CWE-798), SQL injection (CWE-89), RCE risks (CWE-94)
- Delivery Risk — Is this MR too risky to merge? Flags large blast radius changes, missing tests, breaking API changes
Sentinel posts a structured analysis report directly as an MR comment with risk scores (0-10), specific findings with CWE references, and actionable prevention recommendations.
How we built it
Sentinel is built entirely on the GitLab Duo Agent Platform using Custom Agents and Custom Flows:
- 2-Agent Lightning Architecture: A unified Sentinel Analyzer (combines triage + pipeline prediction + security scanning in one pass) feeds into a Sentinel Reporter that posts the structured MR comment
- Strict Tool Budget: Maximum 12 tool calls per analysis (typical: 5-8), preventing agent over-exploration and ensuring fast response times
- MR-Diff-First Analysis: The analyzer reads only the MR diff files, never scanning the entire repository — this is both faster and more accurate
- Industry-Standard Evaluation: 4 benchmark scenarios with metrics including pass@k, tool trajectory, detection completeness, and false positive rate
- 74 Offline Tests: pytest suite validating YAML constraints, output schemas, flow integrity, and prompt quality in 0.13s
Tech stack: GitLab Duo Agent Platform, Anthropic Claude (via GitLab sandbox), Google Cloud Platform (BigQuery), Python (pytest)
Challenges we ran into
- WebSocket Timeout: Our initial 4-agent serial chain took ~10 minutes and caused WebSocket disconnects (code 1006). We redesigned to a 2-agent architecture that completes in ~4 minutes
- Agent Over-Exploration: Early versions made 153+ tool calls, scanning entire repos. We added strict tool budgets and "answer immediately" directives to keep calls under 15
- Security Scanner Scope: The scanner initially read files from main branch instead of MR diff, causing false results. Fixed by making the analyzer start from
list_merge_request_diffs - Platform Constraints: Discovered undocumented rules (no DeterministicStep, no model field, string inputs cause WebSocket disconnect) through trial and error
Accomplishments that we're proud of
- Predictive, not Reactive: First GitLab AI Hackathon entry to predict failures before they happen
- 2-Agent Lightning Design: Solved the timeout problem by consolidating 4 agents into 2 without losing analysis depth
- Comprehensive Testing: 74 offline tests + 4 benchmark evaluation scenarios with industry-standard metrics
- Green Agent Design: Strict token and tool call budgets for efficiency
What we learned
- The GitLab Duo Agent Platform is powerful but has many undocumented constraints that require careful testing
- Fewer, smarter agents beat more, specialized agents when platform timeouts are a factor
- Prompt engineering with strict tool budgets and procedural instructions dramatically improves agent reliability
- MR-diff-first analysis is both faster and more accurate than full-repo scanning
What's next for GitLab Sentinel
- Auto-Fix Suggestions: Generate MR suggestions that fix detected issues automatically
- Historical Learning: Use BigQuery to learn from past pipeline failures and improve predictions over time
- CI/CD Integration: Trigger Sentinel automatically on every MR via GitLab webhooks
- Custom Rule Engine: Let teams define project-specific risk rules and thresholds
Built With
- gitlab-duo-agent-platform
Log in or sign up for Devpost to join the conversation.