Inspiration
Every engineering team I've worked in has the same blind spot: the CI/CD pipeline. We obsess over code quality and architecture reviews - but the moment a pipeline fails, someone has to manually dig through logs, file a post-mortem, estimate the blast radius, and figure out why it keeps happening.
That's not a people problem. That's a missing agent problem.
I wanted to build something that watches the pipeline so you don't have to - agents that fire automatically the moment you assign them as reviewers on a Merge Request, each doing one job with surgical precision.
What it does
Pipeline Guardian is a suite of six AI agents on the GitLab Duo Agent Platform that protect your entire software delivery lifecycle:
- Compliance Sentinel - scans MR diffs for hardcoded secrets, PII, CVE risks, and dangerous misconfigurations
- Efficiency Scorer - compares job durations against a 30-run historical baseline with trend confidence scoring
- Post-Mortem Writer - reads failing job logs and posts a structured incident report with severity, root cause, timeline, and action items
- GCP Cost Estimator calculates compute cost per pipeline run with monthly projections using live GCP Cloud Billing Catalog pricing
- Carbon Auditor - calculates CO2e footprint per job using IEA 2023 regional grid intensity factors and recommends greener regions
- Anomaly Detective - detects duration spikes, persistent failures, and unexpected jobs against a 30-run baseline
All six run Claude Sonnet 4.5 via Vertex AI. Assign any bot as a reviewer - it fires within 60–90 seconds and posts a structured Markdown comment to the MR. No webhooks. No external infrastructure. No scripts to run.
How we built it
Each agent is a GitLab Duo Agent Platform flow with two chained components: a reader (AgentComponent) that fetches pipeline data through GitLab API tools, and a writer (OneOffComponent) that posts a formatted report back to the MR. The reader's JSON output becomes the writer's only input - keeping each component focused, testable, and independently replaceable.
A scheduled CI job seeds historical pipeline run data twice daily (06:00 and 18:00 UTC), giving the Efficiency Scorer and Anomaly Detective high-confidence trend data within 10 days. GCP pricing is refreshed daily from the public Cloud Billing Catalog API with a JSON fallback and staleness warning if the API is unavailable. Carbon calculations use IEA 2023 regional grid intensity averages mapped per runner region.
All prompts include explicit prompt injection defence - any instructions found
inside CI job logs are ignored, and secrets discovered in logs are redacted as
[REDACTED] before posting.
Challenges we ran into
Experiment-status triggers. GitLab Duo reviewer-assignment triggers occasionally
delay or miss. A demo-fallback.py script posts cached realistic responses so the
demo never blocks on infrastructure flakiness.
Prompt injection from CI logs. A malicious developer could embed adversarial instructions inside job echo statements. Every reader prompt explicitly treats all log content as untrusted data - the same mental model as user input in a web app.
Consistent structured output. Getting Claude to return only a raw JSON object across six different agent contexts - no preamble, no markdown fences - required careful zero-shot prompt engineering with explicit output contracts in every system prompt.
Baseline cold start. On day one there's no historical data. The tiered confidence system handles this gracefully, but it meant seeding had to start on day one of the hackathon to have high-confidence outputs ready for submission.
Accomplishments that we're proud of
Six fully deployed agents running on live GitLab infrastructure, all triggered by a single reviewer assignment. The Compliance Sentinel detects CRITICAL-severity violations across secrets, PII, and misconfigurations in one pass. The Post-Mortem Writer produces structured incident reports - severity, root cause category, timeline, and action items - from raw CI job logs in under 90 seconds. The Carbon Auditor ties pipeline emissions to real IEA regional grid data, making the environmental cost of every CI run visible and comparable.
What we learned
GitLab Duo Agent Platform flows are a powerful primitive for chaining analysis agents with action components. The reader --> writer pattern keeps each piece small and focused. Ambient environment flows fire reliably on reviewer assignment with zero external infrastructure - no Lambda, no webhook server, nothing to operate.
Prompt injection defence is not optional when agents read CI job logs. Treating all log content as untrusted is the correct default. Six agents sharing the same MR context also forced a discipline around separation of concerns at the agent level - each agent owns exactly one domain and never duplicates another's tool calls.
What's next for Pipeline-Guardian
- Data Contract Validator - blocks merges that introduce breaking ETL schema changes against registered data contracts
- Merge gates - Compliance Sentinel blocks the MR on CRITICAL findings, not just warns
- Notification routing - HIGH+ severity post-mortems trigger Slack/Teams alerts
- Cross-MR trend analysis - detect recurring failure classes across multiple MRs, not just single pipeline runs
Built With
- anthropic
- claude-sonnet-4.5
- gcp-cloud-billing-api
- gitlab-ci/cd
- gitlab-duo-agent-platform
- python
- vertex-ai
- yaml
Log in or sign up for Devpost to join the conversation.