Inspiration
Every CI/CD pipeline has an invisible cost — wasted compute, redundant downloads, unnecessary runs. A pipeline that wastes 5 minutes per run across 50 daily runs burns over 1,500 hours of compute per year doing nothing useful. Most teams never measure this because the waste is hidden inside job logs and pipeline durations that nobody reviews.
We asked: what if an AI agent could audit your pipeline's sustainability the same way a senior DevOps engineer would — reading the config, checking real run data, profiling the infrastructure — and then fix what it finds?
What it does
GreenCI is a two-tier CI/CD sustainability platform built on the GitLab Duo Agent Platform:
Tier 1 — Project Auditor: A custom flow triggered by @-mentioning on any issue or MR. It reads the .gitlab-ci.yml, fetches real pipeline run data via the GitLab API, maps runners to GCP instance types using Cloud Carbon Footprint coefficients, scores the pipeline across 6 categories (Cache Efficiency, Job Parallelization, Smart Filtering, Image Optimization, Artifact Management, Pipeline Reliability), posts a structured Green Score (A–F) audit report as a comment, applies a green-score::F label for tracking, and creates a merge request with an optimized .gitlab-ci.yml — each fix annotated with a # ♻️ Green: comment explaining the sustainability benefit.
Tier 2 — Org Dashboard: A second flow that audits every project in a GitLab group and produces a ranked leaderboard. It uses calibrated scoring anchors for consistent cross-project comparison, identifies the best and worst projects, surfaces the most common waste patterns, and recommends the highest-impact quick wins across the entire organization.
Bonus — Interactive Analyzer: A standalone agent for GitLab Duo Chat with the same scoring methodology and infrastructure-aware profiling, for on-demand conversational audits.
How we built it
- Platform: GitLab Duo Agent Platform — custom flows and custom agents
- Model: Anthropic Claude Sonnet (default GitLab Duo model)
- Tools: 17 built-in GitLab tools including
gitlab_api_get,gitlab_graphql,create_commit,create_merge_request,update_issue - Energy methodology: Runner metadata from the GitLab API mapped to GCP instance profiles (n1-standard-1/2/4), with energy calculated using Cloud Carbon Footprint v1.15 coefficients and EPA eGRID PJM region grid intensity (379 gCO₂/kWh)
- Scoring: 100-point Pipeline Efficiency Score across 6 categories, aligned with Green Software Foundation principles (energy efficiency, hardware efficiency, carbon awareness). The org dashboard uses calibrated scoring anchors — specific score values mapped to specific technical observations — for consistent cross-project ranking.
Challenges we ran into
- Multi-agent flow chaining: Our original two-agent architecture (Analyzer → Reporter) failed with an instant WebSocket close and
approved_tools=[]in the executor logs. Custom multi-agent flows appear to have a tool authorization issue in the current platform. We solved this by consolidating into a single-agent flow that executes all three phases (analyze → report → fix MR) sequentially. - Hackathon sandbox permissions: The flow's service account in the hackathon group had
insufficient_scopeerrors, preventing flow execution. We developed and tested in a private GitLab Ultimate group, then pushed the final YAML to the hackathon project. - LLM non-determinism: The same pipeline would score 18/100 one run and 23/100 the next. We addressed this with calibrated scoring anchors in the org dashboard — mapping specific technical patterns to specific score values rather than letting the model freestyle.
- Duplicate comment posting: The model would sometimes post the audit report twice with different scores. Fixed by adding explicit "EXACTLY ONCE" and "STOP IMMEDIATELY" instructions at key phase boundaries.
- Branch collision: The fix MR creates a
green-audit/optimize-pipelinebranch. On subsequent runs, the commit fails because the branch already exists. This needs a timestamp suffix or branch cleanup in future iterations.
Accomplishments that we're proud of
- Infrastructure-aware energy profiling that reads actual runner metadata and maps to published GCP energy coefficients — not hardcoded fake numbers
- End-to-end autonomous execution: one @-mention triggers analysis, scoring, report posting, label application, and fix MR creation with zero manual steps
- The org dashboard leaderboard — no other submission in the hackathon does cross-project sustainability comparison
- Honest methodology: We measure observable configuration quality and runtime behavior, cite our data sources (EPA eGRID, CCF), and disclose limitations. No fake carbon math.
What we learned
- The GitLab Duo Agent Platform is powerful but the flow execution engine has rough edges — multi-agent chaining, tool authorization, and timeout management all required workarounds
- Prompt engineering for autonomous flows is fundamentally different from chat — the model's instinct to ask for confirmation ("Would you like me to proceed?") must be explicitly overridden
- Scoring consistency in LLM-based tools requires anchoring — vague rubrics produce vague scores
- The Green Software Foundation's SCI specification is the right framework conceptually, but honest implementation requires energy telemetry data that GitLab shared runners don't expose
What's next for GreenCI
- Before/after scoring: Merge the fix MR, re-run the auditor, and show the actual score improvement
- Historical tracking: Store scores over time (via issue labels or a wiki page) to show sustainability trends
- Pipeline event triggers: When GitLab enables pipeline-completion triggers for custom flows (currently experimental), GreenCI could audit every pipeline automatically
- Kepler/Scaphandre integration: For self-managed runners with energy telemetry, calculate actual SCI scores instead of estimates
- Scoring anchor refinement: Expand the anchor system to the project auditor for fully deterministic scoring
Built With
- anthropic
- ci/cd
- cloudcarbonfootprint
- epaegrid
- gitlab
- gitlabduoagent
- yaml
Log in or sign up for Devpost to join the conversation.