GreenCI

Inspiration

Every CI/CD pipeline has an invisible cost — wasted compute, redundant downloads, unnecessary runs. A pipeline that wastes 5 minutes per run across 50 daily runs burns over 1,500 hours of compute per year doing nothing useful. Most teams never measure this because the waste is hidden inside job logs and pipeline durations that nobody reviews.

We asked: what if an AI agent could audit your pipeline's sustainability the same way a senior DevOps engineer would — reading the config, checking real run data, profiling the infrastructure — and then fix what it finds?

What it does

GreenCI is a two-tier CI/CD sustainability platform built on the GitLab Duo Agent Platform:

Tier 1 — Project Auditor: A custom flow triggered by @-mentioning on any issue or MR. It reads the .gitlab-ci.yml, fetches real pipeline run data via the GitLab API, maps runners to GCP instance types using Cloud Carbon Footprint coefficients, scores the pipeline across 6 categories (Cache Efficiency, Job Parallelization, Smart Filtering, Image Optimization, Artifact Management, Pipeline Reliability), posts a structured Green Score (A–F) audit report as a comment, applies a green-score::F label for tracking, and creates a merge request with an optimized .gitlab-ci.yml — each fix annotated with a # ♻️ Green: comment explaining the sustainability benefit.

Tier 2 — Org Dashboard: A second flow that audits every project in a GitLab group and produces a ranked leaderboard. It uses calibrated scoring anchors for consistent cross-project comparison, identifies the best and worst projects, surfaces the most common waste patterns, and recommends the highest-impact quick wins across the entire organization.

Bonus — Interactive Analyzer: A standalone agent for GitLab Duo Chat with the same scoring methodology and infrastructure-aware profiling, for on-demand conversational audits.

How we built it

Platform: GitLab Duo Agent Platform — custom flows and custom agents
Model: Anthropic Claude Sonnet (default GitLab Duo model)
Tools: 17 built-in GitLab tools including gitlab_api_get, gitlab_graphql, create_commit, create_merge_request, update_issue
Energy methodology: Runner metadata from the GitLab API mapped to GCP instance profiles (n1-standard-1/2/4), with energy calculated using Cloud Carbon Footprint v1.15 coefficients and EPA eGRID PJM region grid intensity (379 gCO₂/kWh)
Scoring: 100-point Pipeline Efficiency Score across 6 categories, aligned with Green Software Foundation principles (energy efficiency, hardware efficiency, carbon awareness). The org dashboard uses calibrated scoring anchors — specific score values mapped to specific technical observations — for consistent cross-project ranking.

Challenges we ran into

Multi-agent flow chaining: Our original two-agent architecture (Analyzer → Reporter) failed with an instant WebSocket close and approved_tools=[] in the executor logs. Custom multi-agent flows appear to have a tool authorization issue in the current platform. We solved this by consolidating into a single-agent flow that executes all three phases (analyze → report → fix MR) sequentially.
Hackathon sandbox permissions: The flow's service account in the hackathon group had insufficient_scope errors, preventing flow execution. We developed and tested in a private GitLab Ultimate group, then pushed the final YAML to the hackathon project.
LLM non-determinism: The same pipeline would score 18/100 one run and 23/100 the next. We addressed this with calibrated scoring anchors in the org dashboard — mapping specific technical patterns to specific score values rather than letting the model freestyle.
Duplicate comment posting: The model would sometimes post the audit report twice with different scores. Fixed by adding explicit "EXACTLY ONCE" and "STOP IMMEDIATELY" instructions at key phase boundaries.
Branch collision: The fix MR creates a green-audit/optimize-pipeline branch. On subsequent runs, the commit fails because the branch already exists. This needs a timestamp suffix or branch cleanup in future iterations.

Accomplishments that we're proud of

Infrastructure-aware energy profiling that reads actual runner metadata and maps to published GCP energy coefficients — not hardcoded fake numbers
End-to-end autonomous execution: one @-mention triggers analysis, scoring, report posting, label application, and fix MR creation with zero manual steps
The org dashboard leaderboard — no other submission in the hackathon does cross-project sustainability comparison
Honest methodology: We measure observable configuration quality and runtime behavior, cite our data sources (EPA eGRID, CCF), and disclose limitations. No fake carbon math.

What we learned

The GitLab Duo Agent Platform is powerful but the flow execution engine has rough edges — multi-agent chaining, tool authorization, and timeout management all required workarounds
Prompt engineering for autonomous flows is fundamentally different from chat — the model's instinct to ask for confirmation ("Would you like me to proceed?") must be explicitly overridden
Scoring consistency in LLM-based tools requires anchoring — vague rubrics produce vague scores
The Green Software Foundation's SCI specification is the right framework conceptually, but honest implementation requires energy telemetry data that GitLab shared runners don't expose

What's next for GreenCI

Before/after scoring: Merge the fix MR, re-run the auditor, and show the actual score improvement
Historical tracking: Store scores over time (via issue labels or a wiki page) to show sustainability trends
Pipeline event triggers: When GitLab enables pipeline-completion triggers for custom flows (currently experimental), GreenCI could audit every pipeline automatically
Kepler/Scaphandre integration: For self-managed runners with energy telemetry, calculate actual SCI scores instead of estimates
Scoring anchor refinement: Expand the anchor system to the project auditor for fully deterministic scoring

Built With

anthropic
ci/cd
cloudcarbonfootprint
epaegrid
gitlab
gitlabduoagent
yaml

Updates

Private user started this project — Mar 25, 2026 11:43 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.