GitLab CI/CD Savings Intelligence: Precise Carbon Audit

Inspiration

In early 2026, the conversation around compute emissions stopped being abstract.

On March 5, 2026, NOAA updated its global atmospheric CO2 record and reported December 2025 at 427.35 ppm, up from 425.21 ppm a year earlier. In January 2026, the International Energy Agency published its Energy and AI report showing that data centres already used about 415 TWh of electricity in 2024, roughly 1.5% of global electricity demand, and could more than double to around 945 TWh by 2030. The same report warned that electricity-related CO2 emissions from data centres could rise sharply this decade.

That matters because most software teams still do not see compute waste where it actually starts: in everyday engineering decisions. Poor Docker layer ordering, weak caching, and inefficient build paths quietly burn GitLab CI/CD minutes over and over again, but that waste is rarely translated into something actionable inside the merge request.

We built GitLab CI/CD Savings Intelligence: Precise Carbon Audit to close that gap. Instead of offering a vague sustainability score, @carbon-audit pinpoints the Docker build path changed in a merge request, measures the CI time being wasted, estimates the carbon impact, and then verifies the savings after the optimization lands. We wanted sustainability to show up not as branding, but as engineering evidence.

What it does

When a merge request changes a Dockerfile, @carbon-audit analyzes whether that Docker build path is likely wasting GitLab CI/CD compute because of poor caching and inefficient layer structure.

The flow first identifies the merge request source and target branches, then inspects the merge request diff to find touched Dockerfiles. From there, it reads the CI configuration, resolves local includes, and conservatively maps the Dockerfile change to the relevant Docker build job. If the mapping is ambiguous, it skips instead of guessing.

Once it has a reliable match, the flow compares recent successful pipelines for the target branch and source branch and classifies the result into one of two outcomes.

POTENTIAL_SAVINGS shows the avoidable waste before the optimization lands. It estimates how much CI time is being left on the table, extrapolates likely yearly run frequency from recent pipeline spacing, and converts wasted build minutes on a standard GitLab runner into a clear carbon estimate.

SAVINGS shows measured improvement after the optimization has been applied. It uses the same calculation path, but now reports how much CI time and CO2 were actually avoided by the improved Docker caching strategy.

In both modes, the output is written for real decision-making: concise, evidence-backed, and posted directly where developers already work.

How we built it

We built this as a GitLab-native flow designed around real merge request context rather than a detached reporting system.

The flow begins by extracting merge request metadata and identifying the source and target branches. It then detects Dockerfile changes from the merge request diff and reads the source branch CI configuration, including local includes, to understand the actual build path in use. That lets the flow narrow its analysis to the Docker build job that matters for the Dockerfile under review.

We intentionally designed the matching logic to be conservative. The flow only proceeds when it can infer one consistent Docker build job across the compared pipelines. This was important because we wanted the carbon claim to feel credible, not decorative.

After that, the flow gathers recent successful pipelines for the baseline branch and the merge request source branch. It computes timing differences, estimates build-minute waste or savings, annualizes the impact based on recent run intervals, and converts the result into CO2 using a standard runner-size model. The final result is published back into the merge request as either a potential-savings or measured-savings comment.

The product experience is simple, but the implementation is deliberately disciplined: merge request context, CI config parsing, job-path matching, historical pipeline comparison, conservative estimation, and a user-facing output that stays within GitLab flow size constraints.

Challenges we ran into

The hardest challenge was credibility.

A lot of sustainability tooling makes broad claims with weak grounding. We did not want to say “your CI/CD is greener” unless we could tie the statement to a specific Dockerfile change, a specific build path, and a defensible comparison. That forced us to solve a much harder problem than generic pipeline scoring: merge-request-scoped attribution.

Another challenge was avoiding overclaiming. A repository can have multiple Dockerfiles, multiple build jobs, includes, branch-specific behavior, and noisy runtime variance. We chose to be strict. If the flow cannot confidently map the changed Dockerfile to one consistent CI job, it skips instead of inventing certainty. That improves trust, even though it reduces how often we can comment.

We also had to make the carbon math useful without turning the merge request comment into a wall of numbers. The output needed to be short enough for GitLab flow constraints, simple enough for judges to understand quickly, and still rigorous enough to feel like engineering evidence rather than branding.

Accomplishments that we're proud of

We are proud that this project makes sustainability concrete inside an everyday engineering workflow. Instead of asking teams to adopt another dashboard, we surface waste directly in the merge request where the Docker build is being changed.

We are also proud of the two-mode design. Most tools stop at “here is waste.” Our flow goes further by supporting both estimated avoidable waste and measured realized savings. That creates a much stronger story for developers, maintainers, and judges because it closes the loop from diagnosis to proof.

Another accomplishment is the decision to prioritize trustworthy output over maximum coverage. By requiring a conservative Dockerfile-to-job match and skipping ambiguous cases, we made the system more defensible. For a Green Agent submission, that matters. The value is not just that we mention carbon; it is that we quantify it in a way that feels earned.

Finally, we believe the project stands out because it is both specific and broadly relevant. Docker builds are everywhere. Build caching mistakes are common. GitLab CI/CD minutes are expensive. Carbon is increasingly part of engineering accountability. This flow brings all of that together in a narrow, understandable, and demo-friendly experience.

What we learned

We learned that the most compelling sustainability automation is not a separate sustainability product. It is operational tooling that helps developers make a better engineering decision in the moment.

We also learned that precision beats coverage. A narrow claim with strong evidence is more valuable than a broad claim with weak attribution. In practice, that meant focusing on merge-request-level Docker build paths instead of pretending to audit all CI/CD behavior equally well.

Another key lesson was that developers respond better to time and efficiency than to abstract environmental messaging. The strongest framing is faster builds, fewer wasted runner minutes, and better caching. Carbon becomes more persuasive when it is presented as the measurable consequence of a concrete optimization, not as the headline by itself.

What's next for GitLab CI/CD Savings Intelligence: Precise Carbon Audit

The next step is to move from precise reporting to precise remediation.

Today, @carbon-audit identifies Docker caching waste and quantifies its impact. The next version will generate safe, reviewable optimization proposals for common Dockerfile issues such as poor layer ordering, oversized build contexts, missing cache-friendly dependency steps, and related CI build configuration improvements. That would turn the flow from a high-signal audit into a full trigger-to-fix sustainability workflow.

We also want to improve historical modeling so teams can tune the estimation assumptions for their own runner types, frequency patterns, and carbon-intensity preferences while preserving the same conservative reporting style.

Longer term, we see this becoming a practical sustainability layer for GitLab CI/CD: not a generic green score, but a focused intelligence system that helps engineering teams prove where compute is being wasted, fix it at the merge request level, and track savings over time.

Built With

claude-on-gcp
duo
gcp
gitlab
gitlab-duo

Updates

Nick Spreen started this project — Mar 25, 2026 01:39 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.