GreenPipe: AI CI/CD Carbon Analyzer

Inspiration

CI/CD pipelines run millions of times daily. A 2025 study (arxiv:2510.26413) found GitHub Actions alone produced 456.9 megatons of CO₂ in 2024 — equivalent to 7,600 trees capturing carbon for a year. Our analysis of 200 real GitLab pipelines revealed that 76% have no caching, 88% lack interruptible flags, and the average pipeline wastes over 35% of its energy on redundant work.

A systematic mapping study of 92 papers (MDPI 2025) found zero research on AI-driven carbon optimization of CI/CD. Tools exist to measure (Eco-CI) or shift regions (CarbonRunner), but nothing automatically measures AND fixes the problem.

GreenPipe was born from this gap: what if an AI agent could analyze any pipeline, calculate its carbon footprint using a real ISO standard, and create a merge request with the fix — all from a single comment?

What it does

GreenPipe is a GitLab Duo AI agent that performs the complete green optimization loop:

Measure → Analyze → Fix → Validate → Commit → MR

Triggered by mentioning @greenpipe on any issue
Discovers project context dynamically (works on any GitLab project)
Reads .gitlab-ci.yml, carbon budget config, and benchmark data
Calculates SCI score using ISO/IEC 21031:2024: \( \text{SCI} = \frac{(E \times I) + M}{R} \)
Detects 13 waste patterns (no cache, oversized images, redundant installs, etc.)
Benchmarks against our dataset of 200 real GitLab pipelines
Generates optimized .gitlab-ci.yml and validates with ci_linter
Creates a branch, commits the fix, and opens a merge request with the full SCI report
Accounts for its own carbon cost — 1.5 gCO₂e analysis vs 390 gCO₂e/month savings = 260× ROI

Key Features

ISO/IEC 21031:2024 SCI Standard — not arbitrary scores
Per-job carbon attribution — shows exactly which job wastes the most
200-pipeline benchmark dataset — rates your pipeline vs real-world data
Carbon budget enforcement — configurable via .gitlab/greenpipe.yml
Carbon-aware scheduling — suggests low-carbon regions (e.g., "Move to europe-north1 for 97% reduction")
Agent self-cost transparency — reports its own inference carbon alongside savings

How I built it

Platform: GitLab Duo Agent Platform with a custom flow definition

AI Model: Claude (Anthropic) via GitLab Duo — powers the analysis, optimization, and report generation

Scientific Foundation:

ISO/IEC 21031:2024 SCI specification
SPECpower database for runner power coefficients (AMD EPYC 7B12: 7W TDP per job)
Cloud Carbon Footprint methodology for embodied carbon (34.25 gCO₂e/hour)
Google Cloud region carbon intensity data

Dataset: Built a collection of 200 real .gitlab-ci.yml files from public GitLab projects using the GitLab API. Analyzed each for SCI scores, waste patterns, and optimization potential. Key findings:

Metric	Value
Average waste score	5.5 / 9
SCI Median	623 gCO₂e/run
SCI P90	2,144 gCO₂e/run
Average reduction potential	35.6%

Agent Tools Used:

get_project — dynamic project discovery
read_files — pipeline config + carbon budget + benchmarks
ci_linter — validates optimized YAML before committing
create_commit — commits fix on a timestamped branch
create_merge_request — opens MR with SCI analysis report
create_issue — creates tracking issue with full methodology

Challenges I ran into

Flow timeout: The initial prompt generated reports too large for the agent's output limits. I iteratively compressed the report template while maintaining scientific rigor — the optimized YAML goes into the commit, not the report text.

Service account permissions: The GitLab Duo service account couldn't post issue comments (create_issue_note). Pivoted to using create_issue for the report and create_commit + create_merge_request for the fix.

AI Catalog versioning: Discovered that flow updates require new Git tags to be picked up by the AI Catalog. Each iteration needed a tag bump (went through v1 to v1.7).

Dynamic project support: Removing hardcoded project IDs required using context:project_id as input and calling get_project to discover the default branch — couldn't assume "main".

Branch naming: Fixed branch names (greenpipe/optimize-pipeline) broke on the second run. Switched to timestamped branches (greenpipe/optimize-20260324-1200).

Accomplishments that I'm proud of

Working end-to-end auto-fix MR with validated, passing CI pipeline
ISO/IEC 21031:2024 compliance — the only green CI/CD tool using the actual international standard
200-pipeline benchmark dataset — real data, not theoretical
Agent self-cost accounting — honest carbon accounting that no other tool provides
Tested on real-world pipelines — not toy examples
Works on any GitLab project — no hardcoded configuration needed

What I learned

76% of real CI/CD pipelines have no caching — the waste is enormous
AI agents can perform complete Measure → Fix → Commit loops autonomously
The GitLab Duo Agent Platform is powerful but requires careful prompt engineering
Using an ISO standard instead of arbitrary scores gives real credibility
Accounting for the AI's own carbon cost is essential for honest green tools

What's next for GreenPipe

Multi-agent architecture — dedicated Collector, Analyzer, and Publisher agents
Real pipeline duration data — integrate with GitLab Pipeline API for actual measurements
CI/CD Component — one-line include: installation for any project
Carbon trend dashboard — track SCI scores over time via GitLab Pages
Water footprint — add water usage estimation alongside carbon

Built With

claude-(anthropic)
cloud-carbon-footprint
gitlab-api
gitlab-duo-agent-platform
iso/iec-21031:2024-sci-standard
python
specpower-database

Updates

Noura Hosny started this project — Mar 24, 2026 01:23 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.