About the Project
This project is a decarbonization transition scoreboard: a country-level analytical system designed to identify where economic growth appears to be separating from carbon emissions, and whether that progress looks structurally durable when viewed through the lens of energy infrastructure.
The core question behind the project was simple:
Which countries seem to be making real progress on decarbonization, not just in headline emissions trends, but in a way that is supported by the underlying power system?
A country can show falling emissions for many reasons. Sometimes that reflects genuine structural transition. Other times it may be driven by temporary shocks, incomplete data, outsourcing of emissions, or short time windows that make the result look better than it really is. I wanted to build something that moved beyond a single chart of GDP and CO(_2), and instead combined trajectory, infrastructure, and confidence into one interpretable framework.
At the center of the project is a basic decoupling idea:
$$ \text{Decoupling} \Longrightarrow \Delta GDP > 0 \quad \text{and} \quad \Delta CO_2 < 0 $$
In practice, the project makes this more robust by measuring these changes across multiple time windows, comparing countries against one another, and incorporating power-plant information to see whether a country’s emissions trajectory is consistent with its physical energy system.
What Inspired Me
The project was inspired by a frustration with how climate progress is often discussed. A lot of public conversations focus on isolated indicators: a country’s emissions went down, renewables went up, coal went down, GDP went up. Each of those is useful, but on its own it can be misleading.
I wanted to build something that linked these pieces together.
What interested me most was the tension between surface-level progress and structural progress. A country might appear to be decarbonizing, but if its power fleet is still heavily fossil-based, recently expanded in gas or coal, or poorly measured, then the story is less convincing. On the other hand, a country with modest recent performance might have infrastructure that suggests stronger long-run transition potential.
That motivated the project: not just ranking countries by whether emissions fell, but trying to answer whether the pattern looked credible, repeatable, and supported by real-world infrastructure.
How I Built the Project
I built the project as a notebook-based analytical pipeline that merges macroeconomic, emissions, demographic, and power-sector data into a country-level scoreboard.
Data Foundation
The project combines three main types of information:
- Emissions data, primarily national CO(_2) metrics
- Macroeconomic data, especially GDP and population
- Power plant data, including installed capacity, fuel type, and commissioning information
From there, the project builds a country-year panel and derives features that try to answer three broad questions:
- Trajectory: Is the country showing evidence of GDP growth with declining emissions?
- Structure: Does its power-plant fleet support that story?
- Confidence: How much should we trust the result given missingness, imputation, and sensitivity to assumptions?
Main Methods Used
The analysis starts with a basic decoupling screen using changes in GDP and CO(_2) over time. But instead of relying on a single window, I added multiple robustness layers:
- decoupling checks across several time windows
- rolling-window consistency measures
- country classifications such as stable, emerging, mixed, or weak decouplers
- infrastructure aggregates like fossil share, renewable share, and low-carbon share
- plant-age and transition features such as recent low-carbon additions, fossil lock-in pressure, and legacy coal exposure
- source-quality and coverage scores to track how much of the result rests on observed vs inferred data
A major addition was treating the power fleet not just as a static fuel mix, but as a transition object. That meant looking at things like:
- whether recent capacity additions were clean or fossil-heavy
- whether the country still relies on old fossil infrastructure
- whether commissioning-year information is sparse and needs careful imputation
Because the data is imperfect, I also built confidence-aware logic into the pipeline. Missing years, incomplete sectoral coverage, and uneven infrastructure data can all distort rankings. So instead of treating every country as equally measurable, the project tracks data coverage and penalizes low-confidence cases.
Scoring Approach
Rather than using a predictive machine learning model, I used a composite ranking framework.
The final scoreboard blends:
- macro decoupling features
- infrastructure and transition features
- stability across windows
- data-confidence measures
Most components are normalized with percentile-style ranking so that variables with very different units can still be combined in a meaningful way. Extended infrastructure features are allowed to influence the ranking more strongly only when the underlying data is good enough.
This made the project feel less like a one-off notebook and more like a decision tool: something that can say not just who ranks highly, but also why, and with what degree of trust.
What I Learned
One of the biggest things I learned is that climate analytics becomes much more difficult the moment you try to move from a single metric to a decision-grade interpretation.
It is easy to say:
- GDP rose
- emissions fell
- therefore this country is decoupling
It is much harder to ask:
- did this hold across multiple time windows?
- is the result sensitive to endpoint choice?
- is the power system consistent with the emissions story?
- how much of this depends on imputed or incomplete data?
- are we comparing countries fairly when their data quality differs?
I also learned how quickly “simple” climate indicators become a data engineering problem. A large part of the project was not glamorous modeling. It was:
- harmonizing country identifiers
- aligning year coverage
- handling source conflicts
- creating fallback logic
- tracking provenance
- making sure the ranking system did not silently reward countries with better data rather than better performance
Methodologically, I learned the value of robustness over elegance. A slightly messier pipeline that tests multiple windows, tracks data confidence, and stress-tests assumptions is much more useful than a cleaner score that looks precise but hides fragility.
Challenges I Faced
1. Uneven Data Quality
A major challenge was that the different datasets do not line up cleanly. Coverage varies by country and year, and some variables are much better measured than others. GDP, emissions, population, and plant data all have different kinds of missingness and revision behavior.
That meant a large share of the work went into deciding:
- when to merge
- when to interpolate
- when to use fallbacks
- when to downgrade confidence instead of forcing a value
2. Power-Plant Data Is Useful but Incomplete
The power plant database adds real structural depth, but it is not a perfect picture of a country’s power system. Some plants are missing commissioning years, some capacity information is sparse, and plant-level data does not automatically map cleanly to national emissions outcomes.
This created a balancing act: the infrastructure layer is extremely useful, but it has to be used cautiously so that it enriches the analysis instead of pretending to offer more certainty than it actually does.
3. Rankings Can Create False Precision
Once a scoreboard exists, it becomes tempting to trust the ordering too much. But a country ranked (7)th is not necessarily meaningfully better than one ranked (12)th if both are sensitive to time-window choice or data coverage.
To address that, I added:
- stability checks
- scenario perturbations
- overlap tests
- confidence tiers
That added complexity, but it made the output much more honest.
4. Climate Performance Is Not One-Dimensional
A country can perform well on one dimension and poorly on another. For example:
- emissions may be falling, but the fossil fleet may still be large
- renewables may be growing, but GDP may be stagnant
- power-sector trends may look better than economy-wide CO(_2)
This made it difficult to define a single score that stayed interpretable. I had to design the system so that the composite score was useful, but still decomposable into understandable components.
Key Limitations
1. The Project Is Descriptive, Not Causal
The scoreboard identifies patterns; it does not prove why those patterns happened. If a country appears to decouple, the project does not establish whether that was caused by policy, industrial change, offshoring, weather, fuel prices, recession dynamics, or technology adoption.
So the project is best understood as a screening and comparison tool, not a causal inference study.
2. Territorial CO(_2) Is Only One Slice of the Problem
The analysis relies heavily on production-based or territorial emissions metrics. That means it does not fully capture:
- consumption-based emissions
- imported carbon intensity
- embodied emissions in trade
- broader non-CO(_2) climate impacts
A country can appear to improve territorially while still consuming carbon-intensive imported goods.
3. Power-Plant Infrastructure Is an Imperfect Proxy
The infrastructure layer is powerful, but incomplete. It mainly reflects grid-side generation assets and does not fully represent:
- industrial emissions outside power
- transport sector transition
- building electrification
- demand-side efficiency
- transmission and storage bottlenecks
- policy design and implementation quality
So even a strong infrastructure score does not fully summarize the transition.
4. Missingness and Imputation Still Matter
Even with careful confidence scoring, some country results still rely on imputed or partially reconstructed data. Commissioning years, sector splits, and cross-source alignment introduce uncertainty that cannot be fully eliminated.
The project tries to be honest about that uncertainty, but it cannot remove it.
5. Composite Scoring Embeds Judgment Calls
Any weighted ranking system reflects methodological choices:
- which time windows matter most
- how strict the decoupling threshold should be
- how much to reward infrastructure alignment
- how to penalize missing data
- how much extended features should influence the final score
I added sensitivity analysis to reduce overconfidence, but these choices still shape the outcome.
Final Reflection
What I am most proud of is that the project evolved from a simple “GDP up, emissions down” idea into a more realistic framework for evaluating decarbonization progress.
Instead of treating climate progress as a single indicator, the project tries to hold three truths at once:
- trajectory matters
- infrastructure matters
- uncertainty matters
That combination made the work more challenging, but also much more meaningful. The final result is not just a ranking of countries. It is an attempt to build a more honest way of asking whether decarbonization progress looks real, durable, and measurable.
Log in or sign up for Devpost to join the conversation.