About the Project

This project is a decarbonization transition scoreboard: a country-level analytical system designed to identify where economic growth appears to be separating from carbon emissions, and whether that progress looks structurally durable when viewed through the lens of energy infrastructure.

The core question behind the project was simple:

Which countries seem to be making real progress on decarbonization, not just in headline emissions trends, but in a way that is supported by the underlying power system?

A country can show falling emissions for many reasons. Sometimes that reflects genuine structural transition. Other times it may be driven by temporary shocks, incomplete data, outsourcing of emissions, or short time windows that make the result look better than it really is. I wanted to build something that moved beyond a single chart of GDP and CO(_2), and instead combined trajectory, infrastructure, and confidence into one interpretable framework.

At the center of the project is a basic decoupling idea:

$$ \text{Decoupling} \Longrightarrow \Delta GDP > 0 \quad \text{and} \quad \Delta CO_2 < 0 $$

In practice, the project makes this more robust by measuring these changes across multiple time windows, comparing countries against one another, and incorporating power-plant information to see whether a country’s emissions trajectory is consistent with its physical energy system.

What Inspired Me

The project was inspired by a frustration with how climate progress is often discussed. A lot of public conversations focus on isolated indicators: a country’s emissions went down, renewables went up, coal went down, GDP went up. Each of those is useful, but on its own it can be misleading.

I wanted to build something that linked these pieces together.

What interested me most was the tension between surface-level progress and structural progress. A country might appear to be decarbonizing, but if its power fleet is still heavily fossil-based, recently expanded in gas or coal, or poorly measured, then the story is less convincing. On the other hand, a country with modest recent performance might have infrastructure that suggests stronger long-run transition potential.

That motivated the project: not just ranking countries by whether emissions fell, but trying to answer whether the pattern looked credible, repeatable, and supported by real-world infrastructure.

How I Built the Project

I built the project as a notebook-based analytical pipeline that merges macroeconomic, emissions, demographic, and power-sector data into a country-level scoreboard.

Data Foundation

The project combines three main types of information:

Emissions data, primarily national CO(_2) metrics
Macroeconomic data, especially GDP and population
Power plant data, including installed capacity, fuel type, and commissioning information

From there, the project builds a country-year panel and derives features that try to answer three broad questions:

Trajectory: Is the country showing evidence of GDP growth with declining emissions?
Structure: Does its power-plant fleet support that story?
Confidence: How much should we trust the result given missingness, imputation, and sensitivity to assumptions?

Main Methods Used

The analysis starts with a basic decoupling screen using changes in GDP and CO(_2) over time. But instead of relying on a single window, I added multiple robustness layers:

decoupling checks across several time windows
rolling-window consistency measures
country classifications such as stable, emerging, mixed, or weak decouplers
infrastructure aggregates like fossil share, renewable share, and low-carbon share
plant-age and transition features such as recent low-carbon additions, fossil lock-in pressure, and legacy coal exposure
source-quality and coverage scores to track how much of the result rests on observed vs inferred data

A major addition was treating the power fleet not just as a static fuel mix, but as a transition object. That meant looking at things like:

whether recent capacity additions were clean or fossil-heavy
whether the country still relies on old fossil infrastructure
whether commissioning-year information is sparse and needs careful imputation

Because the data is imperfect, I also built confidence-aware logic into the pipeline. Missing years, incomplete sectoral coverage, and uneven infrastructure data can all distort rankings. So instead of treating every country as equally measurable, the project tracks data coverage and penalizes low-confidence cases.

Scoring Approach

Rather than using a predictive machine learning model, I used a composite ranking framework.

The final scoreboard blends:

macro decoupling features
infrastructure and transition features
stability across windows
data-confidence measures

Most components are normalized with percentile-style ranking so that variables with very different units can still be combined in a meaningful way. Extended infrastructure features are allowed to influence the ranking more strongly only when the underlying data is good enough.

This made the project feel less like a one-off notebook and more like a decision tool: something that can say not just who ranks highly, but also why, and with what degree of trust.

What I Learned

One of the biggest things I learned is that climate analytics becomes much more difficult the moment you try to move from a single metric to a decision-grade interpretation.

It is easy to say:

GDP rose
emissions fell
therefore this country is decoupling

It is much harder to ask:

did this hold across multiple time windows?
is the result sensitive to endpoint choice?
is the power system consistent with the emissions story?
how much of this depends on imputed or incomplete data?
are we comparing countries fairly when their data quality differs?

I also learned how quickly “simple” climate indicators become a data engineering problem. A large part of the project was not glamorous modeling. It was:

harmonizing country identifiers
aligning year coverage
handling source conflicts
creating fallback logic
tracking provenance
making sure the ranking system did not silently reward countries with better data rather than better performance

Methodologically, I learned the value of robustness over elegance. A slightly messier pipeline that tests multiple windows, tracks data confidence, and stress-tests assumptions is much more useful than a cleaner score that looks precise but hides fragility.

Challenges I Faced

1. Uneven Data Quality

A major challenge was that the different datasets do not line up cleanly. Coverage varies by country and year, and some variables are much better measured than others. GDP, emissions, population, and plant data all have different kinds of missingness and revision behavior.

That meant a large share of the work went into deciding:

when to merge
when to interpolate
when to use fallbacks
when to downgrade confidence instead of forcing a value

2. Power-Plant Data Is Useful but Incomplete

The power plant database adds real structural depth, but it is not a perfect picture of a country’s power system. Some plants are missing commissioning years, some capacity information is sparse, and plant-level data does not automatically map cleanly to national emissions outcomes.

This created a balancing act: the infrastructure layer is extremely useful, but it has to be used cautiously so that it enriches the analysis instead of pretending to offer more certainty than it actually does.

3. Rankings Can Create False Precision

Once a scoreboard exists, it becomes tempting to trust the ordering too much. But a country ranked (7)th is not necessarily meaningfully better than one ranked (12)th if both are sensitive to time-window choice or data coverage.

To address that, I added:

stability checks
scenario perturbations
overlap tests
confidence tiers

That added complexity, but it made the output much more honest.

4. Climate Performance Is Not One-Dimensional

A country can perform well on one dimension and poorly on another. For example:

emissions may be falling, but the fossil fleet may still be large
renewables may be growing, but GDP may be stagnant
power-sector trends may look better than economy-wide CO(_2)

This made it difficult to define a single score that stayed interpretable. I had to design the system so that the composite score was useful, but still decomposable into understandable components.

Key Limitations

1. The Project Is Descriptive, Not Causal

The scoreboard identifies patterns; it does not prove why those patterns happened. If a country appears to decouple, the project does not establish whether that was caused by policy, industrial change, offshoring, weather, fuel prices, recession dynamics, or technology adoption.

So the project is best understood as a screening and comparison tool, not a causal inference study.

2. Territorial CO(_2) Is Only One Slice of the Problem

The analysis relies heavily on production-based or territorial emissions metrics. That means it does not fully capture:

consumption-based emissions
imported carbon intensity
embodied emissions in trade
broader non-CO(_2) climate impacts

A country can appear to improve territorially while still consuming carbon-intensive imported goods.

3. Power-Plant Infrastructure Is an Imperfect Proxy

The infrastructure layer is powerful, but incomplete. It mainly reflects grid-side generation assets and does not fully represent:

industrial emissions outside power
transport sector transition
building electrification
demand-side efficiency
transmission and storage bottlenecks
policy design and implementation quality

So even a strong infrastructure score does not fully summarize the transition.

4. Missingness and Imputation Still Matter

Even with careful confidence scoring, some country results still rely on imputed or partially reconstructed data. Commissioning years, sector splits, and cross-source alignment introduce uncertainty that cannot be fully eliminated.

The project tries to be honest about that uncertainty, but it cannot remove it.

5. Composite Scoring Embeds Judgment Calls

Any weighted ranking system reflects methodological choices:

which time windows matter most
how strict the decoupling threshold should be
how much to reward infrastructure alignment
how to penalize missing data
how much extended features should influence the final score

I added sensitivity analysis to reduce overconfidence, but these choices still shape the outcome.

Final Reflection

What I am most proud of is that the project evolved from a simple “GDP up, emissions down” idea into a more realistic framework for evaluating decarbonization progress.

Instead of treating climate progress as a single indicator, the project tries to hold three truths at once:

trajectory matters
infrastructure matters
uncertainty matters

That combination made the work more challenging, but also much more meaningful. The final result is not just a ranking of countries. It is an attempt to build a more honest way of asking whether decarbonization progress looks real, durable, and measurable.

Built With

Updates

Matthew Sheng started this project — Apr 11, 2026 03:46 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.