Inspiration

Energy systems are getting harder to operate. Climate volatility, renewable intermittency, delayed measurements, and missing sensor data all collide in real-world grid operations.

Most data science workflows focus on prediction accuracy under clean datasets. But in practice, grid operators don’t have perfect data — they have incomplete, noisy, and delayed signals, yet still need to make high-stakes decisions.

We wanted to answer a deeper question:

Can we understand what actually drives grid stress, and make better decisions even when the data is imperfect?


What it does

CausalGrid is a decision intelligence system for climate-energy systems that goes beyond prediction.

It:

  • Identifies what drives grid stress (demand, weather, renewables)
  • Distinguishes stress amplifiers vs mitigators
  • Simulates imperfect data conditions (missing values, noise, delays)
  • Evaluates how reliable conclusions are under degraded data
  • Runs counterfactual simulations to test interventions
  • Deploys an interactive Streamlit app to explore decisions in real time

Instead of just answering:

“What will happen?”

It answers:

“Why did it happen, can we trust it, and what should we do?”


How we built it

We built CausalGrid end-to-end using an AI-native workflow:

Data Layer

  • NOAA climate data (temperature signals)
  • EIA electricity demand (US Lower 48 grid)
  • Renewable generation simulation calibrated to energy behavior

Modeling Layer

  • Grid stress index built across 730 daily observations (2022–2023)
  • Feature engineering with demand, temperature, CDD/HDD, renewables, and behavioral patterns
  • Driver analysis using correlations and model-based importance

Imperfect Data Engine

  • Injected missingness, noise, and combined degradation
  • Measured model stability under real-world data conditions
  • Found that missing data is ~25× more damaging than noise

Decision Layer

  • Counterfactual simulations:
    • +2°C warming
    • +20% renewable penetration
    • demand response scenarios
  • Quantified impact on:
    • mean stress
    • high-stress days
    • system risk

Deployment

  • Built in Zerve’s agentic data science environment
  • Deployed as a Streamlit app
  • Interactive sliders allow real-time scenario testing

Challenges we ran into

  • Real-world APIs are messy
    NOAA and EIA data required multiple iterations to get reliable ingestion without silent failures.

  • Imperfect data modeling is non-trivial
    Simulating missingness without breaking model assumptions required careful design.

  • Bridging analysis → decisions
    Most pipelines stop at insights. Converting results into actionable recommendations took multiple iterations.

  • Balancing realism vs hackathon scope
    We needed real data, but also controlled simulation to test causality under uncertainty.


Accomplishments that we're proud of

  • Built a full end-to-end decision system, not just a notebook
  • Successfully integrated real NOAA + EIA datasets
  • Demonstrated that:
    • demand is the dominant stress driver (~0.96 correlation)
    • renewable share is the strongest mitigation lever
  • Quantified a key insight: > Data completeness matters far more than sensor precision
  • Delivered a deployable app with live scenario simulation
  • Created a system that connects: data → causality → robustness → decisions → deployment

What we learned

  • Real-world data systems are fundamentally incomplete, not just noisy
  • Causal reasoning breaks faster with missing data than with measurement error
  • Prediction alone is insufficient — decision context is critical
  • Combining interventions (demand response + renewables) is far more effective than isolated actions
  • AI-assisted workflows (Zerve) dramatically accelerate iteration from question → deployable system

What's next for CasualGrid

We see CausalGrid evolving into a broader Causal Decision Platform:

  • Replace simulated renewables with real grid generation data
  • Add regional grid modeling (ISO-level analysis)
  • Incorporate real-time streaming data pipelines
  • Extend to policy optimization and reinforcement learning
  • Generalize the framework to:
    • healthcare systems
    • financial risk systems
    • climate resilience planning

Ultimately, our goal is to move from:

analyzing systems to: building systems that can reason, adapt, and recommend actions under real-world uncertainty

Architecture

flowchart LR
    A[NOAA Climate Data] --> D[Zerve Data Pipeline]
    B[EIA Grid Demand] --> D
    C[Renewable Simulation] --> D
    D --> E[Grid Stress Dataset]
    E --> F[Driver Analysis]
    E --> G[Imperfect Data Simulation]
    F --> H[Decision Engine]
    G --> H
    H --> I[Counterfactual Scenarios]
    H --> J[Recommendations]
    J --> K[Streamlit App]

Built With

Share this project:

Updates