Inspiration
We kept coming back to the same question during our research: why do infrastructure systems fail so catastrophically when individual components are relatively well-engineered? The 2003 Northeast blackout hit us hardest. A single software bug silenced one alarm. One substation tripped. Fifty-five million people lost power in thirteen seconds. The engineering was not the problem. The strategy was. Nobody had asked which nodes mattered most before the cascade started.
That became our question. Given a limited budget, which nodes do you protect?
What it does
CascadeShield is an optimization framework that models infrastructure networks as directed graphs and finds the best set of nodes to harden before a cascade failure begins. It simulates how load redistributes when nodes fail, measures cascade damage across realistic failure scenarios, and benchmarks six algorithms against each other to find which one delivers the best protection per unit of compute. The framework supports power grids, hospital server networks, internet backbones, and transportation hubs.
How we built it
We started with the network model. We used NetworkX to build directed graphs with realistic node attributes: capacity, current load initialized to 60 to 80 percent of capacity, intrinsic failure probability, and a centrality-based importance score. We implemented three topology types including scale-free networks which follow the power-law degree distribution observed in real infrastructure.
The cascade simulator was the hardest piece to get right. Load redistribution with a congestion penalty factor sounds simple until you realize the feedback loop can behave in unexpected ways depending on initial conditions. We ran hundreds of test scenarios to validate that the simulator matched known cascade patterns from the literature.
For optimization we implemented greedy search, a genetic algorithm with elitism, integer linear programming through brute force and branch-and-bound, simulated annealing, a QAOA-inspired simulation using a QUBO formulation, and a hybrid genetic algorithm with local search. Getting the benchmark infrastructure right so every algorithm received identical compute budgets and evaluated identical failure scenarios took longer than building any individual algorithm.
Challenges we ran into
The ILP solver scales badly. What takes 0.34 seconds on a 10-node network takes 42 seconds on a 20-node network and times out entirely at 30 nodes. Designing the benchmark to handle this gracefully without giving ILP an unfair advantage or disadvantage required careful thought about how to report results across different network sizes.
The QAOA formulation was genuinely difficult. Encoding the cascade protection objective as a QUBO matrix required translating a simulation-based fitness function into a quadratic form, which is not straightforward when the objective involves a recursive load redistribution process. We ended up linearizing the cascade dynamics around typical operating conditions to make the encoding tractable.
We also had to be honest with ourselves about what the results meant. Early runs showed greedy matching the genetic algorithm on small instances and we were tempted to conclude greedy was good enough. Deeper analysis showed the gap opens significantly at larger network sizes where greedy gets trapped by local optima that population-based methods escape.
Accomplishments that we're proud of
The benchmark framework is rigorous in a way most hackathon projects are not. Every comparison uses equal compute budgets, ten independent random seeds, and sensitivity analysis over a twenty percent parameter range. We can say with confidence that our results are not cherry-picked.
The QAOA module works. Getting a quantum-inspired optimization to produce meaningful results on a cascade protection problem in a hackathon timeframe is something we did not expect to pull off.
What we learned
Cascade failure problems have structure that most generic optimization benchmarks miss. The fitness landscape is highly non-convex and the high-importance nodes are not the same as the high-degree nodes, which means intuitive protection strategies leave significant resilience unrealized. We learned that algorithm selection is not a minor implementation detail on this problem class. It is the difference between 24 percent better protection and the greedy baseline.
We also learned that honest reporting is harder than optimistic reporting. Acknowledging that greedy is competitive on small instances, that ILP is impractical at scale, and that our QAOA results are classical simulations cost us some headline-grabbing claims but made the work more credible.
What's next for Critical Infrastructure Cascade Defense Optimizer
We want to run the QAOA formulation on actual quantum hardware. The problem encoding is ready. We need access to a 50-plus qubit device to see whether hardware execution delivers the advantage that theory predicts. We also want to integrate real SCADA power grid datasets to move from synthetic benchmarks to validated real-world performance, and to explore online adaptive protection strategies for networks whose topology changes dynamically during a disruption event.
Log in or sign up for Devpost to join the conversation.