Panacea

Inspiration

Every major pandemic of the last century, from the 1918 Spanish Flu to COVID-19 has exposed the same gap by leaving the general public in a position where they watch numbers climb on the news and follow policies with no way to actually see why those policies mattered. Research-grade simulators like Covasim were used by governments during COVID to model policy outcomes, but they require epidemiological expertise to operate and take minutes per run. We built Panacea because we believe pandemic literacy is a public good. If someone could actually see what happens when a community waits 10 extra days to intervene, or watches infections ripple through a neighborhood when mask compliance drops from 60% to 30%, that intuition sticks in a way that a chart never could. We wanted to give anyone, not just researchers, the ability to think like an epidemiologist. Panacea is a tool to help the general public learn and understand the decisions being made, and start making better ones themselves.

What it does

Panacea is an interactive pandemic policy simulator that lets anyone experiment with real intervention decisions and immediately see the spread of the virus play out across a living community map. Users select from six disease presets: COVID-19 (Wuhan strain), H1N1 Swine Flu, Hantavirus (Andes strain), Human Metapneumovirus, Influenza A (H3N2), and the 1918 Spanish Flu. We determined the parameters for each virus with epidemiologically grounded values for R0, incubation period, infectious period, mortality rate, and asymptomatic fraction. Users can control four policy sliders: when the government intervenes, mask compliance, vaccination rate, and lockdown intensity.

Each time they run the simulation, a community of dot agents are animated across a map, showing the outbreak unfold over 365 simulated days. The spread is not a pre-recorded animation, but generated in real time from a neural surrogate trained on 5,000 agent-based simulations. Every run produces:

Monte Carlo uncertainty analysis (Returns p5/p50/p95 confidence intervals on peak cases, peak day, total deaths, days over hospital capacity, and attack rate)
Sobol global sensitivity analysis (Evaluation to show which policy changes affected the outcome the most)
Policy cost accounting (Calculated from hospitalization volume and vaccination costs)
AI-powered explanation of all the results to make it easy to understand

How we built it

Agent-Based Model (Mesa): We built a custom SEIRD agent-based model in Python using the Mesa framework. Each agent carries demographic properties (age group), behavioral properties (mask wearing, vaccination status, contacts per day), and health state. Disease transmission is determined by contact with others (varies depending on asymptomatic/symptomatic), mortality (varies by age group), vaccination rollout, and intervention-triggered contact reduction. We ran 5,000 simulations across a swept parameter space covering all disease presets and intervention combinations, collecting daily S/E/I/R/D trajectories and scalar summary statistics per run.

Neural Surrogate (PyTorch): We trained a multilayer perceptron in PyTorch to approximate the Mesa ABM, taking disease parameters, community context, and intervention settings as inputs, and returning both scalar summaries and full 365-day S/E/I/R/D trajectories as outputs. This reduces a 30-second Mesa run to a millisecond forward pass. This speed is what makes Monte Carlo and Sobol analysis interactive since each requires thousands of runs.

Statistical Analysis (SALib, NumPy): We implemented two layers of uncertainty quantification. The Monte Carlo simulation runs 10,000 surrogate forward passes sampling over disease parameter distributions, and returns confidence intervals on all output metrics. Sobol global sensitivity analysis uses SALib's Saltelli to compute first-order (S1) and total-effect (ST) sensitivity indices, telling users which of their decisions had the greatest impact.

AI Explanation Layer (Gemini, ElevenLabs): After each run, Gemini receives the full simulation output and Sobol rankings and generates a natural-language narrative explaining the statistical analyses, and what the user's policy choices produced. ElevenLabs text-to-speech narrates the explanation, making the tool accessible to the visually-impaired.

Geospatial Visualization (Deck.gl, MapLibre): The dot visualization is rendered using Deck.gl's ScatterplotLayer overlaid on a MapLibre map.

Challenges we ran into

One challenge was building day-by-day trajectory tracking that could serve both the neural surrogate's training pipeline and the frontend's real-time animation. Another was getting the full pipeline to connect, from the agent-based model all the way through to map visualization.

Accomplishments that we're proud of

One accomplishment we're proud of is bridging the gap between our neural surrogate and the live visualization. The surrogate predicts outcomes in milliseconds but doesn't produce the per-timestep agent data the visualization needs. Reconciling these — keeping the visualization synchronized with surrogate predictions while staying responsive enough for interactive use — was a real engineering problem, and getting it to feel seamless took several iterations. The other accomplishment we're proud of is recognizing when our ABM was overreaching and scoping it down. Once the simulator was built, we tested it against real-world outbreak scenarios and found it produced wildly inaccurate results. Rather than chasing universal accuracy — which would have meant exploding computational cost and modeling complexity neither realistic for a hackathon nor honest about what we could deliver — we made a deliberate choice to narrow scope to the average suburban American city. This let us calibrate parameters meaningfully and produce results we could actually defend, while acknowledging upfront the constraint that every model in this space faces: you can't represent all possible scenarios without paying for it somewhere.

What we learned

Building Panacea taught us how much of pandemic modeling lives in the gap between what's technically possible and what's epistemically honest. Early on, we kept reaching for more sophistication — bigger models, more parameters, longer simulations — assuming complexity equaled credibility. The opposite turned out to be true. Every time we narrowed scope and stated our assumptions explicitly, the project got more defensible, not less. Scoping the ABM to suburban America wasn't a retreat; it was the move that made the rest of the work meaningful. We also learned how much architectural decisions ripple outward. Choosing to predict trajectories from a neural surrogate rather than running the ABM live shaped everything downstream — what the visualization could show, how Monte Carlo had to be structured, even what the AI explanation layer could reason about. There's no clean separation between "the model" and "the interface" in a system like this. Each layer constrains the others.

What's next for Panacea

The most important next step is calibration to real outbreaks — using Approximate Bayesian Computation against real COVID, H1N1, and 1918 case curves to infer parameter posteriors from real data, making predictions quantitatively grounded rather than just structurally correct. We also want to expand the contact network. The current single well-mixed population could become a multi-layer model with households, workplaces, schools, and communities — closer to what Covasim does — letting users explore layer-specific policies (school closures, work-from-home, gathering limits) instead of a single lumped "lockdown intensity." Finally, regional generalization. Suburban America is one community type; extending to dense urban cores, rural areas, and international contexts would dramatically expand reach. What makes outbreaks behave differently in Tokyo vs. Lagos vs. rural Nebraska is itself a question worth surfacing.