AfterShock — Workforce Disruption Intelligence Platform

When a natural disaster hits, who loses their job, how many, and how long does recovery take?


Inspiration

Natural disasters don't just destroy infrastructure — they destroy livelihoods. Yet workers, employers, and policymakers have almost no data-driven tools to anticipate employment disruption before it happens. We built AfterShock to change that.


What It Does

AfterShock is a full-stack AI platform that predicts, explains, and advises on the employment impact of natural disasters across the United States. It answers three questions simultaneously:

  • When will a disaster likely strike? (frequency forecasting)
  • Who loses jobs and how many? (sector-level employment impact)
  • What should you do about it? (AI-powered guidance tailored to your role)

Users can explore an interactive disaster map, run sector-by-sector job loss predictions for any state and disaster type, and chat with an AI advisor that draws on all three data layers at once.


How We Built It

The Data Pipeline

We processed two large public datasets offline:

  • 45,000+ FEMA Major Disaster declarations (2000–2026) — cleaned, filtered to DR-type declarations, and joined to county FIPS codes
  • 125,000+ employment separation records — every job ending (quit, layoff, retirement) with industry and county data

We merged them using a 3-window post-disaster structure: measuring job endings in the 0–6, 6–12, and 12–18 month windows after each disaster, compared against a 2–3 year baseline. This captures the full shock → recovery → normalization arc that a single window misses entirely.

After feature engineering across 85 dimensions (disaster type, sector, temporal patterns, interaction features), we trained our models on ~12,000 rows.

Model 1 — XGBoost Employment Impact Predictor

We trained an XGBoost gradient boosting model (500 trees) to predict excess job separations by sector, disaster type, and county. We used GroupKFold cross-validation to prevent data leakage — all rows from a given disaster go into the same fold, forcing true generalization to unseen events.

Results: MAE of 0.636, R² of 0.887 — an 83% improvement over the naive baseline. We also trained and rejected an LSTM: with only 6 time steps and ~3,400 rows, it couldn't learn meaningful patterns (MAE 2.583 vs. XGBoost's 0.636). XGBoost was the clear winner on tabular data.

Model 2 — Prophet Disaster Frequency Forecasting

We used Facebook Prophet to forecast monthly disaster frequency per state × disaster type, 6 years forward (2026–2032). Prophet's additive decomposition handles seasonality (Atlantic hurricane season peaks in September), missing data, and outliers cleanly. We compared it against ARIMA, Negative Binomial, and LSTM variants — Prophet won on interpretability and cross-validated accuracy.

Coverage: 142 state × disaster type combinations with 26 years of training history.

The RAG Intelligence Pipeline

The AI advisor runs a 7-step pipeline on every chat message:

  1. Audience detection — classifies the user as worker, employer, policymaker, investor, or insurer from their job title and question
  2. Prophet injection — pulls the disaster frequency forecast for the relevant state × disaster type
  3. XGBoost injection — pulls pre-computed sector job loss predictions
  4. SQL aggregation — runs structured queries (sector rankings, portfolio risk, demand surge) via an in-memory SQLite engine loaded from both model outputs
  5. ChromaDB retrieval — semantic search across 441 chunks: state unemployment guides, FEMA program docs, benefits resources, and model narratives
  6. Prompt assembly — all four data sources merged into a single augmented system prompt
  7. LLM streaming — response streamed token-by-token via Claude API (or Ollama locally)

The SQL layer is essential: ChromaDB can find relevant documents but can't aggregate across 148 rows to answer "which 10 state × sector combinations have the highest combined frequency × job loss risk?" The cross-model SQL join (Prophet frequency × XGBoost job loss) produces a composite risk score no single model can generate alone.

Key Finding

Hurricanes show negative average excess exits (−0.17). People hold onto jobs during crises, reconstruction absorbs displaced workers, and federal aid props up businesses. The employment impact of natural disasters is far more complex than simple job loss — and our 3-window model captures that nuance.


Tech Stack

Backend: FastAPI · XGBoost · Prophet · ChromaDB · SentenceTransformers (all-MiniLM-L6-v2) · SQLite · Anthropic Claude API · Ollama

Frontend: React 19 · TypeScript · Vite · Tailwind CSS · Recharts · Leaflet

Data: FEMA OpenFEMA API · State workforce separation records · Pandas · NumPy


Challenges

  • Data leakage — required GroupKFold cross-validation to ensure the model truly generalizes to unseen disasters, not just unseen rows from known events
  • Sparse county-level data — only 75 FIPS codes had enough job data, limiting rural coverage
  • Multi-source RAG — pure vector search can't aggregate; pure SQL can't retrieve narrative context. Building a system that routes between both required a custom keyword-based query router
  • LSTM rejection — we invested time building a hybrid LSTM architecture before concluding it was the wrong tool for this dataset size and sequence length

What We Learned

The 3-window post-disaster structure was our biggest methodological insight. Extending from a single 6-month window to three consecutive windows cut XGBoost MAE in half (1.275 → 0.636) and revealed that sector recovery follows distinct non-linear trajectories — Retail collapses and recovers slowly, Construction often gains workers during rebuilding, Healthcare drops then rebounds fastest.

We also learned that combining forecasting and impact models into a single SQL risk score (frequency × severity) surfaces insights neither model produces alone.


What's Next

  • Expand job data beyond 75 major metros to rural counties
  • Add exogenous regressors to Prophet (ENSO indices, sea surface temperatures) for better hurricane outlier handling
  • Variable-width disaster windows (wildfires hit faster than floods)
  • Causal inference methods to separate disaster-driven exits from background economic churn

Built With

Share this project:

Updates