Aegis
Know the storm before your bits flip. Aegis is a system that turns live space weather and device telemetry into actionable radiation risk for edge compute—so teams can throttle writes, checkpoint work, and ride out solar events instead of discovering corruption after the fact.
We combine NOAA L1 feeds (solar wind, magnetometer, differential protons, X-rays), a LightGBM forecast trained on years of GOES-18 + ACE/DSCOVR history, and a transparent risk layer that maps environment + wear into severity tiers and recommended mitigations. A Next.js dashboard shows fleet health at a glance; device drill-downs surface live telemetry, factor breakdowns, and forecast curves. The dashboard polls an ESP32 gateway for live Geiger counts and per-chip NVS wear data, persisting snapshots into Postgres on every forecast cycle.
Scope: Aegis forecasts environmental radiation severity and operational risk—it does not claim to predict individual bit flips. That boundary keeps the science honest and the product defensible.
Why it matters
Solar storms and elevated particle flux raise the odds of silent data errors, especially on constrained flash and unshielded nodes. Operators rarely get a single “radiation dial” tied to now and the next few hours. Aegis closes that gap with:
- Minute-resolved context from the same differential channels the model was trained on (not a mismatched integral feed).
- Multi-horizon outlook (e.g. 2h curve from the Flask service, 6h/12h scalars from merged NOAA + optional hosted ML).
- Risk + actions the UI can explain—tiers, factors, and copy-paste-style mitigations for demos.
Architecture (high level)
flowchart LR
subgraph ingest [Ingest]
NOAA[NOAA SWPC JSON]
ESP[ESP32 gateway]
end
subgraph compute [Compute]
Flask[Flask forecast :3002]
Next[Next.js dashboard :3000]
end
subgraph optional [Optional]
DBX[Databricks serving]
Remote[REMOTE_FORECAST_URL]
end
NOAA --> Next
Flask --> Next
Remote --> Next
DBX --> Next
ESP --> Next
| Piece | Role |
|---|---|
dashboard/ |
Next.js 15 app: fleet grid, device pages, GET /api/forecast, GET /api/risk, tRPC + Drizzle scaffold. Polls the ESP32 gateway on each forecast cycle to persist Geiger + NVS wear data. |
forecast_service/ |
Flask POST /forecast — 120-step proton trajectory from the baseline model bundle in artifacts/. |
training/ |
LightGBM baseline + onset-classifier scaffolding; Databricks copies under databricks/. |
data/cleaned_data/ |
Pipelines for ml_training_data_v2 / v3 (Parquet); large artifacts via Git LFS. |
Quick start
1. Clone with Git LFS
Large datasets (ml_training_data_v2.csv, Parquet) are stored with Git LFS. After clone:
git lfs install
git lfs pull
2. Install everything
From the repo root:
sh install-all.sh
This installs dashboard npm dependencies and Python deps for forecast_service.
3. Environment files
Create .env files from the examples (values are local-dev defaults; adjust as needed):
| Package | Copy from |
|---|---|
| Dashboard | dashboard/.env.example → dashboard/.env |
Dashboard essentials
DATABASE_URL— Postgres URL (Drizzle / legacy demos). The forecast routes can run without heavy DB usage, but the template expects this set.FLASK_FORECAST_URL— Defaulthttp://localhost:3002so the app can merge 2h model output with live NOAA.
Optional env (see dashboard/.env.example)
ESP32_GATEWAY_BASE_URL— URL of the ESP32 gateway; polled each forecast cycle for Geiger CPM and per-chip NVS wear.REMOTE_FORECAST_URL— HostedGET /api/forecast-compatible JSON for 6h/12h blocks.DATABRICKS_FORECAST_URL+DATABRICKS_TOKEN— Wire in model serving that acceptsdataframe_records.
4. Run the stack
One terminal (both services):
sh start-all.sh
This starts:
- Next.js dev server (http://localhost:3000)
- Flask forecast service (http://localhost:3002)
Press Ctrl+C to stop all.
Or run à la carte
# Terminal A — dashboard
cd dashboard && npm run dev
# Terminal B — forecast API
python -m forecast_service.app
How to use the app (demo flow)
Fleet overview
- Open http://localhost:3000.
- You’ll see a fleet grid of cards (demo nodes), an alert banner, and a top bar with sync hints.
- Each card shows risk tier, flip probability, wear, and service status (telemetry / forecast / wear).
Device detail
- Click any card (or go directly to
/device/demo-node-01,demo-node-02, ordemo-node-03). - Left column: risk summary, live telemetry (radiation / magnetic), wear detail.
- Right column: live forecast and live risk panels (fed from the merged forecast + risk APIs when the backend is up), factor breakdown, forecast chart, risk history, recommended actions.
- Bottom: computation demo strip for the “workload under stress” narrative.
Note: Fleet and device body content still flows from
dashboard/src/lib/mock-data.tsfor fast UI iteration. Forecast/risk panels use the live/api/forecastand/api/riskintegration path described indashboard/README.md. Swap mock imports for tRPC/DB when you wire persistent devices.
API smoke tests (for judges / integration)
With the dashboard running:
- http://localhost:3000/api/forecast — Merged JSON: NOAA-backed features, optional 6h/12h remote/Databricks, optional 2h curve from Flask.
- http://localhost:3000/api/risk — Risk engine output using the same merged forecast inputs.
More detail: dashboard/README.md, forecast_service/README.md.
npm from repo root
There is no single root node_modules; each package is separate. After install-all.sh:
| Command | What it runs |
|---|---|
npm run dev |
Next.js dev (dashboard/) |
npm run build |
Production build |
npm test |
Dashboard Vitest |
npm run lint |
ESLint (dashboard) |
Data & ML (subsystem overview)
- Raw sources (local, gitignored): ACE/DSCOVR solar wind, GOES-18 SGPS proton CSVs, GOES-18 XRS NetCDF—see
CLAUDE.mdfor paths and the v3 feature table. - Canonical training file:
data/cleaned_data/ml_training_data_v3.parquet(v2 + XRS derivatives). - Scripts:
clean_dataset_v2.py,fetch_xray.py,merge_xray_v3.py, etc., with a suggested venv underdata/.venv/(pandas,numpy,pyarrow,xarray,netCDF4,tqdm). - Training:
training/aegis_baseline.py— per-horizon regression on forward-maxlog10(J_gt_10MeV)with chronological splits (validation includes the May 2024 Gannon storm window).
Team
He ++ — Sean · Deep · Evin · Nico · Ethan
(Subsystem notes and API contracts live in claude/CLAUDE.md and CLAUDE.md.)
License / hackathon
Built for a hackathon demo: verify assumptions before any production deployment; space-weather products and model outputs are not a substitute for mission-critical radiation hardening analysis.
Log in or sign up for Devpost to join the conversation.