Inspiration

Every year the FDA recalls hundreds of drugs — contaminated batches, mislabeled bottles, sterility failures. But when a recall is issued, there's a dangerous gap: no fast, automatic way to reach the patients actually holding that medication. The alert sits in a government database while people keep taking a recalled drug, sometimes for days. We wanted to close that gap — to turn a passive government notice into an active, nationwide safety response that reaches real people in seconds.

What it does

FDA SafetyNet is an autonomous nationwide drug-recall defense system. It:

  1. Ingests live recalls from the FDA's openFDA enforcement feed.
  2. Scores severity — an ML model reads the unstructured recall text and classifies it as Lethal, Moderate, or Minor (Class I/II/III).
  3. Finds the cohort — matches the recalled drug against 1,000,000 patient prescriptions across 5,000 pharmacies nationwide, in real time.
  4. Dispatches a severity-aware alert to every affected patient and generates a patient-facing alert card live at runtime.

Everything is shown on a live, explainable dashboard: from the raw recall, to the model's severity reasoning, to the nationwide fan-out animating across an interactive US map — all while keeping patient data fully de-identified.

How we built it

  • Ingestion: Airbyte (PyAirbyte declarative source) pulling the openFDA drug enforcement API incrementally.
  • Real-time lakehouse: ClickHouse with materialized views that match recalls to patients and roll up alerts by geography the instant data lands.
  • ML severity: a scikit-learn pipeline (TF-IDF → SVD → Logistic Regression), experiment-tracked with Guild.ai, using openFDA's recall classification as ground truth.
  • Orchestration & API: a FastAPI backend streaming live pipeline events over WebSockets to a React dashboard with a D3-driven US map.
  • Runtime UI generation: OpenUI/OpenAI generates the patient alert card on the fly, themed to the recall's severity.
  • Synthetic data at scale: ~5,000 pharmacies and ~1,000,000 customer prescriptions generated with Faker/NumPy/Pandas, with deliberate overlap on real recalled NDCs so matches are realistic.

Challenges we ran into

  • Dependency hell: Python 3.14 was too new for key packages; we pinned to 3.12 and built compatibility shims (e.g. for Guild.ai's imp/pkg_resources conflicts) and isolated virtual environments.
  • Real-time matching: getting ClickHouse materialized views to trigger on insert and roll up nationwide geography sub-second at a million-row scale.
  • Class imbalance: the live recall feed skews toward less-severe classes, so we trained on a larger, balanced historical dataset to keep the model honest.
  • Privacy without losing the story: designing so PII never enters the LLM or the event stream, while still demonstrating real nationwide reach.
  • Real alert delivery: Twilio trial accounts block custom message bodies on both SMS and WhatsApp, so we built a channel abstraction (Twilio, WhatsApp, CallMeBot, Telegram) that degrades gracefully.

Accomplishments that we're proud of

  • An end-to-end pipeline that goes from a live FDA recall to a nationwide, de-identified patient alert in seconds.
  • A genuinely explainable dashboard — you can narrate exactly what every step is doing, including how the model reached its severity decision.
  • Privacy by design: patient names and numbers never leave the secure data layer; the AI only ever operates on de-identified cohorts.
  • Demonstrating national scale — matching one million prescriptions across 50+ states in real time on a local machine.

What we learned

  • Materialized views turn a database into a real-time event engine — pushing matching logic into ClickHouse was far faster than doing it in app code.
  • Using openFDA's own recall classification as ML ground truth made the severity model both accurate and defensible.
  • "Explainability" is a product feature, not an afterthought — isolating each pipeline stage into its own panel made the system instantly understandable.
  • A clean environment-driven config makes a local-first build cloud-portable with zero code changes.

What's next for FDA SafetyNet

  • Cloud deployment on ClickHouse Cloud, Airbyte Cloud, and Render for a live public demo and true production scale.
  • Real carrier-grade delivery via an upgraded SMS/WhatsApp provider, and integration with pharmacy systems for verified opt-in.
  • A multi-agent layer (triage, outreach, presenter) for adaptive, reasoning-driven responses beyond the deterministic pipeline.
  • Broader coverage — extending beyond drugs to food and device recalls, and adding multilingual, accessibility-aware patient alerts.

Built With

Share this project:

Updates