From Signal To Label

I worked as a data scientist at RTI International, where I spent time on health-related research — social determinants of health, public health data systems, the kinds of projects where you stare at adverse outcomes long enough to start asking why some of them were preventable. Even though my work ended, the questions stayed. Then earlier this year I watched Painkiller on Netflix — the show about the Sacklers and OxyContin. The thing that stuck with me wasn't the corporate villainy, it was the lag. Reports of harm were sitting in the FDA's adverse event database for years. Doctors saw patients destroyed. Families filed complaints. And the warnings came late. The system was supposed to catch this. It didn't, or it caught it slowly, and people died in the gap. I started wondering if OxyContin was anomalous or representative. Was the lag a one-time failure, or is it the system's normal mode? Nobody seemed to have measured it at scale. So I built the thing that would answer the question. What I built: an analysis pipeline for 30 widely-prescribed drugs that received major safety warnings — Xeljanz, Zantac, the CAR-T cell therapies, GLP-1 agonists, and more. For each drug, I measured three things: the lag from first reported harm to label change, what triggered the FDA to act, and whether the warning moved the drug's sales. The output is a public-facing page that walks any reader through the cases. What I learned: the lag is real. Median ~5 years, max 37. But the headline finding wasn't what I expected. FAERS — the post-market reporting system designed to catch problems — usually isn't what triggers a warning. In about half of cases, sponsor-mandated post-approval trials surface the issue. The FDA built the right system; the lag is the structural cost of doing safety science properly. And when warnings finally arrive, they barely move drug sales unless a clean competitor is ready to take share. The accountability loop people imagine doesn't really exist. The story the press tells — "the system is broken, FAERS missed it" — is mostly wrong. The system works the way it was designed. The cost is structural, and it falls on patients during the years-long gap between first signal and confirmed warning. That's a more uncomfortable truth than the broken-system narrative, and it's the one the data supports. How findings were verified: every claim on the page is traceable to a primary source. Lag dates were verified by reading actual SPL label diffs across DailyMed version histories — the canonical case (Xeljanz, 8.5 years) matches the FDA's own audit trail exactly. Trigger classifications used majority voting across three independent runs of the agent reading the actual FDA Drug Safety Communication text, with a 7-drug ground-truth subset hand-verified against the source documents. Revenue figures come from SEC 10-K and 20-F filings, cross-checked against company annual reports. Limitations are documented openly: drugs withdrawn before 2012 have sparse FAERS coverage, and some warnings predate the data systems we have access to. Those drugs are flagged on the page rather than silently excluded. How I built it: four notebooks in Zerve. FAERS counts via openFDA, label histories via DailyMed, trigger classifications via FDA Drug Safety Communications, revenue from SEC filings. LLMs read regulatory prose at scale to extract structured judgments — pre-LLM, the same work would have taken months of manual document review. Challenges: SPL label parsing is messy and the format changed multiple times. Many drugs have multiple labelers, each with their own version history. Matching FAERS drug names across decades of inconsistent free-text required real care. Some warnings predate openFDA's coverage entirely.

Built With

anthropic
api
claude
communications
dailymed
drug
edgar
fda
groq
html/css/svg
httpx
jupyter
lxml
openfda
python
safety
sec
spl
streamlit
zerve

Updates

Aditya Vadalkar started this project — Apr 29, 2026 07:39 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.