Inspiration
Adverse event reporting is one of the most under-resourced workflows in clinical pharmacy. A pharmacist who suspects a drug reaction must manually cross-reference the patient chart, score a causality questionnaire, apply FDA seriousness criteria, and transcribe everything into a multi-page MedWatch form — a process that can take hours and is error-prone enough that the vast majority of reportable ADEs never get filed. We wanted to see how much of that burden a conversational AI agent could take on, without sacrificing clinical rigor.
What it does
The AE Investigator is a conversational pharmacovigilance assistant that takes a clinical pharmacist from patient FHIR data to a submission-ready FDA MedWatch Form 3500 — entirely in chat.
The agent operates in three modes:
- Investigation — pulls patient demographics, medication history, past medical history, and family history directly from the FHIR record
- Reasoning — scores Naranjo causality deterministically from structured data, flags red flags, compares against known adverse events in the FDA label, and surfaces what still needs clinician input
- Drafting — generates a complete, field-accurate MedWatch Form 3500 as a live HTML render, incorporating all investigation findings and any clinician overrides
Naranjo questions that cannot be answered from the data are explicitly marked unknown rather than guessed. When a clinician provides a missing answer — such as confirming the reaction recurred on re-challenge — the agent redrafts the report instantly with the updated score.
How we built it
- MCP server (Python + FastMCP, hosted on Railway) — 14 tools organized across investigate, reason, and draft namespaces. Patient data is fetched live from the Prompt Opinion FHIR proxy. Naranjo scoring and FDA seriousness determination are deterministic Python algorithms — no LLM calls.
- BYO agent (Prompt Opinion) — system-prompted AE Investigator with the MCP server attached. Handles the conversational layer and tool orchestration.
- Demo FHIR bundles — three hand-authored synthetic patient cases (ibuprofen-GI bleed, Daytrana-leukoderma, azathioprine-hepatotoxicity) uploaded to the PO FHIR server, each with realistic lab progressions, medication timelines, and family history.
- openFDA + NLM RxNorm — integrated for known AE lookups and drug normalization.
Challenges we ran into
Getting Naranjo scoring right without any hallucination required careful separation of concerns: each question is either answerable from structured FHIR data or explicitly deferred to clinician override — there is no middle ground. Designing that boundary, and enforcing it consistently across tool calls, was the hardest design problem.
The PO platform's FHIR proxy also has non-obvious constraints — bundles require RFC 4122 UUIDs in fullUrl fields, and token scopes in consult-flow differ from workspace scopes in ways that affect which FHIR resources are accessible.
Accomplishments that we're proud of
The Naranjo scoring is fully deterministic: given the same patient data and the same clinician inputs, the score is always the same. That's a meaningful bar for a clinical tool. The agent also surfaces why each question was answered the way it was — not just the score, but the reasoning — which is what a pharmacist reviewing the output actually needs.
The MedWatch Form 3500 output maps to the canonical September 2025 FDA form fields, rendered as live HTML from structured data — not a template fill.
What we learned
LLMs are excellent at orchestration and natural language interaction, but clinical algorithms should not be LLM calls. The combination — LLM for conversation, deterministic Python for scoring — is the right architecture for a tool that a clinician needs to trust.
Session state across multi-turn tool calls is also more nuanced than it looks on paper. Clinician overrides need to persist across the investigation; FHIR context needs to be scoped to a single patient; and the causality assessment needs to see both the structured data and the clinician's answers in the same call.
What's next for Adverse Event Investigator
- External A2A agent — migrate the AE Investigator out of Prompt Opinion's BYO shell into a standalone agent service, making it invocable from any A2A-compatible platform, not just PO
- Supabase session & multi-tenancy — externalize in-process session state to Supabase for persistence, audit logging, and support for multiple concurrent users across institutions
- Signal Detection module — PRR/ROR disproportionality analysis against FAERS, completing the full Pharmacovigilance Suite: investigation → causality → report drafting → signal detection
- ICH E2B(R3) XML export — structured electronic submission alongside the MedWatch HTML form
Log in or sign up for Devpost to join the conversation.