TrialMatch AI

TrialMatchAI

The Last Mile in Clinical Trials

There are roughly 37,000 actively recruiting clinical trials on ClinicalTrials.gov right now. Most patients who would qualify for one will never find out it exists. Not because the trials are hidden — they're sitting in plain sight — but because matching a real patient to a real trial means reading pages of dense eligibility criteria, cross-referencing each one against an unstructured medical chart, and accounting for things the chart may not document at all.

A clinical research nurse can do this well. They are also one of the most expensive, scarcest professionals in oncology and cardiology, and they are not the people answering patients' 3 a.m. Google searches.

TrialMatch & Triage is a healthcare AI agent that does that work. Built inside the Prompt Opinion workspace on MCP, A2A, and SHARP FHIR-context standards, it reads a patient's live FHIR record, classifies how urgently they should be considered for trial enrollment, searches ClinicalTrials.gov, and returns a ranked list of trials with chart-grounded ✓ / ⚠️ checklists explaining exactly why each fits — citing the specific datapoint from the chart that proves it.

What it does

Given a patient in context, the agent runs a six-step workflow in a single conversation:

Pulls the patient's FHIR record (demographics, conditions, observations, medications, allergies) using Prompt Opinion's SHARP context propagation
Triages enrollment urgency — URGENT, MODERATE, or EXPLORATORY — based on disease activity, prior lines of therapy, and remaining standard-of-care options
Searches ClinicalTrials.gov with the patient's primary condition and location
Ranks candidate trials by hard-filtering age, sex, and condition compatibility
Pulls full inclusion/exclusion criteria for the top three matches
Returns a ranked list with chart-grounded ✓ inclusions, ⚠️ items to verify with the treating physician, and named exclusions

Real example: Robert Anderson, 69M with HFpEF

We tested with a synthetic patient designed to mirror a real cardiology clinic referral:

69-year-old male, San Francisco
HFpEF, LVEF 62%, NYHA Class III, NT-proBNP 1,840 pg/mL
Recent HF hospitalization 6 months ago
6-minute walk distance: 285m despite optimized 4-drug GDMT (empagliflozin, spironolactone, lisinopril, metoprolol)
BMI 34.2, T2DM with A1C 6.8%

The agent's top match was NCT07037459 — MARITIME-HF, a Phase 3 trial of maridebart cafraglutide (a GLP-1/GIP-class agent) in HFpEF patients with obesity. Here's what the response looked like — every checkmark is anchored to the actual chart: 🎯 Top match — Maridebart Cafraglutide in HFpEF + Obesity (MARITIME-HF) NCT: NCT07037459 (Phase 3, Recruiting) Why fits: GLP-1/GIP-class agent for HFpEF patients with obesity. Robert has confirmed HFpEF (LVEF 62%), BMI 34.2, NYHA III, elevated NT-proBNP, and a recent HF hospitalization. Hits every primary inclusion bucket. Strength: Strong Inclusion you appear to meet: ✓ Age ≥18 → 69 years ✓ BMI ≥30 → BMI 34.2 ✓ NYHA II–IV, HF ≥30 days → NYHA III, dx 2022 ✓ On HF standard of care → empagliflozin + MRA + ACE-i + beta-blocker ✓ LVEF >40% within 12 mo → LVEF 62% (echo 2 mo ago) ✓ Elevated NT-proBNP → 1,840 pg/mL ✓ HF hospitalization >30d & <12mo prior → hospitalization 6 months ago with IV diuretics Verify with your doctor: ⚠️ T2DM controls: A1C must be ≤10% (yours is 6.8% ✓) and no proliferative/severe diabetic retinopathy — confirm a recent eye exam is on file. ⚠️ Bay Area site availability — trial is multi-site Phase 3 but the public listing's first sites are out of state. Have your physician query the sponsor for the nearest activated CA site (UCSF / Stanford).

That's seven inclusion criteria, each traced to a specific chart datapoint. Two honest verification flags — one clinical (retinopathy screening), one logistical (which CA site is recruiting). No invented facts. No hallucinated NCT IDs. No overconfident "you qualify" verdict.

Why this matters

Most healthcare AI demos hit a wall at this step. They retrieve trials by keyword and present them. A patient acting on that output wastes time on trials they don't qualify for, raises hopes that get crushed at screening, and lose trust in clinical research altogether.

The bar to clear is no longer "can my agent retrieve relevant information." It's "would a clinician trust the answer enough to act on it."

Every ✓ in TrialMatch & Triage's output cites the chart datapoint that proves it. Every ⚠️ flags something the chart genuinely cannot answer. The agent is allowed to be wrong, but it is not allowed to be confidently wrong without flagging the uncertainty. That's the difference between a search engine and a research nurse.

How we built it

Architecture

User (clinician or patient) │ ▼ TrialMatch & Triage Agent (BYO agent in Prompt Opinion) │ MCP (Streamable HTTP) + SHARP FHIR context ▼ TrialMatch & Triage MCP Server (Python, Starlette) │ ├── ClinicalTrials.gov API v2 ├── PubMed E-utilities └── Workspace FHIR server (live patient data)

The agent is invoked directly in the Prompt Opinion launchpad. A2A is enabled on the agent, so any other compliant agent in the ecosystem can compose with it without writing custom glue code.

MCP server with full Prompt Opinion FHIR extension support

We wrote a custom MCP server in Python (Starlette + Uvicorn) that:

Speaks MCP over Streamable HTTP at POST /mcp
Declares the ai.promptopinion/fhir-context extension in the initialize response, with SMART scopes (patient/Patient.rs, Condition.rs, Observation.rs, MedicationRequest.rs, AllergyIntolerance.rs)
Reads X-FHIR-Server-URL, X-FHIR-Access-Token, and X-Patient-ID headers per request and makes authenticated calls back to the workspace FHIR server

Once the extension was declared correctly, Prompt Opinion immediately surfaced the FHIR trust toggle and scope checkboxes the docs describe — proving end-to-end SHARP integration.

Six tools, one focused workflow

Tool	Purpose
`read_patient_fhir`	Live patient profile from workspace FHIR
`triage_patient_urgency`	Decision rule for enrollment urgency tier
`search_clinical_trials`	ClinicalTrials.gov v2 API search
`match_trials_to_patient`	Hard-filter + score top N candidates
`extract_eligibility_signals`	Parse one trial's inclusion/exclusion bullets
`search_pubmed_research`	Optional supporting research enrichment

A2A-ready BYO agent in Prompt Opinion

The agent is configured with patient context, A2A enabled, FHIR context propagation on, and our six tools attached. The system prompt enforces a strict workflow with critical reasoning rules — most importantly: a high keyword-match score is not eligibility; always verify against the actual eligibility text before recommending. This is what produces the chart-grounded inclusion checklists and the honest verification flags.

Challenges we ran into

Declaring the FHIR extension in MCP. The standard mcp Python SDK (FastMCP) doesn't support custom capability declarations in the initialize response. We dropped to writing the JSON-RPC handler from scratch on Starlette, giving us full control over the capabilities payload. Worth it — once the extension was declared correctly, Prompt Opinion immediately surfaced the FHIR trust toggle and the integration "just worked."

Distinguishing keyword-match from real eligibility. Our first matcher returned high scores for trials whose conditions field contained the patient's primary condition — even when the actual inclusion text disqualified them. We added explicit reasoning rules in the system prompt forcing the agent to verify candidates against full eligibility text before recommending them, and to ground each ✓ in a specific chart datapoint. This turned a failure mode into the demo highlight.

Output token budgets. With multiple tool calls feeding ~30K tokens of context back to the model, responses occasionally hit pagination limits. We tightened tool payloads (shorter summaries, top-N filtering, fewer redundant fields), enforced a strict word budget in the prompt, and added a "next match" pagination flow as a fallback. The compression also made the responses more readable.

Sparse FHIR records vs. rich uploaded notes. Some patients in the workspace have rich clinical notes uploaded as documents but minimal FHIR resources. We instructed the agent to fall back to reading uploaded clinical notes when FHIR is sparse — combining structured and unstructured data into a coherent profile.

What we learned

MCP extensions are powerful but require dropping below the SDK abstractions. The Prompt Opinion SHARP integration is well-designed, but using it required custom server code rather than just decorating tools.
Keyword search is not eligibility matching. A trial whose conditions field contains "heart failure" tells you almost nothing about whether your patient qualifies. Real matching requires reading the full inclusion/exclusion text against a structured patient profile, criterion by criterion.
Honest uncertainty beats confident hallucination. The agent's most valuable single behavior is saying "yes on these 7 criteria, with the chart facts that prove each one, plus 2 items only your doctor can verify." That kind of grounding is what makes this a tool a clinician would trust.
Chart-grounded ✓ marks change the trust equation. A bare recommendation says "trust me." A recommendation where every ✓ cites the exact chart datapoint it relies on says "verify me." Clinicians want the second one.

What's next

Multi-site routing — route patients to the closest enrolling site (e.g., automatically resolve the "Bay Area site availability" verification flag from above)
Recurring monitoring — re-run matching on a schedule and alert when new trials open that fit the patient's profile
Co-pilot mode — flip the workflow so a clinician can paste a draft trial-summary note and the agent fills in the chart-grounded checklist
Multi-condition matching — handle patients with overlapping diseases (e.g., HFpEF + diabetes + obesity) by querying multiple condition-specific searches in parallel and ranking trials that address overlap

Built with

Python, Starlette, Uvicorn, httpx, MCP (Model Context Protocol over Streamable HTTP), Anthropic Claude Opus 4.7, Prompt Opinion (BYO Agent + A2A + SHARP FHIR context), ClinicalTrials.gov API v2, PubMed E-utilities, ngrok, FHIR R4

What it does

Built With

anthropic-claude-opus-4.7
clinicaltrials.gov-api-v2
fhir
httpx
mcp-(model-context-protocol-over-streamable-http)
ngrok
prompt-opinion-(byo-agent-+-a2a-+-sharp-fhir-context)
pubmed-e-utilities
python
starlette
uvicorn

Submitted to

Agents Assemble - The Healthcare AI Endgame

Created by

My contribution focused on product vision, patient–trial matching workflow design, and overall execution direction.
I identified the core problem, despite thousands of active clinical trials, most eligible patients never find or enroll in them. I defined TrialMatch AI’s core solution, transforming fragmented, hard-to-navigate trial data into a structured, patient-specific matching and triage system. I designed the end-to-end user experience, ensuring outputs are clear, actionable, and clinically relevant for both patients and providers.
I also developed the core matching and triage framework, including eligibility parsing, patient-specific filtering, and prioritization based on condition severity, location, and trial accessibility. I guided the logic behind surfacing the most relevant trials using real-world clinical considerations and patient journey constraints.
In addition, I led use case selection, demo design, and overall product positioning to align with real clinical workflows and maximize practical impact.
The engineering implementation, system architecture, and technical execution were led by my teammate, Deepti Bahel, who built the full system and contributed to refining key components of the matching and triage logic.

Shriya Prasannan
MHA candidate focused on Healthcare operations, clinical research, and improving access through AI
deepti bahel

Updates

deepti bahel started this project — May 10, 2026 05:30 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.