ChartBrief
Inspiration
USA has approximately 313 active physicians per 100,000 people. On a typical day, a doctor might see 20 patients, some new, some carrying decades of fragmented medical history. They cannot spend 30 minutes reading a single chart when 19 more patients are waiting. And even if they tried, the human brain cannot reliably process years of scattered data under time pressure.
Now picture an ER patient who arrives unconscious. Doctors need to act immediately, but they also need to understand the patient's history. Allergies? Active conditions? Past surgeries? Current medications? That information is scattered across dozens of FHIR resources: Encounter, Condition, Medication, Allergy, Immunization, Observation. Stitching it together manually, under pressure, with a life on the line, that is the problem.
But it goes deeper than just the first-encounter snapshot.
Doctors need to see patterns. A patient with a 700-day care gap who now arrives in crisis is not the same risk profile as a patient with stable annual visits. A person who was fully employed for eight years and recently shifted to part-time, and that change temporally co-occurs with a new substance misuse diagnosis, that is a signal no single chart entry would reveal. These patterns predict risk better than any isolated data point, and today they go almost entirely undetected.
ChartBrief was built to fix that. It transforms fragmented FHIR data into a 2-minute, zero-hallucination clinical brief, structured for the moment a physician most needs it.
What It Does
ChartBrief is an AI-powered clinical chart summarization agent powered by the ChartBrief MCP Server. Using the MCP server's build_clinical_narrative() tool, it fetches and links scattered FHIR documents, summarizes the clinical story, and analyzes care engagement patterns and social determinant shifts over time. The output is a structured six-section brief designed for a two-minute physician read:
- Clinical Summary - 1 to 2 line overview and top priorities
- Active and Historical Conditions - clearly separated and deduplicated
- Medication and Therapeutic History - current and high-impact past therapies
- Care Engagement and Continuity Pattern - gaps, episodes, and care velocity
- Social and Life Context (SDOH) - with temporal transitions flagged
- Continuity Risks and Clinical Concerns - future risk and unresolved findings
ChartBrief is built around a single design principle: the agent can only surface what is documented. No inferred diagnoses, no chronological storytelling, no output that is not grounded in the FHIR record. In a clinical setting, a confident hallucination is more dangerous than no answer at all, so the architecture makes hallucination structurally difficult, not just discouraged.
How I Built It
ChartBrief is a Bring Your Own Agent (BYO Agents) built on the Prompt Opinion platform. The agent uses a build_clinical_narrative() tool from the ChartBrief MCP Server as its mandatory first-step call. Every response must begin by calling this tool, with no free-form improvisation and no hallucinated inferences.
Step 1: Parallel FHIR Fetching
The tool parallel-fetches all relevant FHIR resources simultaneously: Patient, Encounters, Conditions, Observations, Procedures, Medication Requests, Administrations, Statements, Care Plans, Document References, Immunizations, and Allergy Intolerances. No sequential bottleneck.
Step 2: Noise Removal and Context Cleaning
This is the core innovation. Here is what a raw FHIR Condition resource looks like:
{
"resourceType": "Condition",
"id": "cond-837a92f1-6e4b-4a2c-9f1d-8e3b7a2c1d4e",
"clinicalStatus": {
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
},
"code": {
"coding": [{"system": "http://snomed.info/sct", "code": "44054006", "display": "Type 2 diabetes mellitus"}]
},
"subject": {"reference": "Patient/81ad0eb0-0413-4e5b-a6dd-5c156ee997c3"},
"onsetDateTime": "2025-01-20T00:00:00Z",
"recordedDate": "2025-01-20T14:32:00Z"
}
ChartBrief transforms that into:
Condition: Type 2 diabetes mellitus | clinical_status: active | onset: 2025-01-20
~84% fewer tokens. 100% clinically relevant.
Every FHIR resource is mapped to a Pydantic model class that implements a to_llm_string() method, retaining only what matters: name, status, and relevant date. UUIDs, system URLs, reference pointers, redundant timestamps, and null values are all discarded. Duplicate conditions are normalized and deduplicated. Non-clinical entries are filtered through a curated skip list.
Step 3: Care Timeline and SDOH Timeline
The tool constructs a CareTimeline that detects episodes (clusters of encounters under 60 days apart), gaps (disengagement periods over 60 days), cadence baseline, and care velocity over the last 6 months. Alongside this, the SDOHTimeline tracks how social determinants shift over time, making transitions detectable: employment changes, housing instability, and social isolation, along with their temporal associations with clinical events.
Step 4: Agent Constraints
Three hard rules govern every response: no visit-by-visit timeline reconstruction, no causal language (only "co-occurred with," "preceded by," "followed by"), and no output not grounded in documented FHIR data.
Challenges I Ran Into
Data fragmentation. FHIR resources do not link cleanly across time. A medication order from 2020 may have no documented connection to a 2024 encounter where the patient stopped taking it. ChartBrief infers discontinuation only from explicit documentation, never from absence of data.
Separating SDOH findings from clinical conditions. FHIR Condition resources mix diagnoses and SDOH observations (e.g., diabetes vs. employment status) in the same structure. Proper separation requires traversing the SNOMED concept hierarchy, but accessing that hierarchy at runtime requires a licensed terminology server or full SNOMED release files, neither of which is practical here. ChartBrief uses a curated skip list as a temporary workaround.
The two-minute constraint. Summarizing ten-plus years of history into two minutes of reading is brutally hard. Every iteration required removing more, cutting filler, prioritizing actionability over completeness, and trusting the physician to ask follow-up questions.
SDOH temporal logic. Building the SDOHTimeline required inferring start and end dates from encounter notes where dates were often missing or inconsistent. The solution: use the earliest documented date as the start and only mark an end date if explicitly documented.
Noise removal at scale. A single FHIR patient record can contain hundreds of fields across dozens of resources, most of them useless for clinical reasoning. Building to_llm_string() for every model class and validating the output against synthetic records was painstaking work. The payoff was worth it: roughly 90% context reduction with measurably higher LLM accuracy.
Accomplishments That I'm Proud Of
Understanding FHIR at depth. FHIR's resource model is not designed for rapid comprehension. Learning to navigate its nested coding structures, reference patterns, and resource relationships, and then flattening all of it into clean, LLM-readable strings, was a significant technical undertaking.
90% context reduction. Implementing to_llm_string() across every model class cut token usage by roughly 90% while improving LLM output quality. Less noise means fewer hallucinations and faster inference.
A genuinely useful 2-minute brief. The final output is not a data dump. It is a structured, actionable summary a physician can read in two minutes and act on immediately. Getting the density and structure right required many iterations.
Zero-hallucination architecture. Hallucination in a clinical setting is not an annoyance, it is a patient safety risk. The architecture makes hallucination structurally difficult: the agent cannot produce output without first calling the MCP tool, and the tool only surfaces documented facts.
What I Learned
The biggest surprise: clinical data is not designed for humans. FHIR resources are excellent for interoperability and billing. They are terrible for rapid physician cognition. A patient's story is broken into dozens of separate records, each requiring manual assembly, and physicians simply do not have time for that.
I learned that physicians do not need more data. They need less, better data. The right signals, filtered and structured for the moment of decision.
I also learned that raw FHIR JSON is unreadable for both humans and LLMs. Every resource is packed with internal UUIDs, system URLs, metadata timestamps, and reference pointers that carry zero clinical value. The solution is not a smarter prompt. It is noise removal before synthesis begins.
The deepest lesson was about time. Temporal patterns matter more than isolated facts. A single elevated blood pressure reading is noise. A pattern of missed appointments followed by an ER visit is a signal. Social determinants shift over time, and those shifts often precede clinical deterioration long before any diagnosis is documented.
What's Next for ChartBrief
External agent support. The Prompt Opinion orchestrator currently does not support external agents. Once that capability is available, ChartBrief would move from a BYOA integration to a fully external agent for more control and flexibility.
Observability and tracing. Adding distributed tracing to both the MCP server and the agent would give full visibility into every tool call, context size, and response, which is essential for debugging and auditing in a clinical context.
Automated hallucination detection. Periodically, a specialized evaluation agent would pull production traces, the build_clinical_narrative() output alongside the agent's final response, and flag any hallucinated content. Flagged traces would surface to a human reviewer who validates the findings.
Human-in-the-loop prompt improvement. Validated findings from the evaluation agent, combined with human review of flagged traces, would feed directly into prompt updates. This closes the loop: production behavior drives continuous improvement, with a human always in the decision seat.
SNOMED hierarchy integration. The current skip list approach for separating SDOH findings from clinical conditions is a workaround. The proper solution is runtime access to the SNOMED concept hierarchy via a licensed terminology server, making SDOH classification accurate, scalable, and maintainable.
Built With
- fastmcp
- python
Log in or sign up for Devpost to join the conversation.