Clinical notes enricher

Inspiration

Ambient AI has genuinely transformed clinical documentation so doctors spend less time typing and more time with patients
But over the past few years we observed a gap the note was accurate, the coding was incomplete
A patient with diabetes and leg swelling would have a perfect transcript and a missing kidney disease code worth thousands in risk-adjusted revenue
We realized ambient AI solves the transcription problem but leaves the semantic problem
No tool was asking: what do these symptoms mean together, and is that meaning captured in the billing record?
That question is what inspired SymptoMap

SymptoMap is a post-transcript enrichment layer that plugs into any ambient AI clinical documentation tool
It receives a structured clinical note after the ambient AI generates it
It identifies the medical concepts in the note and maps them to standardized clinical identifiers
It then traverses a clinical knowledge graph to find relationships between those concepts that no single symptom lookup would reveal
It applies a clinical terminology standardization layer the kind used by professional medical coders and auditors to ensure the right language is used for each concept (think: the difference between "high blood sugar" and "Type 2 Diabetes Mellitus with hyperglycemia, uncontrolled")
It returns a structured result containing:
- ✅ Validated billing codes already in the note
- 🔴 Flagged missing codes with the knowledge graph reasoning path that found them
- 📋 A complete audit trail showing why each suggestion was made
The clinician sees nothing change enrichment happens invisibly before the note enters the health record

API layer: FastAPI — lightweight, fast, self-documenting; the Swagger UI at /docs is the demo interface
Knowledge graph: NetworkX in-memory graph loaded at startup — approximately 150 nodes and 400 edges across three clinical domains (diabetes, cardiovascular, metabolic), curated from open datasets including PrimeKG (Harvard), SNOMED CT subsets, and ICD-10-CM
Entity extraction: A curated medical dictionary mapped to UMLS concept identifiers — no heavy machine learning model downloads, keeping demo latency under 300ms
Terminology layer: A standardized clinical terminology database (the kind maintained by professional medical coding organizations) mocked for the demo and designed to activate from a real API with a single environment variable — zero code changes required
Testing: 32 red-green TDD test cases written before any implementation — pytest across models, NER, graph traversal, enrichment pipeline, and API endpoints
Demo scenarios: Three pre-built fixture notes covering a clean note (no gaps), a gap note (missing HCC codes), and a partial note (mixed result)
Architecture decision: Deliberately avoided Neo4j, scispaCy, and any heavy infrastructure — the demo runs on a laptop with no external dependencies

The "1.47% problem": Early in research we conflated transcription hallucination rates with the semantic gap we were solving these are completely different problems measured differently; getting the framing right took real work
Knowledge graph scope vs. latency: Full PrimeKG has 4 million edges loading it into memory would have made the demo unusable; the discipline of curating 150 nodes that still told a compelling clinical story was harder than expected
CDI vs. CDSS positioning: Clinical Documentation Improvement and Clinical Decision Support are legally and regulatorily very different categories; we had to be precise that SymptoMap enriches documentation of what was already discussed it does not recommend treatments or diagnoses
Terminology standardization without a live API: Building a mock that was realistic enough to demonstrate the concept while being architected for real API activation required careful interface design
Demo data: Synthetic clinical notes that are realistic enough to be credible but contain no real patient information finding the right sources and constructing three notes that each told a different story took iteration

The entity-cluster insight: The core idea that two symptoms together imply a third concept and that this relationship is traversable in a knowledge graph turned out to be both technically sound and immediately understandable to non-technical judges
32 passing tests before a single line of production code: Full red-green TDD discipline on a hackathon timeline is genuinely difficult; we held the line
Research grounding: Every number in the presentation traces to a peer-reviewed source or primary data survey no interpolated figures, no vendor claims
Sub-300ms enrichment on a laptop: The performance constraint felt arbitrary at first; it turned out to be the right forcing function to make architectural choices that will scale
The patient safety framing: Realizing that an undocumented kidney disease diagnosis is not just a revenue gap but a medication contraindication safety issue and articulating that clearly felt like the moment the project became genuinely meaningful

Transcription accuracy and semantic completeness are orthogonal problems — solving one does not move the needle on the other; this distinction reshapes how you think about the entire ambient AI stack
Knowledge graphs are most powerful at the cluster level — individual entity lookup is just search; the value emerges when you traverse relationships between entities together
CDI is a CFO problem, not a CMO problem — the sales cycle, the ROI language, and the procurement path are completely different from clinical decision support; positioning matters more than we expected
Curated data beats comprehensive data for demos — 150 hand-verified nodes produced more reliable and explainable results than any automated extraction from a full ontology would have
The audit trail is also the product — we initially treated the KG traversal path as metadata; Audit trails are also part of the product; the reasoning transparency is what makes the enrichment trustworthy

Connect to a real ambient AI webhook: The architecture is already built around the standard notification pattern used by major ambient AI platforms — the next step is early access partnership to run against real note output
Replace the terminology mock with a live API: The environment variable pattern means this is one configuration change; the interface is already designed and tested
Expand the knowledge graph to five clinical domains: Current coverage is diabetes, cardiovascular, and metabolic — respiratory and mental health are the next highest-impact additions for HCC risk adjustment
Build the population-level layer: A health system's own patient population data as a graph overlay — "your diabetic patients over 65 historically miss CKD documentation 34% of the time" — this is the enterprise upsell that no open ontology can provide
Pursue Neo4j migration: NetworkX in-memory works for the demo; a 5,000-node graph with full PrimeKG coverage needs a proper graph database; the query patterns are already written in a way that makes migration straightforward

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.