posted an update

Inspiration Last summer I interned on a biomedical ML team at D-Prime, working on blood-pressure waveform models trained on thousands of PPG segments. I got used to seeing patient signals as clean tensors — preprocessed, labeled, evaluated. Then I'd hear from family and friends what their actual experience of the healthcare system looked like: a stack of after-visit summaries, lab printouts, and discharge notes that nobody had ever read together. The patient was the integration layer. And the patient was usually the person least equipped to do that job — sick, tired, and reading words like "lymphadenopathy" without a translation. The case I kept coming back to: a person with chronic back pain who'd seen five specialists in six months. Thirty documents in their portal. Not one of those doctors had read what the others wrote, and the patient couldn't read most of what was in their own file. That's the gap Chronical fills. What it does Chronical takes a patient's scattered medical records — lab reports, visit notes, imaging summaries, discharge papers — and turns them into one chronological timeline they can actually read. Every event on the timeline is:

Plain-language translated so a non-medical reader can understand what happened. "HbA1c 7.2%" becomes "Blood sugar over the last 3 months — slightly above the target range." The original medical term stays available in a tooltip. Severity color-coded at four levels — info / monitor / concerning / urgent — so patients can see at a glance which events deserve a follow-up. Click-back-to-source. Every event links to the exact line in the original document, with the verbatim snippet highlighted. The patient never has to take a translation on faith — they can verify every claim against their actual record.

How we built it Built solo over the HackDavis weekend.

Frontend: Next.js + React, Tailwind for the timeline UI. Backend: FastAPI, with a thin orchestration layer over the Anthropic API. Extraction: Claude with a strict JSON schema. Every extracted event carries a verbatim source.snippet field — the exact string Claude pulled the event from — plus document ID and page number:

json{ "id": "uuid", "date": "2025-09-14", "event_type": "lab", "title": "HbA1c: 7.2%", "summary": "Blood sugar over the last 3 months, slightly above target.", "severity": "monitor", "source": { "document_id": "doc_3", "page": 2, "snippet": "HbA1c 7.2% (ref 4.0-5.6)" } } That schema is what makes click-back-to-source trivial: it becomes a Ctrl-F over the page text once we have the snippet.

Translation: A second Claude pass rewrites each clinical event in plain language, keeping the medical term in a tooltip for users who want it. Severity classification: Done at extraction time as part of the schema, not as a separate call. Claude assigns one of four severity levels with a brief justification we log for evaluation but don't show the user. Eval harness: Three hand-written patient cases — newly-diagnosed Type 2 Diabetes, suspicious mammogram → biopsy → benign, and chronic back pain across 5 specialists — labeled by event type. Precision and recall by entity type render in a small dev panel so we can see when a prompt change regresses accuracy.

Challenges we ran into The biggest temptation early on was to fine-tune a model — BioBERT with a LoRA on entity extraction. It would have been ML-flashy, and with my D-Prime background it was within reach. But twelve hours of solo build and I'd have ended up with a model whose evals I couldn't trust, and no time to build the patient-facing UI that is the product. The hardest call of the weekend was deciding the engineering leverage was in system design and source-grounding, not in training a model. Source highlighting was harder than expected. Getting Claude to return the exact verbatim string — not a paraphrase, not a near-match, not a slightly-cleaned-up version — took a few prompt iterations and an explicit "return the substring as it appears character-for-character, including any typos" instruction. Without that, Ctrl-F over the page text fails silently and the entire trust feature breaks. Severity calibration was the third tricky one. The default LLM behavior is to over-flag — everything looks "concerning" if you ask it to assess medical events. We had to be specific in the prompt about what each severity tier means and give few-shot examples per tier so the timeline didn't become a wall of red. Accomplishments that we're proud of

Every claim on the timeline links back to its source. No hallucinated medical events. The patient can verify everything. Built solo, end-to-end, in twelve hours: extraction pipeline, plain-language translation, severity coloring, source-linking, eval harness, and the patient-facing UI. The eval harness actually caught regressions during the build — twice we changed the extraction prompt and saw recall on lab events drop, and rolled back. That feedback loop is rare in a hackathon project. The hero demo case (chronic back pain across 5 specialists) reads cleanly: 30 input documents collapse to a 14-event timeline, and the path of escalating treatment is visible at a glance for the first time.

What we learned Source-grounding beats model-tuning at the timescale of a hackathon. Twelve hours is enough to design a JSON schema that makes a feature trivial; it is not enough to fine-tune a model whose evals you can trust. Patient-facing health tools live or die on trust, and trust comes from verifiability. Severity color-coding is useless if the patient suspects the colors are made up. The click-back feature isn't a polish item — it's the trust layer that makes the rest of the product believable. Plain-language translation is not the same problem as summarization. A patient does not want shorter text. They want the same information in words they understand, with the medical term still available if they want it. What's next for Chronical

Direct portal integration. Right now Chronical takes uploaded PDFs. Next is FHIR / Epic MyChart pull so the timeline builds itself from the patient's existing records. Caregiver mode. A patient often delegates medical reading to a family member — a parent, a partner, an adult child. Chronical needs a shared-access mode that respects HIPAA and consent. Specialist briefing export. A one-page "what you need to know about this patient" view, generated from the timeline, that a patient can hand a new doctor to start a visit already up to speed. Better eval coverage. Three hand-labeled cases is enough for a hackathon and not enough for production. The next version needs a real evaluation set across condition types, document types, and document quality (clean PDFs vs. faxed scans vs. handwritten notes).

Log in or sign up for Devpost to join the conversation.