Inspiration
Over 300 million people worldwide live with a rare disease, yet the average patient waits 4 to 7 years before receiving a correct diagnosis. This "diagnostic odyssey" causes irreversible harm — organ damage progresses, treatments are delayed, and families are left without answers for years. The core problem isn't lack of medical knowledge: it's that rare disease knowledge is scattered. No single clinician can hold 10,000 rare diseases, their overlapping phenotypes, causative genes, and evolving literature in their head simultaneously.
We asked: what if a clinician could get a rare-disease-aware second opinion in the time it takes to write a referral note?
What it does
OrphaMind is a full-stack clinical decision-support system. A clinician pastes (or photographs) a patient's clinical note and receives:
- Ranked differential diagnosis of rare diseases with confidence scores (0–100%)
- Reasoning per diagnosis — which symptoms support or argue against each disease
- Lab intelligence — automatic extraction of lab values from the note, critical flagging, and fold-change above normal
- Recommended workup — specific genetic, biochemical, and imaging tests to confirm or rule out each candidate
- Urgency assessment — ROUTINE / URGENT / EMERGENCY with clinical justification
- Literature citations — grounded in 87,848 indexed chunks from GeneReviews, OMIM, and clinical references
- OCR support — upload a photo of a handwritten or printed clinical note and OrphaMind reads it
- Patient history — every case is saved and searchable by patient ID
- Disease explorer — full-text + semantic search across all 11,456 Orphanet rare diseases
How we built it
The dual-Gemini pipeline is the architectural core. We use two Gemini models with different roles:
gemini-2.0-flash— extracts structured features (symptoms, genes, demographics, lab values) from the raw clinical note in ~1.5s. Speed matters here; we don't need deep reasoning yet.gemini-2.5-flash— takes the extracted features, the top candidate diseases from our inverted index, and retrieved literature passages, then generates the full multi-step differential diagnosis. This is where reasoning depth matters.
The knowledge layer combines two sources:
- Orphanet (11,456 rare diseases with structured symptoms, genes, inheritance, prevalence) — we built a custom in-memory inverted index so candidate lookup is O(k) not O(n)
- ChromaDB vector store (87,848 document chunks, ONNX embeddings) — semantic retrieval of relevant GeneReviews, OMIM, and textbook passages to ground Gemini's reasoning
4-layer hallucination guard validates every AI-suggested disease:
Orphanet DB lookup → symptom index match → literature semantic search → confidence penalty (−20%) if evidence is absent
The FastAPI backend exposes clean REST endpoints. The React/Vite frontend uses CSS display:none tab persistence so an in-progress diagnosis is never erased when switching tabs.
For OCR, Tesseract 5.5 handles typed notes in a single pass (with a handwritten fallback for short outputs), keeping upload-to-text latency under 3 seconds.
Challenges we ran into
LLM hallucination was the hardest problem. Gemini is trained on vast medical text and can confidently describe plausible-but-fictitious rare diseases. Our 4-layer guard + grounding pipeline specifically addresses this — every diagnosis must be traceable to an Orphanet entry and literature evidence.
Latency vs. accuracy tradeoff — the powerful model was too slow for feature extraction (unnecessary depth) and the fast model was insufficient for final diagnostic reasoning. The dual-model strategy — fast extraction, powerful reasoning — solved this; total pipeline time dropped from ~18s to under 10s.
Vector store consistency — ChromaDB on Windows with ONNX embeddings had subtle corruption issues on partial index builds. We added an explicit rebuild-from-scratch path and a verification step (check_kb.py) to ensure index integrity before startup.
Garbage input detection — naive clinical note validators rejected legitimate abbreviations. The final InputValidator distinguishes between medical shorthand (acceptable) and random keysmash (reject) using character entropy, word recognizability, and minimum meaningful token count.
Accomplishments that we're proud of
- End-to-end rare disease reasoning pipeline that actually returns clinically sensible differentials — verified against known textbook cases (DMD, Wilson disease, Gaucher, Fabry) with correct top-1 or top-2 rankings
- 87,848-document RAG corpus built, indexed, and serving — semantic retrieval grounds every diagnosis in real literature
- Sub-10-second wall time from note submission to full differential, on a consumer laptop with no GPU
- 4-layer hallucination guard — the first layer alone (Orphanet DB check) eliminates ~30% of initially proposed diseases in our tests
- Zero data leakage — the .env API key is never committed, all patient data stays local in SQLite, no external logging
- Polished production UI — animated 4-step loading, confidence bars, expandable literature panels, lab value color coding — a tool clinicians could actually use in a consult
What we learned
- Two specialized models beat one general model — orchestrating Gemini 2.0 Flash and 2.5 Flash in sequence outperforms using either alone for the full task
- Structured knowledge + vector retrieval + LLM reasoning is more reliable than LLM alone — each layer catches failure modes the other layers miss
- Hallucination is a grounding problem, not a prompt problem — better prompts help slightly; grounding in a verified disease database helps dramatically
- Clinical UX requires persistence — tab-switching, slow responses, and error states need careful handling; a diagnosis tool must never silently discard user data
- OCR is the noisiest input — the biggest source of downstream reasoning errors is poor OCR quality; single-pass with quality fallback was essential
What's next for OrphaMind
- FHIR integration — ingest structured EHR data directly instead of free-text notes
- Phenotype similarity scoring — per-patient HPO (Human Phenotype Ontology) term matching against the full Orphanet graph
- Genetic report parsing — upload raw VCF or panel reports; OrphaMind cross-references variants against known disease-causing genes
- Multilingual notes — Gemini's multilingual capability means rare disease support isn't limited to English-speaking health systems
- Clinician feedback loop — confirmed/rejected diagnoses feed back into confidence calibration over time
- Mobile OCR — native camera capture on mobile for point-of-care use in under-resourced settings
Log in or sign up for Devpost to join the conversation.