Zebra

Symptoms Analysis and match to commonly misdiagnosed diseases
Rare Disease Match
Descibe the symtopms you are facing

About the Project

The Problem

Rare diseases affect 1 in 10 people globally, yet patients wait an average of 7-10 years for a correct diagnosis. During this "diagnostic odyssey," they're often misdiagnosed with common conditions like fibromyalgia, anxiety/depression, chronic fatigue syndrome, and IBS. Meanwhile, conditions like Endometriosis (affects 1 in 10 women), PCOS, Ehlers-Danlos Syndrome, and Lupus go undetected—causing unnecessary suffering and delayed treatment.

What Inspired Me

I was inspired by countless patient stories of being dismissed by doctors with phrases like "it's just stress" or "it's all in your head" when they had serious underlying conditions. The medical system is optimized to find common diseases, but rare diseases require pattern recognition that AI can help with. Real example: Endometriosis takes 7+ years to diagnose on average, despite affecting 200 million women worldwide. The symptoms are dismissed as "bad periods" when they're actually a debilitating disease.

How I Built It

Tech Stack:

Frontend: React.js with modern CSS
AI/ML: Hugging Face Inference API (Mistral-7B-Instruct)
Algorithm: Custom multi-stage symptom matching system

Architecture:

Symptom Extraction Layer
- AI-powered extraction using Mistral-7B with medical terminology prompts
- Enhanced fallback with 40+ regex patterns for colloquial symptom descriptions
- Captures severity, duration, and context
Smart Matching Algorithm

$$\text{Score} = \sum (\text{MatchConfidence} \times \text{SymptomWeight} \times \text{SeverityMultiplier} \times \text{DurationMultiplier})$$

Where:

$\text{MatchConfidence} \in [0, 1]$ (fuzzy string matching + medical term mapping)
$\text{SymptomWeight} \in [1, 10]$ (diagnostic significance)
$\text{SeverityMultiplier} = 1.8$ if severe, $1.4$ if chronic

Confidence Calculation

$$\text{Confidence} = \min(45, \text{Coverage} \times 0.45) + \min\left(30, \frac{\text{MatchedSymptoms}}{4} \times 30\right) + \min(25, \text{HighValueMatches} \times 12)$$

Explainability Layer
- AI-generated reasoning for each match
- Comparison with common diagnoses
- Specific next steps and testing recommendations

Database:

Curated 30+ diseases (15 rare, 15 common) with 200+ symptoms
Each symptom weighted by diagnostic significance (1-10)
Includes diagnostic criteria, misdiagnoses, and medical sources

What I Learned

Technical:

How to build effective medical AI that doesn't require massive datasets
Importance of fallback systems (AI fails ~10% of the time)
Fuzzy matching algorithms for medical terminology
Balancing precision vs recall in symptom matching

Medical:

Rare disease diagnostic criteria and patterns
Why misdiagnosis happens (symptom overlap, cognitive biases)
The importance of patient advocacy in diagnosis

UX/Ethics:

How to present AI health information responsibly
Importance of disclaimers without undermining utility
Designing for empowerment vs. causing health anxiety

Challenges I Faced

Symptom Extraction Accuracy
- Problem: Patients describe symptoms colloquially ("sex hurts") but databases use medical terms ("dyspareunia")
- Solution: Built a two-tier system with medical term mapping and synonym dictionaries
False Positives
- Problem: Fatigue matches almost every disease
- Solution: Implemented weighted scoring where rare/specific symptoms (weight ≥8) count more than common ones
AI Reliability
- Problem: Hugging Face API throttling and inconsistent outputs
- Solution: Robust fallback extraction with 40+ regex patterns that works without AI
Ethical Concerns
- Problem: Risk of patients self-diagnosing incorrectly
- Solution: Prominent disclaimers, shows common diagnoses for comparison, emphasizes "discuss with doctor" not "you have this", frames as advocacy tool not diagnostic tool
Scoring Calibration
- Problem: How to balance "2 perfect symptoms" vs "10 weak symptoms"?
- Solution: Multi-factor confidence score weighing coverage, specificity, and high-value matches

Impact & Future Work

Current Impact:

Helps patients identify rare diseases to discuss with doctors
Reduces time to proper diagnosis
Empowers patient advocacy with specific testing recommendations

Future Improvements:

Add 50+ more rare diseases
Implement demographic filtering (age, gender, ethnicity)
Add symptom timeline visualization
Multi-language support
Integration with medical literature APIs for real-time citation

Built With

css3
html5
huggingface/inference
huggingfaceinferenceapi
javascript
mistral-7b-instruct-v0.2
node.js
npm
react.js

Updates

Amruta Velamuri started this project — Oct 14, 2025 09:33 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.