About the Project
The Problem
Rare diseases affect 1 in 10 people globally, yet patients wait an average of 7-10 years for a correct diagnosis. During this "diagnostic odyssey," they're often misdiagnosed with common conditions like fibromyalgia, anxiety/depression, chronic fatigue syndrome, and IBS. Meanwhile, conditions like Endometriosis (affects 1 in 10 women), PCOS, Ehlers-Danlos Syndrome, and Lupus go undetected—causing unnecessary suffering and delayed treatment.
What Inspired Me
I was inspired by countless patient stories of being dismissed by doctors with phrases like "it's just stress" or "it's all in your head" when they had serious underlying conditions. The medical system is optimized to find common diseases, but rare diseases require pattern recognition that AI can help with. Real example: Endometriosis takes 7+ years to diagnose on average, despite affecting 200 million women worldwide. The symptoms are dismissed as "bad periods" when they're actually a debilitating disease.
How I Built It
Tech Stack:
- Frontend: React.js with modern CSS
- AI/ML: Hugging Face Inference API (Mistral-7B-Instruct)
- Algorithm: Custom multi-stage symptom matching system
Architecture:
Symptom Extraction Layer
- AI-powered extraction using Mistral-7B with medical terminology prompts
- Enhanced fallback with 40+ regex patterns for colloquial symptom descriptions
- Captures severity, duration, and context
Smart Matching Algorithm
$$\text{Score} = \sum (\text{MatchConfidence} \times \text{SymptomWeight} \times \text{SeverityMultiplier} \times \text{DurationMultiplier})$$
Where:
- $\text{MatchConfidence} \in [0, 1]$ (fuzzy string matching + medical term mapping)
- $\text{SymptomWeight} \in [1, 10]$ (diagnostic significance)
- $\text{SeverityMultiplier} = 1.8$ if severe, $1.4$ if chronic
- Confidence Calculation
$$\text{Confidence} = \min(45, \text{Coverage} \times 0.45) + \min\left(30, \frac{\text{MatchedSymptoms}}{4} \times 30\right) + \min(25, \text{HighValueMatches} \times 12)$$
- Explainability Layer
- AI-generated reasoning for each match
- Comparison with common diagnoses
- Specific next steps and testing recommendations
Database:
- Curated 30+ diseases (15 rare, 15 common) with 200+ symptoms
- Each symptom weighted by diagnostic significance (1-10)
- Includes diagnostic criteria, misdiagnoses, and medical sources
What I Learned
Technical:
- How to build effective medical AI that doesn't require massive datasets
- Importance of fallback systems (AI fails ~10% of the time)
- Fuzzy matching algorithms for medical terminology
- Balancing precision vs recall in symptom matching
Medical:
- Rare disease diagnostic criteria and patterns
- Why misdiagnosis happens (symptom overlap, cognitive biases)
- The importance of patient advocacy in diagnosis
UX/Ethics:
- How to present AI health information responsibly
- Importance of disclaimers without undermining utility
- Designing for empowerment vs. causing health anxiety
Challenges I Faced
Symptom Extraction Accuracy
- Problem: Patients describe symptoms colloquially ("sex hurts") but databases use medical terms ("dyspareunia")
- Solution: Built a two-tier system with medical term mapping and synonym dictionaries
False Positives
- Problem: Fatigue matches almost every disease
- Solution: Implemented weighted scoring where rare/specific symptoms (weight ≥8) count more than common ones
AI Reliability
- Problem: Hugging Face API throttling and inconsistent outputs
- Solution: Robust fallback extraction with 40+ regex patterns that works without AI
Ethical Concerns
- Problem: Risk of patients self-diagnosing incorrectly
- Solution: Prominent disclaimers, shows common diagnoses for comparison, emphasizes "discuss with doctor" not "you have this", frames as advocacy tool not diagnostic tool
Scoring Calibration
- Problem: How to balance "2 perfect symptoms" vs "10 weak symptoms"?
- Solution: Multi-factor confidence score weighing coverage, specificity, and high-value matches
Impact & Future Work
Current Impact:
- Helps patients identify rare diseases to discuss with doctors
- Reduces time to proper diagnosis
- Empowers patient advocacy with specific testing recommendations
Future Improvements:
- Add 50+ more rare diseases
- Implement demographic filtering (age, gender, ethnicity)
- Add symptom timeline visualization
- Multi-language support
- Integration with medical literature APIs for real-time citation
Built With
- css3
- html5
- huggingface/inference
- huggingfaceinferenceapi
- javascript
- mistral-7b-instruct-v0.2
- node.js
- npm
- react.js
Log in or sign up for Devpost to join the conversation.