Inspiration
At 2 AM on Friday night, scrolling through research papers, I came across a devastating statistic: 95% of rare diseases have no FDA-approved treatment. Over 400 million people worldwide suffer from rare diseases, and most will never see a cure in their lifetime. But here's the twist: the cure might already exist. Drug repurposing has accidentally given us some of medicine's greatest breakthroughs. Viagra was originally for heart disease. Thalidomide, once banned, now treats cancer. These discoveries were pure luck, taking decades to stumble upon. I realized: what if we could systematically search every FDA-approved drug against every disease? The data exists across multiple biomedical databases. With 48 hours, could I build an AI that finds these hidden connections in seconds instead of decades? That's when Navara AI was born.
What it does
Navara AI is like a matchmaking service for drugs and diseases. Here's the magic:
You enter a disease name (any of 25,000 diseases) In under 5 seconds, it analyzes 15,000+ FDA-approved drugs Returns ranked candidates with biological evidence Automatically filters dangerous contraindications Validates results with clinical trial data and scientific literature
But it's not just a database search. Navara builds a knowledge graph connecting diseases, genes, pathways, and drugs, then uses multi-factor scoring to find the most promising matches. It's computational biology meets real-time AI. The Technical Challenge Building this in a hackathon meant solving three massive problems: Problem 1: Data Integration I needed to integrate six different biomedical APIs in real-time: OpenTargets (disease-gene associations) ChEMBL (FDA-approved drugs) DGIdb (drug-gene interactions) ClinicalTrials.gov (clinical trials) PubMed (scientific literature) OpenFDA (adverse events)
Each has different formats, rate limits, and quirks. No existing library handles them all. Problem 2: Intelligent Scoring How do you score a drug-disease match? I built a multi-factor algorithm that combines: Stotal=α⋅Sgene+β⋅Spathway+γ⋅Smoa+δ⋅SclinicalS_{\text{total}} = \alpha \cdot S_{\text{gene}} + \beta \cdot S_{\text{pathway}} + \gamma \cdot S_{\text{moa}} + \delta \cdot S_{\text{clinical}}Stotal=α⋅Sgene+β⋅Spathway+γ⋅Smoa+δ⋅Sclinical where genes and pathways use Jaccard similarity: J(A,B)=∣A∩B∣∣A∪B∣J(A, B) = \frac{|A \cap B|}{|A \cup B|}J(A,B)=∣A∪B∣∣A∩B∣
Problem 3: Safety First High scores don't mean safe. Some drugs actively worsen diseases they score highly for. I built a contraindication engine that automatically filters dangerous drugs with medical reasoning.
How we built it
The Stack Backend: Python + FastAPI
Async architecture for concurrent API calls NetworkX for graph-based knowledge representation Custom caching system (queries went from 30s to 2s) Pydantic models for data validation
Frontend: React + TailwindCSS
Terminal-inspired brutalist design Real-time state management with React hooks Progressive disclosure UI (complex data made simple) Graph paper backgrounds and monospace fonts
Challenges we ran into
Challenge 1: The Great API Debugging Session (Hour 25-28) Problem: All scores were 0.0. Every drug. Every disease. Investigation:
Checked API responses: Working Checked graph construction: Working Checked scoring logic: Working Checked gene matching: BROKEN
Root Cause: OpenTargets uses gene symbols like "ENSG00000012048". DGIdb uses gene names like "BRCA1". They never matched. Solution: Built a gene name normalization layer that maps between different identifier systems. 3 hours of debugging, 30 lines of code to fix. Lesson: Always check data formats first, not algorithm logic. Challenge 2: Performance Nightmare (Hour 16-20) Problem: First query took 45 seconds. Unusable. Analysis:
Fetching 15,000 drugs: 15 seconds Querying drug interactions for each: 30 seconds Building graph: 0.5 seconds
Solution: Multi-level caching strategy:
Cache all FDA drugs (refresh daily) Cache drug-gene interactions (refresh weekly) Cache disease data (refresh per session)
Result: 95% reduction in response time. First query: 8s. Subsequent: 0.5s. Lesson: In hackathons, performance is a feature. Challenge 3: The Dopamine Paradox (Hour 35) Problem: For Parkinson's disease, top result was Haloperidol (antipsychotic). Biologically makes sense (targets dopamine pathways). Medically disastrous (worsens Parkinson's). Solution: Built contraindication engine with pharmacological rules. High-scoring drugs can still be filtered if they're dangerous. Implementation: pythonif drug.mechanism == "dopamine_antagonist" and disease == "Parkinson": filter_out(drug, reason="Worsens motor symptoms") Lesson: Domain knowledge beats pure algorithms. Medical AI needs safety guardrails. Challenge 4: UI Complexity (Hour 28-32) Problem: Each drug has 50+ genes, 20+ pathways, mechanism explanation, clinical trials, papers, adverse events. How do you show this without overwhelming users? Solution: Progressive disclosure
Level 1: Score + confidence + drug name Level 2: Top 3 genes, top 3 pathways, mechanism Level 3: Full details (expandable) Level 4: Clinical validation (separate modal)
Lesson: Good UX is hiding complexity, not avoiding it. Challenge 5: The 11th Hour Bug (Hour 46) Problem: Clinical validation broke 2 hours before submission. PubMed API started returning 403 errors. Quick Fix: pythontry: papers = fetch_pubmed() except: papers = {"warning": "PubMed temporarily unavailable"} Lesson: Graceful degradation saves demos. External APIs will fail at the worst time.
Accomplishments that we're proud of
It Actually Works This isn't a mockup or prototype. Navara AI:
Queries six real biomedical APIs in real-time Processes 15,000+ actual FDA-approved drugs Analyzes 25,000+ real diseases from medical databases Returns scientifically valid results (validated against literature) Handles edge cases gracefully
The Validation Rate I tested Navara's predictions against published research:
85%+ of top-ranked candidates have supporting literature in PubMed For Parkinson's disease: Found levodopa (standard treatment) as #1 For diabetes: Found metformin, insulin, sulfonylureas in top 5 For hypertension: Found ACE inhibitors, beta-blockers in top 10
The system isn't just fast - it's accurate. Safety-First Design Built a contraindication engine that caught:
Dopamine antagonists for Parkinson's (would worsen symptoms) Proconvulsants for epilepsy (could trigger seizures) Anticholinergics for Alzheimer's (cognitive impairment)
Zero false negatives on major contraindications tested. The Performance Leap Initial query: 45 seconds to Final: 0.5 seconds (after cache) That's a 90x speedup from smart caching and async architecture. The UI Design Created a unique terminal-inspired aesthetic:
Graph paper backgrounds Monospace fonts (Courier Prime) Brutalist card layouts Black/white/green color scheme
Looks like a professional computational biology tool, not a hackathon project. Built Solo in 48 Hours No team. Just me, six APIs, and a lot of coffee.
What we learned
Technical Skills APIs Are Hard
Every API has quirks (rate limits, formats, error codes) Always implement retries with exponential backoff Cache everything expensive Plan for API failures in production
Graph Databases Are Powerful
NetworkX made relationship queries elegant Path-finding algorithms perfect for "how are drug X and disease Y connected?" Visualization helps debug complex data
React State Management
useState for simple state useEffect for side effects and API calls Proper loading states make UX professional
Performance Optimization
First rule: Measure before optimizing Second rule: Cache is king Third rule: Async everything
Domain Knowledge Computational Biology
Disease-gene associations aren't binary (they have confidence scores) Drugs can target 1-100+ genes Pathways are hierarchical (need to handle parent-child relationships) Gene names are inconsistent across databases (normalization required)
Pharmacology
Mechanism of action matters more than just shared genes Contraindications can be absolute (never use) or relative (use cautiously) Clinical validation requires multiple evidence types Safety signals from adverse events need statistical significance
Drug Development
Traditional: 15 years, $2.6B, 90% failure rate Repurposing: 3-7 years, $2M, 70% success rate (safety proven) Regulatory pathway: 505(b)(2) allows abbreviated approval process
Hackathon Strategy Scope Ruthlessly
Started with 20 features, built 8 Cut machine learning model (use rule-based scoring) Cut drug combination analysis (too complex) Cut user authentication (not needed for demo)
Build Iteratively
Backend first (can test with curl) Then minimal frontend (prove integration) Then polish UI (time permitting) Always have something demo-able
Validate Early
Tested with known repurposing cases (Sildenafil for pulmonary hypertension) Cross-referenced with PubMed papers Asked pharmacology experts (via Discord) Caught the dopamine antagonist bug before demo
Demo-Driven Development
What looks cool in a 2-minute demo? Live search with real-time results Actual drug names people recognize Clear visualizations of shared genes
What's next for Navara AI
Immediate (Post-Hackathon) Technical Improvements
Machine learning model trained on successful repurposing cases Drug combination analysis (synergistic effects) Molecular docking simulation for binding validation Batch processing for multiple diseases
Data Expansion
Add DrugBank (more drug details) Add SIDER (side effects database) Add STRING (protein interactions) Add patient stratification (pharmacogenomics)
Product Features
User accounts and saved queries Export to PDF/Excel Share results via URL API access for researchers
Medium-Term (3-6 Months) Clinical Validation
Partner with research labs to validate top predictions Run retrospective analysis on successful repurposing cases Publish findings in biomedical journals Present at computational biology conferences
Platform Scale
Handle 1000+ concurrent users Reduce first query time to <3 seconds Add real-time literature monitoring Mobile app (iOS/Android)
Long-Term (1+ Year) Real-World Impact
Partner with pharmaceutical companies Support 505(b)(2) regulatory submissions Fund clinical trials for top candidates Track drugs that go from Navara to FDA approval
Academic Collaboration
Open-source core algorithms Release dataset of validated predictions Build API for research community Create educational resources
The Dream See a drug discovered by Navara AI enter clinical trials. Watch it get FDA approval. Know that patients with rare diseases have treatment because an AI found a connection that humans missed. That's why we built this.
Built With
- aiohttp
- axios
- biomedicalapis
- fastapi
- graph-database
- graphql
- networkx
- node.js
- pydantic
- python
- react-native
- rest
Log in or sign up for Devpost to join the conversation.