Inspiration
This idea didn't come from a conference room — it came from a desk in a primary healthcare unit in a rural, underserved area in Egypt. As a physician working directly within Egypt's Disease Early Warning System (DES Egypt), I file notifiable disease reports the same way clinicians here have for years: on paper, by hand, into a register that then has to physically make its way up the chain before anyone outside the clinic even knows a case exists.
Working with limited resources makes every inefficiency visible. A 35/44-disease notifiable list, Group A/B classification rules that determine how urgently a case must be reported, case definitions that require real clinical judgment — all of it managed through handwriting and registers, with no real-time way to see whether the three cases I just saw this week are isolated, or the first signs of something bigger. In a well-resourced hospital, that gap might cost a delay. In a rural underserved area, it can mean an outbreak is well underway before anyone with the power to respond even learns about it.
I didn't need to read about the global consequences of weak surveillance systems — I was living the local version of it every day. EpiLink came from wanting to close that exact gap: giving a clinician like me, anywhere, the ability to turn what I'm already documenting into a structured, instant, geolocated signal — without changing how rural healthcare actually works on the ground.
What it does
EpiLink is an AI-powered infectious disease surveillance platform designed for Egypt's Disease Early Warning System, turning a clinician's observation into a structured, geolocated outbreak signal in seconds.
- Three ways to report a case: a structured form with disease, governorate, age group, and outcome fields; free-text/SMS input for natural clinical notes; and image/OCR upload for handwritten paper registers (the format most PHC units already use today).
- Instant AI classification: every report is mapped against Egypt's notifiable disease list and ICD-10 codes, with a confidence score attached, powered by Groq's LLaMA 3.
- Statistical anomaly detection: each alert is scored with a z-score against historical baselines, so the system can recognize when repeated, independent reports of the same disease in the same area start forming a real pattern — not just a single isolated case.
- Human-in-the-loop review: AI never has the final say. Every alert sits as "Pending" until a human epidemiologist confirms or dismisses it from the Alert Review console — "AI suggests, humans decide."
- Live outbreak map: a real-time, geolocated heatmap of Egypt showing where signals are clustering, auto-zoomed to the relevant governorates.
- Surveillance dashboard: weekly trends, top diseases, alert rates, and a drift monitor tracking model confidence and human confirmation rates over time.
How we built it
- Backend: FastAPI with async SQLAlchemy, PostgreSQL + PostGIS for geospatial data, Alembic for migrations, and APScheduler for scheduled drift-monitoring jobs.
- Frontend: React (built and iterated rapidly with Lovable), with Leaflet for the interactive outbreak map.
- AI layer: Groq's LLaMA 3 for clinical text classification and disease/ICD-10 extraction, plus Groq Vision for OCR on handwritten and photographed clinical documents — including bilingual Arabic/English text.
- Data foundation: a full mapping of Egypt's 55 notifiable diseases to ICD-10 codes, with Group A (immediate report) and Group B (weekly report) classification baked directly into the alert dispatch logic, sourced from DES Egypt's official surveillance protocols.
- Resilience by design: offline-ready structured intake so a clinician with limited connectivity can still file a report the moment they're back online.
We split the work by strength: backend architecture and the classification/alerting pipeline, frontend and UX, and clinical accuracy — disease definitions, classification rules, and validating every AI output against real DES Egypt protocols.
Challenges we ran into
The hardest challenges showed up not in writing the code, but in testing it like a skeptical end user — and a few were significant:
- The "human-in-the-loop" gap: early in testing, we discovered alerts were being auto-dispatched immediately on creation, completely skipping the review step our own pitch was built around. We traced and fixed the lifecycle logic so alerts now correctly sit as "Pending" until a human decision is made.
- Statistical scoring not wired up: our z-score field was silently returning 0.00 across every alert, even when we deliberately submitted repeated reports designed to trigger a strong cluster signal. We confirmed this with controlled tests before it was fixed.
- Confidence threshold consistency: we had documented an 85% confidence escalation threshold in our qualifier answers, but found the running app wasn't yet enforcing it — a good reminder that documentation and implementation can drift apart under time pressure.
- Diagnostic edge cases: testing the AI Analysis feature with a malaria-suspicious case (fever, jaundice, travel history to an endemic area) revealed the model sometimes under-weighted travel/exposure history relative to overlapping symptom patterns — a clinically meaningful nuance we flagged for prompt refinement.
- OCR-to-alert handoff: the Image/OCR submission path correctly reads and classifies handwritten notes but doesn't yet complete the final step of creating an alert record — the one entry point that isn't fully wired end-to-end.
Finding these issues required deliberately adversarial testing — submitting the same case through every input method, repeating disease/governorate combinations to probe statistical logic, and checking live API responses rather than trusting the UI alone.
Accomplishments that we're proud of
- A fully working end-to-end human-in-the-loop pipeline: report → AI classification → pending alert → human confirmation → dashboard update, verified live across all three reporting methods.
- Accurate per-governorate geocoding on the live outbreak map, correctly distinguishing a patient's actual reporting location from unrelated travel/exposure history mentioned in free text.
- A real statistical escalation behavior: submitting the same disease signal through multiple independent channels visibly raises the alert's confidence, z-score, and severity level — mirroring how real outbreaks reveal themselves through converging evidence, not a single report.
- Built directly from real DES Egypt clinical materials — the actual 35/44-disease notifiable list, Group A/B definitions, and case definition protocols — not a generic disease database.
What we learned
We learned that the gap between "the AI suggests, humans decide" as a slogan and as an actually enforced behavior is easy to underestimate — it has to be tested explicitly, not assumed from the UI text alone. We also learned how much classification accuracy depends on clinically meaningful detail (like travel history) that a general-purpose model won't automatically prioritize unless the prompt is shaped by people who understand the disease context — which is exactly why pairing frontline clinical knowledge with AI/ML engineering produced a stronger result than either could alone.
What's next for Epilink
Scaling Beyond Borders: Expanding EpiLink globally to integrate with healthcare systems in other countries, creating a unified, interconnected surveillance network.
Interconnected Surveillance Networks: Developing interconnected data-sharing protocols, because effective outbreak detection requires a seamless global grid—global health security relies on absolute connectivity.
Universal Human Protection: Transitioning EpiLink into a borderless early-warning system designed to track infectious diseases across continents in real-time, ultimately protecting human lives worldwide.
Closing the Image/OCR Gap: Finalizing the automated intake workflow so handwritten paper reports trigger instant alerts end-to-end, just like digital and voice inputs.
Refining AI Risk Analysis: Fine-tuning the differential diagnosis prompts to place heavier mathematical weight on epidemiological risk factors like travel history, occupational exposure, and local outbreak data.
Built With
- alembic
- css
- docker
- fastapi
- groq
- html
- javascript
- langchain
- langgraph
- ocr
- openstreetmap
- postgresql
- python
- react
- render
- sqlalchemy
- supabase
- typescript
- vercel
- vite
Log in or sign up for Devpost to join the conversation.