Veredictos Multimodal Health Router

Inspiration

Healthcare AI's hard part is integration, not the model. Brazil has 16 million diabetics and 6,000 ophthalmologists; diabetic retinopathy is the leading cause of preventable blindness worldwide and <50% of diabetics actually get screened. Veredictos already has a specialised retinal model at 96% sensitivity for DR, glaucoma, and hypertensive retinopathy. We came to this hackathon to answer: how do we make any health-AI model invokable, federation-ready, and auditable from day one, inside a real EHR workflow?

What it does

Veredictos Multimodal Health Router is a model-agnostic MCP capability layer plus an ADK A2A orchestrator that picks the right health-AI model for each clinical task and writes auditable FHIR for every step.

MCP server exposes:

ListAvailableModels - registry discovery for the orchestrator
InvokeHealthModel - single-model routing
EnsembleDiagnose - weighted multi-model agreement
ConsultDiagnosticReport - natural-language explainer for an existing report
ScreenTriage - batch screening priority queue
3 clinical prompt templates + 5 discoverable resources

A2A orchestrator (Google ADK Python + Gemini) reads patient FHIR context via SHARP, classifies the task (diagnose / consult_report / ensemble / screen_triage), and dispatches the correct MCP tool. Includes a deterministic /local rule-based fallback for offline demos.

HIPAA/LGPD audit trail - every invocation writes a FHIR AuditEvent + Provenance resource plus an NDJSON line with SHA-256 hashes of input/output payloads.

Federation - FhirRegistry + target_workspace in the routing decision schema enables cross-workspace routing without changing tool code.

4 swappable adapters behind the MCP - Veredictos Vision (retina specialist), RetFound (Moorfields foundation), Synthetic classifier (Veredictos-Gen distribution), ReportConsult (LLM explainer). Replace _predict() in any of them to plug real weights in ~50 lines without touching FHIR/MCP/A2A.

How we built it

MCP server - Python, FastMCP (po-fastmcp), Pydantic v2, fhir.resources R4 validation, structlog, httpx
A2A agent - Google ADK Python (po-adk-python orchestrator template), Gemini 3.1 flash-lite via LiteLLM
Shared schemas - RoutingDecision, AdapterInput, AdapterOutput, ModelMeta, AuditRecord in one folder used by both services
Audit context manager - wraps every tool body; on exit it emits AuditEvent + Provenance + NDJSON in one transaction
Local end-to-end stack - mock FHIR R4 server (FastAPI), boot script, smoke battery with 9 scenarios covering every MCP path
Tunnels - Cloudflare quick tunnels expose MCP + A2A to the workspace

Challenges

Kaggle MedGemma lesson drove the architecture: integration into the platform matters more than model accuracy. We deliberately ship deterministic mocks behind the adapter contract so judges verify the integration, not be distracted by claims about a model.
A2A protocol v1 required supportedInterfaces instead of the deprecated top-level url. Caught in po-adk-python README at the last minute.
FastMCP ≥3.2 moved stateless_http from the constructor to run_http_async. Adapted on the fly.
Gemini free-tier quotas + 503s during live demos - we built the deterministic /local rule-based brain on the same A2A endpoint as the failover.
Tool schema cache - Gemini hesitated to invoke EnsembleDiagnose because it asked for fields the server already defaulted. We refactored tools to expose individual named kwargs (no JSON blob arg) and the server fills every default from the SHARP headers.

Accomplishments

4 health-AI backends behind a single MCP, each swappable in ~50 LOC
Real ADK multi-turn orchestrator with structured output and session state - not a prompt-only mock
HIPAA/LGPD-grade audit pipeline (AuditEvent + Provenance + payload hashes) wired into every tool by default, not bolted on
Federation surface (target_workspace) that already works with two static bindings
9/9 smoke battery green; CI green on every push

What we learned

MCP capability servers belong upstream; A2A agents belong downstream. The JSON RoutingDecision between them is the contract that makes the system observable.
Even mock backends benefit from real determinism - the demo becomes a reproducible test, not a stunt.
SHARP context-by-headers is a clean pattern: authentication never enters the prompt; tokens and FHIR base URLs ride on x-fhir-* headers.

What's next

Plug the real Veredictos V24 (332 MB, 94% sensitivity) and the public RetFound checkpoint behind their adapters. The contract is unchanged.
Federation across regional EHRs in the SUS network with workspace-level Provenance.
"Second-opinion" sub-agents when confidence is borderline, mediated by the orchestrator.

Built With

a2a-protocol
cloudflared
fastapi
fastmcp
fhir-r4
fhir-resources
gemini
github-actions
google-adk
httpx
litellm
mcp
openai-codex
pydantic
python
sharp
uvicorn

Updates

Pedro Afons started this project — May 11, 2026 09:31 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.