Inspiration
Healthcare AI agents are transforming patient care, but one hallucination can cost a life. We've seen AI recommend medications that conflict with patient allergies, misdiagnose due to biased data, or make overconfident predictions without evidence. We asked: What if we could build a guardrail that catches these errors before they reach patients? Cerberus was born from the need for transparency and safety in healthcare AI.
What it does
Cerberus is a Healthcare AI Audit & Safety Layer; a meta-MCP server providing real-time comprehensive auditing of healthcare AI decisions. It forces transparency through explainability extraction, verifies clinical claims against FHIR data, detects demographic bias, quantifies uncertainty, audits multi-agent conversations, and generates FHIR-compliant audit trails. It doesn't treat patients; it audits the agents that do.
How we built it
We implemented a Defense-in-Depth Triple-Gate Architecture using Node.js 20+, TypeScript 5.4+, and the MCP SDK. The system includes an MCP server with stdio transport, full SHARP extension implementation, FHIR R4 client with SMART on FHIR, plugin-based audit engine with 5 specialized plugins, A2A protocol support, and FHIR AuditEvent generation. Developed in 4 phases: Foundation, Core Engine, A2A Integration, and Safety & Polish.
Challenges we ran into
Healthcare data complexity required a plugin-based engine to handle nested FHIR resources efficiently. SHARP specification ambiguity demanded a flexible parser. Adapting explainability to clinical decision-making needed healthcare-specific feature extraction. Performance at scale required L1/L2 caching. Bias detection without representative data needed statistical significance testing. A2A protocol complexity demanded robust task management and SSE streaming.
Accomplishments that we're proud of
5 production-ready MCP tools with full schemas and validation. Full SHARP compliance with context propagation. Comprehensive FHIR integration supporting 8 resource types. Plugin-based architecture with 5 audit plugins. A2A protocol support with Agent Card and task management. FHIR AuditEvent generation for compliance. Performance optimized with sub-second response times.
What we learned
Standards (MCP, A2A, FHIR, SHARP) are force multipliers reducing integration effort by ~70%. Healthcare data requires prioritizing safety over speed. Explainability needs feature importance, evidence citations, confidence calibration, and known unknowns. Bias is subtle and requires statistical significance testing across multiple dimensions. The meta-agent pattern is powerful for building infrastructure. Performance matters as much as accuracy in healthcare workflows.
What's next for Cerberus
Additional audit plugins (Lab Result Verification, Imaging Study Appropriateness, Clinical Guideline Compliance), enhanced explainability with visualization, advanced bias detection with population health integration, production hardening with Kubernetes, and the ultimate goal: making Cerberus the standard safety layer for all healthcare AI agents.
Built With
- a2a-protocol
- axios
- cors
- docker
- docker-compose
- express.js
- fhir
- fhir-r4-api
- hapi
- ioredis
- mcp-sdk
- node.js
- pino
- redis
- shap-js
- sharp-extension
- simple-statistics
- typescript
- uuid
- zod
Log in or sign up for Devpost to join the conversation.