Inspiration

Most medical AI tools today are just glorified chatbots. They require doctors to break their workflow, open a new tab, and type out symptoms. But medicine isn't just text—it's highly spatial and highly visual. A doctor needs to know exactly where a patient's pain is on their body, and what the wound looks like.

We were inspired to build Medisight: not a chatbot, but an autonomous, multi-agent medical brain that integrates directly into existing EHR workflows (like Prompt Opinion) using the Model Context Protocol (MCP). We wanted to prove that AI can handle 3D spatial coordinates, native multimodal vision, and real-time literature retrieval, all while outputting standardized clinical data.

What it does

Medisight is a stateless diagnostic microservice. When a doctor requests a consultation inside the Prompt Opinion EHR, Medisight triggers a LangGraph workflow that:

  1. Maps Spatial Pain: Takes 3D (X, Y, Z) click-coordinates from the patient's anatomy chart and calculates Euclidean distance thresholds to assess spatial confidence.
  2. Retrieves Anatomical Context: Queries a locally baked ChromaDB vector store to inject relevant clinical conditions (like SNOMED codes and genetic risks) via semantic RAG.
  3. Orchestrates Specialists: Spawns specialized AI agents (Neurology, Orthopedics) to analyze the data in parallel.
  4. Retrieves Literature: Dynamically builds symptom-aware query variants to search the live NCBI PubMed API (with fallbacks to Europe PMC and CrossRef) to ground the diagnosis in clinical guidelines.
  5. Synthesizes via a Chief Critic: A final agent reviews the specialists' drafts, and standardizes the output.
  6. Returns FHIR: The final response isn't just text; it's a fully compliant HL7 FHIR R4 Transaction Bundle ready to be written to the hospital's database.

Additionally, Medisight features a native Multimodal Vision Endpoint, allowing it to ingest physical images of injuries (base64) to extract visual symptoms like erythema and swelling.

How we built it

We built Medisight with an enterprise-grade backend architecture:

  • The Intelligence Engine: We utilized Google Gemini 3.0 Flash Preview and 2.5 Flash for their massive context windows and native multimodal reasoning, building a cascading fallback system to handle rate limits.
  • The Orchestration: We used LangGraph to route our agents. Instead of relying on a single complex prompt, we built a deterministic state machine where agents check each other's work and route through a triage_agent_node.
  • The Memory: We baked a ChromaDB Vector Database directly into our Docker image for zero-latency Retrieval-Augmented Generation (RAG).
  • The Integration: We wrapped the entire system in FastMCP, explicitly injecting the PROMPTOPINION_FHIR_EXTENSION to ensure seamless context passing from the host EHR.
  • The Infrastructure: The app is containerized and deployed on Google Cloud Run for stateless, auto-scaling execution, protected by a custom HTTP Bearer Token SecurityMiddleware.

Challenges we ran into

  • The "Brittle Data" Problem: In real clinical settings, data is messy. If spatial data is missing, the system doesn't crash; our spatial node dynamically defaults to a semantic baseline search.
  • API Rate Limiting & Hallucinations: Multi-agent architectures are heavy on API calls. We built a robust fallback wrapper that automatically handles 429/503 errors and downgrades to secondary models or rule-based string fallbacks if the LLMs fail entirely.
  • The Markdown Bug: LLMs naturally want to format their output with Markdown backticks, which breaks JSON parsing. We had to build strict parsing and sanitization layers to ensure our Critic Agent always returned machine-readable JSON for our FHIR bundles.

Accomplishments that we're proud of

We are incredibly proud of our PubMed Helper engine. Instead of just making a single API call, we engineered a system that generates symptom-aware topic variants, searches NCBI, utilizes exponential backoff, and falls back to Europe PMC and CrossRef if no hits are found.

We are also proud of implementing enterprise-level security and a fully containerized Vector DB within a tight hackathon window.

What we learned

We learned that building AI for healthcare is 10% prompt engineering and 90% systems architecture. Orchestrating LLMs requires strict data typing, deterministic routing, and anticipating edge-case crashes.

What's next for Medisight

The immediate next step is expanding our specialized agent batch to include Cardiology and Rheumatology. Following that, we plan to integrate real-time SNOMED CT and ICD-10 coding modules so Medisight can automate hospital billing concurrently with the clinical diagnosis.

Built With

Share this project:

Updates