MedExplain

Dual-Model Convergence for Explainable Clinical AI


The Problem

In the realm of healthcare, trust is paramount. While modern AI systems have achieved remarkable accuracy in diagnosing diseases, they often fall short in explaining their reasoning. This lack of transparency breeds skepticism among clinicians who rely on clear, traceable reasoning to make informed decisions.

  • Heatmaps (GradCAM, SHAP) show where a model looked.
  • They do not show why a diagnosis was made.
  • In clinical medicine, black-box decisions are unacceptable.

Radiologists need traceable reasoning, not just predictions.


Our Solution

We built a clinical AI system powered by a Dual-Model Convergence Loop.

Instead of trusting one model:

  • We run two independent reasoning strategies in parallel.
  • A deterministic comparison model evaluates agreement.
  • A diagnosis is accepted only if both strategies independently converge.

If they disagree, the system flags the case for human review.

Explainability is not added afterward — it is built into the reasoning process itself.


The Dual-Model Convergence Loop

Conclusions Model (Aggressive Strategy)

  • Temperature: 0.7
  • Casts a wide diagnostic net
  • Prefers false positives over false negatives
  • Immediately proposes all possible findings

Outputs for each finding:

  • Diagnosis
  • Severity level
  • Confidence score
  • Anatomical region
  • Bounding box coordinates

This model behaves like an assertive specialist proposing differential diagnoses.


Iterative Model (Cautious Strategy)

  • Temperature: 0.1
  • Makes exactly one logical deduction per cycle
  • Cannot jump to conclusions
  • Cannot declare FINAL without full prerequisite reasoning

Formal reasoning constraint:

Dₙ = f(D₀ ∪ {d₁, d₂, …, dₙ₋₁} ∪ Fₙ₋₁)

Where:

  • D₀ = patient records + imaging + clinical references
  • dᵢ = prior deductions
  • Fₙ₋₁ = flagged disagreement regions

Each step must:

  • Reference patient data
  • Reference imaging findings
  • Reference clinical guidelines

The reasoning chain itself becomes the explanation.


Comparison Model (Deterministic Validator)

  • Temperature: 0.0
  • Performs semantic matching between model outputs
  • Identifies agreement vs disagreement
  • Extracts bounding boxes of mismatched findings

If disagreement exists:

  • Flagged regions are fed back into the next reasoning cycle
  • The system forces re-evaluation

The loop runs for up to 5 cycles.


Convergence Rule

The system converges when:

  • Every aggressive conclusion has independent confirmation
  • Semantic similarity exceeds threshold
  • Iterative model declares FINAL

If convergence fails after 5 cycles:

status = human_review

Uncertainty is explicitly surfaced.

Honest doubt is safer than false confidence.


What the System Provides

  • Multi-timepoint scan viewer
  • Color-coded bounding boxes (severity based)
  • Structured diagnosis & prognosis panel
  • Suggested next steps
  • Full reasoning chain (step-by-step)
  • Downloadable CSV audit trail (regulatory-ready)

How We Built It

Backend

  • Node.js + Express
  • js.sqlite
  • Anthropic SDK (Claude Sonnet)
  • Structured JSON outputs
  • Temperature-controlled role prompts

Data Layer

  • FHIR R4 synthetic patient records (Synthea)
  • DICOM imaging from TCIA
  • Python pipeline for:
    • Hounsfield Unit rescaling
    • Window/level normalization
    • DICOM-to-PNG conversion

Database Schema

  • Patient
  • Cases (pending → running → converged / human_review)
  • Scans (multi-timepoint support)
  • Iteration_steps
  • Conclusion_runs

Every reasoning step is stored.

Explainability is structurally embedded into the database.


Why This Is Different

Most explainable AI:

  • Adds heatmaps
  • Adds SHAP values
  • Explains where the model looked

MedExplain:

  • Explains how the model reasoned
  • Forces independent agreement
  • Surfaces disagreement transparently
  • Stores a full audit trail

We do not trust one model to be right.

We require independent convergence between the conclusions model for breadth and the itertive model for depth.


From Explainable AI to Autonomous Clinical Agent

MedExplain is not just a reasoning framework.

It is an autonomous radiology reasoning agent.

Given:

  • A FHIR patient record
  • A DICOM scan

It independently:

  1. Performs structured diagnostic reasoning
  2. Validates itself through convergence
  3. Flags high-risk anatomical regions
  4. Generates a structured preliminary report
  5. Escalates only when uncertainty remains

This completes a meaningful clinical task end-to-end.


Proving Agent Value

Because the system runs as a loop with structured state, we can measure:

  • Time to convergence per case
  • Number of reasoning cycles required
  • Escalation rate to human review
  • High-severity findings flagged
  • Token usage per diagnostic run
  • Cost per case

This enables:

  • Cost-per-diagnostic modeling
  • Time-saved estimates for preliminary reporting
  • Reduction in unnecessary escalations
  • Clear ROI per deployment

MedExplain is not a seat-based tool.

It is a task-based autonomous agent.

Value is measured per diagnostic case.


Economic Model

Hospitals do not buy seats for agents.

They deploy agents to complete workflows.

MedExplain can operate under:

  • Per-case billing
  • Severity-tier billing
  • Escalation-based pricing
  • Convergence-efficiency pricing

Because every reasoning cycle and token is logged, agent economics are transparent and auditable.

The system can integrate with agent cost observability platforms to track:

  • Latency
  • Cost
  • Accuracy
  • Escalation rate
  • Convergence efficiency

This aligns directly with agent-based pricing infrastructure.


What We Learned

  • Prompt engineering defines agent behavior.
  • Explainability must be architectural.
  • Independent convergence improves reliability.
  • AI agents must expose uncertainty.
  • Agent value must be measurable per task, not per user.

What’s Next

  • Clinical validation studies
  • Integration with hospital EHR systems
  • Multi-modality imaging expansion
  • Specialized medical vision models
  • Deployment-ready observability layer
  • Alignment with AI regulatory frameworks

Closing Statement

MedExplain transforms medical AI from a black-box predictor into an autonomous, accountable reasoning agent.

Two models. Independent strategies. Converged conclusions.

Explainable by architecture.
Valuable by measurement.

Built With

Share this project:

Updates