Inspiration
In the United States, doctors waste over 40 hours per week on prior authorization paperwork. For each prescription requiring insurance approval, a clinician has to manually extract patient data from clinical notes, look up the insurance company's specific approval criteria, evaluate whether the patient meets every criterion, and write a formal justification letter. People have literally died waiting for prior authorization decisions. The work is mechanical, evidence-driven, and pattern-heavy, which makes it the perfect fit for an AI agent. We built PA Prior Authorization Assistant to take this two-hour-per-request paperwork ritual and collapse it into thirty seconds.
What it does
PA Prior Authorization Assistant is a FHIR-native MCP server that automates the entire prior authorization workflow. Given a patient and a proposed treatment, it performs four steps end to end:
- Fetches the patient's clinical data from a FHIR R4 server using Prompt Opinion's FHIR Context extension headers, then uses an LLM to synthesize a structured clinical profile (diagnoses with ICD-10 codes, current medications, prior failed treatments with durations and reasons, lab results, and prescriber details).
- Looks up the requested treatment's prior authorization criteria for the patient's insurance plan from a curated criteria database.
- Asks an LLM to evaluate each criterion against the patient's clinical evidence, producing per-criterion reasoning, supporting quotes from the chart, an APPROVE / DENY / NEEDS_MORE_INFO recommendation, and a confidence score.
- Drafts a complete, professional, anonymized PA request letter ready for submission to the insurance company.
The full pipeline runs in under thirty seconds for a typical case and produces a 350 to 500 word evidence-based letter that cites specific labs, prior medication trials, and clinical findings.
How we built it
The MCP server is built in TypeScript on top of Express and the Model Context Protocol SDK. It speaks the MCP JSON-RPC protocol over HTTP and declares Prompt Opinion's ai.promptopinion/fhir-context capability extension during the initialize handshake, advertising the FHIR scopes it needs (patient/Patient.rs, patient/Condition.rs, patient/MedicationStatement.rs, patient/Observation.rs, patient/MedicationRequest.rs, patient/Coverage.rs).
For LLM inference we default to Groq's free Llama 3.3 70B model with an automatic Gemini fallback for resilience. This keeps the whole stack accessible to under-resourced clinics that cannot afford paid AI infrastructure.
For demo data, we created five synthetic patients across five conditions (rheumatoid arthritis, type 2 diabetes, severe ADHD, severe asthma, treatment-resistant depression) and three synthetic insurance plans (BlueCross Standard, Aetna Premium, United Silver). Each patient is represented as a proper FHIR Bundle (Patient, Coverage, Condition, MedicationStatement, Observation, MedicationRequest) and uploaded to the public HAPI FHIR R4 sandbox. Zero real PHI is used anywhere.
The server is deployed on Render with a GitHub Actions cron that pings the health endpoint every ten minutes to prevent cold-start delays. The MCP server is published in the Prompt Opinion Marketplace.
Challenges we ran into
The biggest challenge was correctly implementing Prompt Opinion's FHIR Context extension. Their docs are sparse, and the spec is newer than most public examples of MCP servers. We had to study their documentation carefully to understand how to declare the extension during the MCP initialize handshake and how to read the X-FHIR-Server-URL, X-Patient-ID, and X-FHIR-Access-Token headers on every tool call.
Another challenge was the Render free tier's tendency to spin services down after fifteen minutes of inactivity. We solved this with a free GitHub Actions workflow that pings the health endpoint every ten minutes, keeping the server warm without paying for an always-on plan.
Rate limit management on free LLM tiers was a real consideration too. We added retry-with-exponential-backoff for transient rate limits and an automatic provider fallback so the service degrades gracefully under load.
Accomplishments that we're proud of
The system works end to end with real FHIR data, not mocked responses. When you call build_patient_profile with Alice's patient ID, our server actually performs HTTP GET requests against the public HAPI FHIR sandbox, pulls her Patient, Condition, MedicationStatement, and Observation resources, sends them through Llama 3.3 70B for synthesis, and returns a structured profile that the next tool can consume.
The generated PA letters read like real PA letters. They cite specific lab values with dates, name the prior failed medications with durations and dosing, address each insurance criterion explicitly, and use proper medical terminology. A practicing rheumatologist could submit one of these letters with minor edits.
The architecture is clean and reusable. The same four tools handle five very different specialties (rheumatology, endocrinology, psychiatry, pulmonology) and three different insurance plans without any per-condition special-casing.
What we learned
We learned a great deal about the practical realities of healthcare interoperability. Designing for FHIR is very different from designing for arbitrary REST APIs because the data model carries strong clinical semantics. We also learned how MCP fits into a healthcare AI stack: it provides a standard way for any AI agent on any platform to discover and invoke specialized clinical tools without bespoke integration work.
On the LLM side, we learned that good prompt design plus structured FHIR input can produce clinical text that holds up to professional scrutiny, and that smaller free-tier models like Llama 3.3 70B are entirely capable of this work when the prompts give them the right structure.
What's next for PA Prior Authorization Assistant
The natural next steps are expanding the criteria database to cover more drugs and insurance plans, integrating with real EHR FHIR endpoints (Epic, Cerner, MEDITECH), adding an appeals tool that drafts denial response letters, and building a per-call audit trail so clinics can prove how each PA recommendation was generated. Longer term, we want to add a feedback loop where approved versus denied outcomes train future prompts to push acceptance rates even higher.
Built With
- axios
- express.js
- fhir
- gemini
- github-actions
- groq
- hapi-fhir
- llama-3.3
- mcp
- model-context-protocol
- node.js
- render
- typescript
- zod
Log in or sign up for Devpost to join the conversation.