AI prior authorization agent

● Inspiration

Every year, prior authorization delays contribute to thousands of preventable deaths. A 2023 AMA survey found that 93% of physicians report PA causes delays in care, and 1 in 4 patients abandon treatment entirely because of the wait. We watched a family member's cancer medication get held up for 11 days while an overworked office manager manually filled out a 14-page form — and we knew there had to be a better way.

What it does

Our app automates the entire prior authorization workflow using a multi-agent AI system. A physician selects a patient, describes the request (or uses the built-in AI medical scribe to transcribe the appointment), and launches four specialized AI agents that run in parallel: one identifies documentation gaps, one analyzes patient records, one compiles step therapy history, and one builds the clinical justification. The agents then collaborate to produce a fully completed, submission-ready PA form in under 90 seconds.

How we built it

We built the app on Next.js 15 with the Anthropic API powering all agents using Claude Sonnet. Each agent streams its output in real time via Server-Sent Events so physicians can watch the analysis happen live. The medical scribe uses Deepgram's nova-2-medical model over a live WebSocket for real-time transcription. Patient records are loaded from a structured EHR file system, and the completed form can be rendered as a printable PDF via a Nunjucks HTML template.

Challenges we ran into

Getting four agents to stream concurrently into a single SSE response without blocking each other took significant architectural work. We also hit subtle issues with the form agent re-running when it shouldn't — solved by caching parallel agent outputs and routing Generate Form through a dedicated /api/form endpoint. Keeping the Deepgram WebSocket and MediaRecorder in sync with React state was trickier than expected, especially around cleanup on unmount.

Accomplishments that we're proud of

We're proud that the full pipeline — from raw patient files to a completed, structured PA form — works end-to-end in under two minutes. The medical scribe integration feels genuinely useful: a physician can have a natural conversation with a patient, stop recording, and have the transcript automatically enrich the AI's context without polluting their own written request. The streaming UI also makes the agent reasoning transparent and trustworthy rather than a black box.

What we learned

Prompt engineering for structured JSON output is surprisingly hard at scale — small ambiguities in the schema description cause the model to hallucinate fields or nest incorrectly. We also learned that separating concerns between agents (each owning one domain) produces dramatically better outputs than a single monolithic prompt. And real-time audio in the browser is far more fragile than the Web APIs make it look.

What's next for AI prior authorization agent

Direct EHR integration via FHIR APIs so patient records are pulled automatically. Payer-specific form templates, since every insurance company has a different format. A feedback loop where denial reasons are fed back to the agents to strengthen future submissions. And a physician review + e-signature step before submission, so the human stays in the loop for final sign-off.

How we used our sponsors

To make this system production-ready, scalable, and secure, we integrated several sponsor tools directly into our architecture:

AWS We deployed our full-stack app on AWS, using its cloud infrastructure to handle real-time streaming, agent orchestration, and file storage for patient records and generated PDFs. AWS ensures the system can scale reliably across clinics and large healthcare networks.

Kiro (AWS agentic IDE) We used Kiro to plan and structure our multi-agent architecture before implementation. It helped us reason through concurrency challenges—like coordinating four streaming agents—before writing production code, saving significant debugging time.

Auth0 Healthcare data demands strict security. Auth0 powers authentication and role-based access control, ensuring only authorized physicians can access patient records and generate prior authorizations. Bland AI While we used Deepgram for transcription, Bland AI enables future expansion into fully self-hosted voice agents—allowing clinics to run HIPAA-compliant voice pipelines without relying on external APIs.

Airbyte Airbyte powers our data ingestion pipelines, syncing structured patient data from EHR systems into our internal format. This will be especially critical as we expand to real-time FHIR integrations. Aerospike We use Aerospike as a low-latency database to cache intermediate agent outputs and patient data, enabling sub-90-second generation times and preventing redundant computation across agent runs.

TrueFoundry TrueFoundry helps us deploy, monitor, and scale our AI agents in production. It provides observability into each agent’s performance, latency, and failure modes—critical for a healthcare-facing system.

Overmind Overmind allows us to continuously optimize agent behavior by analyzing outputs and iterating on prompts. This is especially valuable for improving structured JSON reliability and reducing hallucinations in clinical documentation.

Macroscope Macroscope helps us understand and navigate our growing codebase, especially as we scale the multi-agent architecture. It’s been useful for debugging complex interactions between streaming, caching, and API routing.