AI-Powered Healthcare FWA Detection

Dashboard by autoresearch

Inspiration

Healthcare fraud costs the US $100–300 billion annually. Despite widespread automation, most detection is still rule-based, static, and lag months behind evolving schemes. We wanted an AI system that not only flags suspicious claims in real time but continuously improves its own detection rules — without human intervention.

What It Does

HCendGame detects three categories of healthcare billing anomalies: • Fraud: Intentional deception — e.g., Keytruda billed without a cancer diagnosis • Waste: Overutilization — e.g., Ozempic prescribed for hypertension-only patients • Abuse: Inconsistent practices — e.g., HCC upcoding to inflate CMS risk scores

Detection runs through a two-stage pipeline: • Clinical rule engine — ICD-10/NDC/HCC cross-validation, therapeutic mismatch, quantity limits • Amazon Nova Pro (Bedrock) — natural-language reasoning for edge cases, pattern synthesis, AutoResearch rule proposals

6 Modules Module Description Single Claim Real-time validation across 5 fraud scenarios Batch Analysis 500 synthetic claims, 15% planted anomaly rate Network Graph Kickback rings, hub providers, doctor-shopping detection Temporal Analysis Monthly billing spike detection (SVG chart, zero dependencies) AI Investigator Natural-language queries → structured evidence briefs via Nova Pro AutoResearch Autonomous rule improvement loop — best F1: 0.878

AutoResearch — The Core Innovation "One day, insurance fraud used to be caught by meat computers reviewing stacks of claims between coffee breaks and department meetings. That era is long gone." — adapted from @karpathy, March 2026

Inspired by Karpathy's autoresearch LOOP FOREVER methodology, our AutoResearch tab runs autonomously: • Agent proposes a new detection rule • Validates on 500 synthetic claims • If F1 improves → keep the commit • If equal or worse → git reset --hard HEAD~1 (discard) • Repeat indefinitely

F1 progression across 10 experiments: 0.821 → 0.829 → 0.837 → 0.843 → 0.836 (discard) → 0.851 → 0.858 → 0.862 → 0.855 (discard) → 0.878 Best F1: 0.878 | Rules kept: 8/10 | Rule added: TEMPORAL_CLUSTERING

How We Built It

• React 19 + Vite 7 + Tailwind CSS 3 — single-file architecture (RXHCCnva.jsx ~2,200 lines) • Amazon Nova Pro via Bedrock Converse API (temperature: 0.3, structured JSON output) • ICD-10/NDC/HCC clinical rule engine — coded inline, zero external dependencies • AutoResearch loop — useCallback + useEffect + arAutoLoopRef for clean async control • Inline SVG charts — no charting library required • Deployed to Vercel from local build (rootDirectory management via Vercel API)

Challenges

• Stopping the loop cleanly — React's stale closure problem required a useRef (arAutoLoopRef) alongside useState • Nova Pro prompt engineering — getting structured JSON with clinical reasoning in a single call without hallucinating ICD codes • Vercel deployment — GitHub root contains mixed Python/React files; solved via local deploy with project.json swap + API rootDirectory reset • Batch progress UX — synchronous claim processing blocked the React render loop; fixed by chunking with await new Promise(r => setTimeout(r, 0)) to yield between batches

Accomplishments that we're proud of

• AutoResearch tab: a fully working demo of Karpathy's autonomous research loop applied to FWA — not a mockup • Zero external chart/table libraries — everything renders in pure SVG and JSX • Nova Pro integration that produces clinically grounded, evidence-structured responses • Conversational AI Investigator with multi-turn memory — ask a follow-up and it remembers what you asked before • CSV audit export — flagged claims download with date-stamped filename, ready for SIU hand-off

What We Learned

• Autonomous AI improvement loops are viable even for domain-specific rule engines • React ref + state duality is the right pattern for controlling async loops • Nova Pro handles multi-step clinical reasoning well when given a tightly scoped system prompt • Color-coded risk gauges (not just percentages) dramatically reduce time-to-decision for reviewers

What's Next

• Connect to real CMS claims data via AWS HealthLake • Multi-agent AutoResearch — parallel rule proposals with council voting • Export flagged claims to PDF audit reports • HCC risk-score drift monitoring dashboard • Streaming token output from Nova Pro for real-time investigator responses

Built With

amazon-bedrock-converse-api
amazon-web-services
amazonnova
javascript
node.js
react

Updates

HK Chun started this project — Apr 16, 2026 09:35 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.