Inspiration

Medicaid helps serve 70 to 80 million people every year, which comes out to roughly 1 to 4 Americans. But due to the large number of people Medicaid is provided to, investigative capacity is limited. Parsing millions if not billions of data entries, and assessing billing patterns that look suspicious is slow, manual and full of false positives. Medicaid Sherlock is a open-source multi-agent AI system that helps investigators triage faster and help catch perpetrators in a more effective manner.

What it does

Medicaid Sherlock is a multi-agent AI system that analyzes 227 million Medicaid claim rows between the years 2018 and 2024 and flags providers with unusual billing patterns using five anomaly detectors (spending outliers, cost-per-claim outliers, billing spikes, billing/service mismatch, and procedure concentration). It then uses a 5 agent workflow to turn any flagged provider into a nuanced case file that includes peer comparisons, time-series evidence, billing network relationships, HCPCS plain-English translation, and public context to separate explained structural patterns to unexplained signals that warrant review.

How we built it

Data pipeline (DuckDB + Parquet): Pre-aggregated benchmarks (HCPCS×year medians/p95/p99), provider summaries, and monthly time series for fast analytics over a huge dataset. Anomaly detection layer: Implemented five complementary detectors and unified all flags into a single anomalies table with scores and categories. Cross-referencing engine: Intersected risk dimensions to surface high-signal leads (e.g., multi-flag providers, extreme cost outliers, new high spenders, flagged networks). Enrichment: NPPES NPI Registry API to classify providers (organization vs individual, specialty/taxonomy, location) for context-aware scoring. HCPCS lookup (official CMS file + common CPTs) to translate procedure codes. Perplexity Sonar to pull relevant public records and enforcement context. Multi-agent system (Claude Sonnet): Investigator orchestrates Analyst + Network Mapper + Researcher + Report Generator to produce structured reports with evidence and caveats. Frontend (demo): An interactive UI for exploring anomalies, drilling into providers, visualizing networks, and generating a report for screen-recorded demos.

Challenges we ran into

Scale: Working with a massive dataset required careful pre-aggregation, indexing strategies, and avoiding “full table scans.” False positives / structural noise: Hospitals, FQHCs, and integrated systems naturally look anomalous; we had to design the system to avoid reckless conclusions and add context-aware interpretation. OSINT reliability: Web context can be messy; we structured outputs to separate confirmed enforcement actions from unverified allegations and to surface uncertainty clearly.

Accomplishments that we're proud of

Built a complete end-to-end investigation pipeline from raw claims to anomalies to cross-referenced leads to enriched case files. Created a system that’s explainable by design: every flag has a rationale, peer comparison, and a “legitimate explanation” section. Successfully integrated a multi-agent investigation workflow that produces structured, investigator-friendly reports on demand.

Built With

Share this project:

Updates