AyushAudit AI — The Story
Inspiration
India's PM-JAY insures 500M+ people yet loses an estimated ₹2,000 Cr/year to fraud. A rule-based auditor felt solvable with RAG.
How it's built
$$\text{Audit} = \text{RAG}(\text{Guidelines}) \xrightarrow{\text{LLM}} P(\text{fraud}) \gtrless \tau$$
Guidelines → chunked → FAISS index → top-$k$ rules retrieved per claim → LLM judges.
What I learned
- Unity Catalog replaced DBFS silently — every path assumption broke
hive_metastoreis deprecated;/tmpis ephemeral; Gradio beats Streamlit inside notebooks- Heuristics are a 必须 fallback when APIs fail
Challenges faced
| Problem | Fix |
|---|---|
| DBFS disabled | UC Volumes /Volumes/... |
main catalog missing |
Discovered workspace via SHOW CATALOGS |
| LLM unavailable on free tier | Sarvam AI + heuristic fallback |
| Scanned PDFs | Tesseract OCR at $300$ DPI |
The real lesson
$$\text{Production} = \text{Good idea} + \underbrace{\text{10x debugging}}_{\text{the actual work}}$$
Built for the 55M PM-JAY claims processed annually — catching fraud before reimbursement, not after.
Built With
- datalake
- python
- replit
Log in or sign up for Devpost to join the conversation.