In India, the hardest part of care is often just finding the right place to get it. Facility info is scattered, stale, and frequently plotted at the wrong coordinates — so patients travel hours to a hospital that can't help them, or pay for a private clinic when a free PMJAY one is closer. For a dialysis patient, a wrong turn can be fatal. And most health tools only speak English, shutting out anyone who doesn't. Jeevan Rekha ("lifeline") fixes both.

What it does Turns "I need X care near me" into a trustworthy, explainable shortlist of real facilities.

Any Indian language, in and out — type or speak Hindi/Bengali/Tamil/…; answers come back in the same language. Semantic match — "renal replacement therapy" finds a "dialysis unit." Precise "near me" — GPS → PIN → city, real distances. Evidence + verification on every card — cites why it matched, with a PMJAY/HFR/NABH Verified badge. Healthcare guardrails — emergency escalation (112/108), relevance floor, medical disclaimer. Privacy by design and a live cost dashboard. How we built it Entirely on Databricks — no external infra, no API keys.

Databricks Apps (AppKit / React + Node), deployed as an Asset Bundle Lakebase (Postgres + pgvector) as both OLTP store and vector index Model Serving — gte-large-en for embeddings, llama-3.3-70b for translation/intent Unity Catalog — ~10k-facility bronze source + conversation archive Multilingual support wraps an English core — translate in → search in English → translate out — keeping one vector space and adding LLM calls only when input isn't English. Ranking blends meaning, proximity, and capacity:

score

0.60 ⋅ sim + 0.25 ⋅ proximity + 0.10 ⋅ beds + 0.05 ⋅ doctors score=0.60⋅sim+0.25⋅proximity+0.10⋅beds+0.05⋅doctors

Challenges we ran into ~8% of coordinates were wrong — a "Noida" hospital 500 km away. We built a bronze→silver validation pipeline (geo_valid filter) so bad points never show. The official pincode directory lied (PIN 201301 → "Agartala"), so we dropped it and geocode from the median of validated facilities. Suspicious evidence — an early fallback cited "cleft lip repairs" for a dialysis query; fixed with a suspect-evidence filter + LLM-grounding. 512 MB Lakebase cap and a disabled SQL warehouse forced a lean schema and fully Lakebase-native ETL. Accomplishments that we're proud of A full health agent on one platform — LLM, embeddings, OLTP, vectors, ETL, and web app all inside Databricks. Trust built into the dataflow — evidence, verification, validated coordinates, and emergency escalation as gates, not afterthoughts. Genuinely multilingual, including voice, over an English-only vector core. Fast and cheap — sub-100 ms search, a few rupees per 1,000 queries. What we learned A vector index belongs next to your OLTP — pgvector in Lakebase did geo + semantic + evidence in one SQL round-trip. For health data, trust is the product — citations and validation mattered more than ranking tweaks. Translation as a thin wrapper gives many languages for almost no extra cost or storage. Wrong data hurts more than missing data — a confidently-wrong "3.9 km" is worse than nothing. What's next for Jeevan Rekha - LifeLine Full OpenStreetMap validation across all ~10k facilities Bulk PMJAY/HFR ingestion + procedure-level verification MLflow tracing, rate limiting, cost caps WhatsApp referral hand-off for ASHA / community health workers

Share this project:

Updates