The Pudding Agent (Graph Harness Agent on Databricks Apps!)

What it does

Inspiration

The Virtue Foundation dataset poses a deceptively simple question: where are India's "medical deserts," the districts where people cannot reach care? The hard part is not running SQL. It is that the data is a ~10,000-facility web-scraped sample (not a census), the capability fields are self-reported, and there is no population denominator, so the easiest "insights" are exactly the ones most likely to mislead a planner. We wanted an agent a non-technical health planner could simply talk to, one that answers in stories, decks, notebooks, and maps, and that is relentlessly honest about what the data cannot prove.

Give the GIFs a moment to load. It will be worth it! Also check out the YT video!

Note for judges: access to the live app

This app uses on-behalf-of (OBO) authentication on Databricks. Every action is personalised and runs as the signed-in user, so each viewer's account must be granted access before they can sign in.

To get hands-on access, please email me your Databricks (or login) email and I'll add you right away: rathes.waran@resonance-analytics.com

What it does

The Pudding Agent is a full-stack analytics agent running as a Databricks App. Ask it a plain-English question about medical deserts and it:

Builds scroll-driven narrative essays (in The Pudding style) over the dataset.
Exports real PowerPoint decks for offline communication to stakeholders.
Writes reproducible Databricks notebooks that run natively on serverless for deeper analysis.
Serves a live interactive map of district access gaps.

Ask "where is anaemia worst, and which facilities claim advanced care they cannot back up?" and it routes to the right SQL pattern, pulls the figures through governed SQL, and answers with a figure-injected story, every number taken from the data and never hand-typed. The headline it surfaces: health-insurance coverage ranges from 1.2% in South Andaman to 97.8% in Barmer, an 81.5x spread across 706 districts, and 305 of those districts (43.2%) have zero facilities in the sample.

How we built it

Harness: Databricks has no native agent sandbox, so we built one with LangChain's Deep Agents SDK, simulating a virtual filesystem over Volumes (artifacts) and Workspaces (generated notebooks), all OBO-enabled so every action is personalised to the signed-in user.
The brain is a knowledge graph. Instead of stuffing every skill into the prompt or hard-coding instructions, domain knowledge lives in a 448-node, 3,590-relation Neo4j graph (Findings, Metrics, SQL patterns, Questions, chart and deck recipes, design rules, and the Genie space). A find_skill tool retrieves only what each question needs, by intent, walking Finding to Metric to SQL pattern to Genie space for a joined-up, explainable plan. In our A/B tests this used roughly 58% fewer input tokens than reading skill files directly, and you can watch the traversal live in an in-app graph explorer.
Reasoning and retrieval: a DeepAgents orchestrator on Databricks Model Serving (gpt-5.5) plans the work, and a Databricks Agent Bricks Genie Space (the India Healthcare Access Space) turns natural language into governed SQL over the Virtue Foundation Delta Share.
Memory: Lakebase (managed Postgres) is the backbone. Every turn is checkpointed, so the agent is multi-turn, resumable, and stateful, and chat history plus feedback are the single system of record shared by the app, orchestrator, and subagents.
Observability: every request is traced end to end with MLflow Tracing, so each reasoning step, graph traversal, and Genie call is an auditable span.
Scale by data, not code: the graph is produced by an offline pipeline (ingest, EDA, gap-analysis, embed). Retargeting a new dataset means rebuilding the graph, not rewriting the agent or the prompts.

Challenges we ran into

An honest dataset is a hostile dataset. The facilities table is a ~10k sample against 47,000+ in external registries, 88.4% private, with no population column. We designed an "honesty contract" that travels with every artifact: coverage not census, no per-capita, self-reported capability, and suppressed NFHS cells encoding rarity rather than absence.
Confounding. "More need means fewer facilities" looks real until you control for urbanisation, at which point it largely disappears. We made the Python Analyst subagent pressure-test relationships and report confidence rather than assert causation.
Messy geography. Facilities bridge to districts through address, PIN, and district-name normalisation, so unmatched or inconsistently written locations create apparent gaps we had to surface honestly.
No native sandbox on Databricks, which we solved with the Deep Agents virtual filesystem over Volumes and Workspaces.

Built With

agent-bricks
databricks
databricks-apps
deepagents
genie
lakebase
langchain
mlflow
neo4j
openai
python
react

Updates

Rathes Waran started this project — Jun 16, 2026 03:25 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.