Dwell

Inspiration

By the time a building failure is visible to a resident, it's already been predictable in public data for months — 311 complaints that went unanswered, permits that expired without inspection, seismic risk that nobody acted on. That data is technically public, but it's buried across a dozen disconnected government databases in formats no normal person can read. A renter has no way of knowing that next year they could be dealing with water damage, foundation problems, or major repair disruptions—all because warning signs were buried in millions of rows of government permit, inspection, and infrastructure records that no one ever checks. The neighborhoods that pay the price for this gap are almost always the ones with the fewest resources to recover. Dwell exists to close it.

What it does

Enter an address and Dwell pulls real data from USGS, NWS/FEMA, USDA SSURGO, and DataSF — building permits, 311 history, soil quality, seismic and flood risk. It reasons across signals: a building with no soft-story retrofit, a cluster of water damage permits, and two open sewer complaints tells a different story than any one of those alone. The output is a plain-language safety report with color-coded risk findings, a prioritized roadmap, and realistic cost ranges. For renters, every finding maps to action — what a landlord is legally required to fix, how to file a 311 complaint that creates a paper trail.

How we built it

A Python data layer wraps four public APIs into clean, cacheable functions. A Claude-powered ReAct agent decides which tools to call, then passes results to a deterministic scoring rubric kept separate from the LLM so every grade is auditable, not a black box. A lightweight TF-IDF system grounds cost estimates in real public works data. Every API response is cached on first run so the demo is instant and Wi-Fi-resilient.

Challenges we ran into

We originally planned a trained ML failure predictor, but the available public data is current-state snapshots — not labeled failure events — so we pivoted to an auditable reasoning system instead. That turned out to be the stronger product: residents making real decisions need to trust the output, and a system that shows its work beats one that doesn't. We also hit a JSON truncation bug from an undersized token budget, and built graceful fallback handling after the SF 311 feed dropped mid-test — because a silent failure producing a falsely clean report is worse than an error.

What we learned

The hard part of public infrastructure data isn't access — it's synthesis across sources that were never designed to talk to each other. A transparent rubric paired with an LLM that explains its reasoning is more trustworthy than a black-box score, especially when the output affects where someone lives. No one should need to be a data engineer to find out if their building is safe.