Inspiration

One of our teammates grew up in Mumbai. Every monsoon, the same streets flood. Every monsoon, people say the same thing: someone should have fixed this years ago. That sentence kept following us around as we picked a problem for this hackathon.

Then we looked at Houston. Six federally declared floods since 2016. Then Los Angeles. The Palisades and Eaton fires hit in January 2025, the same hillsides that burned in 2018. Different continents, same story on a loop. A disaster hits, the news cycle moves on, the city rebuilds the exact same way, and everyone is surprised when it happens again.

None of this is actually unpredictable. Houston floods on a roughly known cycle. Los Angeles burns on a roughly known cycle. The patterns are sitting right there in the historical record. We kept asking ourselves one question. The United States federal government, through an agency called FEMA, the Federal Emergency Management Agency, already collects detailed records every time it pays a household after a disaster. If the pattern is predictable and the data already exists, why is it not already turned into something a city can act on before the next disaster, not after.

That was the project.

What it does

DisasterCast answers two questions most city budget committees do not currently have a clear, sourced answer to.

What will the next flood or wildfire cost if nothing changes. We pull real FEMA payment records, the actual dollars paid to actual households after actual disasters, and project forward.

What should the city fix first to change that number. We rank prevention projects by return on investment, not by which one sounds biggest, using real infrastructure data from places like the Harris County Flood Control District, not generic advice a search engine could have given.

Pick Houston or Los Angeles, and you get a live cost projection, a map of sixty eight thousand real damaged properties from Hurricane Harvey, a homelessness trend chart built from eighteen years of real HUD data, and five ranked, costed, explained recommendations a city council member could actually bring into a meeting.

Here is the part that made us want to keep building this even after the deadline pressure set in. This is not just a Houston problem or a Los Angeles problem, it shows up in the national numbers. According to USAFacts, looking at FEMA's own approved funding over the last five years, only one point nine percent went to hazard mitigation, the category that actually prevents future damage, while eighty eight percent went to recovery and public assistance after disasters had already happened. A report to Congress states the same imbalance even more bluntly, FEMA spends seven dollars or more on recovery for every one dollar spent on mitigation. The strange part is that the mitigation dollar works extremely well when it is spent, federal research shows it saves six dollars or more in avoided future damage for every dollar invested. The return is real. The funding still mostly does not follow it.

Here is what actually changes for the person using it. Before DisasterCast, a Houston budget committee knows Hurricane Harvey cost the city around one hundred twenty five billion dollars back in 2017. That is the only number they have, and it points backward. They do not know what the next flood will cost given today's infrastructure, how much of that cost is preventable, or which specific project returns the most per dollar spent. Prevention loses the budget argument to recovery spending almost every year, because the cost of waiting has not had a number attached to it before the disaster actually happens.

After DisasterCast, that same committee sees a projected cost for the next event, broken into categories they can question and check, a ranked list of specific projects with real return on investment figures, and a short policy brief they could genuinely bring into a council meeting. The conversation moves from a guess about the past to a sourced argument about the future. That shift, from hindsight to foresight, is the entire point of the project.

How we built it

We started in Streamlit because we wanted something running with real numbers in it on day one, not a beautiful mockup with fake data inside. Once the core logic worked, one of us rebuilt it as a full Next.js frontend with a FastAPI backend, which let us add real interactive features like the property damage map and the live scenario sliders.

The data pipeline pulls from four FEMA datasets totaling over three hundred thousand records, HUD's Point in Time homelessness count going back to 2007, and United States Census population figures. A log linear regression model projects cost escalation, because disaster costs compound rather than add up, and a retrieval augmented generation pipeline using Groq's Llama 3.3 model turns real infrastructure evidence into ranked, plain language recommendations.

We wrote nineteen automated tests along the way to check that our calculations behaved correctly, larger damage should always project a larger cost, a discount should always produce a smaller number than the full price. Those tests caught structural bugs, but not the deeper mistakes. The real errors, a scaling factor applied to the wrong column, a population growth figure that turned out to be invented, an infrastructure project marked as unfinished when it had actually been completed years ago, were only caught by manually checking our own numbers against the original public sources. The tests told us the math was internally consistent. They could not tell us the math was actually true.

Challenges we ran into

We got our own numbers wrong, more than once, and had to go back and fix them.

Early on, our cost model showed Hurricane Harvey at forty four billion dollars. The real, widely documented figure is around one hundred twenty five billion. We had applied our scaling factor to the wrong column in the FEMA data, the amount actually paid out instead of the amount of damage assessed. Once we found it, fixing it changed almost every number downstream.

Later, we had a chart claiming Houston's vulnerable population grew ninety six percent. It sounded precise and convincing. It was completely made up, a smooth synthetic growth curve with no real Census data behind it. We replaced it with the actual number, forty eight percent, which is still a real and significant finding, just an honest one instead of an invented one.

The closest call came when we found an infrastructure card claiming a major Houston flood project was only forty percent complete. We searched for it to add more detail, and found Harris County Flood Control District's own announcement that the project had been completed years ago. We had been about to present a finished project as unfinished. We deleted the whole card rather than patch it, because a wrong specific number is worse than an honest gap.

We also leaked an API key into a public commit at one point, late at night, tired, moving fast. GitHub's push protection caught it before it went further. We revoked the key within minutes and learned to slow down exactly at the moments we feel most rushed.

Accomplishments that we're proud of

Every number a user sees in DisasterCast traces back to a real, named, public dataset. Not most numbers. Every one we currently show. When we found a number we could not defend, we removed it instead of keeping it because it looked good on screen. That discipline is the thing we are most proud of, more than any chart or any line of code.

We are also proud that DisasterCast genuinely generalizes. The same engine that explains a Houston bayou flooding explains a Los Angeles hillside burning, because the underlying problem, infrastructure built for yesterday's population serving today's, is the same problem everywhere.

What we learned

We learned that being honest about your own model's limitations is not a weakness in a project like this, it is the entire point. A tool meant to help officials make high stakes funding decisions has to earn trust by showing its work, not by sounding confident.

We also learned, very directly, that a number that looks impressive and a number that is true are not always the same thing, and the only way to tell the difference is to go check.

What's next for DisasterCast

We want to add a third city, most likely Mumbai, to prove the model holds up outside the United States and to bring the project back to where the idea started. We also want to connect the property level damage map to more cities once equivalent datasets exist, and to let a city official upload their own local infrastructure project list so the recommendation engine reasons over their real, current capital plan instead of only our researched examples.

Most of all, we want to put this in front of an actual city emergency management office and ask them one question. Does this number change what you would fund next year. If the answer is yes, the project did its job.

Built With

  • fast-api
  • fema
  • fema-disaster-declaration
  • groq
  • harris-country-and-houston-city-population
  • harvey-damage-dataset
  • housingassistancerenters
  • hud-annual-homeless-assistance-report-point-in-time-count-by-continum-of-care-2007-to-2024
  • leaflet.js
  • next.js
  • openfema-housingassistanceowners
  • publicassistanceapplicantprogramdeliveries
  • pytest
  • python
  • react
  • scikit-learn
  • united-states-census-bureau
Share this project:

Updates