Inspiration

What it does

Uses a Random Forest to predict if the rate of AIDS infection for a given zip code will be in the top 25% of AIDS rates.

How we built it

Used scikit-learn to train a random forest model using the data in the aids data set. We also found a data set of tax data for zip codes in North Carolina, and used that to estimate the average income for each zip code.

Challenges we ran into

Getting the data is the largest challenge. Another frustration was the fact that we were focusing on the AIDS data, which is aggregated by zip code, while much of the other demographic data we wanted to use (age, income, race, etc.) is aggregate in other data sets by census tract. And getting the data.

Accomplishments that we're proud of

We didn't do too bad for being fairly new to scikit-learn.

What we learned

How to convert a bsv to csv in Pandas.

What's next for HackathonCLT MMXIX

a nap

Share this project:
×

Updates