The rising costs of healthcare prompted us to see how average healthcare costs for procedures changed geographically in a recent timeframe. Considering special variables, such as how states handle different procedures and Medicare costs versus total costs for said procedures, the Inpatient Prospective Payment System dataset was discovered on the website by our team.


Working with the IPPS dataset, we analyzed procedure instances versus states as well as the relative costs of the said procedures in terms of Medicare costs and total costs. The data was displayed using the Bokeh library which projected a heat map of the above variables in states across the US. The Pandas library was used to manipulate data to find the correlations as well as display premature graphs of raw data. A Django web application was developed to display our data visualizations.


Two members were introduced to Python in this Hackathon, and the entire team was unfamiliar with Numpy, Pandas, Bokeh, and professional data analytics preceding the event. One member that knew Python had configuration troubles with Numpy and Pandas in their Anaconda distribution (resolved by reinstalling Anaconda) and was not able to fully participate in the project until 6-7 hours into the Hackathon. There were many changes as to which data visualization libraries were fit to display the data: d3 and Javascript were the original language and library settings, but this shifted into Bokeh and Python.


Considering the vast amount of progress made in learning Python for half of our team, this project was a grand success. Exposure to, and experience with, popular Python data analyzing libraries were also an important addition to what was learned, which served as an important accomplishment in itself. Most of all, without formal teaching in this area, we feel that we performed very well for our first big data analysis.

If Given More Time

Data manipulation across smaller areas (reducing from states to counties) would have been ideal. Mapping zip codes to counties in states was far more difficult than anticipated, so this idea was rejected early on. Restricting the domain of locations in terms of metro/local and cost evaluations were an area of interest but were sadly not fully explored.

Built With

Share this project: