Inspiration

Although a few of us were motivated by the prizes, we also chose the Health track as it would provide us with a challenge level where both experienced and inexperienced coders/data scientists could cooperate. Also by using datasets we have experienced firsthand or have seen before, we were able to track trends relative to that of the real world.

What it does

Our project analyzes the datasets of weekly death counts for influenza, pneumonia, and COVID-19, reported by different regions and age groups. By correlating these health data with external factors like travel statistics, we can highlight trends that show how disease outbreaks influence travel trends in various parts of the country.

How we built it

The first step in turning any dataset into a functional system we had to clean the dataset, removing any irrelevant or incomplete data. After preprocessing, we used Python for statistical analysis and visualization. We then created models in Plotly and Tableau to track fluctuations in travel trends in different regions and age groups, overlaying this data with health statistics to identify key relationships

Challenges we ran into

From grasping the basics of Google Colab to mastering the various data analysis tools we used, and finally visualizing our results, we faced numerous challenges throughout the process. One of the most difficult and time-consuming tasks was developing an efficient method for cleaning and correcting the dataset. With over 50,000 rows of data, sorting it accurately would have been nearly impossible without a streamlined approach.

Accomplishments that we're proud of

We’re proud of successfully integrating multiple complex data sources to derive insights, particularly our ability to visualize the relationship between death counts and travel trends in a clear, compelling way. We're also very proud of how well the group was able to communicate and work together despite meeting one another on the day of C.D.C. Having a great group and an enjoyable one has definitely made us more than proud of our creation!

What we learned

This was most of our first time working with datasets as large as this one, particularly in public health. For some of us, it was also a learning curve, as we had different expertise in different aspects of the projects. For instance, Andy was able to learn more about Pandas, and Asa learned more about Python overall.

What's next for CDC-2024: Health Science Track

As far as our program, I think we're all excited to see it perform and see how it would place against our competitors in this challenge however, after the long hours of working with the same datasets, we would want to try other ones, maybe something lighter such as the Natural Science track or pushing ourselves to see how we would be able to operate with the Pop Culture Track.

Share this project:

Updates