Group Members

Jacob Peterson, Blake Maxwell, Matt Chang

Inspiration

Cancer is very prevalent in the world, increasingly so over recent decades and deserves to be researched, both in general and specific risk factors that may contribute to it among age groups.

What it does

Imports vast amount of cancer, study, and population data from CDC sources, combines them into various dataframes and formats it, computes cancer trends over time, alcohol risk factor relevance, tobacco risk factor relevance, and visualizations for each.

How we built it

It was built in Google Colab using CSV data downloaded from the CDC and imported into Pandas Data frames, then imported into VSCode before submission.

Challenges we ran into

Time investment (it took about three times as long as we initially expected), importing cancer count data into a CSV, formatting the data correctly, internet issues, simultaneously working on Colab, learning a new library, creating frequencies, graphing using different methods and libraries, reformatting everything for VSCode including installing 6 libraries, and cutting down the presentation video to be short but include everything.

Accomplishments that we're proud of

Completing it and making it run, visualizations, learning a new library, the project being over 4 times the length expected of us, and getting relevant data to present.

What we learned

New libraries, code formatting, time management, dealing with massive amounts of data (one of our dataframes was 25000 X 742 !) reading and translating scientific study data.

What's next for Cancer Prevalence Over Time:

We're all STEM students, so we'll be continuing doing research in various applications in science using the data programming skills we've learned.

Built With

Share this project:

Updates