We were inspired the large amounts of biological data being produced in the 21st Century, and how we as data scientists, could help the world better understand it. Given that breast cancer is a significant health concern in our population, visualizing and understanding data around it could help better direct patient treatments, public policy and therapeutic development.
What it does?
We visualized a dataset that gave information on patient outcomes, age, and types of surgery performed. We conducted a research study to better understand how these data, fit within our current understanding of Breast Cancer, which was understood through our literature review process.
How we built it
We used python modules such as pandas, matplotlib and Seaborn within the Jupyter notebook runtime environment to clean up and prepare our data to create our statistical visualization figures.
Challenges we ran into
We encountered difficulties producing some of our statistical figures in python, and had to continuously use google, and our time to find and fix our errors.
Accomplishments that we are proud of
Producing a scientific poster was a big achievement, as it is well positioned to present at scientific conferences down the line should we wish to do so.
What we learned
We learned how effective combining in person, and virtual team members to a project can be, and the importance of not letting yourself get stuck on an issue or idea.
What's next for breast cancer data study
Applying our statistical models, and study more comprehensive datasets, as well as refining our understanding of breast cancer in general, allowing a better analysis to be performed in the future.