Inspiration

Breast cancer is the most common cancer in women. About 90% of breast cancer patients live for at least 5 years after the cancer is diagnosed. When the breast cancer is detected early, the chance of survival is much higher. Beginning at the age of 40, breast cancer screening every one or two years is one of the most recommended preventive solutions for women. However, there are a large population of women that does not have access to such services despite having a high chance of developing breast cancer. Risk scoring for breast cancer that can identify individuals with high risk of developing chronic condition might be a good way to encourage individuals to seek professional help on time. The tool is easy and cost-effective to use. Healthcare organizations can also use such model to identify individuals with high risks.

What it does

Pink Ribbon is a tool for estimating the chance of developing breast cancer. Using this tool, a user can enter some information including age, weight, height, history of breast cancer in 1st degree relative, age of a user at the first birth or nulliparity, menopausal status, history of previous breast biopsy, and history of using hormone replacement therapy. Then, the app will return the chance of developing breast cancer. Upon returning the result, the tool also provides additional insights about breast cancer from the dataset.

How I built it

The data is obtained from Breast Cancer Surveillance Consortium (BCSC) website. After data wrangling and feature engineering steps, a random forest model was trained as a classifier. The precision and recall were used for fine tuning and performance evaluation. Python, Flask, D3.js, and HTML are deployed for creating the app.

Challenges I ran into

The percentage of positive case was much less than negative cases. Additional methods and further fine tuning are needed to achieve a desirable performance. There are many follow-up ideas that could be done on estimated risk that were not achievable in a short time-frame.

Accomplishments that I'm proud of

I created a data-driven solution that allows individuals to estimate the chance of developing breast cancer. The tool is affordable, easy to access, and flexible. The result also includes additional insights from the dataset that allow users to make better sense of the data. Several components in this projects were successfully developed and integrated in a short time-span.

What I learned

Dataset is not always clean. It is necessary to preprocess the data before feeding it into the data analytics pipeline. Integrating multiple software frameworks has a lot of components that require attention to details in fine tuning and tweaking. Vision, technical knowledge, business plan, and communication skills are equally important to the success of the project.

What's next for Pink Ribbon

The proof-of-concept prototype submitted in this hackathon is a web-based application. In the next phase of the project, a mobile app will be developed for both Android and iOS. This mobile app will be made available for free download. Once the mobile app has sufficient user base, revenue can be generated through advertisement. These revenues will be used for supporting additional features.

Share this project:

Updates