Diagnosing Breast Cancer with MA-CNNs

Github: https://github.com/JasonSWu/CS1470-Final-Project

Final Writeup: https://drive.google.com/file/d/1LK68wWDRB2UZ8gaHFbL-vLgwa6H1f67H/view?usp=sharing

Final Video: https://drive.google.com/file/d/1OtDjr4npeHaquc4TCKG8v6lgZ4E29F9o/view?usp=sharing

Diagnosing Breast Cancer with Convolutional Neural Networks on Mammograms

Aaron Igra (aigra), Carter Moyer (cgmoyer), Jason Wu (jwu191)

Introduction: What problem are you trying to solve and why?

We are trying to solve the issue of diagnosing breast cancer in mammograms with a higher accuracy than physicians. This work is important to create a more consistent and accurate diagnosis for each patient and to detect cancers earlier than physicians might be able to, giving patients more time to get the proper care they need.

Related Work:

Are you aware of any, or is there any prior work that you drew on to do your project? We are reimplementing the paper Classification of Mammogram Images Using Multiscale all Convolutional Neural Network (MA-CNN). This paper seeks to use mammography scans to diagnose malignant and benign breast cancer. It uses an MA-CNN model to classify the images as normal, malignant, or benign. It uses the mini-MIAS dataset model, and achieved an accuracy of 0.99%. Kaggle has a list of code posted by individuals using this dataset https://www.kaggle.com/datasets/kmader/mias-mammography/code

Data: What data are you using (if any)?

First, we’ll achieve equivalent accuracy and AUC on the mini-MIAS dataset used in the paper. Then, we’ll try to get good results with the VinDr-Mammo dataset. Both of these datasets are large-scale, widely used ones which utilize full-field digital mammography (FFDM), the modern clinical standard for mammograms. mini-MIAS is 443 MB whereas VinDr-Mammo is, uncompressed, 338 GB. Both of these datasets will need significant preprocessing in order for our model to be compatible with both of them.

Methodology:

What is the architecture of your model? We will train our model wit a 60-20-20 train-dev-test split The MA-CNN model consists entirely of convolutional layers with a flatten, dense, and softmax at the end. The model begins by passing inputs through multiple dilated convolutional layers of differing scale in parallel to learn “multi-level abstract information.” Any max-pooling layers are replaced by convolutional layers of stride greater than one to preserve important information. The hardest part of implementing that paper will likely be the preprocessing. The paper details nuanced techniques for preprocessing images and even cites two other papers for further explanation. Metrics: What constitutes “success?”

We plan to repeat the experiment from the paper, where we build an MA-CNN model and test on the mini-MIAS dataset. We will look at the false positive rate, false negative rate, sensitivity, and specificity. Ideally, our model has high accuracy whether or not someone has breast cancer, however, it is more important to have a low false negative rate than a low false positive.

Our base goal is to reimplement the model in the paper, and we hope we can achieve similar accuracy as the authors.

Our target is to achieve decent accuracy on a subset of a larger dataset, either CBIS-DDSM or VinDr-Mammo. These datasets are both over 100 GB, so we will be using a small subset to test on our personal machines.

Our stretch goal is to test the entirety of the large datasets with our model, using either Brown’s computing cluster or another online tool. We hope that we can achieve higher accuracy with a larger training set, and show whether an MA-CNN model is effective for diagnosing breast cancer.

https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset https://physionet.org/content/vindr-mammo/1.0.0/

Ethics:

There are several societal issues related to this project. First is the matter of people’s, predominantly women’s, health. Breast cancer kills 42,000 women each year in the United States alone with many more women dying from breast cancer around the world. As such, this is very important work, but it’s also work that needs to be done with a great deal of care due to the intimacy of people’s health involved. There are also concerns we have to consider when working with our model — that being discrimination based on race/ethnicity, age, and other factors as a result of the dataset being primarily composed of different groups. Due to systemic racism and other forms of discrimination in the health care system, different groups of people might be prioritized when receiving mammograms compared to others, leading to a bias in our model’s predictions. If possible, we should work to reduce this issue, or at a minimum be cognizant of it. The major stakeholders in properly diagnosing breast cancer are the patients, their families, clinical staff who are taking the mammograms, and physicians. There are two types of mistakes that can be made — a false positive and a false negative. A false positive is highly undesirable because it will cause the patient to undergo incredibly traumatic (and likely expensive) treatment with lasting consequences, none of which were necessary and indeed could prove to be very harmful. The other, a false negative, is also incredibly dangerous because it could give a patient and their physician a false sense of security with the cancer only to emerge later and prove to be much more dangerous. This reduces the amount of time the patient has to get treated, if they even get treatment, and could similarly have very negative health effects, possibly even leading to their death.

Division of labor:

All members will be equally responsible for contributing to both preprocessing and model building.

Built With

tensorflow