NeuroDet

NeuroDet Poster

Title

NeuroDet: Deep learning models for brain tumor classification

Team members

Hyeyeon Hwang (hhwang16)
Alexandra Wong (amw11)
Victor Youdom Kemmoe (vyoudomk)

Introduction & Inspiration

What problem are we trying to solve and why? If you are doing something new, detail how you arrived at this topic and what motivated you.

We are interested in applying deep learning models to answer biological questions. Looking further into the types of biological questions we could focus on, we decided to focus on neuroscience and in particular, detecting types of brain tumors. Typically, when an abnormality is seen on MRI or CT scan, the only way to confirm what type of tumor is present is to perform a brain biopsy, a very invasive diagnostic test. Different brain tumor subtypes have incredibly different levels of aggressiveness and treatment options. If we could create a method to diagnose patients via their imaging, we could save patients the pain and risk of complications that are associated with obtaining a brain biopsy.

What kind of problem is this? Classification? Regression? Structured prediction? Reinforcement Learning? Unsupervised Learning? Etc.

Our project is a classification task. We are classifying MRI images on whether there is a tumor present, and if so, if it’s a malignant versus benign subtype. We are also interested in classifying the tumors into 4 classes (no tumor, benign tumor, malignant tumor, and pituitary tumor).

Related Work

Related papers

Much of the prior work on brain tumor classification utilizes models based on deep convolutional neural networks. For example, “Differential Deep Convolutional Neural Network Model for Brain Tumor Classification” by Abd El Kader I et al. (https://pubmed.ncbi.nlm.nih.gov/33801994/) introduces a differential deep convolutional neural network (deep-CNN) to train and test 25,000 brain magnetic resonance imaging (MRI) images and achieves an accuracy of 99.25%. “Brain Tumor Detection and Classification on MR Images by a Deep Wavelet Auto-Encoder Model” by Isselmou Abd El Kader et al. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8471235/) implements a deep wavelet autoencoder for tumor detection and classification with high accuracy. “CNN-RNN: A Unified Framework for Multi-label Image Classification” by Wang et al. (https://arxiv.org/pdf/1604.04573.pdf) introduces a combined CNN-RNN model that outperforms state-of-the-art multi-label classification models.

Living implementations

https://github.com/aksh-ai/neuralBlack
https://github.com/Nupurgopali/Brain-tumor-classification-using-CNN
https://github.com/AryanFelix/Brain-Tumor-Classification
https://github.com/MohamedAliHabib/Brain-Tumor-Detection

Data

Accessing data

The brain MRI image dataset we plan to use is located here: https://www.kaggle.com/sartajbhuvaji/brain-tumor-classification-mri?select=Testing

How big is it? Will you need to do significant preprocessing?

There are 3264 MRI images total, with the following classes that have been already split into training and testing folders. As the data is already split into training and testing sets, we will not have to do significant preprocessing.
Glioma (malignant tumor): 826 Train Files, 100 Test Files
Meningioma (benign tumor): 822 Train Files, 115 Test Files
No Tumor (healthy control): 395 Train Files, 105 Test Files
Pituitary Tumor (mostly benign tumor): 827 Train Files, 74 Test Files

Methodology

Our goal is to detect whether a tumor is present and if a tumor is benign or malignant. This will involve the binary classification tasks (1) no tumor present or tumor present and (2) benign tumor or malignant tumor. We also want to classify brain tumor MRI images into four classes (no tumor, benign tumor, malignant tumor, and pituitary tumor) and do a multi-label classification of pituitary tumors, which may be benign or malignant. For our tasks, we plan to implement CNNs, RNNs, CNN-RNNs, and VAEs.

As CNNs are widely used for image classification and previous work on brain tumor classification have used the same architecture successfully, we decided to build a CNN to use as our baseline model.

Although most pituitary tumors are benign, they may also be malignant. So we are also interested in looking at whether a tumor is a pituitary tumor and benign or malignant. As previous work on multi-label image classification tasks have used RNNs and mixed CNN-RNNs successfully, we plan to implement RNNs and CNN-RNNs for single-label and multi-label classification of the brain tumors.

As studies have shown that VAEs achieve good performance on image classification, we may also implement and assess the performance of a VAE model for binary and multi-class classification of the different types of tumors.

Metrics

We will assess model performance using accuracy, area under receiver operating characteristic (AUC) curve, and area under precision recall (AUC-PR) curve. Note that for a binary classification of benign and malignant tumors, a false positive prediction can lead to unnecessary treatment, but a false negative prediction is far more serious since a malignant tumor that requires urgent treatment has been unnoticed. Therefore, minimizing false negatives is more important.

Our base goal is to implement the baseline CNN and RNN model.
Our target goal is to implement the baseline CNN, RNN, and CNN-RNN model.
Our stretch goal is to implement the baseline CNN, RNN, CNN-RNN, and VAE model.

Ethics

What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?

The dataset is a collection of MRI images that have been deidentified to protect patients’ privacy. Although it is now difficult to tell how biased the dataset is due to the lack of information about who the patients were, there are significant factors that could bias the collection of images obtained. For instance, were the sexes represented evenly in the dataset? Were these patients of different ancestries represented well? Were these patients all from the same geographical regions, and what types of environmental factors were they exposed to? Was the radiologist actually correct when reading the images, given that radiologists are humans too and can make errors? MRI images also come in many slices, so which particular axis and slice an image came from could bias the model to only be able to classify a tumor if the slice came from the same area as the training set. Each of these factors could skew classification in a particular subpopulation that is not well generalizable to the rest of the public, which could make it a less useful general diagnostic test.

Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm?

There are a variety of stakeholders in anything deep learning applied to the medical field. Obviously, the most important stakeholder is the patient whose diagnosis could depend on such an algorithmic tool. This matters hugely because a wrong diagnosis could mean getting the wrong treatment, which has consequences for their survival, health, quality of life, and financial stability. Additionally, there are consequences for the usage of healthcare resources if unnecessary treatment is given, which matters for both providers (like physicians themselves) and payers (like health insurance companies) who need to distribute said resources to other patients in need. Misdiagnosing certain groups of people because of biases in the dataset could also lead to further stratification of health inequities.