Title: Tumor Segmentations Who: Abigail Marks (amarks3), Ria Panjwani (rpanjwa2), Priyanka Solanky (psolanky) Introduction: What problem are you trying to solve and why? According to the CDC, each year in the United States, 24,500 men and 10,000 women get liver cancer, and 18,600 men and 9,000 women die from the disease. As physicians diagnose patients, they often utilize Magnetic Resonance Imaging (MRI) to visualize liver tumors from tomography images. Analyzing structural MRI scans is a time-consuming task, so automatic and robust liver tumor segmentation can support radiologists glean information about size, shape to enhance personalized treatment plans. Deep learning models, such as the one we want to implement, have had great success with medical segmentation tasks, allowing physicians to visualize tumors to enhance treatment options, thus improving overall patient outcomes. The paper we are implementing describes automatic brain tumor detection and segmentation, using U-Net based fully convolutional networks. The paper’s objective is to provide brain tumor segmentation that is accurate and automatic. High accuracy allows for precise classification of tumor subtypes and border delineation, both useful in surgical planning, and in tracking tumor growth/shrinkage. We chose this paper because we find this application exciting and extremely pertinent to real life outcomes. Being able to replicate the architecture this paper outlines has the potential to save lives, money, and time. We have also found from personal experience that finding large and robust datasets has been a challenge, so we looked at many different projects and felt comfortable working with medical imagery data. As group members, we all found the project attainable given our understanding of deep learning, relevant to the real world with a tangible practical application, and a challenge for us to work through together. This segmentation task is enhancing the representation of the tumor within an image into something more meaningful and easier to analyze. In this way, segmentation allows for automatic identification of the location of liver tumors within the liver, and clear boundaries of the edges of the tumor. This task is a problem of its own, as it is not quite classifying images as done with MNIST, nor is it any sort of prediction task.
Related Work: Are you aware of any, or is there any prior work that you drew on to do your project? The paper we are following is looking at tumor segmentation in the brain. We are interested in seeing if the same methods and model can be applied to a different part of the body. That being said, the data we are getting is from a segmentation competition, so there are many resources available. One source we read describes the general architecture of a U-Net. It consists of an encoder and decoder and is fully convolutional. The encoder works by convolving the images and downsampling to reduce the size of the image. In the decoder, you use a transpose convolution to go back to the original image, while adding in previous states. They note how convolution is generally used for classification, as there is a dense layer at the end. However, convolution can also be used for image segmentation, where you are dividing the image into localized areas. The other paper focuses heavily on data augmentation, as there generally is not a lot of data available for biomedical tasks. This paper specifically was looking at HeLa cells. They augmented the data with random elastic transformations. Using these transformations, they were able to achieve high accuracy with very few data samples. We haven’t found an implementation of this exact paper, however, there are papers that do similar tasks. There are some papers written for liver segmentation that have similarities. There are also similar papers that use 3D data. We considered going down this route, but struggled to find data sets large enough to confidently be able to accomplish the task We have found the following: https://github.com/naldeborgh7575/brain_segmentation https://github.com/assassint2017/MICCAI-LITS2017
others will be added as we stumble across them.
Data: What data are you using (if any)? For our project, we will use a dataset obtained from kaggle that has data on liver tumors which is stored in 3D image format. The paper we are interested in uses 2D images and so it is necessary to obtain data and so we plan to slice the 3D images into 2D images on the third axis as this is much less time consuming and will provide much more data for training. Much of the preprocessing can remain the same as well as the segmentation of the images once the images are sliced. We can also try to use the BraTS Brain tumor dataset. This dataset is in a 3D format. It has approximately 300 images that have a tumor and so can be used to evaluate whether tumors can be detected. In both cases, the data is in 3D format and we will use the same model to segment the images and identify the tumor. The dataset that we are interested in contains approximately 260 images that have a liver tumor and we will be able to add healthy images in order to further train our mode. Further, these images are 3D and so can be split up into 2D images to obtain many more images that we can train our data set on. The BraTS Dataset is much larger and will need significant preprocessing in order to use the data. We will do preprocessing by splitting up the dataset into training and validation sets. Then, we will use k-fold cross validation. Finally, we will take the data and normalize it over the whole-brain region. This procedure follows the methods outlined in the paper. We will continue to search for data sets that could be used in the model that we are interested in.
Methodology: What is the architecture of your model? How are you training the model? First we will need to preprocess our data. We are using .nii files that come from MRIs. The MRIs are of the whole body and not just areas involving the liver, so we will need to get our images into the right form. They also are in 3 dimensions. We will have to slice the data on an axis to get 2D images. They also will have to be a reasonable size given that convolution is not a computationally light task to complete. If needed, we will also perform some data augmentation to help our accuracy. We will be training a U-Net model for segmentation. A U-Net is a fully convolutional model that first extracts the features of the image with a series of convolution and max pooling layers. Once it has downsample to a certain point, it begins to upsample again. It uses skip connections to enhance the context of the upsampling. This is a form of supervised learning, so we will train the model over a series of epochs and in batches. The structure of the model will be as follows: Layer 1: Image size 240x240 Conv 2d 3x3, 64 filters, Conv 2d 3x3, 64 filters Save output for later Max pooling 2x2 Layer 2: Image size 120x120 Conv 2d 3x3, 128 filters, Conv 2d 3x3, 128 filters Save output for later Max pooling 2x2 Layer 3: Image size 60x60 Conv 2d 3x3, 256 filters, Conv 2d 3x3, 256 filters Save output for later Max pooling 2x2 Layer 3: Image size 30x30 Conv 2d 3x3, 512 filters, Conv 2d 3x3, 512 filters Save output for later Max pooling 2x2 Layer 3: Image size 30x30 Conv 2d 3x3, 512 filters, Conv 2d 3x3, 512 filters Save output for later Max pooling 2x2 Layer 4 (bottom of the U) Con 2d 3x3, 1024 filters Conv 2d 3x3 1024 filters Deconv 3x3 Layer 5 copy and concatenate saved output from layer 3 conv 2d 3x3 256 filters conv 2d 3x3 256 filters deconv 3x3 Layer 6 copy and concatenate saved output from layer 2 conv 2d 3x3 128 filters conv 2d 3x3 128 filters deconv 3x3 Layer 7 copy and concatenate saved output from layer 1 conv 2d 3x3 64 filters conv 2d 3x3 64 filters conv 2d 1x1 2 filters The paper uses an Adam optimizer with a learning rate of .0001. The weights will be initialized at random normal with mean 0 and standard deviation .01. They initialized all biases to 0. If you are implementing an existing paper, detail what you think will be the hardest part about implementing the model here. Having the shapes of the inputs to the convolutions match the shape of the decoder will also be tricky, as the shape of the encoder layer will have to be cropped in order to create a skip connection to the decoder. Additionally, fine tuning the hyper parameters to work for the liver dataset might be tricky given how long it takes to run.
Metrics: What constitutes “success?” For our project, success will be measured according to the accuracy of our model predicting whether or not a tumor is present. The accuracy will be measured according to the Soft Dice Similarity Coefficient as shown below. This is a modified version of the Dice Similarity Coefficient which is: In order to calculate the soft dice coefficient, we can use the following formula. For the denominator, research has shown that squaring the ytrue and ypredicted values have been better for some loss evaluations while taking the simple sum has been better for others. Thus, we will try both models and see what works best. We plan to use similarity and intersection over union tests in order to evaluate accuracy. The notion of “accuracy” applies for our project because we are measuring how similar the prediction of where the tumor is located is to the actual location of the tumor. The soft dice coefficient can be calculated. We can use a target mask to calculate this value which can zero out any values that are not activated in the prediction array. Our base goal for this project is to achieve an accuracy of 65% over the dataset for the detection of tumors within MRI images. Our target goal is to achieve an accuracy of 80% and our stretch goal is to achieve an accuracy of approximately 91% using the Dice similarity coefficient.
Ethics: Why is Deep Learning a good approach to this problem? Segmentation involves partitioning images into multiple segments, essentially a pixel-level classification. Deep learning models are really good at solving this problem, and do so across computer vision applications from road sign detection, to video surveillance. We have seen that deep learning models are excellent at classification tasks, but segmentation extends the scope by detecting and delineating all features of interest in an image. From a paper surveying image segmentation using deep learning, the authors noted that “over the past few years,... deep learning (DL) models have yielded a new generation of image segmentation models with remarkable performance improvements —often achieving the highest accuracy rates on popular benchmarks— resulting in a paradigm shift in the field.” Deep learning can be used in conjunction with a physician’s own analysis of MRI scans, and other visualization techniques. This automatic analysis provides a helpful first viewing, if anything, that can provide more instantaneous actionable insights. We believe that this technology, partnered with individual analysis can constitute a high degree of accuracy for this problem, a problem that requires the highest accuracy level possible.
Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm? The major stakeholders in this problem are the developers involved in creating the model and aggregating relevant data, physicians who interface with the results with faith of high accuracy, and the patients whose outcomes are affected by treatment decisions. The developers need to ensure they are amenable to complaints of physicians as accuracy is life-or-death. They also need to ensure the data they use to create the model architecture is representative of all patient and tumor types. The CDC makes a point to note differing statistics of liver disease for men vs. women: if there is a difference in visualizing liver tumors in male bodies vs. female, the dataset needs to incorporate these deviations. While developers care about accuracy, they also need to consider speed. While the value of Deep learning models is their speed relative to humans, what tradeoffs are being made with regards to accuracy or memory? How are these concerns addressed? The consequences of mistakes made by the algorithm are extremely high stakes - inaccurate delineation of tumor borders can result in missteps by physicians during surgery, and therefore myriad of potential deficits for patients, often life-or-death. The importance of high accuracy cannot be further emphasized so patients should be empowered to know that deep learning models are being used to determine their treatment plan, in addition to their oncologist's analysis.
Division of labor: Briefly outline who will be responsible for which part(s) of the project. For this project, we will divide up the work so that two people will be involved in every part of the project with one member being the primary leader and the other being able to assist on the task. This will help to ensure that there will be someone to consult with on every part of the project but that every group member can work independently on their own parts so that there is no overlap between members working on the same part at the same time.
Our initial goal is to use the following scheme: Preprocessing: Ria (main) + Priyanka (assist) Creating the model: Abigail (main) + Priyanka (assist) Training: Priyanka (main) + Abigail (assist) Testing: Priyanka (main) + Ria (assist) Loss/Accuracy Functions: Ria (main) + Abigail (assist)
We will also meet for weekly meetings regarding our progress and to discuss any problems we are facing for group discussion.
Writeup: https://docs.google.com/document/d/1K0PU47YJYTQ-W7_S8KNeIQ49CsQwG_LlOv6QxgcouEQ/edit?usp=sharing
Built With
- tensorflow

Log in or sign up for Devpost to join the conversation.