posted an update

Introduction

Our model will be used to predict the outcome of mechanical thrombectomy on a patient with a stroke before the operation even occurs. The project that we are tackling is a supervised image classification problem—we are tasked with taking in CTA images(essentially X-rays of the skull) and other relevant features about the patient (such as gender, age, and type of injury)—and from these, predicting the TICI score and the number of passes needed to complete a successful operation. The TICI score is a measure of the amount of blood flow through a blood vessel after the operation has been completed. TICI scores fall into discrete classifications (0, 1, 2a, 2b, 3), which will be output by our model.

Challenges

One of the largest challenges we’ve faced so far is the preprocessing of the data. As mentioned earlier, the data is located on a hospital machine and we are only allowed to access it via remote desktop software. Because we are working with real patient data, we have to be cognizant of HIPAA guidelines and making sure we are handling the data in an ethical manner. Additionally, only a small portion of the whole available dataset is applicable to our project; we needed to figure out a way to only select the data that we needed from the complete MIPS dataset we were given. To do this, we had to write our own script just to identify relevant examples (without doing any preprocessing on it yet).

While playing around with preprocessing the data, we have seen that we are given an image as well as other features (age, gender, etc) that all play a role into our classification problem. Another challenge we are currently facing is how to combine both the image and the other features into a single feature space. One solution that we have brainstormed is to run the image by itself through a 3D convolution network, produce an output, then combine the output of the convolution layer with the additional features to produce the final output of the model.

Although we are only starting to implement the model, the largest challenge that we foresee is performing 3D convolution on our examples. We discovered that each CT image doesn’t take the same format as we had seen for the CNN assignment; each example is a compilation of 64 different CT images, each taken from different angles to synthesize a 3D model. Because we cannot perform traditional 2D convolution on this input, we will need to look into how 3D convolution is implemented.

Insights

We have been focused on doing data preprocessing up to this point; as such, we do not have any concrete results to show with regards to model output. However, we are almost done with preprocessing the data; we have parsed the relevant examples, converted the images into numpy arrays that we plug into our model, and are able to read in the non-image features from a CSV. In terms of preprocessing, we only need to separate the input data into training/validation/test splits as well as possibly one-hot encoding some of the features.

Plan

We are in a good state with our project! We should be able to finish up preprocessing shortly and proceed with implementing the model’s infrastructure. As stated above, the main challenges we will be dedicated our time to are the 3D convolution and combining image and non-image features to produce an output. Preprocessing of the data has gone pretty smoothly and we do not anticipate changes to that portion. And because we are still in the process of implementing the model, our plans are still flexible with regards to infrastructure.

Log in or sign up for Devpost to join the conversation.