Non-uniform accessibility to healthcare across the country and shortage of qualified healthcare professionals are major barriers to both preventive and curative health services. Furthermore, there is glaring disparity between rural and urban India as regards access to medical diagnosis and other healthcare facilities are concerned. About 55 million people in India suffer from chronic obstructive pulmonary disease , according to a global study, which noted that people of less developed states are more prone to the disease than those living in developed ones. Chronic respiratory diseases were responsible for 10.9% of the total deaths and 6.4% of the disability-adjusted-life-years(DALYs) in India in 2016 as compared with 9.5% and 4.5% respectively in 1990. Of the total global DALYs due to chronic respiratory diseases in 2016, 32% were in India. Fast and Cheap Computer aided Diagnosis will prove extremely beneficial for people in rural India where there is acute paucity of trained radiologists and Doctors per capita. An automated AI system that can reliably identify different disorders from Chest X-rays would be invaluable in addressing the problem of reporting backlogs and lack of radiologists in low-resource settings. Our study will demonstrate that AI based Computer aided diagnosis – deep learning algorithm trained on a large quantity of well-labelled data can accurately detect multiple abnormalities on Chest X-rays. As these systems further increase in accuracy, the feasibility of applying deep learning to widen the reach of automatic chest X-ray interpretation and improving reporting efficiency will add tremendous value in radiology workflows and public health screenings in Indian healthcare sector, particularly rural India, which suffers from acute shortage of trained radiologists and doctors
What it does
Our end-to-end AI based solution addresses the problem of multi-label thorax disease classification on chest X-ray images. Our model will output the predicted scores corresponding to each of the 14 abnormalities associated with thoracic disease and a class activation map that clearly highlights regions in X-rays characteristic of the different abnormalities.
How I built it
We have used PyTorch and FastAi library to design our model. We used Chest X-ray 14 dataset by NIH for procuring the data for our model.We trained our model on GTX 1080 ti
In our approach we have used an ensemble of fine-tuned(Transfer Learning) Xception architecture with DenseNet-121. The fine-tuning is done using state of the art approaches like cyclic learning rate. To address the class imbalance issue we employed recently proposed _ Class balanced loss_ by Yin Cui, Menglin Jia et. al., 2019
Further, the output of our fine tuned transfer learning model is conected to residual attention network for further improving the accuracy.
In the previous work by Kapur et. al (CheXNet), all the pathologies are treated equally in classifier learning, i.e., when predicting the labels of each image, all pathologies are given the same weight. Moreover, correlation essentially exists among the labels, for example, the presence of cardiomegaly usually additionally accompanies high risk of pulmonary edema. Therefore, exploring the dependency or correlation among labels could assist to strengthen the intrinsic relationship for some categories. However, considering an individual image, the uncorrelated labels may also introduce unnecessary noise and hinder the classifier from learning powerful features. We wanted to mitigate the interference of uncorrelated classes and preserve correlations among the relevant classes at the same time.
To accomplish the above task we will be using combination of residual attention learning and Spatial Regularization Net. Our model uses a category-wise residual attention mechanism to assign different weights to different feature spatial regions. It will automatically predict the attentive weights to enhance the relevant features and restrain the irrelevant features for a specific pathology.
Unlike the traditional classification problems, where the prediction can be either correct or wrong, the multi-label classification problem is a more challenging task and requires more special evaluation measures since the performance over all labels should be considered.
In a multi-label classification problem, a prediction can be fully correct (all predicted labels are correct), partially correct (some of the predicted labels are correct) or fully wrong (all predicted labels are wrong). We monitored F-beta scores with different values of beta during training and measured AUROC at end of each epoch for validation set.
F-beta offers good trade-off between precision and recall. For our problem, F-beta offers more flexibility than plain F1-score as we could adjust the beta value depending upon how much we weigh each of the 14 disease labels. We measured hamming loss and tried different many other loss functions during experimentation.
Challenges I ran into
Perhaps, the major bottleneck was the noisy labels in the dataset. Here, are the few key challenges that highlight a few major challenges:
In the ChestX-ray8 dataset which was used for training the algorithm pathology labels were extracted automatically from the radiology reports by text mining. This also means that the ground truth per se is an individual radiologist’s judgment, which in some cases further deteriorated due to the inherent inaccuracies of automated data mining.
A frontal chest x-ray is only the beginning: Already around the time the NIH dataset was released there were concerns about the fact that only frontal views were included. In many cases we use these as standalone tools, but very often we do rely on a very simple yet immensely useful ancillary technique: the lateral view.
The lateral chest x-ray alone is rarely diagnostic, but in conjunction with the frontal view its effect is supra-additive, by helping clear up not only the localization but the etiology of abnormalities first identified on the frontal radiograph. The labels don’t match the images very well. After physical examination of X-ray images, it becomes apparent that many images in this multi-label dataset are inconsistent with the labels.
The labels are internally consistent, so systems trained on them seem to learn to reproduce them (along with their errors)
Entity extraction using NLP is not perfect: we try to maximize the recall of finding accurate disease findings by eliminating all possible negations and uncertainties of disease mentions. Terms like ‘It is hard to exclude …’ will be treated as uncertainty cases and then the image will be labeled as ‘No finding’. ‘No finding’ is not equal to ‘normal’. Images labeled with ‘No finding’ could contain disease patterns other than the listed 14 or uncertain findings within the 14 categories.
Accomplishments that I'm proud of
Firstly, I'll have to admit that I satrted quite late - almost a fortnight before the deadline. Given a short ampunt of time I learnt a lot about how deep learning could be employed in healthcare sector for solving many problems faced by people with limited access to healthcare. I was able to design a simple UI for my model where any person could use my model for performing Chest X-ray diagnosis with 0.84 AUC score.
What I learned
This was my first computer vision project. I undertook this project with the intention of gaining more knowledge about how deep learning is playing a pivotal role in healthcare sector. I used PyTorch and FastAI for the first time for this project and I must concede that I am amazed by the ease of use and flexibility that these frameworks offer. Earlier, I used to use Keras/TensorFlow and was fairly satisfied, but after exploring PyTorch, it will be hard going back. I think that given a short time frame, I learnt a lot. Before this project, I barely knew about computer vision area. I used to work on structured data before and this project proved to be major driving force behind my will to learn about application of DL in health care and computer vision in general. Moreover, the small UI app that I created is also one of my first experiences as regards designing UI is concerned. I also learnt about how to effectively use GPU for deep learning For example, PyTorch offers plethora of options like pinned memory and asynchronous non_blocking transfer of data from Host to GPU, which greatly speeds up the training time.
What's next for Chest X-Ray Computer Aided Diagnosis
Currently, I think I need to acquire a bigger and more 'clean' dataset like CheXpert and MMIC X-ray dataset which has over half a million samples. This will aid in increasing accuracy of my model. Furthermore, I'm interested in studying the effect of combining Chest and heart segmentation with my model. The idea behind doing so is the current model just takes Chest X-ray images with no bounding regions associated with different organs like lungs and heart. By using segmentation via models like U-net, my model should be able to improve further.