Inspiration

Pneumonia is one of the major reasons of death all over the world. It is even more dangerous for children and the elderly. And if we see the current situation of COVID-19, then we need a good system that can detect lung diseases accurately. So, this project is a starting step to solve one of the major issues in the healthcare industry which can be scaled enormously in the future.

What it does

This project consists of a deep learning model which takes the x-ray images of lungs as input. Then it predicts whether the lungs are affected by pneumonia or not. If it finds traces of pneumonia, then it draws bounding boxes around those sites.

How I built it

The model that I built is a Faster RCNN model with a ResNet50 backbone. The Faster RCNN model has been pre-trained on the COCO dataset. I finetune the model and again train on lung images. The dataset is taken from RSNA Pneumonia Detection Challenge which was held in 2018 on the Kaggle platform. It consisted of enough images to start with a good baseline model.

Challenges I ran into

Training time and computation power was the biggest hurdle that I ran into. Faster RCNN model resizes the images into 800x800 pixels by default. I changed that resize into 1024x1024. This increased training time a lot. I trained the model on Kaggle kernel for 21000 iterations with a batch size of 8 (30 epochs). Training any longer was not possible due to the kernel running time of 9 hours.

Accomplishments that I'm proud of

I got Average Precision of 0.251 on my private validation set. The best leaderboard score on Kaggle was 0.25475. So, I think that it is a big achievement for me. I know that the model can be improved much further with more training and more data but it has a good starting point as of now.

What I learned

  1. Building deep learning models for medical imaging prediction is a very important task and will become more so in the future. But approaching such solutions using deep learning is difficult as medical images are very different from other real-life image data.
  2. Data and training time are the two most important factors when building deep learning solutions for medical imaging. More data and more training always help.
  3. Working alone for a hackathon is always difficult. One should team-up with like-minded people to get things done in a much better way. I could not find anyone who was very interested in this project. But I thought that this was a serious problem that needed to be solved so I moved ahead with it.

What's next for Pneumonia Detection using Deep Learning

I want to improve my model even further with more data and training. As of now it is not possible due to the non-availability of really good compute power with me. Winning this hackathon will surely help me do so. As of now, this model can be run with Flask API on the localhost. But I want to deploy it as a website and maybe even team up with some medical professionals so that they can provide their insights.

Share this project:

Updates