Inspiration

This summer we’ve been volunteering at FIMRC, a nonprofit working to bridge healthcare gaps in low-resource areas. Specifically we have been working with their chapter in Ghana; we did a case study during our time there when a patient was being diagnosed with tuberculosis. The doctor was performing the diagnosis by listening to 30 lung sounds by ear, since there was no available equipment. However, we learned that the doctor only came once per week, even though 50 patients with a respiratory disease would visit the clinic each day. With further research, we found that the doctor to patient ratio was 1:15,000, a crazy low ratio that was unsettling.

At the beginning of summer, we came across an MIT article on an AI model that could detect COVID-19 via forced cough recordings. We were working on replicating the model during that time. This sparked our idea for this hackathon - to leverage AI to create a respiratory disease screening algorithm in order to provide more accessible respiratory disease diagnostics to low-resource areas (like Ghana).

What it does

Essentially, the algorithm can distinguish between various respiratory illnesses/categories (COPD, COVID-19, asthma, and healthy) via forced cough recordings. It will be different than existing algorithms (like the MIT one), since it is multi-class with more than two disease categories, thus making it more practical in a clinical setting.

We envision for it to be implemented as a smartphone app, and be used by healthcare workers of limited experience, who would record a patient's cough sound on a mobile phone, and the algorithm would predict which disease they may have.

How we built it

First we searched for any public dataset that we could find, getting cough recording data from asthma, COPD, COVID-19, and healthy subjects (Coswara dataset). Afterwards, we performed some audio preprocessing since the data was not uniform - this included normalizing bit rates, amplitudes, and applying a low-pass filter to get rid of mass background noise. We also segmented the coughs into individual recordings of the same time length.

Afterwards, we converted each cough recording into a spectrogram, which we assembled into datasets for model training and testing. We applied a Convolutional Neural Network to perform image analysis on the spectrograms, and employed transfer learning in order to leverage an existing algorithm to increase our rates of accuracy. We use VGG16 as the transfer learning model, since we found it to run faster than VGG19 and Resnet50, other popular algorithms.

In terms of the general model architecture, we added four dense layers, utilizing ReLU as activation, added three dropout layers to reduce overfitting, and used softmax as the final activation since we were doing multi-class classification. With more time, we could have tried tweaking these parameters and seeing how it affected accuracy. We worked with a batch size of 15, with 25 epochs since we found any numbers greater than those produced results that were overfitted and forced us to stop training. To evaluate model performance, we printed out the Area Under the Curve (AUC) values and receiver operating characteristic curve (ROC) to produce a visual representation.

Ultimately, we also designed a rough UI since we want the final product to ultimately be an app and would like to show what we think this would look like.

Challenges we ran into

The greatest challenge was training. It took a long time, and sometimes an hour through training the computer would stop functioning or the code would present an error (often to do with the dataset, which we had to work backwards to fix). It was a tedious process and we were very worried that we couldn’t even get a result to produce since we realized it took each model 2 hours to train. Moreover, when we initially began training, the model didn’t appear to be learning and so there was a lot of trial and error in tweaking various parts of the model and figuring out how to employ transfer learning, which required tons of research and patience.

Audio preprocessing was also difficult. We didn’t know what to even process and things like low-pass filters, amplitude and bit rate normalization, and amplitude envelopes were brand new to us. Since we were also working with a large handful of recordings, we had to write a lot of code that looped through all the data.

With the UI design, it wasn’t nearly as daunting as building the actual algorithm itself, but our time crunch was really rough, and we had to a lot our time extremely well.

Accomplishments that we're proud of

It was certainly a massive challenge to train an algorithm in such a short period of time. Although we had previous code to repurpose (from building the replicated COVID-19 vs. healthy model), there were many new aspects including audio preprocessing, assembling new datasets, and using transfer learning. Adding two other new categories also caused quite a few errors we had to debugg (for instance while ResNet 50 worked for our COVID algorithm, it didn’t for this multi-class version) and lengthened out training time by quite a bit (our computers were on the verge of crashing). We’re surprised that we even got something working at the end of the ~48 hours which is an accomplishment we are endlessly proud of.

What we learned

While we’ve had past machine learning experience, working with shallow learning and NLP, this project gave us quite a bit more insight into working with deep learning and cough sounds. We used transfer learning for the first time (actually learning about it for the first time too), and quite a bit about audio preprocessing too. It was also our first time working with multi-class detection and we got a sense of how that was different than binary methods, involving the different activation functions required, what training was like, and how to interpret results.

On the non-technical side, we learned much about time management and work delegation since we were working on such a tight timeline. We learned how to play to each of our strengths since there was so much code to power through and we had to constantly adapt our goals to fit what was more realistic.

What's next for LungTech

We’re hoping to carry this project forward, thus our next steps are to look into getting better quality data to improve our algorithm. After all we did only hasty search the internet. We would also like the explore the detection of tuberculosis and pneumonia, trying to find some data for those diseases as well. Striving into the future, our big goal is to build a working smartphone algorithm that can be used in a clinical setting in Ghana.

Built With

Share this project:

Updates