We are inspired by recent efforts to use AI in Medical Imaging, and we believe it can help overwhelmed and understaffed healthcare systems, especially in developing countries and pandemic hotspots.
There are multiple projects online that claim their highly-accurate AI trained on X-rays can be used to diagnose COVID-19. These are bogus claims most of the time. Patients with mild symptoms have minimal traces of the disease. Furthermore, the current available data on COVID-19 images are usually from severe cases, so the models trained exclusively on them won't perform accurately in real-world conditions.
Mild and severe cases of COVID-19 exhibit lung opacity in images (which is also an indicator of Pneumonia). Since there is abundant data on X-rays with lung opacity, we decided to use it to augment COVID-19 data, and in separate labels to mitigate the risk of false negatives (i.e. having a truly COVID-19 positive individual being flagged as not showing symptoms of it if the model was a binary classifier).
What it does
The model takes both PA and AP X-ray images (in DICOM format) as inputs and outputs a prediction for each image from one of three labels (covid, opacity, nofinding). Those predicted with the labels "covid" and "opacity", as well as additional risk factors (Age, X-ray View Position) taken from the metadata, are flagged by the model for priority action (ex. testing, re-testing, ventilator, etc…). The results can all be saved in a csv file.
For example, a 54-year old patient predicted to have lung opacity is flagged as higher priority for action by the model over a 20-year old patient predicted to have symptoms of COVID-19. Both are still flagged as higher priority for COVID-19 testing compared to those who were predicted to have no findings.
Our model does not have any diagnostic capabilities and is only meant to automate currently existing medical triage workflows.
How We built it
The model was trained via transfer learning with a Resnet 34 model architecture on images taken from the Cohen Dataset and the Radiological Society of North America Pneumonia Dataset. Approximately 26,000 images were used with weighted resampling to account for the scarcity of COVID-19 data.
Area under the Receiver Operator Characteristic (AUROC) was the chief metric used to determine model performance. It was calculated with a one-vs-all approach. AUROC for "covid", "opacity", "nofinding" were at 99.97%, 92.64%, and 92.73%, respectively.
It was then dockerized into an intuitive application that clinicians can use offline and within their own hospitals/testing centers.
What's next for COVID-19 X-Ray Scanner for Diagnosis Triage
We hope that our model can be used in a formal study investigating its effectiveness and that it be tested in a hospital, especially when most of the world goes out of lockdown and hospitals will see a resurgence of cases.
Since it is open-source, we also hope that other researchers can further improve on the model. Currently Arterys, an American Medical Imaging Company with 6 FDA Clearances, will deploy a version of the model soon on their platform for American researchers to further develop.