Enhancing Disease Diagnoses Machine Learning

Training and Validation Accuracy Score on Local Pennylane Simulator
Training and Validation Loss Score on Local Pennylane Simulator
Tuberculosis Test Predictions Using Model

Tuberculosis Disease Diagnosis: A Brief Background

Tuberculosis disease is a potentially serious infectious bacterial disease that mainly affects the lungs. The bacteria that cause TB are spread when an infected person coughs or sneezes. Most people infected with the bacteria that cause tuberculosis don't have symptoms. When symptoms do occur, they usually include cough, weight loss, and fever. Patients with active symptoms require a long course of treatment involving multiple antibiotics. Although tuberculosis is generally rare in the United States (fewer than 200,000 cases per year), it is much more prominent and destructive in developing countries and is the most common infectious disease worldwide. Although the diagnosis tools regarding TB have improved in recent decades, many cases are still diagnosed using clinical suspicion without positive culture evidence due to time constraints. TB has a misdiagnosis rate of 14.6%, which has a direct risk to a patient’s treatment path.

Prototype

This project aims to advance the accuracy and minimize the time of Tuberculosis Disease diagnostics using a neural network. To test the speed and efficiency of machine learning as well as its usefulness in tuberculosis diagnosis, I devised a prototype written in Python. The Python language's concise, expressive, and dynamic nature makes it well suited for prototyping tasks. The model utilizes PyTorch for training and validation.

Image Recognition

I created a base neural network model, which utilizes stacked machine learning models to create a model trained from any dataset in ImageNet format. The base model is used as the base for Transfer Learning on an Image Classification task (based on resnet18). The last layer of this pre-trained model (fully connected/FC layer) is then modified through a PyTorch machine learning framework, generating a new model. I am training the model at 8 epochs.

Results: Accuracy and Loss Score of Trained Model

The model reported a validation accuracy of approximately 92.5% and a loss score of approximately 20%, indicating the model was able to generalize and differentiate the categories correctly given the dataset images. Once the model is loaded in, it requires less than 2 seconds to make a prediction.