Mustafa is from Afghanistan and Juan José from Guatemala. Upon meeting at the registration table on March 2nd 2019, we both quickly realized that we were both interested in developing a Social Good project that could be applicable to some of the major challenges we face in our respective homes. Upon some brainstorming, we stumbled upon the NIH Dataset link. The number of diseases labeled in the dataset made it extremely enticing to work with. After doing some online research, we realized that lung and respiratory related diseases were a leading cause of death for people around the world. Additionally, we also found out that there is a shortage of X-ray technicians worldwide, especially in developing countries around the world.
In light of these facts, we decided to take action toward bridging the X-ray technician gap around the world by using Artificial Neural Networks for the process of diagnosing lung diseases. The main application of a predictive model for such diseases would be in places where there are little or no X-ray technicians at all. Therefore, in order for the product to be accessible for even the most remote areas in the world, we decided to bring the power a predictive model using an Artificial Neural Network via SMS.
What it does
The overarching infrastructure of the model is basically such that we have a phone number that is listening for images to be sent over SMS. On the server side, we have a pre-trainned tensorflow DenseNet model with the data from the NIH. Once we receive the image, we feed it to a pre-trained model and get a prediction for the diseases. We text the respective predictions to the message-sender and prepare the model for the next incoming image.
Challenges I ran into
- Learning how to build and train an Artificial Neural Network model from scratch in less than 24 hours -> the developing process of neural networks is by natural experimental and takes a long time. This exerted additional pressure to the process of coming up with an accurate model
- Finding an optimal way to train the model without access to a GPU
- Cleaning the dataset and selecting significant sampling sizes from the 30k+ images