Sometimes we have medical questions that are tied to a specific images, and most medical question and answering systems cannot process that type of input. This type tool could be extremely useful to people at home wanting to know more about their own health, or even doctors who need to extract information from images.
What it does
My application takes an image and a question about the image as an input and gives an answer to that question. The model is trained mainly on a medical dataset, which means it cannot be used accurately outside of medically related questions.
How we built it
I used PyTorch as the main deep learning framework to implement the question + image and answer system. I used Streamlit to put all of this deep learning on to an application. I used the MICCAI19-MedVQA Github repository to help me implement the deep learning model. In addition to these libraries and frameworks I used NumPy, Pandas, Matplotlib, and Torchvision to help me create my final product.
Challenges we ran into
There were many challenges I had to overcome during this project. One of the main challenges I faced was actually getting the model to work because of the complexity of such a model. I chose to use a BiLinear LSTM network in the end. Another challenge I faced was optimizing the Streamlit UI setup because of the heavy deep learning models that were used. I managed to speed up prediction significantly by caching models.
Accomplishments that we're proud of
I am extremely proud of the amazing accuracy I achieved on the model. These relatively complex models often take days to fully train, but using the power of transfer learning I was able to cut this down significantly and train a extremely robust model. I'm also proud of the nice and easily usable UI that I was able to make with Streamlit which includes a dark mode and light mode, easily being able to input files, and rerun features for new images.
What we learned
- Learned how to implement a BiLinear LSTM for image and text inputs
- Learned how to create a nice looking UI with Streamlit
- Learned how to implement an attention mechanism that works with an LSTM
- Learned how to search, collect, and process medical image data
What's next for Q&MedicAid
Expand the models capabilities and make it be able to solve more than just medical problems!