Recently, I came across a research paper on medGAN and realized the use of GANs in the healthcare sector. So far, GANs (Generative Adversarial Networks) have been used for generating synthetic EHR data in the US. I wondered and wondered and decided to try out this same method on images and it turned out to be a success! 🦾

  1. With the rising privacy concerns, the safety of the patients and the medical image datasets on the internet remains a safety concern. Although organizations such as NIH have up-to-date internet safety protocols on their data, hackers have yet been able to trace the patients based on the imaging data available freely on the internet. As a result, patient's privacy is a genuine concern. 🧐

  2. Besides this, the lack of dataset is a frequent problem faced by healthcare researchers worldwide. Often when one collects a dataset on say COVID chest X-rays 🫁, one also needs a balanced dataset of healthy lungs to feed to the AI models. Lack of data can be an issue here which will be addressing

What it does utilizes the power of Generative Adversarial Networks (GANs) in generating discrete synthetic medical images for an available dataset. Thus, it serves two main purposes -

1️⃣Data Augmentation for building AI/ML models using Deep learning for diagnosis, detection of various diseases

2️⃣ Protects patient privacy when stored alongside the original dataset on the internet thus making it nearly impossible to trace the patient

How we built it

I built the DCGAN (Deep Convolutional GAN) network on google colab. Tried building this GAN first on a pneumonia medical dataset.

GAN is made up of two different neural networks: the discriminator and the generator. The generator generates images, and the discriminator detects if an image is real or whether it was generated. The generator accepts a random seed vector and generates an image from the random seed vector given as an input. The discriminator accepts an image as its input and gives us a number which is the probability of the given image being real. If we provide additional seeds, we can generate unlimited amount of new synthetic images

Language used - Python 🐍

Challenges we ran into

  1. Jupyter NB's sessions kept on expiring every time I tried running my generator. Build the model thrice but the same issue persisted. Then, I decided to finally shift to Colab to avoid wasting time 😣
  2. There were some file-type issues in the dataset which very not very visible. I had to run several terminal commands to get rid of those bugs 🐞

Accomplishments that we're proud of

This will be one of the most novel and foremost applications of GANs in the field of medical image synthesis and privacy protection. GANs, are infamous for their application in deepfakes and have led to heated debates surrounding privacy and theft concerns. I am glad that we could use the same technology for better purposes and thus establish the fact that, it is us humans, who are responsible for the proper usage of technology and that technology, in itself, is not inherently dangerous.

What we learned

GANs, the mathematics behind them,

Application of this type of AI in the healthcare domain, βš•οΈ

Using Google colab notebooks,

Medical research and analysis

What's next for

  1. A full-fledged open-source software to make it publicly available to researchers and practitioners across the world to generate synthetic data for their personal projects 🀩
  2. To collaborate with research institutes and online repositories to use this AI to generate additional data for their datasets to protect the patient's privacy online πŸ™ŒπŸΌ

Built With

Share this project: