SyntheticMRI

Synthetic MRI generation using StableDiffusion & VQVAE

Proposal

This project endeavors to advance the field of synthetic MRI image generation through the development of innovative latent diffusion models, drawing inspiration from seminal works on Denoised Diffusion Probabilistic Models & Stable Diffusion. Our goal is to condition these models on a diverse set of parameters to generate high-fidelity synthetic datasets. These datasets are intended to facilitate the pre-training of data-intensive transformer models for various downstream tasks, thus mitigating the challenges posed by small, heterogeneously sourced medical datasets and the domain shift issues inherent in models pre-trained on non-medical datasets like ImageNet. The advantages of Diffusion Models have been witnessed only recently in the field of Medical imaging, leaving scope for more exploration. We aim to produce large size synthetic datasets that could be used to pre-train data hungry Transformer models for downstream tasks. This could help avoid training models on small medical datasets collected from different machines with varying parameters, nor rely on Imagenet pre-trained models that suffer from change in domain distributional shift.

Introduction

Diffusion models operate through a dual-process framework consisting of forward and reverse phases, both modeled as Markov chains. In the forward phase, noise is incrementally introduced into the input image, transforming it into a Gaussian distribution characterized by specific mean and variance parameters. Conversely, the reverse phase involves the gradual denoising of a perturbed image back to its original state.

Diffusion Processes

Notably, the U-Net architecture is employed to learn the parameters of the reverse process, with the model's training objective centered on maximizing the likelihood of the output probability distribution. This is achieved through the minimization of the L2 loss associated with noise prediction at each timestep, a principle referred to as denoised score matching.

The innovation of conducting training within a latent space, as proposed by Stable Diffusion, ensures the stability of the training process. This approach entails the separation of perceptual compression, achieved through models such as KL-VAE and VQ-VAE, from the denoising process. However, initial experiments with VQ-VAE revealed limitations, prompting a strategic pivot to VQ-GAN. The VQ-GAN model incorporates additional losses - perceptual loss, GAN feature matching loss, and L1 loss - enhancing the model's performance by providing a more nuanced approach to image reconstruction and generation.

Maximizing likelihood function = Minimizing L2 loss of noise prediction.
That's why the whole training methodology is also called denoised score matching!

Models

Our initial foray into model development led to the creation of a 3D VQ-VAE model, which was subsequently succeeded by a more sophisticated VQ-GAN framework in response to the former's limitations. The VQ-GAN model is distinguished by its inclusion of a discriminator component, which introduces a adversarial dynamic to the training process, thereby refining the generated images' fidelity.

The U-Net architecture within our diffusion model is composed of downsampling, middle, and upsampling blocks, each incorporating Residual and Attention Blocks. These components are meticulously designed to condition the model on diffusion timesteps, thereby enhancing its generative capabilities.

Our approach to model training is characterized by a meticulous optimization of the various loss components, including perceptual loss, GAN feature matching loss, and L1 loss. This multi-faceted loss strategy significantly improves the quality and realism of the generated synthetic MRI images.

Experiments

Extensive experiments were conducted to evaluate the efficacy of the VQ-GAN model, with a particular focus on the impact of the newly integrated loss functions. These experiments demonstrated a marked improvement in the quality of the generated images, affirming the superiority of the VQ-GAN model over its VQ-VAE predecessor.

The transition to the VQ-GAN model, complemented by a sophisticated loss strategy, signifies a pivotal advancement in our project. The results of these experiments, which underscore the enhanced quality and realism of the synthetic MRI images generated by our model, can be accessed here which has a summary of all the experiements & outputs.

In conclusion, our project represents a significant contribution to the domain of synthetic MRI image generation, leveraging the latest advancements in diffusion models and GAN technology to produce high-quality, realistic images. These developments hold great promise for the enhancement of medical imaging analysis and the broader field of medical AI research.

Built With

jupyter-notebook
python
shell

Updates

Aayush Jaiswal started this project — May 17, 2024 06:38 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.