posted an update

Mid-Project Update

Introduction

We are implementing the paper “Image-to-Image Translation with Conditional Adversarial Networks” by Isola et al., which investigates a general-purpose solution to generating output images from input images. For our purposes, the model described by the paper has shown promising results on reconstructing and synthesizing images from label maps and edge maps. Our goal is to construct images of clouds based on an input black-and-white image mask, which we hope will benefit from the model’s success with image synthesis from image segmentation data. If this model works as expected we will be able to synthesize realistic cloud images of any shape.

Progress

We’re feeling good about our progress so far. We’ve completed our preprocessing code, which reads in image files and outputs resized, normalized NumPy arrays representing the clouds and masks. We’ve also begun work on the generator model, defining the Keras layers of the encoder-decoder and completing the forward pass function. Although the generator we are currently implementing uses a standard encoder-decoder architecture, the paper found improved results using a U-Net architecture, which adds additional skip connections. Since our input image masks have few high-level features, we expect that the standard encoder-decoder may work adequately, but we still plan to add skip connections in the near future, upgrading to the U-Net architecture.

This image-to-image translation project is different from our past assignments in that it consists of two entirely separate models, training in parallel. It was a challenge getting our heads around this new architecture, especially when it came to structuring the two models in Python. After attending the GAN lectures and looking at the GAN lab, we have a much better understanding of these concepts. It seems that, because the generator and discriminator are fully separate, it should be straightforward to define them as separate models, with their only overlap being in the train function.

Next Steps

Our next steps will be to code the discriminator model and to complete the generator. Then we’ll write code to batch our inputs and train our models. Finally, we’ll begin producing concrete results by testing our model on real and fabricated masks and writing code to visualize the results.

Log in or sign up for Devpost to join the conversation.