Final Writeup: https://docs.google.com/document/d/1YYBO_dxzi9H8zOZ97is-OcrT5Qyr0Ct5nPuJFRZ-Cqs/edit?usp=sharing
Introduction
The colorization of grayscale images is an ill-posed problem, with multiple correct solutions. In this project, we propose an adversarial learning colorization approach coupled with semantic information. A generative network is used to infer the chromaticity of a given grayscale image conditioned to semantic clues. This network is framed in an adversarial model that learns to colorize by incorporating perceptual and semantic understanding of color and class distributions. The model is trained via a fully self-supervised strategy. Qualitative and quantitative results show the capacity of the proposed method to colorize images in a realistic way achieving state-of-the-art results.
Related Work
For the article “Colorful Image Colorization”; given a grayscale photograph as input, the program attacks the problem of hallucinating a plausible color version of the photograph. The system is implemented as a feed-forward pass in a CNN at test time and is trained on over a million color images. They use “colorization Turing test,” for evaluation https://github.com/richzhang/colorization
Data
We will be using the ILSVRC2012 dataset, from Kaggle. https://www.kaggle.com/c/imagenet-object-localization-challenge/overview/description
Methodology
8 blocks of 3 convolution, 1 relu, and 1 batch normalization layers. From the paper, “The net has no pool layers. All changes in resolution are achieved through spatial downsampling or upsampling between conv blocks.” We will use Google Cloud to train our model faster than locally. The hardest part about implementing the paper will be converting from Pytorch to Tensorflow and the long training times that are associated with multiple convolution layers.
Metrics
Since image colorization has multiple acceptable answers, accuracy is not a metric that we can use to evaluate our model. We will use human participants to choose between a generated and ground truth color image. Our base goal will be to fool humans on 5% of the trials, our target will be 10%, and our stretch will be 15%. Additionally, the PSNR of the images will be computed with respect to the ground truth and compared to those obtained for other fully automatic method.
Ethics
Colorizing a grayscale image requires a deep understanding of how colors and textures are associated with different objects and features in the image. Deep learning models such as CNNs can learn the complex relationships by automatically extracting features from the input image at different scales and combining them to generate a color image.
If the dataset disproportionately represents a certain racial group or demographic and portrays images of certain scenarios with underlying bias, the model will not perform accurately when presented with grayscale images of underrepresented groups. For example, if the dataset consists of faces of one racial group, the model may not perform well on faces of other racial groups.
We are concerned that some images were collected illegally or without permission. Since there are millions of images in the database, those illegal images are hard to identify.
Division of labor
Our main division of labor will be the following:
Preprocessing: Ben, Yaoqi
Implementation: Ben, Seyit
Evaluation: Seyit, Zahra
Write up and Poster: Yaoqi, Zahra
Additionally we will all work together when necessary.
Built With
- tensorflow
Log in or sign up for Devpost to join the conversation.