x | Devpost

Private user posted an update — Dec 09, 2021 10:30 PM EST

The following is a copy of our initial project proposal. You can also read it on Google Drive here

Introduction: In recent years, Generative Adversarial Networks (GANs) have become an increasingly popular technique to generate and manipulate images. Our goal is to work on an implementation of the model used in this paper. The authors train a model to detect differences between real and GAN-generated images. The authors calculate co-occurrence matrices on images and pass them into a CNN. This boils down into a standard binary classification problem, choosing between real and generated images.

Related Work: This paper is somewhat similar to the paper we plan to implement. In the paper, the authors also train a classifier to detect GAN images from a dataset generated by one GAN (ProGAN) and real images and then test it on image sets generated by various other GANs. The authors also experiment with various data augmentations such as blur and JPEG compression to see if it improves the classifiers performance.

https://openaccess.thecvf.com/content_CVPR_2020/papers/Wang_CNN-Generated_Images_Are_Surprisingly_Easy_to_Spot..._for_Now_CVPR_2020_paper.pdf

Data: The authors used the CycleGAN and StarGAN image sets for training and testing. They are available open source on Github.

Methodology: Similarly to in the original paper, our plan is to calculate co-occurrence matrices on the three RGB color channels. This is done directly on each image pixel, and will output a 3 x 256 x 256 vector.

This will then be passed to our Convolutional Neural Network. Our current plan for the CNN’s architecture is to have six convolutional layers, with max-pooling and relu layers alternating between them, and then 1-3 dense layers at the output. The overall architecture of the CNN itself is somewhat similar to that which we’ve implemented in this class, but we plan to use more layers to align closer to the paper we intend to replicate, which means we will be training a far greater number of parameters. The exact size of the layers is absolutely subject to change, but as a starting point, our convs will be somewhere in the size of 32/64 3x3/5x5, and the dense layers will start at 128 and potentially go up to 512. We believe (or perhaps, hope, though we are somewhat confident) that this will be in the scope of the computational ability of the department machines. If not, we plan to reach out to mentor to see whether cloud credits are a possibility. We plan to train it for a number of epochs that will be dependent on the ultimate runtime of each epoch, and plan to use the Adam Optimizer, experimenting with different learning rates to find what seems to be best.

Finally, as our data comes training/testing labelled, after training, we will measure the accuracy of our model and contrast it with that of the original paper’s.

Metrics: “Accuracy” is an appropriate measure for our project. Overall, we will judge our success based on the accuracy of our implementation against the original paper’s, which they have provided and plotted in the paper. Thus, our base metric will be implementing a model that perhaps does ok, whose accuracy is less than the original paper’s, our target goal is to meet the accuracy of the original paper’s, and our stretch goal is to exceed the accuracy of the original paper’s and also experiment with the model on additional data sources, such as GANs not mentioned in the paper or real world face image sets.

Ethics: There are a multitude of broader societal issues relevant to detecting fake images.

The existence of GAN generated and altered images poses a clear societal issue, that GANs can be used to create images to deceive people. Currently, it may be possible for humans to tell GAN images and real images apart, but as GAN technologies develop further it is likely that humans will no longer be able to detect them. In fact, the New York Times notes that these technologies are already being used in harmful and deceptive ways: “[GAN generated images] are starting to show up around the internet, used as masks by real people with nefarious intent: spies who don an attractive face in an effort to infiltrate the intelligence community; right-wing propagandists who hide behind fake profiles, photo and all; online harassers who troll their targets with a friendly visage.” Thus, the work in this project poses a tool to help mitigate some of these harms.

It is likely that one day deep learning tools will be used in the real world by people trying to tell if an image is fake or real. Clearly this can be used to combat some of the harmful ways GAN images are used discussed above. However, it is important to discuss the stakeholders of the real world use of such a model. Firstly, if such a model is ever used on a picture of a real person, we would want the model to have a high degree of accuracy. Further, it is important to check if a model like this has biases and that any real world faces the model is trained are representative, as it would be dangerous if the model viewed images of one group of people as more “real” than others. Finally, it is also important to make sure that if such a model is used in the real world that it does not cause more harm than good.

Division of Labor: We have not fully fleshed this out, but we hope to both split up the work of the implementation and presentation portions of the project (so that no one person is doing all the work). However, to also do some work together to make sure everything integrates smoothly.

After a week, we will likely all move to work on the CNN, tuning its parameters, and gauging our success.

Log in or sign up for Devpost to join the conversation.

Private user posted an update — Nov 28, 2021 10:03 PM EST

Detecting GAN generated Fake Images using Co-occurrence Matrices: Checkpoint 2 Thomas Kim (tkim61), Henry Sowerby (hsowerby), Jacob Makar-Limanov (jmakarli)

*Introduction: * This can be copied from the proposal.

In recent years, Generative Adversarial Networks (GANs) have become an increasingly popular technique to generate and manipulate images. Our goal is to work on an implementation of the model used in this paper. The authors train a model to detect differences between real and GAN-generated images. The authors calculate co-occurrence matrices on images and pass them into a CNN. This boils down into a standard binary classification problem, choosing between real and generated images.

Challenges: What has been the hardest part of the project you’ve encountered so far?

In implementing an existing paper, it is challenging to nail down the exact implementation of the paper, especially when the paper doesn’t include code, as there are a number of things that seem straightforward but are actually difficult to implement yourself. With respect to the CNN, the paper mentions the layers they used, however does not detail what the parameters for its max pooling layers are-- so it’s just another added complexity of figuring out the right ones. The paper was also vague about co-occurrence matrices. The authors mention that they used this method, but they do not explain fully what a co-occurrence matrix is and how to calculate it. To implement this, further reading from the papers' citations and the internet was needed. Further, similar to how the author’s were unclear about max pooling parameters, it is unclear if the author’s of the paper used a specific co-occurrence filter or had the model learn a co-occurrence filter.

Insights: Are there any concrete results you can show at this point? How is your model performing compared with expectations?

Because we have not fully trained the model, we don’t have any complete results, nor do we have plots to display. However, our CNN model and our method for finding co-occurrence matrices are implemented, and tentative results suggest that these implementations are working.

Plan: Are you on track with your project? What do you need to dedicate more time to? What are you thinking of changing, if anything?

Overall, we are on track with our project-- our CNN model and co-occurrence methods are implemented, and seem to do okay with very small amounts of data. Now we need to dedicate our time to fully training the model.

Data: Successfully downloaded StarGan, and biggan can be accessed from TF's website directly.

Log in or sign up for Devpost to join the conversation.

Private user started this project — Nov 12, 2021 11:14 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.

x

Built With

Updates