Lab 4

Introduction

In this lab, we will generate new images based on the images in the MNIST data set using a Generative Adversarial Network. You should reuse the code from lab 3 to read the MNIST data set, write image files, and as the basis for the classifier. We recommend that you limit yourself to images with a single digit first, because it will be easier to train the network this way. Once you have the training code up and running, you can try using other digits as well. This article contains a description of how to train a GAN for the MNIST data set, if you get lost. Do not just copy the source code from the article, our structure is slightly different and if you don’t know what you are doing, it will not work. If you understand what you are copying, clearly mark which source code was not written by you. Plagiarism will not be tolerated and will result in 0 points. Here you can also find some tips and tricks to make your GANs work better.

Report

You are required to document your work in a report, that you should write while you work on the lab. Include all requested images, and any other graphs you deem interesting, and describe what you observe. The lab text will prompt you for specific information at times, but you are expected to fill in other text to produce a coherent document. At the end of the lab, send an email with the names and carnés of the students in the group as well as the zip file containing the lab report as a pdf, and all code you wrote to the two professors and all code you wrote to the two professors (markus.eger.ucr@gmail.com, marcela.alfarocordoba@ucr.ac.cr) with the subject “[PF-3115]Lab 4, carné 1, carné 2” before the start of class on 16/6. Do not include the data set in this zip file or email.

Generative Adversarial Networks

As discussed in class, a GAN consists of two separate neural networks: A generator and a discriminator. These two networks “play a game” against each other, where the generator has the goal of producing images that look as realistic as possible, and the discriminator has to determine which images were produced by the generator and which are real. Define two classes Generator and Discriminator as subclasses of torch.nn.Module and populate them with layers. As a start, use three hidden layers with LeakyReLU activation functions and an output layer with Tanh or Sigmoid activation functions (take care of the scaling: if you use tanh for the Generator, all your images should be scaled to the interval (-1,1), or, with sigmoid, to (0,1)). Typically, you will want the number of units to increase sequentially in the hidden layers of the generator (e.g. 128, 256, 512) and the reverse sequence in the hidden layers of the discriminator.

The input to the generator should be a vector of 100 (random) numbers, and the output will be an image with the dimensions of an MNIST image (28x28). The input for the discriminator will be an image (28x28), and the output will be a single classification: fake or not fake.

Training

To train your GAN, you will need two functions: One to train the generator and one to train the discriminator. First, write a function train_discriminator, which you can pass a list of real images (choose one digit: your birthday, last digit of your carné, number of cats, or similiar, and use the images for that digit only from the MNIST data set) and a list of fake images. Let your discriminator predict the labels for both sets of data, and calculate the loss with nn.BCELoss(), calculate the gradients of the parameters and perform an optimization step. Test this function with some random images/tensors (generated with torch.rand), and some images from the MNIST data set.

Then write a function train_generator that first samples random noise, passes it to the generator, and then passes the generated images on to the discriminator, and calculates the loss afterwards (Caution: For the generator your goal is for the predictor to predict these fake images as real), then calculate the gradient and perform an optimization step. For training the discriminator, you will need a set of “real” images and set of “fake” images. You start with randomly generated images, but then add the images generated by the generator in each iteration. You will want to keep some, but not all, “old” images around, in order to not “forget” what fake images looked like in the past. One reasonable approach would be to keep 100 old images around. In each iteration, you generate 300 fake images with the generator, and add them to the 100 stored images (resulting in 400 fake images total). After the training step for the discriminator, you keep a random selection of 100 of these 400 images for the next iteration. Another approach would be to add a small number (around 20) images from each iteration and add them to an ever-growing collection of fake images.

Your overall training will consist of an outer loop, which you should repeat for several iterations (try around 200), and in each iteration, you train the discriminator for several iterations (choose e.g. 100 random real and fake images every iteration), and then the generator for several iterations:

The general outline for your training loop should be:


for i in range(n):

    # Train discriminator
    for j in range(n1):
        fake_data = generator(torch.randn(100, 100)).detach()
        real_data = sample(real_images, 100)
        d_error = train_discriminator(d_optimizer, loss_fn, discriminator, real_data, fake_data)
        print(j, d_error)

    # Train generator
    for j in range(n2):
        g_error = train_generator(g_optimizer, loss_fn, discriminator, generator)
        print(j, g_error)

    # Sample some fake images at random
    fake_data = generator(torch.randn(100, 100)).detach()
    for j in range(fake_data.shape[0]):
        if random.random() < 0.1:
            show_image(fake_data[j], 'img_%d_%d'%(i,j), SCALE_01)

Notes:

  • Use Adam as the optimizer for both networks
  • You may have to use a very low learning rate (0.0001 or less), especially if you notice that your generator error is 0, and the discriminator error is 1.
  • You should .detach() after you create fake images for training the generator so that you don’t unnecessarily calculate the gradients on the generator as well
  • Don’t forget to .zero_grads()
  • The fact that pytorch accumulates gradients is actually useful when training the generator: You can pass both image sets (fake and real) separately, calculate the loss on each, call backward() each time and you’ll have the total gradient.
  • One of the most common problems with GANs is that all produced images are identical or nearly identical (called “mode collapse”), because the generator has found the “perfect” image to fool the discriminator. Unfortunately, the only thing you can do when that happens is to restart training (perhaps with a lower learning rate) and hope to get a different/better initialization. Dropout layers (with probabilities of 0.5 even) in the generator may also help.

Perform this training and inspect the resulting images. Use a low n1 and n2 (around 20) at the beginning until you are sure that your code works and the loss actually decreases. Then increase these values to 200-400. For the overall n, you will only need 100 iterations or so. Don’t forget to include some sample images in your report, but also provide an estimate of what percentage of images “looks decent”, in your opinion. Once you have everything working, go back, and change the selection of which digit to generate (e.g. if you started with generating 2s, generate 4s now). You should be able to do this by changing the digit in a single place, and everything else should work the same way.

Useful Resources