class: center, middle # Artificial Intelligence ## Generative Adversarial Networks --- # Artificial Neural Networks .left-column[
] .right-column[ We introduced one or more "hidden layers", which will hold intermediate values h $$ \vec{h} = f_1(W_1 \cdot \vec{x})\\\\ y = f_2(\vec{w_2} \cdot \vec{h})\\\\ y = f_2(\vec{w_2} \cdot f_1(W_1 \cdot \vec{x})) $$ ] --- # Adversarial Training * Say someone has a neural network that can distinguish between cats and non-cats * We want to "smuggle" a cat past the network * This means: We want an image of a cat that the network identifies as a non-cat * Why? To improve the network, of course! (There are more sinister applications, too) --- # Adversarial Training * Take an existing image of a cat * Change it "a little bit" * Check if it is now classified as a non-cat * Repeat --- # Adversarial Training * Pass your existing image through the network * Note which pixels have the greatest impact on the result (weights) * Change (only) those pixels --- # Adversarial Example
-- This is a bird! --- # Generative, Adversarial Networks * So far we have used Neural Networks to classify images, or predict some value * Could we **generate** things with a Neural Network? * Crazy idea: We pass the Neural Network some random numbers and it produces a new Picasso-like painting -- * That's exactly what we'll do! --- # First: Classification * To produce a Picasso-like painting, we first need to know which paintings *are* Picasso-like * We could train a Neural Network that detects "real" Picassos (the "Discriminator") * Input: An image * Output: "True" Picasso, or "fake" * So we'll need some real and fake Picassos to start with ... --- # Art Connoisseur Network * After some training, our network will be able to distinguish real and fake picassos * This means we can give this network a new painting, and it will tell us if it is real or not * Now we can define the task for our generator more clearly: Fool the discriminator network, i.e. generate paintings that the discriminator recognizes as "real" Picassos --- class: medium # The Generator Network * The Generator Network takes, as we wanted, a vector of random numbers as input, and produces a picture as output * The **loss function** for this network then consists of passing the produced image through the discriminator and determining if it believes the painting to be real or not * We can then use backpropagation and gradient descent, as usual, to update the weights in our generator * Over time, our generator will learn to fool the discriminator! --- # Not quite enough ... * If our discriminator was "perfect", this would already be enough * However, to start, we needed some "fake" Picassos, which we just generated randomly * Once the Generator produces some images, we actually have "better fakes"! * So we can improve the Detector with that * And then we need to improve the Generator again, etc. --- # Generative, Adversarial Networks * Generative: We **generate** images * Adversarial: The Generator and the Discriminator play a "game" against each other
--- class: medium # The Generative Game * Discriminator learns to detect fake images (optimization with gradient descent) * Generator learns to produce fake images that look real to the discriminator (optimization with gradient descent) * Discriminator learns to detect these new fake images * Generator learns to fool the updated discriminator * ... --- # Mode Collapse * The goal of the generator is to **minimize** the error (= how many images the discriminator recognizes as fake) * The input of the generator is random noise * Imagine there is a **perfect** fake image * The generator could learn to ignore the input and produce only this image --- # Mode Collapse * Once the generator produces **only** the perfect image, the loss, and therefore the gradient for each image will be the same * In the next iteration, the generator will also only produce **one** image * The generation process has "collapsed" to a single example * Generally, we don't want that --- # Mode Collapse: Randomization * What can we do? When we get to that point, nothing :( * To prevent getting there: Introduce more randomness * Dropout layers: After the activation function, randomly set values to 0 (with a probability p) * Randomize labels: When training the generator, randomly set some of the labels to 0 --- # Mode Collapse: Diversify Generation * Another option is to explicitly encourage generation of different images * For each set of generated images, calculate the average per-pixel variance * Use this variance as an additional input for the discriminator * If variance = 0 very often, the discriminator will learn to use that to identify fake images --- # GAN Variants * Generating faces or photos from existing ones * Additionally providing a class to generate specific pictures * Generate an image from a textual description * Apply a style to an existing image ("Style transfer") --- # Samples
--- # Cycle GAN
--- class: center, middle # Lab 4 --- # Lab 4 * Let's build a GAN to generate more MNIST-like images * Focus on only one digit (pick one: your birthday, last digit of the carné, your favorite number, ...) * As mentioned above, we need two neural networks: A discriminator and a generator --- class: mmedium # Lab 4: Discriminator * We only need one output to distinguish between fake (0) or real (1) * Use `torch.nn.BCELoss` (Binary CrossEntropy Loss), which expects the output from the network and the expected output (0 or 1) **in the same shape** * Take the images from the MNIST data set for one digit as real data * Generate completely random images as initial fake data * Train the network to distinguish the two --- class: mmedium # Lab 4: Generator * For the Generator, we want a very different network * It takes 100 inputs and produces 784 outputs (one for each pixel) * Sample these 100 inputs randomly from a normal distribution (`torch.randn`) * The loss function is a bit more complicated: First pass the output through the (trained) discriminator network, then pass the output from that to `BCELoss`, where we **want our outputs to be 1** (i.e. the generator wants the discriminator to say that the images are real), then do a `backward` call, and optimizer step as normal * Your generator optimizer will only have the generator parameters as its target and therefore will not change the discriminator! --- # Lab 4: Training * In a loop, alternate between training the generator and the discriminator * Keep an "episode memory" of fake images, to which you add newly generated fakes in every iteration * Train the generator and discriminator each for a couple of iterations * Put both into a larger loop * The lab description has a skeleton code for this training loop setup --- class: medium # Lab 4: Summary * You need two Neural Networks, two optimizers, two training functions * There is one overall loop that contains two sub-parts: - Train the discriminator - Train the generator * Store images produced by the generator as fake images to train the discriminator * Make sure to preserve some old images! * You don't have to address mode collapse, but you can --- # References * [Torch Tensor Operations Overview](https://jhui.github.io/2018/02/09/PyTorch-Basic-operations/) * [GAN Introduction](https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/) * [GAN hacks](https://github.com/soumith/ganhacks) * [GAN Variations](https://developers.google.com/machine-learning/gan/applications) * [Large Scale GAN Training for High Fidelity Natural Image Synthesis](https://arxiv.org/abs/1809.11096)