class: center, middle # Artificial Intelligence ## Generative Adversarial Networks --- # Artificial Neural Networks * Last week, we discussed Artificial Neural Networks (ANNs) * Today we will look at some interesting things we can do with them * First, let's look at the interior of a neural network --- # Artificial Neural Networks .left-column[
] .right-column[ Our neural networks contained "hidden layers", which hold intermediate values h $$ \vec{h} = f_1(W_1 \cdot \vec{x})\\\\ y = f_2(\vec{w_2} \cdot \vec{h})\\\\ y = f_2(\vec{w_2} \cdot f_1(W_1 \cdot \vec{x})) $$ ] --- class: medium # Information Content * Our Neural Networks are deterministic functions * In fact, each individual layer is a deterministic function * This means, each layer's output **only** depends on its input * We can view the output at each layer as an "encoding" of the input of the network that is used by the next layer --- class: medium # Auto-Encoders * One application of this idea are Auto-encoders * They are neural networks with several layers, that become narrower and narrower (fewer neurons), before they widen again * The number of inputs is the same as the number of outputs, and the training examples use the *same* values for input and output * The goal is to learn a smaller *representation* for the input data * In essence, the ANN has to reconstruct the input from fewer values --- class: medium # Auto-Encoders
--- class: medium # Auto-Encoders
--- # Vector Embeddings * Auto-Encoders allow us to represent data with fewer values * We can view this representation as **vectors** * With the proper training these vectors can be used instead of the original data in our actual application * Next week we will talk about a related approach that represents words as vectors! --- # ANNs: An Alternative View * A Neural Network is a function that takes a vector as input and produces a vector as output * We can tweak this function to produce outputs closer to the ones we already have * As long as we can describe what we want as something differentiable, e.g. comparing with training data using a differentiable function, we can train the network with gradient descent --- # Adversarial Training * Say someone has a neural network that can distinguish between cats and non-cats * We want to "smuggle" a cat past the network * This means: We want an image of a cat that the network identifies as a non-cat * Why? To improve the network, of course! (There are more sinister applications, too) --- # Adversarial Training * Take an existing image of a cat * Change it "a little bit" * Check if it is now classified as a non-cat * Repeat --- # Adversarial Training * Pass your existing image through the network * Note which pixels have the greatest impact on the result (weights) * Change (only) those pixels --- # Adversarial Example
-- This is a bird! --- # Adversarial Training * Maybe we could automate this process? * Basically, we want to learn how to "fool" a classifier * But what do we use as our representation and learning objective? * Ideally, our process would produce new images --- # Generative, Adversarial Networks * So far we have used Neural Networks to classify images, or predict some value * Could we **generate** things with a Neural Network? * Crazy idea: We pass the Neural Network some random numbers and it produces a new Picasso-like painting -- * That's exactly what we'll do! --- # First: Classification * To produce a Picasso-like painting, we first need to know which paintings *are* Picasso-like * We could train a Neural Network that detects "real" Picassos (the "Discriminator") * Input: An image * Output: "True" Picasso, or "fake" * So we'll need some real and fake Picassos to start with ... --- # Art Appreciation * Real Picassos are easy to come by [citation needed] * Where do we get our fakes? * Picasso basically painted randomly, so let's use randomly generated images!
--- # Art Connoisseur Network * After some training, our network will be able to distinguish real and fake picassos * This means we can give this network a new painting, and it will tell us if it is real or not * Now we can define the task for our generator more clearly: Fool the discriminator network, i.e. generate paintings that the discriminator recognizes as "real" Picassos --- # A Word on Loss Functions * How did we train our neural networks? * We calculated the gradient of the loss function wrt the model parameters * We said that we need our loss function to be differentiable * What else is differentiable? Our discriminator network! --- class: medium # The Generator Network * The Generator Network takes, as we wanted, a vector of random numbers as input, and produces a picture as output * The **loss function** for this network then consists of passing the produced image through the discriminator and determining if it believes the painting to be real or not * We can then use backpropagation and gradient descent, as usual, to update the weights in our generator * Over time, our generator will learn to fool the discriminator! --- # Not quite enough ... * If our discriminator was "perfect", this would already be enough * However, to start, we needed some "fake" Picassos, which we just generated randomly * Once the Generator produces some images, we actually have "better fakes"! * So we can improve the Detector with that * And then we need to improve the Generator again, etc. --- # Generative, Adversarial Networks * Generative: We **generate** images * Adversarial: The Generator and the Discriminator play a "game" against each other
--- class: medium # The Generative Game * Discriminator learns to detect fake images (optimization with gradient descent) * Generator learns to produce fake images that look real to the discriminator (optimization with gradient descent) * Discriminator learns to detect these new fake images * Generator learns to fool the updated discriminator * ... --- class: medium # Stability * So you run this training for some iterations * In one iteration, your generator produces 100 images (A), you train the discriminator to recognize them * Then the generator learns to produce 100 new images (B) that fool the discriminator * The discriminator now learns to recognize those * Then the generator learns to produce the 100 images in (A) **again** because now those fool the discriminator * etc. --- class: medium # A Replay Buffer * To avoid such problems, it can be worth it to keep a "repository" of old images * But if we keep **all** old images around, training will slow down pretty quickly * Instead, we could have a repository of, say, 200 old images, and we select 100 of those at random * Then we add 100 new images and we have a new repository * We always use these entire 200 images to train the discriminator (some old, some new) --- # Mode Collapse * The goal of the generator is to **minimize** the error (= how many images the discriminator recognizes as fake) * The input of the generator is random noise * Imagine there is a **perfect** fake image * The generator could learn to ignore the input and produce only this image --- # Mode Collapse * Once the generator produces **only** the perfect image, the loss, and therefore the gradient for each image will be the same * In the next iteration, the generator will also only produce **one** image * The generation process has "collapsed" to a single example * Generally, we don't want that --- # Mode Collapse: Randomization * What can we do? When we get to that point, nothing :( * To prevent getting there: Introduce more randomness * Dropout layers: After the activation function, randomly set values to 0 (with a probability p) * Randomize labels: When training the generator, randomly set some of the labels to 0 --- # Mode Collapse: Diversify Generation * Another option is to explicitly encourage generation of different images * For each set of generated images, calculate the average per-pixel variance * Use this variance as an additional input for the discriminator * If variance = 0 very often, the discriminator will learn to use that to identify fake images --- # GAN Variants * Generating faces or photos from existing ones * Additionally providing a class to generate specific pictures * Generate an image from a textual description * Apply a style to an existing image ("Style transfer") --- # Samples
[Large Scale GAN Training for High Fidelity Natural Image Synthesis](https://arxiv.org/abs/1809.11096) Also check out: [thiscatdoesnotexist.com/](https://thiscatdoesnotexist.com/) --- # Cycle GAN
[Unpaired Image-to-Image Translationusing Cycle-Consistent Adversarial Networks](https://junyanz.github.io/CycleGAN/) --- # Cycle GAN
[Turning Fortnite into PUBG with Deep Learning (CycleGAN)](https://towardsdatascience.com/turning-fortnite-into-pubg-with-deep-learning-cyclegan-2f9d339dcdb0) --- # GANcraft
[NVidia GANcraft](https://nvlabs.github.io/GANcraft/) --- class: medium # References * [Synthesizing Robust Adversarial Examples](https://arxiv.org/pdf/1707.07397.pdf) * [Auto-Encoder: What Is It? And What Is It Used For?](https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-part-1-3e5c6f017726) * [GAN Introduction](https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/) * [GAN hacks](https://github.com/soumith/ganhacks) * [GAN Variations](https://developers.google.com/machine-learning/gan/applications) * [Large Scale GAN Training for High Fidelity Natural Image Synthesis](https://arxiv.org/abs/1809.11096)