class: center, middle # Creación de Videojuegos ### Advanced AI --- # Unknown information * Last time we talked about adversarial planning * But what if we don't know the state of the world? * What if there is a large number of possibilities? * For example: A shuffled deck of cards * We can't really try all possibilities --- # Monte Carlo Tree Search * Idea: Just try random moves (and/or sample from the random deck) * Record the outcomes for the random playouts * Repeat a large number of times * Each time, pick moves that did better previously with a higher probability * At the end, pick the move that has the highest expected value --- # MCTS
Selection
11/21
7/10
3/8
0/3
2/4
1/6
1/2
2/3
2/3
2/3
3/3
Expansion
11/21
7/10
3/8
0/3
2/4
1/6
1/2
2/3
2/3
2/3
3/3
0/0
Simulation
11/21
7/10
3/8
0/3
2/4
1/6
1/2
2/3
2/3
2/3
3/3
0/0
0/1
Backpropagation
11/22
8/11
3/8
0/3
2/4
1/7
1/2
2/3
2/3
2/3
4/4
0/1
--- class: center, middle # Machine Learning --- class: small # Machine Learning * The idea behind Machine Learning is to give some data/experiences to an algorithm, and it derives how to solve a problem on other, unseen instances of the same kind of data * Supervised Learning: The algorithm gets some inputs and outputs, and learns their relationship. * Unsupervised Learning: The algorithm gets some inputs, and learns information about them. * Reinforcement Learning: The algorithm is given a world, some actions, and a reward function, and learns how to obtain rewards in the world. --- class: small # Optimization * The basis for many Machine Learning Algorithms is optimization * Optimization: Given a function `\(f\)` find the value for `\(\vec{x}\)` such that `\(f(\vec{x})\)` is minimal (or maximal) * For example: In a strategy game there are many units and buildings to build. Which order of combination of these units/buildings gives you the strongest army? * Often, the function can not be (cheaply) drawn or calculated for every point to find a minimum --- # Optimization: Gradient Descent * Remember the derivative of a function? * The derivative gives you the "change" of the function * Idea: If we want the minimum, we move in the direction where the function values become less * Where do we start? Let's just pick something at random --- # Optimization: Gradient Descent
--- # Optimization: Gradient Descent
--- # Optimization: Gradient Descent Gradient Descent has many "problems" * Local minima * Overshooting * Slow movement on plateaus * etc. --- class: small # Function approximation * What if we want to replicate something else? * For example: The player drives car races, and we want an AI player that has a similar driving style ("Drivatar") * Let's say our player is a function `\(p(\vec{y})\)` * We want a function `\(\hat{p}(\vec{y})\)` that is "a good approximation" of `\(p(\vec{y})\)` * Idea: Let's minimize the difference `\(|\hat{p}(\vec{y}) - p(\vec{y})|\)` * But ... we don't actually know `\(p(\vec{y})\)`? --- class: small # Training examples * Say we have observed *some* values for `\(p(\vec{y})\)` * That means, we have a list `\(p(\vec{y_1}) = z_1, p(\vec{y_2}) = z_2, p(\vec{y_3}) = z_3, \ldots\)` * We "just" want to know values of p between these values, e.g. `\(p(\vec{y_?}) = ?\)` * Easiest solution: Linear interpolation! * But what if the function is something non-linear? --- class: small # Function approximation * Imagine we have a quadratic function `\(\hat{p}(y) = a y^2 + b\)` * We can choose a and b to change the values of the function * We can add cubic terms, etc.: `\(\hat{p}(y) = \sum_i a_i y^i\)` * Then, we can use gradient descent *on the values for a* to find a version of this function that minimizes the error * But: Our input is actually a vector, and we might want to combine these values "somehow", too --- class: small # Neural Networks * Instead of polynomials we can also compose our function of other units * For example, somewhere inside our function we have a term like `\( h(\sum_i a_i y_i) \)` * These terms correspond to "neurons", and h is the neuron's activation function * If h has a derivative, we can calculate the gradient of the entire network * And if we can calculate the gradient, we can do gradient descent! --- # Neural Networks
--- # Neural Networks in Games * Neural Networks can be useful if you want to approximate an unknown function * You can also use it to e.g. "blend together" levels (using the existing levels as the target function) * Training might be computationally expensive, though * There's also the issue of control --- # Clustering * What if you have your players, and you just want to know if there are certain classes of playstyles * In other words, you want to group players together that play similarly * This is called *clustering*, because it detects clusters of inputs that are similar * One approach: Choose k cluster centers, and assign each element to the closest cluster, then update cluster centers iteratively --- # K-Means
--- # Reinforcement Learning
image/svg+xml
A Simple Maze Puzzle, by Adam Stanislav
Environment
Agent
Action
Interpreter
Reward
State
--- # Reinforcement Learning * There is a world/task, for example a game * The agent performs some actions * The observer gives the agent a reward (e.g. score) for the performance * The agent tries to maximize the reward --- # Q-Learning * The agent stores the rewards for each action in each state in a table * When an agent takes a step, they record the expected reward for the following state * For example: If an agent tries to jump, they record which reward they expect to get in the future, after jumping * There is a discount factor to value getting rewards sooner higher ---
--- class: small # References * [All You Need to Know about Gradient Descent](https://hackernoon.com/gradient-descent-aynk-7cbe95a778da) * [Drivatars in Forza Motorsport](https://www.youtube.com/watch?v=twI0RSVwnR0) * [Artificial Intelligence and Games](http://gameaibook.org/) by Georgios N. Yannakakis and Julian Togelius, chapters 2.5-2.7. Free PDF available on the website! * [Experimental AI in Games Workshop](http://www.exag.org/)