Lecture 21: Advanced AI

# Creación de Videojuegos

### Advanced AI

---

# Unknown information

* Last time we talked about adversarial planning 
  
  * But what if we don't know the state of the world?
  
  * What if there is a large number of possibilities?
  
  * For example: A shuffled deck of cards
   
  * We can't really try all possibilities
  
---

# Monte Carlo Tree Search

* Idea: Just try random moves (and/or sample from the random deck)
  
  * Record the outcomes for the random playouts
  
  * Repeat a large number of times 
  
  * Each time, pick moves that did better previously with a higher probability
  
  * At the end, pick the move that has the highest expected value
  
---

# MCTS

---

# Machine Learning

---

# Machine Learning

* The idea behind Machine Learning is to give some data/experiences to an algorithm, and it derives how to solve a problem on other, unseen instances of the same kind of data
  
  * Supervised Learning: The algorithm gets some inputs and outputs, and learns their relationship. 
  
  * Unsupervised Learning: The algorithm gets some inputs, and learns information about them.
  
  * Reinforcement Learning: The algorithm is given a world, some actions, and a reward function, and learns how to obtain rewards in the world. 
  
---

# Optimization

* The basis for many Machine Learning Algorithms is optimization
  
  * Optimization: Given a function `\(f\)` find the value for `\(\vec{x}\)` such that `\(f(\vec{x})\)` is minimal (or maximal)
  
  * For example: In a strategy game there are many units and buildings to build. Which order of combination of these units/buildings gives you the strongest army?
  
  * Often, the function can not be (cheaply) drawn or calculated for every point to find a minimum
  
---

# Optimization: Gradient Descent

* Remember the derivative of a function?
  
  * The derivative gives you the "change" of the function 
  
  * Idea: If we want the minimum, we move in the direction where the function values become less
  
  * Where do we start? Let's just pick something at random 
  
---

# Optimization: Gradient Descent
  
<img src="/CI-2700/assets/img/gradientdescent.png" width="100%"/>

---

# Optimization: Gradient Descent
  
<img src="/CI-2700/assets/img/gradientdescent3D.png" width="80%"/>

---

# Optimization: Gradient Descent

Gradient Descent has many "problems"

* Local minima 
  
  * Overshooting
  
  * Slow movement on plateaus 
  
  * etc.
  
---

# Function approximation

* What if we want to replicate something else?
  
  * For example: The player drives car races, and we want an AI player that has a similar driving style ("Drivatar")
  
  * Let's say our player is a function `\(p(\vec{y})\)`
  
  * We want a function `\(\hat{p}(\vec{y})\)` that is "a good approximation" of `\(p(\vec{y})\)`
  
  * Idea: Let's minimize the difference `\(|\hat{p}(\vec{y}) - p(\vec{y})|\)`
  
  * But ... we don't actually know `\(p(\vec{y})\)`?
  
---

# Training examples

* Say we have observed *some* values for `\(p(\vec{y})\)`
  
  * That means, we have a list  `\(p(\vec{y_1}) = z_1, p(\vec{y_2}) = z_2, p(\vec{y_3}) = z_3, \ldots\)`
  
  * We "just" want to know values of p between these values, e.g. `\(p(\vec{y_?}) = ?\)`
  
  * Easiest solution: Linear interpolation! 
  
  * But what if the function is something non-linear?
  
---

# Function approximation

* Imagine we have a quadratic function `\(\hat{p}(y) = a y^2 + b\)`
  
  * We can choose a and b to change the values of the function
  
  * We can add cubic terms, etc.: `\(\hat{p}(y) = \sum_i a_i y^i\)`
  
  * Then, we can use gradient descent *on the values for a* to find a version of this function that minimizes the error
  
  * But: Our input is actually a vector, and we might want to combine these values "somehow", too
  
---

# Neural Networks

* Instead of polynomials we can also compose our function of other units 
  
  * For example, somewhere inside our function we have a term like `\( h(\sum_i a_i y_i) \)`
  
  * These terms correspond to "neurons", and h is the neuron's activation function 
  
  * If h has a derivative, we can calculate the gradient of the entire network
  
  * And if we can calculate the gradient, we can do gradient descent!
  
---

# Neural Networks

---

# Neural Networks in Games

* Neural Networks can be useful if you want to approximate an unknown function 
  
  * You can also use it to e.g. "blend together" levels (using the existing levels as the target function)
  
  * Training might be computationally expensive, though 
  
  * There's also the issue of control
  
---

# Clustering

* What if you have your players, and you just want to know if there are certain classes of playstyles
  
  * In other words, you want to group players together that play similarly
  
  * This is called *clustering*, because it detects clusters of inputs that are similar 
  
  * One approach: Choose k cluster centers, and assign each element to the closest cluster, then update cluster centers iteratively
  
---

# K-Means

---

# Reinforcement Learning

---

# Reinforcement Learning
 
  * There is a world/task, for example a game 
  
  * The agent performs some actions 
  
  * The observer gives the agent a reward (e.g. score) for the performance
  
  * The agent tries to maximize the reward
  
---

# Q-Learning
 
  * The agent stores the rewards for each action in each state in a table 
  
  * When an agent takes a step, they record the expected reward for the following state

* For example: If an agent tries to jump, they record which reward they expect to get in the future, after jumping

* There is a discount factor to value getting rewards sooner higher  
  
---

<img src="/CI-2700/assets/img/qtable.png" width="80%"/>
  
---

# References

* [All You Need to Know about Gradient Descent](https://hackernoon.com/gradient-descent-aynk-7cbe95a778da)
  
  * [Drivatars in Forza Motorsport](https://www.youtube.com/watch?v=twI0RSVwnR0)
  
  * [Artificial Intelligence and Games](http://gameaibook.org/) by Georgios N. Yannakakis and Julian Togelius, chapters 2.5-2.7. Free PDF available on the website!
  
  * [Experimental AI in Games Workshop](http://www.exag.org/)