Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

AI in Digital Entertainment

Optimization

1 / 46

Artificial Intelligence

Remember: Everything in AI is either representation or search

  • Behavior Trees, State Machines, etc.: Representation

  • Planning: Search (Plan-Space planning uses a different representation)

  • Belief modeling: Representation improvements

  • Monte Carlo Tree Search: Search

2 / 46

Optimization

  • Often in search we want to find something called a "solution"

  • However, what if we have multiple possible solutions, with different levels of quality?

  • We want to find the "best" solution among all possible ones

  • In other words, instead of constructing partial solutions until we find a complete one, we look at the set of complete solutions and try to find the best among them

  • For example: Doing the most damage, earning the most money, getting the highest score, driving the race in the lowest possible time, etc.

3 / 46

The Optimization Problem

Given a function (usually called "fitness function"):

f:RnR

find the xRn for which f(x) is minimal (or maximal).

4 / 46

The Optimization Problem

Given a function (usually called "fitness function"):

f:RnR

find the xRn for which f(x) is minimal (or maximal).

First idea:

ddxf(x)=0

5 / 46

Challenges

  • Setting the derivative to zero only finds a minimum/maximum, but not necessarily the global minimum/maximum

  • The dimensionality of the vector may be really high, giving us many options to consider

  • Who says the derivative can even be calculated?

  • We haven't even specified what our vectors represent, and what f looks like

6 / 46

Minimization vs. Maximization

  • For the rest of this talk, we will assume that we want to minimize our functions

  • For some values, such as score, this does not really make sense, of course

  • We can turn a maximization problem into a minimization one by simply negating the value (or subtracting it from the theoretical maximum)

  • For example: If the maximum possible score is 100000, to maximize the score we want to minimize the distance to that value

7 / 46

Representation

8 / 46

Representation

  • We said our function takes vectors as real numbers as input, but what do these vectors represent?

  • They can be geometrical, such as the angle and force used to shoot a bird in Angry Birds

  • It could be an assignment of values to resources, such as how many of each unit to build in Starcraft

  • Another possibility is to view the vector as a sequence of actions

9 / 46

Geometric Interpretation

10 / 46

Differentiability

To be differentiable, a function needs to be continuous:

ε>0δ>0a:(|ax|<δ|f(a)f(x)|<ε)

Or, in words:

If you change the input a little bit, the output also only changes a little bit.

11 / 46

Differentiability?

Is the function that maps from angle and strength to score differentiable?

12 / 46

Gradient Descent

13 / 46

Gradient Descent

  • For now, let's say our function is differentiable

  • We know that the derivative is zero at the extrema

  • We also know that it defines the slope

  • So if we start at any point, we can just go "downhill"

14 / 46

Gradient Descent

  • Start with an initial guess x0

  • Repeat until "convergence":

xi+1=xi+αddxf(xi)

  • α is the learning rate
15 / 46

Gradient Descent

16 / 46

Gradient Descent

17 / 46

Gradient Descent: Some improvements

  • Scale the learning rate over time to move quickly at first, and slow down as a solution is found

  • Start at multiple different randomly chosen starting locations to avoid being stuck in a local minimum

  • Use the learning rate as momentum to get out of small minima

  • Add a small random number to the gradient to explore other parts of the function

18 / 46

Evolutionary Algorithms

19 / 46

Evolutionary Algorithms

  • What if our function is not differentiable?

  • Or our vectors don't have a good geometric interpretation, but are more just "collections of numbers"

  • Let's look to nature: Genes and evolution

  • Each individual is made up of genes

  • New individuals inherit a mixture of genes from their parents, strongly preferring "better" genes

  • "Survival of the Fittest"

20 / 46

Evolutionary Algorithms

  • We can interpret a vector as the "genes" of an individual

  • The fitness function then tells us how "good" these genes are

  • By modifying the genes we can create new individuals, to find the "best" genes as measured by our fitness function

21 / 46

Genetic Algorithm

22 / 46

Genetic Algorithm

  • We generate a (random) set of individuals as a population

  • Every step we create some new individuals as combinations and/or mutations of existing ones

  • Then we keep only the "most fit" individuals, i.e. the ones for which our scoring function returns the lowest values

  • Repeat for "a number of steps"

23 / 46

Population

  • Our population is a set of vectors

  • Each of these vectors is associated with its fitness

  • We limit our population to a certain size by only keeping the n "best" individuals

24 / 46

Combination and Mutation

  • Two individuals can be combined by selecting elements from each of their two vectors (or averaging, adding, etc.), called "crossover"

  • An individual can also be changed by modifying single values in its vector, called "mutation"

  • If we view our vector elements as containing a binary representation of numbers, we could also use binary operations such as an xor to combine two values

25 / 46

Why does this work?

  • By keeping only the best individuals from one iteration to the next we guarantee that the solution will never get worse over time

  • Using combinations of two individuals basically tries to combine the advantages of the two, while removing the drawbacks

  • By having a number of different individuals we avoid getting stuck in local minima

26 / 46

Repair

  • One problem is that in many domains not every possible vector is also feasible

  • For example: If our vector represents actions, jumping while already in the air might not be possible

  • Repair is the mechanism by which an infeasible individual is converted into a feasible one

  • How this repair works is domain dependent

27 / 46

Swarm-Based Optimization

28 / 46

Swarm-Based Optimization

  • Another approach is to keep a set of individuals and "move" them around in space

  • The individuals can use information about the other individuals, or the "swarm" as a whole, to determine where to go

  • There are several approaches to how this information exchange works

  • Typically the individuals are initially spread out over the search space and then follow the path to the best known values

29 / 46

Particle-Swarm Optimization

  • In Particle Swarm Optimization, each individual is called a particle, and has a position and velocity

  • Every step, the position is updated with the velocity, and the velocity is changed depending on (using semi-random ratios):

    • The best position found by the swarm as a whole
    • The best position found by the particle itself
  • If the best position found by the swarm as a whole is far away from the particle in question, it will move more quickly

30 / 46

Particle-Swarm Optimization

31 / 46

Ant-Colony Optimization

  • When ants look for food, they start by walking around randomly

  • Wherever they walk, they leave pheromones behind

  • If an ant encounters a pheromone trail by another ant, it has a chance to follow it

  • Pheromones evaporate over time, meaning that shorter paths will have a higher concentration of pheromones, because they are traversed faster

32 / 46

Ant-Colony Optimization

1 2 3
33 / 46

Ant-Colony Optimization

  • Often described as operating on graphs, with pheromone trails assigned to the edges

  • However, it can also be used for general optimization problems

  • Generate a set of ants randomly placed in the search space

  • Move ants around and update their trails with the best fitness value they encounter

  • If an ant encounters another pheromone trail, it has a chance of following that trail, depending on the strength of the pheromone trail

34 / 46

Function Approximation

35 / 46

Function Approximation

  • For optimization we need a search space of vectors and a fitness function

  • However, calculating the fitness for arbitrary input vectors may be expensive or even impossible

  • For example: We may be able to calculate the strength of a collection of troops in Starcraft, but determining its expected win rate requires extensive simulations

  • Idea: Estimate the fitness values

  • Iff we have fitness values for some inputs, we can calculate a guess for the intermediate values

36 / 46

Interpolation

  • Say we know the win rate for having 40 Space Marines and 10 Fire Bats is 80%, and the win rate for having 10 Space Marines and 40 Fire Bats is 20%

  • What is the win rate for 25 Space Marines and 25 Fire Bats?

  • We can guess 50%, using Linear Interpolation

  • Our fitness functions are not usually linear, though

37 / 46

Interpolation

  • It turns out that the win rate for 25 Space Marines and 25 Fire Bats is actually 90%

  • Instead of trying to explain this linearly, we can use some polynomial

  • For n observations we would need a polynomial of degree n-1

  • Better idea: Define each point to be a linear mixture of some non-linear function of the observed values, depending on the distance from those values

  • Radial Basis Functions (RBF)!

38 / 46

Radial Basis Function Example

  • f(40,10) = 0.8, f(10, 40) = 0.2, f(25,25) = 0.5

  • We use a Gauss function as the RBF

  • To determine and estimate for f(20,30) we calculate

f(20,30)f(40,10)φ(|(40,10)(20,30)|)+f(10,40)φ(|(10,40)(20,30)|)+f(25,25)φ(|(25,25)(20,30)|)

39 / 46

Function Approximation with Neural Networks

  • Radial Basis Functions are neat, because they work even with a large number of known values, and don't require the known information to be layed out in any particular form

  • However, the parameter of the function that determines its "flatness" needs tuning

  • Next week we will talk about Neural Networks, which provide an alternative way to approximate function where these parameters are learned from input data

40 / 46

Multi-Objective Optimization and Pareto Optimal Solutions

41 / 46

Multi-Objective Optimization

  • Sometimes we don't want to just optimize one value, but multiple

  • Or the optimization of one value comes at the expense of another

  • For example: Building units in a strategy game that do the most damage

  • The most expensive units do the most damage

  • That does not mean that building the most expensive units is the best strategy. There is a trade-off between cost and army strength

42 / 46

Multi-Objective Optimization: Approaches

  • Maybe we can try to optimize the (weighted) sum of both values?

  • Or how about a ratio? Let's optimize the army strength per resource spent

  • Both of these may run into degenerate solutions. For example, spending 0 resources results in infinite strength per resource

  • Better idea: Let the AI agent decide about the trade-off

43 / 46

Pareto-Optimal Solutions

  • Say we have two values, x and y, to maximize

  • If we have a solution with fitness (x,y), any other solution for which the fitness for x is higher and the fitness for y is higher, is an objectively worse solution. The first solution is said to dominate the other

  • However, solutions for which x is higher or y is higher can be considered of the "same" utility

  • The result will be a frontier of fitness values

44 / 46

Pareto-Optimal Solutions

C Pareto A B x(A) < x(B) y x y(A) > y(B)
45 / 46

Artificial Intelligence

Remember: Everything in AI is either representation or search

  • Behavior Trees, State Machines, etc.: Representation

  • Planning: Search (Plan-Space planning uses a different representation)

  • Belief modeling: Representation improvements

  • Monte Carlo Tree Search: Search

2 / 46
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow