Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

Artificial Intelligence: Planning

Planning As Pathfinding

1 / 48

Review: The Planning Problem

A planning problem consists of three parts:

  • A definition of the current state of the world

  • A definition of a desired state of the world

  • A definition of the actions the agent can take

2 / 48

Review: PDDL Domains and Problems

We can define a planning problem in PDDL domain and problem files:

  • The domain file defines which actions (operators) exist, and (optionally) the types of all objects

  • The problem file defines the initial state of the world (and any additional objects that may be required), and the goal condition

  • A planner then reads these two files and outputs a plan

Today we'll talk about that last part

3 / 48

Pathfinding

Recall the pathfinding problem from the second lecture

  • We are given a graph, consisting of nodes and edges

  • In our implicit graph representation, we could ask one node to generate its neighbors

  • We would start at the start node, and generate neighbors "intelligently" until we reached a goal node

4 / 48

Planning As Pathfinding

To use a pathfinding algorithm for planning, we need to formulate the planning problem as a graph. Let's start with the "obvious" choice:

  • Each state is a logical state

  • Each edge corresponds to an action

  • In any state, we can take any action whose preconditions are satisfied to generate a new state

5 / 48

The State-Space

  • When we start at the node corresponding to the start state and start expanding actions, we generate the so-called State-Space

  • We can then use a standard A* algorithm to do pathfinding

  • In fact, many planners do exactly that, with different heuristics

6 / 48

State Space Example

7 / 48

Planning Problem in State Space

8 / 48

For the Project

  • Parse PDDL files like discussed last week

  • Extract all types, objects, operators, the initial state and the goal condition

  • Ground actions

  • Plan

9 / 48

Grounding Actions

(:action up
:parameters (?f1 - floor ?f2 - floor)
:precondition (and (lift-at ?f1) (above ?f1 ?f2))
:effect (and (lift-at ?f2) (not (lift-at ?f1))))

For each parameter, replace with all possible values.

Save all ground actions in a list.

You may end up with many ground actions.

10 / 48

State Representation

You can create the world for the initial state by passing all atoms and the type dictionary to make_world. However, for pathfinding you will need a Node subclass, e.g. PlanningState.

Each PlanningState contains a reference to a logical world, as well as to the list of all actions

Note: You don't have to use make_world, and could instead store the state information directly in PlanningState

11 / 48

get_neighbors

  • checks which actions can be applied in the logical world, i.e. which actions' preconditions are satisfied using models

  • Generates a new logical world for each of these actions by applying its effects using apply

  • Returns Edge objects containing new PlanningState objects for each generated logical world, with a cost of 1, and a string representing the action that was taken

12 / 48

Pathfinding

  • We can now use astar as our planning algorithm!

  • Pass it a PlanningState corresponding to the initial state (which can generate more states by applying actions)

  • For the goal condition, write a isgoal function that uses models on the state and the goal condition

  • Call pathfinding.astar(start, h, isgoal)

13 / 48

Planning Heuristics

14 / 48

A short timeline

  • 1995: Graphplan (next lecture)

  • 1997/1998: HSP (Heuristic Search Planner)

  • 2000/2001: FastForward

  • 2005/2006: FastDownward

15 / 48

Heuristics

  • We heard that heuristics can speed up A*

  • We used heuristics to tell the algorithm which nodes on the search frontier were "closer" to the goal

  • For search in Romania or on UCR campus we used the straight line distance as a "guess" of how close a node could be to the goal

  • We tried to have our heuristic underestimate the cost and be in the same "units" as the actual cost (e.g. km)

16 / 48

Planning Heuristics

  • In planning, our nodes are (logical) states, and our edges are actions

  • When we expand a state by applying all actions, we compute a heuristic value for each of the successor states

  • This heuristic value should estimate how long a plan from that state to a goal state is (at most)

  • In other words, we want to estimate how "difficult" it is to satisfy the goal from each state in our search frontier

17 / 48

Heuristics

  • So how do we define a heuristic for such an abstract process?

  • Here is an idea: Solve a simplified/approximate (relaxed) version of the problem, and use the cost of the solution as the heuristic

  • How can we simplify the problem?

18 / 48

Heuristic Search Planning (HSP)

  • Idea: Ignore delete lists

  • Why is this simpler? Because the state always grows

  • With a larger state, more actions can be applied, and eventually we will have added all atoms that can possibly be added

Note: This does not work without modification for non-STRIPS domains that have negative preconditions.

19 / 48

HSP

  • Oops: This is still NP-hard (action ordering)...

  • Instead, use an estimation:

    • Assign a heuristic value i=0 to each atom in the current state
    • Increase i by 1
    • Apply all possible relaxed actions (without delete lists), and set the heuristic value of each produced atom to i
    • Repeat the last two steps until no values change anymore
  • The heuristic value is then the sum of the heuristic values of all atoms in the goal

  • This is no longer admissible, since all goal atoms are treated as independent

20 / 48

HSP Heuristic Example: Pickup

Positive Preconditions: {free(),clear(X)} Negative Preconditions: None

Add List: {holds(X)}

Delete List: {free(),clear(X)}

21 / 48

HSP Heuristic Example

Relaxed blocksworld Pickup:

{free(),clear(X)}{holds(X)}

Put down:

{holds(X),clear(Y)}{on(X,Y),clear(X),free()}

Current State:

{on(A,B),on(B,Table),on(C,Table),clear(A),clear(C),free()}

22 / 48

HSP

  • HSP was the first widely successful planner to use heuristic search on STRIPS problems

  • It won the 1998 planning competition

  • There are updated versions which work with similar strategies

  • FastForward was inspired by HSP

23 / 48

FastForward

  • We said earlier that solving the relaxed problem is NP-Hard

  • But that depends on which relaxed problem we mean

  • Specificially, finding the optimal plan is NP-hard, but determining a plan is in P

  • Of course, this may overestimate

24 / 48

FastForward Heuristic

  • Start with a "layer" consisting of the set of all atoms in the current state

  • Apply all applicable relaxed actions, and add all their effects to the set of atoms generating the next layer

  • Continue until all atoms in the goal can be found in a layer

  • Then backtrack through the layers to build a relaxed plan

  • The length of this plan is the value of the heuristic

25 / 48

FastForward: Hillclimbing

  • FastForward also uses a modified search procedure

  • Since the heuristic may overestimate, it can be beneficial to ignore it for parts of the search

  • For any state, if all neighbors have a higher estimated cost, FastForward will expand all of those states' neighbors, until it finds a state with a lower heuristic value

  • In other words, as long as the heuristic values don't decrease, FastForward uses breadth-first search

26 / 48

FastDownward

  • FastDownward is a "planning system", that implements several different search procedures and heuristics

  • Additionally, FastDownward compiles the planning problem to a different representation

  • Multi-Valued Planning Task: Instead of binary predicates we have value assignments

27 / 48

Multi-Valued Planning Task

  • Predicates are often not a very natural way to represent real-world tasks

  • For example, it would be nice if we could easily say which block the gripper is holding in blocksworld, or which block is above another

  • Multi-Valued Planning Tasks (MPT) allow you to do just that

  • An MPT contains variables, each of which can be assigned one value from a finite domain

  • For example: "holds" may be a state variable with the domain ranging over all available blocks

28 / 48

Multi-Valued Planning Task

Why?

  • MPTs are still PSPACE-hard (propositional planning is just a special case where all state variables have values "true" or "false")

  • However, MPTs allow the natural expression of mutually exclusive states, which may cut down the search space

  • For example, we know that on(A,B) and on(A,C) can never be true at the same time

  • A propositional planner first has to figure that out

  • An MPT planner knows that on_A can only have the value B or C, but never both

29 / 48

Search-Heuristic

  • Using the state variables, we can analyze how they can change

  • For example, on_A = B may only change into on_A = undefined

  • We can also determine under which conditions such changes may occur (i.e. which actions cause them, and what preconditions they have)

  • This gives us some guidance in what to do

  • Of course, this analysis could have been done on the original problem, the MPT approach just reduces the space of possibilities

30 / 48

FastDownward

  • As mentioned, FastDownward actually implements several different approaches

  • The translation from a propositional planning problem (in PDDL) to an MPT is actually a separate component

  • The algorithms we are discussing for propositional planning usually still work on MPTs with some slight modifications

31 / 48

Types of Heuristics

32 / 48

Different Types of Heuristics

  • Relaxation

  • Abstraction

  • Critical Paths

  • Landmarks

  • Network Flow

33 / 48

Relaxation

  • So far all heuristics we have talked about have been relaxation heuristics

  • They relax the planning problem in some way

  • Removing negative effects is the most common way to do this

  • You could also remove other parts of actions, or the goal, or split up actions

34 / 48

Abstraction

"Estimate cost by projecting the state space to a smaller space (applying a graph homomorphism)" (Helmert and Röger)

  • Make the planning problem simpler, by removing options

  • For example, remove one of the blocks from actions and goals

  • For some domains you may also have a more abstract representation (e.g. for a strategy game: Have a library of high level strategic actions which can help solve the planning problem on low-level actions)

35 / 48

Critical Paths

  • We can part of the goal, and construct a "critical path" backwards

  • For example, to get B on the table, we have to put-on-table(B), for that we first have to pickup(B), etc.

  • The number of actions on this critical path can serve as a heuristic

  • We may want to try different parts of the goal to get a better estimate

36 / 48

Landmarks

  • In many cases, we can find actions that have to happen (Landmarks)

  • For example: If we want B to be on the table, we have to have an action put-on-table(B)

  • Our heuristic value can be constructed from counting all of these landmark actions

  • The more accurate our count is, the more accurate is our heuristic

37 / 48

Network Flow

  • We can look at actions as producing and consuming facts

  • The number of productions and consumptions has to be the same

  • That means, for every time we pick up a block, we have to put one down as well

  • We may be able to use this fact to determine how many "switches" have to be made at least

38 / 48

For the Project

  • Task 4 is the most "creative" part of the project

  • You are supposed to come up with your own heuristic

  • Think about the different types, and how you would implement them

  • The FastForward Heuristic may be a good starting point

39 / 48

Other Optimizations

40 / 48

Other Optimizations

  • Combining Multiple Heuristics

  • State-Space Pruning

  • Invariant Synthesis

  • Action preferences

  • Other search strategies

41 / 48

Combining Multiple Heuristics

  • No single heuristic is good for all planning problems

  • Just calculate multiple different ones, and combine the results

  • Pick the highest, lowest, average, sum, etc.

  • Of course, this is just another heuristic and will not be "perfect" either

  • But it may overcome limitations of others

42 / 48

State-Space Pruning

  • Many problems contain states that are "almost" the same

  • For example, if we want to transport packages around (logistics domain), and have multiple identical trucks, we can remove states that only differ by which truck is where

  • It may also be that the order of two actions does not matter, so we don't have to consider both orderings

  • Some of these calculations are relatively simple, others may require external knowledge

43 / 48

Invariant Synthesis

  • In most planning problems the operators maintain certain invariants, i.e. facts that hold in all reachable states

  • For example, in blocksworld there can only ever be one block that is held by the gripper

  • This forms the basis of the translation to an MPT, but can also be used when solving the relaxed problem to strengthen the heuristic

  • Another form of invariants are predicates that can never change, which we can use to exclude certain actions (like putting a block on itself)

44 / 48

Action Preference

  • When we calculate a heuristic by solving a relaxed problem, we actually get a "plan"

  • This plan can be used to give us a clue what a "good" next action might be

  • Rather than considering all actions the same, we prefer these actions

  • Can be used to augment the heuristic, or simply try this action first and if things don't work out fall back to "full" search

45 / 48

Other Search Strategies

Remember: Everything in AI is either representation or search

  • Searching through state space is just one possible approach to planning

  • Other planners may search through a different kind of space (different representation)

  • Yet others may use a very different search procedure when expanding the states

  • We will look at two such approaches in the next two weeks

46 / 48

Homework

  • Homework 5 has been posted on the class website

  • There are 5 problems using the planning domains from the repository of http://planning.domains

  • You will be calculating heuristic values and solving relaxed problems

  • When you do this, take note of how the different heuristics perform their estimations

47 / 48

Review: The Planning Problem

A planning problem consists of three parts:

  • A definition of the current state of the world

  • A definition of a desired state of the world

  • A definition of the actions the agent can take

2 / 48
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow