Artificial Intelligence: PlanningPlanning As Pathfinding1 / 48

Review: The Planning Problem

A planning problem consists of three parts:

A definition of the current state of the world
A definition of a desired state of the world
A definition of the actions the agent can take

2 / 48

Review: PDDL Domains and Problems

We can define a planning problem in PDDL domain and problem files:

The domain file defines which actions (operators) exist, and (optionally) the types of all objects
The problem file defines the initial state of the world (and any additional objects that may be required), and the goal condition
A planner then reads these two files and outputs a plan

Today we'll talk about that last part

3 / 48

Pathfinding

Recall the pathfinding problem from the second lecture

We are given a graph, consisting of nodes and edges
In our implicit graph representation, we could ask one node to generate its neighbors
We would start at the start node, and generate neighbors "intelligently" until we reached a goal node

4 / 48

Planning As Pathfinding

To use a pathfinding algorithm for planning, we need to formulate the planning problem as a graph. Let's start with the "obvious" choice:

Each state is a logical state
Each edge corresponds to an action
In any state, we can take any action whose preconditions are satisfied to generate a new state

5 / 48

The State-Space

When we start at the node corresponding to the start state and start expanding actions, we generate the so-called State-Space
We can then use a standard A* algorithm to do pathfinding
In fact, many planners do exactly that, with different heuristics

6 / 48

State Space Example

7 / 48

Planning Problem in State Space

8 / 48

For the Project

Parse PDDL files like discussed last week
Extract all types, objects, operators, the initial state and the goal condition
Ground actions
Plan

9 / 48

Grounding Actions

(:action up
  :parameters (?f1 - floor ?f2 - floor)
  :precondition (and (lift-at ?f1) (above ?f1 ?f2))
  :effect (and (lift-at ?f2) (not (lift-at ?f1))))

For each parameter, replace with all possible values.

Save all ground actions in a list.

You may end up with many ground actions.

10 / 48

State Representation

You can create the world for the initial state by passing all atoms and the type dictionary to make_world. However, for pathfinding you will need a Node subclass, e.g. PlanningState.

Each PlanningState contains a reference to a logical world, as well as to the list of all actions

Note: You don't have to use make_world, and could instead store the state information directly in PlanningState

11 / 48

`get_neighbors`

checks which actions can be applied in the logical world, i.e. which actions' preconditions are satisfied using models
Generates a new logical world for each of these actions by applying its effects using apply
Returns Edge objects containing new PlanningState objects for each generated logical world, with a cost of 1, and a string representing the action that was taken

12 / 48

Pathfinding

We can now use astar as our planning algorithm!
Pass it a PlanningState corresponding to the initial state (which can generate more states by applying actions)
For the goal condition, write a isgoal function that uses models on the state and the goal condition
Call pathfinding.astar(start, h, isgoal)

13 / 48

Planning Heuristics14 / 48

A short timeline

1995: Graphplan (next lecture)
1997/1998: HSP (Heuristic Search Planner)
2000/2001: FastForward
2005/2006: FastDownward

15 / 48

Heuristics

We heard that heuristics can speed up A*
We used heuristics to tell the algorithm which nodes on the search frontier were "closer" to the goal
For search in Romania or on UCR campus we used the straight line distance as a "guess" of how close a node could be to the goal
We tried to have our heuristic underestimate the cost and be in the same "units" as the actual cost (e.g. km)

16 / 48

Planning Heuristics

In planning, our nodes are (logical) states, and our edges are actions
When we expand a state by applying all actions, we compute a heuristic value for each of the successor states
This heuristic value should estimate how long a plan from that state to a goal state is (at most)
In other words, we want to estimate how "difficult" it is to satisfy the goal from each state in our search frontier

17 / 48

Heuristics

So how do we define a heuristic for such an abstract process?
Here is an idea: Solve a simplified/approximate (relaxed) version of the problem, and use the cost of the solution as the heuristic
How can we simplify the problem?

18 / 48

Heuristic Search Planning (HSP)

Idea: Ignore delete lists
Why is this simpler? Because the state always grows
With a larger state, more actions can be applied, and eventually we will have added all atoms that can possibly be added

Note: This does not work without modification for non-STRIPS domains that have negative preconditions.

19 / 48

HSP

Oops: This is still NP-hard (action ordering)...
Instead, use an estimation:
- Assign a heuristic value i=0 to each atom in the current state
- Increase i by 1
- Apply all possible relaxed actions (without delete lists), and set the heuristic value of each produced atom to i
- Repeat the last two steps until no values change anymore
The heuristic value is then the sum of the heuristic values of all atoms in the goal
This is no longer admissible, since all goal atoms are treated as independent

20 / 48

HSP Heuristic Example: Pickup

Positive Preconditions: $\{\text{free}(), \text{clear}(X)\}$ Negative Preconditions: None

Add List: $\{\text{holds}(X)\}$

Delete List: $\{\text{free}(), \text{clear}(X)\}$

21 / 48

HSP Heuristic Example

Relaxed blocksworld Pickup:

$\{\text{free}(), \text{clear}(X)\} \Rightarrow \{\text{holds}(X)\}$

Put down:

$\{\text{holds}(X), \text{clear}(Y)\} \\ \Rightarrow \\ \{ \text{on}(X, Y), \text{clear}(X), \text{free}() \}$

Current State:

$\{ \text{on}(A, B), \text{on}(B,\mathit{Table}), \text{on}(C,\mathit{Table}),\\ \text{clear}(A), \text{clear}(C), \text{free}() \}$

22 / 48

HSP

HSP was the first widely successful planner to use heuristic search on STRIPS problems
It won the 1998 planning competition
There are updated versions which work with similar strategies
FastForward was inspired by HSP

23 / 48

FastForward

We said earlier that solving the relaxed problem is NP-Hard
But that depends on which relaxed problem we mean
Specificially, finding the optimal plan is NP-hard, but determining a plan is in P
Of course, this may overestimate

24 / 48

FastForward Heuristic

Start with a "layer" consisting of the set of all atoms in the current state
Apply all applicable relaxed actions, and add all their effects to the set of atoms generating the next layer
Continue until all atoms in the goal can be found in a layer
Then backtrack through the layers to build a relaxed plan
The length of this plan is the value of the heuristic

25 / 48

FastForward: Hillclimbing

FastForward also uses a modified search procedure
Since the heuristic may overestimate, it can be beneficial to ignore it for parts of the search
For any state, if all neighbors have a higher estimated cost, FastForward will expand all of those states' neighbors, until it finds a state with a lower heuristic value
In other words, as long as the heuristic values don't decrease, FastForward uses breadth-first search

26 / 48

FastDownward

FastDownward is a "planning system", that implements several different search procedures and heuristics
Additionally, FastDownward compiles the planning problem to a different representation
Multi-Valued Planning Task: Instead of binary predicates we have value assignments

27 / 48

Multi-Valued Planning Task

Predicates are often not a very natural way to represent real-world tasks
For example, it would be nice if we could easily say which block the gripper is holding in blocksworld, or which block is above another
Multi-Valued Planning Tasks (MPT) allow you to do just that
An MPT contains variables, each of which can be assigned one value from a finite domain
For example: "holds" may be a state variable with the domain ranging over all available blocks

28 / 48

Multi-Valued Planning Task

Why?

MPTs are still PSPACE-hard (propositional planning is just a special case where all state variables have values "true" or "false")
However, MPTs allow the natural expression of mutually exclusive states, which may cut down the search space
For example, we know that on(A,B) and on(A,C) can never be true at the same time
A propositional planner first has to figure that out
An MPT planner knows that on_A can only have the value B or C, but never both

29 / 48

Search-Heuristic

Using the state variables, we can analyze how they can change
For example, on_A = B may only change into on_A = undefined
We can also determine under which conditions such changes may occur (i.e. which actions cause them, and what preconditions they have)
This gives us some guidance in what to do
Of course, this analysis could have been done on the original problem, the MPT approach just reduces the space of possibilities

30 / 48

FastDownward

As mentioned, FastDownward actually implements several different approaches
The translation from a propositional planning problem (in PDDL) to an MPT is actually a separate component
The algorithms we are discussing for propositional planning usually still work on MPTs with some slight modifications

31 / 48

Types of Heuristics32 / 48

Different Types of Heuristics

Relaxation
Abstraction
Critical Paths
Landmarks
Network Flow

33 / 48

Relaxation

So far all heuristics we have talked about have been relaxation heuristics
They relax the planning problem in some way
Removing negative effects is the most common way to do this
You could also remove other parts of actions, or the goal, or split up actions

34 / 48

Abstraction

"Estimate cost by projecting the state space to a smaller space (applying a graph homomorphism)" (Helmert and Röger)

Make the planning problem simpler, by removing options
For example, remove one of the blocks from actions and goals
For some domains you may also have a more abstract representation (e.g. for a strategy game: Have a library of high level strategic actions which can help solve the planning problem on low-level actions)

35 / 48

Critical Paths

We can part of the goal, and construct a "critical path" backwards
For example, to get B on the table, we have to put-on-table(B), for that we first have to pickup(B), etc.
The number of actions on this critical path can serve as a heuristic
We may want to try different parts of the goal to get a better estimate

36 / 48

Landmarks

In many cases, we can find actions that have to happen (Landmarks)
For example: If we want B to be on the table, we have to have an action put-on-table(B)
Our heuristic value can be constructed from counting all of these landmark actions
The more accurate our count is, the more accurate is our heuristic

37 / 48

Network Flow

We can look at actions as producing and consuming facts
The number of productions and consumptions has to be the same
That means, for every time we pick up a block, we have to put one down as well
We may be able to use this fact to determine how many "switches" have to be made at least

38 / 48

For the Project

Task 4 is the most "creative" part of the project
You are supposed to come up with your own heuristic
Think about the different types, and how you would implement them
The FastForward Heuristic may be a good starting point

39 / 48

Other Optimizations40 / 48

Other Optimizations

Combining Multiple Heuristics
State-Space Pruning
Invariant Synthesis
Action preferences
Other search strategies

41 / 48

Combining Multiple Heuristics

No single heuristic is good for all planning problems
Just calculate multiple different ones, and combine the results
Pick the highest, lowest, average, sum, etc.
Of course, this is just another heuristic and will not be "perfect" either
But it may overcome limitations of others

42 / 48

State-Space Pruning

Many problems contain states that are "almost" the same
For example, if we want to transport packages around (logistics domain), and have multiple identical trucks, we can remove states that only differ by which truck is where
It may also be that the order of two actions does not matter, so we don't have to consider both orderings
Some of these calculations are relatively simple, others may require external knowledge

43 / 48

Invariant Synthesis

In most planning problems the operators maintain certain invariants, i.e. facts that hold in all reachable states
For example, in blocksworld there can only ever be one block that is held by the gripper
This forms the basis of the translation to an MPT, but can also be used when solving the relaxed problem to strengthen the heuristic
Another form of invariants are predicates that can never change, which we can use to exclude certain actions (like putting a block on itself)

44 / 48

Action Preference

When we calculate a heuristic by solving a relaxed problem, we actually get a "plan"
This plan can be used to give us a clue what a "good" next action might be
Rather than considering all actions the same, we prefer these actions
Can be used to augment the heuristic, or simply try this action first and if things don't work out fall back to "full" search

45 / 48

Other Search Strategies

Remember: Everything in AI is either representation or search

Searching through state space is just one possible approach to planning
Other planners may search through a different kind of space (different representation)
Yet others may use a very different search procedure when expanding the states
We will look at two such approaches in the next two weeks

46 / 48

Homework

Homework 5 has been posted on the class website
There are 5 problems using the planning domains from the repository of http://planning.domains
You will be calculating heuristic values and solving relaxed problems
When you do this, take note of how the different heuristics perform their estimations

47 / 48

Resources

48 / 48

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help