A planning problem consists of three parts:
A definition of the current state of the world
A definition of a desired state of the world
A definition of the actions the agent can take
How would a human solve a complex problem?
For example, say we have 5 packages, 2 trucks and 40 cities
A "natural" way would be to take the large problem and divide it into smaller problems
One part may be "move package 1 from its origin to its destination"
But each of these parts is still a "complex problem"
Once we have solved all the "complex problems", we have a plan!
However, the solutions for each part may interact or overlap
For example, if we drive a package from Limón to Liberia, and another one from San José to Alajuela, we may combine these two plans into one
On the other hand, if we drive a package from Limón to San José, and another from San Ramon to Punta Arenas, we need to come up with a plan to fill the gap
This approach of "take a complex problem and split it into smaller subproblems" is not just natural to humans, it is also common in algorithms
Let's see if we can apply it to planning
We probably have to augment our formalism with some way to represent decomposition
Last week we talked about partial-order causal link planning
Our graph consisted of nodes representing plans
To transition from one plan to another we would use "refinement operations"
Let's use the same idea for decompositions
We always assumed that actions were what the agent could execute in the end
What if we have something more abstract, like "go to the airport"
There may be many ways to achieve such an abstract action, like going by car, by taxi, by train
Some ways only work in certain situations (not all airports have train connections)
Just as with our regular actions, let's assume that an abstract action has preconditions and effects
That means our POCL planning process can just insert abstract actions to satisfy other actions' preconditions
However, we need a new type of flaw because abstract actions can not be executed
To define how an abstract action can actually be performed we define "decompositions"
Basically, we say that to perform an abstract action we instead perform a set of several other actions
Of course, these other actions may also be abstract, leading to a hierarchy of decompositions
So what is a decomposition?
You can imagine a decomposition to be like a "mini-plan"
The abstract action has some preconditions and effects
The decomposition has these preconditions as an initial state, and the effects as the goal
Additionally, the decomposition adds "pseudo-steps" to the plan
We require that all preconditions and effects are used by causal links connected to these pseudo-steps
We could just replace an abstract action by its decomposition
However, in many cases we want to "reuse" some steps
For example, we have an abstract action "go on vacation", and another "buy a gift"
One may involve going to the airport, the other going to a store, but there are often stores at airports
If we don't have any flaws, return the current plan
Non-deterministically choose a flaw
Refine the plan to fix the flaw (note: this may generate new flaws!)
Call POP on the newly refined plan
We can reuse POP by adding a new flaw type and new refinements
New flaw: Abstract Action without decomposition
New refinement: Apply decomposition
New refinement: Merge pseudo-steps
One particularly well-suited domain for decompositional planning is conveying information
Take a scientific paper: The paper consists of sections, which each have their own preconditions and effects, each section consists of a series of paragraphs, and each paragraph consists of sentences
The "preconditions" and "effects" are of the form "the reader knows x"
Planning based on task decomposition
The "domain" in this case is a set of decomposition rules, or "methods"
The "goal" is to achieve a task
Each method describes how one task can be decomposed into smaller tasks
There may be multiple methods for each task
"(Compared with classical planners,) the primary advantage of HTN planners is their sophisticated knowledge representation (and reasoning capabilities) (1)"
Early PDDL tried to include syntax to describe task networks
That effort was discarded, and no one really used it
To this day there is no "standard" HTN formalism
(1) Ghallab et al. Automated Planning: Theory & Practice
HTN planning is (depending on the exact formulation) Turing-complete
Some reduced HTN formalisms can be compiled to PDDL
Others (as we will see) allow arbitrary code to be mixed with tasks
NOAH (1975): Nets of Action Hierarchies, the first HTN planner
Nonlin (1976)
SHOP (1999), SHOP2 (2003): Simple Hierarchical Ordered Planner
SIADEX (2005)
Some people have argued that HTNs require the domain to encode a lot of almost algorithmic knowledge
SHOP was even disqualified from the international planning competition in 2000
On the other hand, such additional information often means that HTNs often perform really well on real-world tasks
"HTN planning is promoted as the most applied automated planning technique of real-world problem" (1)
(1) Nau et al. Applications of SHOP and SHOP2
PyHOP is an newer and simplified version of SHOP
Instead of logic, it uses Python objects and functions to represent states and methods
The planning algorithm itself needs less than 150 lines of python code
A state in PyHOP is a simple Python object with some attributes assigned to it.
For example, an initial state for blocksworld could look like this:
state1 = State('state1')state1.pos = {'a':'b', 'b':'table', 'c':'table'}state1.clear = {'a':True, 'b':False, 'c':True}state1.holding = False
To pick up or put down a block, we write Python functions
def pickup(state,b): if state.pos[b] == 'table' and state.clear[b] == True and state.holding == False: state.pos[b] = 'hand' state.clear[b] = False state.holding = b return state else: return False
def moveb_m(state,goal): for b1 in all_blocks(state): s = status(b1,state,goal) if s == 'move-to-table': return [('move_one',b1,'table'),('move_blocks',goal)] elif s == 'move-to-block': return [('move_one',b1,goal.pos[b1]), ('move_blocks',goal)] b1 = pyhop.find_if(lambda x: status(x,state,goal)=='wait', all_blocks(state)) if b1 != None: return [('move_one',b1,'table'), ('move_blocks',goal)] return []pyhop.declare_methods('move_blocks',moveb_m)
def is_done(b1,state,goal): if b1 == 'table': return True if b1 in goal.pos and goal.pos[b1] != state.pos[b1]: return False if state.pos[b1] == 'table': return True return is_done(state.pos[b1],state,goal)def status(b1,state,goal): if is_done(b1,state,goal): return 'done' elif not state.clear[b1]: return 'inaccessible' elif not (b1 in goal.pos) or goal.pos[b1] == 'table': return 'move-to-table' elif is_done(goal.pos[b1],state,goal) and state.clear[goal.pos[b1]]: return 'move-to-block' else: return 'wait'
To use PyHOP we pass it a task to solve, with an (optional) goal.
goal1a = Goal('goal1a')goal1a.pos={'c':'b', 'b':'a'}pyhop(state1,[('move_blocks', goal1a)], verbose=1)
We now have multiple agents
They want to collaborate
But each agent is performing their planning individually
Communication?
Imagine we have unlimited bandwidth for communication
Then we could write a distributed algorithm that solves the planning problem for all agents simultaenously
The plans are then distributed to the agents and executed by them
In practice, communication is severely limited
Additionally, combining multiple planning problems into one increases computational complexity
We may therefore need every agent to perform their own planning and only coordinate when necessary
Goal Selection
Planning
Synchronization
Execution
But wait! Before we plan anything we need to have a goal
If we collaborate on a larger task, such as soccer, we need to distribute the goals among the agents
Some of these goals may require shared resources (there is usually only one ball)
There are many ways to do this goal allocation, for example in the form of an "auction" (where each agent "bids" their expected cost of achieving it, lowest one gets the goal)
Once each agent has their goals, they come up with a plan to achieve them
This can be done using any of the techniques we have discussed so far
The agents could even use different algorithms (for example, if they have different computational capabilities)
Since the agents often operate in real-time environments, the duration of each action (and plan) is important!
Agents communicate their plans to each other
They then may need to synchronize certain actions (e.g. a pass between players)
They also need to verify that they do not use the same resource at the same time (including space on the field)
If this synchronization fails, they may have to start over
Once the plans have been validated against each other, the agents can start executing them
Each action will take a certain amount of time, e.g. to move across the field
Oops: Things change quickly in soccer
The agents basically have to constantly replan to account for the changed circumstances
We assumed that the plans would "play nicely" together, and resource conflicts would be relatively rare
For many domains, including soccer, this is surprisingly effective
Consider this: The players start distributed across the field, with the ball in one known location
If we define reasonable goals, not every player will run towards the ball, but instead position themselves intelligently
These assignments result in a low chance for collisions
Additionally, defined "roles" for the players result in natural collaboration such as forward passes
The generalized version of these soccer "conventions" is called social laws
Basically, each agent has a set of rules they are expected to follow, that define which plans are "socially" acceptable
One approach is to define an increasingly strict set of such laws, and start planning with the strictest version, relaxing the laws until a plan can be found
Because these laws basically define a hand-crafted restriction on the plans, the solution may not be optimal, but this approach has been shown to work well in practice
What if the agents are not (fully) cooperative?
For example, logistics trucks may need to distribute packages for delivery between them, but each truck operator receives payment for deliveries
What if some actions can only be executed by two agents?
For example, to lift a couch two agents need to be present. A single agent can not just "plan to move the couch"
What if each agent only has limited information about the world?
For example, each agent has cameras and there are walls blocking vision
Homework 8 has been posted on the class website
There are 5 problems
One problem is about DPOCL
Three use PyHOP: Download the code and play around with it
The last one is about multiagent planning
Young, R. M. DPOCL: A Principled Approach to Discourse Planning
Georgievski, I. and Aiello, M. An Overview of Hierarchical Task Network Planning
A planning problem consists of three parts:
A definition of the current state of the world
A definition of a desired state of the world
A definition of the actions the agent can take
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |