Lecture 8: Decompositional Planning

# Artificial Intelligence: Planning

### Decompositional Planning
#### and 
### Multiagent Planning

---

# Review: The Planning Problem

A planning problem consists of three parts:

* A definition of the current state of the world
  
  * A definition of a desired state of the world
  
  * A definition of the actions the agent can take
  
---

# Let's take a step back

* How would a human solve a complex problem?

* For example, say we have 5 packages, 2 trucks and 40 cities

* A "natural" way would be to take the large problem and divide it into smaller problems

* One part may be "move package 1 from its origin to its destination"

* But each of these parts is still a "complex problem"

---

# Interactions

* Once we have solved all the "complex problems", we have a plan!

* However, the solutions for each part may interact or overlap

* For example, if we drive a package from Limón to Liberia, and another one from San José to Alajuela, we may combine these two plans into one

* On the other hand, if we drive a package from Limón to San José, and another from San Ramon to Punta Arenas, we need to come up with a plan to fill the gap

---

# Planning

* This approach of "take a complex problem and split it into smaller subproblems" is not just natural to humans, it is also common in algorithms

* Let's see if we can apply it to planning

* We probably have to augment our formalism with some way to represent decomposition

---

# DPOCL

---

# Review: POCL

* Last week we talked about partial-order causal link planning

* Our graph consisted of nodes representing plans

* To transition from one plan to another we would use "refinement operations"

* Let's use the same idea for decompositions

---

# Abstract Actions

* We always assumed that actions were what the agent could execute in the end

* What if we have something more abstract, like "go to the airport"

* There may be many ways to achieve such an abstract action, like going by car, by taxi, by train

* Some ways only work in certain situations (not all airports have train connections)

---

# Abstract Actions

* Just as with our regular actions, let's assume that an abstract action has preconditions and effects

* That means our POCL planning process can just insert abstract actions to satisfy other actions' preconditions

* However, we need a new type of flaw because abstract actions can not be executed

---

# Decompositions

* To define how an abstract action can actually be performed we define "decompositions"

* Basically, we say that to perform an abstract action we instead perform a set of several other actions

* Of course, these other actions may also be abstract, leading to a hierarchy of decompositions

* So what *is* a decomposition?

---

# Decompositions

* You can imagine a decomposition to be like a "mini-plan"

* The abstract action has some preconditions and effects

* The decomposition has these preconditions as an initial state, and the effects as the goal

* Additionally, the decomposition adds "pseudo-steps" to the plan

* We **require** that all preconditions and effects are used by causal links connected to these pseudo-steps

---

# Pseudo-Steps

* We could just replace an abstract action by its decomposition

* However, in many cases we want to "reuse" some steps

* For example, we have an abstract action "go on vacation", and another "buy a gift"

* One may involve going to the airport, the other going to a store, but there are often stores at airports

---

# Review: The POP algorithm

1. If we don't have any flaws, return the current plan

2. Non-deterministically choose a flaw

3. Refine the plan to fix the flaw (note: this may generate new flaws!)

4. Call POP on the newly refined plan

---

# DPOCL

* We can reuse POP by adding a new flaw type and new refinements

* New flaw: Abstract Action without decomposition

* New refinement: Apply decomposition

* New refinement: Merge pseudo-steps

---

# Discourse Planning

* One particularly well-suited domain for decompositional planning is conveying information

* Take a scientific paper: The paper consists of sections, which each have their own preconditions and effects, each section consists of a series of paragraphs, and each paragraph consists of sentences

* The "preconditions" and "effects" are of the form "the reader knows x"

---

# Hierarchical Task Networks

---

# Hierarchical Task Networks (HTN)

* Planning based on task decomposition

* The "domain" in this case is a set of decomposition rules, or "methods"

* The "goal" is to achieve a task

* Each method describes how one task can be decomposed into smaller tasks

* There may be multiple methods for each task

---

# HTN Formalisms

"(Compared with classical planners,) the primary advantage of HTN planners is their sophisticated knowledge representation (and reasoning capabilities) (1)"

* Early PDDL tried to include syntax to describe task networks

* That effort was discarded, and no one really used it

* To this day there is no "standard" HTN formalism

---

# HTN Complexity

* HTN planning is (depending on the exact formulation) Turing-complete

* Some reduced HTN formalisms can be *compiled* to PDDL

* Others (as we will see) allow arbitrary code to be mixed with tasks

---

# Famous HTN Planners

* NOAH (1975): Nets of Action Hierarchies, the first HTN planner

* Nonlin (1976)

* SHOP (1999), SHOP2 (2003): Simple Hierarchical Ordered Planner

* SIADEX (2005)

---

# HTN Task Definitions

---

# HTN Applicability

* Some people have argued that HTNs require the domain to encode a lot of almost algorithmic knowledge

* SHOP was even disqualified from the international planning competition in 2000

* On the other hand, such additional information often means that HTNs often perform really well on real-world tasks

* "HTN planning is promoted as the most applied automated planning technique of real-world problem" (1)

---

# PyHOP

* PyHOP is an newer and simplified version of SHOP

* Instead of logic, it uses Python objects and functions to represent states and methods

* The planning algorithm itself needs less than 150 lines of python code

---

# PyHOP: State

A *state* in PyHOP is a simple Python object with some attributes assigned to it.

For example, an initial state for blocksworld could look like this:

```Python
state1 = State('state1')

state1.pos = {'a':'b', 'b':'table', 'c':'table'}

state1.clear = {'a':True, 'b':False, 'c':True}

state1.holding = False
```

---

# PyHOP: Operators

To pick up or put down a block, we write Python functions

```Python
def pickup(state,b):
    if state.pos[b] == 'table' and state.clear[b] == True 
	   and state.holding == False:
        state.pos[b] = 'hand'
        state.clear[b] = False
        state.holding = b
        return state
    else: return False
```

---

# PyHOP: Methods

```Python
def moveb_m(state,goal):

for b1 in all_blocks(state):
        s = status(b1,state,goal)
        if s == 'move-to-table':
            return [('move_one',b1,'table'),('move_blocks',goal)]
        elif s == 'move-to-block':
            return [('move_one',b1,goal.pos[b1]), 
			        ('move_blocks',goal)]
   
    b1 = pyhop.find_if(lambda x: status(x,state,goal)=='wait', 
	                   all_blocks(state))
    if b1 != None:
        return [('move_one',b1,'table'), ('move_blocks',goal)]
    
    return []

pyhop.declare_methods('move_blocks',moveb_m)
```

---

# PyHOP: Helper function

```Python
def is_done(b1,state,goal):
    if b1 == 'table': return True
    if b1 in goal.pos and goal.pos[b1] != state.pos[b1]:
        return False
    if state.pos[b1] == 'table': return True
    return is_done(state.pos[b1],state,goal)

def status(b1,state,goal):
    if is_done(b1,state,goal):
        return 'done'
    elif not state.clear[b1]:
        return 'inaccessible'
    elif not (b1 in goal.pos) or goal.pos[b1] == 'table':
        return 'move-to-table'
    elif is_done(goal.pos[b1],state,goal) and 
	     state.clear[goal.pos[b1]]:
        return 'move-to-block'
    else:
        return 'wait'
```

---

# PyHOP Call

To use PyHOP we pass it a task to solve, with an (optional) goal.

```Python
goal1a = Goal('goal1a')
goal1a.pos={'c':'b', 'b':'a'}

pyhop(state1,[('move_blocks', goal1a)], verbose=1)
```

---

# Multiagent Planning

---

# Multiagent Planning

---

# Multiagent Planning

* We now have multiple agents

* They want to collaborate

* But each agent is performing their planning individually

* Communication?

---

# Communication

* Imagine we have unlimited bandwidth for communication

* Then we could write a distributed algorithm that solves the planning problem for all agents simultaenously

* The plans are then distributed to the agents and executed by them

---

# Limitations

* In practice, communication is severely limited

* Additionally, combining multiple planning problems into one increases computational complexity

* We may therefore need every agent to perform their own planning and only coordinate when necessary

---

# Multiagent Planning Outline

1. Goal Selection 
  
  2. Planning 
  
  3. Synchronization 
  
  4. Execution

---

# Goal Selection

* But wait! Before we plan anything we need to have a goal

* If we collaborate on a larger task, such as soccer, we need to distribute the goals among the agents

* Some of these goals may require shared resources (there is usually only one ball)

* There are many ways to do this goal allocation, for example in the form of an "auction" (where each agent "bids" their expected cost of achieving it, lowest one gets the goal)

---

# Planning

* Once each agent has their goals, they come up with a plan to achieve them

* This can be done using any of the techniques we have discussed so far

* The agents could even use different algorithms (for example, if they have different computational capabilities)

* Since the agents often operate in real-time environments, the duration of each action (and plan) is important!

---

# Synchronization

* Agents communicate their plans to each other

* They then may need to synchronize certain actions (e.g. a pass between players)

* They also need to verify that they do not use the same resource at the same time (including space on the field)

* If this synchronization fails, they may have to start over

---

# Execution

* Once the plans have been validated against each other, the agents can start executing them

* Each action will take a certain amount of time, e.g. to move across the field

* Oops: Things change quickly in soccer

* The agents basically have to constantly replan to account for the changed circumstances

---

# Coordination

* We assumed that the plans would "play nicely" together, and resource conflicts would be relatively rare

* For many domains, including soccer, this is surprisingly effective

* Consider this: The players start distributed across the field, with the ball in one known location

* If we define reasonable goals, not every player will run towards the ball, but instead position themselves intelligently

* These assignments result in a low chance for collisions

* Additionally, defined "roles" for the players result in natural collaboration such as forward passes

---

# Coordination

* The generalized version of these soccer "conventions" is called *social laws*

* Basically, each agent has a set of rules they are expected to follow, that define which plans are "socially" acceptable

* One approach is to define an increasingly strict set of such laws, and start planning with the strictest version, relaxing the laws until a plan can be found

* Because these laws basically define a hand-crafted restriction on the plans, the solution may not be optimal, but this approach has been shown to work well in practice

---

# Further Complications

* What if the agents are not (fully) cooperative?

* For example, logistics trucks may need to distribute packages for delivery between them, but each truck operator receives payment for deliveries

* What if some actions can only be executed by two agents?

* For example, to lift a couch two agents need to be present. A single agent can not just "plan to move the couch"

* What if each agent only has limited information about the world?

* For example, each agent has cameras and there are walls blocking vision

---

# Homework

* [Homework 8](/PF-3335/assets/pdf/homework8.pdf) has been posted on the class website
  
* There are 5 problems

* One problem is about DPOCL

* Three use PyHOP: Download the code and play around with it

* The last one is about multiagent planning

---

# References
 
  * Young, R. M. [DPOCL: A Principled Approach to Discourse Planning](https://www.aclweb.org/anthology/W94-0302.pdf)
  
  * Georgievski, I. and Aiello, M. [An Overview of Hierarchical Task Network Planning](https://arxiv.org/pdf/1403.7426.pdf)
  
  * [PyHOP](https://bitbucket.org/dananau/pyhop/src/default/)
  
  * [Introduction to Planning in Multiagent Systems](https://pdfs.semanticscholar.org/119f/98dacdb8cb68c326a151664bc7348c8fd0a8.pdf)