class: center, middle # Artificial Intelligence: Planning ### Conclusion --- class: center, middle # Current Developments --- # Current Developments - Planning is still a highly active field of research - The International Conference on Automated Planning and Scheduling (ICAPS) is the premiere venue for planning research - They also hold various competitions every year - For example, the First Unsolvability International Planning Competition in 2016 --- # Recent Research - Novel Heuristics - Novel Applications - Integration with Machine Learning - Extensions to the formalism, like temporal and probabilistic planning --- # Some Applications - Logistics (RoboCup Logistics League) - Ride sharing - Video Games - Narrative Generation --- # Goal-Oriented Action Planning * F.E.A.R. (and several other games) used a variant of planning for its enemy AI * Instead of arbitrary goals and states, it greatly restricts what can be represented * This restriction allowed it to be used in a real-time environment * The developers stated that using planning allowed them to easily add new action types --- class: center, middle # Narrative Generation as Planning --- # Narratives * A narrative is often considered to consist of two parts: - Story: What is happening in the world (the "plot") - Discourse: What the audience is told (the "telling") * We can use a planner to generate each of these two parts --- # Story Generation - We have an initial state of the world - We have actions the different characters can perform - The author defines a goal for the story - For example: In a detective story the goal is that there is a murder and that the detective identifies the murderer --- # Story Example * Sherlock Holmes goes to Irene Adler's house * Sherlock Holmes stabs Irene Adler * Sherlock Holmes identifies himself as the killer * The End. --- # Story Generation as Planning - Planners are usually good at finding efficient solutions - Stories are not efficient! - However, each *character* should act rationally, i.e. efficiently - We could use one planner per character ... --- # Intentional Planning - With one planner per character we may never find a path to the story goal - Instead, we use a "trick": We tell the planner what each character wants to achieve - The planner may then only use actions for a character that helps them achieve their goal - But the overall plan still has to solve the story goal! - Implementations: IPOCL, CPOCL --- # Discourse Planning - Once we have a story (plot), we want to tell it - There are several different media, such as film, text, comics, or combinations thereof - As we discussed during our lecture on Hierarchical Planning, we can use planning to generate this discourse --- # Camera Planning - Suppose we want to make a movie - Each scene is supposed to convey some particular aspect of the story - There are different camera shots, angles, lenses, etc. that we can use - We can frame this as a planning problem by defining the different shots as operators --- # The Battle for the Rune [Link](https://drive.google.com/open?id=1VweROPFFeFyPv4M1rf_FPnZR9NDIuD23) --- class: center, middle # Compilation --- # Feature Creep - New applications have led to the need for new features in planners - For example, intentional planning for stories, or constraints on the plan - But classical planning is already very expressive! - We can "compile away" certain things --- # Compiling Intentions - In Intentional Planning the planner may only add actions that help a character achieve their goal - But that's just another precondition! - For each predicate, such as `has`, we add an additional predicate `intends-has` - Then we add a new action for each such intention that has the additional precondition of that intention --- class: medium # Why Compilation? - IPOCL can solve intentional planning problems! - To plan a variation of the Aladdin story it needs 12h - The compiled version can be run using *any* state of the art planner - Even with significantly more predicates due to the compilation, a solution was found in under a minute! - Using compilation we can make use of all recent advances in classical planning! --- class: center, middle # Planner Selection --- # Planner Selection - No heuristic works well for all problems - In fact, no planner works well for all problems - What if we have "every" possible planner available? - We would want to pick the best planner for each problem! --- # Planner Selection - We could run all planners in parallel - As soon as one finishes, we return its solution and terminate all others - Problem: We "need" 1 CPU per planner (or equivalent computation resources) --- # Problem Structure - Let's run each planner on a couple of benchmark problems and time it - We then figure out which planner is the fastest for each of these problems - Ideally, when we look at the problems each planner is good at, we can identify some common "structure" - Then we use each planner on problems that have the structure of problems it is good at --- class: medium # Machine Learning? - Identifying a common structure may be tricky to do manually - Machine Learning is good at finding patterns! - We could train a neural network on a representation of each problem and the time needed by a planner - Then we can use the neural network to predict how long a particular planner may need on a new problem! - Of course, this requires a proper encoding, and comes with all the other challenges of using Machine Learning --- class: center, middle # Machine Learning --- # Machine Learning and Planning - There are many ways Machine Learning and Planning can be used together - Generally, planning is good when the domain is known, but the solution structure may not be - On the other hand, Machine Learning is good when the solution structure is known, but he domain may not be - We can use each one to complement the other --- # Reinforcement Learning A Reinforcement Learning Problem consists of: - A state of the world - Actions, the agent can take - A reward function for the agent -- Sounds a lot like planning? --- # Reinforcement Learning - In Reinforcement Learning, we tell the agent how many "points" it gets for reaching a final state - The idea is that the agent "explores" the environment using some sequence of actions until it reaches a final state - Then it uses the points it got to assign a value to each action it took to reach that state - By repeating this "often", the agent learns the value of "each" action in "each" state --- # Learning a Domain - From observing how an agent interacts with a world, we may be able to learn domain actions - For example, if we observe the world before and after some actions we can construct a model of what these actions may have done - We may then use a planner to find a solution for some new problem --- # Guiding the Agent - In order to learn the agent needs to explore the world - This exploration follows some exploration policy - We may use a plan as this exploration policy (with some random noise, perhaps) - Instead of exploring the environment completely randomly (initially), the agent uses this plan as a guide --- class: small # Wumpus World - The Wumpus World domain is actually one of the standard textbook examples for Reinforcement Learning! - The agent visits the world thousands of times and walks about randomly to learn which moves are good and which are bad - Then it can visit a new world and knows not to run into the Wumpus (for example) - Instead of moving about randomly, we can use a planner to provide a "good" path - But maybe the world is too big? Our planner can operate on a simplified version, because it just needs to produce a guideline --- class: center, middle # Conclusion --- # Conclusion - Planning is very central to my research interests - I really enjoyed teaching this class - I hope I did not scare y'all away from this fascinating topic - If you want to know more about my research, and/or do a project with me, let me know --- class: medium # Next Semester - Please participate in the class evaluation! - It helps me improve the course and my teaching in general - It also helps the adminstration to decide which courses to offer - There is also the Machine Learning and Statistical Learning class next semester, if you liked my teaching style and/or are interested in the topic! --- class: center, middle # Planning Competition --- class: medium # Planning Competition - For the planning competition, I selected several different problems - I used the time my planner needed for the problems as a baseline to select some from different difficulty levels - I ran each planner on each of these problems - The fastest planner for each problem got 5 points, the second fastest 3, the third fastest 1 - If a planner crashed, timed out or produced an invalid solution, it got -1 points --- class: medium # Planning Competition: The Problems - Airport p04, p05, p06 - Airport ADL p01, p02 - Elevator s2-2, s3-4, s4-3, s5-2, s6-1 - Elevator ADL s2-4, s3-2, s4-1, s4-4, s5-3 - Logistics p01, p03, p04 - movie prob03, prob10, prob13, prob17, prob24, prob29 - psr-small p01, p03, p04, p05 - tpp p03, p04 --- class: small # My Planner I selected these problems to represent several different domains, and a range of different difficulties. My planner solved each of them in less than 1 minute. - airport\p04-airport2-p1.pddl: 0.35 seconds - airport\p05-airport2-p1.pddl: 0.42 seconds - airport\p06-airport2-p2.pddl: 2.95 seconds - airport-adl\p01-airport1-p1.pddl: 33.08 seconds - airport-adl\p02-airport1-p1.pddl: 50.72 seconds - elevators-00-adl\s2-4.pddl: 0.14 seconds - elevators-00-adl\s3-2.pddl: 0.27 seconds - elevators-00-adl\s4-1.pddl: 0.73 seconds - elevators-00-adl\s4-4.pddl: 2.51 seconds - elevators-00-adl\s5-3.pddl: 6.72 seconds - elevators-00-strips\s2-2.pddl: 0.16 seconds - elevators-00-strips\s3-4.pddl: 0.88 seconds - elevators-00-strips\s4-3.pddl: 1.29 seconds - elevators-00-strips\s5-2.pddl: 3.38 seconds - elevators-00-strips\s6-1.pddl: 51.81 seconds --- # My Planner - logistics\p01.pddl: 13.09 seconds - logistics\p03.pddl: 1.32 seconds - logistics\p04.pddl: 52.17 seconds - movie\prob03.pddl: 1.36 seconds - movie\prob10.pddl: 1.47 seconds - movie\prob13.pddl: 1.59 seconds - movie\prob17.pddl: 4.31 seconds - movie\prob24.pddl: 8.09 seconds - movie\prob29.pddl: 7.05 seconds - psr-small\p01-s2-n1-l2-f50.pddl: 0.13 seconds - psr-small\p03-s7-n1-l3-f70.pddl: 0.21 seconds - psr-small\p04-s8-n1-l4-f10.pddl: 0.22 seconds - psr-small\p05-s9-n1-l4-f30.pddl: 1.31 seconds - tpp\p03.pddl: 0.61 seconds - tpp\p04.pddl: 4.37 seconds --- # Planning Competition: Third Place Mauricio! - 31 points: 2 first places, 6 second places, 3 third places - Very well rounded, never times out! - Strengths: Apparently does well on more complex domains, like the logistics one - Weaknesses: Slow on smaller domains, in particular movies --- # Planning Competition: Second Place Daniel! - 69 points: 10 first places, 5 second places, 4 third places - This is a very fast planner (for this class)! - Strengths: Completely destroyed the movies domain, but also did very well on logistics - Weaknesses: Slight overhead on smaller domains like psr-small and tpp --- # Planning Competition: First Place Kenneth! - 95 points: 13 first places, 9 second places, 5 third places - Actually timed out twice on the logistics domain - The harder STRIPS elevator problems were also a struggle - Strengths: Almost everything else --- # Planning Competition: Graph
--- # Planning Competition Honorable Mention: Edgar - Edgar's planner did amazingly well on the logistics domain - Only three planners managed to solve the hardest logistics problem, the other two needed 91.4s and 218.1s (mine needs 52s) - Edgar's planner solved it in 23.4s! - Unfortunately this did not generalize to many other domains --- # Planning Competition: All Results [Link](https://docs.google.com/spreadsheets/d/1FZpEAYxbeWP5FBlJs5N9Ep9ENL5YNvk44qVKOKz-OoM/edit?usp=sharing) --- class: small # References * [ICAPS 2019 Proceedings](https://aaai.org/ojs/index.php/ICAPS) * [Goal-Oriented Action Planning](http://alumni.media.mit.edu/~jorkin/goap.html) * [Narrative Planning: Compilations to Classical Planning](https://arxiv.org/abs/1401.5856) * [Bardic: Generating Multimedia Narrative Reports for Game Logs](https://www.semanticscholar.org/paper/Bardic-%3A-Generating-Multimedia-Narrative-Reports-Barot-Branon/3087daa75239c7990bfc993a5919d14e69373b42)