------------------------------------------------- Planning Seminar Class Notes on April 06 2000 By XuanLong Nguyen, xuanlong@asu.edu Two main topics in today's class are: 1. Conditional Planning 2. Issues involved in Execution Relevant reading are: 1. S. Russel's text book, chapter 13 on Planning and Acting 2. K. Golden's AIPS-98 paper "Leap before you look". Exclusive Interview: An exclusive conversation between our boss and Keith Golden (Puccini's author) is at the end of this note. ______________________________________________________________________ We addressed a couple of questions on the Tire's example in Russel's book. To begin with, lets briefly summarize the main ideas of the algorithm for generating a conditional plan in the text book. In this particular algorith, the conditional planner is generalized from the partial order planner POP. Initially, the initial plan includes just the start and goal steps (like in POP). We would try to repeatedly select an open condition and a step (whether a new one or some step already existed in the partial plan) that achieves that open condition. Since we are only accessed to incomplete world, Eventually, we would run into an open condition (Intact(tire1)) that can not be achieved by any available causal actions, but can be known through an sensing action, or a conditional step (check(tire1)). There are, say 2, different outcomes of the sensing action, each of which corresponds to two different *contexts*. Each disjoint context is now propagated forward to branch of the partial plan that it is needed. [ Thus, now it's clear that the contexts are used to partition the possible world state based on the set of unknown state variables. Furthermore, since we are doing backward chaining search, the contexts are considered on a demand-driven basis. In other words, only unknown state variables that are relevant to achieving our goals are considered and sensed by sensing actions. For example, if there are 2 unknown state variables intact(tire1), spare(tire1), each of which can take on true or false we may have totally 4 different contexts. But as we do backward chaining, it turns out that we only need to really care about different contexts, corresponding to (intack(tire1)) V (intact(tire1)). ] Back to the stream of our algorithm. Since we have introduced in a new conditional step, which branches our partial plan into 2 contexts (C1 and ~C1), we have to find a plan corresponding to the second context C2=~C1 as well. This is done by making a copy of the goal step. Then we do backward chaining again, working on the open conditions of this new goal step and the partial plans starting from it. If during the backward process, you run into another conditional step again, you would need further context branching and then work on each of these different context branches. If you know your current context C2, how can you take in into account when you do the backward chaining. Rao said that the contexts are merely label used to differentiate between different branches in the conditional plan. But the algorithm says (and we can see) that when we do backward chaining with a current context C2, whenever we choose a step S that adds a precond p, S have to be "compatible" with C2. I personally don't understand what compatible means, unless we have to take on C2 not as merely label, but also as some logical sentence. On the other hand, if C2 also have logical value, then we run into a problem that the context can be changed by some causal actions within its own branch -- somethign that Rao dismisses. Threat resolution is simple. Suppose we have a step S that threats link s1->(p)->s2. We can do it by either promotion or demotion, as in POP, but also one additional choice by conditioning the step S so that its context becomes incompatible with the context of the step s1 and s2. [ Long: Again, what exactly does "incompatible" mean here? ] Until this point, it's easy to answer one question in the class asking why we have to worry about the interactions (specifically, threats) between apparently different branches corresponding to different contigencies of the conditional plans. The problem is that if we have more than 1 contexts to be considered, we are not sure which context a given step (derived from the backward chaining) belong to. Another question is concerned with, what if two different steps belong to two different contexts also need the same step. Then what happens if the step eventually backward chained to a step that is inconsistent with one of the contexts. Notice that this question already presumes that we may work on different branch corresponding different (disjoin) contexts concurrently. This is not the case with the particular algorithm CPOP (in page 399): Not until all plans for existing goal steps for contexts C1,.., Cn are complete do we consider the plan for the new context ~(C1^...^Cn). This is a legitimate question, though, because we really want to avoid the overlapping between branches, we want to reduce the number of branches as much as we can, all allowing the different branches to merge if possible. We can see that the whole algorithm can work in principle, but it's clear that this algorithm can NOT produce a single conditional plan in practice. It's too inefficient! There are so many non-deterministic points that makes it hard to find an efficient search strategy. For example, it's not clear whether finding plan for all contexts one by one, or concurrently is better. And how to choose among 3 different threat resolution options. And suppose you choose conditioning resolution, then which strategy would you use to choose a conditional link for the threatening step. This is not to say that the algorithm inherits the same burden as the UCPOP in also having to deal with the problem of selecting open conditions, and supporting actions, etc. And already UCPOP, as its current state of the art, is a dead planner that can solve nothing. One can easily wonder what kind of plans this CPOP can turn out. Rao briefly mentions the problem with generations of planners that were extended from the UCPOP framework. In extending the UCPOP in different ways to accommodate differnt problems of different natures, the notion of causal link, threats, threat resolutions are often modified, broaded to mean different things, and that further complicate the understanding as well as effort to improve the efficiency of the planners. A brief mention on SGP, which also deal with incomplete information. SGP's underlying idea isn't that smart, in that it essentially attempt to find plans for every possible worlds, while we know that much of the different plans may be actually overlapping. Now turn to the execution issue in planning. Execution planning is understandably most pressing under the setting of planning with incomplete information. If we do planning under classical assumption, while care about execution -- we just need to synthesize a single plan that, when executed, will surely lead to the goal. Issues in executions are (1) What to execute, and (2) How to execute Question (2) specifically asks: How do you proceed the plan after you execute an action. Different scenario may arise: + Are the effects of the action you executed are what you intended? If not, what would you do? Suppose if you the effects are not what you intended then would you kill that subbranch starting from that step, or not? There is no easy answer to this, as the branch you just killed may be useful later on. + When you execute an action, the set of precedence constraint will be changed as well. For example, if you choose to execute an action S, then the precedece constraint (InitialState,S) not becomes the contiguity constraint, and S is now supposed to be before any other steps to be considered in the plan. Propagating this kind of precedence constraint may lead a candidate plan into a dead end, or narrow down the search choice. + As Rao said: Thus, execution is pretty messy thing! Question (1) is equally prolemmatic. In general, you are never sure whether excecuting an action will lead you to the goal or a dead end in the incomplete information setting, *unless* the action chosen is reversible. Thus there are several rule of thumb in choosing which action to execute: + should be the earliest action in the topological sort in the plan (i.e satisfying all precedence constraints) + wait until the action (step) doesn't take part in any interaction with some branch (doesn't mean it won't later on, though). There is one way to ensure that executing an action doesn't kill you in the end, is having extra safe-measure preconditions, so you will be able to reverse. (e.g you want to climb up the tree to bring your cat down; but you are afraid to die, so you want have an extra condition: a life rope). While this method help you avoid the deadlock, it may compromise the completeness (suppose you don't have any rope, then you cna't find any solution to save the cat; while the cat can be saved.) Extra precondioning can be costly, because it requires extra planning. Now, one note(?) on Golden's Puccini. It turns out that Puccini is not a conditional plan to deal with incomplete information. Here is the Golden's answers to Rao's question on Golden. It's interesting to note on that in dealing with question (1), he considers "when to execute an action", and "which actions to execute" as different flaws. Furthermore, these flaws are given higher priority than other type of flaws. Rao emphaisize that while this is interesting, as the notion of flaws is again overloaded, there is really a big efficiency problem here, as it is not clear whether such a search control strategy (between different type of flaws) is efficient or not. Here is the Golden-Rao's conversation. I personally think the discussion on the "cost" of action is interesting and controversial. To me, Golden favors low cost over hight cost action becuase he's more concerned with not running into deadlock, but it's not clear whether this kind of strategy is best in the interest of both time (to reach the goal) and optimality. Rao: Another question... Does puccini ever generate conditional plans to handle incompleteness, or does it essenitally depend on interleaved execution to figure out what the incompleteness is? Golden: the latter. for the softbot domain, contingency planning is costly and not very effective, so although i toyed around with it a little, i didn't end up using it. Rao: I understand that UWL develops conditional plans, while Gini/Olowsky depend on execution to avoid contingency planning. Golden: sensp, which supported uwl, did contingency planning. neal lesh wrote a (nameless) planner for uwl that interleaved planning and execution. Rao: Also, what are the control strategies used for flaw resolution--for example, when does one pick an execution flaw over an open condition flaw? (If you just refer me to the correct section in your dissertation--which I am looking at, that would suffice). Golden: there is no fixed strategy -- it is subject to search control. but in general, execution is given higher priority than open conditions. the most effective strategy i found was assigning high priority to the execution of low cost actions and very low priority to the execution of high cost actions. the rationale is that low-cost actions don't represent much of a commitment but can lead to increased knowledge, which can be used to prune the search tree. however, high cost actions represent a considerable commitment, so it is better to do more planning to make sure that commitment is justified. by "cost", i mean both the time required to execute the action and impact it has (e.g. the time required to undo the action). Rao: sorry for bugging you about things you probably thought you were over and done with... Golden: no problem. actually, i'm doing softbot stuff again, though of a different flavor and not using any of my old code. and replying to your email is far more pleasant than checking the proofs in a paper i'm reviewing, so it's a welcome break. -- XuanLong Nguyen CSE, Arizona State Univ.