Notes on :: Execution --------- By : BIPLAV SRIVASTAVA CSE 591 Planning Seminar Significantly Revised by Rao [Nov 16, 1995] ___________________________________________________________________________ * Plans may be made to check feasibilty of attaining a goal state without any intention of executing it. But for the discussion, it is assumed that a planning agent also wants to execute his plan. * Execution is trivial if the world never changes during the planning-execution episode as the agent just needs to execute actions according to the plan and it is done. However, classical planning techniques may be used even in situations where the classical planning assumptions are not completely satisfied -- and the world is partially observable, quasi-dynamic and quasi-deterministic. A realistic agent in a realistic world faces the problem that actions may not necessarily turn up to expectation. Actions which the agent expects to accomplish successfully most of the time are taken as "primitive" actions for planning purposes and ones which the agent goofs up often are made into goals for which planning is done. When the agent announces that it has got a plan, it (actually) means that if its "primitive" actions are successful while executing the plan, then the agent will achieve the goal. But with world as it is, its "primitive" actions may fail and this is handled while execution. * Execution approaches 0) Just execute the plan and assume that the world reaches goal state (assuming that classical planning assumptions hold) 1) Check after executing all actions in the plan if the goal is achieved -- very optimistic ! 2) Action Monitoring : check if next action is applicable in the current state. (you can catch bad executions early) 3) Execution Monitoring : check if agent still has a plan to reach a goal state from the current state (before executing next action). * Execution Monitoring model is the most general approach. In the model discussed in the class, a plan is converted into a set of state/action pairs such that if any state/action rule is applicable in the current state, it is applied. These state-action pairs can be computed using the causal link structure of the plan. For a totally ordered plan --- "Execute any action whose cutset conditions hold in the current state. If there is more than one action whose cutset conditions hold, pick the action that occurs latest in the plan." Here, the cutset of a step s is the set of causal links that either support a condition at s, or whose source steps are necessarily before s and whose destination steps are necessarily after s. cutset(s) = { c | s1 --c--> s2, s2 = s OR s1 < s < s2} Notice that the state action pairs construted from the plan describe a finite-sate automaton, called execution automation, in which the goal state is the terminating state. Any evolution of the automaton corresponds to one possible behavior exhibited by the plan. As long as the current state of the world is covered by the automaton, the agent knows what to do. This already allows some flexibility during execution. For example, after executing action A, it is not necessary that the world has to be the state that is supposed to hold after A in the plan. As long as the world is in one of the covered states, we still know how to get to the goal state. The notion of execution automation can be extended quite easily to partially ordered plans. The straight forward way of doing this is to consider the partially ordered plan as equivalent to a set of totally ordered plans that correspond to the topological sorts of the plan, compute the execution automatons of all the plans, and union them to get the automaton of the PO plan. Another way is to extend the definition of the cutset of the step as follows A binary partition of a plan P with respect to a step s is two sets of steps S1 and S2 such that -- S2 contains s -- S1 contains all steps that necessarily precede s -- S2 contains all steps that necessarily follow s -- All the steps s' that are unordered wrt s are distributed among S1 and S2 such that If s' is in S1, all of necessary predecessors of s' must be in S1 and all its necessary successors must be in S2 If s' is in S2, all its necessary predecessors must be in S1 and all its necessarily successors must be in S2 Graphically, a partition such as the above corresponds to drawing a cross section through the PO plan graph. A cutset of a step s in a partially ordered plan P is defined with respect to every binary partition with respect to that step. cutset(s, S1, S2) = { p | s1 --P-->s2, s1 in S1 and s2 in S2} It is easy to see that this definition of cutset for partially ordered plans is a strict generalization of the definition of cutsets for totally ordered plans. In particular, the totally ordered plans have a single binary partition with respect to each step in the plan, and thus a single cutset for that step. A partially ordered plan may have an exponential number of cutsets for a step, and thus an exponential number of situations where it is okay to execute that step. This should be considered as a feature rather than a bug. A partially ordered plan gives us more than one solution to the problem, and thus provides a larger execution automaton that covers a larger number of states of the world. Replanning --------- * If from the current state, S, none of the actions in the plan can be executed, (i.e, the current state is not covered by the execution automaton corresponding to our plan), replanning is necessitated One way of doing replanning is to go from the current state to the nearest state that is covered by the execution automaton. Suppse that is the state S'. Then replanning involves solving the problem [S,S'] and then unioning the execution automaton corresponding to that plan with the current plan. One feature of this approach is that as the agent continues replanning, it increases the coverage of its execution automaton to account for more and more states of the world. Thus the plan is slowly becoming a policy. It is also worth noting that to the extent the agent thinks that the world is not really all that static, and also realizes that it has some more time left before execution is to start, it can start extending the current policy itself by considering states near the covered states that are _likely_ to occur during execution. This is similar to the idea of envelope extension in Markov Decision Processes (which we will talk about when discussing probabilistic plannng). * Replanning can be posed as a reuse problem. If plan P with initial condition I and goal G is not executable from current state I', a new planning problem P' can be defined with initial condition I' and goal G and take P as the reusable plan. This will involve first fitting P to the new problem and then extending the skeletal plan. Interleaving execution and planning ------------------------------------ Sometimes, the agent may want to interleave planning and execution. This may be necessary for a variety of reasons. -- If the actions take long time to execute, starting off an action can buy you time to complete the rest of your planning activity -- Starting off execution may also help you deal with predictive uncertainity of the world. Suppose the actions are so non-deterministic that predicting what happens after a sequence of actions is too risky (ie., low proabability predictions). In such a case, you may want to cut down the risk by executing the actions that are ready, and thereby getting higher quality information about the evolution of the world that can be used by the later parts of the plan. -- finally, execution can also be useful in gathering information during planning. This we will discuss during sensing stuff. The question is when is it right to execute an action? Suppose we have an action s:A that is a part of the current partial plan P. We can only execute A if its preconds are satisfied in the current state. But this in itself is not enough. In executing A, our hope is that A is the first step of a solution that can be reached by refining P. If this is not the case, then the next best thing is that A is part of _some_ solution (even if it cannot be reached by refining P). THe worst scenario is if A is part of no solution what so ever. In this case, by executing A the agent has painted itself into a corner. Formally, the following scenarios decide whether the agent will have to backtrack having executed action A, and whether it will still retain completeness having executed action A. *** Avoid backtracking from P (in the order of preferability, and in the order of decreasing strength. ) -- A is the first action of a minimal solution candidate of P -- A is the first action of a solution candidate of P (minimal or nonminimal) -- A is the first action of some _candidate_ of P [The first two are harder to ensure. The third one is possible to ensure, and this is what we do] *** Avoid losing completeness (in the order of preferability, and in the order of decreasing strength) -- A is the first action of _some_ minimal solution for the problem (whther it is a candidate of P or not) -- A is the first action of some solution (minimal or nonminimal) whether it is a candidate of P or no. Reversibility and Completeness ------------------------------ A necessary condition for A not being the first action of any solution, minimal or non-minimal, is that A is irreversible. That is some conditions changed (added/delted) by A cannot be reachived (deleted/added back). Note that irreversibility is only necessary, not sufficient for loss of completeness since the conditions irreversibly delted by an action may not be relevant for the goals of this particular plan. Sometimes, it is easy to show that an action is irreversible. Suppose an action A deletes condition C, and no other action in the domain has C in its add list, clearly, A is an irreversible action. Of course, showing that an action is irreversible is in general as hard as planning itself. For example, suppose the action A has deleted a condition C. Suppose there is an action B which does give back condition C but its preconditions cannot be achieved in the current problem. Then, A is essentially irreversible as far as this problem is concerned. THe only way of finding this out however is trying to plan to go from the stater after A's execution to a state where the deleted condition is added back. This is as costly as planning itslef. Nevertheless, in most realistic domains, we do know that some of the actions are definitely irreversible (e.g. bomb the place, destory the block etc.) and the agent can avoid executing these actions. Backtraking ------------ Just because the action is reversible doesn't necessarily mean that it is worth executing it. In particular, apart from compelteness we also want that the once we execute an action from plan P, we be able to refine P into a solution. As described above, a necessary condition for this is that A be the first action of a candidate of the plan. We can ensure this by showing that there exists a safe ground linearization of the plan such that A is the first step. We can do this also without explicitly enumerating safe ground linearizations. Specifically, we make sure that A is not necessarily preceded by any other action (since if so, it can't be the first action of any safe ground linearization). We also make sure that A is not involved in any unresolved conflict (since if it is, then it is possible that A is not the first action in any SGL). Finally, we also make sure that all preconditions of A have causal links supporting them. This is to ensure that A is actually applicable in the current state. Once these conditions are satified, we know that A can be the first step of a candidate of the plan, and that A is applicable in the current state. Having executed the action, we now get to see the new world state, I' that occurs after A has been executed. We now should "fit" the rest of the partial plan P such that it is consistent wrt to changed initial state. This can be seen as a fitting operation that fits P to the new problem [I', G] (where G is the same as the old goal state). The fitting operation involves removing unsupported causal links (converting the conditions into open conditions) or unnecessary causal links, extending casual links to exploit any serendipitous effects of execution, and removing any purposeless steps (ie, those that don't support any causal links) from the plan. At this point, the ffitted plan is further refined. [If the refinement fails, we will eventually backtrack.] When execution of a step from a plan P occurs, we have made extra commitment is made to the portion of the search space below P. It makes no sense typically to search other portions of search space, since you already made a commitment to execute this plan (undoing the executions below this plan may actually be costly). We should go to other parts only when this plan starts looking very unpromising. * The bottomline to the whole discussion on planning and execution is, if the failure in a world is benign, the agent might want to wait for failure to occur versus plan for failures at the outset. --------------------------------------------------------------- Points to ponder: 1. How is executing an action from a partial plan, and then fitting the rest of the plan different from just doing FSS planning all the time? 2. How is it different from UCP that does PSS and FSS refinements together?