--------- Planning: How do we generate the sequence of actions, given an initial state and a goal state The most obvious thing to do is to use forward state space planning Search Queue: {Initial state.} Loop -- Pick a state from the search queue. -- if it is a goal state, terminate, and return the path from the intiial state to this state. -- apply all applicable actionss in the state. -- "progress" the state through the operator application. -- put each of the states in the search queue. End. ----- Problems with the forward search: IN ANY REALWORLD SITUATION, THERE ARE WAY TOO MANY _DOABLE_ ACTIONS, THAN ARE ACTIONS THAT ACTUALLY GET US ANYWHERE TOWARDS THE GOAL -- can try to use some heuristic to decide which of the states is closer to the goal -- -- Not enough since you still have to generate all the states, before evaluating them... -- Do search in the backward direction--ie., start from the goal state and apply operators in the backward direction. ----BACKWARD SEARCH An operator O is applicable to a state S in the backward direction if: -- The postcondition formula of O is consistent with S -- There exists at least one variable that is named both in the postcondition formula of O and the state S. The state resulting from applying O in the reverse direction is (also called the result of regressing S through O): -- The precondition formula of O + the variable value assignments of every state variable not in the preconditions of O, but in the state S, Termination criterion: The current backward state's partial assignment is consistent with the the variable assignment in the initial state. The backward search has a much smaller branching factor than the forward search in most realistic domains. THERE ARE STILL REASON TO FAVOR FORWARD SEARCH INSTEAD OF BACKWARD SEARCH. One such reason is that the states produced in the forward search are complete while those produced in the backward search are partial assignments (This is because goal state is a partial assignment while the initial state is a complete assignment). The fact that the states are complete in the forward state space search has some advantages. In particular, it becomes easier to evaluate the truth of complex logical formulas (comprising of sentences on the variable values) in complete states. In particular, checking the value of a logical sentence given a complete state --or INTERPRETATION-- reduces to doing boolean algebra and is linear in the size of the sentence. On the otherhand, checking the value of a sentence in a partially specified state is equivalent to doing theorem proving (e.g., given a database of sentences that are known to be true, does a new sentence S have the value true? False? unknown?) and is semi-decidable. Bachus and Kabanza, in their paper to be presented at EWSP-95 titled "Using temporal logic to control search a forward chaining planner" say that the fact that one can do model checking instead of theorem proving to compute the truth of large formulas makes it easier to control forward searching state space planners than backward searching or plan-space planners. Now given that forward search has advantages, the question is how do we control its branching factor. We do the following: -- Attempt to isolate those actions that are somehow known to be relevant to the goals -- an operator O is relevant if it either gives one of the goals, or if it can give one of the preconditions to one of the relevant operators (need to do a recursive analysis). Thus we can have a separate procedure called Compute-Relevant-Ops(), which takes the set of goals of the current problem and returns a list of operators that are both _relevant_ to the goals and are _applicable_ in the current state. This computation can be done by the following recursive procedure: (may be buggy; but you get the general idea). Global variable: *applicable-ops* ComputeRelevant(Goals, currentstate) Foreach goal g in Goals: do Foreach operator Op that can give goal g do Begin if Op is applicable in the current state, push Op to the *applicable-ops* list Otherwise (i.e., op is not applicable) ComputeRelevant(Preconditions(Op),Currentstate) End This builds a structure like the following g1 g2 g3 / \ / \ /\ / \ / \ / \ / \ / \ / \ O1 O2 etc.. etc. / | \ * / | \ p g1 r (the starred ops are applicable) Now, it is possible to make the ComputeRelevant procedure more sophisticated in many ways-- for example, we can detect situations where an operator O1 gives g, requires P, while O2 gives P requires g. To avoid looping in this case, we can prune any branch if the same subgoal repeats on that branch. Thus, the final graph structure constructed by teh ComputerRelevant procedure will have either terminated branches, or branches that end in operators which are applicable in the current state. Any one of these operators are relevant to the toplevel goals, and in general they will be fewer of them than the number of operators that are directly applicable in teh initial state. For completeness, all these relevant operators must be considered in generating the next possible states. If you are doing depth first search, and want to decide which of these operators are more likely to lead to a solution fast, it is typically useful to use the length of the path to that operator from the root of the tree as a heuristic merit of the operator. (The Computerelevant procedure is an abstracted version of one developed by Drew McDermott in his recent paper. McDermott calls these graph strctures "Greedy regression graphs", and shows that using the greedy regression graphs to isolate relevant actions, and ranking them interms of path length leads to VERY efficient planning -- better than most existing planners) on certain domains). -----INTERLEAVING THE COMPUTERELEVANT PROCEDURE WITH FORWARD OP APPLICATIONN The Means Ends Analysis planning. Now, since ComputeRelevant procedure is costly -- it has to construct this regression graph -- you may get the idea that it may be better to "interleave" the computation for the regression graph computation with that of "operator application". Planners which do this can be called "means ends analysis" planners. Historically, the first state space planners -- including STRIPS -- were means ends analysis planners (and didn't exactly realize that they were interleaving the computerelevant and forward application procedures). -- If your compute relevant procedure is complete (i.e., finds all recursively relevant operators), then running it as a co-routine with respect to forward application does not actually affect the completeness of the planner. One headache of course is that since we are advancing the state forward, an operator that is considered applicable in the previous state may no longer be applicable in the new state. However, this is not a real problem once we note that in terms of this type of interleaving, the graph structure being computed by the ComputeRelevant procedure is not exactly a part of the plan -- the plan is still the sequence of operators that have been applied in the forward direction until now. Thus, we can carry the inapplicable operators along, continuing recursion on their preconditions UNTIL the planning stops --- which it does when the current state is a superset of goal state (equivalently, the goals are true in the current state, or the assignment of the current state is consistent with the assignment of the goal state). An example of planning system that works like this is FLECS/PRODIGY 4.0 system from Carnegie Mellon university. -- Surprisingly enough, the first planner to interleave compute-relevant procedure with the operator application -- the STRIPS planner -- prematurely terminates certain branches of the compute relevant procedure -- thus leading to incompleteness. Although the above para makes STRIPS planner sound like it was using a convoluted approach, teh approach actually sounds quite natural. Specifically, the approach involves selecting a goal, and recursing on getting to a state where that goal is true, and starting from that state to go towards the goal state. Init-state ------------------------------------------------- Goal State Init-state ----------------State where g1 holds --------------- Goal state (do this first) (recurse on this) -- Use a stack structure to keep track of the state of the computerelevant procedure. The stack originally contains the goals of the problem as a conjunction The graph is constructed by picking the top of the stack. step 1: Pick the top of the stack Step: 2 -- if the top of the stack is a goal condition, If the goal condition is true in the current state, Pop the condition off the stack [**Source of incompleteness 1] Else, Consider some operator O that can give that goal. Put O on top of the stack Put O's preconditions as a conjunction on top of O Put O's preconditions individually on top Goback to Step 1. [ If O has precondition P, Q , R, the stak will have P Q R P&Q&R O The conjunctive condition is needed to ensure that after making P, Q and R true separately, they are still true _together_. ] Step 2': If the top of the stack is a conjunction of conditions If the conjunction holds in the current state, pop it from the stack If the conjunction doesn't hold in the current state, add the unsatisfied parts of the conjunction onto the stack, on top of the conjunction. [[If after working on P,Q,R, we find that only R is true in the current state-- presumaly because making R true made P and Q untrue, we can add P and Q back onto the stack giving: P Q P&Q&R O ]] Step 3: if the top of the stack is an operator it will be applicable in the current state (This is because the only way the top of the stack will have an operator is if all its preconditions, on the top of the stack, have been worked on, and are together true in the current state.) Apply the operator to the current state (thus advancing the state), Pop the stack [[*another source of incompleteness*-- need to consider delaying the operator application]] --if the top of the stack is an operator, it will be applicable in the current state You remove some operator from the top of the stack, and NOTICE that this computation is similar to the ComputeRelevant procedure--but for two important differences. It decides not to consider operators for a goal that is currently true in the current state. It decides to apply an operator as soon as it becomes applicable. It has the nice flavor of picking one goal at a time, working on achieving that goal completely, and then working on the next goal. This hardcoded interface between the computeRelevant and Forward Application procedures makes the overall alogrithm incomplete. As we discussed in the class, this algorithm -- known as STRIPS PLANNING ALG-- fails to find a solution for Register Swapping problem, while finding only an inoptimal solution for the sussman anomaly problem. ---STRIPS and SERIALIZABILITY Given that STRIPS can't solve some problems, but can solve others, an interesting question is what is the type of problems that can be solved by STRIPS. This class of problems are the ones with _serializable subgoals_, ie, a set of subgoals such that there exists some permutation on teh subgoals such that we can work on individual subgoals in that order and concatenate the subplans giving rise to a complete solution for the overall problem. Unfortunatley, finding whether a given set of subgoals in serializable turns out to be just as hard as letting STRIPS try to solve that problem and see if it gets solved. ------More on STRIPS-------- Conceptually I G Original planning problem O decide to use a relevant operator O Reduces to two recursive planning problems I prec(O) State after O G {State after O is given by first simulating the plan to go from I to prec(O), and then applying O to the resulting state) Function MEA(I, G, plan) returns plan If G is a subset of I Terminate and return Plan Find an action O that is relevant to G - I {backtrack point; need to consider all actions} If O is applicable in I then I <- O(I) Plan <- Plan + O MEA (I, G, Plan) {Recursive invocation} Else {O is not applicable in I} {Plan to make it applicable} plan' <- MEA ( I , prec(O), null) Plan <- Plan + plan' I <- result of applying plan' to I I <- result of applying O to I {the last two steps compute the state after O} {Now complete the rest of the goals} MEA( I, G, Plan) This is still forward planing because the operators are being applied in the forward direction. The process can be seen as recursively reducing the differences betwen the current initial state and the goal state. One of the useful properties of MEA approach is that it can exploit any independence between subgoals. In particular, suppose G = {g1, g2, .. gn}, and they are all independent. Then, STRIPS can exploit it by reducing the differences one by one, without ever backtracking. Thus, if it does m units of work to achieve one goal, the total work it takes to achieve all n goals will simply be m * n, instead of m ^ n (as would be the case with normal forward search). In the real world, many of the goals are reasonably independent of each other, while others are dependent. Exploiting subgoal independence where it is available is a very important requirement for efficient planning. -------SLIDE---------------------------------------------------------- Idea: Assume that subgoals are independent. But watch out for problems. Make a plan assuming independence. Check the plan. If the plan is wrong, then do some corrective action. ------------------------------------------------------------------------- The algorithm won't do much worse then the one which does not try to exploit independence. Example: The two goals are not independent, i.e. we have to put Put B on C and then A on B Initial State Goal State [A] [B] [A] [B] [C] [C] ==================================== Solution 1) [A] [B] [C] 2) [A] [B] [C] 3) [A] [B] [C] 4) [B] [A] [C] 5) [A] [B] [C] In order to have completeness we may have to work on the goal more then once. Can use any one of the following approaches. 1) Backtrack in the order in which differences are removed or 2) Keep removing differences as long as they exist. ALTHOUGH STRIPS/MEA IS A NICE ALGORITHM, IT HAS SOME STRONG LIMITATIONS. IN PARTICULAR, SINCE IT REDUCES DIFFERENCES ONE AT A TIME, IT WON'T WORK IN PROBLEMS WHERE THERE IS NO SINGLE ORDER IN WHICH GOALS CAN BE ACHIEVED WITHOUT VIOLATING THE PREVIOUSLY ACHIEVED GOALS. [SUCH PROBLEMS ARE SAID TO HAVE NON-SERIALIZABLE SUBGOALS]. GIVEN A PROBLEM WITH NON-SERIALIZABLE SUBGOALS, STRIPS WILL EITHER FAIL TO SOLVE THE PROBLEM, OR END WITH A SUBOPTIMAL SOLUTION. Consider the following example which cannot be solved by this algorithm. Register Swapping problem. Initial State Goal State X=A X=B Y=B Y=A Z=C It is easy to seach that we can neither work on x=B first and y=A next nor work on y = A first and x = B next (since we will overwrite a value in either case). Subgoal interleaving solves this problem.