RECONSTRUCTING GRAPHPLAN ALGORITHM FROM FORWARD PROJECTION

Subbarao  Kambhampati
rao@asu.edu
(Draft working notes ASU CSE xxx) 
[Oct 15, 1995]

Graphplan is the first realistic contender that has emerged for the
refinement planning paradigm. An implementation of graphplan in C
seems to be the most efficient classical planner that is around now --
it solves in seconds problems that are almost impossible for the
traditional refinement planners without search control. 

The graphplan algorithm looks very different from the planners we
studied until now. Part of this is because the authors, Blum and Furst
are either not very conversant with the planning literature (which
turned out to be good since they were not biased by it), or didn't
have sufficient space in the ijcai paper to elaborate on the
connections with previous work. However, since we did spend a fair bit
of time understanding normal refinemnt planners, and are probably
already biased by them, it does make sense for us to try to derive the
graphplan ideas from the existing ones.  In these notes, I (Rao) will
show that Graphplan can actually be best understood in terms of doing
a forward projection, where the search space size is reduced by
keeping the projected states in a compact "unioned" format. The
ensuing computational tradeoff seems to be a good one as is evidenced
by the results of Graphplan.  The realization of the connections
between forward projection and graphplan also helps us get a better
perspective on the types of extensions that can be done to graphplan.

Motivating Graphplan from Forward Projection:

At the heart of graph plan is a forward projection stage where actions
are applied from a proposition list to generate new proposition lists.
These proposition lists are not world states (in that not every
proposition in the prop list may hold in the world at any time), but
are best seen as the union of propositions from all possible states
that come up after a fixed number of projections from the initial
state. Since the prop list doesn't correspond to real states, even if
the goals of the problem are present in the prop list at some stage i,
it doesn't necessarily mean that there is actually a plan of i steps
-- we need to search backwards to see if one does exist. This
_backward_ projection step can be made cheaper by keeping information
about which subsets of the proposition list can actually co-exist.
This is done by keeping "mutual exclusion" relations and propagating
them.

Let us now elaborate this idea by re-visiting forward projection. 

Consider the following example where the initial state contains the
propositions R and W. There are five actions, O1, O2, .. O6
with preconditions and effects as follows:

(R) O1 (P, !R)  [prec on right and effects on left]
(W) O2 (Q, !W)
(W) O3 (M, !R,!W)
(P,Q) O4 (Z)
(Q,W) O5 (J)
(Z,W,R)O6 (K)

now consider the process of forward projecting these actions from the
initial state. Recall that projection involves applying every
applicable action to the initial state to compute new states, and
applying operators to the resultant states iteratively. Planning ends
when one of these states contains all the goals. (This is the usual
FSS). 

We can make this projection process more efficient by considering the
sets of actions that are independent of each other (i.e., they don't
delete each others preconditions are added effects) together. 

In particular, although both O1 and O2 are applicable in the initial
state (R,W), rather than consider them in separate projection branches
we can consider them _together_. In particular, we can apply the
action set (O1,O2) together to the state (R,W) to get the state (P,Q).
THe semantics here is that irrespective of the actual order in which
O1 and O2 are applied to the state (R,W), the final state is
guaranteed to be (P,Q). 

Considering independent actions together rather than separately has
some computational advantages from the forward projection point of
view. First off, it reduces the branching factor of the projection
space -- instead of two branches one for O1 and one for O2, we have
one branch.  Another advantage, that may not be obvious at first
glance, is that by applying sets of independent actions together we
could effectively reduce the _depth_ of the projection tree.

To see the last point, suppose our goal was Z. Only O3 is capable of
giving Z and it requires both P and Q to be true. If we apply O1 and
O2 together to the state (R,W), then we can directly apply O3 in the
resultant state, thus effectively stopping projection in two levels.
If we applied O1 and O2 separately to (R, W), then we would have had
to apply O2 and O1 repsectively to the resultant states, before we
would get to a state where O3 is applicable (thus increasing the depth
of projection). To summarize, projecting independent actions together
is just a darned good idea. This idea has actually been around in
planning literature for a while. Drummond in his KR-89 paper on
learning situated control rules describes his projection process in
these terms. 

Now, consider the process of doing projection as described above, but
with one additional augmentation -- apart from the five actions
described previously, we will add a buch of no-ops, one for each
state, which take the contents of the state as their preconditions,
and give the content of the state back as their effects.  Thus, there
is one no-op applicable in any state. [[The addition of this no-op,
while contributing to the search efficiency of forward projection, is
useful in understanding the derivation of Graphplan algorithm later.]]

Here is a projection tree resulting from this kind of projection:
	   
                    (R,W)		     -------level 0  state(init)
            -------------------------	   		   	    	
	    |          |   	    |	   		   	    	
     o1,o2  |          |o3  	    | no-op    --actions applied in lev1
	    |          |    	    |  			   	        
	  (P,Q)	     (M )   	  (R,W)---------\  --  level 1 states.    
        /    |       |             \   	  \      \      		
     o4/  nop| 	     |nop           \o1,o2 \   	  \nop   -- lev 2 actions 
      /	     | 	     |               \      \o3    \  			  
   (Z,P,Q)  (P,Q)   (M)	             (P,Q)  (M)	   (R,W)  ---- lev 2 state
									 
		   ... continues ...                    		 
							                 

This projection tree continues until we encounter a state which
contains all our goals. As soon as such a state is found, the plan is
directly output as the path from the initial state to this state. If
our goal is Z, we can terminate with the left most state in the third
level, and the path will be [o1,o2] -- o4. The semantics of this being
that in the first step, we can execute o1 and o2 in any arbitrary
order, and in the second step we execute o4. 

Note that the plans that are output have a special form of
parallelism-- at any given time step many independent actions may be
executed in parallel. This is however very different from the kind of
partial order found in _plan space_ plans. In particular, within a
given branch of the projection tree, the positioning of the operators
with respect to time steps is strictly fixed (recall from UCP that
state-space refinements have to do this to derive the state).

Another important aspect of searching for plans using this forward
projection tree is that any search process whcih does a
_level-by-level_ enumeration of the projection tree will be guaranteed
to find the SHORTEST plan in terms of the number of steps (this is
quite straightforward to see). Of course, finding the shortest plan in
terms of the number of time steps is not the same as finding an
optimal plan with respect to some random objective function (which
may, for example, assign higher cost to o1, and lower cost to o2, etc.
). 

--
Now, the BIGGEST problem with forward projection as the basis for
planning is that the projection tree can grow exponentially quite
easily. (From our discussion of forward state space search and MEA,
note that  some of this growth can be cut down by using the
compute-relevant function to isolate only those actions that are
recursively relevant to the toplevel goals. However, even this will
not be enough. There are way too many states at each level of the
projection tree, and each level is exponentially larger than the
previous level -- the combinatorics can easily overwhelm any search
strategy. 

One way of getting a handle on the forward branching would be to some
how consider all the states in a given level _TOGETHER_, rather than
as separate nodes in the search tree. In particular, seeing the states
as sets of propositions, we can try to UNION the sets together to get
the collection of propositions from which all the states in that level
are made up. Such a union corresponds exactly to the _proposition
list_ maintained by graph plan's plan graph structure at that level. 

Going back to our example, the prop list at 0th level is the union of
just the init state, and is thus (R,W). At the second level, it is
(P,Q,M,W,R), and at the third level, it is (Z, P, Q, M, W , R). If you
run graphplan and construct the plan graph for this example, this is
exactly the proposition lists you will find. (note: we will see below
that in general, however, the plangraph proposition lists may contain
more propositions than the union of the sets at i-th level however,
because plan-graph may do some illegal projections).

Given this observation, it is easy to see why the plan graph of
i-levels contains only i proposition lists, while a projection graph
of i levels contains b^i states. The b^i states of projection graph
are unioned together to give rise to the i-th level proposition list.
						       
Once we make this correspondence between the forward projection graph
and graph-plan's plan-graph, all other details of graph-plan algorithm
can more are less be reconstructed as good algorithm design (but
recall the twin platitudes -- god is in the details, and hind-sight is
20-20).  Let us now proceed to unravel these details. 
						       
Loosely speaking, the unioned proposition list at level i can be seen
as a compact "disjunctive" representation of the b^i states at the i
th level of the corresponding projection graph. Doing this unioning is
clearly a good idea in terms of curtailing the growth of the
projection tree. However, proposition list, being a union of the
individual state propositions,  doesn't have the concrete physical
semantics of state. In particular, an arbitrary subset of the
proposition list at i-th level may not be part of any realizable state
at i-th level. For example, (Q,W) is a subset of the prop list at
first level (P,Q,M,R,W). However, there is no state at the first
level in the projection graph which contains Q and W together!.

The fact that every subset of the proposition list may not belong to
any valid state implies of course that we need to be careful both in
growing the plan graph using proposition lists, as well as terminating
using the proposition lists.

In particular, if (Q,W) is our goal for the problem, just because it
is part of the prop list at level 1 didn't mean that we could exhibit
a plan of one step length. Similarly, since (Q,W) is not part of any
legal state, we should not really be projecting the action O5 (which
has preconditions Q,W) from this proposition list. 

Of course, one dumb way of keeping the proposition list, while still
realizing that (Q,W) cannot belong to any legal state at the
corresponding level in the projection graph would be to keep
information about all legal states. This is dumb because we will be
keeping exponential amount of additional information on top of the
proposition list, and essentially will be doing forward projection
anyway. 

Another idea is to remember the opposite-- i.e., remember all the
subsets of a proposition list that DON'T belong to a legal state in
the proposition graph. This seems better since we actually need keep
only all the minimal illegal subsets. In particular, if we know that
(Q,W) is a illegal subset, then we will know that any superset of
(Q,W) is also an illegal subset--thus we can reduce redundant storage.

However, computing all minimal subsets turns out to be just as hard as
keeping all the legal states. In particular, if we have m propositions
in the prop list, we will have to consider all the 1 subsets, 2-subsets,...
m-subsets that are illegal. We thus have to examine and store 

 (m choose 1) + (m choose 2) + ... + (m choose m) = 2^m 

illegal subsets. 

One way out of this dilemma is to not demand all illegal subsets, but
just get by with a subset of the illegal subsets. In particular, it is
useful to consider all 2-sized illegal subsets alone. This is
essentially a pair-wise incompatibility or mutex relation between
propositions.  There are several advantages in computing these.

1. THere are only atmost (m)(m-1)/2 pairwise mutex relations in a
   m-sized proposition list (as against 2^m total)

2. The pairwise mutex relation can be computed in an incremental
   fashion -- starting with the actions in the initial level, we can
   consider any pair of action that are not independent to be mutex.
   The mutex relation can then be propagated to any effects of the
   actions in the next proposition list. 
 
  In particular, in the example we have been following, O2 and O3 are
 non-independent. Thus, M and Q are considered mutually exclusive at
  the first level prop. list. This makes sense since O2 and O3 are in
  different branches, and they lead separately to M and Q, and there
  is no single state that contains M and Q in the forward projection
  graph.  Similarly, (Q W) are mutex since the no-op at the first
  level is not independent of o1 or o2, which delete R and W, and
  their mutex  propagates to their effects Q and W. 

The pairwise mutex relations are able to answer the plan-existence
question as well as curtail useless projections in plan-graph to some
extent. In our example, since (M,Q) are known to be mutex, any problem
with goals that include (M,Q) cannot have a solution at the first
level of the plan graph (despite the fact that M and Q are part of the
first level prop list). Similarly, since (Q,W) are found to be mutex,
there is no point in projecting O5 from first level prop list, and
graph-plan algorithm realizes this. 

The tradeoffs in using pair-wise mutex.
--------------------------------------

Although pair-wise mutex relations (or 2-sized illegal subsets) are
easy to compute incrementally, as well as take up less space, assuming
conservation of complexity, we must be trading off something by using
them, instead of the full set of illegal subsets.

The price we have to pay for this compactification is two fold:

1. We  may still do some illegal projections. Even though we recognize
all pairwise mutex propositions, we may not recognize arbitrary
subsets of propositions (of size greater than 2) that may be illegal.
In particular, it is possible that P and Q are not mutex and Q and W
are not mutex (in that there are legal states that contain them), but
 P,Q,W are together illegal (in that none of the states that contain P
and Q or Q,W contain all three together). Since graph-plan won't
realize this from mutex, it will account for these illegal projections
in the plan graph.

In our example, this happens when graph-plan algorithm includes O6 in
level 3 actions, and the proposition K in level 3 props, although it
is easy to see that the forward projection graph will not be able to
apply O6 anywhere from level 2 states.

2. For roughly the same reasons as above, we may not be able to answer
plan-existence question when we have more than two subgoals.  

From the point of view of planning, 2 is the central problem. Although 1
 clutters the plan graph with illegal projections, as far as planning
is concerned, it just makes the problem of 2 worse. 

We can handle 2 by doing a backward chaining on the proposition lists
at different levels, through the actions that lead to those
propositions. In particular, given a set of goals
 (g1...gn) that are present in the i-th level prop list, and none of
the goals are pairwise mutex in the prop list, the only time
we are sure that there is actually a plan of i-steps that can give us
g1..gn is when we can exhibit a path through the corresponding
projection graph that terminates in a state that contains g1...gn.
(Note that such backward chaining phase is not necessary either in
forward projection graph, or in a plan-graph that keeps _all_ illegal
subsets of each prop list. The backward chaining is thus a fair price
we should be willing to pay for curtailing the space blow-up of
forward projection and the time-blowup of the full illegal subset
enumeration). 

We can do this by keeping track of the connections between the actions
in stage i, and their effects in the prop list at stage i, and their
preconditions in the prop list at stage i-1. For this purpose,
graphplan keeps actions and propositions at each level as individual
vertices of a graph, and puts edges between an action and its effects
and its preconditions. This graph-structure can be used to effectively
traverse the plan-graph backwards to answer the plan-existence
question. 

We can do this efficiently by using a recursive dynamic programming
algorithm, which exploits the mutex relations that are stored at each
level.  In particular, given the goals g1...gn to be satisfied at
level i, we consider one action each for g1 to gn from the actions at
level i, such that none of the actions are pairwise mutex. Once this
is done, the preconditions of these actions become recursive subgoals
for the previous proposition level. If everything goes well, this
recursion petersout at the initial prop list (corresponding to the
initial state). By the time the recursion ends, we would have
identified a subgraph of the plan-graph that corresponds to a valid
path in the corresponding forward projection graph, and is thus a
valid plan for our problem. 

At any time we are unable to find actions to cover all the subgoals of
the current level, without violating mutex relations, we know that the
subgoals at the current level cannot be made true. We then backtrack
to the previous level, and check to see if the subgoals at that level
can be achieved through a different set of actions. If we can't the
subgoals at that level fail too and the backtracking continues.

The failures of recursion can be cached to improve performance
further. In particular, when we find that recursion fails on some set
of subgoals g1...gm at level i, and we are about to backtrack to level
i+1, we know that g1...gm is an illegal subset of the prop list at
level i. Normally, we are only storing the 2-sized illegal subsets to
reduce computation. But, since this particular subset thurst itself
upon us as a side-effect of back-chaining, we may as well store it
(just on the off chance that some set of goals g1....gn that contains
g1...gm needs to be satisfied at this level in future recursions, and
we can shortcut the computation by recognizing that the set is
unsatisfiable). 

Graphplan thus memoizes these failing goal subsets. Note of course
that (a) the memoized illegal subsets won't help in cutting down
illegal projections of the plan-graph (since the projection has
already been done by the time the backward chaining started). and that
(b) in the worst case, we may still be storing O(2^m) lists-- the
saving grace being that we are only storing them since they became
available and not because we were eagerly enumerating all illegal
subsets. 

The reconstruction of graph-plan in terms of forward projection has, I
hope, reduced some of the mystery as to the antecedents of the
algorithm. At the very least, it should clarify to you that the fact
that graph-plan plans can contain multiple actions within each step,
and the fact that graph-plan terminates with smallest plan (in terms
of time-steps), are both a consequence of the properties of the
corresponding forward projection graph. However, by intelligently
distributing complexity between the search space size, the mutex
relation computation, and a non-trivial backward chaining phase,
graph-plan seems to be very efficient empirically. Of course, this
realization of the antecedents of the graphplan in terms of complexity
distribution makes us expect that there should be "killer examples"
for graph-plan too. More interestingly, while doing pair-wise mutex
relation turns out to be a pretty good choice, there may be other
tradeoffs between pair-wise mutex and the full set of all illegal
subsets, that may do better than pairwise mutex relations based
graph-plan implementation in some cases. 

Another thing we note from the connection between graph-plan and
forward projection is that graphplan should, at some level, inherit
some of the limitations of planning based on forward projection. In
particular, I conjecture that it may be harder for graph plan to
handle augmented planning problems where the planning starts not just
with init state and goals, but also sets of IPCs to be maintained and
sets of actions that must be part of the final plan. Similarly, I
conjecture that graphplan will have some difficulty in being able to
reuse/modify specific plans. 

Of course, none of this is meant to be a black-mark on graph-plan.
Given that it out does many existing algorithms, developed over a
20-year time period, it deserves a lot of critical examination and 
attempts at extension. 

Finally, the termination results of graphplan can also be seen as a
consequence of termination for forward projection.  But, I will leave
this as a non-trivial exercise for the curious.

Rao Kambhampati
[Oct 15, 1995]