Notes for Thursday, March 27, 2003 Forgot Mention in Last Class You might know when certain things will happen during the execution of the plan. For example, I know that at 6 o'clock pm on Friday Goldwater Center closes it's doors. I can plan around it because I know it will happen. There are many predictable external events that we would like to plan around. In classical planning it is difficult to talk about durative actions because you need a marker that is independent of a plan. In classical planning, how do you specify that something will happen at a particular time? Time is a marker that is independent of a plan. You can model an external event and plan around it. Example problem: You have a rover that's trying to plan. It knows that at certain times there is darkness and at certain times there will be light. At particular points in times it can be looking at objects and others it cannot. When you are taking pictures of an object it must be dark. You must take external events into account. We now will focus on conjunctive planning approaches. Conjunctive planners - progression, regression, partial order planning. Disjunctive planners - graphplan, SAT encoding, naïve SAT encoding, CSP encoding The words "conjunctive" and "disjunctive" invoke this idea: In graphplan whenever you pick a particular encoding you pick one action or another action. So, we use disjunctive constraints. Progression, regression and partial order planning can be written as conjunctive constraints. Conjunctive planning approaches have worked the best with temporal planning. There are also some disadvantages of working with graphplan and other disjunctive planning approaches. First we talk about how to model the actions and how to model the search (progression, regression & partial order planning). It is unnecessary to extend partial order planning to deal with durative actions. All you end up doing is make constraints between time points. For example, if it turns out that an effect of A1 has to support a precondition of A2 then we can make a constraint that te < ts' (i.e. te is strictly less than ts'). This is a particular type of constraint satisfaction problem called temporal CSP. ts te +--------+ | A1 | +--------+ | | +-----+ | ts' te' | +--------+ '-->| A2 | +--------+ Recall in partial order planners something was managing our order constraints. Now we will have something manage the temporal constraint. In the discussion, we will talk about durative actions first and then we'll discuss how a continuous quantity can be handled. Action Representation In general you could have a precondition that is needed over an interval. Also, effects can occur at any point during an action. If you know exactly the amount of time an action will persist you can say that. If you happen to know "I will make it true exactly at this point" you can say that. Modeling is a very interesting issue - it is reasonable to say that if you don't know how to model an action then you can forget about having a search algorithm. We'll be talking about a particular standard, PDDL 2.1. Some of the parts of this standard are beyond current planners. There's a paper that will appear in the Journal of AI Research that shows the PDDL 2.1 standard. An example of an action from the lecture's slides: ~(in-city ?airplane ?city1) (in-city ?airplane ?city2) ^ ^ | | | consume(fuel ?airplane) | |------------------------------>| | | +-------------------------------+ | Flying | +-------------------------------+ ^ | (in-city ?airplane ?city1) ------------------------------> (fuel ?airplane) > 0 In the beginning we are at a particular city and the fuel has to be greater than zero for a particular flight. After the flying action starts you are no longer in the city. By the end of the flight action you are in another city. Also, there is an amount of fuel at the beginning and an amount at the end. Consumption occurs during the whole action. If you have an action it is obvious there are cases where you'd want a certain thing to be true at the end of an action. Can you think of an action where you would want something true at the beginning - and that is the reason why you write the action? For example, some people don't care where they go - they just want to get out of a particular city. Here is now PDDL models an action: (:durative-action burn_match :parameters () :duration (= ?duration 15) :condition: (and (at start have_match) (at start have_strikepad)) :effect (and (at start have_light) (at end (not have_light)) ) ) One obvious thing we see is a duration. Duration is be a number or an expression that is evaluated to a number. The expression is evaluated right at the top point of the action. Expressions cause the action to have a dynamic duration. A duration can be inequality, too. You might not know exactly how much time something will take. For example, when heating water - you don't know how much time that will take. You can say that it will definitely not take more than 10 minutes. How long it actually takes depends upon other things. If the duration is an inequality then you do not decide it. All you say is that this action will take place. As long as you stay within the inequalities certain effects will happen. Another thing that we have are things in the preconditions like (at start (>= energy ?x)) meaning at the start of the action, the energy must be greater than or equal to ?x. Another precondition we can have is like (over all (visible ?y ?x)), so throughout the action's duration it must be true that (visible ?y ?x). We cannot have preconditions over an interval in PDDL 2.1. To handle this, you can make a sequence of actions with different "over all" constraints. If you have "at start" as an effect of an action, then the action encumbers a particular resource to ensure that that amount exists throughout the duration of the action. That is, we speak for an amount of a particular resource so that it cannot be used otherwise. This prevents plans that will not work. We can ensure that a particular amount of a resource will exist before performing an action. We can also say an effect occurs when an action ends using "at end". If you don't want to be conservative you need to be flexible (not reckless). You have to say something like "my resource will be going down at the rate of..." Given the PDDL 2.1 action above and the cross_cellar action: (:durative-action cross_cellar :parameters () :duration (= ?duration 10) :condition (and (at start have_light) (over all have_light) (at start at_steps)) :effect (and (at start (not at_steps)) (at start crossing)(at end at_fuse_box) ) We make a plan to do both actions concurrently: +-burn_match---------------------------------------------------------+ | have_match, have strikepad | | <----------------------------------------------> dur: 15 | | have_light ~have_light | +--------------------------------------------------------------------+ +-cross_cellar-------------------------------------------------------+ | have_light, at_steps | | <-------------------------------------> dur: 10 | | ~at_steps, crossing ~at_fuse_box | +--------------------------------------------------------------------+ burn_match takes 15 units of time and cross_cellar takes 10 units of time. At the beginning of the burn_match action we get have_light. So now we can start cross_cellar. Here is a scenario where without concurrency we fail to do the problem. What does it mean to support a durative precondition? Notice over all points we must have light. In classical planning, if we have a precondition how many actions are required to support it? Just one. Here, if you want light from 8 pm to 6 am in your cellar you just keep striking matches one after the other. Think about your parents asking you to take care of your younger sibling. There are five older siblings. The requirement is that someone is with the youngest. You don't have to be there the whole time. If you have a precondition you have to consider supporting the action throughout the whole duration. So, one person can take care of the kid for a short time, and then the next for a short time, etc. This is a valid plan. This is the reduction of durative goals. Summary of PDDL 2.1 Durations can be static or dynamic and may have inequalities. Preconditions can be "at start" of an action or "over all" (through whole duration), but we cannot have interval preconditions or have a precondition in the middle of the action. We can do this by modeling sequences of actions. If you know you can make effects true at the start and you can hold the effects for 5 minutes after that, there is no way of saying that. Numeric quantities can be written in preconditions or effects. If you are consuming you tend to model @ the beginning but if you are producing you tend to model @ the end. Rates can go up and down at the same time. We add the rates in this case. Modeling linear rates is easier than non-linear rates. If you have linear rates of change that you use the min value theorem. Look at the example from the paper Fig. 12 and Fig. 14. # = rate. The current state of the art: Level 1: STRIPS/ADL Level 2: Durative Actions (FF, LPG, Sapa) Level 3: Numeric quantities (Sapa, LPG) Level 4: Continuous change What do plans look like? For each action we have a start point. We can calculate dynamic durations. There is no uncertainty during execution. Sometimes the plan might not just say the start time. We can constrain the position of an action. Ultimately what we need is the exact time points of actions. These are called position constraints. We can also give ordering constraints. Remember a partial order planner gives you a partially ordered sequence of actions. Similarily, a set of actions and a set of temporal constraints between the actions can be given. So, any assignment of start times to the actions that is consistent with the temporal constraints is a solution. If you give plans that way then you have order constraint plans. You could go from position constraint plans to order constraint plans. Remember: Just like in classical progression planning you can get a sequential plan and decide what ordering is really needed by writing a causal explanation. Having order constraint plans might be better than having position constraint plans because we get flexibility How do I represent the constraints between time points? t1 < t2 t1' t1 t2 +--------+ +--------+ | A1 | --> | A2 | +--------+ +--------+ t3 +--------+ ~p | A3 | +--------+ We want to talk about the distance between two time points. So we want to be able to write a constraint like 3 <= (t2 - t1') <= 5. That is, we want the distance to be between 3 and 5. What if all we want is t1 < t2. It doesn't really matter by how much smaller t1 is. Even qualitative definitions can be written. We can say 0 < t2 - t1 < infinity. These are called temporal constraint networks. You have a bunch of time points and you have constraints among the time points. You keep accumulating these constraints. When all constraints specify a single interval it's called a simple temporal network and is computationally as easy as a partial order plan. Checking if it is a consistent plan or not involves a topological sort which takes polynomial time. t1 < t2 ^ t2 < t3 simple partial order t1 < t2 v t3 < t4 disjunctive ordering constraint. When did we talk about disjunctive ordering constraints? RePOP. If you have disjunctive ordering constraints then testing for consistency is NP-Complete. This is called a disjunctive temporal network. t1 [0,3] / \ [1,5][7,9] / \ t0 ---- t2 [7,20] So, in the above figure [1,5][7,9] means either bounded between 1 and 5 or 7 and 9. In general, even if you're not interested in planning you may still be interested in temporal constraint satisfaction. For example, you may be interested in scheduling. When saying temporal network we normally assume a simple temporal network. We've talked about actions - problems - plans. Makespan - execution time of a plan (difference between the start time and the end time of a plan). Slack - the difference between the deadline for a goal and the time by which the plan achieves it. Slack can be positive where we finish before the goal or negative where we do not finish the goal in time. We want slack in the plan because it tends to improve the robustness of a plan. Think of an example where two plans have the same makespan but one has more slack than the other. You might be doing certain things at the last possible point. |+++++++++++++A1++++++++++++++| g1 has less slack |+++A2+++++| -----------------> g2 has more slack We could do A2 as above and give ourselves more slack or start g2 later and give ourselves less slack. Cost: We can have multiple dimensions, each corresponding to a resource.