------------------------------------------------------------------------ Note on Mar 27 Class. Note taker: Zaiqing Nie Topics: Temporal Constraint Satisfaction Problems Papers involved: P1. E. Schwalb, L. Vila: "Temporal Constraints: A Survey" P2. R. Dechter, I. Meiri, J. Pearl: "Temporal Constraint Networks" P3. E. Schwalb, R. Dechter: "Processing Disjunctions in Temporal Constraint Networks" ------------------------------------------------------------------------ 1. Introduction to these three papers This week is the TCSP(Temporal Constraint Satifaction Problems) week, first of all you should have got a bunch of papers, one of these paper is "Temporal Constraints: A Survey", it is a survey paper, but the problem of this survey paper is that it assume that you have known enough about the TCSP already, so it will not be the best paper to read first. There are two other paper, one is the "Temporal Constraint Networks", which is the old 1991 paper. It actually is the first TCSP paper, the very first paper which try to formalize temporal reasoning as a constraint satisfaction flaw, a particular importance is the notion of metric temporal problem. For exemple, t1-t2<=25, how do you handle those temporal constraints? They talked about these, and they differentiate between something called simple temporal problems vs. full temporal problems which contain disjunctions. What the hell is disjuctions? If you have disjuctions then you are dead. Because CSP people don't like having NP-complete procedures for just solving the CSP. If your NP-complete is a subset of another NP-complete procedure, it is not clear to me whether you gain anything by not making this formulas and making the other one that is still NP-complete. But it is kind of worth figuring out where the complexity comes in, how do the problem became harder. That is what this paper talked about, this is like a original paper. Then you can jump into the paper "Processing Disjunctions in Temporal Constraint Networks", the 97 AI paper, this explain very nicely how to implement temporal constraint satisfaction algrithm. If you try to read this paper first then whenever you got confused about something they talked in their older paper, then go back to read the paper P2. In fact once things become clearly, you go back to the survey paper P1, and you will say "Yes, of course that exactly what it is". unfortunately, it is very hard for people who work inside of the field to write a survey paper, it is very hard to kind of figure out what people don't know. This survey paper is still interesting, because the other two paper together don't exactly cover the whole information. I am going to try to use the slides from the tutorial most of the time. Couple of things you need know, things like what is the meaning of composition and intersection, because they keep talking about it, why is this important? somehow that connected to the notion of path-consisitency so why is path-consistency connected to composition and intersection. 2. Approachs to represent constraints I am starting from constraint satisfaction which is also there in the tutorial, because you can understand these notion of normal CSP, and then go to temporal CSP which is a special type of CSP. Any way this is basically a constraint satisfaction problem, you have variables, you have domains, you have constraints, etc. One thing you want to remember is that constrains can be represented either as feasible or as infeasible compound labels. You have to decide which one up-front, you either think of constraints as telling you don't do these things, (this is called no-good approach, the constraints are no-goods), or as telling you are only allowed to do these things. An example to explain how to represent constraints, ... x belongs to {1,2,3} y belongs to {2,4} ... What I really want to say is x and y should nerver have the same value, what are the constraints between x and y? no-good way: (x, y) not belong to {(1,1),(2,2),(3,3)} feasible value way: (x, y) belong to {(1,2),(1,4),(2,4),(3,2),(3,4)} If I am giving a full assignment, such that (..., x=a, ..., y=b,...) Suppose I want to see if assignment is already inconsistent to the constraints no matter it is complete or partial. What I am going to do is I actually project x and y values out of this assignment. And if I have constraints represented as no-goods, I check if the project value is actually a member of no-good set; in the feasible compound value way, I check if the projected value is actually a member of the allowed set. In different situation for practical reason you might say one way is better than other way. But theoretically they are exactly equal. For the rest of our discuss, we just think of constraints as feasible values, but you should be able to just convert from one way to another. What happens when you do arc-consistency? If you have the view of this way, feasible compound methods, the constraints thrinks. Because after the constraint propagation, you found some of the feasible values about not actually feasible, so you remove them. So the contraints shrinks. When you do consistency enforcement, you keep reduce the size of the constraints; If you have the view of infeasible way, after you do acr-consistency, you found other things that are implicit infeasible too, so the constraints increase. In practice, people will use a procedure that will take a partial assignment as input and it will tell you what are the variables you are going to worry about. For example, the procedure C5(x,y) which cares about x and y. it takes the partial assignment as input, and does some magic actions, and outputs Yes/No. If it is Yes, that means as far as this constraint is concerned, this partial assignment is fine. If it is no, that means this partial assignment is not fine. This is just another representation, if you want to talk about composition and intersection, you can not talk about procedures, but if you have procedure constraints, you want to compute the conmposition, you basically multiplly call the procedure. 3. Converting n-ary to binary we thought a constraint is essentially about a set of variables, but these guys are saying a constraint is always between two variables. So the question is what are the given up of this binary temporal constraint view, is it like without the loss of generality or with loss of generality. Now the normal party line is it is without loss of generality. Just as concentrating on SAT tells you a lot that you need to know about CSP, because every CSP can be written into a SAT, so you could say without loss of generality. Unfortunately there much more encodings need you to do when you write a CSP into a SAT, you know that SAT can actually blow the size of CSP. So the question is without loss of generality of what, the generality of expressiveness or the generality of efficiency. In fact CSP community always thought it doesn't really matter in terms of expressiveness, you can take a n-ary CSP and make extra variables, and convert every n-ary constraints into a set of binary constraints plus extra variable. It is not obvious up front, because there is a notion of making of new variables, but if you have a n-ary CSP and you convert it into a binary , you have more constraints, more variable, and then you can use the constraint network representation. And one of the interesting thing is that nobody ever check whether or not that is a reasonable thing to do. One of my mentor, it is a kind of nice thing to remember: "All research is a solution looking for a problem". So after you generate one technique, some solutions you hope the whole world will really only care about this problem and nothing else. If you are really hoping that since binary CSP has all this nice little network representation, if the whole world contains only binary constraints, then you feel free. But there are lots of non-binary constraints, pretty much everything that you write in a SAT encoding plans is non-binary constraints, so whenever you have non-binary constraints, you have to convert them into binary version, if you want to only stick to binary constraints, and there is actually a paper as late as 98 by ? and Van Beek, these are the big people in CSP community, they actually said: "Ok, let's set down and talk about what is the utility of taking a n-ary CSP and converting it into a binary CSP, and how bad can it get". And the results are quite surprising for the CSP people, because it show that it is not always a god idea to convert a n-ary CSP into a binary CSP. It's theoretically feasible but is not always a good idea. 4. constraint networks So if you have basically variables and binary constraints between them, then you will have graph, which each edge is the constraint itself, the edge has a label, and it is the representation of the constraint. Again, as I said the representation here will always talk about feasible value. Page26: So in this case(p26), it says given c12 is a constraint, it says only (2,2) is a valid assignment between x1 and x2, and so on, those are the constraints, so think of each constraint is essentially a relation, is not surprising that you kind of do algebra constraint that looks a lot like what you do in databases. So it is kind of interesting, on one hand you can think of constraints as logic sentences, and that view is kind of like the normal view, on other hand you can think of constraints as relations. This thing is basically from constraint networks. The nice thing is there is like a huge number of techniques are developed to apply basically graphic theoretical analysis to these networks, and the notions like arc and path consistency, which are basically two and three consistency for normal CSP, are also mentioned there. Arc consistency deals with two variable at a time and it reduces the domains. For example x1:{1,2} c12:{(2,2)} x2:{2,3} O--------------------------o After we do arc consistency propagation, we can reduce the domain of x1 as x1:{2}, and x2:{2}. And path consistency is some thing we will talk about in the next, but essentially it will talk about three variables at certain time and say if there are constraints between x1 and x2, x2 and x3, what is the constraint about x1 and x3. 5. Solutions, Minimum networks and Constraint propagation Page 27 So these(p27) are the normal problems that you are interested in solving CSP talks. We only care about the first two, but in CSP looked at all four. So you want to know whether the CSP is consistent first of all, and if it is consistent, give me a solution? notice that consistency says there is an existence, it's basically says there is a solution, and normally existence question might be cheaper than constructure question. So deciding consistency, computing a solution, and computing a minimum network. These actually make good sense only if you take the notion of constraints as feasible values. What is a minimum network , it essentially tells you how to compute all the solution. Once I know the minimum network I know all the solutions in poly-nominal time. Since normally finding a new solution is NP-complete, how much would be the cost of finding the minimum network, NP-complete. So computing a minimum network is actually harder than finding a single solution. Then the fourth notion is computing all solutions. If you have computed the minimum network, you know how to compute all the solution. Now why does one need to talk about minimum network, the idea is you are talking about constraint propagation. What is constraint propagation doing? Constraint propagation is kind of process all the constraints at the same time, and making the network more towards the minimum network. So it is kind of nice now, you can see we always talked about what consistency enforcement does, and here in the point of binary CSP, you can actually talk about a partial ordering between networks. You begin with the beginning network, some where down are the minimum network, and there are a bunch of networks which are more minimum and less minimum, below minimum network there are nothing, and you can think of consistency enforcement as transversing these lattice and different constraint propagation algrithm transversing in different ways. But in essence they are all toward minimum networks. In the case of minimum network there is no back tracking. and when you go closer and closer toward minimum network , hopefully you do less and less back tracking, that the whole notion of constraint propagation in the binary CSP point of view. 6. Reduce the complexity. Page 28 So, of course everything is NP-complete, existence is NP-complete, finding a solution is NP-complete, finding all solution is also NP-complete. Now the question is coping with intractability. It looks like a silly thing for us, because in AI everything is NP-complete, you should like to see if there are ways in which you could reduce the complexity of processing. For example, if you actually are given a CSP and I tell you that your life depends on your telling me the right answer as to whether or not it contains a solution and give me a solution. Suppose you want to just spend ten minutes. There are two procedures p1 and p2, both take ten minutes and in the case of P1 p2 p3 Yes: maybe consistent Yes: definitely consistent Yes: definitely Yes No: definitely No No: maybe not consistent No: definitely No The procedure p3 is too costly, it's NP-complete. So I give these p1, p2 polynomial procedure, which you want to use. Of course, you will pick up p1,because when p2 say a plan is inconsistent, it is actually maybe inconsistent, and you just kill the plan, and then you maybe kill the only solution in this way. So when people talk about the approximation method of CSP, they always look for approximation method which will like the first type of procedure p1s. They are actually only check if you are really inconsistent, and again the reason this make sense is because CSP give you the subrouten for planner, you don't really need to solve it, if you stick at a single CSP, you have to solve it, but if your CSP is a moving target, and in fact the planner might for other reasons change it's mind and go to a different branch, which give you a completely different CSP. So who care if you really spend all your life and figured out that what is the completely consistent. That the reason why it make sense to think about approximation solution for approximation in this case means if it says inconsistency it is really inconsistent, if it doesn't, it may still be inconsistent. Consistency enforcement actually gives you inconsistency and make the constraints tight. In this way as we do more and more constraint propagation. We go closer and closer to the minimum network, but until you get the minimum network you can not say that it is consistent. Now there is the interesting thing, how would you actually do things such that when you say Yes means definitely consistent, no means I don't know. Then you essentially have to come from below toward minimum constraints, in this way you come from a really small network, the network you have is the small set of the minimum network, then increase it. 7. Path consistency Page 29,30,31 Path consistency is essentially three consistency, and for many problems by the time you do path consistency, you will find whenever this bad disjunction doesn't existed in the temporal constraints, path consistency would ensure minimum network, the good problems ( the easiest problems) are such that path consistency ensures global consistency, such that: minimum network = global consistency. So the idea is what is the level of consistency you have to enforce before you get minimum network. If you just have to do arc-consistency that means two consistency will get you to the global consistency. If you have to do three consistency before you get a global consistency, that's likely a harder problem. So, the best way ti compare the complexity of two CSP problems is not in terms of number of variables and constraints, but rather in terms of what is the level of consistency that you have to enforce before it becomes globally consistency. So the simple temporal problems is on level three. Once you have disjunctions the worsest case is level n. Suppose I have a constraint network where there are two variables which have no edge between them, what's the constraint between them, what are the feasible binding? It can be every thing of the cross product of d1 and d2 (d1 the domain of the first variable and d2 is that of the second one). So the basically idea is you have avoided putting a edge if in fact the constraints that does exist is useless constraint, that means everything is fine. That's worth remembering, because it's not that path consistency require a constraint between every two variables, even if there no constraint, you do actually have a constraint, that's the d1 cross d2. So think of every point of variables, if there is a explicit constraint related to them. If there are no explicit constraint, you essentially have a dummy constraint which says everything is feasible. We can use a example to understand the concept of path consistency. We can use a example to understand the idea of path-consistency, 4 t2 5 --------------O--------------- | | t1 O--------------------------------O t3 3 t1 should proceed t2 by at least four minutes, t2 should proceed t3 by at least five minutes, t1 before t3 three minutes. Now is it really reasonable to believe that the proceed of t1 to t3 is three? It is not, because you have want to assign values to everybody, not only t1 and t3, so t1 must proceed t3 by at least nine minutes, that is basically enforce path consistency. A normal path consistency says take any large path x1 to xn and in essence figure out what is implicit constraint between x1and xn, given the constraints x1 to x2, x2 to x3,..., xn-1 to xn, and it turns out that for simple binary networks if you ensure three consistency you get path consistency. Now consider all three sets of this n variable, for each three set you complete the new constraint between two variables. When you do this you hopefully have computed the tightest constraint between every two variables. For example, suppose t1,t2,t3 :(1,10) (1,5),(1,6),(1,10), (1,6),(1,7),(2,7) (2,6),(2,10) (5,10),(4,9),(4,10) O--------------------------O-------------------------O t1 t2 t3 And there is no constraint between t1 and t3. So actually the constraint(t1,t3) will be: { (x1',x2')|x1':(1,10),x2':(1,10)}. After I do what ever I do the new constraint would be a subject of this. New-Constraints(t1,t3) = constraints(t1,t2) composition constraints(t2,t3) = {(1,5) join (5,10)} = { (1,10) } Constraints(t1,t3) = {Constraints(t1,t3)} intersection {New-Constraints(t1,t3)} = { (1,10) } Here the composition operation is nothing but join followed by selection. So the only feasible assignment here is t1=1 and t3=10. At last, Dr.Rao explained the Enforcing Path-consistency alrigthm and Directional Path-consistency algrithm on page30 and page31.