------------------------------------------------------------------------
Note on Mar 27 Class. Note taker: Zaiqing Nie
Topics: Temporal Constraint Satisfaction Problems
      
Papers involved: 
P1. E. Schwalb, L. Vila: "Temporal Constraints: A Survey"
P2. R. Dechter, I. Meiri, J. Pearl: "Temporal Constraint Networks"
P3. E. Schwalb, R. Dechter: "Processing Disjunctions in Temporal 
Constraint Networks" 
------------------------------------------------------------------------

1. Introduction to these three papers

This week is the TCSP(Temporal Constraint Satifaction Problems) week, 
first of all you should have got a bunch of papers, one of these paper is 
"Temporal Constraints: A Survey", it is a survey paper, but the problem 
of this survey paper is that it assume that you have known enough about 
the TCSP already, so it will not be the best paper to read first. There 
are two other paper, one is the "Temporal Constraint Networks", which is 
the old 1991 paper. It actually is the first TCSP paper, the very first 
paper which try to formalize temporal reasoning as a constraint 
satisfaction flaw, a particular importance is the notion of metric 
temporal problem. For exemple, t1-t2<=25, how do you handle those 
temporal constraints? They talked about these, and they differentiate 
between something called simple temporal problems vs. full temporal 
problems which contain disjunctions. What the hell is disjuctions? If you 
have disjuctions then you are dead. Because CSP people don't like having 
NP-complete procedures for just solving the CSP. If your NP-complete is a 
subset of another NP-complete procedure, it is not clear to me whether 
you gain anything by not making this formulas and making the other one 
that is still NP-complete. But it is kind of worth figuring out where the 
complexity comes in, how do the problem became harder. That is what this 
paper talked about, this is like a original paper. Then you can jump into 
the paper "Processing Disjunctions in Temporal Constraint Networks", the 
97 AI paper, this explain very nicely how to implement temporal 
constraint satisfaction algrithm. If you try to read this paper first 
then whenever you got confused about something they talked in their older 
paper, then go back to read the paper P2. In fact once things become 
clearly, you go back to the survey paper P1, and you will say "Yes, of 
course that exactly what it is". unfortunately, it is very hard for 
people who work inside of the field to write a survey paper, it is very 
hard to kind of figure out what people don't know. This survey paper is 
still interesting, because the other two paper together don't exactly 
cover the whole information. I am going to try to use the slides from the 
tutorial most of the time. Couple of things you need know, things like 
what is the meaning of composition and intersection, because they keep 
talking about it, why is this important? somehow that connected to the 
notion of path-consisitency so why is path-consistency connected to 
composition and intersection. 

2. Approachs to represent constraints
I am starting from constraint satisfaction which is also there in the 
tutorial, because you can understand these notion of normal CSP, and then 
go to temporal CSP which is a special type of CSP. Any way this is 
basically a constraint satisfaction problem, you have variables, you have 
domains, you have constraints, etc. One thing you want to remember is 
that  constrains can be represented either as feasible or as infeasible 
compound labels. You have to decide which one up-front, you either think 
of constraints as telling you don't do these things, (this is called 
no-good approach, the constraints are no-goods), or as telling you are 
only allowed to do these things. 
An example to explain how to represent constraints,
 ...
 x belongs to {1,2,3}
 y belongs to {2,4}
 ...
 What I really want to say is x and y should nerver have the same value, 
what are the constraints between x and y?
 no-good way:
 (x, y) not belong to {(1,1),(2,2),(3,3)}
 feasible value way:
 (x, y) belong to {(1,2),(1,4),(2,4),(3,2),(3,4)}
 
 If I am giving a full assignment, such that 
 (..., x=a, ..., y=b,...)
 Suppose I want to see if assignment is already inconsistent to the 
constraints no matter it is complete or partial. What I am going to do is 
I actually project x and y values out of this assignment. And if I have 
constraints represented as no-goods, I check if the project value is 
actually  a member of no-good set; in the feasible compound value way, I 
check if the projected value is actually a member of the allowed set.
 In different situation for practical reason you might say one way is 
better than other way. But theoretically they are exactly equal. For the 
rest of our discuss, we just think of constraints as feasible values, but 
you should be able to just convert from one way to another. 
 What happens when you do arc-consistency? If you have the view of this 
way, feasible compound methods, the constraints thrinks. Because after 
the constraint propagation, you found some of the feasible values about 
not actually feasible, so you remove them. So the contraints shrinks. 
When you do consistency enforcement, you keep reduce the size of the 
constraints; If you have the view of infeasible way, after you do 
acr-consistency, you found other things that are implicit infeasible too, 
so the constraints increase. 
In practice, people will use a procedure that will take a partial 
assignment as input and it will tell you what are the variables you are 
going to worry about. For example, the procedure C5(x,y) which cares 
about x and y. it takes the partial assignment as input, and does some 
magic actions, and outputs Yes/No. If it is Yes, that means as far as 
this constraint is concerned, this partial assignment is fine. If it is 
no, that means this partial assignment is not fine. This is just another 
representation, if you want to talk about composition and intersection, 
you can not talk about procedures, but if you have procedure constraints, 
you want to compute the conmposition, you basically multiplly call the 
procedure. 

3. Converting n-ary to binary
we thought a constraint is essentially  about a set of variables, but 
these guys are saying a constraint is always between two variables. So 
the question is what are the given up of this binary temporal constraint
view, is it like without the loss of generality or with loss of 
generality. Now the normal party line is it is without loss of 
generality. Just as concentrating on SAT tells you a lot that you
need to know about CSP, because every CSP can be written into a SAT, so
you could say without loss of generality. Unfortunately there much more 
encodings need you to do when you write a CSP into a SAT, you know that 
SAT can actually blow the size of CSP. So the question is without loss of
generality of what, the generality of expressiveness or the generality of 
efficiency. In fact CSP community always thought it doesn't really matter 
in terms of expressiveness, you can take a n-ary CSP and make extra 
variables, and convert every n-ary constraints into a set of binary 
constraints plus extra variable. It is not obvious up front, because 
there is a notion of making of new variables, but if you have a n-ary CSP 
and you convert it into a binary , you have more constraints, more 
variable, and then you can use the constraint network representation. And 
one of the interesting thing is that nobody ever check whether or not 
that is a reasonable thing to do. One of my mentor, it is a kind of nice 
thing to remember: "All research is a solution looking for a problem". So
after you generate one technique, some solutions you hope the whole world
will really only care about this problem and nothing else. If you are
really hoping that since binary CSP has all this nice little network
representation, if the whole world contains only binary constraints, then 
you feel free. But there are lots of non-binary constraints, pretty much 
everything that you write in a SAT encoding plans is non-binary 
constraints, so whenever you have non-binary constraints, you have to 
convert them into binary version, if you want to only stick to binary 
constraints, and there is actually a paper as late as 98 by ? and Van 
Beek, these are the big people in CSP community, they actually said: "Ok, 
let's set down and talk about what is the utility of taking a n-ary CSP 
and converting it into a binary CSP, and how bad can it get". And the 
results are quite surprising for the CSP people, because it show that it 
is not always a god idea to convert a n-ary CSP into a binary CSP. It's 
theoretically feasible but is not always a good idea. 

4. constraint networks

So if you have basically variables and binary constraints between them, 
then you will have graph, which each edge is the constraint itself, the 
edge has a label, and it is the representation of the constraint. Again, 
as I said the representation here will always talk about feasible value. 

Page26:
So in this case(p26), it says given c12 is a constraint, it says only 
(2,2) is a valid assignment between x1 and x2, and so on, those are the 
constraints, so think of each constraint is essentially  a relation, is 
not surprising that you kind of do algebra constraint that looks a lot 
like what you do in databases. So it is kind of interesting, on one hand 
you can think of constraints as logic sentences, and that view is kind of 
like the normal view, on other hand you can think of constraints as 
relations.
This thing is basically from constraint networks. The nice thing is
there is like a huge number of techniques are developed to apply 
basically graphic theoretical analysis to these networks, and the notions 
like arc and path consistency, which are basically two and three 
consistency for normal CSP, are also mentioned there. Arc consistency 
deals with two variable at a time and it reduces the domains. For example 

  x1:{1,2}    c12:{(2,2)}      x2:{2,3}
      O--------------------------o
      
After we do arc consistency propagation, we can reduce the domain of
x1 as x1:{2}, and x2:{2}. 
And path consistency is some thing we will talk about in the next,
but essentially it will talk about three variables at certain time
and say if there are constraints between x1 and x2, x2 and x3, what is
the constraint about x1 and x3. 

5. Solutions, Minimum networks and Constraint propagation

Page 27 
  So these(p27) are the normal problems that you are interested in 
solving CSP talks. We only care about the first two, but in CSP looked at 
all four. So you want to know whether the CSP is consistent first of all,
and if it is consistent, give me a solution? notice that consistency says 
there is an existence, it's basically says there is a solution, and 
normally existence question might be cheaper than constructure question. 
So deciding consistency, computing a solution, and computing a minimum 
network. These actually make good sense only if you take the notion of 
constraints as feasible values. What is a minimum network , it 
essentially tells you how to compute all the solution. Once I know the 
minimum network I know all the solutions in poly-nominal time. Since 
normally finding a new solution is NP-complete, how much would be the 
cost of finding the minimum network, NP-complete. So computing a minimum 
network is actually harder than finding a single solution. Then the 
fourth notion is computing all solutions. If you have computed the 
minimum network, you know how to compute all the solution. Now why does 
one need to talk about minimum network, the idea is you are talking about 
constraint propagation. What is constraint propagation doing? Constraint 
propagation is kind of process all the constraints at the same time, and 
making the network more towards the minimum network. So it is kind of 
nice now, you can see we always talked about what consistency enforcement 
does, and here in the point of binary CSP, you can actually talk about a 
partial ordering between networks. You begin with the beginning network, 
some where down are the minimum network, and there are a bunch of 
networks which are more minimum and less minimum, below minimum network 
there are nothing, and you can think of consistency enforcement as 
transversing these lattice and different constraint propagation algrithm
transversing in different ways. But in essence they are all toward
minimum networks. In the case of minimum network there is no back 
tracking. and when you go closer and closer toward minimum network , 
hopefully you do less and less back tracking, that the whole notion
of constraint propagation in the binary CSP point of view.

6. Reduce the complexity. 

Page 28 
So, of course everything is NP-complete, existence is NP-complete,
finding a solution is NP-complete, finding all solution is also 
NP-complete. Now the question is coping with intractability. It looks
like a silly thing for us, because in AI everything is NP-complete,
you should like to see if there are ways in which you could reduce the 
complexity of processing. 
For example, if you actually are given a CSP and I tell you that your 
life depends on your telling me the right answer as to whether or not it 
contains a solution and give me a solution. Suppose you want to just 
spend ten minutes. There are two procedures p1 and p2, both take ten 
minutes and in the case of 

P1                                p2                       p3 
Yes: maybe consistent    Yes: definitely consistent   Yes: definitely Yes  
No: definitely No        No:  maybe not consistent    No: definitely No 

The procedure p3 is too costly, it's NP-complete. So I give these p1, p2 
polynomial procedure, which you want to use. Of course, you will pick up 
p1,because when p2 say a plan is inconsistent, it is actually maybe 
inconsistent, and you just kill the plan, and then you maybe kill the
only solution in this way. So when people talk about the
approximation method of CSP, they always look for approximation method
which will like the first type of procedure p1s. They are actually
only check if you are really inconsistent, and again the reason this
make sense is because CSP give you the subrouten for planner, you
don't really need to solve it, if you stick at a single CSP, you have
to solve it, but if your CSP is a moving target, and in fact the planner 
might for other reasons change it's mind and go to a different branch, 
which give you a completely different CSP. So who care if you really 
spend all your life and figured out that what is the completely 
consistent. That the reason why it make sense to think about 
approximation solution for approximation in this case means if
it says inconsistency it is really inconsistent, if it doesn't, it may
still be inconsistent. Consistency enforcement actually gives you
inconsistency and make the constraints tight. In this way as we do more
and more constraint propagation. We go closer and closer to the
minimum network, but until you get the minimum network you can not
say that it is consistent. Now there is the interesting thing, how
would you actually do things such that when you say Yes means
definitely consistent, no means I don't know. Then you essentially
have to come from below toward minimum constraints, in this way you
come from a really small network, the network you have is the small
set of the minimum network, then increase it. 

7. Path consistency 

Page 29,30,31
  Path consistency is essentially three consistency, and for many
problems by the time you do path consistency, you will find whenever
this bad disjunction doesn't existed in the temporal constraints, path
consistency would ensure minimum network, the good problems ( the
easiest problems) are such that path consistency ensures global
consistency, such that: minimum network = global consistency. So the idea 
is what is the level of consistency you have to enforce before you get
minimum network. If you just have to do arc-consistency that means two 
consistency will get you to the global consistency. If you have to do 
three consistency before you get a global consistency, that's likely a 
harder problem. So, the best way ti compare the complexity of two CSP 
problems is not in terms of number of variables and constraints, but 
rather in terms of what is the level of consistency that you have to 
enforce before it becomes globally consistency. So the simple temporal 
problems is on level three. Once you have disjunctions the worsest case 
is level n. 
Suppose I have a constraint network where there are two variables which 
have no edge between them, what's the constraint between them, what are 
the feasible binding? It can be every thing of the cross product of d1 
and d2 (d1 the domain of the first variable and d2 is that of the second 
one). So the basically idea is you have avoided putting a edge if in fact
the constraints that does exist is useless constraint, that means 
everything is fine. That's  worth remembering, because it's not that path 
consistency require a constraint between every two variables, even if 
there no constraint, you do actually have a constraint, that's the d1 
cross d2. So think of every point of variables, if there is a explicit 
constraint related to them. If there are no explicit constraint, you 
essentially have a dummy constraint which says everything is feasible. We 
can use a example to understand the concept of path consistency. 

We can use a example to understand the idea of path-consistency,

           4           t2         5
         --------------O---------------
        |                              |
   t1  O--------------------------------O t3
             3
             
             
t1 should proceed t2 by at least four minutes, t2 should proceed t3
by at least five minutes, t1 before t3 three minutes. Now is it
really reasonable to believe that the proceed of t1 to t3 is three?
It is not, because you have want to assign values to everybody, not
only t1 and t3, so t1 must proceed t3 by at least nine minutes, that
is basically enforce path consistency. A normal path consistency says
take any large path x1 to xn and in essence figure out what is
implicit constraint between x1and xn, given the constraints x1 to x2,
x2 to x3,..., xn-1 to xn, and it turns out that for simple binary
networks if you ensure three consistency you get path consistency.
Now consider all three sets of this n variable, for each three set
you complete the new constraint between two variables. When you do
this you hopefully have computed the tightest constraint between
every two variables. 
For example, suppose t1,t2,t3 :(1,10) 


   (1,5),(1,6),(1,10),             (1,6),(1,7),(2,7)    
   (2,6),(2,10)                    (5,10),(4,9),(4,10)
 O--------------------------O-------------------------O 
 t1                         t2                        t3

And there is no constraint between t1 and t3. So actually the 
constraint(t1,t3) will be: { (x1',x2')|x1':(1,10),x2':(1,10)}. After I do 
what ever I do the new constraint would be a subject of this. 

New-Constraints(t1,t3)
  = constraints(t1,t2) composition constraints(t2,t3)
  = {(1,5) join (5,10)}
  = { (1,10) }
Constraints(t1,t3)
  =  {Constraints(t1,t3)} intersection {New-Constraints(t1,t3)}
  = { (1,10) } 

Here the composition operation is nothing but join followed by selection. 
So the only feasible assignment here is t1=1 and t3=10. 

At last, Dr.Rao explained the Enforcing Path-consistency alrigthm and 
Directional Path-consistency algrithm on page30 and page31.