_______________________________________________________ Note taker: Terry Zimmerman (zim@asu.edu) Date: April 19 2000 Subject: Reviewed Midterm exam (with relevant digressions) until ~4:25PM Decision-Theoretic Planning (cont) Stationary MDPs, problem horizons, discounted value functions In exchange with: Binh Minh ................................... Midterm exams handed back along with a Sample Solutions handout. These were discussed for most of class. Major "exam points of interest" are briefly covered here, especially those Dr. Rao seized on as opportunities to extend into mini-lectures... The discussion fairly closely tracked the solns handout until Short Answer question D where Rao spent ~15 mins discussing the relationship between the SAT encodings of clauses and the "equivalent" ILP clauses (where variables are constrained to having values 0/1 ). Whereas most students fully grasped the first two parts of the question involving expressing the encodings in the two forms, there was more subtlety to the third part -a more compact ILP representation. Rao explained what lay behind the soln sheet answer: A1 + A2 + A3 <=1 This form can only be arrived at by using "passive constraints" for the problem. That is, recognition that A1, A2, and A3 must have either 0 or 1 as assigned values allows you to create this equivalent eqn. This is called a "cut" in ILP literature (and more specifically a "clique cut"). Note that the following combination of the linear clauses is NOT a cut: A1 + A2 <=1 A1 + A2 <=1 A2 + A3 <=1 A3 + A1 <=1 -------------- 2A1 + 2A2 + 2A3 <=3 This is NOT a cut! Unlike A1 + A2 + A3 <=1, it provides no further information or refinement over the original component eqns. BINHMINH: More correctly, it's also a cut. Any constraint is a cut. However, it's not a "separating" cut, meaning that it doesn't provide any new information (doesn't cut down any part from the relaxation region). Rao drew a comparison of this use of "passive information" to the POCL planning process. Suppose we have 2 operators: A1 with precond P, A2 with precond Q The initial partial plan { 0 [P] -----> ~infty } gets refined into { 0 * A1 < ~infty } which describes a greatly reduced the candidate set Here we've used passive information about the planning action structure in order to prune the candidate set. We use the information that for action to be applicable, all of its precondition msut hold before the action. Note that in CSP, the constraint propagation process does NOT reduce the number of "candidates". It just makes the representation more explicit. To reduce the size of the candidate set you need to combine constraint propagation with non-explicit constraints about the problem or CSP process. (constraints that are not "on the table"). BINHMINH: Umm, the question here is what exactly do we mean by "candidate". I think that by adding new constraint, we actually cutting down the candidate set, but the one that doesn't change is the set of solutions. For example, if we have: X From rao@asu.edu Mon Apr 24 23:48:00 2000 > Date: Thu, 20 Apr 2000 07:13:25 -0700 > From: Subbarao Kambhampati > To: plan-class@parichaalak.eas.asu.edu > Subject: (a)musings on the SAT/ILP connection > > Here is a bit more on the SAT/ILP connection (in the context of the > cutset discussion in today's class). > > We know that every SAT clause has an 0/1 ILP equivalent. > > So, what is the equivalent of "resolution" and other inference > procedures of SAT in ILP? > > THe answer is --simple linear combination of constraints. > > For example see the nice little connection between the resolution of > two SAT clauses and the equivalent resolution: > > x v y x + y >= 1 > > ~y V z 1 -y +z >= 1 > ============== ======================= > x v z x + z >= 1 > > > So, we note that if we just keep computing all linear combinations of > the ILP constraints, you are deriving all the consequences of the > original SAT clauses. > > Since constraint propagation involves essentially deriving all the > resolvents of the given set of clauses, doing linear combinations on > ILP side can be seen as doing constraint propagation. BINHMINH: That's true that ILP, and CSP overshadow SAT, but some linear combinations do not have the same effect in SAT. What I mean is that any SAT resolvent can be expressed as some linear combination, but it's not true the other way. It's easy to understand because SAT is just special case of ILP (binary ILP), and special case of CSP (boolean CSP), therefore, anything that we can do with SAT, we can do with ILP and CSP, but not the other way. > > > Suppose the SAT clauses are unsatisfiable. How does it show up in ILP? Maybe > you derive an inconsitent arithmetic constraint through linear combinations... > let's see: > > x v y x + y >=1 > ~y 1 - y >=1 > ~x 1 - x >=1 > ========== ================= > |= False 2 >= 3 > > Viola! Notice that the false on the left is derived after two separate > resolutions. The degnerate inequality on the right is derived after > just summing everything up. BINHMINH: However, deciding which set of ILP constraints to sum up to get a "right" cut is rather more difficult than SAT (because its structure is more general). But, in Hooker's book, he has a nice section (that I like very much) that talk about which SAT clauses are chosen to get a new resolvent based on the solution from the corresponding LP relaxation. > > Another way of checking if a set of linear constraints will allow us > to derive inconsistent arithmatic constraints is to just run the LP > algorithm on the constraints and see if it returns degeneracy. > > But wait.. how come we are able to check unsatisfiability of > propositional clauses--a co-NP complete problem, using LP, which is a > polynomial procedure? Something must be wrong here... > > Let's see if we can always check unsatisfiability using LP...consider > the following example: > > ~x V y 1 -x +y >=1 > x V y x +y >=1 > ~x V ~y 2 -x -y >=1 > x V ~y 1 +x -y >=1 > ========= > |= False > > Clearly, the left hand side entails an empty clause. > However, there is no linear combination of the clauses on the RHS that > leads to an inconsistent arithmatic constraint. In fact, it is easy to > see that the assignment x,y = (.5,.5) is a satisfying assignment in > that it satisfies all the arithmatic constraints. > > SO, what gives? What gives is that the integrality constraints x,y > must be 0 or 1--which are implicit on the RHS constraints, are > violated. > > Specifically, let's see if any of the assignments x,y =(00)(01)(10)(11) > make sense for the RHS. Consider (00)--this assignment violates the > second constraint. Similarly, (01) violates the fourth constraint, > (10) violates the first constraint and (11) violates the > third one. So, none of the 0/1 assignments are consistent for > x,y. This means that the 0/1 integer program is "degenerate". > > So, why isn't the LP degenerate? Well, that is because there are > fractional solutions (x,y) = (.5,.5) that satisfy all constraints. > This is why LP relaxation doesn't detect the inconsistency of the SAT > clauses. > > Given that the LP relaxation did detect the inconsistency in the x V y, > ~x, ~y case above, but failed in this case, the question is what sub-class > of inconsistencies can be detected by LP relaxation alone? > > We note that in the case of x V y, ~x, ~y, unit propagation on the SAT > clause derives the empty clause, while in the second example above, the > empty clause cannot be derived by unit propagation alone. Turns out that > this is in general always the case. ZIM: You introduce the term "unit propagation" here, by which I assume you mean 1x1 constraint prop on the SAT clauses. So it seems the phrase "while in the second example above, the empty clause cannot be derived..." should indicate that you're referring to the equivalent propagation on the ILP eqns. (since the empty clause IS derived for the SAT clauses for the 2nd case also). > That is, LP relaxation of the 0/1 ILP program of a set of SAT clauses > will be degenerate IF and ONLY IF unit propagation applied to the SAT > clauses derives an empty clause. [[Since running unit propagation to > completion on a set of SAT clauses is only polytime, while checking > the degeneracy of an ILP program--even if it is just a 0/1 ILP > program--is NP-complete, the complexities do match]]. > > The final note is about resolvents that can be derived by linear > combination. Clearly, for our second example, the empty clause is a > resolvent of the SAT clauses, and yet it cannot be derived by the > linear combination of the arithmatic constraints (or else, LP > relaxation would have detected the inconsistency.). BINHMINH: But if we take into account the fact that all ILP variable should take the value 0/1, then we can derive the same effect as "empty clause" in SAT. > What this means is the following ONE DIRECTIONAL implication: > > Every linear combination of arithmatic constraints corresponding to a > set of SAT clauses is a resolvent of the SAT clauses. > (but not vice versa). ZIM: I'm confused. The lin comb in the 2nd example did NOT resolve the SAT clauses -at least not with the same answer as SAT resolution. BINHMINH: I don't know how can you prove it, I dont' see it clearly here. > > Cutting constraints start with linear combinations, but then also use > the integrality restriction to derive constraints that are stronger > than any of the linear combinations. BINHMINH: And can get cut that coresponding to any "possible" resolvent from original SAT clauses set. Binhminh: It may help to state that in SAT, it's proven that resolution is complete in the sense that it can derive ALL possible clauses. In ILP, the Chva'tha (I'm not so sure about the name, seems like some other guy ??ri first mention about it, and Chva'tha prove it) cut, which basically take the *weighted* combination of arithmatic constraints and do the *integral rounding* on all argument can derive ALL possible cut. However, bot resolution and Chva'tha's cut are general and costly, so it's better to concentrate on more specific type of constraint propagations. > > (One thing worth thinking about at this point is what is the nature of > the cutting constraints when seen from the SAT angle. Specifically, a > cutting constraint should still correspond to a valid SAT BINHMINH: I think that it's not true. Let's take the example below, x+y+z <= 1 doens't corespond to any SAT resolvent. The only utility here is that it help the LP relaxation, but it DOESN'T help ILP formulation of SAT. Thus, it give "extra" information to LP relaxation, but it doesn't give extra information to the ILP (doens't cut down any part from the integral hull). And the binary ILP is the mirror of the SAT, we doens't get anything for it, and also nothing for the SAT (no new resolvent). > resolvent--but there may not be any compact way of expressing it. For > example, in the case of 3 pairwise mutex clauses between x,y, and z, > we know that at most one of the three variables x,y and z can be > true. This can be expressed quite elegantly and compactly as > > x+y+z <= 1 > > in ILP. However in SAT, it needs the expression on the LHS below > (because SAT is lousy at COUNTING). > > (x & ~y & ~z) V > (~x & y & ~z) V > (~x & ~y & z) V > (~x & ~y & ~z) BINHMINH: However, while x+y+z <= 1 (1)in ILP give us some *extra* thing in LP relaxation, the lousy SAT form of (1) doesn't give any thing to SAT, because it's exactly (~x v~y), (~y v ~z), (~z v ~x), which is the starting point in deriving (1) in ILP. Note that adding (1) to the ILP formulation may help the LP relaxation detect the infeasibility, but adding the lousy SAT form to the original SAT doens't help unit propagation (if it couldn't detect the infeasible in the starting place) (Oops, I'm kind of repeating the point here). Rao [Apr 20, 2000] ................................................ Question G (integrating planning & scheduling) additional notes: A major problem with conducting planning & scheduling together is that we end up rediscovering the many types of failures over and over. For example, if an action cannot be assigned a given resource the planner will futilely try all other resources of th e same type. This problem is really all about "abstraction". That is can we efficiently abstract scheduling from planning. It's likely that a compromise will be best. One approach: use a 2-phase plan/schedule system but keep an upper-lower bound resource constraint in the planning phase. The comments on the remaining questions closely followed the Sample Solutions. -------------------------------------------------------- BACK TO: "Decision-Theoretic Planning: Structural Assumptions and Computational Leverage" Rao reviewed Section 2 as covered last time focusing on: Stationary MDPs: -the probability distribution predicting the next state depends only on the previous state, NOT the stage (or time) Section 2.7 Horizons and Success Criteria Finite horizon problems vs. infinite-horizon problems... (infinite horizon problems necessitate use of a discount factor) Rao noted that there's a special case of an infinite horizon problem that does not require discounting: Only the goal state has a positive reward and goal states are "absorbing". Actions have a positive cost (ie. get subtracted) We then moved to Section 3 & 3.1 with this observation: Counter-intuitively, there does NOT exist an optimal policy for a stationary finite horizon MDP. But there DOES exist an optimal policy for an infinite horizon stationary MDP and it's the same as for the equivalent non-stationary MDP (i.e. time doesn't matter). (This was proven by R.A. Howard, 1960) ============================================================= NEXT TIME (from Rao e-mail): 1. Fielding any questions on section 2.10 2. Discuss Section 3 We definitely will cover section 3.1 ** Make sure that you follow the policy construction in Figure 11. We will most likely cover Section 3.2-3.3