_______________________________________________________
Note taker: Terry Zimmerman (zim@asu.edu)
Date: April 19 2000
Subject: Reviewed Midterm exam (with relevant digressions) until ~4:25PM
Decision-Theoretic Planning (cont)
Stationary MDPs, problem horizons, discounted value functions
In exchange with: Binh Minh
...................................
Midterm exams handed back along with a Sample Solutions handout. These were
discussed for most of class. Major "exam points of interest" are briefly
covered here, especially those Dr. Rao seized on as opportunities to extend
into mini-lectures...
The discussion fairly closely tracked the solns handout until Short Answer
question D where Rao spent ~15 mins discussing the relationship between the
SAT encodings of clauses and the "equivalent" ILP clauses (where variables
are
constrained to having values 0/1 ). Whereas most students fully grasped the
first two parts of the question involving expressing the encodings in the two
forms, there was more subtlety to the third part -a more compact ILP
representation.
Rao explained what lay behind the soln sheet answer: A1 + A2 + A3 <=1 This
form can only be arrived at by using "passive constraints" for the problem.
That is, recognition that A1, A2, and A3 must have either 0 or 1 as assigned
values allows you to create this equivalent eqn. This is called a "cut" in
ILP literature (and more specifically a "clique cut"). Note that the
following combination of the linear clauses is NOT a cut:
A1 + A2 <=1
A1 + A2 <=1
A2 + A3 <=1
A3 + A1 <=1
--------------
2A1 + 2A2 + 2A3 <=3 This is NOT a cut! Unlike A1 + A2 + A3 <=1, it
provides no further information or refinement over
the original component eqns.
BINHMINH: More correctly, it's also a cut. Any constraint is a cut.
However, it's not a "separating" cut, meaning that it doesn't provide any new
information (doesn't cut down any part from the relaxation region).
Rao drew a comparison of this use of "passive information" to the POCL
planning process. Suppose we have 2 operators:
A1 with precond P, A2 with precond Q
The initial partial plan { 0 [P] -----> ~infty }
gets refined into { 0 * A1 < ~infty }
which describes a greatly reduced the candidate set
Here we've used passive information about the planning action structure in
order to prune the candidate set. We use the information that for action to
be applicable, all of its precondition msut hold before the action.
Note that in CSP, the constraint propagation process does NOT reduce the
number of "candidates". It just makes the representation more explicit. To
reduce the size of the candidate set you need to combine constraint
propagation with non-explicit constraints about the problem or CSP process.
(constraints that are not "on the table").
BINHMINH: Umm, the question here is what exactly do we mean by "candidate". I
think that by adding new constraint, we actually cutting down the candidate
set, but the one that doesn't change is the set of solutions. For example, if
we have: X From rao@asu.edu Mon Apr 24 23:48:00 2000
> Date: Thu, 20 Apr 2000 07:13:25 -0700
> From: Subbarao Kambhampati
> To: plan-class@parichaalak.eas.asu.edu
> Subject: (a)musings on the SAT/ILP connection
>
> Here is a bit more on the SAT/ILP connection (in the context of the
> cutset discussion in today's class).
>
> We know that every SAT clause has an 0/1 ILP equivalent.
>
> So, what is the equivalent of "resolution" and other inference
> procedures of SAT in ILP?
>
> THe answer is --simple linear combination of constraints.
>
> For example see the nice little connection between the resolution of
> two SAT clauses and the equivalent resolution:
>
> x v y x + y >= 1
>
> ~y V z 1 -y +z >= 1
> ============== =======================
> x v z x + z >= 1
>
>
> So, we note that if we just keep computing all linear combinations of
> the ILP constraints, you are deriving all the consequences of the
> original SAT clauses.
>
> Since constraint propagation involves essentially deriving all the
> resolvents of the given set of clauses, doing linear combinations on
> ILP side can be seen as doing constraint propagation.
BINHMINH: That's true that ILP, and CSP overshadow SAT, but some linear
combinations do not have the same effect in SAT. What I mean is that any SAT
resolvent can be expressed as some linear combination, but it's not true the
other way. It's easy to understand because SAT is just special case of ILP
(binary ILP), and special case of CSP (boolean CSP), therefore, anything that
we can do with SAT, we can do with ILP and CSP, but not the other way.
>
>
> Suppose the SAT clauses are unsatisfiable. How does it show up in ILP? Maybe
> you derive an inconsitent arithmetic constraint through linear
combinations...
> let's see:
>
> x v y x + y >=1
> ~y 1 - y >=1
> ~x 1 - x >=1
> ========== =================
> |= False 2 >= 3
>
> Viola! Notice that the false on the left is derived after two separate
> resolutions. The degnerate inequality on the right is derived after
> just summing everything up.
BINHMINH: However, deciding which set of ILP constraints to sum up to get a
"right" cut is rather more difficult than SAT (because its structure is more
general). But, in Hooker's book, he has a nice section (that I like very much)
that talk about which SAT clauses are chosen to get a new resolvent based on
the solution from the corresponding LP relaxation.
>
> Another way of checking if a set of linear constraints will allow us
> to derive inconsistent arithmatic constraints is to just run the LP
> algorithm on the constraints and see if it returns degeneracy.
>
> But wait.. how come we are able to check unsatisfiability of
> propositional clauses--a co-NP complete problem, using LP, which is a
> polynomial procedure? Something must be wrong here...
>
> Let's see if we can always check unsatisfiability using LP...consider
> the following example:
>
> ~x V y 1 -x +y >=1
> x V y x +y >=1
> ~x V ~y 2 -x -y >=1
> x V ~y 1 +x -y >=1
> =========
> |= False
>
> Clearly, the left hand side entails an empty clause.
> However, there is no linear combination of the clauses on the RHS that
> leads to an inconsistent arithmatic constraint. In fact, it is easy to
> see that the assignment x,y = (.5,.5) is a satisfying assignment in
> that it satisfies all the arithmatic constraints.
>
> SO, what gives? What gives is that the integrality constraints x,y
> must be 0 or 1--which are implicit on the RHS constraints, are
> violated.
>
> Specifically, let's see if any of the assignments x,y =(00)(01)(10)(11)
> make sense for the RHS. Consider (00)--this assignment violates the
> second constraint. Similarly, (01) violates the fourth constraint,
> (10) violates the first constraint and (11) violates the
> third one. So, none of the 0/1 assignments are consistent for
> x,y. This means that the 0/1 integer program is "degenerate".
>
> So, why isn't the LP degenerate? Well, that is because there are
> fractional solutions (x,y) = (.5,.5) that satisfy all constraints.
> This is why LP relaxation doesn't detect the inconsistency of the SAT
> clauses.
>
> Given that the LP relaxation did detect the inconsistency in the x V y,
> ~x, ~y case above, but failed in this case, the question is what sub-class
> of inconsistencies can be detected by LP relaxation alone?
>
> We note that in the case of x V y, ~x, ~y, unit propagation on the SAT
> clause derives the empty clause, while in the second example above, the
> empty clause cannot be derived by unit propagation alone. Turns out that
> this is in general always the case.
ZIM: You introduce the term "unit propagation" here, by which I assume you
mean 1x1 constraint prop on the SAT clauses. So it seems the phrase "while in
the second example above, the empty clause cannot be derived..." should
indicate that you're referring to the equivalent propagation on the ILP eqns.
(since the empty clause IS derived for the SAT clauses for the 2nd case also).
> That is, LP relaxation of the 0/1 ILP program of a set of SAT clauses
> will be degenerate IF and ONLY IF unit propagation applied to the SAT
> clauses derives an empty clause. [[Since running unit propagation to
> completion on a set of SAT clauses is only polytime, while checking
> the degeneracy of an ILP program--even if it is just a 0/1 ILP
> program--is NP-complete, the complexities do match]].
>
> The final note is about resolvents that can be derived by linear
> combination. Clearly, for our second example, the empty clause is a
> resolvent of the SAT clauses, and yet it cannot be derived by the
> linear combination of the arithmatic constraints (or else, LP
> relaxation would have detected the inconsistency.).
BINHMINH: But if we take into account the fact that all ILP variable should
take the value 0/1, then we can derive the same effect as "empty clause" in
SAT.
> What this means is the following ONE DIRECTIONAL implication:
>
> Every linear combination of arithmatic constraints corresponding to a
> set of SAT clauses is a resolvent of the SAT clauses.
> (but not vice versa).
ZIM: I'm confused. The lin comb in the 2nd example did NOT resolve the SAT
clauses -at least not with the same answer as SAT resolution.
BINHMINH: I don't know how can you prove it, I dont' see it clearly here.
>
> Cutting constraints start with linear combinations, but then also use
> the integrality restriction to derive constraints that are stronger
> than any of the linear combinations.
BINHMINH: And can get cut that coresponding to any "possible" resolvent from
original SAT clauses set.
Binhminh: It may help to state that in SAT, it's proven that resolution is
complete in the sense that it can derive ALL possible clauses. In ILP, the
Chva'tha (I'm not so sure about the name, seems like some other guy ??ri first
mention about it, and Chva'tha prove it) cut, which basically take the
*weighted* combination of arithmatic constraints and do the *integral
rounding* on all argument can derive ALL possible cut. However, bot resolution
and Chva'tha's cut are general and costly, so it's better to concentrate on
more specific type of constraint propagations.
>
> (One thing worth thinking about at this point is what is the nature of
> the cutting constraints when seen from the SAT angle. Specifically, a
> cutting constraint should still correspond to a valid SAT
BINHMINH: I think that it's not true. Let's take the example below, x+y+z <= 1
doens't corespond to any SAT resolvent. The only utility here is that it help
the LP relaxation, but it DOESN'T help ILP formulation of SAT. Thus, it give
"extra" information to LP relaxation, but it doesn't give extra information to
the ILP (doens't cut down any part from the integral hull). And the binary ILP
is the mirror of the SAT, we doens't get anything for it, and also nothing for
the SAT (no new resolvent).
> resolvent--but there may not be any compact way of expressing it. For
> example, in the case of 3 pairwise mutex clauses between x,y, and z,
> we know that at most one of the three variables x,y and z can be
> true. This can be expressed quite elegantly and compactly as
>
> x+y+z <= 1
>
> in ILP. However in SAT, it needs the expression on the LHS below
> (because SAT is lousy at COUNTING).
>
> (x & ~y & ~z) V
> (~x & y & ~z) V
> (~x & ~y & z) V
> (~x & ~y & ~z)
BINHMINH: However, while x+y+z <= 1 (1)in ILP give us some *extra* thing in LP
relaxation, the lousy SAT form of (1) doesn't give any thing to SAT, because
it's exactly (~x v~y), (~y v ~z), (~z v ~x), which is the starting point in
deriving (1) in ILP. Note that adding (1) to the ILP formulation may help the
LP relaxation detect the infeasibility, but adding the lousy SAT form to the
original SAT doens't help unit propagation (if it couldn't detect the
infeasible in the starting place)
(Oops, I'm kind of repeating the point here).
Rao
[Apr 20, 2000]
................................................
Question G (integrating planning & scheduling) additional notes:
A major problem with conducting planning & scheduling together is that we end
up rediscovering the many types of failures over and over. For example, if
an
action cannot be assigned a given resource the planner will futilely try all
other resources of th e same type. This problem is really all about
"abstraction". That is can we efficiently abstract scheduling from planning.
It's likely that a compromise will be best. One approach: use a 2-phase
plan/schedule system but keep an upper-lower bound resource constraint in the
planning phase.
The comments on the remaining questions closely followed the Sample
Solutions.
--------------------------------------------------------
BACK TO:
"Decision-Theoretic Planning: Structural Assumptions and Computational
Leverage"
Rao reviewed Section 2 as covered last time focusing on:
Stationary MDPs: -the probability distribution predicting the next state
depends only on the previous state, NOT the stage (or time)
Section 2.7 Horizons and Success Criteria
Finite horizon problems vs. infinite-horizon problems...
(infinite horizon problems necessitate use of a discount factor)
Rao noted that there's a special case of an infinite horizon problem that
does not require discounting: Only the goal state has a positive reward
and goal states are "absorbing". Actions have a positive cost (ie. get
subtracted)
We then moved to Section 3 & 3.1 with this observation:
Counter-intuitively, there does NOT exist an optimal policy for a stationary
finite horizon MDP. But there DOES exist an optimal policy for an infinite
horizon stationary MDP and it's the same as for the equivalent non-stationary
MDP (i.e. time doesn't matter). (This was proven by R.A. Howard, 1960)
=============================================================
NEXT TIME (from Rao e-mail):
1. Fielding any questions on section 2.10
2. Discuss Section 3
We definitely will cover section 3.1
** Make sure that you follow the policy construction in Figure 11.
We will most likely cover Section 3.2-3.3