********************************************************
Date: 
Note-taker: Minh Binh Do
Topic: Scheduling
********************************************************

Finally, we finish Rao's stuffs on classical planning, and turn to
scheduling. Wow, we are dealing with totally new problem here. Rao
will start with chapter 3 in D.Smith paper, and he suggest us to read
one paper by Beck and Fox about variable and value orderings
in scheduling. Besides all those good news, Rao also talk about some
homework that he will send out soon. He said that the problem will be
small, but if you're familiar with Rao's class, you know that none of
his homework takes less than one full day to finish.

[[Rao: note-takers are getting too cocky...]]

According to Rao, there is a problem of lacking standard, or
general scheme of scheduling problem. 

In general, it's a problem ofassigning resource to task with
duration.  However, everyone comes up with different representations
for their specific (not all of them are CSP or ILP based) for
different specific problems, and they propose their own method to
solve them. Rao said that no one even care about what other people are
doing, if you read about 100 scheduling paper, then you will totally
get lost. So, we can appreciate the fact that classical planning
problem is clearly defined, and we have a cute STRIPS representation
for it. Because there are so many different scheduling problems, we
will concentrate on job-shop planning problem as we go along.

In the Job-Shop Scheduling (JSS) problem, we have a bunch of tasks that 
we already know. Each task will be executed in a duration of time, and 
will require some specific types or machine. There are some
requirements on the order of the tasks to be executed (example, you
can't "take class note" before "going to the class"). Each machine
can only hold a limited number of tasks, and there are may be many tasks
compete to a type of machine over the same time period. The scheduler should 
return the set of starting time for all the tasks, and the machine that 
that task will be used to execute.

In the traditional method, each task will be represeted by its
starting time, and there are normally constraints on the range of
possible starting time. If we call ESTi the Ealiest (possible)
Starting Time, and LFTi the Latest (possible) Finishing Time for task
Ti, then the range of the starting time will be from ESTi to LFTi -
Di (Di is the duration of task Ti). The scheduling problem will be
totally specified with following vars, constants, and constraints:

1. For each task Ti, the variables will be: Si (starting time), and Mi 
(machine be assigned to Ti), and the constants will be Di, ESTi, LFTi.
2. Precedent constraint between Ti: Ti < Tj (mean Si+Di <= Sj).
3. Constraint on which machine type each specific task will use, the
number of machines of that type, and the capacity of each machine.

Note that there are two types of scheduling problem: non-preemptable,
and preemptable. According to Rao, non-preemptable is much harder to
solve (eventhough many people think the opposite way).

One of the most popular variant of the JSS is the single-capacity JSS, 
where each machine can only support one task at a time. It turns out
that the single-capacity JSS is as hard as the limitted-capacity JSS
(and both are more difficult than the infinite-capacity JSS). 

As we discussed in the last class, the difficulty of the CSP problem
depends on the number of variables, number of constraints, and the
average size of constraints (in this order). If we look at the CSP
encoding of JSS, then the number of variables is the sum of the number of
starting time for each task, and the machine-assignment for each
task. If we call n the number of tasks to be scheduled, then there
are total of 2*n variables. However, we can remove dependent variables
to reduce the total number of variables, which resulted in factored
encoding (recall the state-based, and action-based encoding for
classical planning, where we compile out the action variable, or state 
variable. It has been shown that the action-based encoding is normally 
better than the state-based encoding. I think that one of the reason
is the "AND" relation for action-state dependency (action implies the
"AND" of all of its effect, and precondition), and the "OR" relation
between state-action (fact imply the "OR" of actions that give
it). Therefore, when we compile away state var by using the dependent
between action and fact, we generaly getting more "simple" (and
smaller??)  constraints than when we compile away action
variable. Moreover, some paper like "Act, and the rest will follow"
has recognized the fact that if we know all the truth values for
action variable, then we also know all the truth value for fact
variables, but not vice-versal.)

Oh, going back to the JSS problem. We can throw out all the variables
related to machine by adding constraints of type:
(Si + Di < Sj) V (Sj + Dj < Si) with every pair (Ti,Tj) that compete
for the same machine Mk. Note that the fomular above can only be
applied for single-capacity JSS, for the limitted-capacity JSS, we
need a more complicate formulation (it's easy to say by word, but 
actually quite complicated to represent as constraint, can you think 
of it??)

Now, we can get rid of the machines, and the result encoding will
only have the starting time of each task as a variable. The
constraints involve EST and LFT will be expressed by:
Si > ESTi and Si + Di < LFTi.
If we look at those two constraints above, we can see that the range in 
which we can choose the Si value will be from: EST1 -> LFTi-Di and
they normally call the value: (LFTi - ESTi - Di) as slack value for
Ti.

After the JSS problem has been clearly specified, can you think of any 
problems with this CSP encoding?? Here, there is one obvious problem, 
whichrooted from the very beginning of choosing Si as CSPvariable. 

By modeling problem as above, we automatically have to think 
of time as discreet values, and the difficulty of the problem may
depend on time unit that we commit to. Let's say that the task have to 
start after 7 AM, and finish before 1 PM. What's the different if we
say that the task will have to finish in 1 hour, and if we say the
task has to finish in 1 hour 00 minutes. In the second case, we have
chosen minute as our time unit, which automatically widen the domain
of Si by 60 times, and significantly hardened our problem.

The above disadvantage lead to the development of the second approach
called the PCP model in encoding JSS, where we do not care much about
the exact starting time of each task, but only the relative orders
between them. The detail of the encoding is as follow:

Variables:

1. There is one boolean variable i_bef_j for each pair of tasks t_i
and t_j.  Basically, i_bef_j = TRUE iff task i is executed before task
j (Si+di <= Sj).

2. There are two numerical variables SAi and EBi--meaning task i starts 
   at and task i ends at.  These variables take on values t_
   
Given this, we can represent the various JSS constraints as follows:

1. Precedence constraint between Ti and Tj  ==>  i_bef_j = True
2. Resource contention between Ti and Tj    ==>  i_bef_j V j_bef_i
3. Release time constraint  (Ti can't start before time t) 
               ===> SAi >= t
4. Deadline constrains  (Ti can't end before time t)
               ==> EBi <= t

2. Precedence constraint => Described in (1).
3. Capacity constraint (single-capacity): i_bef_j V j_bef_i for two
competing task i, and j.


Now, after totally specify two types of encoding for JSS problem. It's 
worthy to notice that they are kind of related to state-space, and
plan-space (causal) encoding in classical planning. In the starting
point based JSS encoding, we can think of the moment at each timepoint 
as a state, when we can check the consistent of the state over the load of
each machine. This method also allows us to know exactly how machine's
loads are at each moment. In the second method, there is no notion of
timepoint anymore, we don't know until the end how machines are used
at each moment, but we have a more flexible encoding.

Based on what we learn, we may think that the second approach may be
the more difficult to solve because it has more variable n^2, compare
with n in the first approach. However, Smith kind of saying that it's
actually easier to solve (unlike in classical planning). So, why is it 
better?? Rao said that in normal scheduling problem, we are only 
able to know the exact starting time for each task at the very late
moment in the scheduling process, so if we commit to the starting time 
too soon (like in the first approach) for a partial scheduling, there
is a more chance that we will get the inconsistency later, and will
have to backtrack. However, the ordering relations (precedence) seem
to settle down long before the actual starting times become
decidable. For this reason, the second approach is normally
better. The second reason is that popular techniques in CSP community
like arc-consistency seems to do way better in the second approach
than in the first one, it can help to cut down a large number of
partial assignment in the second approach. I personally think that 
the poor performance of the first approach may also resulted from the
fact that the domains for starting times are too big for many
problem. The experiments of (Biplav's)
show that there are many problems in which we have only
very small number of variables, but if we increase their domain size
several times, then the problem becomes really hard to solve.

Finally, just 5 minutes before the class terminates, Rao gave us some
bonus points (his personal experience after reading hundred of
scheduling papers) about scheduling:
** ALL AI SCHEDULINGS ARE: ***
-> A bit of encoding hack.
-> A bit of CSP-Customization.
   -->> A bit of new consistency enforcement process.
   -->> Specific Variable/Value ordering.

+ For the encoding hack, we have tasted a little bit of it when
discuss about two different types of encoding.
+ For the CSP-Customization, the Scheduling community has some special 
techniques like:
  * Arc-bounds consistency: Exploit the structure of values in each
variable's domain (the possible starting times for each task are not
unrelated value).
  * Demand Profiles + Contention Slack Times: Exploit the fact that
the demand of each machine is depended on the starting time, EST, and
LFT of the tasks that use that machine. From this fact, we can find a
cheap but good variable ordering heuristic, which basically say that
the machine that mostly will be busy will be tried to assign task for
it first, so if we have to backtrack then it will be near the top of
the search tree.
  * Texture.

THE END !! Whew.......