Foundations of Automated Planning and Scheduling
Subbarao
Kambhampati
Department
of Computer Science and Engineering
Arizona
State University
http://rakaposhi.eas.asu.edu/rao.html
A
Textbook to be published by
Prentice Hall Series in Artificial
Intelligence
Outline of chapters
Chapter 1: Introduction and overview
·
Timeliness
of the textbook
·
General
characterization of planning
·
Classifications of planning
·
Sub-classes
of planning problem based on Environment types and agent capabilities
·
Deterministic
vs. Stochastic
·
Static
vs. Dynamic
·
Multi-agent/Distributed
planning
·
Accessible
vs. Partially/In-accessible
·
Discrete
vs. Continuous
·
Types
of Plans
·
Sequential
plans, concurrent plans, conditional plans, policies
·
Sub-classes
of planning based on differing expectations on the planner
·
Planning vs. scheduling
·
Automated
vs. Mixed-initiative planning
·
Sub-classes
of planning based on Inputs[C1]
·
Action
Specification, initial and goal states
·
Fully
domain independent
·
+
additional domain knowledge
·
Domain
customized planners such as HTN planners, SHOP/TLPlan etc.
·
+
actions that must be part of the plan
·
Scheduling
·
A
discussion of realized and potential applications of planning and scheduling
·
Focus
of the textbook
·
Classical
Planning Assumptions
·
Deterministic,
static and accessible world + instantaneous actions, no distinction between
preconditions and resources
·
Part
I will be a comprehensive treatment of classical planning approaches.
·
A comprehensive treatment of conjunctive
and disjunctive refinement planning approaches
·
A comprehensive treatment of heuristic
control of conjunctive and disjunctive planners
·
A discussion of domain-customization or
knoweldge-based approaches for controlling refinement planners.
·
Part
II will discuss the extension of the
refinement planning techniques to provlems with durative actions and metric
resources.
·
Part
III will cover approaches for handling domains where classical approaches do
not hold
·
Handling
partially accessible environments (conditional and conformant planning)
·
Handling stochastic environments
·
Handling
dynamic environments (reactive planning; multi-agent planning)
Part I. Classical Planning
Chapter 2. Modeling Classical Planning Problems
·
Modeling
Classical Planning
·
Action
representations
·
Situation-calculus
representation
·
Frame
and Successor-state axioms
·
ADL
representation
·
STRIPS (and +ve precondition STRIPS)
representations as special cases
·
Compactness
of representations (using conditional effect conversion as an example)
·
Alternative
representations
·
The
Add/Delete representation (and the “goals must be +ve” hangover)
·
Backstrom’s
representation; HSTS representation
·
The
PDDL standard
·
Illustration
of the language with an example domain.
·
Complexity
of classical planning
·
Discussion
based on Erol/Nau, Bylander and Backstrom
·
Special
attention to what aspects do and do not make planning hard
Chapter 3. Refinement Planning Framework
·
The
many brands of classical planners; and the need for a unifying view.
·
Overview
of Refinement Planning—Candidate set semantics
·
Conjunctive
vs. disjunctive refinement planners
·
Conjunctive
Refinement Planners
·
Partial
Plans: Semantics[C2]
·
Proof
/Verification Strategies
·
Progression
proof
·
Regression
proof
·
Causal
proof
·
Refinement
Strategies
·
Completeness,
progressiveness, systematicity
·
A general
template for Refinement Planning
Chapter 4. Conjunctive Refinement Planners
·
Refinement
strategies for conjunctive refinement planners
·
State-space
refinements
·
Forward
State-space Refinement (progression)
·
Goal-directed
State-space Refinements[C3] (regression)
·
Plan
space refinement
·
Selection,
establishment, de-clobbering and book-keeping
·
Tradeoffs
among refinements.
·
Position,
Relevance and Commitment
·
Connection
to proof strategies
·
A
general template for conjunctive refinement planning
·
Design
choices and issues in instantiating the
template
·
Single
refinement planners
·
Interleaving
multiple refinements
·
Case
studies of specific single refinement planners
·
Progression
and Regression planners
·
UCPOP—a
Partial order planner
·
(We shall discuss how the specific UCPOP
algorithm is but a simplification of a particular instantiation of conjunctive
refinement planners).
Chapter 5. Disjunctive Refinement Planners
·
The
idea of and motivation behind disjunctive planners
·
“lifted”
search space; avoids rediscovering the same failures multiple times
·
Issues
in disjunctive refinemnt planning
·
Representing
disjunctive plans
·
Generalizing
conjunctive refinements to work on disjunctive plans without loss of progressiveness[r4]
·
Extracting
solutions from disjunctive plans
·
Generalizing progression refinement
to work with disjunctive plans
·
(we
shall present a 4-stage approach for “re-discovering” Graphplan’s plan-graph
expansion phase. The idea is to characterize planning graph expansion and mutex
propagation as a form of consistency enforcement that improves the
progressiveness of the progression refinement applied to disjunctive plans)
·
Open
issues in disjunctive refinement
·
Solution extraction from
disjunctive plans
·
Direct
extraction strategies
·
Graphplan’s
backward search
·
Improvements
to Graphplan’s backward search (non-directional and non-systematic search
strategies)
·
Compilation
strategies
·
Compilation
to CSP; Issues
·
Compilation
to SAT; Issues
·
Compilation
to Integer Programming; Issues
·
Compilation
to BDD; Issues
·
Comparison
of compilation substrates
·
Comparing
direct and compilation strategies for solution extraction
·
(*) Bounded-length plan generation as
model-finding problem
(I may
decide to convert this into a separate chapter.)
·
Basic
idea: Set up a bounded length disjunctive structure; search for a substucture
(model) that corresponds to a valid
plan[C5]
·
Proof
strategies (chapter 3) as the basis for
searching for valid plans
·
Encodings
based on progression and regression proofs (aka State Space encodings)
·
Encodings
based on causal proof (aka causal encoding)
·
Techniques
for simplifying encodings (e.g. term splitting etc.)
·
Tradeoffs
among encodings based on different proof strategies
·
Comparison
to integer-programming encodings
Chapter 6. Handling conditional and quantified effects and
preconditions[C6]
· Issues in handling conditional effects
· Regression planners
· Plan-space planners (confrontation etc)
· Issues in handling quantified effects
· Issues in handling quantified (existential) goals/preconditions
Part II. Heuristics and Acceleration techniques for classical
planners
Chapter 7. Heuristics for conjunctive planners
·
Controlling conjunctive state-space
planners
·
Estimating
the distance-to-go of a state
·
Set-difference
heuristic as a first idea
·
Explaining
problem with set-difference heuristic
in terms of its ignorance of positive
and negative interactions
·
Discussion
of sub-goal interactions
·
Deriving
heuristics from planning graphs
·
Level
information from simple serial planning graphs
·
Level
information from planning graphs with mutexes
·
Improving
informedness with higher-order mutexes
·
Sacrificing
admissibility for effectiveness
·
Sum
heuristics, Partition heurstics, combination heuristics
·
Heuristics
based on extraction of relaxed plans
·
Controlling
the cost of heuristic computation
·
Strategies
for limiting the planning graph expansion
·
Limiting
planners’ search to action instances from the planning graph
·
Comparing
the effectiveness of planning graph based heuristics for progression vs.
regression planners
·
Regression
planners—can use one planning graph for the entire
planning episode; Since the search is in partial and potentially inconsistent
states, they need mutex and higher order mutex analysis for handling –ve
interactions
·
Progression
planners—need to compute the planning graph once
for each node expansion. However, the search is in the space of complete and
consistent states, and thus relaxed planning graphs that ignore –ve
interactions provide sufficient guidance.
·
Remarks
on serial vs. parallel planning graphs and optimal sequential vs. optimal
parallel planning
·
Other
approaches for deriving heuristics
·
Interpreting
planning graphs as relaxed dynamic
programming
·
Connections
to the work of Geffner
·
Direct
application of pattern-database techniques
·
Heuristics for conjunctive plan-space
planners
· Adapting distance based heuristics
to plan-space planners
· Detecting hidden inconsistencies
using mutex relations
· Disjunctive conflict resolution
techniques
Chapter 8. Control strategies for
disjunctive state-space planners
·
Techniques
for improving Graphplan’s direct search of planning graph
· Compacting planning graph (the
bi-level representation)
· Variable-ordering heuristics based
on distance heuristics
· Generalizing memoization into a
full-fledged nogood learning
· Relation to normal nogood learning
techniques in CSP
· Alternative search strategies for
extracting valid plans from planning graphs
· Non-directional search
· Local search
·
Techniques
for improving compiled encodings
·
Splitting,
factoring etc. to reduce the number of variables
·
Exploiting
implicit constraints to reduce size of CSP encodings
·
Combining
multiple constraints to derive tighter cutting-plane constraints in IP
encodings
Chapter 9. Planner-independent Pre-processing techniques
·
Domain
simplification techniques
·
Filtering
irrelevant actions using planning graphs
·
Filtering
irrelevant actions using RIFO-style techniqiues
·
Simplification
based on type analysis and type inference
·
Invariant
detection techniques
·
A
description/survey of invariant detection techniques
·
To be
distilled from the work of Fox & Long; Rintanen; Gerevini & Schubert
·
Discussion
on how detected invariants can be used
·
Detection
of forced goal orderings[C7]
·
Detection
of islands
Chapter 10. Manual Domain Customization Techniques
·
Domain
customization as an “any-expertise” middle ground between domain-independent
and domain-customized planners
·
Types
of knowledge/guidance that are worth adding
·
Source
of the knowledge (learned vs. manual input)
·
Possible
ways of representing the domain knowledge
·
As
“task reduction schemas”, as “temporal logic constraints on the state
sequences”, as invariants etc.
·
Evaluating
the ease and natural-ness of expressing domain expertise in the representation
·
Possible
ways of using the domain knowledge
·
Passive
(as a filter) vs. Active (folding the domain-specific knowledge)
·
Issues
in evaluating domain-customized planners
·
The
need to distinguish between evaluating the planner and evaluating the
domain-expertise writer
·
Case
Study: Task decomposition (HTN) Planning
·
Syntax:
Non-primitive tasks, Reduction schemas
·
Semantics:
Grammar of desirable solutions
·
Schema
vs. Planner completeness
·
Partial
hierachicalization
·
Schema
validation
·
Using
task-reduction schemas
·
Handling
non-primitive tasks as part of plan refinement
·
Discussion
of issues such as phantomization etc. [C9]
·
Bottom-up
HTN Planning
·
The
idea of generating parse trees for a partial plan in terms of the task
reduction schemas
·
Use
of task-reduction schemas in the context of state-space planners
·
A
brief discussion of SHOP
·
Use
of bottom-up HTN planning in disjunctive planners (Mali’s work)
·
Case
Study: TL Plan
·
Syntax:
temporal logic-based search control rules
·
Efficient
incremental interpretation of Temporal
logic rules[C10]
Chapter 11. Learning techniques for planning (*optional*)
I am not sure yet whether I will
have this chapter—given that most of this work has been done in the context of
older planners and has not [C11]yet
been shown to be of sufficient value in the current generation planners
·
Background
on learning techniques to speed-up planners
·
Learning
abstractions
·
Distillation
of techniques such as ALPINE, PABLO and HiTime
·
Learning
search control rules
·
Distillation
of techniques from Prodigy, UCPOP+EBL etc.
·
Reusing
plans
·
Distillation
of techniques from Macro-operator learning, plan reuse (PRIAR) and replay
(Prodigy/Analogy;DerSNLP)
Part
III. Handling Time and Resources[r12]
Chapter 12.
Action representations and Solution criteria
·
General
overview
·
Issues
in extending classical planning with durative actions
·
How
time is represented: point vs. interval
·
How
effects and preconditions of an action are distributed over its duration
·
Deadline
goals
·
Multiple
solution quality criteria
·
Issues
in extending classical planning with resources
·
Resources
may be discrete as well as continuousàneed for handling continuous
quantities
·
Actions
may use, produce or consume resources
·
Issues
in combining resources and time
·
How
resource levels are modeled (at end points vs. over the action’s duration)
·
How
resource levels vary
·
Linear
vs. nonlinear variation
·
Representational
issues
·
Point
vs. Interval representations of time [r13]
·
The
PDDL 2 standard (with an explanation about the levels)
·
Multiple
dimensions of solution quality
·
Timeliness
metrics
·
Make-span
·
Tardiness
·
Flexibility
metrics
·
E.g.
slack in the plan
·
Cost
metrics
·
Cumulative
cost of the actions
·
Quantity
of resources consumed
·
Interactions
between the quality metrics
·
E.g.
lower cost plans may have higher make-span etc.
·
Subclasses[r14] of metric temporal planning[r15]
·
Based
on domain model
·
+durative
actions
·
+resources
·
+resources
+durative actions
·
Based
on input
·
Standard
planning
·
Standard
scheduling (action selection is part of
the input)
·
Between
planning and scheduling
Chapter 13.
Scheduling
I may wind up
splitting this into two chapters..
·
Differentiating
scheduling from planning
·
The
Jobshop scheduling problem (JSP)
·
Classes
of JSP problems (e.g. Single vs. Multi-capacity resources)
·
Jobshop
scheduling as a CSP
·
Start-point
representation
·
Precedence
constraint (PCP) reprsentation
·
Compare start-point representation to
state-space approaches and PCP representation to plan-space approaches
·
Specialized
constraint propagation techniques for JSP
·
Arc-bounds
consistency
·
Edge
finding
·
Energy
minimization
·
Specialized
variable ordering heuristics
·
Contention-based
heuristics
·
Slack-based
heuristics
Chapter 14. Planning with Durative Resource-consuming
Actions [C16]
·
General
issues in extending classical planning approaches to handle time and resources
·
Revisiting
[r17]Plan-space/State-space tradeoffs in
the context of metric/temporal planning
·
State
information helps resource reasoning. Precedence relations help concurrency
handling
·
Forward
search planners[rao18]
·
Discussion
based on SAPA
·
Techniques
for improving temporal flexibility
·
Plan-space
planners
·
Discussion
based on ZENO/IXTET/HSTS
·
Extending
planning-graph based heuristics for temporal planning
·
Estimating
cost and make-span functions separately
·
Connections
to multi-objective search
·
Disjunctive approaches
·
Temporal
Graphplan
·
MILP
encodings
·
De-coupled/Loosely-coupled
approaches
·
Tradeoffs
between coupled vs. monolithic approaches
·
Postponing
resource reasoning (LPSAT style approach)
·
Postponing
scheduling considerations (RealPlan style approach)
Part IV. Relaxing Classical Assumptions
Chapter 15. Handling Partially Accessible Environments[C19]
·
Problem
specification: State variables with unknown values. Variations:
·
No
actions available for sensing the missing state variables
·
Conformant
planning
·
Sensing
actions available for (at least some of the) missing variables
·
Semantics
of sensing actions
·
Discussion
based on SADL and Levesque et. Al. Work.
·
Also
remarks on purely causal actions (classical planning), purely sensing actions
(database-style query planning) and causal/sensing actions
·
Basic
naïve approaches
·
PlanàSenseàExecute
·
SenseàPlanàExecute
·
Integrate
planning, sensing and execution
·
Problems
with naïve approaches
·
Complexity
characterization (using Baral/Krenovich)
·
Implemented
approaches
·
Conjunctive
planners
·
Plan-space
conditional planners
·
Based
on CNLP,CASSANDRA etc.
·
Plan-space
planners that interleave sensing and execution
·
Discussion
based on SADL/Puccini
·
Disjunctive
planners
·
Conformant
and Sensing Graphplan algorithms
·
BDD
planners
·
QBF
encodings (Rintanen)
Chapter 16. Handling
Stochastic Environments
·
Problem
specification: Actions have uncertain outcomes; Objective: maximize (or
guarantee a certain) probability of satisfaction.
·
Action
representations
·
Actions
transforming state distributions to state distributions
·
Syntax
based on 2-TBN etc.
·
Complexity
characterization (using Littman’s JAIR paper)
·
Extending
conjunctive planners
·
Discussion
using Buridan
·
Extending
disjunctive planners
·
Maxplan
approach (EMAJSAT encodings)
·
Pgraphplan
approach
Chapter 17. Handling Dynamic
(multi-agent) Environments
·
Problem
specification: World evolves during the agent’s planning cycle
·
Variations
·
The
other agents in the world cannot be modeledàDynamic planning
·
Remark on how a dynamic environment may
look like a stochastic environment from the agent’s perspective; thus some
overlap in approaches.
·
The
other agents in the world are modeled explicitlyàmultiagent planning
·
Approaches
for Dynamic Planning
·
PlanàMonitoràReplan
·
Monitoring
plans (triangle tables etc).
·
Replanning
strategies
·
Algorithms
for replanning in conjunctive (plan-space) planners
·
Discussion
based on PRIAR, SPA, IPEM etc.
·
MonitoràReactàLearn
·
Incremental
policy construction
·
(Discussion based on Drummond et. al.’s ERE,
Dean/Kaelbling’s envelope extension techniques and Universal plans)[C20]
·
Overview
of multi-agent planning approaches (Fairly new area).
Chapter 18.
Decision-theoretic planning as a general normative framework for
planning
I am still
not completely decided about the placement of this chapter in Part IV. One idea
is to place this at the very beginning of Part IV—sort of the way I used refinement planning as the normative
framework for classical planning in Parts I and II.
·
Motivating
decision-theoretic planning in the context of conflicting goals, stochastic
environments
·
Markov
Decision Processes as a model for
decision theoretic planning
·
Action
representations
·
Reward
represenations
·
Value
and policy functions
·
Complexity
of optimal policy contruction
·
Variations
in MDPs (with complexity discussion)
·
POMDPS
·
Semi-MDPS
·
MDPs
with unknown action models (and reinforcement learning)
·
Constructing
optimal policies for FOMDPs
·
Value
and policy iteration algorithms
·
Interpreting
AI planning algorithms as approximate policy construction techniques on
factored MDPs (Discussion based on the Boutilier/Dean/Hanks
JAIR paper)
·
Factored
action representations
·
Factored
reward representations
·
Discussion
on how value and policy iteration algorithms change in the presence of factored
representations
Appendices
1. CSP
1.1. Temporal CSP (TCSP)
2. IP
Topics I haven’t yet placed
anywhere in the TOC: Mixed initiative planning, connection to reasoning with
actions/change literature, plan recognition; discussion on handling
combinations of classical planning assumption relaxations (e.g. stochastic and
incomplete and durative action etc…); Plan-based interfaces
[C1]Look at Selman’s paper
[C2]Do we keep the detailed handling of conditional effects etc separate??
[C3]Do we just define the refinement strategies or the actual planners also? Have to avoid the tendency to talk about unification too much—since it is a text book, one concrete example planner should be given for each of the techniques before we go on to how many other planners fall under the same rubric.
[r4]How to work in the LCGP stuff??
[C5]This is going to be tricky… we need to consider how to point out that simplications of the encodings may depend on the type of substrate—e.g. the kind of simplifications supported by IP encodings.
[C6]Do we discuss the conditional and quantified effects first in the representation section before coming here? Or do we talk about it for the first time here?
[C7]How do we use these in partial order planners?
[C8]****I would need to give some actual algorithm templates rather than do dry top-level discussion.
[C9]We need to consider SHOP thing…
[C10]May be a short digression into SAT-based domain-specific invariants etc.
[C11]Where do we put stuff about Quality considerations… then again why talk about quality considerations in classical planning
[r12]Just
some quick comments on your Part III. I'm not sure if it would be
better, but I would approach the problem like this:
In classical planning, we ignore time and assume that actions do not use
resource. Time and resource are independent aspects that involved with
most real world application. There are planners that extend the classical
planning to only time, or only resource, or both. When extend, the
problems that do not exist in the classical planning arise:
- Time: Can not assume time point for actions so:
* Actions with different duration
* How to represent the logical (true/false) preconditions and effects of
each durative action: time point & interval (protection/holding/changing
effect value)
* Goal deadlines.
* Concurency problem of durative actions (temporal mutex).
* State of the world (there are state where the value of some proposition
may be unknow - like the position of a robot when it's in the middle of
travelling action).
- Resource:
* Resource are not logical (T/F) but can be discrete or continous.
* Actions can use/produce/consume resource.
* How to represent resource-related precondition (constraint) and
effects (changes).
- Combining time & resource:
* How resource changes over an interval => changing process as a linear
or more complicate process.
* Concurency regarding the resource aspects.
==>> Problems listed above lead to:
- Different level of action representations: Level 1,2,3,4,5 in PDDL+ with
increasing order of complexity in handling problems listed above.
- Different criteria in optimization: time
related/resource-related/multi-objective.
- Different way of handling those constraints in each planning framework
and the difficulies with each (forward/backward/POP). (chapter 14)
I think chapter 14 should be fine. Not as elaborated as the chapters for
classical planning, but we can go in more details later.
I just wonder if Scheduling need more than one chapter in your book ;)
(because it stand side-by-side with planning in the title)
[r13]Talk about TCSP etc here??
[r15]Need more structure here…
[C16]HOW DO WE DECIDE WHERE TO PUT STUFF LIKE TGP (which only handles durative actions) and Wolfman’s stuff which only handles resources—Do we consider these extensions one by one—or just ignore them and consider the whole thing together?
[r17]*****GEFFNER’s UNIFIED FRAMEWORK HERE??
[rao18]Should I have separated handling time from handling resources?
[C19]Where do we put the complexity classifications?
[C20]Problem: Dynamic and Stochastic environments may look alike…