Foundations of Automated Planning and Scheduling

 

 

Subbarao Kambhampati

Department of Computer Science and Engineering

Arizona State University

http://rakaposhi.eas.asu.edu/rao.html

 

A Textbook to be published by

Prentice Hall Series in Artificial Intelligence

 

 

Outline of chapters

 

Chapter 1: Introduction and overview

·                    Timeliness of the textbook

·                    General characterization of planning

·                    Classifications of planning

·                    Sub-classes of planning problem based on Environment types and agent capabilities

·                    Deterministic vs. Stochastic

·                    Static vs. Dynamic

·                    Multi-agent/Distributed planning

·                    Accessible vs. Partially/In-accessible

·                    Discrete vs. Continuous 

·                    Types of Plans

·                    Sequential plans, concurrent plans, conditional plans, policies

·                    Sub-classes of planning based on differing expectations on the planner

·                    Planning  vs. scheduling

·                    Automated vs. Mixed-initiative planning

·                    Sub-classes of planning based on Inputs[C1] 

·        Action Specification, initial and goal states

·        Fully domain independent

·        + additional domain knowledge

·        Domain customized planners such as HTN planners, SHOP/TLPlan etc.

·        + actions that must be part of the plan

·        Scheduling

·                    A discussion of realized and potential applications of planning and scheduling

·        Focus of the textbook

·        Classical Planning Assumptions 

·        Deterministic, static and accessible world + instantaneous actions, no distinction between preconditions and resources

·        Part I will be a comprehensive treatment of classical planning approaches.

·                      A comprehensive treatment of conjunctive and disjunctive refinement planning approaches

·                      A comprehensive treatment of heuristic control of conjunctive and disjunctive planners

·                      A discussion of domain-customization or knoweldge-based approaches for controlling refinement planners. 

·                    Part II  will discuss the extension of the refinement planning techniques to provlems with durative actions and metric resources.

·                    Part III will cover approaches for handling domains where classical approaches do not hold

·                    Handling partially accessible environments (conditional and conformant planning)

·                    Handling  stochastic environments

·                    Handling dynamic environments (reactive planning; multi-agent planning)

 

 

Part I. Classical Planning

 

Chapter 2. Modeling Classical Planning Problems

 

·                    Modeling Classical Planning

·                    Action representations

·                    Situation-calculus representation

·                    Frame and Successor-state axioms

·                    ADL representation

·                    STRIPS  (and +ve precondition STRIPS) representations as  special cases

·                    Compactness of representations (using conditional effect conversion as an example)

·        Alternative representations

·        The Add/Delete representation (and the “goals must be +ve” hangover)

·        Backstrom’s representation; HSTS representation

·        The PDDL standard

·        Illustration of the language with an example domain.

·        Complexity of classical planning

·        Discussion based on Erol/Nau, Bylander and Backstrom

·        Special attention to what aspects do and do not make planning hard

 

Chapter 3. Refinement Planning Framework

 

·                    The many brands of classical planners; and the need for a unifying view.

·                    Overview of Refinement Planning—Candidate set semantics

·                    Conjunctive vs. disjunctive refinement planners

·                    Conjunctive Refinement Planners

·                    Partial Plans: Syntax

·                    Partial Plans: Semantics[C2] 

·                    Proof /Verification Strategies

·                    Progression proof

·                    Regression proof

·                    Causal proof

·                    Refinement Strategies

·                    Completeness, progressiveness, systematicity

·                    A general template for Refinement Planning

 

 

Chapter 4. Conjunctive Refinement Planners

 

·                    Refinement strategies for conjunctive refinement planners

·                    State-space refinements

·                    Forward State-space Refinement (progression)

·                    Goal-directed State-space Refinements[C3] (regression) 

·        Plan space refinement

·        Selection, establishment, de-clobbering and book-keeping

·        Tradeoffs among refinements.

·        Position, Relevance and Commitment

·        Connection to proof strategies

·        A general template for conjunctive refinement planning

·        Design choices and issues  in instantiating the template

·        Single refinement planners

·        Interleaving multiple refinements

·        Case studies of specific single refinement planners

·        Progression and Regression planners

·        UCPOP—a Partial order planner

·          (We shall discuss how the specific UCPOP algorithm is but a simplification of a particular instantiation of conjunctive refinement planners).

 

Chapter 5.  Disjunctive  Refinement Planners

 

·        The idea of and motivation behind disjunctive planners

·        “lifted” search space; avoids rediscovering the same failures multiple times

·        Issues in disjunctive refinemnt planning

·        Representing disjunctive plans

·        Generalizing conjunctive refinements to work on disjunctive plans without loss of progressiveness[r4] 

·        Extracting solutions from disjunctive plans

·        Generalizing progression refinement to work with disjunctive plans

·        (we shall present a 4-stage approach for “re-discovering” Graphplan’s plan-graph expansion phase. The idea is to characterize planning graph expansion and mutex propagation as a form of consistency enforcement that improves the progressiveness of the progression refinement applied to disjunctive plans)

·        Open issues in disjunctive refinement

·        Solution extraction from disjunctive plans

·        Direct extraction strategies

·        Graphplan’s backward search

·        Improvements to Graphplan’s backward search (non-directional and non-systematic search strategies)

·        Compilation strategies

·        Compilation to CSP; Issues

·        Compilation to SAT; Issues

·        Compilation to Integer Programming; Issues

·        Compilation to BDD; Issues

·        Comparison of compilation substrates

·        Comparing direct and compilation strategies for solution extraction

·        (*) Bounded-length plan generation as model-finding problem

              (I may decide to convert this into a separate chapter.)

·        Basic idea: Set up a bounded length disjunctive structure; search for a substucture (model) that corresponds to a  valid plan[C5] 

·        Proof strategies (chapter 3)  as the basis for searching for valid plans

·        Encodings based on progression and regression proofs (aka State Space encodings)

·        Encodings based on causal proof (aka causal encoding)

·        Techniques for simplifying encodings (e.g. term splitting etc.)

·        Tradeoffs among encodings based on different proof strategies

·        Comparison to integer-programming encodings

 

Chapter 6. Handling conditional and quantified effects and preconditions[C6] 

 

·        Issues in handling conditional effects

·        Regression planners

·        Plan-space planners (confrontation etc)

·        Issues in handling quantified effects

·        Issues in handling quantified (existential) goals/preconditions

 


 

Part II. Heuristics and Acceleration techniques for classical planners

 

Chapter 7. Heuristics for conjunctive planners

 

·        Controlling conjunctive state-space planners

·        Estimating the distance-to-go of a state

·        Set-difference heuristic as a first idea

·        Explaining problem  with set-difference heuristic in terms of its ignorance of  positive and negative interactions

·        Discussion of sub-goal interactions

·        Deriving heuristics from planning graphs

·        Level information from simple serial planning graphs

·        Level information from planning graphs with mutexes

·        Improving informedness with higher-order mutexes

·        Sacrificing admissibility for effectiveness

·        Sum heuristics, Partition heurstics, combination heuristics

·        Heuristics based on extraction of relaxed plans

·        Controlling the cost of heuristic computation

·        Strategies for limiting the planning graph expansion

·        Limiting planners’ search to action instances from the planning graph

·        Comparing the effectiveness of planning graph based heuristics for progression vs. regression planners

·          Regression planners—can use one planning graph for the entire planning episode; Since the search is in partial and potentially inconsistent states, they need mutex and higher order mutex analysis for handling –ve interactions

·          Progression planners—need to compute the planning graph once for each node expansion. However, the search is in the space of complete and consistent states, and thus relaxed planning graphs that ignore –ve interactions provide sufficient guidance.

·        Remarks on serial vs. parallel planning graphs and optimal sequential vs. optimal parallel planning

·        Other approaches for deriving heuristics

·        Interpreting planning graphs as relaxed  dynamic programming

·        Connections to the work of Geffner

·        Direct application of pattern-database techniques

 

·        Heuristics for conjunctive plan-space planners

·      Adapting distance based heuristics to plan-space planners

·      Detecting hidden inconsistencies using mutex relations

·      Disjunctive conflict resolution techniques

 

 

Chapter 8. Control strategies for  disjunctive state-space planners

 

·        Techniques for improving Graphplan’s direct search of planning graph

·      Compacting planning graph (the bi-level representation)

·      Variable-ordering heuristics based on distance heuristics

·      Generalizing memoization into a full-fledged nogood learning

·      Relation to normal nogood learning techniques in CSP

·      Alternative search strategies for extracting valid plans from planning graphs

·      Non-directional search

·      Local search

·        Techniques for improving compiled encodings

·        Splitting, factoring etc. to reduce the number of variables

·        Exploiting implicit constraints to reduce size of CSP encodings

·        Combining multiple constraints to derive tighter cutting-plane constraints in IP encodings

 

 

Chapter 9. Planner-independent Pre-processing techniques

 

·        Domain simplification techniques

·        Filtering irrelevant actions using planning graphs

·        Filtering irrelevant actions using RIFO-style techniqiues

·        Simplification based on type analysis and type inference

·        Invariant detection techniques

·        A description/survey of invariant detection techniques

·        To be distilled from the work of Fox & Long; Rintanen; Gerevini & Schubert

·        Discussion on how detected invariants can be used

·        Detection of forced goal orderings[C7] 

·        Detection of islands

 

 

Chapter 10. Manual Domain Customization Techniques

 

·        Domain customization as an “any-expertise” middle ground between domain-independent and domain-customized planners

·        Types of knowledge/guidance that are worth adding

·        Source of the knowledge (learned vs. manual input)

·        Possible ways of representing the domain knowledge

·        As “task reduction schemas”, as “temporal logic constraints on the state sequences”, as invariants etc.

·        Evaluating the ease and natural-ness of expressing domain expertise in the representation

·        Possible ways of using the domain knowledge

·        Passive (as a filter) vs. Active (folding the domain-specific knowledge)

·        Issues in evaluating domain-customized planners

·        The need to distinguish between evaluating the planner and evaluating the domain-expertise writer

 

·        Case Study: Task decomposition (HTN) Planning

·        Syntax: Non-primitive tasks, Reduction schemas

·        Semantics: Grammar of desirable solutions

·        Schema vs. Planner completeness

·        Partial hierachicalization

·        Schema validation

·        Using task-reduction schemas

·        Top-down HTN Planning[C8] 

·        Handling non-primitive tasks as part of plan refinement

·        Discussion of issues such as phantomization etc. [C9] 

·        Bottom-up HTN Planning

·        The idea of generating parse trees for a partial plan in terms of the task reduction schemas

·        Use of task-reduction schemas in the context of state-space planners

·        A brief discussion of SHOP

·        Use of bottom-up HTN planning in disjunctive planners (Mali’s work)

·        Case Study: TL Plan

·        Syntax: temporal logic-based search control rules

·        Efficient incremental interpretation of Temporal logic rules[C10] 

 

Chapter 11. Learning techniques for planning (*optional*)

I am not sure yet whether I will have this chapter—given that most of this work has been done in the context of older planners and has not [C11] yet been shown to be of sufficient value in the current generation planners

 

·        Background on learning techniques to speed-up planners

·        Learning abstractions

·        Distillation of techniques such as ALPINE, PABLO and HiTime

·        Learning search control rules

·        Distillation of techniques from Prodigy, UCPOP+EBL etc.

·        Reusing plans

·        Distillation of techniques from Macro-operator learning, plan reuse (PRIAR) and replay (Prodigy/Analogy;DerSNLP)

 


 

 

Part III. Handling Time and Resources[r12] 

 

Chapter 12.  Action representations and Solution criteria

 

·        General overview

·        Issues in extending classical planning with durative actions

·        How time is represented: point vs. interval

·        How effects and preconditions of an action are distributed over its duration

·        Deadline goals

·        Multiple solution quality criteria

·        Issues in extending classical planning with resources

·        Resources may be discrete as well as continuousàneed for handling continuous quantities

·        Actions may use, produce or consume resources

·        Issues in combining resources and time

·        How resource levels are modeled (at end points vs. over the action’s duration)

·        How resource levels vary

·        Linear vs. nonlinear variation

 

·        Representational issues

·        Point vs. Interval representations of time [r13] 

·        The PDDL 2 standard (with an explanation about the levels)

 

·        Multiple dimensions of solution quality

·        Timeliness metrics

·        Make-span

·        Tardiness

·        Flexibility metrics

·        E.g. slack in the plan

·        Cost metrics

·        Cumulative cost of the actions

·        Quantity of resources consumed

·        Interactions between the quality metrics

·        E.g. lower cost plans may have higher make-span etc.

 

·        Subclasses[r14]  of metric temporal planning[r15] 

·        Based on domain model

·        +durative actions

·        +resources

·        +resources +durative actions

·        Based on input

·        Standard planning

·        Standard scheduling (action selection is  part of the input)

·        Between planning and scheduling

 

 

 

Chapter 13.  Scheduling

    I may wind up splitting this into two chapters..

 

·        Differentiating scheduling from planning

·        The Jobshop scheduling problem (JSP)

·        Classes of JSP problems (e.g. Single vs. Multi-capacity resources)

·        Jobshop scheduling as a CSP

·        Start-point representation

·        Precedence constraint (PCP) reprsentation

·          Compare start-point representation to state-space approaches and PCP representation to plan-space approaches

·        Specialized constraint propagation techniques for JSP

·        Arc-bounds consistency

·        Edge finding

·        Energy minimization

·        Specialized variable ordering heuristics

·        Contention-based heuristics

·        Slack-based heuristics

 

 

Chapter 14.  Planning with Durative Resource-consuming Actions [C16] 

 

·        General issues in extending classical planning approaches to handle time and resources

·        Revisiting [r17] Plan-space/State-space tradeoffs in the context of metric/temporal planning

·        State information helps resource reasoning. Precedence relations help concurrency handling

·        Conjunctive approaches

·        Forward search planners[rao18] 

·        Discussion based on SAPA

·        Techniques for improving temporal flexibility

·        Plan-space planners

·        Discussion based on ZENO/IXTET/HSTS

·        Extending planning-graph based heuristics for temporal planning

·        Estimating cost and make-span functions separately

·        Connections to multi-objective search

·        Disjunctive  approaches

·        Temporal Graphplan

·        MILP encodings

·        De-coupled/Loosely-coupled approaches

·        Tradeoffs between coupled vs. monolithic approaches

·        Postponing resource reasoning (LPSAT style approach)

·        Postponing scheduling considerations (RealPlan style approach)

 


 

 

Part IV. Relaxing Classical Assumptions

 

 

Chapter 15.  Handling Partially Accessible Environments[C19] 

 

·        Problem specification: State variables with unknown values. Variations:

·        No actions available for sensing the missing state variables

·        Conformant planning

·        Sensing actions available for (at least some of the) missing variables

·        Semantics of sensing actions

·        Discussion based on SADL and Levesque et. Al. Work.

·        Also remarks on purely causal actions (classical planning), purely sensing actions (database-style query planning) and causal/sensing actions

·        Basic naïve approaches

·        PlanàSenseàExecute

·        SenseàPlanàExecute

·        Integrate planning, sensing and execution

·        Problems with naïve approaches

·        Complexity characterization (using Baral/Krenovich)

 

·        Implemented approaches

·        Conjunctive planners

·        Plan-space conditional planners

·        Based on CNLP,CASSANDRA etc.

·        Plan-space planners that interleave sensing and execution

·        Discussion based on SADL/Puccini

·        Disjunctive planners

·        Conformant and Sensing Graphplan algorithms

·        BDD planners

·        QBF encodings (Rintanen)

 

 

 

Chapter 16.  Handling Stochastic Environments

 

·        Problem specification: Actions have uncertain outcomes; Objective: maximize (or guarantee a certain) probability of satisfaction.

·        Action representations

·        Actions transforming state distributions to state distributions

·        Syntax based on 2-TBN etc.

·        Complexity characterization (using Littman’s JAIR paper)

·        Extending conjunctive planners

·        Discussion using Buridan

·        Extending disjunctive planners

·        Maxplan approach (EMAJSAT encodings)

·        Pgraphplan approach

 

Chapter 17.  Handling Dynamic (multi-agent) Environments

 

·        Problem specification: World evolves during the agent’s planning cycle

·        Variations

·        The other agents in the world cannot be modeledàDynamic planning

·         Remark on how a dynamic environment may look like a stochastic environment from the agent’s perspective; thus some overlap in approaches.

·        The other agents in the world are modeled explicitlyàmultiagent planning

·        Approaches for Dynamic Planning

·        PlanàMonitoràReplan

·        Monitoring plans (triangle tables etc).

·        Replanning strategies

·        Algorithms for replanning in conjunctive (plan-space) planners

·        Discussion based on PRIAR, SPA, IPEM etc.

·        MonitoràReactàLearn

·        Incremental policy construction

·        (Discussion based on Drummond et. al.’s ERE, Dean/Kaelbling’s envelope extension techniques and Universal plans)[C20] 

·        Overview of multi-agent planning approaches (Fairly new area).

 

 

Chapter 18.  Decision-theoretic planning as a general normative framework for planning

I am still not completely decided about the placement of this chapter in Part IV. One idea is to place this at the very beginning of Part IV—sort of the way  I used refinement planning as the normative framework for classical planning in Parts I and II.

 

·        Motivating decision-theoretic planning in the context of conflicting goals, stochastic environments

·        Markov Decision Processes as a model for  decision theoretic planning

·        Action representations

·        Reward represenations

·        Value and policy functions

·        Complexity of optimal policy contruction

·        Variations in MDPs (with complexity discussion)

·        POMDPS

·        Semi-MDPS

·        MDPs with unknown action models (and reinforcement learning)

·        Constructing optimal policies for FOMDPs

·        Value and policy  iteration algorithms

·        Interpreting AI planning algorithms as approximate policy construction techniques on factored MDPs (Discussion based on the Boutilier/Dean/Hanks JAIR paper)

·        Factored action representations

·        Factored reward representations

·        Discussion on how value and policy iteration algorithms change in the presence of factored representations

 


 

 

Appendices

 

1.      CSP

   1.1. Temporal CSP (TCSP)

2.      IP

 

Topics I haven’t yet placed anywhere in the TOC: Mixed initiative planning, connection to reasoning with actions/change literature, plan recognition; discussion on handling combinations of classical planning assumption relaxations (e.g. stochastic and incomplete and durative action etc…); Plan-based interfaces

 


 [C1]Look at Selman’s paper

 [C2]Do we keep the detailed handling of conditional effects etc separate??

 [C3]Do we just define the refinement strategies or the actual planners also? Have to avoid the tendency to talk about unification too much—since it is a text book, one concrete example planner should be given for each of the techniques before we go on to how many other planners fall under the same rubric.

 [r4]How to work in the LCGP stuff??

 [C5]This is going to be tricky… we need to consider how to point out that simplications of the encodings may depend on the type of substrate—e.g. the kind of simplifications supported by IP encodings.

 [C6]Do we discuss the conditional and quantified effects first in the representation section before coming here? Or do we talk about it for the first time here?

 [C7]How do we use these in partial order planners?

 [C8]****I would need to give some actual algorithm templates rather than do dry top-level discussion.

 [C9]We need to consider SHOP thing…

 [C10]May be a short digression into SAT-based domain-specific invariants etc.

 [C11]Where do we put stuff about Quality considerations… then again why talk about quality considerations in classical planning

 [r12]Just some quick comments on your Part III. I'm not sure if it would be

better, but I would approach the problem like this:

 

In classical planning, we ignore time and assume that actions do not use

resource. Time and resource are independent aspects that involved with

most real world application. There are planners that extend the classical

planning to only time, or only resource, or both. When extend, the

problems that do not exist in the classical planning arise:

 

- Time: Can not assume time point for actions so:

 * Actions with different duration

 * How to represent the logical (true/false) preconditions and effects of

each durative action: time point & interval (protection/holding/changing

effect value)

 * Goal deadlines.

 * Concurency problem of durative actions (temporal mutex).

 * State of the world (there are state where the value of some proposition

may be unknow - like the position of a robot when it's in the middle of

travelling action).

 

- Resource:

 * Resource are not logical (T/F) but can be discrete or continous.

 * Actions can use/produce/consume resource.

 * How to represent resource-related precondition (constraint) and

effects (changes).

 

- Combining time & resource:

 * How resource changes over an interval => changing process as a linear

or more complicate process.

 * Concurency regarding the resource aspects.

 

==>> Problems listed above lead to:

- Different level of action representations: Level 1,2,3,4,5 in PDDL+ with

increasing order of complexity in handling problems listed above.

- Different criteria in optimization: time

related/resource-related/multi-objective.

- Different way of handling those constraints in each planning framework

and the difficulies with each (forward/backward/POP). (chapter 14)

 

I think chapter 14 should be fine. Not as elaborated as the chapters for

classical planning, but we can go in more details later.

 

I just wonder if Scheduling need more than one chapter in your book ;)

(because it stand side-by-side with planning in the title)

 

 [r13]Talk about TCSP etc here??

·         [r14]Goals with duration

·        Goals with deadlines

·        Resource restrictions

Planning vs. Scheduling

 [r15]Need more structure here…

 [C16]HOW DO WE DECIDE WHERE TO PUT STUFF LIKE TGP (which only handles durative actions) and Wolfman’s stuff which only handles resources—Do we consider these extensions one by one—or just ignore them and consider the whole thing together?

 

 [r17]*****GEFFNER’s UNIFIED FRAMEWORK HERE??

 [rao18]Should I have separated handling time from handling resources?

 [C19]Where do we put the complexity classifications?

 [C20]Problem: Dynamic and Stochastic environments may look alike…