Planning Applications (specifically - Software Agents)
------------------------------------------------------

The papers this summary is based on are:
	"The First Law of Robotics (a call to arms)"
		by Weld and Etzioni
	"Intelligent Agents on the Internet: Fact, Fiction, and Forecast"
		by Weld and Etzioni
	"A Softbot-based Interface to the Internet"
		by Weld and Etzioni
	"Planning, Executing, Sensing, and Replanning for Information
		Gathering"
		by Craig A. Knoblock

with some additional information gleaned by me from:
	"Integrating Planning, Execution and Monitoring"
		by Ambros-Ingerson and Steel
	"The systematic plan adaptor: A formal foundation for
		case-based planning"
		by Hanks and Weld


The Big Picture
---------------

These papers discuss the application of planners to the problem of
gathering, filtering, and analyzing information, and the use of
planners to act as 'intelligent agents' that provide an abstraction to
the technology used by the agents. The extensions to basic planning
algorithms (such as UCPOP) that make these applications possible are
discussed. Additionally other extensions (such as means to implement
Asimov's 3 'laws of robotics') are suggested to make the use of
planning algorithms more practical.


Planning, Executing, Sensing, and Replanning for Information Gathering
----------------------------------------------------------------------

This paper discusses the implementation of a planner called Sage that
can "produce parallel execution plans, interleave planning and
execution, support run-time variables, and perform replanning where
appropriate". Sage is based on the UCPOP planner.

The first modification to the UCPOP planner is made to deal with
resources that cannot be accesssed in parallel. When executing
operators such as 'query-database', the resource that the operator
depends on (the database) may not be accessible by more than one
operator at a time. This cannot be captured in the basic UCPOP
operator descriptions because the resource is available before
execution and again immediately after the execution. See how this is
NOT expressed in the following:

	(define (operator query-database)
		:parameters (?database)
		:precondition (available ?database)
		:effect (and (available ?database)
			     (know !somedata)))

If two of these type of operators were unordered with respect to each
other, they might be executed in parallel because the basic UCPOP
planner was unaware of the resource conflict.

The solution to this was to add resource constraints. The definition
of an operator modified to specify which resources the operator
depends on:

	(define (operator query-database)
		:parameters (?database)
		:resources ((resource ?database database))
		:precondition (available ?database)
		:effect (know !somedata))

After the addition of a new action to a plan, Sage checks for
potential resource conflicts. If there are any, the conflicts are
introduced as threats into the plan.

The Sage planner performs execution and information gathering while
planning. Execution is handled by the addition of two new types of
flaws: "unexecuted" and "executing". The "unexecuted" flaw is
introduced when any new step is added to the plan. The "unexecuted"
flaw can only be worked on if there are no open conditions for the
operator and every operator that achieves a precondition of the
operator has already been executed.

Once a step has been executed, this is viewed as a commitment to the
plan in which the step occurs. Any further planning is applied to
refinements of that plan.

When Sage executes a step, it is spawned off as a new process, so that
Sage can continue to work on the plan while the step is executing. An
"executing" flaw is introduced into the plan while the step is
executing. Sage can then spawn off other steps while waiting for the
first step to complete. Sage will try to solve the "executing" flaws
during each cycle of the planner. If a step completes, its "executing"
flaw is removed and the step is marked as completed so that later
steps can be executed.

The Sage planner allows for the addition of new goals while it is
working on past goals in order to exploit shared work and keep track
of resource conflicts. As soon as goals are solved, their results are
sent to the requestor. The planner can then run and solve plans
continuously.

When a step in a plan fails, it either returns an error condition or
unexpected data. The Sage system has been designed to handle the first
error, but the second one is still being worked on because of the
difficulty involved in detecting it. The paper explains that Sage can
replan a failed action while other actions continue to execute, but it
doesn't elaborate on how that is done. Only a reference is given to
the 'Systematic Plan Adaptor' (described in the paper referenced at
the top). The SPA has a method of backtracking over steps in a plan
by annotating decisions made during earlier planning. SPA can take a
state in plan space, and produce the parent state, along with sibling
states so that past alternatives that were lost after commitment to
a certain plan can be checked out. The SPA algorithm also keeps track
of failed plans so that they aren't repeated.

Sage also has facilities for information gathering. This is through
the use of run-time variables. The variables are denoted with a '!',
and serve as placeholders during the initial planning. After execution
of a step with a run time variable, the run time variable's value is
replaced by data gathered during execution.


comments:
--------

Knoblock unfortunately doesn't go into any detail about replanning,
which seems to me an extremely important part of this system.

The definition of operators for Sage don't include references to the
actual code that is run when the operator is executed - where is that
stored?

How does the planner handle multiple goals by multiple users? How does
it know who to send the output to, and when all the goals from a
particular user are satisfied?

What if information could be added or removed from the initial state
by operators? Could this speed up some plans?


The First Law of Robotics (a call to arms)
------------------------------------------

This paper discusses a way to add the notion of 'harm' to a planning
system (or 'agent'), and how an agent can avoid performing harmful
actions when trying to acheive a goal. Weld and Etzioni define a way
to formalize the notion of 'harm' (as it is informally defined by Asimov)
to a planning system. They then discuss how the agent can tractably
avoid harmful actions, and how an agent should resolve conflict
between its goals and the need to avoid harm.

The notion of harm is derived from Asimov's three laws of robotics:

	1. A robot may not injure a human being, or,
	   through inaction, allow a human being to come to harm.

	2. A robot must obey orders given it by human beings except
	   where such orders would conflict with the first law.

	3. A robot must protect its own existence as long as such
	   protection does not conflict with the first or second law

Weld and Etzioni define harmful conditions as conditions that should
never be caused by an agent at any time, and conditions that an agent
should try to maintain at the end of planning.

Conditions that the agent should never bring about are labelled as
'dont-disturb' conditions. These conditions are like Asimov's first
law - they must not be met even under human orders. A 'dont-disturb'
condition is expressed with a single, function-free, logical sentence
as an argument, like the following:

	dont-disturb(written.to.tape(?f) or isa(?f, file))

All free variables are considered universally quantified. A sequence
of actions generated by a planner satisfy dont-disturb(C) if none of
the actions make C false. If a condition C is already false at the
start of planning, then it is not up to the planner to make C true,
but rather the planner must avoid making any additional conditions
false. This is necessary so that the constraints provided to the
agent are mutually consistent, and hence tractable.

An algorithm (called 'violation') is presented that, when given the
effects of an action and the dont-disturb conditions, will return true
if there is a necessary violation of the dont-disturb conditions,
false if there is no possible conlict, or it will return a list of
bindings that cause a conflict with the dont-disturb conditions. The
planner has four options to follow depending on the result of
'violation'.

The planner can ignore a conflict returned by 'violation' if the
effects are true in the initial condition. This is called 'disavow'.
If a step asserts big-fluffy-hippo(barney), and that conflicts with
dont-disturb(big-fluffy-hippo(?x)), but the initial state holds
big-fluffy-hippo(barney), the condition hasn't been violated by the
planner and the action can be added.

If a conditional effect conflicts with a dont-disturb condition, the
planner can add the action as long as it ensures that the conditional
effect is necessarily made false. For example, if the conditional is S
-> P, and P conflicts with the dont-disturb conditions, then the
planner would need to add an open condition (not S) to ensure that P
is false. This is called 'Confront'.

The planner can 'evade' the conflict through goal regression. 

The planner can backtrack to find another plan if the effects conflict
with dont-disturb. This is called 'refuse'. 

Analysis of the algorithm shows that its performance is good enough
that safe planning can be formulated as a standard planning problem.

Etzioni and Weld also define a method for formulating constraints that
aren't as strong as dont-disturb constraints - these are called
'restore' constraints. These restraints are defined like the
dont-disturb constraints with a single, function-free, logical
sentence as an argument. The goal for the agent is to try to acheive
restore(C) after first acheiving its goals. The restore constraints
differ from the dont-disturb constraints in that the conditions don't
need to hold at all points in the plan, but only at the end of
planing, if possible. Also, if the goal state conflicts with a restore
constraint, then the goal takes precedence.

To achieve the 'restore' constraints, the authors suggest splitting
the planning session into two parts - the first to acheive the given
goal, and the second to acheive the restore constraints. During the
second phase only the constraints that don't conflict with the
conditions after the first phase and are true in the initial plan are
attempted to make true.

The restore algorithm isn't guaranteed to completely acheive the
restore conditions. If the planner chooses action A over action B in
the first phase, and action A violates a restore constraint but action
B does not, and no action can make A's results acheive the restore
conditions, then the algorithm will fail. Despite this problem, the
algorithm works remarkably well because actions in real problems are
often reversable and the authors conjecture that most cleanup goals
are trivially serializable with respect to each other.

The paper contemplates some further challenges to planners: preventing
the squandering of resources and preventing harm through inaction. The
authors also suggest that agents could be given rules to ensure that
other agents aren't causing harm - although there are many
difficulties with this idea.


comments:
---------

Could all agents share one big database of what shouldn't be done?
Should there be classes of harmful things? (for OS commands,
housework...)


Intelligent Agents on the Internet
----------------------------------

This paper discusses the concept of intelligent agents - the
motivation for them, their desirable qualities, and some examples of
intelligent agents, most notably the Internet Softbot being developed
at the U of Washington. 

An agent is something that acts on your behalf. The reasons we might
want intelligent agents is for abstraction and distraction. An
intelligent agent can abstract away the technology and the resources
that it accesses on behalf of the user. The agent can also employ a
character or 'persona' to distract the user from what might be a
complex task.

There are many desirable qualities for intelligent agents, but
currently there isn't agreement on the relative importance of any of
these qualities, and there is no single agent that has all of the
desirable characteristics. Possession of a majority of these qualities
is what differentiates an agent from a regular program. The qualities
of an agent are (can include):

Autonomous
	goal oriented
	collaborative
	flexible
	self-starting
Temporal continuity
Character
Communicative
Adaptive
Mobile

(descriptions of each of these qualities can be found in the paper)

There are currently numerous intelligent agents being built for the
internet. These internet agents act on behalf of the user and
recommend or guide the user to information on the internet. Examples
include the WebWatcher, the WWW indexing agents (Lycos, WebCrawler,
and InfoSeek), and the FaqFinder. The Webwatcher guides the user
through the web based on past user preferences. The indexing agents
will find information on the web based on keywords - but the searches
are very unstructured and as the web grows, the results contain fewer
and fewer appropriate matches to the request. The FaqFinder does a
better job of searching for the user, because it restricts its domain
to that of FAQ's, and has more domain information to guide the search.

The Internet Softbot is an attempt to make a smarter internet agent,
that accesses structured information services on the internet. The
Internet Softbot contains models of each service on the internet so
that it can answer focused queries with relatively high
reliability. Access to new information services can be added to the
Internet Softbot simply by providing a model of the service to the
softbot. The softbot fulfills is autonomous, goal directed, flexible,
and self starting.

The goal of the Internet Softbot project is to create a 'concierge' -
an agent that can synthesize command sequences and handle errors to
acheive some goal for the user. The softbot can monitor events,
enforce constraints, and locate and manipulate objects, among other
things. Only the goal of 'what' is to be accomplish need be specified
to the softbot - it will determine the 'how' and 'where' to accomplish
the task.

Goal specification for the softbot can be expressed though a goal
language, that includes conjunction, disjunction, negation, and nested
universal and existential quantification. The specification can also
be 'hidden' though the use of graphical forms that convert to
specifications. The softbot can also pose questions to the user, while
running, to better control the planning.

The Softbot consists of four modules:

The task manager - this controls the scheduling of activities by the
softbot

The XII planner - this is an extension of UCPOP that handles planning
with incomplete information, and interleaves planning and execution.  

The Model Manager - this is a database that stores everything the
softbot has observed about the world, and can perform closed world
reasoning.

Internet Domain Models - these provide the softbot with information
about the Internet, people, machines, and search control heuristics.

The challenges that still need to be overcome include:

algorithms for planning with incomplete information - the Internet
softbot can reason with incomplete information, but it still requires
extensive, hand-coded, search control rules.

Multi-agent communication techniques - including the ability to convey
new vocabularies to agents.

Learning algorithms - to allow the agent to adapt to the user's
desires.

Cyberperception - which is the ability to extract facts from
unstructured documents.

Softbot safety systems - techniques to ensure that an agent will not
harm it's owner during operation.


comments:
---------

In the description of the Softbot Architecture, it stated that the
domain models were 30% of all code, compared to 25% for the
planner. That's quite eye-opening! Perhaps the planner is too general?
or just not powerful enough to reduce the amount of code needed for
the domain model? What are some specific examples of what it can and
cannot perform?