Planning Applications (specifically - Software Agents) ------------------------------------------------------ The papers this summary is based on are: "The First Law of Robotics (a call to arms)" by Weld and Etzioni "Intelligent Agents on the Internet: Fact, Fiction, and Forecast" by Weld and Etzioni "A Softbot-based Interface to the Internet" by Weld and Etzioni "Planning, Executing, Sensing, and Replanning for Information Gathering" by Craig A. Knoblock with some additional information gleaned by me from: "Integrating Planning, Execution and Monitoring" by Ambros-Ingerson and Steel "The systematic plan adaptor: A formal foundation for case-based planning" by Hanks and Weld The Big Picture --------------- These papers discuss the application of planners to the problem of gathering, filtering, and analyzing information, and the use of planners to act as 'intelligent agents' that provide an abstraction to the technology used by the agents. The extensions to basic planning algorithms (such as UCPOP) that make these applications possible are discussed. Additionally other extensions (such as means to implement Asimov's 3 'laws of robotics') are suggested to make the use of planning algorithms more practical. Planning, Executing, Sensing, and Replanning for Information Gathering ---------------------------------------------------------------------- This paper discusses the implementation of a planner called Sage that can "produce parallel execution plans, interleave planning and execution, support run-time variables, and perform replanning where appropriate". Sage is based on the UCPOP planner. The first modification to the UCPOP planner is made to deal with resources that cannot be accesssed in parallel. When executing operators such as 'query-database', the resource that the operator depends on (the database) may not be accessible by more than one operator at a time. This cannot be captured in the basic UCPOP operator descriptions because the resource is available before execution and again immediately after the execution. See how this is NOT expressed in the following: (define (operator query-database) :parameters (?database) :precondition (available ?database) :effect (and (available ?database) (know !somedata))) If two of these type of operators were unordered with respect to each other, they might be executed in parallel because the basic UCPOP planner was unaware of the resource conflict. The solution to this was to add resource constraints. The definition of an operator modified to specify which resources the operator depends on: (define (operator query-database) :parameters (?database) :resources ((resource ?database database)) :precondition (available ?database) :effect (know !somedata)) After the addition of a new action to a plan, Sage checks for potential resource conflicts. If there are any, the conflicts are introduced as threats into the plan. The Sage planner performs execution and information gathering while planning. Execution is handled by the addition of two new types of flaws: "unexecuted" and "executing". The "unexecuted" flaw is introduced when any new step is added to the plan. The "unexecuted" flaw can only be worked on if there are no open conditions for the operator and every operator that achieves a precondition of the operator has already been executed. Once a step has been executed, this is viewed as a commitment to the plan in which the step occurs. Any further planning is applied to refinements of that plan. When Sage executes a step, it is spawned off as a new process, so that Sage can continue to work on the plan while the step is executing. An "executing" flaw is introduced into the plan while the step is executing. Sage can then spawn off other steps while waiting for the first step to complete. Sage will try to solve the "executing" flaws during each cycle of the planner. If a step completes, its "executing" flaw is removed and the step is marked as completed so that later steps can be executed. The Sage planner allows for the addition of new goals while it is working on past goals in order to exploit shared work and keep track of resource conflicts. As soon as goals are solved, their results are sent to the requestor. The planner can then run and solve plans continuously. When a step in a plan fails, it either returns an error condition or unexpected data. The Sage system has been designed to handle the first error, but the second one is still being worked on because of the difficulty involved in detecting it. The paper explains that Sage can replan a failed action while other actions continue to execute, but it doesn't elaborate on how that is done. Only a reference is given to the 'Systematic Plan Adaptor' (described in the paper referenced at the top). The SPA has a method of backtracking over steps in a plan by annotating decisions made during earlier planning. SPA can take a state in plan space, and produce the parent state, along with sibling states so that past alternatives that were lost after commitment to a certain plan can be checked out. The SPA algorithm also keeps track of failed plans so that they aren't repeated. Sage also has facilities for information gathering. This is through the use of run-time variables. The variables are denoted with a '!', and serve as placeholders during the initial planning. After execution of a step with a run time variable, the run time variable's value is replaced by data gathered during execution. comments: -------- Knoblock unfortunately doesn't go into any detail about replanning, which seems to me an extremely important part of this system. The definition of operators for Sage don't include references to the actual code that is run when the operator is executed - where is that stored? How does the planner handle multiple goals by multiple users? How does it know who to send the output to, and when all the goals from a particular user are satisfied? What if information could be added or removed from the initial state by operators? Could this speed up some plans? The First Law of Robotics (a call to arms) ------------------------------------------ This paper discusses a way to add the notion of 'harm' to a planning system (or 'agent'), and how an agent can avoid performing harmful actions when trying to acheive a goal. Weld and Etzioni define a way to formalize the notion of 'harm' (as it is informally defined by Asimov) to a planning system. They then discuss how the agent can tractably avoid harmful actions, and how an agent should resolve conflict between its goals and the need to avoid harm. The notion of harm is derived from Asimov's three laws of robotics: 1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm. 2. A robot must obey orders given it by human beings except where such orders would conflict with the first law. 3. A robot must protect its own existence as long as such protection does not conflict with the first or second law Weld and Etzioni define harmful conditions as conditions that should never be caused by an agent at any time, and conditions that an agent should try to maintain at the end of planning. Conditions that the agent should never bring about are labelled as 'dont-disturb' conditions. These conditions are like Asimov's first law - they must not be met even under human orders. A 'dont-disturb' condition is expressed with a single, function-free, logical sentence as an argument, like the following: dont-disturb(written.to.tape(?f) or isa(?f, file)) All free variables are considered universally quantified. A sequence of actions generated by a planner satisfy dont-disturb(C) if none of the actions make C false. If a condition C is already false at the start of planning, then it is not up to the planner to make C true, but rather the planner must avoid making any additional conditions false. This is necessary so that the constraints provided to the agent are mutually consistent, and hence tractable. An algorithm (called 'violation') is presented that, when given the effects of an action and the dont-disturb conditions, will return true if there is a necessary violation of the dont-disturb conditions, false if there is no possible conlict, or it will return a list of bindings that cause a conflict with the dont-disturb conditions. The planner has four options to follow depending on the result of 'violation'. The planner can ignore a conflict returned by 'violation' if the effects are true in the initial condition. This is called 'disavow'. If a step asserts big-fluffy-hippo(barney), and that conflicts with dont-disturb(big-fluffy-hippo(?x)), but the initial state holds big-fluffy-hippo(barney), the condition hasn't been violated by the planner and the action can be added. If a conditional effect conflicts with a dont-disturb condition, the planner can add the action as long as it ensures that the conditional effect is necessarily made false. For example, if the conditional is S -> P, and P conflicts with the dont-disturb conditions, then the planner would need to add an open condition (not S) to ensure that P is false. This is called 'Confront'. The planner can 'evade' the conflict through goal regression. The planner can backtrack to find another plan if the effects conflict with dont-disturb. This is called 'refuse'. Analysis of the algorithm shows that its performance is good enough that safe planning can be formulated as a standard planning problem. Etzioni and Weld also define a method for formulating constraints that aren't as strong as dont-disturb constraints - these are called 'restore' constraints. These restraints are defined like the dont-disturb constraints with a single, function-free, logical sentence as an argument. The goal for the agent is to try to acheive restore(C) after first acheiving its goals. The restore constraints differ from the dont-disturb constraints in that the conditions don't need to hold at all points in the plan, but only at the end of planing, if possible. Also, if the goal state conflicts with a restore constraint, then the goal takes precedence. To achieve the 'restore' constraints, the authors suggest splitting the planning session into two parts - the first to acheive the given goal, and the second to acheive the restore constraints. During the second phase only the constraints that don't conflict with the conditions after the first phase and are true in the initial plan are attempted to make true. The restore algorithm isn't guaranteed to completely acheive the restore conditions. If the planner chooses action A over action B in the first phase, and action A violates a restore constraint but action B does not, and no action can make A's results acheive the restore conditions, then the algorithm will fail. Despite this problem, the algorithm works remarkably well because actions in real problems are often reversable and the authors conjecture that most cleanup goals are trivially serializable with respect to each other. The paper contemplates some further challenges to planners: preventing the squandering of resources and preventing harm through inaction. The authors also suggest that agents could be given rules to ensure that other agents aren't causing harm - although there are many difficulties with this idea. comments: --------- Could all agents share one big database of what shouldn't be done? Should there be classes of harmful things? (for OS commands, housework...) Intelligent Agents on the Internet ---------------------------------- This paper discusses the concept of intelligent agents - the motivation for them, their desirable qualities, and some examples of intelligent agents, most notably the Internet Softbot being developed at the U of Washington. An agent is something that acts on your behalf. The reasons we might want intelligent agents is for abstraction and distraction. An intelligent agent can abstract away the technology and the resources that it accesses on behalf of the user. The agent can also employ a character or 'persona' to distract the user from what might be a complex task. There are many desirable qualities for intelligent agents, but currently there isn't agreement on the relative importance of any of these qualities, and there is no single agent that has all of the desirable characteristics. Possession of a majority of these qualities is what differentiates an agent from a regular program. The qualities of an agent are (can include): Autonomous goal oriented collaborative flexible self-starting Temporal continuity Character Communicative Adaptive Mobile (descriptions of each of these qualities can be found in the paper) There are currently numerous intelligent agents being built for the internet. These internet agents act on behalf of the user and recommend or guide the user to information on the internet. Examples include the WebWatcher, the WWW indexing agents (Lycos, WebCrawler, and InfoSeek), and the FaqFinder. The Webwatcher guides the user through the web based on past user preferences. The indexing agents will find information on the web based on keywords - but the searches are very unstructured and as the web grows, the results contain fewer and fewer appropriate matches to the request. The FaqFinder does a better job of searching for the user, because it restricts its domain to that of FAQ's, and has more domain information to guide the search. The Internet Softbot is an attempt to make a smarter internet agent, that accesses structured information services on the internet. The Internet Softbot contains models of each service on the internet so that it can answer focused queries with relatively high reliability. Access to new information services can be added to the Internet Softbot simply by providing a model of the service to the softbot. The softbot fulfills is autonomous, goal directed, flexible, and self starting. The goal of the Internet Softbot project is to create a 'concierge' - an agent that can synthesize command sequences and handle errors to acheive some goal for the user. The softbot can monitor events, enforce constraints, and locate and manipulate objects, among other things. Only the goal of 'what' is to be accomplish need be specified to the softbot - it will determine the 'how' and 'where' to accomplish the task. Goal specification for the softbot can be expressed though a goal language, that includes conjunction, disjunction, negation, and nested universal and existential quantification. The specification can also be 'hidden' though the use of graphical forms that convert to specifications. The softbot can also pose questions to the user, while running, to better control the planning. The Softbot consists of four modules: The task manager - this controls the scheduling of activities by the softbot The XII planner - this is an extension of UCPOP that handles planning with incomplete information, and interleaves planning and execution. The Model Manager - this is a database that stores everything the softbot has observed about the world, and can perform closed world reasoning. Internet Domain Models - these provide the softbot with information about the Internet, people, machines, and search control heuristics. The challenges that still need to be overcome include: algorithms for planning with incomplete information - the Internet softbot can reason with incomplete information, but it still requires extensive, hand-coded, search control rules. Multi-agent communication techniques - including the ability to convey new vocabularies to agents. Learning algorithms - to allow the agent to adapt to the user's desires. Cyberperception - which is the ability to extract facts from unstructured documents. Softbot safety systems - techniques to ensure that an agent will not harm it's owner during operation. comments: --------- In the description of the Softbot Architecture, it stated that the domain models were 30% of all code, compared to 25% for the planner. That's quite eye-opening! Perhaps the planner is too general? or just not powerful enough to reduce the amount of code needed for the domain model? What are some specific examples of what it can and cannot perform?