@prefix vivo: . @prefix edm: . @prefix ns0: . @prefix dcterms: . @prefix skos: . vivo:departmentOrSchool "Science, Faculty of"@en, "Computer Science, Department of"@en ; edm:dataProvider "DSpace"@en ; ns0:degreeCampus "UBCV"@en ; dcterms:creator "Scott, Lawrence Gill"@en ; dcterms:issued "2010-09-27T23:47:36Z"@en, "1990"@en ; vivo:relatedDegree "Master of Science - MSc"@en ; ns0:degreeGrantor "University of British Columbia"@en ; dcterms:description """This thesis addresses the problem of providing explanations for expert systems implemented in a shell that supports a hybrid knowledge representation architecture. Hybrid representations combine rules and frames and are the predominant architecture in intermediate and high-end commercial expert system shells. The main point of the thesis is that frames can be endowed with explanation capabilities on a par with rules. The point is illustrated by a partial specification for an expert system shell and sample explanations which could be generated by an expert system coded to that specification. As background information, the thesis introduces expert systems and the standard knowledge representation schemes that support them: rule-only schemes, and hybrid schemes that combine rules with frames. Explanations for expert systems are introduced in the context of rules, since rules are the only representation for which explanations are supported, either in commercial tools or in the preponderance of research. The problem addressed by the thesis, how to produce explanations for hybrid architectures, is analyzed in two dimensions. Research was surveyed in three areas for guiding principles toward solving the problem: frame logic, metalevel architectures, and reflective architectures. With the few principles that were discovered in hand, the problem is then analyzed into a small number of subproblems, mainly concerning high-level architectural decisions. The solution proposed to the problem is described in two ways. First a partial specification for expert system shell functionality is offered, which describes, first, object structures and, then, behaviors at three points in time—object compilation time, execution time, and explanation generation time. The second component of the description is a set of extended examples which illustrate explanation generation in a hypothetical expert system. The solution adopts principles of reflective architectures, storing metainformation for explanations in metaobjects which are distinct from the object-level objects they explain. The most novel contribution of the solution is a scheme for relating all the ways that objects' slot values may be computed to the goal tree construct introduced by the seminal Mycin expert system. The final chapter explores potential problems with the solution and the possibility of producing better explanations for hybrid expert system shell architectures."""@en ; edm:aggregatedCHO "https://circle.library.ubc.ca/rest/handle/2429/28741?expand=metadata"@en ; skos:note "EXPLANATIONS IN HYBRID EXPERT SYSTEMS By LAWRENCE GILL SCOTT B.S., Michigan State University, 1972 M.U.P., Michigan State University, 1976 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Computer Science Department) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA March 1990 © Lawrence Gill Scott, 1990 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of G&r>p^4-tV S c i - e n c e The University of British Columbia Vancouver, Canada Date A p r . 4- ; 19 7 o DE-6 (2/88) ABSTRACT This thesis addresses the problem of providing explanations for expert systems implemented in a shell that supports a hybrid knowledge representation architecture. Hybrid representations combine rules and frames and are the predominant architecture in intermediate and high-end commercial expert system shells. The main point of the thesis is that frames can be endowed with explanation capabilities on a par with rules. The point is illustrated by a partial specification for an expert system shell and sample explanations which could be generated by an expert system coded to that specification. As background information, the thesis introduces expert systems and the standard knowledge representation schemes that support them: rule-only schemes, and hybrid schemes that combine rules with frames. Explanations for expert systems are introduced in the context of rules, since rules are the only representation for which explanations are supported, either in commercial tools or in the preponderance of research. The problem addressed by the thesis, how to produce explanations for hybrid architectures, is analyzed in two dimensions. Research was surveyed in three areas for guiding principles toward solving the problem: frame logic, metalevel architectures, and reflective architectures. With the few principles that were discovered in hand, the problem is then ii ABSTRACT analyzed into a small number of subproblems, mainly concerning high-level architectural decisions. The solution proposed to the problem is described in two ways. First a partial specification for expert system shell functionality is offered, which describes, first, object structures and, then, behaviors at three points in time—object compilation time, execution time, and explanation generation time. The second component of the description is a set of extended examples which illustrate explanation generation in a hypothetical expert system. The solution adopts principles of reflective architectures, storing metainformation for explana-tions in metaobjects which are distinct from the object-level objects they explain. The most novel contribution of the solution is a scheme for relating all the ways that objects' slot values may be computed to the goal tree construct introduced by the seminal Mycin expert system. The final chapter explores potential problems with the solution and the possibility of producing better explanations for hybrid expert system shell architectures. i i i TABLE OF CONTENTS Abstract ii List of Figures vi Acknowledgement x 1 Introduction 1 2 A Survey: Knowledge Representation and Explanation 3 2.1 Expert Systems 3 2.2 Knowledge Representation Schemes 6 2.3 Rule-Based Knowledge Representation Schemes 8 2.4 Frame-Based Knowledge Representation Schemes 11 2.5 Hybrid Knowledge Representation Schemes 17 2.6 Explanation in Expert Systems 18 3 The Problem: Generating Explanations in a Hybrid Shell 23 3.1 Features of Hybrid Expert System Shells 23 3.2 Adding Explanation Functionality to a Hybrid Shell 25 3.3 The Logic of Frames 26 3.4 Metalevel Architectures 28 3.5 Reflective Architectures 31 3.6 The Problem Restated 34 4 A Solution: Explaining Hybrid Expert Systems Using A Goal Tree Object and Metaobjects 37 4.1 Overview: Explaining Hybrid Expert Systems 37 4.2 Object Structures for Explaining Hybrid Expert Systems 40 4.3 Behavior at Compile Time for Explaining Hybrid Expert Systems45 4.4 Behavior at Run Time for Explaining Hybrid Expert Systems 53 i v TABLE OF CONTENTS 4.5 Behavior at Explanation Time for Hybrid Expert Systems 58 4.6 Chapter Recap 63 5 Sample Explanations and Their Generation 65 5.1 HOW? Explanation for a Rule 66 5.2 WHY? Explanation for a Rule 77 5.3 HOW? Explanations for Methods (and Demons) 83 5.4 WHY? Explanations for Methods (and Demons) 93 5.5 Explanations for User Input, External Access, Inheritance, and Initialization Values 98 5.6 WHAT-IS-IT? Explanation I l l 6 Analysis and Conclusions 117 6.1 Reflecting on the Solution 118 6.2 Explaining Other Queries 121 6.3 Expressing Explanations Better 127 6.4 Summary and Conclusions 131 Bibliography 133 V LIST OF FIGURES Figure 1: Prototypical Expert System Architecture 4 Figure 2: Internal Structure of a Frame, or Object 13 Figure 3: Frame Hierarchy 14 Figure 4: Influence of Mycin Expert System 20 Figure 5: Portion of a Mycin Goal Tree 21 Figure 6: Key Objects in Shell to Explain Hybrid Expert Systems 38 Figure 7: Sample Object-Level Object and Its Metaobject 44 Figure 8: Object Compiler Behavior Supporting Explanations 46 Figure 9: A Rule Parsed into One Goal Unit 47 Figure 10: A Method Parsed into Three Goal Units 48 Figure 11: Agent Compilation Updates Impacted Slots Facets 50 Figure 12: A Rule's Goal Tree Template 51 Figure 13: A Method's Goal Tree Template 52 Figure 14: A Slot-Changing Agent Stores Information for HOW? Explanations 56 Figure 15: Portion of Goal Tree Instantiated by Execution of a Method 58 Figure 16: HOW? Explanation Schema for Slot-Changing Agents 60 Figure 17: WHY? Explanation Schema 61 Figure 18: WHAT-IS-IT? Explanation Schema 63 Figure 19: Sample HOW? Explanation for a Rule 66 Figure 20: A Rule Stores Information for HOW? Explanations 68 Figure 21: ADI Rule 69 vi L IST OF FIGURES Figure 22: Meta ADI Rule's Goal Tree Template 69 Figure 23: ADI Rule's Template of Information Stored in a Goal Tree Node ....70 Figure 24: Inserting an Ordered Pair into the History List Facet for Slot Deferred Interest 71 Figure 25: HOW? Explanation for a Rule 72 Figure 26: The Interface Manager's HOW? Explanation Template for Slot-Changing Agents 73 Figure 27: Explanation Templates for ADI Rule for Goal Slot Deferred Interest 75 Figure 28: A User Query 78 Figure 29: Sample WHY? Explanation for a Rule 78 Figure 30: The PRP Coverage Rule 79 Figure 31: Context of a Specific WHY? Explanation 80 Figure 32: WHY? Explanation for a Rule 81 Figure 33: The Interface Manager's WHY? Explanation Template 82 Figure 34: Explanation Templates for PRP Coverage Rule for Goal Slot PRP Coverage 83 Figure 35: Sample HOW? Explanation for a Method 84 Figure 36: Compute Total Tax Method 85 Figure 37: A Slot-Changing Agent Stores Information for HOW? Explanations 86 Figure 38: CTT Method's Template of Information Stored in a Goal Tree Node 87 Figure 39: HOW? Explanation Schema for Slot-Changing Agents 88 Figure 40: Explanation Templates for CTT Method for Goal Slot Total Tax 89 Figure 41: Low Taxable Income Demon 90 vii LIST OF FIGURES Figure 42: LTI Demon's Template of Information Stored in a Goal Tree Node 91 Figure 43: Explanation Templates f o r LTI Demon 92 Figure 44: Sample HOW? Explanation for a Demon 93 Figure 45: WHY? Explanation Schema 94 Figure 46: Compute Total Tax Method, Augmented with User Query 94 Figure 47: Explanation Templates f o r CTT Method for Goal Slot Is Citizen ....95 Figure 48: Sample WHY? Explanation for a Method 96 Figure 49: Low Gross Income Demon 96 Figure 50: Explanation Templates f o r Low Gross Income Demon for Goal Slot Is Citizen 97 Figure 51: Sample WHY? Explanation for a Demon 98 Figure 52: Interface Manager's HOW? Explanation Template for User Input 101 Figure 53: Ask User Method's Template of Information Stored in a Goal Tree Node 101 Figure 54: Sample HOW? Explanation for User Input 102 Figure 55: Tax DB External Access Object's Template of Information Stored in a Goal Tree Node 103 Figure 56: Explanation Templates for Tax DB External Access for Goal Slot Gross Income 103 Figure 57: Sample HOW? Explanation for External Access 104 Figure 58: Explanation Templates for Ex-Invest KB External Access for Goal Slot Investment Goals 105 Figure 59: Sample WHY? Explanation for External Access 106 Figure 60: HOW? Explanation Schema for Initialization Time Value 107 Figure 61: Interface Manager's HOW? Explanation Template for Initialization Values 108 Figure 62: Sample HOW? Explanation for Initialization Time Value 108 viii LIST OF F IGURES Figure 63: HOW? Explanation Schema for Inherited Value 109 Figure 64: Inherited Slot Value 109 Figure 65: Interface Manager's HOW? Explanation Template for Inherited Values 110 Figure 66: Sample HOW? Explanation for Inherited Value 110 Figure 67: Sample WHAT-IS-IT? Explanation . 112 Figure 68: WHAT-IS-IT? Explanation Schema ..113 Figure 69: The Interface Manager's WHAT-IS-IT? Explanation Template 114 Figure 70: Metaobject's Information for WHAT-IS-IT? Explanations 114 Figure 71: Object Compiler Stores Information for WHAT-IS-IT? Explanations 115 Figure 72: Joe's Explanation Queries 123 i x ACKNOWLEDGEMENT I appreciate the sacrifices of Virgina, Rainier, and Logan during my studies in general and during the preparation of this thesis in particular. I wish to thank my employer, U S WEST, and especially supervisors John Agnew and Steve Tarr for support of my studies at the University of British Columbia. I enjoy the friendships I made in Vancouver during my studies. x 1 INTRODUCTION Difficulty in understanding computer systems' behavior is a serious problem for developers, maintainers, and end users. The standard solution, static documentation, is at best a partial remedy. When documentation exists at all, it may be inaccurate or outdated. Even current, correct documentation can be problematic in that it may be understandable only to programmers. For traditional software, the difficulty makes maintenance a hit-or-miss proposition. For new kinds of software, it can also retard users' acceptance. Endowing software with the ability to explain its behavior has been touted as a way to overcome the understandability barrier for a new class of software, expert systems. Some of the cited benefits of explanation include greater acceptance by users, faster learning, quicker recovery from errors, and easier debugging ([Buchanan84a], [Swartout83a], [Wallis84]). Despite visions that vendors' hyperbole may bring to mind—of intelligent programs conversing with users in an articulate fashion—the explanation facilities of most expert systems, and the commercial shells used to construct them, are rather limited, in two distinct ways: • Most shells' explanation facilities only explain expert systems constructed with rules and • They support a rather narrow view of what constitutes an explanation. Recent research has focused on the latter limitation from two perspectives: knowledge representation and explanation generation. Seeking to improve the information available to explanation mechanisms, researchers in knowledge representation have devised alternative constructs 1 l INTRODUCTION for building expert systems. Seeking to improve the final explanations, other researchers have sought to improve the explanation mechanisms, using linguistic and psychological models. These efforts have ignored the first limitation, that of explaining only rules. That neglect is unfortunate because the trend in expert system shells is toward hybrid architectures which combine rules with another knowledge representation paradigm: frames, or objects ([Gevarter87], [Harmon89a, b, and c]). This thesis takes first steps toward filling the gap between explanation research and expert system building practice. It provides a framework for providing explanations for the hybrid representation found in many intermediate and high-end commercial expert system shells. The main point of the thesis is that frames can be endowed with explanation capabilities on a par with rules. The point is illustrated by a partial specification for an expert system shell and sample explanations which could be generated by an expert system coded to that specification. The framework described in the thesis suffers from the limitation that its view of explanation is too narrow, and the final chapter of the thesis considers whether the framework can scale up to produce better explanations. The thesis is organized as follows. Standard explanations for rules—i.e those supported by commercial shells—are described in chapter 2. How to produce standard explanations for hybrid architectures—the topic of this thesis—is pondered in chapter 3. A solution is proposed in chapter 4 and illustrated in chapter 5. Finally, producing better explanations for hybrid architectures—beyond the scope of this research—is explored in chapter 6. 2 2 A SURVEY: KNOWLEDGE REPRESENTATION AND EXPLANATION Though not yet as commonplace as database systems, expert systems— also known as knowledge-based systems—seem on their way to becoming so. Just ten years ago only a few examples existed in university laboratories. Buoyed by academic successes, professors turned entrepreneurs and founded firms to market expert system tools. In recent years corporate giants like Texas Instruments and International Business Machines have joined the fray. In that ten years, the tools have evolved along size and complexity dimensions. Their explanation capabilities, however, have not kept pace, as we see later in this chapter. This chapter supplies background information assumed by the remainder of the thesis. A brief introduction to expert systems describes their capabilities, limitations, and typical system architecture. Expert systems attempt to explicitly represent knowledge about the world, and we look at the two knowledge representation schemes in commercial tools: rules only, and hybrid systems which combine rules and frames. Expert systems extended the notion of on-line help to dynamic explanations of behavior; the final sections in this chapter consider the range of explanations currently possible. The inadequacies of canned text are contrasted to the standard approach commonly available in expert system shells. 2.1 EXPERT SYSTEMS An expert system is computer software that solves a problem thought to require intelligence or reasoning. Many practitioners prefer to call them 3 2.1 EXPERT SYSTEMS knowledge-based systems, since that label does not appear to exclude software that functions as an intelligent assistant or intelligent colleague. However, other artificial intelligence systems can be knowledge-based, e.g. systems for natural language processing, vision, and machine learning. Thus this thesis will bend to the tide of usage and use the term expert system for any system that employs expertise and reasoning to solve problems, wherever that expertise lies on the novice-to-expert continuum. Expert systems have been successfully applied to a broad range of problem types: classification, diagnosis, monitoring, configuration, planning, and design are frequent examples. Within those problem types, individual systems are usually \"expert\" only in a narrow range of expertise. And some knowledge-based problems are not yet amenable to solution by expert systems, for example problems that require spatial, temporal, or analogical reasoning. The architecture of a prototypical expert system is illustrated in Figure 1. Its key features are the user/developer interface, inference engine, knowledge base, and some number (perhaps zero) of external access interfaces. Developer User Iixport System User/ Developer Interface Inference Engine Knowledge Base External Access Figure 1: Prototypical Expert System Architecture 4 2.1 EXPERT SYSTEMS The inference engine is the heart of an expert system, since inference is a prime distinction between expert systems and other kinds of software. The nature of the inference engine determines how knowledge will be represented in the knowledge base by the developer. During a consultation, the inference engine dynamically drives the processing, whence its name. The dynamic processing reflects both static knowledge and dynamic, situation-specific facts in the knowledge base. The developer constructs the static portion of the knowledge base, representing knowledge about the real world domain that the expert system uses to solve problems. The static knowledge base represents facts, complex structures, and/or relationships between structures. The user and external access interfaces add dynamic, situation-specific facts into the knowledge base. Based on those facts, the inference engine infers others to solve its assigned problem. The user/developer interface is the point of contact between humans and an expert system. The developer uses the developer interface to build the static portion of the knowledge base, to tailor the user interface, and to establish external accesses. At run time, the user interface enables the user to interact with the system in whatever ways the developer has provided. Typically one of those ways is to ask for and receive explanations of the expert system's behavior. External access interfaces may link to sensors, databases, other conventional software, and/or other expert systems. Some expert systems are written in programming languages, e.g. Lisp, Prolog, and C. Others are written using tools called shells, e.g. KEE, ART, and Knowledge Craft. Shells typically consist of an inference engine, a developer interface, and the generic portion of a user interface. Some shells provide 5 2.1 EXPERT SYSTEMS facilities for external access as well. A shell provides the syntax for representing knowledge during development, and a procedural semantics which effects inference during execution. Obviously, a shell's ability to explain its behavior is tied to how it represents knowledge. Thus some background in knowledge representation in expert systems is presented before we consider explanation in detail. 2.2 KNOWLEDGE REPRESENTATION SCHEMES Brachman and Levesque provide a concise, clear statement of the significance of knowledge representation for artificial intelligence research, in the introduction to [Brachman85b] : The notion of the representation of knowledge is at heart an easy one to understand. It simply has to do with writing down, in some language or communicative medium, descriptions or pictures that correspond in some salient way to the world or a state of the world. In Artificial Intelligence (AI), we are concerned with writing down descriptions of the world in such a way that an intelligent machine can come to new conclusions about its environment by formally manipulating these descriptions. In his Knowledge Representation Hypothesis, Smith states the notion more formally ([Smith85]): Any mechanically embodied intelligent process will be comprised of structural ingredients that ... represent a propositional account of the knowledge that the overall process exhibits and ... [that] play a formal but causal and essential role in engendering the behavior that manifests that knowledge. Smith's hypothesis introduces the two crucial ingredients of representation: structure and behavior. Schemes for representing 6 2.2 KNOWLEDGE REPRESENTATION SCHEMES knowledge, such as rules and frames, must supply both ingredients: structure and behavior that can act on the structure. The combination of structure and behavior must enable the scheme to convey meaning about a slice of the world. Minimally, a knowledge-based system encoded in any scheme must be able to determine what it \"knows\" about the slice of the world it models. The desirability of a representation scheme can be viewed along two dimensions: expressive adequacy and notational efficacy ([Woods83]). Expressive adequacy is what the representation allows to be said. Notational efficacy concerns several attributes of a representation scheme, such as its computational efficiency for different kinds of inference, how concise the scheme is, and how easy it is to modify. Levesque and Brachman have noted a fundamental tradeoff between the two dimensions: limitations on a representation scheme's expressiveness are necessary if its reasoning is not to become computationally intractable ([Levesque85]). Hayes notes that to be clear about exactly how a scheme represents knowledge about the world, the scheme must have an associated semantic theory. A semantic theory is an account of the way in which particular configurations of the scheme correspond to particular arrangements in the external world ([Hayes85a]). Some representation schemes have very precise semantic theories, e.g. logic-based rules; other schemes seem to have no formal semantic theory. Most knowledge representation schemes model the world as a collection of individuals and relationships that exist between them. States are the collection of all individuals and relationships at one point in time. Schemes can be differentiated by their viewpoint into this common model. For example procedural schemes (e.g. Lisp code) are based on the viewpoint of state transformations ([Mylopoulos84]). 7 2.3 RULE-BASED KNOWLEDGE REPRESENTATION SCHEMES 2.3 RULE-BASED KNOWLEDGE REPRESENTATION SCHEMES Rules are a very natural way to represent some forms of knowledge. Newell and Simon report that experts often discuss their knowledge in language corresponding to rules ([Newell72]). Rules are straightforward and easy to understand, due to their simplicity of notation. Thus it is not surprising that rules were the dominant form of representation in the first generation of expert systems. The structure of a rule can be expressed in either of two equivalent forms: IF THEN or alternatively IF . The premise can consist of multiple clauses connected by logical connectors (AND, OR, NOT, etc.). In many representation schemes the consequent may also contain more than one statement (in which case it is interpreted as actions to be carried out in sequence). The behavior of rule-based knowledge representation schemes is uniform and straightforward ([Parsaye88]): (1) Knowledge exists in the form of rules and facts. (2) New facts are added. (3) Combining the new facts with existing facts and rules leads to the deduction of further facts. Two strategies are available to control inference in the third step. In the first, reasoning proceeds forward with a rule whose premise matches the facts; its firing adds one or more new facts to the knowledge base, and steps 2 and 3 8 2.3 RULE-BASED K N O W L E D G E R E P R E S E N T A T I O N S C H E M E S repeat. This approach is called data driven, or forward chaining. Alternatively a goal can be established, and rules considered which conclude that goal; if some rule concludes the goal, but part of its premise is unknown, the premise becomes the new goal. This approach is goal driven, or backward chaining. The two control strategies behave differently and are appropriate for different kinds of expert systems. However a given rule can be used with either or both strategies. Rule-based representations rank high in expressive adequacy. They can be about specific objects (\"If the drill has a damaged cord,...\") or entire classes of objects (\"If any device has a damaged cord,...\"). Rules about other rules, called metarules, can provide more focused behavior than simple forward or backward chaining (see e.g. [Parsaye88]). Indeed rules can be used to express the same knowledge as other representation schemes (see e.g. [Walker87] and [Thayse88]). Rules may or may not rank so high in notational efficacy, depending on the size and nature of a rule-based expert system. We consider the three aspects of notational efficacy individually Rule-based inference requires attention if it is to be computationally efficient in large applications. On the one hand, the goal directed nature of backward chaining insures that only the rules appropriate to situation-specific facts are considered. And Forgy has devised a way to significantly improve the performance of forward chaining processing, the Rete Algorithm ([Forgy82]). On the other hand, rule organization techniques such as Rete, metarules, and rule groups had to be developed to ameliorate control and performance difficulties that develop during inference in large rule bases, e.g. poor performance or unfocused inference ([Davis80], [Chandrasekaran84], [Parsaye88]). 9 2.3 RULE-BASED K N O W L E D G E REPRESENTATION S C H E M E S Rules are concise in some contexts, and not in others. Mylopoulos notes rules encourage \"conceptual economy\", in that a rule need be stated just once, even if used in different ways in a knowledge base ([Mylopoulos84]). Others note problems that can result from forcing knowledge not well suited to rules into a rule scheme. For example, a rule-based approach to procedural control makes flow of control implicit and context explicit—exactly opposite from the desired state for some problems ([Georgeff86]). Chandrasekaran notes a 20%/80% effect in some problem domains: while much of the domain is represented in relatively few rules, the remaining domain knowledge requires many more ([Chandrasekaran84]). Rules are generally viewed as easy to modify. A developer can often create rules without worrying in advance about the order in which actions should be taken ([Parsaye88]). New rules can be added to flesh out a problem domain with little or no impact on previously debugged rules. However Woods notes that adding significant new perspectives, e.g. time or situation variables or intermediate steps and agents, can require rewriting a rule base from the ground up ([Woods83]). The discussion thus far has obscured an important point about rules: there are actually two varieties in widespread use. Rule schemes close to first-order predicate logic, such as Prolog, have a clean, well understood, and accepted formal semantics. Production rules, found in most expert system shells, do not. Chandrasekaran notes that existence of a rigorous semantics does not necessarily make a scheme better for building systems [Chandrasekaran84]. We see in section 2.6 that lack of a formal semantics does not preclude a scheme from having a viable explanation facility. Logic rules and production rules may be further contrasted by their viewpoints into the world model consisting of individuals, relationships, and 1 0 2 .3 RU L E - BA S E D K N O W L E D G E R E P R E S E N T A T I O N S C H E M E S states. Production rules are generally considered to be procedural, i.e. based on a view of state transformations like Lisp code. Logic-based rules are based on a viewpoint of true assertions about states; however in endowing a logic-based scheme with a procedural semantics, their behavior becomes very much like production rules. Henceforth we ignore the difference between the two kinds of rules, and continue to refer to rules, meaning both varieties. In summary, rules are an effective scheme in which to represent many, but not all, kinds of knowledge in expert systems. While their expressive adequacy is high in theory, in practice their notational efficacy constrains their usefulness. 2.4 F R A M E - B A S E D K N O W L E D G E R E P R E S E N T A T I O N S C H E M E S This section discusses frames as a knowledge representation scheme. The intent here is not to suggest frames (only) as an alternative to rules for representing knowledge. Rather, frames are described as a prelude to discussing hybrid representation schemes in the following section. If contrasts are to be drawn at all, they should be between the rule only vs. hybrid (rule and frame) schemes. Frames are organizational devices for modeling explicitly the objects in a real world problem domain, relationships between those objects, and contexts in which different relationships apply. Like rules, there is evidence that frames are a natural way of organizing knowledge. Early research by Bartlett demonstrated that human memory contains knowledge structures that aid in interpreting new information ([Bartlett32]). In a seminal paper (reprinted as [Minsky85]), Minsky argues that frames are more desirable than traditional logic for representing knowledge to solve l l 2.4 FRAME-BASED K N O W L E D G E REPRESENTATION SCHEMES many realistic, complicated problems. First order logic fails in expressive adequacy in that it is poorly suited to representing approximate solutions, which Minsky sees as vital in human problem solving. Logic also fails in terms of notational efficacy. Minsky claims a human thinker reviews plans and goal lists, and while one can program these with theorem proving, he argues one really wants to represent them directly, in a natural (perhaps procedural) way. The notion of frames was conceptualized independently from object-oriented programming. Minsky notes the origins of the former in work by Bartlett and Kuhn. The latter did not originate with, but was popularized in, the language Smalltalk. As Parsaye and Chignell note, the distinction between the two concepts is rapidly disappearing ([Parsaye88]). Hence in this thesis we use the terms frame and object interchangeably. There are two aspects to the structure of frame systems: the internal structure of objects and the relationships between objects. Figures 2 and 3 illustrate the two aspects. An object is composed of a collection of attributes, commonly called slots. In simpler frame systems each slot has just an associated value. However as shown in Figure 2, in many modern frame systems, slots themselves are composed of collections of facets. One of the facets is typically the Value facet1. Thus a slot's value in the simple view equates to the value of the slot's Value facet in the more complex view. Other facets can define a slot's allowable values, its default value, how many values it may have, etc. Not all slots need have the same facets. Henceforth in body text, words in the object language appear in bold Helvetica Narrow font, while words in ordinary (meta-)language appear in Palatino font. 12 2.4 F R A M E - B A S E D K N O W L E D G E R E P R E S E N T A T I O N S C H E M E S O b j e c t N a m e : Sample Object I n h e r i t a n c e : Is Instance Of Sample Objects S l o t N a m e : Sample Slot F a c e t s : Value true Legal Values (true false) Default Value false How Many Values 1 S l o t N a m e : Another Sample Slot F a c e t s : Value (red white blue) Legal Values (from-class 'Colors') Default Value (white) How Many Values unlimited S l o t N a m e : Third Sample Slot F a c e t s : Value 3.1714 Figure 2: Internal Structure of a Frame, or Object Figure 3 illustrates some sample objects and the hierarchical structure that relates them. The links between individual frames are of two types: superclass-subclass links and class-instance links, indicated by solid and dotted lines respectively. That is, (super)class Vehicle has two subclasses, Volvo and Mercedes. And class Volvo has two instances, My Old Volvo and Richard's New Volvo. A class object may have any number of subclasses and/or instances. Instance objects are prohibited from having descendents in some frame schemes; in other schemes there is no such prohibition. The two types of links are both referred to as I S - A l i n k s , as in \"Volvo is a Vehicle\" or \"My Old Volvo is a Volvo\". 13 2.4 F R A M E - B A S E D K N O W L E D G E R E P R E S E N T A T I O N S C H E M E S Object Name: Vehicle Object Name: Mercedes Object Name: My Old Volvo Object Name: Richard's New Volvo Figure 3: Frame Hierarchy Frame hierarchies are similar to semantic networks in many respects. Both schemes represent knowledge in conceptual units like frames with role descriptions like slots. And both schemes relate the conceptual units with a hierarchy expressing generalization relationships ([Brachman85a]). Although semantic nets were popular during the 1970s, Parsaye notes that they are used on their own in relatively few systems today ([Parsaye88]). Nor do any commercial expert system shells support them. Thus we do not discuss semantic nets further in this thesis. Frames' behavior extends the notion of static record structures in two ways. The two ways directly relate to objects' internal structure and the relationships between objects. Internally, a slof s value is not restricted to static data, but can be an active data element, i.e.'procedural code. In early frame formalizations this capability was called procedural attachment. In the object-oriented programming tradition, the same capability is achieved via messages passed 14 2.4 FRAME-BASED KNOWLEDGE REPRESENTATION SCHEMES between objects that activate methods in the target object. This thesis uses the message and method terminology. A slot may be associated with a special kind of method, called a demon. Demons are methods augmented with an invocation condition. Demons are not activated by the explicit message-method mechanism; instead, a demon activates whenever its invocation condition is satisfied ([Parsaye88]). The frame representation provides a flexible mechanism for inheriting information between objects. That is, a class object normally passes some (in some schemes, all) of its slots down the IS-A links to its descendents, i.e. its subclasses and/or instances. Inheritance occurs down the hierarchy to all descendents, regardless of the number of intervening objects; that is, My Old Volvo in Figure 3 would inherit from both the Volvo and Vehicle objects. Inherited slots can contain data or methods, and all facets of an inherited slot are inherited. An object may specify nothing about an inherited slot, in which case the slot behaves for the heir just as it does for the object where originally defined. Alternatively, an object can specify information about an inherited slot locally, overriding its inherited behavior to create tailored local behavior. Besides capturing structural information in their IS-A links, frames also allow the full range of expression of whatever language their methods are coded in—Lisp for example. Their great flexibility permits a developer to define and use any datatype that the combined method language and IS-A links formalisms allow. With such broad expressiveness, frames' notational efficacy might be expected to suffer (remember the tradeoff between expressiveness and tractability). Fortunately, the hierarchical organization useful for structuring real world knowledge also provides efficiencies in storage as well as inference 15 2.4 FRAME-BASED K N O W L E D G E REPRESENTATION SCHEMES ([Chandrasekaran84], [Parsaye88]). The inheritance capability of frames adds to their modularity and compactness of expression ([Parsaye88]). Finally frames facilitate modification: since knowledge is structured in units that correspond to the real world, it is usually obvious which object needs to be modified to effect a desired change. Little early work on frames presented a formal semantic theory for them. In comments preceding their reprint of Minsky's proposal for frames, Brachman and Levesque complain that \"vagueness and general lack of rigor has followed many who pursued the frame ideas, as if the topic itself demanded a certain informal style of research\". They continue that \"frames as a representational framework seem to have much more to do with cognitive memory models than with mathematical logic and logical inference\" ([Brachman85b]). In other work, Brachman supplies some answers about the semantics of frames, which we consider more closely in section 3.3. Frames are based on multiple viewpoints into the world model of individuals, relations, and states. The external, hierarchical structure of frames is explicitly based on the viewpoint of individuals and relationships. Frames' methods, consisting of procedural code, adds the procedural viewpoint of state transformations. Frames' slots are based on the view of relations between objects, a la semantic networks. Frames represent an intuitive and flexible way to model a slice of the world. Objects in the real world can be placed in one-to-one correspondence to objects in the model. Inheritance is a powerful mechanism for efficient, sharing of structure and behavior. Frames alone represent a powerful, emerging programming paradigm—object-oriented programming. In the next section, we introduce the marriage of rules and frames in hybrid expert system shells. 16 2.5 HYBRID KNOWLEDGE REPRESENTATION SCHEMES 2.5 HYBRID KNOWLEDGE REPRESENTATION SCHEMES Some researchers purposely eschew hybrid representations, and advocate representations based entirely on rules, specifically logic rules, because of their well-defined semantics. [Jackson89] makes a particularly strong argument for this view: Logical formalisms are rich enough to provide different concrete architectures, while on the other hand the use of logic provides a unifying framework for the system which saves it from the unstructured richness of hybrid systems.... Hybrid systems are not useful for building expert systems, precisely because they offer a bewildering array of possibilities and little if any guidance as to which are appropriate to what tasks.... Although logic's semantic and computational properties are somewhat negative, the important point is that these properties are known at all2. Jackson et al's purist view stands in stark contrast to the pragmatic approach maintained by hybrid systems' proponents ([Bobrow85], [Parsaye88], [Chandrasekaran84]). Even relative purists Levesque and Brachman argue that \"there is no single best language\" ([Levesque85]). Thuraisingham calls tools in which the object-oriented model is augmented with logical deduction \"generic representations\" and praises their power to model structural entities and rules ([Thuraisingham89]). Jackson, et al call them \"high level programming environments\" and admit their tools prove useful in appropriate applications ([Jackson89]). This thesis does not join the argument, but obviously sides with the pragmatists. One motivation for studying the hybrid tools is that they are the model which developers of powerful commercial expert system shells have It should be noted that Jackson et al are arguing for a more powerful logic than first order predicate logic. 17 2.5 HYBRID KNOWLEDGE REPRESENTATION SCHEMES adopted. The list of such shells includes Knowledge Engineering Environment (KEE), Automated Reasoning Tool (ART), Knowledge Craft, Nexpert Object, Aion Development System, Knowledge Base Management System, Goldworks, and others. These shells have different foci. KEE for example is at heart frame-based, with rules added on. ART's view is the converse, primarily rule-based, with frames added on. Knowledge Craft is a loosely coupled collection of tools, providing forward chaining rules (OPS), backward chaining rules (Prolog), and a frame language (CRL). In chapter 3 we clarify the thesis' view of a prototypical hybrid shell, and define the explanation facility we provide for it. There are advanced representations used in expert systems we could, but are not, considering part of hybrid shells. Some of these are multiple worlds and truth maintenance, scripts, and blackboard architectures. Having described hybrid expert system shells, we are now ready to turn to our main concern with them, namely capability to explain their behavior. 2.6 EXPLANATION IN EXPERT SYSTEMS An expert system's explanation facility provides explanations of its actions and conclusions to a variety of users, including developers and end users. Almost since the expert system era began, researchers have considered explanation to be one of AI's most valuable contributions ([Wick89a]). This has led to a difference in users' expectations between expert system and other software: users may anticipate weeks or months learning other software, but expect to begin useful work with an expert system almost immediately ([Wexelblat89]). 18 2.6 E X P L A N A T I O N IN EXPERT SYSTEMS The simplest mechanism for generating explanations is canned text. The developer attempts to anticipate all situations requiring explanation, and creates and stores text strings to recall at an appropriate later time. Problems with applying the canned text approach to systems of any significant size or complexity are well known ([McKeown85a], [Swartout83b]). As with any documentation, it is hard to maintain consistency between the text strings and the actual functioning of the system as it changes over time. In a complex system, it may be impossible to anticipate questions and responses for all situations that may occur. Difficulties in achieving high quality explanations result directly from the fact that the system's explanations are not based on a conceptual model. Thus, while canned text may suffice for error messages and help systems, the approach is not considered to be robust enough for intelligent, dynamic explanations ([Wick89b]). Mycin, an early rule-based expert system for diagnosing infectious blood diseases developed at Stanford University in the 1970s, has been enormously influential in the spread of expert systems. Figure 4 (modified from [Clancey86]) illustrates other research systems that Mycin spawned at Stanford. It also indicates the importance of one derivative, Emycin on the commercial development of expert systems. Emycin—short for Essential or Empty Mycin—removed Mycin's domain specific knowledge and was applied to other problem domains. Emycin demonstrated the viability and potential of expert system shells. Many first generation shells were simply reimplementations of Emycin, e.g. M. l , Personal Consultant, and Expert Systems Environment. 19 2.6 EXPLANATION IN EXPERT SYSTEMS Research Expert Systems ResearchShells Tutoring Systems Shell-based Expert Systems Commercial Shells Figure 4: Influence of Mycin Expert System The Mycin system introduced the explanation mechanism that is the foundation of most commercial explanation facilities today: templates ([Scott84]). Templates are text phrases with slots that can be filled by different words for different situations; individual templates can be strung together to produce an explanation of a multi-step process. In Mycin, a template is associated with each rule, and the slots in the templates are filled by translations of variables instantiated in the rules. Templates are more flexible than simple canned text, but they can suffer from the same problems and they require effort to produce readable output ([McKeown85a]). Mycin's templates are used to generate explanations by interrogating two knowledge structures: a goal tree and Mycin's rules. The dynamic, consultation-specific goal tree consists of nodes representing goals and subgoals pursued during the consultation. Goals are simply the need to know the values of variables. When a goal may be satisfied by application of one or Mycin Neomycin Emycin Her acles \\ / Guidon K \\ Guidon2 Pi SaconX ster Most First Generation Shells 20 2.6 EXPLANATION IN EXPERT SYSTEMS more rules in Mycin's static knowledge base, the corresponding node in the goal tree is indexed to the relevant rule(s). See Figure 5 for a portion of a goal tree ([Scott84]). goal: IDENTITY of ORGANISM-1 ask: question 7 rules: RULE009 (failed, clause 1)... RULE003 (succeeded)... goal: CATEGORY of ORGANISM-1 rules: RULE037 (succeeded)... goal: HOSPITAL-ACQUIRED of ORGANISM-1 ask: question 15 [no rules] Figure 5: Portion of a Mycin Goal Tree These two knowledge structures, i.e. static rules and dynamic goal tree, enable Mycin's template mechanism to produce two kinds of explanations: why a piece of information is being asked of the user and how a goal was achieved. Because these are not the only possible interpretations of the queries how and why, we distinguish the Mycin interpretations as HOW? and WHY? in this thesis. The explanation facilities in commercial expert system shells are limited to answering these HOW? and WHY? queries for rules. A few shells provide some minimal augmentation, such as a graphical view of the goal tree or goal: GRAM of ORGANISM-1 ask: question 11 [no rules] 21 2.6 EXPLANATION IN EXPERT SYSTEMS simple explanations of the \"objects\" implicit in rules (which we refer to as WHAT-IS-IT? explanations). As Wick and Slagle note, almost all research into explanations relies on additional knowledge not available in existing shells ([Wick89a]), a topic we take up again in chapter 6. It seems ironic that hybrid shells, with all their richness of knowledge representation, provide explanations only for slot values that are inferred by rules. We are now ready to explore what it means to explain slot values computed by any of the variety of mechanisms found in hybrid shells. 22 3 THE PROBLEM: GENERATING EXPLANATIONS IN A HYBRID SHELL This chapter makes explicit the problem addressed by the thesis. Its first section describes important features of the hybrid shells for which explanation functionality is specified in chapter 4. In the second section, the problem of adding explanation functionality is refined, by clarifying what structure within a hybrid architecture to explain, and what queries to answer. Three areas of research appear to bear on the problem of explaining frames, and a section in this chapter is devoted to each. Taking a cue from Georgeff and others, that the semantics of a scheme can be important for providing explanation capability [Georgeff86], the logic of frames is explored, with disappointing results. Explanation is a form of metareasoning, i.e. reasoning about reasoning; thus the section after that considers whether research into metalevel architectures provides any guiding principles or pragmatic suggestions for explaining frames. Explanation may also be viewed as a form of computational reflection, in that a system's explanation facility may involve structures representing aspects of the system itself; hence a third exploration section considers whether research into reflective architectures provides principles or suggestions. Of the three potential sources of ideas, the last is the most helpful. A summary and restatement of the problem concludes the chapter. 3.1 FEATURES O F HYBRID EXPERT SYSTEM SHELLS The hybrid shells we target for explanation are those with essentially the mix of rule-based reasoning and object-oriented programming provided in 23 3.1 FEATURES O F HYBRID EXPERT SYSTEM SHELLS commercial expert system building tools such as KEE, ART, Knowledge Craft, etc. These shells provide full object-oriented programming capabilities including IS-A hierarchies, inheritance and message passing. They also provide robust rule-based inference with both forward and backward chaining. The rules and frames in these systems are integrated. That is, rule-based inference can stimulate methods, and conversely, methods can stimulate rule-based inference. As a concrete example, consider KEE, which provides two different languages to effect rule-based and frame-based reasoning. KEE's rule language is called TellAndAsk and its method language is Common Lisp. Both the premise and consequent of TellAndAsk rules may contain TellAndAsk's well formed formulas (wff's) and/or Lisp expressions that invoke methods. Forward and backward chaining are invoked by the expressions: (ASSERT ) (QUERY ). Rule-based reasoning can be invoked by using these expressions within either rules or methods ([IntelliCorp87]). It is important to note that these frame systems typically include accessible generic classes for objects, rules, methods, and demons. Assuming these classes are first class objects, then adding structure and behavior to them for explanation purposes will make that structure and behavior apply to all instances of objects, rules, methods, and demons. 24 3.2 ADDING EXPLANATION FUNCTIONALITY TO A HYBRID SHELL 3.2 ADDING EXPLANATION FUNCTIONALITY TO A HYBRID SHELL A hybrid tool's integration of rules and frames makes it possible to accomplish the same behavior in a variety of ways. This is the \"unstructured richness\" that Jackson, et al lament. While it may indeed be advisable to heed the advice of Parsaye and Chignell, to \"maximize inference in the rules and retrieval in the frames\" [Parsaye88], we can reasonably assume that in an imperfect world, expert systems have been and will be built that mix rules and frames in a less optimal fashion. This assumption provides the motivation to explain frames on a par with rules. Like Wick and Slagle, this thesis aims to do so using only using the technology currently available in hybrid shells [Wick89a], i.e. not requiring advanced explanation methodologies described in chapter 6. Just what is it then that we wish to explain? The objects in frame-based systems have three levels of structure we might focus on: objects themselves, objects' slots, or slots' facets. The level that is usually asserted by rules and assigned values by methods is slots (more precisely the slots' Value facets, but this distinction is not important). Thus our problem is to explain a slot's value for any of the ways that value might be assigned. There are three general ways that slots' values can be determined. Some slots, representing static information, never change values during a consultation; we label this situation initialization values. Secondly, a slot's value may not be specified locally at all, but rather be determined by the frame hierarchy's inheritance mechanism. Finally a slot's value may be determined dynamically during a consultation by any number of ways, which we refer to collectively as slot-changing agents. Slot-changing agents include rules; 25 3.2 ADDING EXPLANATION FUNCTIONALITY T O A HYBRID SHELL methods, including demons; user supplied values; and values returned from access to external processes. At a minimum then, we want to answer Mycin-style HOW? and WHY? queries for all the ways a slot's value can be determined in a hybrid shell. We prefer the explanation mechanism to be as uniform as possible, regardless of how a slot's value may be determined. Ideally we would also like our solution to scale up, that is, to accommodate other kinds of explanations. The topic of explaining frame-based reasoning has been largely ignored by other researchers of expert system explanations. The search for ideas of how to attack the problem led naturally to the topics presented in the next three sections. 3.3 THE LOGIC OF FRAMES A semantics was not specified by early proponents of frames, who were more interested in richness of representation and the possibility to express situations which were awkward in first order predicate logic. Remember that logicians complained about the \"informal style\" of frame research. When logicians applied their own rigor to the question of frame logic, they uncovered a subtle but severe constraint to formally defining a frame semantics. If a frame notation \"allows cancellation, but provides no mechanism for noting certain facts as uncancellable, then it simply cannot express universal truths, ... [or] more precisely, it can only represent universal truths extensionally (by explicitly indicating all cases)\" ([Brachman85a]). It is interesting to note that software engineering perspectives on object-oriented programming have observed that object-oriented design for reusability of classes leads to a principle which would avoid Brachman's 26 3.3 THE LOGIC OF FRAMES concern: the appropriate use of inheritance is to model a type hierarchy, in which every class should be a particular kind of its superclasses. \"Subclasses should add responsibilities to their superclasses; they should not cancel inherited responsibilities, or override them to become errors, or no behavior at all\" ([Wilkerson89]). Ignoring procedural attachment, i.e. methods, Brachman has carefully analyzed the semantics of IS-A links ([Brachman83]). He clarifies that IS-A links are not synonymous with inheritance, which he calls an implementation issue, rather than one of expressive power. Noting that the semantics of a general-purpose IS-A mechanism cannot be predicted in general, Brachman (and also Hayes ([Hayes85b])) have categorized the variety of inferences that the various uses of IS-A links suggest. Brachman then goes on to make a concrete proposal for what IS-A links should be. His suggestion involves enhancing IS-A links with information that looks (\"suspiciously\", he says) like formalisms that make up special cases of standard logical statements: assertional force, modality, and quantifiers. If we leave aside both methods and the matter of overriding defaults, it is generally agreed that frames as data structures are essentially just bundles of properties. As a representational language, they are otherwise mostly equivalent to first order predicate logic [Hayes85b]), although their form emphasizes certain compelling patterns in knowledge representation that do not emerge from predicate logic based representations ([Brachman83]). Study of frame-based reasoning was not particularly helpful to finding principles or techniques useful for explaining frames. Existing frame representations have a semantics that can only be described as ad hoc, if indeed it can be described at all. The most concrete proposal for endowing 27 3.3 T H E LOGIC OF F R A M E S frames with a formal semantics lies in enhancing their IS-A links, by making explicit their logical characteristics. The notion of enhancement of links was considered, but discarded for this thesis. It is more complicated than the solution proposed in chapter 4. As concrete evidence, this thesis' proposal could be implemented in existing shells because of the existence of objects, rules, methods, and demons as first class objects. The IS-A links, on the other hand, are not first class objects in these shells. A second reason that the study of semantics was not helpful for explaining frames is that methods are not addressed at all. However, despite frames' lack of formal semantics, we need not despair that they are unexplainable. Remember that production rules do not have an established semantics either, but have been successfully explained nonetheless. 3.4 M E T A L E V E L ARCHITECTURES Section 2.3 introduced the notion of metarules, used to focus rule-based processing. As used in philosophy, linguistics, and knowledge representation, the prefix meta- is a kind of word schema: meta-x is interpreted as \"x about x\". What a meta-x is \"about\" is called the \"object-level\". Thus a metarule is a rule about object-level rules. Metalanguage is language about some object-level language. A metalevel architecture includes some components which are about other components (those at the object-level). And, unfortunately but unavoidably, a metaobject is about some object-level object. Explanation in expert systems is an example of metareasoning, that is, reasoning about the object-level reasoning the expert system is doing (or has 28 3.4 METALEVEL ARCHITECTURES done) in solving its problem. Therefore the metareasoning literature is reviewed as a possible source of principles or design suggestions for explaining frames. We consider first the range of applications of metareasoning, then discuss whether the literature provides useful suggestions. Davis proposed metarules as a way to control inference in the Mycin system ([Davis80]) and control remains the most strongly and widely advocated application of metareasoning. The aim is to improve performance by pruning a search tree. This usage relies on an ability to represent properties of object-level knowledge and the state of the inference process. Most approaches are rule-based, and hence rely, like Davis, on metarules ([Aiello88]). Extending the application of metareasoning beyond control, some researchers have used metareasoning as an abstraction mechanism to increase expressive power. One example occurs in automated deduction systems, where inference rules can be defined at the metalevel. The search space at the metalevel is smaller than at the object-level, since a single metaproof represents multiple object-level proofs. Other problems that have been attacked in this manner include: awareness of what a system knows, awareness of beliefs, non-monotonic reasoning, reasoning about changing situations, reasoning about different (and possibly inconsistent) evolving theories, and reasoning about multiple views of objects ([Aiello88]). Metalevel architectures also appear in some object-oriented languages. Cointe contrasts a number of such languages, which vary greatly in the access to the metalevel permitted by the language to users. For example in Smalltalk-80, the metalevel is not at all accessible to the user of the language, 29 3.4 M E T A L E V E L ARCHITECTURES whereas the ObjVlisp language permits access equally to instances, classes, and metaclasses ([Cointe88]). Finally Aiello and Levi survey some applications of metareasoning to interfaces. Some examples are mechanized interfaces, e.g. between separate bodies of knowledge, or between knowledge bases and databases. Other examples are between knowledge bases and users, in applications such as knowledge acquisition, user modeling, and (finally!) explanation. Three principles or suggestions resulted from this review of metareasoning that have importance for the question of explaining frames. The key observation is what data structures support explanation facilities. Two other principles concern the choice of language(s) at the meta- and object-levels, and the advantages of control knowledge at the metalevel. Sterling and Shapiro devote an entire chapter in [Sterling86] to metainterpreters, which they define as \"an interpreter for a language written in the language itself\". The chapter provides several examples of Prolog metainterpreters which can answer HOW? and WHY? queries. The technique of metainterpretation they present consists of metaprograms in which predicates take object-level programs as one of their arguments. Other arguments that are carried along in the metaprogram's predicates provide insight into the data structures needed to support HOW? and WHY? explanations. Those additional arguments are the equivalent of Mycin's goal tree and the currently active rule. Metalevel architectures vary as to whether they are monolingual or bilingual. Jackson et al argue for the latter approach, a purely declarative object-level \"with no concern for efficiency\" and a metalevel for control with efficiency \"its most prominent aspect\" ([Jackson89]). This debate is not resolved, however; Hayes argues that factual and control information should 30 3.4 METALEVEL ARCHITECTURES be represented in the same scheme, so that control can be involved in inference ([Hayes85a]). Whether mono- or bilingual, Jackson et al cite numerous advantages to separating control knowledge from domain knowledge. One of these is the possibility of deeper explanations. What Jackson et al refer to are the kinds of explanations with deep knowledge sources that are taken up in the final chapter, not our concern with better explanations using existing tools and knowledge sources. Thus, this review of metareasoning has proven only somewhat more relevant than the review of frame logic to the problem addressed in this thesis. It reveals that there are many alternative uses of metareasoning and some variety in metalevel architectures. While the review found no clear principles for metalevel architectures in general, at least metainterpretation techniques indicate the flexibility of Mycin's data structures in producing explanation facilities for a variety of rule formalisms. However Mycin-style production rules and metainterpretation of logic programs each deal with a single, uniform representation scheme. The metareasoning review has provided little guidance for explaining hybrid systems' wide variety of mechanisms for assigning slots' values. 3.5 REFLECTIVE ARCHITECTURES Smith ascribes much of the subtlety and flexibility that humans bring to bear on the world to our ability to reflect: that is, our ability to not only think about the external world, but also to think about our own internal ideas, actions, feelings, and past experiences. Interest in the self-referential aspect of reflective thought has sparked interest in psychological and knowledge 31 3.5 R E F L E C T I V E A R C H I T E C T U R E S representation circles, both of which apply the prefix meta- to the resultant theories and systems ([Smith85]). A computational system is said to have a reflective architecture if it incorporates structures representing aspects of its own structure and behavior. The object-level system solves problems in the (external) problem domain. The self-representation at the reflective level makes it possible for the system to answer questions about and to support actions on its object-level system. Though metareasoning and reflective reasoning are similar, Maes draws a distinction between them. Metareasoning may be implemented in a different language than the object-level system and may have only static access to object-level system. In Maes' view a reflective metalevel must be implemented in the same language as the object-level system and must have dynamic access to the object-level system ([Maes88]). Much research into reflective architectures has dealt with constructing general purpose reflective programming languages or theorem provers. For example Maes describes the virtual infinite tower of circular interpreters implemented in most reflective languages, required to permit concepts like metametaobjects ([Maes88]). And Aiello and Levi describe the deductive apparatus that is necessary at both the object-level and metalevel for exportation of results from one level to the other ([Aiello88]). These architectural concerns have little bearing on the more limited problem pursued in this thesis. However Maes presents properties of a reflective architecture for an object-oriented language, some of which do provide useful principles for the problem in this thesis ([Maes87]). We consider three basic architectural questions that must be resolved to explain frames, and whether Maes' answer is appropriate. 32 3.5 REFLECTIVE ARCHITECTURES The most basic question that must be resolved is where to store metainformation for explanations about the expert system's objects that model the real world. Maes argues for maintaining a disciplined split between the object-level and metalevel. Her solution is to associate a metaobject with every object-level object. The proposal in the next chapter adopts Maes' solution, rather than alternative solutions such as storing that metainformation inside the same object, in metaslots or metafacets. Slot-changing agents—rules, methods, etc.—also require meta-information for explanations. The second question is where should it be stored? Maes suggests that a self-representation should be uniform. If all entities (instance objects, class objects, slot-changing agents) are objects, they can all be reflected upon. Since we wish to explain all objects, the proposal adopts this suggestion as well. Given that each object-level object will have an associated metaobject, it is necessary to decide where to draw the line between objects and metaobjects. Maes argues for a self-representation being complete, i.e. metaobjects should contain all the information about objects. This seems to leave object-level objects just a Value facet, contrary to the case in most expert system shells. Since the metaobjects proposed here are for explanation only, not general purpose metareasoning, this suggestion seems less important. The convention adopted is that all metainformation for explanations is stored in metaobjects; facets not used in explanations are stored in object-level objects. Maes has one other important observation to contribute: if a self-representation is causally connected to the aspects of the system it represents, then the self-representation is always accurate. The weakness of canned text is an example of a self-representation that is not causally connected to its system, which explains the manual maintenance problem it presents. The 33 3.5 REFLECTIVE A RCHITECTURES thesis proposal attempts to have as much as possible of the information stored in explanation metaobjects be causally connected to the object-level objects. Some of the principles suggested by Maes for reflective systems are helpful in deciding on an architecture for explaining hybrid expert systems. Summarizing the two key principles this proposal adopts: all object-level entities are first class objects and have an associated metaobject, and the metalevel contains all the information needed to explain object-level objects. 3.6 T H E PROBLEM RESTATED Hybrid shells permit rule- and frame-based processing to be as tightly integrated as a given object-level problem demands. The thesis' problem is to define a hybrid explanation facility that is equally integrated, affording frames the same level of explanatory power Mycin made standard for rules. This means that HOW? and WHY? queries must be answerable for all ways that frames's slots are computed—whether by initialization behavior, by the frame inheritance mechanism, or by any slot-changing agent. Ideally the problem's solution should also accommodate other kinds of explanations, and indeed a mechanism supporting WHAT-IS-IT? explanations is taken on as part of the problem in the thesis. The problem is constrained by relying, like Wick and Slagle, on existing technology and knowledge sources. The problem's solution is hindered by the fact that exploration into related topics uncovered little direct research into frame explanations. Frames' semantics is ad hoc or nonexistent, and the best remedy requires complicating links, i.e. redesigning frames from the ground up. Even that radical solution merely addresses the logic of links, ignoring the logic of 34 3.6 THE PROBLEM RESTATED methods. The solution must work around this gap, if methods are to be explained. Another hindrance: much research into metalevel architectures has been directed toward applications other than explanation, such as control. Furthermore, while that research has enough breadth to indicate alternative archtectures, it has not yet produced enough results to make choices between the alternatives apparent. The problem can be viewed as reducing into a set of four related subproblems. The brief surveys presented in the preceding three sections have provided some first steps to solutions of three of these four subproblems. Subproblem #1: What general approach to providing explanations to use? Since it is Mycin's functionality that is being extended, an obvious approach to explore is extending Mycin's mechanisms to explain other schemes besides production rules. This approach has been successfully applied, using metainterpretation, to explain logic programs. The approach requires explicit construction of and access to a goal tree, and constant awareness of what slot-changing agent is acting at any point in time. Subproblem #2: How are slot-changing agents other than rules to be related to the goal tree? This is the subproblem most neglected by other researchers, and hence is the thesis' most original (and perhaps most controversial) contribution. Subproblem #3: How to extend the Mycin approach to the entire hybrid architecture? The approach for slot-changing agents is straightforward. Assume the Mycin approach can be recast into object terms, and then can be generalized so as to be applicable to all slot-changing agents. Then, supplying the generalized Mycin capability to generic, first class classes for all slot-35 3.6 T H E PROBLEM RES T A T E D changing agents—i.e. generic classes for rules, methods, demons, user input, and external access interfaces—allows all instances of rules, methods, etc., to inherit the generalized explanation capability. Initialization time values and inherited values must be explainable also. Subproblem #4: Where to store metainformation for explanation behavior, as opposed to information for object-level behavior? As already indicated, Maes' recommendation for a disciplined split between metaobjects and object-level objects is adopted in this thesis. This chapter has explicated the problem of explanation in hybrid expert systems. Moving from a discussion of the hybrid architectures to a survey of related research, the chapter has analyzed the problem in sufficient detail to point the way toward the solution presented in the next chapter. 36 4 A SOLUTION: EXPLAINING HYBRID EXPERT SYSTEMS USING A GOAL TREE OBJECT AND METAOBJECTS This chapter proposes a solution to the problem of explaining expert systems constructed in a hybrid architecture. The solution extends the Mycin notion of a goal tree for explanations of rules so that it provides equivalent explanations for all ways of determining slots' values. The metaknowledge about objects used to generate explanations is stored in two places: in an Interface Manager object and in metaobjects. The four sections that define the proposal describe, first, structure, and then, behavior at three points in time: compile time, execution time, and explanation time. (Of course the last two times are often intertwined in actual consultations.) The viewpoint of the proposed solution is epistemological, in the sense defined in [McCarthy85]: The epistemological part of AI studies what kinds of facts about the world are available to an observer with given opportunities to observe, how these facts can be represented in the memory of a computer, and what rules permit legitimate conclusions to be drawn from these facts. It leaves aside the heuristic problems of how to search spaces of possibilities and how to match patterns. 4.1 OVERVIEW: EXPLAINING HYBRID EXPERT SYSTEMS The solution proposed here for providing a hybrid expert system shell with explanation capability on a par with Mycin's rule explanations is straightforward, and can be described in two conceptual steps. First, Mycin's explanation mechanism is recast into a frame architecture, i.e. it is objectified. In a sense this step just describes a view of the rule-based explanation 37 4.1 OVERVIEW: EXPLAINING HYBRID EXPERT SYSTEMS component of existing hybrid shells. Second, that mechanism is generalized to provide explanations for the other ways slots values are determined in a hybrid system, namely inheritance, initialization values, and slot-changing agents. To illustrate the flexibility of the solution, a third conceptual step extends the generalization of Mycin's HOW? and WHY? explanation mechanism to enable a third type of explanation: WHAT-IS-IT? explanations tell the user what a slot is and its significance in the expert system. OBJECT-LEVEL Objoct Name: r^lLT . N T Object Name: Object Name: Inference Engine BOTH LEVELS Object Name: Object Compiler Object Name: Interface Manager Object Name: Goal Tree META-LEVEL . KT__ /-«.:__». NT-Object Name: