UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Implementing a normative theory of communication in a framework for default reasoning Csinger, Andrew 1990

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1990_A6_7 C74.pdf [ 5.49MB ]
Metadata
JSON: 831-1.0051599.json
JSON-LD: 831-1.0051599-ld.json
RDF/XML (Pretty): 831-1.0051599-rdf.xml
RDF/JSON: 831-1.0051599-rdf.json
Turtle: 831-1.0051599-turtle.txt
N-Triples: 831-1.0051599-rdf-ntriples.txt
Original Record: 831-1.0051599-source.json
Full Text
831-1.0051599-fulltext.txt
Citation
831-1.0051599.ris

Full Text

Implementing a Normative Theory of Communication in a Framework for Default Reasoning by Andrew Csinger A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science i n The Faculty of Graduate Studies Department of Computer Science We accept this thesis as conforming to the required standard University of British Columbia May 1990 © A n d r e w Csinger, 1990 08 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of flgrlPufgR ScitliCS The University of British Columbia Vancouver, Canada Date WO AfML 11**-DE-6 (2/88) Abstract This thesis presents a framework for inter-agent communication, rep-resented and partially implemented with default reasoning. I focus on the limited goal of determining the meaning for a Hearer-agent of an utterance u> by a Speaker-agent, in terms of the beliefs of the interlocu-tors. This meaning is generally more than just the explicit propositional contents of u, and more than just the Speaker's goal to convey her belief that u>. One way of determining this meaning is to let the Hearer take stock of the implicit components of the Speaker's utterances. Among the implicit components of the meaning of u, I show in particular how to derive certain of its presuppositions with a set of default schemata using a framework for default reasoning. More information can be extracted from the communications chan-nel between interlocutors by adopting a normative model of inter-agent communication, and using this model to explain or 'make sense' of the Speaker's utterances. I construct such a model expressed in terms of a set of default principles of communication using the same framework for default reasoning. The task of deriving the meaning of an utterance is similar to the job required of a user-interface, where the user is the Speaker-agent, and the interface itself is the Hearer-agent. The goal of a user-interface as Hearer is to make maximal use of the data moving along the communications channel between user and application. The result is an integrated theory of normative, inter-agent commu-nications expressed within an ontologically and logically minimal frame-work. This work demonstrates the development and application of a methodology for the use of default reasoning. The implementation of the theory is also presented, along with a discussion of its applicabil-ity to practical user-interfacing. A view emerges of user-modelling as a component of a user-interface. ii Contents Abstract ii List of Tables vi List of Figures vii Acknowledgements viii 1 Introduction 1 1.1 What this thesis is about 1 1.2 A Theory of Communication 2 1.2.1 The Implicit and the Tacit 2 1.2.2 Representation and Implementation 3 1.3 User-interfacing and User-modelling 3 1.4 Priorities 4 1.5 Organization of this Thesis 6 2 Background 7 2.1 Presupposition 7 2.1.1 History of Presupposition 8 2.1.2 Presuppositional Environments 17 2.1.3 Terminological Confusion 18 2.1.4 Summary 20 2.2 Theories of Communication 21 2.2.1 Principles of Cooperation (Grice) 25 2.3 User Modelling 27 Dimensions of categorization 27 2.4 Belief and Rationality 28 2.4.1 Beliefs 30 Beliefs versus Knowledge 30 The Epistemic Status of Belief 31 2.4.2 Rationality 32 2.4.3 Previous Work in Belief Modelling 35 iii Allen 36 Cohen et. alias 36 Perrault 37 Konolige 39 2.5 Non-monotonic Systems 39 2.5.1 Theorist 39 2.5.2 Theory Preference 40 3 Design Issues 42 3.1 Causality and Point-of-view 43 3.1.1 The General Model 43 3.2 Default-Programming Methodology 45 Status of Explananda 45 Status of Assumptions 46 3.3 The Communications Domain 46 3.4 Domain Formulation 47 Speaker-Hearer Duality 48 The Shared-Information Constraint 49 3.5 Alternative Implementation Strategies 50 Case I 50 Case II 51 4 Implementation 53 4.1 Implementation Language 53 4.2 Principles 57 4.3 Presupposition 62 4.3.1 Criterial and Non-criterial Properties 62 4.3.2 Factive Verbs 63 4.4 Implicatures 64 4.5 Rationality 66 4.6 Other Aspects 67 4.7 Cancellation and Multiple Extensions 68 5 Conclusion 69 5.1 Contribution 69 5.2 Problems 70 5.2.1 Multiple Extensions 70 5.2.2 Goals, Plans and Desires 70 5.3 Further work 71 Bibliography 72 i v A Theorist Listings 7 7 A . l Maxims ^ A.2 Presupposition 7 8 A.3 Implicature ^8 A.4 Rationality ^9 A.5 Miscellaneous 80 A.5.1 World Information 80 A.5.2 Lexical Information 80 v List of Tables 2.1 Negation in Classical and Trivalent Logics 9 2.2 Summary of Previous Work in Presupposition 21 2.3 Summary of Previous Work in Belief Modelling 35 3.1 Domain-Formulation 48 3.2 Communication Domain Formulation 49 3.3 Four Possible Implementations of the Domain 49 3.4 Speaker-Hearer Duality 49 vi List of Figures 1.1 From Utterance to Belief 2 1.2 The User-Modeller as Part of the Interface 5 2.1 Grice's Maxims of Conversation 26 3.1 From Utterance to Belief via Communication 43 3.2 Causality Model for Interlocutor Pair 47 3.3 Theorist Architecture for Abduction and Prediction 52 4.1 Principles of Communications 55 4.2 Implicature Generators 55 4.3 Presupposition Schemas/Triggers 56 4.4 Rationality Constraint Schema 56 4.5 Principles and Grice's Maxims 58 4.6 Some of the Principles of Communications 60 4.7 Default Representation of Maxim of Sincerity 60 4.8 Default Representation of Maxim of Quantity 61 4.9 Default Representation of Maxim of Sarcasm 61 4.10 Non-criterial default schema 63 4.11 Factive Verb Presupposition Schema 64 4.12 Implicature-generating schema 66 4.13 Rationality Constraints 67 vn Acknowledgements My wife. My poor, long-suffering Susan. Enough said. My supervisor. David Poole has been very generous with his time and energy, and bought me that pitcher of Beer when I really needed it. My advisor. Richard Rosenberg has given me only good advice, and some of his enthusiasm has rubbed off on me. Michael Horsch. Mike has been instrumental in getting parts of this thesis to make sense, and parts of the implementation to work. He was also there to help me drink that pitcher. Emanuel Noik. Manny's mission, it seems to me, is to keep me motivated. I don't know from where he gets his boundless energy or vocation. I can only thank him and hope my good fortune continues. UBC. The university is near some of the most breathtaking scenery in the world. These natural monuments are always around to remind me who I am whenever I start taking myself too seriously. viii Chapter 1 Introduction Common Sense: Those superstitions we learned before the age of eighteen. —Einstein. 1.1 W h a t t h i s t h e s i s is a b o u t At its highest level, this thesis is about user-interfacing. My conception of a user-interface is of a support structure for communications between an intelligent agent and an applications program. The user-interface bridges the gap between user and application, forming a channel along which communica-tions can take place. The information-carrying capacity of this channel can be qualitatively described in terms of its bandwidth. The goal of user-interfacing is to broaden the bandwidth of the communications channel between user and application. There are potentially many ways to accomplish this broadening. Some that have been suggested are programmable command-decoders, graphical input-output devices, natural-language interfaces, multi-media output, and multi-sensory input-output. I restrict myself in this thesis first of all to interfaces which can be im-plemented over a conventional serial (teletype-like) channel, and focus further on a natural language style environment. I accomplish the broadening effect by exploiting tacit and implicit components of user utterances, using a theory of communications. I choose to express the additional information gleaned from the utterance in terms of the beliefs of the user-agent. To this end, I build a model of the user based on the utterances she makes. Figure 1.1 is a schema of the domain this thesis is concerned with; this schema is refined in later chapters to show the various sub-components. A view emerges that a user-modeller can be considered to be a sub-component of 1 CHAPTER 1^ INTRODUCTION 2 Utterance Belief Figure 1.1: From Utterance to Belief the user-interface, and that user-modelling is one of the tasks that a user-interface might be called upon to do in fulfilling its goal of broadening the bandwidth of the communications channel. At its lower levels, this thesis is about presupposition, about theories of commu-nication, and about implementing these in a default reasoning framework. 1.2 A T h e o r y o f C o m m u n i c a t i o n A recurrent theme throughout this thesis is that the communicative content of what is uttered is not restricted to its propositional contents; in addition to what is di-rectly asserted by an utterance, there is a set of propositions which are indirectly implied, and the set of those which are antecedently assumed. Loosely, the first set has been referred to as the implicatures of an utterance, while the latter includes what are known as felicity conditions. I show how to derive a subset of both the implicit and tacit contents of utterances, in terms of the beliefs of the interlocutors involved. Previous work has invariably employed some form of Cooperative Princi-ple, according to which the utterances in a discourse are presumed to adhere to a set of guidelines, itself tacitly represented by the participants in the discourse. I too make use of such principles, but with the desire to capture the realistic departures that are routinely made in the attempt to mislead, to be sarcastic, and so on. 1.2.1 The Implicit and the Tacit In general, implicatures of an utterance are those propositions which are implied but not directly stated by the utterance. Recent usage, however, has followed the work of Grice[Gri75], who identified certain types of inference which he then named implicatures; he further distinguished these into categories with distinct properties. Conventional implicatures are those which arise solely from features of the words employed in an utterance, and this thesis is concerned with only this kind of impli-cature. Henceforth, I use the term implicature in this technical sense, and show how some conventional implicatures can be derived from context-situated utterances in the framework of the principles of communication I define. Tacit phenomena are fundamental to communication. Often expressed in terms of mutual beliefs, tacit information is generally held to be known by all members CHAPTER 1. INTRODUCTION 3 of the group under observation, and further to be known to be known to all these members. In particular, participants in a dialog are usually held to believe that the principles of cooperative communication alluded to above are in effect. In general, elements of world knowledge are also considered to be available to the members of a group, so that this information may go unsaid in conversations among members of a group. This type of tacit information has been referred to as presupposed by an utterance, or by the speaker making an utterance. I employ the term presupposition in its more technical sense, that of the sentential presuppositions of an utterance. (See section 4.3). This is a class of pragmatic inference distinguished mainly by its defeasibility in the context of contradictory information, and by its characteristic behavior under negation. I show how sentential presuppositions of varying lexi-cal environments can be derived from context-situated utterances, the cooperative principle, and the implicatures of the utterance. 1.2.2 Representation and Implementation Tacit phenomena and pragmatic inference are often characterized in terms of their conjectural nature. Defeasibility has long been a distinguishing feature of natural language presupposition, and the maxims of cooperative communication are self-evidently fragile. In Chapter 2, I follow the historical thread of the defeasibility of pragmatic inference from first attempts at formalizing presupposition, to recent work using default reasoning. I see this work as continuing this trend, and the model I present in Chapter 3 is itself completely implemented in a default reasoning framework; I acquire and represent beliefs of agents from their utterances using the Theorist [Poo87] framework for common-sense reasoning. 1.3 U s e r - i n t e r f a c i n g a n d U s e r - m o d e l l i n g Much of my early work was aimed at improving user-interfaces for Computer-Aided Design (CAD) systems, where the efficacy of the interface can be measured quali-tatively in terms of the maximum rate of information exchange between user and application [CdCF87], [CCBD87]; others have recognized this metric and have de-scribed it in different terms. In the domain of information retrieval, these words have been written: . . . improvements in an information storage and retrieval system focus on the idea of improving the cost-effectiveness of the system, in terms of the quality of the information retrieved in relation to the time, effort, and expense of storing and retrieving it.[Kor85] I have named this qualitative measure of the rate of information exchange between user and application, the bandwidth of the communications channel, and sought in CHAPTER 1. INTRODUCTION 4 the past to increase this bandwidth with a variety of ad-hoc techniques suited to the C A D environment. This thesis pursues and generalizes the idea of expanding the bandwidth of the communications channel between user and application, through the interposition-ing of a User-Modeller UM. The role of the user-interface expands to include the functions usually attributed to a UM. In this thesis I describe how principles of communication along with a related theory of presupposition -both implemented in a framework for default reasoning- constitute a U M capable of increasing the bandwidth of the user-application communications channel, in a principled manner. Figure 1.2 is a variation on a conventional view of the user-interface[Pfa85]; I have added the U M unit to be viewed either as a sub-component of, or as communi-cating with the user-interface. This schema maps cleanly into the communications domain with the recognition of the user's role as Speaker-agent, and the role of the user-interface as Hearer-agent. 1.4 P r i o r i t i e s Throughout this thesis, I argue for logical and ontological minimality. I see this work as part of the movement of "minimal A l " , which seeks to accomplish its goals with the least posturing about psychological relevance, or cognitive validity. Certain linguists and psychologists have characterized my position as one of timidity, and have urged me to take a stand on the psychological relevance of the computational architecture set forth in this thesis. I believe they do this from a misunderstanding of the goals of A l in general, and the aims of this work in particular. There is plenty of room for differences in opinion on the former, so I will deal only with the latter objection. I am not engaged in empirical cognitive science here, but in minimal, empirical artificial intelligence. The approach is minimal, because I try to adopt only those elements of a logical calculus that are necessary to accomplish the representational requirements of this study. In particular, I represent (and derive) the following: • presuppositions of natural language utterances • principles of natural language communications • principles for deriving beliefs from other beliefs I do not argue anywhere that other approaches or representational schemes cannot accomplish the same objectives; I only demonstrate that these objectives can be accomplished in a principled manner within the particular framework for default reasoning that is minimal with respect to its underlying logical calculus. So it may well be that particular connectionist networks and a host of ad-hoc logics can implement systems with equivalent characteristics, but I show that these are not CHAPTER 1. INTRODUCTION 5 Figure 1.2: The User-Modeller as Part of the Interface CHAPTER 1. INTRODUCTION 6 necessary to achieve the results of this system. I leave it to the psychologists, however, to decide the cognitive relevance of the computational units I describe. 1.5 O r g a n i z a t i o n o f t h i s T h e s i s In chapter 2, I survey previous work in the areas of presupposition, theories of communication and user-modelling. I explore some of the work done by philosophers and psychologists on belief and rationality, and I introduce the default reasoning formalism which I use to implement my own theory. Chapter 3 is a consideration of the issues I faced in deciding the eventual path that the implementation would take. Previously unexplored methodological issues are investigated, and some alternative implementation strategies are pursued. The implementation itself is detailed in chapter 4. Some of this work appeared elsewhere ([CP89]). I conclude in chapter 5 with what I consider the contribution of this thesis, along with a consideration of the problems that remain to be solved, and some suggestions that might lead to their resolution. Chapter 2 Background To spend too much time in studies is sloth. —Francis Bacon In this chapter I trace the lineage of previous work that leads to my research in the pragmatics of communication. There are many dimensions along which a survey of this kind might be made. I pursue the growing recognition in the literature that certain classes of pragmatic inference are defeasible, with particular attention to the study of presuppositional phenomena. Early work attempted to stay within the bounds established by classical logic, but these 'semantic' theories appear to be giving way to 'pragmatic' varieties which take into account more than the behavior of truth-functional-connectives in natural language. Previous work in the formulation of 'cooperative' principles underlying commu-nication is addressed as well. As in the discussion of presupposition, there is a unifying thread of defeasibility running through the literature. This thread has only recently been perceived as indicative of a default nature, and I amplify on this point. I discuss the relation of a model of communication to user modelling, and I present some salient issues in previous work. I introduce the terminology with which my own work will be described. 2.1 P r e s u p p o s i t i o n There are a variety of reasons for studying presuppositional phenomena in natural language, not the least of which is their ubiquity. As alluded to in the motivational preface to this thesis, masterful use of human language involves subtleties which are not captured by even the most detailed analyses of the propositional contents of a discourse. 7 CHAPTER 2. BACKGROUND 8 Linguistic presupposition has been recognized from the start as something pecu-liarly extra-propositional, a blemish on the uniform face of classical logic. Certainly in the eyes of logicians of the day, the phenomenon had to be accounted for. Even within the domain of strictly linguistic analysis, terminological confusion runs ram-pant; I will make my attempt to clear this up in section 2.1.3. In the meantime I will employ presupposition in its pseudo-technical senses, with all the imprecision and vagueness that previous authors have enjoyed. Much of the previous literature has been created out of a concern over the 'projec-tion problem' associated with presupposition. This is the study of how the presuppo-sitions of the constituent clauses of a compound sentence 'project' over the sentence. Various perspectives will be considered, and I later argue -following Burton-Roberts and others- that the concern over projection has been due to previous definitions of the presupposition relation, rather than to the existence of a problem with pro-jection as such. Another issue is the behavior under negation of the presupposition relation, and once again, I will consider various attempts to define presupposition in view of this behavior. 2.1.1 History of Presupposition Despite the movement toward acceptance of the predicate calculus as the language of linguistic representation, it became clear very early in the process that it would place too severe restrictions on expression, and that certain relations manifest in natural language could not be captured with it. Previous study of presuppositional phenomena has typically resorted to various non-standard logics to avoid certain difficulties. Negation in Natural Language One problem which continues to plague a standard-logic analysis is the following. Example 2-1: Sentence 2—1: The king of France is bald. Sentence 2—2: The king of France is not bald. Sentence 2—3: There exists a king of France. • Example 2-1 is Strawson's[Str50], although this is in fact a very old story[Fre92]. Both sentences (2-1) and (2-2) are commonly held to presuppose (2-3). The prob-lem arises when (2-3) is false; this is a case of presupposition failure. If (2-1) is regarded as false because of the non-existence of the referent, then if the natural CHAPTER 2. BACKGROUND 9 Classical Negation Tri-valent Negation T F T F F T F T # # Table 2.1: Negation in Classical and Trivalent Logics language negation is interpreted in the wide-scope sense, (2-2) can only be given the value of true by recourse to the law of the excluded middle. One way out that has been taken is to adopt a tri-valent logic which assigns to (2-1) and (2-2) the third value in the case of presupposition failure[Fre92, Str50]. (See table 2.1.) Although this and similar approaches avoid the above-mentioned contradiction, they suffer from an inflexibility of application: there are instances where presupposition failure does not deny a truth value from the sentence. Sentence 2—4: The King of France is (not) a woman. Sentence 2-4 is intuitively false (true?) in spite of the failure of the presupposition that there is a King of France. Russell's approach was to represent sentences with presupposed referents along the following lines. Sentence 2-1 would be given the logical form of equation 2.1. 3x(king(x) A —>3y(j/ ^ x A king(y)) A bald(x)) (2-1) A natural language negation operator can then be interpreted in various ways. Equa-tion (2.2) is the logical negation of equation (2.1). ->(3x(king(x) A ->3y(y ^ x A king(y)) A bald(x))) (2.2) The speaker could be negating the 'kingliness' of the referent, as in 2.3, or his baldness ( 2.4), or even the existence of the referent. 3x(->king(x) A -,3y(y / i A king(y)) A bald(x)) (2.3) 3x(king(x) A ~<3y(y ^ x A king(y)) A ->bald(x)) (2.4) To Russell, natural language negation is thus inherently ambiguous. His approach touches upon an issue we are interested in only orthogonally: that of the correct log-ical form of an utterance. We seek a theory of communication which is independent of this issue. CHAPTER 2. BACKGROUND 10 Strawson argued for the truth-valuelessness of utterances like 2-1 and 2-2 on the basis of 'pure intuitions' to this effect. Most so-called definitions of 'semantic presupposition' have in fact centered on Strawson's notion, paraphrased by defini-tion 1. The relation Strawson calls necessitation is an implication that does not support modus tollens.1 Definition 1 (Strawson) Sentence A semantically presupposes sentence B iff sentence A necessitates sentence B, and the denial of sentence A necessi-tates sentence B. From this, Strawson argues, if sentence B is not true, then sentence A is neither true nor false. Thus, semantic presupposition is not classical entailment, because there is no support for contrapositives, and it requires a tri-valent logic. Although there is no pre-theoretic or theoretical obstacle to such an account of presupposition, sufficiency is not adequacy in itself, and the semantic approach must stand against the challenges of other theories. As Lycan [Lyc84, p81] puts it, (IV) The notion of "truth-valuelessness" engendered by the notion of "se-mantic presupposition" is unmotivated, specious, and pernicious to the study of natural language. Neglecting cases of indexicality and cases of vagueness ... we should hew to the line of bivalence in the semantic analysis of English. It remains for the traditional semantic account to render a mapping of natural language connectives to logical truth-functional connectives, thereby allowing for a purely compositional interpretation of 'projection.' A pragmatic view of presupposition failure is that the utterance is somehow 'infelicitous,' having violated some of the maxims of cooperative communication (see sections 2.2.1 and 2.2. Early pragmatic accounts center on proposed solutions to the so-called projection problem, characterized by context-sensitive rules designed to over-ride the normal behavior of the purely compositional rules proposed by Langendoen and Savin (1971) and others. The Negation Test I have noted the distinctive behavior of the presupposition relation under negation, evidenced by some very simple examples. The following discussion exposes what is called the negation test, a criterion of linguistic ancestry which any successful definition of the presupposition relation must accomodate. It has been argued that both sentence 2-1 and sentence 2-2 presuppose sen-tence 2-3. This is to say that certain negated lexical environments carry the same presuppositions as their affirmative counterparts. This phenomenon has been pro-moted as a necessary condition on a relation, for it to be considered presupposition per se. 1This relation is known elsewhere as weak-entailment. See definitions 5 and 6 of Burton, in this thesis. CHAPTER 2. BACKGROUND 11 Defeasibility Example 2-2: Sentence 2—5: The King of France is not bald, because there is no King of France. Sentence 2—6: *The King of France is bald, because there is no King of France. 2 • Example 2-2 demonstrates the defeasibility of presupposition. The presupposition of sentence 2-5 is cancelled from within the sentence itself, without upsetting the intuitions of a native speaker. The second clause serves to focus the scope of the negation operator on the existence of the referent, rendering the statement unam-biguous. (This is known as internal negation). The same presupposition of sen-tence 2-6 cannot be successfully defeated; an infelicity results [Aus62]. Along with its behavior under the negation test, the defeasible nature of the presupposition rela-tion is another feature that distinguishes it from other candidate pragmatic inference classes. Projection The projection into the matrix sentence of the presuppositions of its constituent clauses has been recognized as a problem for theories of presupposition [KP79],[Gaz79]. Example 2-3: Sentence 2—7: (He stopped singing) and (the audience began to applaud). Sentence 2—8: He had been singing. Sentence 2—9: The audience had not been applauding. • Horton[HH88, p78] gives example 2-3 as representative of a class of sentence in which presuppositions of constituent clauses project over the sentence. The presup-positions of the first and second clauses are sentences 2-8 and 2-9, respectively, and both of these project, or become presuppositions of sentence 2-7 itself. 2Some have argued that sentences of the form The king of France rules over Normandy, but there is no king of France, are felicitous in contexts where the first clause refers to the intension, and the second clause to the extension of king of France (referentially opaque and transparent readings, respectively). If the reader's intuitions tend in this direction, I urge that he replace king of France in all its occurrences with present king of France. I am interested in the extensions of the referring terms. CHAPTER 2. BACKGROUND 12 Example 2-4: Sentence 2—10: My cousin is a bachelor or [my cousin is] a spinster. Sentence 2—11: My cousin is male. Sentence 2—12: My cousin is female. • Example 2-4 is one in which some of the presuppositions of the clauses do not project over the matrix sentence. Sentences 2-11 and 2-12 are these presuppositions; they are mutually contradictory, and thus do not project. This is an example of cancellation from within the sentence itself. This example is dealt with in more detail in later sections of this thesis. Karttunen and Peters contributed the "Plugs, Holes and Filters" account of presupposition projection [KP79]. They divided linguistic environments into three categories, distinguished by their effect on the projection of presuppositions. Holes are those environments in which presuppositions always survive embedding, while plugs block projection. Filters are middle ground, where presuppositions sometimes fail, sometimes project, depending upon filtering conditions. There are numerous objections to the approach. First, it is considered unprincipled by some [Lev88], in that the theory grows in complexity when presented with more complex data. The method has been shown to make incorrect predictions [ibid]. And last, is the conflation of the presupposition relation with other pragmatic inference classes. Karttunen and Peters deny the defeasibility of presupposition, thereby losing what I see as its most distinguishing feature. Instead of using defeasibility (via the negation test, perhaps) as a defining characteristic of the relation, they attempt to develop a theory which predicts only those presuppositions that will not subsequently be cancelled. Gazdar first put presuppositional analysis on a firm pragmatic footing. He ar-gued convincingly in favor of an approach based on consistency, rather than on truth values in projection. He developed a notion of 'satisfiable incrementation,' [Gaz79, pl31] which, in retrospect, presaged the newer theories associated with default logic and common-sense reasoning. Gazdar recognized and emphasized the defeasibility of the presupposition relation. He proposed rules to generate presuppositions which were to be regarded in some sense as conjectural, and which could be defeated by contradictory presuppositions of clauses in complex sentences, or by inconsistency with context. He called these potential presuppositions, or presuppositions, empha-sizing that they were no more than 'notional entities' that played a 'technical role' in CHAPTER 2. BACKGROUND 13 his theory[Gaz79]. These pre-suppositions, defined by definition 2, become actual presuppositions only if they survive the mechanics of the context incrementation method introduced later by Gazdar. Gazdar postulates a function for each lexical environment that carries presup-positions, and suggests that the set F of these functions has a cardinality which is "some small finite number," and that Obviously one can go further and define f$, f&, etc. for all the other sources of pre-suppositions but, as far as I can see, this is a theoretically trivial task, and I do not propose to pursue it here. It may be theoretically trivial, but it certainly poses a number of difficult practical questions! In particular, the definition explicitly ignores the surface form of the sentences that are its domain, and it remains unclear what the cardinality of F might be. Though perhaps theoretically uninteresting if the set is in fact finite, the actual size will no doubt reflect upon the efficiency of any implementation. Gazdar proposes, for instance, to capture the presuppositions of a sentence with a factive verb, with the following function: M<j>) = ty:(iP = KX)A(<j> = X ~ v ~ that - X - Y)} where v is a factive or semifactive verb, <j> and x a r e sentences, and X and Y are any strings, possibly null. K is read as the speaker knows that. Example 2—5: Sentence 2—13: Oedipus regrets that Jocasta drinks Sentence 2-14: K (Jocasta drinks) Gazdar presents example 2-5, where sentence 2-13 presupposes sentence 2-14.3 Notwithstanding the above-mentioned implementation difficulties, Gazdar, on the assumption that all the sources of presupposition can be written as functions, defines fp, the pre-supposition function which yields all the potential presuppositions of a sentence:4 3 He admits in a footnote that "this is insufficient, since most factives also presuppose that the subject of the matrix sentence knows the complement to be true...". With this proviso and with the change from a knowledge (K) operator to a belief-predicated expression of the form employed later in this thesis, I am in basic agreement with this approach. 4The equations in the definitions employ Gazdar's original notation and terminology. CHAPTER 2. BACKGROUND 14 Definition 2 (Gazdar: pre-suppositions) for any sentence 4>, fp{<j>) = u / e F / (<£) Gazdar also accounts for other inference classes, notably various implicatures, but does little to specify the functions that would generate them. He also admits that his theory is liable to the charge of ad-hocness, as the order in which the rules are applied is not argued for. Briefly, pre-suppositions and im-plicatures become presuppositions and implica-tures only if they survive the mechanics of Gazdar's 'satisfiable incrementation' sys-tem. In particular, this mechanism prevents the passing through of pre-suppositions which 1) should not project from the clauses of complex sentences into the set of presuppositions of the matrix sentence, 2) are inconsistent with the existing context, and 3) are also implicated or entailed by the sentence. Example 2-6: Sentence 2—15: If John sees me then he will tell Margaret. Sentence 2—16: I don't know that John will see me. • In example 2-6, Gazdar gives sentence 2-16 as an example of a clausal quantity im-plicature of sentence 2-15. So in particular, the set of clausal quantity im-plicatures for simple disjunctions or conditionals is given by definition 3, where P is read as for all the speaker knows it is possible that. Definition 3 (Gazdar: Clausal Quantity Im—plicatures) fe(4> — or ~ V) = fc(if ^<j>^ then ~ ip) = {P^,P^,P-.^,P-i^} Mercer sets out to formalize certain presuppositional phenomena within the frame-work of a default logic[Mer87].He recognizes the crucial importance of the defeasibil-ity of the presupposition relation, and takes this as persuasive evidence for modelling it within a default logic. He identifies three distinct sources of presupposition de-feat: contextual, conversational, and where propositions which are presupposed by a sentence are also entailed by it. These desiderata, along with the behavior of the relation under negation, lead to Mercer's proof-theoretic definition of presupposition: Definition 4 (Mercer: Presupposition) A sentence a is a presupposition of an utterance u, represented by the default theory A u iff CHAPTER 2. BACKGROUND 15 • A u |=A ct and • a € Th( CONSEQUENTS{D}), but • A u a and • A u | £ A -"a 5 The technique is far more principled than its precursers, but does not treat of other inference classes, such as implicatures. It no longer suffers from the form of ad-hocness attributable to Gazdar's theory, but there are other ad-hoc steps in the derivations of presuppositions from default theories representing complex sentences; this problem leaves large question marks for anyone interested in a working imple-mentation of the method, but Mercer's remains the most principled approach, and my work on presupposition follows closely on his. Refer to section 4.7 for further comparison of Mercer's approach with my own. Horton has recently presented another theory of presupposition, with an emphasis upon modelling presuppositions as beliefs of agents [Hor87]. In particular, she points out that not only do the beliefs of the speaker and listener have to be accounted for, but that the beliefs of other agents need sometimes be included to provide an intuitively satisfying account of the presuppositions of some complex sentences. Horton also gives much consideration to the defeasibility of the presuppositional relation, carefully distinguishing between presuppositions which are blocked by se-mantically internal negation, and those which must be retracted due to inconsistency with antecedently or subsequently established context. Horton's potential presup-positions resemble Gazdar's presuppositions, although she is careful to point out that when a sentence potentially presupposes a proposition, that sentence tends to imply that proposition. She is therefore attaching more than mere 'technical' signif-icance to potential presuppositions. Horton agrees with Gazdar that the 'survival of candidate presuppositions depends on consistency'[Hor87]. The reader will note in subsequent sections of this thesis, that my approach is entirely consistent with Hor-ton's belief-centered view of presupposition, though my theory has different goals than hers. I am interested in developing a theory of communication, to which be-liefs of the interlocutors are crucial; Horton also recognizes the importance of agent beliefs, and her theory of presupposition is also couched in terms of these beliefs. Burton—Roberts One might think or at least hope that after so long a history, some sort of consensus would have been reached on what presuppositional phenom-5Mercer notes about this definition that: " . . . the only defaults in A u are the presupposition generating defaults. In reality the default theory would contain many other kinds of defaults. The definition would have to be changed so that the proof of a requires the invocation of a presupposition-generating default . . . As well, . . . all proofs must require the use of the statement representing the semantic representation of the uttered sentence." Similar considerations motivate aspects of the implementation presented in chapter 3 of this thesis. CHAPTER 2. BACKGROUND 16 ena are, and upon how they behave. Remarkably, not even the semantic-pragmatic division has been surmounted, as evidenced by the recent publication by Burton-Roberts [BR89], which claims nonetheless to prove once and for all that presuppo-sition is semantic in nature. Burton-Roberts' approach is well motivated. I concur with him on both his dissatisfaction with semantico-logical definitions of presuppo-sition, and with his observation that "projection problems are thrown up by defini-tions. Without a definition, there can be no problem." He then frames the project in terms of what he sees as three misguided assumptions pervasively manifested in previous work. First among these is the assumption that "to adopt the semantic hypothesis is to adopt the Standard Logical Definition of Presupposition" (see be-low). Second, that the SLDP is "satisfactory as far as elementary formulae (simple sentences) are concerned," and third, that the remaining problem for a semantic theory is to "solve or otherwise mitigate the projection problem attendant on that definition." Burton-Roberts identifies the Standard Logical Definition of Presupposition, which he states as definition 5, in a form similar to Strawson's (definition 1 of this chapter). Definition 5 (Burton-Roberts: SLDP) A presupposes B if and only if both • A implies B • -\A implies B. He notes the classical difficulty with the sense of 'implication' as employed in the def-inition. It cannot be classical entailment, for the reasons already discussed. Burton-Roberts therefore calls it weak entailment, identifying a relation that does not sup-port modus tollens. Such a relation, writes Burton-Roberts, commits the logician to the result that "the failure of a presupposition inevitably results in the presupposing sentence having a third logical status." He points out that it is this definition (the SLDP) that has been criticized in discussions of the 'semantic' approach. He there-fore proposes the Revised Logical Definition of Presupposition, distinguished from the SLDP . . . in the most general terms by the fact that the Standard theory induces a trivalent logic whereas the Revised theory induces a gapped bivalent logic . . . such a theory is to be preferred at least on these grounds, for the speaker intuition we are attempting to reconstruct is that of the lack of truth value inherent in statement failure. Definition 6 (Burton-Roberts: RLDP) 52 is a presupposition of S\ if and only if • 52 is a weak entailment of S\ CHAPTER 2. BACKGROUND 17 • S? is not a strong entailment of Si. Burton-Roberts explains: . . . the set of truth conditions of a sentence are its weak entailments (support-ing modus ponens). A subset of these weak entailments are strong entailments (supporting modus tollens). Presuppositions are weak entailments that form a subset complementing the subset consisting of the strong entailments... those weak entailments that are not strong entailments are presuppositions. He goes on to formulate his general theory of presupposition, defining along the way a relation which he calls generalized presupposition, by referral to the Salient Presuppositional Intuition: that a proposition is presupposed when it is implied by both a sentence and its negation. (This is just a reformulation of the criterion I have been calling the negation test). Definition 7 (Burton-Roberts: Generalized Presupposition) 52 is a generalized presupposition of S\ if and only if the non-truth of S2 renders S\ liable to lack of truth value. He further explains that "a sentence is liable to lack of truth value if and only if there is some possible state of affairs in which it actually lacks a truth value." Wondering about the nature of the relation from the negated sentence to its presuppositions, he suggests a name for it: intuitive implication. In sum, the motivation for his approach—like Strawson's—is the desire to ex-plicate a purported intuition about truth-valuelessness in sentences which admit of presupposition failure. His approach relies on a bivalent logic with gaps, of which he concludes he has "no means of conclusively demonstrating that the distinction between trivalence and gapped bivalence consists in what I say it consists in." He is able to handle a large range of examples that have been classically problematic for standard semantic theories of presupposition, but the approach is essentially Strawsonian. His technique appears to work for those examples which Mercer and others have identified as problematic for semantic theories, but as noted, Burton-Roberts does give up bivalence, and his motivations are different from mine. In shopping around for a definition of presupposition, motivations are relevant; he has very little to say about cancellation of presupposition by contradictory context. Some 'pragmatic', context-sensitive mechanism is required above and beyond even a successful semantic account of (truth-functional) projection. 2.1.2 Presuppositional Environments Burton-Roberts [BR89, 249] credits Rob van der Sandt (in conversation) with the observation that uevery theory is, in the final analysis, going to have to fist the CHAPTER 2. BACKGROUND 18 presupposition-inducing elements anyway." Gazdar has provided only a hint of how this might be accomplished via his pre-supposition generating function / p , repro-duced herein with definition 2. Mercer [Mer87, p34] lists a range of environments which carry presuppositions, and formulates some of these within a default logic. It is implicit in his work that although he has presented only some of these environ-ments, it is possible in principle to list them all. Karttunen, has listed thirty such environments. Horton [Hor87, p71] also lists a selection of presupposition-carrying environ-ments, which she calls triggers. In short, the consensus appears to be that there is a finite number of presupposi-tional environments, and that they can all -in principle- be enumerated. No one to my knowledge has made a claim as to the number involved. The theory presented in this thesis makes these assumptions as well. 2.1.3 Terminological Confusion There has been so much emphasis over recent years on problems with various ac-counts of a putative presupposition relation, that the furor has served only to ob-scure the true source of the confusion. The latest work has admitted that what is subsumed under the name presupposition is far from agreed upon. Contemporary writers are variously adamant about terminological reform; for example, Lycan [Lyc84] writes: (I) The term "presupposition" is viciously misleading and is not scientifically well-behaved, in that the class of sentence pairs that have been subsumed under it is very far from constituting a natural kind. "Presupposition" is an ill-conceived umbrella word that is used to cover any number of importantly distinct and largely unrelated notions (from formal semantics, the theory of conversation, speech-act theory, the theory of speaker-meaning, the psychology of inference, and more). A single term devised to comprehend all these notions, or probably even two or more of them, would figure in no interesting (and true) linguistic generalizations. (II) The various implicative notions have, in fact, indeed epidemically, been conflated and licentiously interchanged in the literature, with the result that any number of theoretical issues have been stymied and several pseudoissues brought unhappily into being. (III) Though several of the various pragmatic notions collected under the only slightly more refined heading of "pragmatic presupposition" are individually clear, manageable, and theoretically important, the Strawsonian notion of "semantic presupposition," once clarified, is empty (or all but empty) and useless for the purpose of understanding the workings of natural language. [P81] CHAPTER 2. BACKGROUND 19 That point (I now decline to speak softly) is that people should stop using that word. R has caused nothing but trouble and error. Nor should any equivalent bastard term be introduced. Rather, we should simply adopt more specific and precise terms for each of the distinct but undisputedly real implicative relations that have formerly been forced into each other's company under the label of "presupposition," and use those terms exclusively. [p82] Lycan further notes Gazdar's sympathy with the above remarks, but also indicates that Gazdar has not complied with the exhortation. I have written above in terms of positive and negative forms of a sentence 'presup-posing' the same propositions, and am thereby guilty of a similar crime of conflation. There is clearly a distinction to be made between presuppositions of affirmative sen-tences, which are not defeasible, and those of negated sentences, which are defeasible. Mercer is likewise concerned with the conflation of different relations under the banner of an over-used term, and takes steps to clear the air right from the start in his PhD thesis [Mer87]. He is quite systematic about calling the affirmative relation 'entailment,' and the relation supported by the negated environment 'presupposi-tion.' In the past the label presupposition has been misused, capturing all sorts of phenomena including implicatures. Although this misuse provides a good reason to discard the label, I will continue to use it because it has become a standard term. [pl8] He cites Oh and Dineen [OD79] in his defense, and then reiterates his dissatisfaction with the usage: It is an unfortunate historical fact that the term presupposition does not have a single definition. The term has been used to describe everything from truth-value conditions to beliefs required of a speaker in order to make a successful utterance. The term has been used to capture a heterogeneous collection of phenomena including conventional and conversational implicatures, as well as the relation which would now be considered presupposition. [p26] Mercer's concerns are more than just terminological here, and he goes on to re-interpret the aforementioned negation test as the hallmark criterion of presuppo-sition: "Firstly, simple negated sentences must presuppose, and their unnegated simple sentence counterparts must entail the same inferences... Secondly, presuppo-sitions are implied, not said." cf. Lycan again [Lyc84, p93]: ... historically, the term 'presuppose' has been used in each of two different ways: one as contrasting with 'assert,' and the second as contrasting with 'entail.' The former usage is more natural, the latter technical. Mercer seems to have adopted the 'technical' reading. Finally, Stalnaker [Sta80] has something to say as well on the subject at hand: CHAPTER 2. BACKGROUND 20 As regards presupposition, the problem of separating description of the phe-nomena from theory is aggravated by the fact that the same word is used both as a descriptive term, identifying the relevant class of examples, and as a the-oretical term in semantics, denned in terms of logical relationships between sentences or propositions. 2.1.4 Summary Much of the work I have reviewed in this section seeks to provide an account of the truth-conditionality of sentences which exhibit presupposition failure. Thus, while Burton-Roberts' theory may succeed on this count, its usefulness to my project without some explanation of context-incrementation is limited. My project is the derivation of agents' beliefs from utterances, which must take into account much of the doxastic environment of the agents which are involved. Perhaps the greatest difficulty for proponents of a semantic approach to presup-position is the so-called projection problem. This is the study of how the presuppo-sitions of the clauses of a complex sentence 'project' over the sentence and into the context. Burton-Roberts [BR89] has argued that projection has been a problem for semantic approaches only because previous definitions of presupposition adopted by semanticists have been incorrect, and presents his own version. He also argues that the perceived ambiguity of natural language negation is likewise a by-product of a misconceived semantic definition. Pragmatic approaches avoid these issues largely by sidestepping them, and derive their (considerable) explanatory power from high-level theories of communications, although this is not always explicated. The result is that there is still no well-principled account of presupposition which can be derived strictly from a theory of communication. I go on to propose an axiomatization of such a theory, along with the necessary rules of inference to derive not only presupposition, but other classes of pragmatic inference as well. It may be quite simply that those who are most concerned with truth-function-ality in natural language are forced into a semantic account of implicit as well as explicit phenomena, while those concerned most with the effects of context will tend toward pragmatic approaches as the most profitable tools. While much of the previous work on presupposition centered on the attribution of truth values to sentences, I am more concerned in this thesis with the beliefs of agents in varying presuppositional environments. I have taken an 'opportunistic' approach to the use of presupposition within the implementation; wherever it seemed to me that additional beliefs could be derived from utterances, I implemented a presupposition schema to do my bidding. Thus, I make no claim that the range of presuppositional phenomena exploited by the system is complete; far from it. My approach does assume, following Karttunen and others [Kar74], that "The basic presuppositions of a simple sentence presumably can be determined from the lexical CHAPTER 2. BACKGROUND 21 Dimension Research Russell Strawson Gazdar Mercer H or ton Csinger Logic Classical 3-valued ad-hoc default modal Theorist Defeasibility V V V Implementation V Belief-predicated V? V Context-sensitivity V V v7 Table 2.2: Summary of Previous Work in Presupposition items in the sentence and from its form and derivational history..." and that it is possible to "give a finite list of basic presuppositions for each simple sentence of English." Table 2.2, a summary of the previous research in presupposition evaluated in this section, provides an at-a-glance statement of particular researchers' attention to the dimensions I have identified. A question mark in any box indicates my feeling that the dimension, though mentioned in the work, is tangential to the thrust of the research. 2.2 T h e o r i e s o f C o m m u n i c a t i o n Several approaches have been taken to theorizing about, or modelling communica-tion. All are subject to the validity of numerous assumptions, and none of them has been completely adequate. These are some of the issues which any theory of com-munication must address, and which are briefly dealt with in the following sections: • The meanings of utterances • The mechanism(s) which support(s) the derivation of meanings of utterances • The purpose of communication • The organization of knowledge Utterance Meaning A well-established view [Cha76] is that the meaning of a natural language utterance consists of the logical form of the utterance itself, along with all of the inferences that can be made from this logical form and any relevant, available knowledge. This view remains plausible, but too vague to be more than a guideline. It makes no claim as to the nature of the logical form, the inference method, or categories of knowledge required. Various formalizations exist, which are more committed along one or more of these dimensions. In general, some distinction is made between the CHAPTER 2. BACKGROUND 22 propositional content of an utterance, and the meaning of an utterance; the former is a subset of the latter. Diversity of terminology is a factor here as well; Herzberger [Her75] has referred to propositional content as assertive content. Horton has suggested that the communicative content of an utterance includes its entailments, conversational implicatures, conventional implicatures, and its pre-suppositions [Hor87, pi]. Gazdar has distinguished between literal and conveyed meaning, and provided rules and conditions for deriving the latter from the former. Mercer writes that his model of communication rests upon two assumptions; the first of these has to do with cooperative principles, while the second is concerned with sentence meaning. He suggests that . . . the meaning of an asserted declarative sentence is approximately equiva-lent to update your knowledge base with the logical form of the sentence just uttered. [Mer87] Bach and Harnish [BH79, pl50] discuss issues of sincerity versus 'literalness,' intending versus operative meaning versus Grice's notion of speaker meaning. Deriving Utterance Meaning While it is generally agreed that the meaning of a sentence is more than just its propositional contents, as noted above, there is not much consensus upon what this meaning actually consists of, much less any agreement about how to derive it. If, as Marr says, ". . . phrasing of information must be an artwork of suggestive-ness and insight", then the retrieval of the information must be via a process that is equally sophisticated. The purpose of human language is presumably to transform a data structure that is not inherently one-dimensional into one-dimensional form for transmis-sion as a sequential utterance, thereafter to be retranslated into some rough copy of the original in the head of the listener. [Mar81, pl51] Marr's view touches upon the related issues of representation and control. How-ever, he specifies neither the representation (e.g., logical form) nor the control (e.g., derivation processes). His caution is well-founded, of course, but others have been more daring in their proposals. Gazdar[Gaz79, pl33], for instance, suggests various levels beyond the literal meaning of an utterance; within the system of satisfiable incrementation which he defines with some precision, the relevant quantities are conveyed meaning, and conversational contribution, defined as follows: The conversational contribution of an utterance .. . is that proposition . . . which consists of all worlds except those that have both the following properties: • they were included by all the propositions in the context of [the] utter-ance and CHAPTER 2. BACKGROUND 23 • they are each excluded by at least one proposition (not necessarily the same one in each case) in the context that results from the utterance. Mercer[Mer87] defined sentence meaning as approximately equivalent to updating the hearer's database (above). He adds that implied in this is a "commitment to the principle that the inferences are generated by a well-founded proof theory working in conjunction with knowledge represented as statements in a logical language." Mer-cer deals only with asserted declarative sentences, and only with the generation of their presuppositions. The project then becomes one of defining the presupposition relation, which he goes on to do within a default logic formalism. The Purpose of Communication Linguists, philosophers, and -recently- com-puter scientists have grappled with the nature of communication. How is it accom-plished? What makes it possible? Certain assumptions have been at the heart of all theories thus far, often referred to as principles of cooperation. These amount to no less than normative guidelines for communicative acts and processes. Witness a philosopher's view [Den81, p238]: The norm for belief is evidential well-foundedness (assuring truth in the long run), and the norm for avowal of belief is accuracy (which includes sincerity). These two norms determine the pragmatic implications of our utterances ... Mercer's [Mer87, p7] first self-professed assumption regarding his model of cooper-ation is that ... the rules given in Grice's theory of cooperative communication govern the communication act. Neither is Horton[Hor87, p30] free from these assumptions. Her introduction reads: Before proceeding, we now pause to discuss the assumptions that we make. In order to simplify the problem, we will follow Grice in assuming that con-versation is cooperative. Specifically, we will assume the following: • Sincerity Assumption: The speaker will only say what he believes to be true. In other words, the speaker will not deliberately try to deceive the listener. • Straightforwardness Assumption: The speaker will not use sarcasm (a flouting of the maxim of Quality). Although she goes on to suggest how her assumptions might be relaxed in order for her theory to model deceit and sarcasm [p96], hers is not a general theory of misleading. CHAPTER 2. BACKGROUND 24 Organization of knowledge Language is a process of communication between people, and is inextricably enmeshed in the knowledge that those people have about the world. That knowledge is not a neat collection of concepts designed to manipulate ideas. It is in fact incomplete, highly redundant, and often inconsistent. There is no self-contained set of 'primitives' from which everything else can be denned. Definitions are circular, with the meaning of each concept depending on the other concepts. [Win71, p210] Couched in this light, the project of identifying and representing categories of knowl-edge required for the modelling of communication appears hopeless. Although the enthusiasm of early researchers in knowledge representation is reflected in the ter-minology that still pervades the area, I do not see any gain in clarity via the use of such terms as 'knowledge base,' 'rationality module,' 'PLANNER, ' 'Conniver,' etc. [PG86]. While such names are intentionally idiomatic, and highly suggestive of the roles they play in the (toy) implementations of their creators, they obscure the huge gulf between what they are and the psychological analogs they are designed to emulate. Following Perrault I will try to reserve these tempting connotations for the text of grant applications, and use the less connotative 'information' instead of 'knowledge.'6 I suggest then, that the following categories of information are salient. • Situational Information:Mercer distinguishes situational from background know-ledge [Mer87, p l l j . In particular, he writes that utterances are an important source of situational information, and thereafter restricts himself to only this form of situational information. He suggests that other sources might include information from previous parts of the discourse,7 the physical situation of the interlocutors, and their relative social statuses. As far as the current work is concerned, I too am interested only in the utterance itself as a source of situational information. • Background Information: Background information is everything non— situational. Aside from this obvious description of what it is not, various implementations have categorized it in different ways. I identify the following categories of information: — Linguistic Information: generally includes the following sources of infor-mation: * phonology * syntax 6There are even more principled reasons for avoiding the use of the term 'knowledge,' as dis-cussed in section 2.3. 7Although I tend to regard information from the discourse as part and parcel of the context. CHAPTER 2. BACKGROUND 25 * semantics * pragmatics The concerns of this thesis do not touch upon issues of phonology, and the domain can be safely ignored. Syntactic information is usually repre-sented in the form of a grammar, which is used to build a structural rep-resentation of the utterance. Various formalisms exist, and their output is treated by some semantic process to yield a logical form, correspond-ing to what I have labelled the propositional content of the utterance. I do not wish here to argue either for a particular syntactic formalism, or for a particular representation language. I have adopted a first-order default-augmented representation not because I am ontologically commit-ted to it, but because it offered certain pragmatically-motivated payoffs (see section 2.5). Semantic information is sometimes classified as includ-ing such facts as, for example, that factive verbs like regret, and surprise entail their complements [Mer87, pl2]. Although not crucial to my ar-guments, I go along with such categorizations. Pragmatic information encompasses a vast (and nebulous) area that includes information about the conversational usage of language in different situations. Of particular interest under the heading of (pragmatic) linguistic infor-mation is the aforementioned Cooperative Principle. — World Information: World, 'real-world,' or 'encyclopaedic' information includes facts about the world as it is. I take to be under this heading such 'knowledge' as the binary quality of human sexuality (i.e., that humans are generally male or female, and that these states are generally mutually exclusive), the 'knowledge' of the capital cities of the world, and so on. Information in this category can be of a default nature as well; some people are hermaphroditic. — Contextual Information: Human communication—and indeed the human concept of understanding—is grounded in context. Whether an utterance is successful is measured against the change it produces in the beliefs that the hearer has of the world; the speaker also has beliefs, and some of them are about the hearer. These beliefs are always subject to revision, and thus hint again at defeasibility from another direction. Context is conven-tionally regarded as the set of beliefs that are shared by the interlocutors. It has been called shared knowledge, mutual knowledge, common ground, etc. 2.2.1 Principles of Cooperation (Grice) All of the previous systems have employed some version of the Cooperative Principle developed and summarized by Grice [Gri75], and repeated here as Figure 2.1. In its CHAPTER 2. BACKGROUND 26 1. Quantity — Make your contribution as informative as is required. — Do not make your contribution more informative than is required. 2. Quality: Try to make your contribution one that is true. 3. Relation: Be relevant. 4. Manner: Be perspicuous. Figure 2.1: Grice's Maxims of Conversation simplest form, the principle accepts that the semantics of the language is a priori, and that utterance meaning depends upon this semantics augmented with inferences sanctioned by rules describing conversational use of utterances. These rules comprise the (Gricean) cooperative principle: Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged. Whether this principle owes its longevity to only its vagueness, or to some other quality, the fact remains that in some guise or another, it underlies every model of communication that I have surveyed [Gri75, BH79, Gaz79, GS80, Lyc84, Mer87, HH88, BR89]. The Cooperative Principle captures what I have called The Assump-tion of Minimal Perversity. This is the element of reasoning which eludes monotonic logics, and what all non-monotonic systems attempt to capture.8 In this study, min-imal perversity manifests itself in that given no indications to the contrary the hearer assumes that the utterance adheres to the reasonable guidelines of the Cooperative Principle, embodied by the maxims of Figure 2.1. Typically, these maxims are ap-plied from a strictly extra-logical perspective, although various re-formulations are in use. 8 I have noted the following definition of the Minimal Perversity Assumption: the assumption that of the (possibly, or even likely) infinite number of clauses which might affect the reasoning process, only those whose truth value is known must be considered. This is analogous to the well-known Closed World Assumption, and to various circumscriptive devices, all suggestive of non- monotonic ity. CHAPTER 2. BACKGROUND 27 2 .3 U s e r M o d e l l i n g User modelling is the investigation of how assumptions about a user's background knowledge (as well as the user's plans and goals in consulting the system) can be automatically created, represented, and exploited by the system in the course of interaction with the user [KF88, p3]. User modelling is a special case of agent modelling; this thesis presents an approach to agent modelling in a natural language environment which is based on default reasoning. Others have considered goals and plans of the user [CP79, AP80]; I restrict myself here to the user's beliefs. Although this may be a good point at which to launch into what beliefs actually are, I will pay these philosophical dues in section 2.4.1. User modelling takes many forms. One approach has been to let the user do the modelling; some applications permit the user to modify variables in the environment which reflect user-proficiency parameters indirectly [CdCF87], while others query the user directly about preferences and capabilities. If user-modelling has come a long way since a user-determined 'help-level' first appeared on the menu of a popular word processing program, it has much farther to go before achieving the kind of flexibility we expect from our human interlocutors. User modelling is what interactive systems will need to do to be responsive to the needs of the user. Tutorial systems need to guage the user's competence in the subject of study [Kap82, pl84]. For natural language systems to improve the utility of databases by providing access to inexpert users, they will have to obey the conversational principles observed by native speakers of the language. The earlier discussion about the meaning of an utterance is relevant here (2.2). Presuppositions and other non-propositional utterance-related phenomena have re-cently resurfaced in the 'computational' literature because of their connection with database systems in general, and question-answering in particular. There has been some debate over what constitutes a cooperative response from a system when the question put to it suffers from presupposition failure. [MR84, Kap82] Dimensions of categorization Kass and Finin [KF88] have enumerated a number of dimensions along which user-models can be categorized. Of particular interest are the dimensions they call • Representation of beliefs • Acquisition of beliefs CHAPTER 2. BACKGROUND 28 Representation of Beliefs Of interest under this heading is the continuum be-tween what Kass and Finin call implicit and explicit representation. Systems which model attitudes implicitly are of little interest to my project. These include any implemented program in which the programmer has made any assumptions about the prospective users of his system, while writing the code. (Kass and Finin cite a generic FORTRAN compiler as an example.) Explicitly encoded attitude repre-sentation has some characteristics, some of which reflect on the project at hand. Part of what I have been referring to as Rationality, Kass and Finin call explicit representation, and go on to say: The knowledge in the agent model is encoded in a representation language that is sufficiently expressive. Such a representation language will typically provide a set of inferential services, allowing some of the knowledge of an agent to be implicit, but automatically inferred when needed.9 Acquisition of Beliefs The method by which beliefs are acquired is relevant to the effectiveness of the user-modeller [KF88]. Acquisition has been described with the already overused terms implicit and explicit. Explicit acquisition takes place when the user makes explicit statements about what she does and does not believe. Implicit acquisition is more difficult, and involves deduction on the part of the user-modeller. One approach is the technique advanced in this thesis, wherein the user-modeller monitors the communications channel between user and application, and derives tacit and implicit user-beliefs from the propositional contents of user-utterances. Belief acquisition in the domain of user-modelling has been further categorized [ibid.] as recognition oriented or constructive. Recognition oriented approaches are more limited in their scope, but more straight-forward to implement. This kind of system relies on a stored set of belief stereotypes, which can be triggered by the form of a user-utterance [ibid]: Thus if a user indicates knowledge of a concept that triggers a stereotype, the whole collection of assumptions in the stereotype can be added to the model of the individual user. Stereotype modelling enables a robust model of an individual user to be developed after only a short period of interaction. The approach to belief acquisition advanced in this thesis is thus implicit, and constructive. 2.4 B e l i e f a n d R a t i o n a l i t y Beliefs and rationality are deeply intertwined, both within and without the compu-tational paradigm. In the following subsections, I explore the relationship between 9 c/ . Levesque's distinction between implicit and explicit belief [Lev84], and section 2.4.2 of this thesis. CHAPTER 2. BACKGROUND 29 belief and rationality, and consider various definitions of rationality. The aim of this investigation is to suggest directions that might lead to plausible formulations of belief introspection (i.e., rules to derive beliefs from existing beliefs). In this thesis, I take the working view that 1. Rationality is a property of intelligent agents of the sort for which the theory described in this thesis is formulated 2. An intelligent agent can be described in terms of its beliefs10 3. Rationality is in some sense a well-formedness criteria for the beliefs of an intelligent agent The first point is little more than a manifesto, a declaration to the effect that rationality is an identifiable property of intelligence, present in at least some agents which are intelligent.11 In particular, I consider normal humans to be rational. The second point claims -without trying to explicate the nature of a belief, (i.e. without making any ontological claims in respect of beliefs)- that an intelligent agent can be described at some level by its beliefs; for instance, one agent can be distinguished from another by a difference in their respective beliefs. In general, I will refer only to a partial description of this sort. Thus, when I speak in terms of an agent's attitudes, goals, or desires in this thesis, I do so loosely, with the underlying assumption that these aspects of the agent's mental state can ultimately be reduced to some expression in terms of its beliefs. The third and last point is the one that ties beliefs to rationality. Agents that exhibit an identifiable set of normative characteristics are predictable to the ex-tent of their 'normativeness'. In general, inter-agent communications relies on this normative component, and deviance results in various pathologies (e.g. pluralistic ignorance and false consensus). In particular, the theory described in this thesis provides predictive power for agents which are normative with respect to a partic-ular definition of rationality, to be described. Hearer-agents with this normativity can make inferences about the beliefs of speaker-agents who make utterances, and normative speaker-agents can derive the forms of their utterances from their beliefs. This is the view of communication taken in this thesis. Previous views of this well-formedness criterion have amounted to anything from classical logical consistency to ad-hoc procedural specifications. I try to leave the definition as loose as possible, to be filled in with further results as necessary, but 1 0Though I have not completely abandoned this original, naive hope that goals and desires might be expressed as complexes of beliefs, it certainly appears to me now that it is easier to treat them as primitive. u T h i s is not a strong statement. I plan to pursue the relationship between rationality and intelligence. I only point out here that on my view, while all rational agents are intelligent, not all intelligent agents need be rational in the sense described. CHAPTER 2. BACKGROUND 30 I do propose herein that a default reasoning framework offers immediate results towards resolving some well-known problems such as logical omniscience. In this section, I pay my philosophical respects to others who have considered the relationship between rationality and belief. 2.4.1 Beliefs There is much to say about beliefs, as the ample body of philosophical literature demonstrates, but there is very little that is not still under investigation. The researcher who wishes to represent beliefs in a computational environment has little more than his own intuitions to go on. My own intuitions urge me to remain as ontologically uncomitted as possible, and while I have surveyed a wide range of models of belief, I will stay with what I consider to be the most minimal. Beliefs versus Knowledge. Believing is a state of knowledge representing the propositions that the system A S S U M E S to be true. Reasoning is the process of inference to form beliefs from other beliefs using deduction rules [Mor87]. Any complete model of a user will include information about what the user knows, or what he believes. In the context of modelling other individuals, an agent does not have access to objective truth and hence cannot really distinguish whether a proposition is known or simply believed to be true [KF88]. We will speak only in terms of the beliefs of the agent being modelled. Other approaches in which the subjectivity of truth has been recognized are current, viz. [SM88]. . . . start with only a definition of knowledge, any definition that you find acceptable, and define belief as a defeasible version of it. This is, by the way, another brick in the wall of the argument for defeasibility. My own views on belief are fairly radical in comparison, for I count myself among the eliminative materialists [Chu86, Chu88], who would not commit themselves onto-logically to anything like 'beliefs in the head.' 'Beliefs' are common-sense entities, and become the objects manipulated by a default logic in virtue of just this quality. I doubt that there is anything very 'scientific' about the way humans come to their conclusions, and a normative theory of that process will be scientific only to the extent that it successfully captures the pragmatics of the process. CHAPTER 2. BACKGROUND 31 Tacit, implicit, shared, mutual, etc., beliefs. Logical definitions of mutual belief have been offered by many researchers. The subject is of some importance to this project because the Cooperative Principle in operation demands mutual recog-nition, and hence representation. Suffice it to say that the interlocutors postulated for my theory mutually believe12 the elements of the Cooperative Principle. If they do not, a state of pluralistic ignorance or of false consensus might arise, in which the maxims of cooperation would be defeated. Agents necessarily maintain models of their peer-agents; part of this model usually includes beliefs to the effect that, among other things, their peer-agents believe the maxims of the Cooperative Principle. The unfoundedness of such a belief is characteristic of false consensus. The Epistemic Status of Belief Hadley [Had89, p4] surveys the epistemic status of beliefs in artificial intelligence research: We may summarize the stance towards belief currently adopted by many (though not all) Al researchers as follows: "Agent X believes sentence S if and only if S is explicitly present in X's belief base, or S is derivable, by means of a tractable epistemic logic, from a set of epistemic formulae corresponding to a subset of X's explicit belief base." Now, certain difficulties with the above emerge as soon as the thesis is explicitly stated. For example, the epistemic logics cited above do not address the fact that agents acquire beliefs over time. Nor do they address the fact that an agent may, on occasion, validly derive a conclusion from prior beliefs, but abandon that conclusion because it conflicts with another of the agent's beliefs. To be sure, if the agent is sc rational,13 the conclusion will be abandoned only if the agent also discards at least one premise of the retracted conclusion. Nevertheless, agents often have inconsistent beliefs, and do not automatically 'commit to' the conclusions they derived. Compare Konolige [Kon85]: ... a belief subsystem is the computational structure within an artificial agent responsible for representing his beliefs about the world ... A belief subsystem consists of a finite list of facts the agent believes to be true of the world (the base set) together with some computational apparatus for inferring conse-quences of these facts ... the belief set of an agent is the set of all queries that can be derived. Cognitive scientists are divided into two broad camps, each of which adheres to a different view of belief. There is the Representational Theory of Mind, championed by Jerry Fodor and Zenon Plyshyn, and the Syntactic Theory of Mind with an 2A11 the previously discussed reservations about 'belief apply here. 3Hadley does not explain in this paper what he means by rational; the emphasis is mine. CHAPTER 2. BACKGROUND 32 equally stellar group of supporters including Stephen Stich. To show that the issue remains far from settled are a group of respected philosophers who have thus far refused to give up their seats on the fence between extremes. Daniel Dennett [Den87] is one of these; his opponents call him 'slippery', his supporters 'careful'. There are also the connectionists, with their own view of cognition. This is a promising line of inquiry, but has little to offer to this project. A distinction should be made here. The goals of cognitive science are presumably to offer an explanation of the human cognitive example, while the goal of this project under the A l umbrella is to suggest a set of computational approaches which might be of use in a certain area. With this in mind, I do not claim that the representation of beliefs which is to follow is true to the human cognitive model. Nonetheless, the effort that has been devoted by philosophers of mind to the question of belief should not go unnoticed. In the absence of a computational work-bench, some inquirers have constructed models which bear a striking resemblance to the architecture of the system presented in this thesis, and are useful exemplars of the kind of behavior I would like an artificial agent to exhibit. The work of cognitive scientists serves then, as a high-level requirements analysis for researchers whose aim is to implement cognitive models. Stephen Stich's Content Theory of Belief [Sti83] is one such model. In it, he proposes an Inference mechanism whose resemblance to the Rationality or Intro-spection module of this implementation is obvious. He considers beliefs to be some form of mental sentence tokens, whose meanings are imbedded in their causal in-teractions with other sentence tokens and with the environment. To connect the agent with his environment, he adds perception and action-control units. To deal with the causes of actions, he includes a Practical reasoning mechanism, capable of generating desires from beliefs and desires. There is more than passing interest in this model, for it shows us how far a computational system must go before it can be considered to have beliefs in the same sort of way that we do. In addition to representing beliefs and providing a mechanism for introspection, it must take account of desires, and provide both a means to generate desires, and a way to interface with the environment. The notion of artificial agent in this thesis falls far short of these requirements. Stich's model nicely points the way to future work. While we have developed a hearer-based computational theory of communication, and an implementation that has application as a user-modeller, we have ignored goals and desires, and therefore a realistic account of a speaker model is still out of reach. 2.4.2 Rationality I take rationality to be the mechanism whereby an agent reasons about its world, generating new attitudes in concert with its previous attitudes and with incoming data from its environment (the context), and discarding any attitudes which it finds CHAPTER 2. BACKGROUND 33 untenable in its system for rationality (i.e., it must discard those attitudes which are irrational). This circular definition leaves open many issues, and particularly the question of the mechanism itself. But controversy begins with the smallest move towards a more detailed account.14 1 5 Most work to date has tended to an often unexplored assumption that rationality is and must be at bottom logical. This belief has made its way into even the lay world where-accompanied by admonishments from Mr. Spock on the bridge of the Starship Enterprise-irrationality and illogic are confused. To avoid crossing unnecessary ontological territory, ground that is likely someday to be lost to a better-founded theoretical assault, I want from the outset to clearly separate what is the domain of rationality, and what part logic is to play in it. A goal or a belief can be rational only with respect to an agent and his inference mechanism. For instance, it may appear prima facie rational for an individual to plan to have children, if the axiomatization of that agent's beliefs includes only his built-in (innate) instincts. But if the additional constraints of global population density are added, such a plan is questionable in its rationality. Still, the inference mechanism remains unspecified, and the haste with which researchers join to define it in terms of first-order-predicate-logic is forgivable only in view of the shortage of viable rivals for the job. 1 6 For the sake of the current computational implementation, I too follow the trend of exploring different logical axiomatizations of rationality. What can be said of or hoped for a model of rationality? That it is useful? That it is faithful to the human example? That it is stronger or weaker than the human example? I think none of the logical formalisms have been faithful in this sense,17 that most of them have been stronger, and that their usefulness remains, in general, to be demonstrated. My approach is detailed in section 4.5. I take rationality to be a property of the thinking (or lack of thinking) that goes into belief formation or decision making. In particular, it is a property of the methods used, the rules followed (or not followed), not of the outcome of the process. [p5] The purpose of rational thinking is to make decisions most consistent with the T H I N K E R ' S I N T E R E S T S or to arrive at beliefs whose strengths are IN P R O -P O R T I O N T O T H E E V I D E N C E available.18[Bar85, p6], 1 4 I have already risked controversy in the preceding definition by my tacit acceptance of an agent which dispenses with irrational attitudes; this is already a somewhat idiosyncratic view of what can be only an ideal agent. Real agents can certainly be irrational; I am personally acquainted with several of this bent! 15See [Che86] for a thorough discussion of these and related issues. 1 6 Nor does my search for alternatives promise to bear much quick fruit. I am at the early stages of exploration in connectionist methods, and can say only that my skepticism is unabated. 1 7It is my opinion as well that they will never be. 1 8 M y emphasis. CHAPTER 2. BACKGROUND 34 I include these quotations to provide myself an opportunity to discuss briefly the difficulties with defining rationality. My observations have not been completely ignored by others, especially within the area claimed by philosophers (see, for in-stance, [Den87]), but they are usually shrugged off, particularly by 'computational' researchers. First, since rationality has been defined in terms of them, I would ask: what are the agent's interests? Who or what is to decide such a thing? If it is not the agent itself, but some meta-agent, then the familiar infinite level-regress must be avoided. And still some subjectivity would linger; is it rational for an agent to choose in favor of its own survival, when this may select against the continual survival of its race?1 9 My second question based directly upon the preceding definition of rationality, would be: how do we measure the strength of a belief? Though certainly not idly posed questions, I will not attempt to answer them here; I will adopt a simplistic view of what an agent's interests might be, in the con-text of Cooperative Communication, and will only hint in later sections (e.g. 2.5.2) as to how some beliefs might be preferred over others. Within the 'computational' school, established rationality constraints are usu-ally some variation upon a demand for logical consistency, which is in my view an unrealistic attitude. I would not want to go so far as to suggest that (logical) consis-tency is a prerequisite for rationality, much less that (logical) closure be a criterion of rationality. Previous strategies have all suffered from what has been called the problem of logical omniscience, wherein an agent who believes a is held to believe all of the (logical) consequences of a. This requirement imposes the following conditions [Lev84]: • Every valid sentence must be believed • If two sentences are logically equivalent, then one must be believed if the other is (regardless of its complexity) • If a sentence and its negation are both believed, then so must every sentence These conditions are undesireable20 as partial definitions of rationality. I want something less limiting, and turn first to default reasoning for both a defeasible version of closure and a language capable of expressing inconsistency. I will also explore the (partial) implementation of implicit versus explicit belief along the lines of Levesque[Lev84]. See section 4.5 for my implementation of the constraints on rationality. 1 9These ideas are consistent with Dawkins' [Daw87]. 2 0Karttunen writes in a footnote that "It is implicit in this treatment that every individual's beliefs are considered to be closed under entailment. I am not sure whether this is a defect."[Kar74] Much of the work in this thesis is intended to address this question. CHAPTER 2. BACKGROUND 35 Dimension Research Allen Cohen Perrault Konolige Csinger Utterance meaning Prop. Prop. Prop. Prop. Imp. beliefs vs knowledge v 7 yl v 7 rationality assumptions Class. Class. Class. Class. Default default conditions v 7 v 7 v 7 degree of implementation v 7 v 7 v 7 Intentions V v 7 v 7 context-sensitivity v 7 v 7 v 7 Table 2.3: Summary of Previous Work in Belief Modelling Implicit and Explicit Belief: Levesque[Lev84] hints at a solution by distinguish-ing between what he called implicit and explicit knowledge.21 2 2 He recognizes the difficulties inherent in a consistency or closure-based approach to rationality, and identifies these approaches with possible-worlds semantics. He proposes another semantics, which he calls a situation, described loosely as a 'partial possible world.' Roughly speaking, a situation may support the truth of some sentences and the falsity of others, but may fail to deal with other sentences at all. Explicit belief, then, is identified with possible worlds, while implicit beliefs are identified with situation semantics. The qualities described above are the ones I will try to incorporate into my model as a first try at rationality. I have little choice but to define rationality for the limited purposes of this work, as conformant to a prescriptive model (i.e. a set of rules); my approach will be to make the model as conservative as possible. I will press the formalism of default logic into service to this end, realizing from the start that a normative model of rationality itself is not within reach. 2.4.3 Previous Work in Belief Modelling In this section I survey previous work in the modelling of beliefs, with particular attention to the aspects discussed above. It may be useful in what follows to refer to the summary table 2.3. 2 1 A n unfortunate choice of terminology from my point of view; the categories are only remotely related to my use of the terms implicit and explicit. 2 2Charniak [Che86, p9] calls these conscious and unconscious beliefs; Konolige [Kon85] call them the base set and the belief set. CHAPTER 2. BACKGROUND 36 Allen Allen [AP80, A1187] has advanced a theory of speech acts, along with an implemen-tation that makes use of rules composed of preconditions, bodies, and effects. The preconditions serve to embed the rules within consistent contexts, and enforce the Gricean maxims along the way. Allen says as much in his description of the INFORM speech-act [A1187, p443]: As expected, there is a precondition that the speaker believes the proposition that is asserted, and the effect is that the hearer believes the proposition. The definition is given[AP80] in terms of shared beliefs of the speaker and hearer, and with the addition of the preconditions: Action Class lNFORM(Speaker, Hearer, Proposition) Want-Precondition Speaker want INFORM(Speaker, Hearer, Proposition) Precondition KNOW(Speaker, Proposition) Effect KNOW(Hearer, Proposition) Where KNOW is defined as the mutual knowledge operator, along the lines discussed in section 2.4.1. He defines other speech-acts (e.g., REQUEST), whose effects or preconditions involve the intentions of the agents involved, viz., their plans and goals. Allen notes that an accurate account of beliefs and intentions would need to be time-indexed. Other operators are introduced to represent further intentions. Allen recognizes the limitations of the belief-logic he employs, which he says is to be interpreted more or less along the lines of Hintikka. Only the propositional contents of formulae are considered in utterance meaning. The point most salient to my project is Allen's recognition of the importance of context-sensitivity, which he implements via the preconditions of his operators. Cohen et. alias Cohen and Perrault [CP79] also describe a system that makes use of rules consisting of preconditions and effects. They implement the INFORM and REQUEST speech-acts, and they interpret belief as a modal operator constrained with the following axiomatization, which they recognize as an 'idealization' that is clearly 'too strong to be a faithful model of human beliefs'. They go on to say: To reflect human beliefs more accurately, one needs to model (at least): degrees of belief, justifications, the failure to make deductions, inductive leaps, and knowing what/who/where something is. These refinements, though needed by a theory of speech acts, are outside its scope. CHAPTER 2. BACKGROUND 37 BI ctBELIEVE(all axioms of the predicate calculus) B2 aBELIEVE(P)=> a BELIEVE(a BELIEVE(P)) B3 CtBELIEVE(P) or aBELIEVE(<2) a BELIEVE(P or Q) B4 CtBELIEVE(P) and aBELIEVE(Q) <3» a BELIEVE(P and Q) B5 CtBELIEVE(P) =>• ->ct BELIEVE(--P) B6 o;BELIEVE(P =*> Q) (a BELIEVE(P) a BELIEVE(<5))23 B7 BEaBELIEVE(P(x)) =*> a BELIEVE(3xP(a;)) B8 all agents believe that all agents believe B . l to B.7 Cohen and Levesque provide a set of context-sensitive axioms [CL] to capture the consequences of utterances. Their approach makes use of a form of the closed world assumption, in that the preconditions for some of these rules involve statements about what an agent does not believe. Only the propositional contents of formulae are considered. Other operators are introduced to represent the intentions of agents. Perrault Perrault addresses the application of default logic to a speech act theory[Per87],and in so doing brings many issues to light. He realizes and argues strongly for the context-sensitivity of the rules that cap-ture the consequences of utterances, and this is a large part of his appeal to non-monotonic logic. Perrault deals exclusively with the propositional contents of declarative utter-ances. He distinguishes between knowledge and belief. Perrault's approach to rationality makes no appeal to default logic. He states that "The beliefs of one agent at one time are taken to be consistent, distributive over conjunctions, closed under logical consequence and positive introspection. Beliefs need not be true." He thus axiomatizes the rationality constraints as follows, where BXytP is read as agent x believes that P at time t: Consistency Bx>tp ->BXit~^p Distributivity BXit(p A q) = BXttp A Bx>tq Closure BXitp A BXit(p q) Bx<tq Positive Introspection BXttp BXttBXttp Memory BXitp => Bx<t+1Bx<tp Persistence BXit+1Bx<tp Bx<t+lp The last two axioms are intended to ensure that "agents remember their previous beliefs and continue to hold them." The strengths of these axioms prevents revision 2 3 B.6 of Cohen and Perrault is equivalent to: ( aBELIEVE(P —+ Q) and aBELlEVE(P)) —• a B E H E V E ( Q ) . This is the form in which I will consider this axiom in my own work. CHAPTER 2. BACKGROUND 38 of belief. Also, he indicates that all agents are assumed to believe that all axioms hold. This is not a default rule: Definition 8 (Perrault: Axiom Closure) For every agent x, time t and axiom A above, Bx>tA is an axiom. The default rules employed by Perrault both concern the incrementation of belief sets. Definition 9 (Perrault: Belief Transfer Rule) Bx>tByitP =>• BXitP Definition 10 (Perrault: Declarative Rule) DOXttP BXitp The Declarative Rule is similar to the sincerity condition which has been referred to throughout this thesis, and which will also be implemented in section 4.2 as a default rule. DOxjP is to be interpreted as the action of agent x at time t of uttering a declarative sentence with propositional content p. He adds the following meta-rule, to implement closure of default rules: Definition 11 (Perrault: Default Rule Closure) For all agents x and times t, if p =>• q is a default rule, so is BXttp => Bx>tq The implementation within a logic24 of rules such as these is always a problem. (See section 4.5.) There is no sense in which one extension of a default theory has precedence over another. I will discuss this issue in section 2.5.2. Perrault is able to show in this formulation the difference between theories rep-resenting sincere and insincere utterances by a speaker, but the persistence axiom (as noted above) prevents the retraction of previous beliefs. He briefly considers different belief strategies that might be described in default logic, pursuing the pos-sibility of making some of the axioms into default rules. In particular, he mentions the persistence axiom and discusses its conversion into the persistence default rule. If both this and the memory axiom were converted to default rules of inference, multiple (mutually inconsistent) extensions would result, representing both the case in which the hearer's beliefs persist, and the case in which they do not. As he points out, "The theory would then give no precedence to either." Perrault does not pursue the subject any further, though it seems to me that this is the single most important unexplored thread. See section 2.5.2 for further exploration of the problem of choosing between multiple extensions. 2 5 2 4 I t is easy enough to add meia-logical control to implement rules like this, but these are variously ad-hoc. 2 5Perrault goes on to consider intentions within the limits of his formulation, but these issues fall outside the scope of this thesis. CHAPTER 2. BACKGROUND 39 Konolige Konolige[Kon85] and Batali[Bat83]26 both explore the ability of agents to reason about their own representations, a process that they call introspection. Kono-lige advances the view that 'a belief subsystem is the computational subsystem within an artificial agent responsible for representing his beliefs about the world.' Konolige[Kon84] argues that this belief subsystem is 'conceptually separate' from the rest of an agent's cognitive mechanisms. He also distinguishes between the fi-nite list of facts which the agent believes a priori to be true of the world, and the set which the agent can derive via its computational inference apparatus. He calls the finite set of beliefs the base set, and the inferrable superset the belief set; these notions correspond closely with Levesque's explicit and implicit belief, respectively. Konolige, with Appelt[AK88], also advocates the use of default logic, with em-phasis upon attitude revision. He employs what he calls a hierarchic autoepistemic logic, characterized by a collection of subtheories linked in a hierarchy, rather than by a single default theory. I will have more to say about this approach in section 2.5.2. 2.5 N o n - m o n o t o n i c S y s t e m s Default logics were formulated to overcome some of the well-known problems of classical, monotonic logics. Definition 12 (Monotonicity) A system is monotonic if and only if it has the following property: whenever it infers a conclusion C from a set of assump-tions S, it will also infer C from any larger set of assumptions containing S. One of the best known non-monotonic formalisms is due to Reiter [Rei80]. 2.5.1 Theorist The Theorist formulation for default reasoning lends itself particularly well to implementation in a logic programming environment[Poo87]. The Theorist imple-mentation I used embodies a non-clausal first-order theorem-prover, and a mecha-nism for defeasible rules of inference, making it a likely candidate for implementing both the principles of cooperative communication, and the rules for presuppositional inference. In Theorist the user provides two sets of first order formulae F is a set of closed formulae called the facts. These are intended to be true in the domain being modelled, and as such are assumed to be consistent. 2 6Batali's work is a survey of several computational models of introspection, and includes an extended argument for continuing such research. Most of his observations are covered in this thesis in some form or another. CHAPTER 2. BACKGROUND 40 A is a set of formulae which act as possible hypotheses, any consistent ground instance of which can be used as a premise in a logical argument. Definition 13 (Scenario) a scenario of (f, A) is a set D U T where D is a set of ground instances of elements of A such that D U T is consistent. Definition 14 (Explanation) If g is a closed formula then an explanation of g from (T, A) is a scenario of(T,A) which implies g. Definition 15 (Extension) An extension is the set of logical consequences of a maximal (with respect to set inclusion) scenario.27 Definition 16 (Prediction) g is predicted if and only if g is in all exten-sions. That is, g is explainable from (J-, A) if there is a set D of ground instances of elements of A such that T U D (= g and T U D is consistent in which case J-'UD is an explanation of g. Such a g will be referred to in this thesis as the explanandum28 of a logical argument. I will make extensive use of both prediction and explanation as described above, in the discussions to follow. Theorist is an attempt to be a minimalist system. It is an attempt to see how far a very simple hypothetical reasoning framework can be pushed. It will also be of interest later in this thesis because exactly the same formal definition provides a definition for default reasoning, abductive reasoning, design, and recognition. These issues will arise in section 3.4. 2.5.2 Theory Preference The problem of multiple extensions arises in all default theories of any complexity. There is great representational power in being able to place into separate extensions mutually inconsistent formulae corresponding to distinct alternatives. This power has, however, gone unused because of the problems associated with choosing between the extensions. The implementation presented in this thesis also suffers from the multiple exten-sion problem, as will be detailed in section 4.7 Some comments by Perrault[Per87] highlight the difficulties: 2 7 This corresponds to Reiter's definition of extension in terms of fixed points[Rei80],[Poo88]. 2 8 The plural of this term is, of course, explanandal CHAPTER 2. BACKGROUND 41 Ideally, one would like a theory in which it is possible for one agent's beliefs, say, to change depending on H O W S T R O N G L Y 2 9 he believed something before the utterance, and how much he believes what the speaker says. We cannot give such an account in detail, so we will rely on something simpler. We assume what one might call a persistence theory of belief: that old beliefs persist over time, and that new beliefs are adopted as a result of observing external facts as long as they do not conflict with old ones. Perrault has not gone into the reasons for his inability to provide 'such an account'; even the ideal theory he refers to does not address the discarding of beliefs in the light of new facts, and the problem of implementing looms large. This is no criticism of Perrault; his silence speaks eloquently for what needs to be done. Time does not permit an exploration of the efforts thus far undertaken by re-searchers such as Poole[Poo85], Brewka[Bre89], Konolige[AK88], GefFner[Gef89], and others, broadly characterized by a common goal of achieving a reasoning behavior in closer correspondence with intuition. Most approaches take recourse to some form of semantic, domain-dependent cues, thereby abandoning one of the stated goals of this thesis, that of ontological and logical minimality. Csinger and Horsch [in progress] are currently exploring various syntactic approaches with the aim of maintaining this generality. My emphasis Chapter 3 Design Issues Consistency is the hobgoblin of small minds. —Ralph Waldo Emerson I return now to my stated goal of deriving an agent's beliefs from his utterances. Having argued -as have many others- that the meaning of an utterance is more than its propositional, explicit contents, I go on to show how certain elements of the implicit and tacit contents play roles in the derivation of beliefs. The general model I will pursue is illustrated in Figure 3.1. The lines of the diagram indicate inference paths; the solid line between utterance and belief represents the familiar entailment relation of monotonic logic, while the other lines are intended to be suggestive of the defeasible implication of non-monotonic logic.1 It is the purpose of this chapter to describe in some detail the inference processes which occur along these paths. I embarked upon a default reasoning implementation not because I hold any measure of psychological reality for the formalism, but from the practically moti-vated desire to produce a system that would successfully ascribe a set of beliefs to an agent based upon his utterances. Some intentional idiom was needed, and default logic presented itself as the most accessible, the least ad-hoc and with the least ontological baggage.2 The strategy in all of what follows is to abstract away from the temporal linearity of discourse that would lead into truth-maintenance considerations, and to assume instead that the entire discourse is available for analysis. The problem is then one of achieving a consistent explanation of the discourse.3 1Compare Figure 3.1 with Figure 1.1; the former can be interpreted as the elaboration of the latter, or the simpler Figure 1.1 can be seen as the limiting effect of a purely monotonic logic. The only inference path available to a purely classical analysis is the one solid line of Figure 3.1. 2The remainder of this chapter comprises the body of a paper (Hypothetical Reasoning and Discourse Structure) by Csinger and Poole, still in preparation. 3Perrault of SRI almost convinced me during a recent seminar that his default-logic formulation 42 CHAPTER 3. DESIGN ISSUES 43 Utterance Implicatures Presuppositions Belief Figure 3.1: From Utterance to Belief via Communication 3.1 C a u s a l i t y a n d P o i n t - o f - v i e w The reasoning system which implements a theory of communication, and a user-modeller in particular, has only the utterances of some agents as input to the infer-ence process. It is arguable that there is causality inherent in the domain, but my position is that since belief is relative to the believer (agent-relative), any causality must be relativized to the agent's point-of-view. The discussion to follow first makes precise the elements that participate in my theory, then sets out the limitations and restrictions I have accepted in this thesis. I then attempt to resolve the problems associated with agent-relativity by first examining a causality model of conversation, and then showing the connection to default logic programming methodology. 3.1.1 The General Model In general, communications consists of events which take place among an arbitrary set of agents called a population. One or more agents play the role of speaker, while the rest are hearers. No agent can be a speaker and a hearer at the same time. A communicative event takes place among a subset of the agents called a group, that includes only one speaker. This is the reasonable (among civilized tribes of interlocutors!) condition that only one agent speak at a time. There is no loss in generality in restricting a theory (as I have) to communicative events within groups composed of only two agents, viz. one speaker and one hearer. This group is to be known as a pair, specifically, a speaker-hearer pair. The theory of communication of speech-acts requires the use of time-predicated modal operators. A full description of discourse will require some reference to time, and perhaps even to modal operators, but it is my contention that the more limited project of determining the propositional contents of an utterance are well within the domain of minimalist (default) logic. CHAPTER 3. DESIGN ISSUES 44 described in this thesis applies in particular to conversations of such a pair of agents. Point of view As far as a speaker-agent is concerned, it is her intentions, goals and beliefs4 which compose the explanation of her utterance. Thus, in answer to a query about the purpose of her utterance, she might answer that she wished to inform -or mislead- her interlocutor; these are her goals. With reservations,5 I allow that no one is more qualified than the speaker herself to attest to her beliefs and intentions. A hearer-agent has even less recourse to a claim of privileged access, 6 and must resort to some form of theorizing about the speaker's mental states, based upon very scarce input. Ignoring physical posturing, sense-data from visual and other senses, etc., further reduces the available data, and the hearer that is thus restricted has only the utterances of the speaker to go by. It is in just such a frugal environment that the typical user-modeller must function, and for which the present theory is formulated: by 'listening in' to the utterances between interlocutors, the reasoner of the User-Modeller U M must reconstruct their mental states. In the simplest U M system, the reasoner plays the role of hearer; based upon the utterances of the speaker, it attempts to reconstruct via some inference process, the mental state of the speaker. I have already argued that it is advantageous for the system to make use of the entire bandwidth of the communications channel between speaker and hearer, between user and user-interface, and I have suggested how this might be at least partially accomplished[CP89]. In particular, I have presented the inference classes of presupposition and implicature as part of the process by which to derive the beliefs of the speaker. This is the purpose of the theory of communication I have been advocating in this thesis. Causality The speaker's mental state can be regarded as 'causing' her utterance, and this is likely to be the point of view of the speaker herself. Thus, we say speaker-belief 'causes' speaker-utterance. As far as the hearer is concerned, it is the speaker's utterances which are the source7 of the hearer's beliefs about the speaker. Thus, we say speaker-utterance 'causes' hearer-belief. (Refer to Figure 3.2.) These and other causal influences are inherent in the domain, and are partially encoded in the lines/vectors of Figure 3.1. The implementation discussed in this thesis takes the point of view of the hearer-agent, and all formulae should be interpreted as the system's beliefs about the speaker-agent. 4 Which may be, of course, and in general will include, beliefs of the agent about other agents. 5There are well known philosophical problems with claims of privileged access to mental states. 6The hearer has even greater difficulty with the problem of other minds. [Den85, Den87, Sea84] 7For instance, in the sense that the speaker's utterances are the 'data' for the hearer's recognition procedures. CHAPTER 3. DESIGN ISSUES 45 3.2 D e f a u l t — P r o g r a m m i n g M e t h o d o l o g y Although there is precious little data in the form of existing implementations from which to draw generalizations, Poole has devoted some time to the exploration of how his framework for default reasoning might best be employed. Just as any textbook on Pascal programming that purports to be informative and complete should include a chapter on top-down programming), a guide to the use of a default logic framework should refer to the prevailing, tried-and-true method-ologies of that paradigm. This is what the author of Theorist has set out to do [Poo89c, Poo89b], and with what this part of this thesis is concerned. In general, there are not enough constraints in a domain to uniquely determine the approach that the reasoning system should take in formalizing its characteristics [Poo89b]. The causality in the domain does not uniquely constrain its default-reasoning ax-iomatization. To make the presentation here more precise, we use the simple default reasoning framework of Theorist [PGA87]. Different uses of Theorist can be characterized along two dimensions: • Status of Explananda, and • Status of Assumptions The first considers whether the explanandum is known to be true or whether it is something that has to be determined. The second considers whether the system is free to choose any hypothesis that it wants or whether it must try to "guess" some hypothesis that "nature" has already chosen. Status of Explananda The first dimension is whether the explanandum is known or not. We can divide this into two choices: Abduct ion: The system knows that the explanandum (the observation of the world or the design objective) is true, and needs to find an explanation for it. The idea is to find assumptions that imply the explanandum. We consider all explanations as possible descriptions of the world. Prediction: The system does not know if the explanandum is true, and the idea is to determine what can be predicted from the facts (the general knowledge and the observation or design objective). One interesting difference between abduction and prediction is in the relevance of counter-arguments. In predicting g, it matters whether or not ->g can be explained. In abduction, however, an explanation of —>g is irrelevant. CHAPTER 3. DESIGN ISSUES 46 Status of Assumptions Along the other dimension we can distinguish between the two tasks: Design can be defined as where the system can choose any hypothesis it wants. For example, a system can choose the components of the design in order to fulfil its design objective, or choose utterances to make in order to achieve a discourse goal. A consistency check is used to rule out impossible designs. All other sets of components that fulfil the goal are possible, and the system can choose the "best design" to suit its goal. Design can be done in an abductive way to try to hypothesize components in order to imply a design goal. Alternatively, design can be done in a predictive way to derive a design from goals and any hypotheses we care to choose. Recognition is where the underlying reality is unknown, and all we can do is to guess at it based on the observations we make about it. This definition in-cludes diagnosis, scene recognition and plan recognition. Recognition can also be done in an abductive manner or a predictive manner [P0088], [Poo89b]. In an abductive framework, we need to treat all of the explanations as pos-sible descriptions of the world. In the predictive framework, one appealing strategy is to predict something only if is explained from the observations even when an adversary chooses the hypotheses [Poo89a], which corresponds to membership in all extensions (which corresponds, propositionally at least, to circumscription [Eth83]). Note that these frameworks are different ways to use the same formal system for dif-ferent purposes. In order to use the system we have to choose one way to implement our domain. 3.3 T h e C o m m u n i c a t i o n s D o m a i n Understanding is difficult even in the simplest of communications domains. Typi-cally, a Hearer attempts to reconstruct a Speaker's (complex) mental state from a limited set of verbal and non-verbal cues, given only a general a priori understand-ing of the communications domain. The reasoning system with which we propose to implement inter-agent communications has only the utterances of some agents and a set of shared principles as input to the inference process. A hearer-agent must resort to some form of theorizing about the speaker's mental states, based upon this very sparse input. It is in such a frugal environment that the typical user-modeller must function, and for which the present theory is formulated: CHAPTER 3. DESIGN ISSUES 47 Figure 3.2: Causality Model for Interlocutor Pair by 'listening in' to the utterances between interlocutors, the reasoner of the U M must reconstruct components of their mental states8. In the simplest U M system, the reasoner plays the role of hearer; based upon the utterances of the speaker, it attempts to reconstruct via some inference process, a subset of the mental state of the speaker. As far as a speaker-agent is concerned, it is her intentions, goals and beliefs which compose the explanation of her utterance. The speaker's mental state can be regarded as causing her utterance, and this is likely to be the point of view of the speaker herself. The mental state of the speaker can be regarded as a representation of her design objectives; what she seeks is to design an utterance to fulfil these objectives. As far as the hearer is concerned, it is the speaker's utterances which are the pri-mary source of the hearer's beliefs about the speaker's mental state. Thus, the hearer seeks to recognize some components of the speaker's mental state from speaker-utterance. (Refer to Figure 3.2.) This exploration of the inherent direction of domain causality leaves open the direction of inference that the system is to select. This choice is essentially a question of default-logic programming methodology, since the way the domain is axiomatized will impose a particular inference strategy on both hearer and speaker agents. 3.4 D o m a i n F o r m u l a t i o n We now turn to formulating the domain within the default reasoning framework. The problem of finding the right constraints on the domain breaks down into the problem of where to place the interlocutors of the speaker-hearer pair on the domain-formulation grid of table 3.1. Elsewhere, we have discussed the kind of information needed to support in-8Kass and Finin [KF88] have referred to this approach to user-modelling as implicit with respect to acquisition, and explicit with respect to representation CHAPTER 3. DESIGN ISSUES 48 Explanandum Known Abduction Unknown Prediction Who Design User Recognition Nature Table 3.1: Domain-Formulation teraction between rational agents, and have discussed specific points (e.g., world knowledge, linguistic knowledge, and the extent to which these are shared by the interlocutors[CP89]). Philosophical issues aside, we suggest that in re-constructing a model of the speaker from her utterances, a hearer makes particular use of shared knowledge. To make this easier, the shared knowledge should be represented in a form that supports the inferences of both the Speaker (as utterance designer) and the Hearer (as belief recognizer). If we accept that there are principles of com-munication [Gri75] which the Speaker adheres to in designing her utterance, it is reasonable that the Hearer make use of these principles as well during the recogni-tion process. The central implementation question is then: how should the principles of communication be represented? The answer to this question is hidden in an important characteristic of the in-terlocutor pair: Speaker-Hearer Duality. Speaker—Hearer Dual i ty As we have presented the domain, there are essentially two kinds of information available to, and distributed between, the interlocutor pair: As a designer of utter-ances, the Speaker knows beliefs, while as a recognizer, the Hearer knows utterances. These aspects of the domain allow us to conclude that it is the Speaker-agent that occupies the first row of the domain formulation table, and that the Hearer-agent will occupy the second. For convenience, we have labelled the agents with the coor-dinates of the box they occupy. The domain can be implemented in at least four different ways, corresponding to the four different possible combinations of Speaker and Hearer, as represented in the domain formulation table. The four possible im-plementations are enumerated in Table 3.3. The first column of Table 3.2 represents a system where both members of the speaker-hearer pair know their explananda; but due to the nature of the domain itself, these explananda will be different. Likewise for column two, where the explananda are unknown. Speaker-Hearer Duality is a feature of the domain which gives rise to the Shared Information Constraint, which suggests that there are two reasonable ways to assign grid positions to speaker and hearer, and consequently, that there are two sensible CHAPTER 3. DESIGN ISSUES 49 Explanandum (x) Known Abduction Unknown Prediction Who Design User I. Speakern x = bels II. Speakeri2 x = utts Recognition Nature II. Hearer2i x — utts I. Hearer22 x = bels Table 3.2: Communication Domain Formulation S] leaker Hearer agent uses agent uses (1,2) prediction (2,2) prediction (1,1) abduction (2,1) abduction (1,1) abduction (2,2) prediction (1,2) prediction (2,1) abduction Table 3.3: Four Possible Implementations of the Domain implementation strategies. T h e Shared—Information Constraint We have already argued that a certain (probably large) percentage of the infor-mation available to hearer and speaker must be mutual to them both for successful communication. We suggest now that this places a useful constraint upon domain axiomatization, and gives us a partial answer to our implementation question: for the speaker and hearer to share knowledge, their worlds should be axiomatized the same way. In particular, given a set of principles of communication which express ('causal') relations between beliefs and utterances, the Speaker and Hearer should adopt the same view of this causality. This means that, for either of the axioma-tizations presented, the two members of the speaker-hearer pair will use different inference mechanisms, viz. abductive or predictive reasoning. (Refer to Table 3.4). We will call this useful domain-formulation constraint the The Shared Information Inference Direction Speaker Hearer knows uses knows uses I utt =^  bel bel abduction utt prediction II bel utt bel prediction utt abduction Table 3.4: Speaker-Hearer Duality CHAPTER 3. DESIGN ISSUES 50 Constraint Observe that there are (at least) two essentially distinct approaches to axiomatiz-ing the speaker-hearer pair's communication domain. These correspond to what we have referred to loosely as the "directions of inference", and are labelled with roman numerals in Table 3.2. Note that in both cases, the Speaker is performing Design, while the Hearer is involved in Recognition; it is their explananda -along with the inference strategies they adopt- that vary depending upon their grid positions. In addition to the Shared Information Constraint, there are independent concerns which also motivate and which may constrain the implementation methodology. These are addressed in the following sections. 3.5 A l t e r n a t i v e I m p l e m e n t a t i o n S t r a t e g i e s Having accepted the argument for mutually represented information to be com-pelling enough to constrain the formulation of the domain, there are still two alter-natives. Any domain is likely to admit of this kind of 'vagueness', which is not unlike the problem of choosing an algorithm in a conventional programming language. Case I Choosing the axiomatization of case I means the hearer agent uses prediction, while the speaker agent uses abduction, and that the principles of communication will be of the following form:9 principlei principle? Hi = { i Fi = principlen utt(X,Y) principlei A utt(a,u>) =>• bel(a,Bu) A bel(ct,B\2) A principle? A utt(a,uj) bel(a,B2\) A bel(a,B22) A A bel(a,Bibl) A bel(a, B2b2) principl ; m A utt(a,u>) =>• bel(a,Bmi) A 6 e / ( a , 5 m 2 ) A ;•• A bel(a, Bmbm) , In adopting the predictive approach for the hearer, we consider the facts F to consist in the utterances themselves and all other information regarded as true; thus the utterances are the observations which are to be explained, or 'diagnosed'. H is 9Some of these facts actually function as hypotheses in our implementation; this distinction is unimportant here... CHAPTER 3. DESIGN ISSUES 51 inter alia10 the default representation of the principles of communication, viz, the normality assumptions. For instance, a speaker is normally sincere, thereby believing what she says. We are prepared to accept sincerity as 'normal' (equation 3.1), and as a component in the diagnosis, as in equation 3.2. sincere(Speaker,u>) H = I lying(Speaker,u) (3.1) sincere(S,u) A utt(S,u>) bel(S,u>) A relevant(S,u>) (3.2) Sarcasm, misdirection, and outright lying are also possible explanations of the ob-servations, and may enter into the Hearer's recognition process as in equation 3.3.11 lying(S,u) A utt(S,u) =t> -ibel(S,u) A -<bel(S,-ibel(Hearer,Lj)) (3.3) The system may not be able to predict any particular belief component of a mental state, even though it may be able to explain this component. In this way, the U M can entertain competing models of the Speaker's mental state.12 The Speaker uses the default representation of the principles of communication, along with her beliefs, to abduce utterances which fulfil her design objectives. Case II Choosing the axiomatization of case II means the hearer agent uses abduction, while the speaker agent uses prediction, and that the principles of communication will be represented in the following form: principlei principle^ Ru = { principlem { bel(X,Y) J 1 0 Both my theory and implementation posit other elements which also add default rules, but which can be ignored for our purposes here 1 1See chapter 4 for a description of these and other predicates that appear in the implementation language. 1 2 This is perhaps a sceptical view of human communication, but lying is a well-established human trait. It is only reasonable to presume that our artificial interlocutors will someday fall prey to unscrupulous users unless forewarned of our propensity to mislead! CHAPTER 3. DESIGN ISSUES 52 Abduction (^peaker belief) Utterance Prediction Hearer-inferred beliefs^ Figure 3.3: Theorist Architecture for Abduction and Prediction bel(a, B\i) A bel(ct, Bi2) A • • • A 6e/(a, 5a6j) A principlei => utt(a, u) bel(a, B21) A bel(a, B22) A • • • A bel(a, B2b2) A principle? => utt(a,u) bel(a, Bmi) A bel(a,Bm2) A • • • A bel(a, Bmbm) A principlem => u f i ( a , w ) ^ The principles of communication can be regarded here as possible hypotheses which would be acceptable as explanations of the observations. Stated in diagnostic terms, the principles would be the possible causes of the observed symptoms. Thus, in the presence of a conjectural intention13 on the part of the speaker to communi-cate, one explanation of an observed utterance is based on conjectured sincerity. 6e/(Speaker,u;) A relevant(Speaker,u>) A sincere(Speaker, u) ^ u^(Speaker,u>) (3.4) The facts for the Speaker are her beliefs, which are to be explained with those of the default principles which are consistent. The reader should note here that there is a formulation and implementation of Theorist which allows for both abduction and prediction to be performed within the same framework, on a single database. This architecture, shown in Figure 3.3, is suited to implementing the communications domain of the Speaker and Hearer agents described in this chapter.1 4 Figure 3.3 depicts the implementation alternative described in this chapter as Case I. 13I.e., in the presence of some belief-predicated term or terms expressing the Speaker's belief that her utterance is relevant, etc. 14However, as noted elsewhere in this thesis, I have implemented only the Hearer's side of the conversation. Chapter 4 Implementation The beliefs of today may count as true today, if they carry us along the stream; but tomorrow they will be false, and must be replaced by new beliefs to meet the new situation. —Russell on Bergson's Finalism. The implementation is presented and discussed in this section. It is written entirely in the Theorist language. The code portions are presented in distinct units, loosely corresponding to the categories identified in the meta-schema presented earlier; in some cases, the code and sample output have been edited for readability. The unabridged code for the entire implementation is reproduced in Appendix A . 1 After an introduction to the implementation language, I return to discuss pre-suppositions and the principles of communication, with an eye to isolating their roles in the current project. 4.1 I m p l e m e n t a t i o n L a n g u a g e The underlying representation language is that of Theorist, as described in sec-tion 2.5.1. I define rules to represent various types of information, as described throughout this thesis, and particularly as distinguished in chapter 2.2 The cate-gories of interest are: • The maxims of the cooperative principle • Presupposition generating rules (from lexical categories) • Implicature generating rules 1Some of this work appeared in From Utterance to Belief, by Csinger & Poole [CP89]. 2Each of the following categories are represented by an inference path in Figure 3.1. 53 CHAPTER 4. IMPLEMENTATION 54 • Ad-hoc belief support functions The following can be considered a meta-schema of the predictive version of the implementation. Figure 4.1 describes the form in which the maxims are to be captured. The interpretation I intend for the syntactic elements are as follows: • utt(a,u;): The agent a 'utters' the statement u. • bel(a, ft): The agent a 'believes' the statement ft. • imp(a, i): The agent a 'implicates' the statement t. • pre(o;,7r): The agent a 'presupposes' the statement 7r. As for the meanings of the quoted terms, I would like to leave their definitions as pre-theoretic as possible. Hadley has surveyed [Had89] the use of belief'in the field of A l , and has concluded that it is unclear to what extent the various theories are taken by their proponents to be true theories, or realistic cognitive models. He also adds that the 'syntactic approach' underlies the others to varying degrees. With this in mind, and with the conviction that a realistic account of (human) cognition need not necessarily be logical in any sense, I do not wish to go beyond a syntactic characterization of the current model. In leaving the definitions as 'pre-theoretic' as possible, I mean to avoid imposing either a semantics or a claim to psychological validity. If history continues as it has in recent years, the lifespans of such claims are not likely to be long. Thus, I can say that utterances are context-situated3, that utt(a,co) means the agent a expresses a statement u>. The information content of u is its propositional content, augmented with the inferences sanctioned by both the rules of the cooper-ative principle, and the context embodied in the beliefs of the agent and those of his interlocutors. An agent a believes the information expressed by ft just in case the quantity bel(a,ft) holds true. As noted above, I hold fast to the syntactic view, by which device two expressions ft\ and ft2 are different, even if they can be considered synony-mous under some semantically defined operation. Thus, I leave open the question of whether an agent who believes Mary has a brother also believes Mary has a male sibling. As far as my implementation goes, agents will not perceive such synonymies unless presented with an explicit rule to identify them. An agent a implicates an expression t just in case the quantity imp(a, L) holds true. This happens when an inference is sanctioned by the line connecting utterance to implicature in Figure 3.1. i can not be both implicated in this sense and uttered as described above. I.e., Va,i<.utt(a, a;) A imp(a, ( ) A u / t 3Which is to say little more than that the theory I am constructing is a pragmatic one. CHAPTER 4. IMPLEMENTATION 55 default principlei '• utt(a,uj) => bel(a, Bu),bel(a, B12), • • •,bel(a,). default principle? '• utt(a,uj) =>• bel(a,B2i),bel(a,B22),• • • >bel(a,B?b2)-default principlem : utt(a,u) =S> bel(a, Bmi),bel(a, Bm2), • • •, bel(a, Bmbm) Figure 4.1: Principles of Communications default implicaturei : utt(a,oj),principleyi =3- imp(a, in), imp{a, 112)• • ,imp(a,iu1) default implicature2 : utt(a,ui),principley2 =^ imp(a, 121)1 irnp(a, 122)1-" 7 2mp(a:, i2i 2) default implicaturep : utt(a,u),principleyp imp(a,Lpi),imp(a,iP2),- • • ,imp(a,ip,p) Figure 4.2: Implicature Generators An agent a presupposes an expression 7r just in case the quantity pre(a, 7r) holds true. This happens when an inference is sanctioned by any line terminating in presupposition in Figure 3.1. ir can not be presupposed in this sense if it is either uttered or implicated as described above. I.e., Va,,tl,r(utt(a, u) A pre(a, 7r) A UJ 7^  w) V (imp(a, 1) A pre(a, 7r) A t 7^  7r) In Figure 4.1, the 2?,-^  are the beliefs adduced to capture the normative strengths of the maxims as discussed in the relevant sections of this thesis. In Figure 4.2, the t's are derived from the forms of the o;'s; this places constraints on the i's sufficient to guarantee, for instance, that OJ 7= t. In Figure 4.3, the 7r's are derived from the forms of the w's; this places constraints on the 7r's sufficient to guarantee, for instance, that u 7^  ir. In addition to the default rules of Figures 4.1, 4.2, and 4.3, the rules of 4.1 and 4.2 are needed to derive beliefs describing the mental state of the speaker (or rather, that of the system which models the mental state of the speaker). default belJmp(a, 1) : imp(cx,t) =>• bel(a,t). (4-1) default bel_pre(o;, 7r) : pre(a,ir) =4> bel(a,7r). (4.2) CHAPTER 4. IMPLEMENTATION 56 default presupposition : utt(a,u),principleXl =>- pre{a,ir-y\),pre(a,iT\2), • • • ,pre(a,7Tin) default presupposition : utt(a,<jj),principleX2 =>- pr e(a, ir 21), pre(a, 1:22), • • • )iwe(a,7T2r2) default presuppositions : utt(a,u),principleXs ^ pre(a,/Ksi),pre(a,TrS2), - • • ,pre(a,irsrs) Figure 4.3: Presupposition Schemas/Triggers default rationalityi : bel(a, Bi) A principleyi => bel(a, Bu) A bel(ct, B12) A • • • A 6e/(a, i ? i r i ) de fault rationality2 : bel(a, B2) A principley2 =>• 6e/(a, .B21) A bel(a, B22) A • • • A bel(a, 2?2r2) default rationalitys : bel(a,Bs) A principley3 bel(a, Bsi) A bel(a,Bs2) A • • • A bel(a, Bsrs) Figure 4.4: Rationality Constraint Schema CHAPTER 4. IMPLEMENTATION 57 Rationality conditions can also be implemented as default rules, representing a set of normative constraints which exhibit the desirable behavior of defeasibility, thus relaxing the traditional requirements of closure and consistency.4 The rationality (or introspection) schema cannot be implemented directly as shown in Figure 4.4 without some consideration of the underlying control mechanisms. See section 4.5 for details. Different types of knowledge can be implemented either as facts or as defaults in the logic, depending upon their epistemic status as perceived by the implementor. I have adopted the view that all beliefs are defeasible, as suggested by Shoham[SM88]. 4 .2 P r i n c i p l e s Others before me have felt free to implement and reformulate the Gricean Maxims, picking and choosing from among them as they saw fit. I see no reason why I should not indulge in a similar practice, with the accompanying explanations. First, it is not so much the Gricean maxims that I wish to formulate, as it is the underlying intuition which they attempt to capture. So while G rice's work is no doubt a large part of the inspiration for what follows, I am not trying in any way to be faithful to his method. What I retain of Grice, is the reasonable working hypothesis that communication is governed by a set of principles (which Grice calls his 'maxims'), which would -if completely explicated- provide explanations for nat-ural language utterances. I do not make any claim regarding the number of these governing principles, and will refer instead to the set which contains them, even though its cardinality is unknown. It is these principles of communication which I implement in this thesis. The re-lationship between 'my principles' and 'Grice's maxims' is summed up by observing that Grice restricted himself to 'cooperative' forms of communication. The princi-ples I have in mind seek to capture normal [human] communications in a broader normative sense. In particular, different kinds of misleading are normal, rational communicative pursuits, and the theory should be able to represent these. See Fig-ure 4.5. It is worthwhile to my project to bear in mind throughout, the essential defeasibility of any of the principles. All of their exhortations should be prefixed with something along the lines of 'in the absence of any contradictory information . . . , ' or more significantly, perhaps: 'by default . . . ' Thus, while the principles are the expression of the norms of human communication, they give way to other, abnormal modes of communication, which I lump under the blanket term, misleading,5 to dis-tinguish them from the cooperative mode. The intuition that I seek to capture with 4Consistency remains a criterion of rationality in the implementation I present, but in the default theoretical as opposed to the traditional, monotonic sense. Nonetheless, I do not wish to claim that consistency is in any sense a property of rationality; I know of many empirical counterexamples to such a claim! 51 owe the use of this term to David Poole. CHAPTER 4. IMPLEMENTATION 58 Pr inc ip l es Gricean Cooperation Misleading A Figure 4.5: Principles and Grice's Maxims the default reasoning implementation is articulated by van Frassen[vF75, 52]: And whether one is guilty of deception depends not so much on whether what one says is true or false (it is perfectly possible to deceive by making true statements) but on how and when one says it Further light is thrown on the project by Gazdar, quoting from Lewis[Lew69]: 'L is an actual language of a population P iff there prevails in P a convention of truthfulness in L.' The word 'prevail' is important: Lying is an effective enterprise only in a population in which a convention of truth prevails. The point is once again hammered home, that compliance with the principles is normative, that deviations in non-ideal agents are to be expected, and that theories founded on principles of rationality might be pressed into service as lie-detectors of sorts, if not truth-detectors. As described earlier in this thesis, Grice categorized his Cooperative Principle into a number of maxims which were intended to explain natural language communication between cooperating agents. In the discussion which follows, I refer to these categories only because they are a good starting point; I am not committed to a "Gricean" theory, in any deeper sense. The Maxim of Quality is a sincerity condition, the formulation of which follows, and is consistent with Searle's account of Speech Acts expressed as follows: It is always possible to express a psychological state that one does not have, and that is how sincerity and insincerity in speech acts are distinguished. An insincere speech act is one in which the speaker performs a speech act and CHAPTER 4. IMPLEMENTATION 59 thereby expresses a psychological state even though he does not have that state. Thus an insincere statement (a lie) is one where the speaker does not believe what he says, an insincere apology is one where the speaker does not have the sorrow he expresses, an insincere promise is one where the speaker does not intend to do the things he promises to do. [SV85, pl8] Thus I shall say, naively, that a Speaker believes what she says. I will call this the Principle of Sincerity. Quantity is the idea that a speaker should utter the most specific statement of what she wishes to communicate. A reasonable—but by no means exhaustive— formulation of this is that when a speaker utters a disjunction, he does so because no other natural language connective is expressive of the 'tentativeness' of his belief in either of the disjuncts. This rule thus sanctions the derivation of the clausal quantity implicatures as per Gazdar[Gaz79] and Mercer[Mer87]. This will be the basis of my Principle of Disjunction. Relevance is tricky. I suggest that anyone who can completely formulate this one in any kind of logic will have solved most—if not all—of the problems of Artificial Intelligence!! Needless to say, I am still working on it, although as a first attempt, I might expect the speaker to utter only what the speaker believes the hearer does not already know. Groenendick and Stockhof[GS80] have referred to this as a principle of informativeness. This becomes my Principle of Relevance.6 Perspicuity is too vague a concept to admit of an obvious representation within the current framework problematic, and I will leave it for future work. Sarcasm, though not one of the 'original' maxims, can be captured simply along the following lines. A speaker is sarcastic when the speaker 1) does not believe her utterance, 2) believes that the hearer does not believe the utterance, and 3) believes that the hearer believes that the speaker does not believe the utterance. These conditions mark my Principle of Sarcasm. The principles discussed above are summarized in Figure 4.6. As promised throughout this thesis, the principles of cooperative communication have been captured in the Theorist language, and the resulting implementation is presented in this section. Corresponding in spirit to each of the maxims discussed in sections 2.2 and 2.2.1, are the series of default rules described here. These rules are the simplest that could plausibly account for the inferences involved. Their interactions with the rules expressing presupposition and implicature are described in the upcoming sections dealing with those rules. 6The case where a speaker utters w even though she believes that the hearer already believes u, is not covered by this principle of relevance, but would be explainable via a principle of confirmation. CHAPTER 4. IMPLEMENTATION 60 Principle 1 (Sincerity) A Speaker believes what she says. Principle 2 (Disjunction) A Speaker may believe any of the disjuncts in her utterance. Principle 3 (Relevance) A Speaker believes the hearer does not a priori believe her utterance. Principle 4 (Sarcasm) A Speaker does not believe her utterance and • believes the hearer does not believe the utterance • believes the hearer believes that she does not believe her utterance. Figure 4.6: Some of the Principles of Communications •/.•/. GRICEAN Quality Analog default sincere(S, U). fact sincere(S,U) and utt(S, U) => bel(S, U). Figure 4.7: Default Representation of Maxim of Sincerity A minimal condition on 'sincerity' is that the speaker believe what she says. The rule of Figure 4.7 expresses precisely this dictum. The system will assume the speaker's sincerity whenever it is consistent7 to do so. As expressed here, sincerity does not involve the 'true beliefs' of the speaker, only that she have a belief which corresponds to the contents of her utterance. 'Quantity' is the dictum that the speaker says nothing which she believes to be already known by the hearer.8 This condition is expressed by the rules of Figure 4.8. When a speaker is being 'sarcastic', it is usually true that she does not believe what she says, that she believes that the hearer does not believe what she says, and that she believes the hearer believes that the speaker believes that the hearer does not believe what she says. These appear to be the minimal requirements upon a 7Whenever it is consistent in the default logical sense. 8This is not the usual interpretation of the quantity principles. Call it relevance, or brevity, or whatever. "A rose by any other name..." CHAPTER 4. IMPLEMENTATION 61 v •/.•/, GRICEAN Quantity Analog default quantity(S, U). fact quantity(S.U) and utt(S, U) => bel(S, not bel(hearer, U)). Figure 4.8: Default Representation of Maxim of Quantity '/, Sarcasm predication default sarcastic(S, U). fact sarcastic(S,U) and utt(S, U) => bel(S, not bel(hearer, U)) and bel(S, bel(hearer, bel(S, not bel(hearer, U)))) and not bel(S, U). Figure 4.9: Default Representation of Maxim of Sarcasm condition of sarcasm, and are represented by the conjuncts of the rule in Figure 4.9. Example 4—1: Beliefs derived given: fact utt(dave,not property(john, regret, jumping)). bel(dave,not property(john,regret,jumped)) sincere(dave,not property(john,regret,jumped)) bel(dave,not bel(hearer,not property(john,regret,jumped))) sarcastic(dave,not property(john,regret,jumped)) bel(dave,bel(hearer,bel(dave,not bel(hearer,not property(john,regret,jumped))))) sarcastic(dave,not property(john,regret,jumped)) • Note that a speaker can not be simultaneously sincere and sarcastic with respect to a particular utterance. Whereas a conventional logical approach would derive a CHAPTER 4. IMPLEMENTATION 62 contradiction, or not conclude anything, the mechanics of default reasoning derive the consequences of assuming both sincerity and sarcasm, with mutually inconsistent beliefs residing in separate extensions of the resulting theory. For example, given the utterance by Dave that he regrets that John jumped, Theorist derives the formulae of example 4 - 1 9 See section 2.5.2 for a discussion of how one extension might be 'preferred' over another. 4.3 P r e s u p p o s i t i o n Mercer [Mer87] shows how to represent a number of presuppositional schemas in Re-iter's formalism for default logic. These schemas correspond largely to the fragments of Theorist code presented in this section. Theorist provides a useable implemen-tation, and I have reified over properties to allow for a first-order representation. 4.3.1 Criterial and Non-criterial Properties Mercer describes his schema for non-criterial properties in terms of the meaning-inheritance hierarchy of a lexeme. The criterial properties of a lexeme are those which define the terminal branches of the hierarchy, e.g., a bachelor is unmarried. Non-criterial properties are those which define the other levels of the hierarchy, e.g., bachelors are [generally] male, and adult. Mercer says of this category of presupposition that it is a "type of lexical presupposition which is based on the deciding criterion of a negated lexeme's meaning" [Mer87, p76]. Example 4-2: Sentence 4—1: My cousin is not a bachelor Sentence 4—2: The speaker's cousin is male Sentence 4—3: The speaker's cousin is adult • Mercer's example is reproduced as example 4-2. Given the utterance of sentence 4 -1, the presuppositions of sentences 4-2 and 4-3 can be derived. The non-criterial presupposition schema is implemented as the Theorist default rule of Figure 4.10. This rule might be paraphrased as when a negated lexical item appears in an utter-ance, and it has non-criterial properties, then if it is consistent to do so, infer that the speaker believes the indicated presupposition. The non-criterial properties of the lexemes, where applicable, are simply provided as facts in Theorist. 9Among others which have been omitted here for clarity. CHAPTER 4. IMPLEMENTATION 63 '/, noncriterial presupposition schema: default pre_by_nonc(S, Object, Propty, Presupposition) : (utt(S, property(Object, not Propty)) or imp(S, property(Object, not Propty))) and none(Propty, Presupposition) => bel(S, property(Object, Presupposition)). Figure 4.10: Non-criterial default schema Example 4—3: Speaker's Beliefs about Bachelors, given : fact utt(andrew,property(cousin, not bachelor)) Answer is believes(andrew, property(cousin,male)) Theory is [pre_by_nonc(andrew,cousin,bachelor,male)] Answer is believes(andrew, property(cousin,adult)) Theory is [pre_by_nonc(andrew,cousin,bachelor,adult)] • Given the utterance by the agent Andrew of My cousin is not a bachelor, Theorist ascribes the beliefs of example 4-3 to Andrew.10 Note that the antecedent of the presupposition schema contains a conjunct that is a disjunct of an utterance formula and an implicature formula. This is a reflection of the fact that implicatures can themselves sanction presuppositions; this will become clearer in the following section dealing with the implementation of implicatures, and again in section 4.7. 4.3.2 Factive Verbs Utterances with factive verbs imply the relative clause, whether the verb is negated or not [HH88]. Example 4—4: Presupposition by Factive Verb, given: fact utt(andrew, not property(john, regret, came(mary, party))) Answer is bel(andrew,bel(john,came(mary,party))) Theory is [pre_by_factive(andrew,john,came(mary,party),regret)] • 1 0Other beliefs are sanctioned as well, deriving from explanations of sincerity, sarcasm, etc., but they have been omitted in the interests of brevity and clarity. See appendix B for unabridged sample sessions with the system. CHAPTER 4. IMPLEMENTATION 64 '/, factive presupposition schema: default pre_by_factive(Speaker, Subject, Presupposition, Factive) : (utt(Speaker, not property(Subject, Presupposition, Factive)) or imp(Speaker, not property(Subject, Presupposition, Factive))) and factive(Factive) => bel(Speaker, bel(Subject,Presupposition)). Figure 4.11: Factive Verb Presupposition Schema The utterance by Andrew that John regrets that Mary came to the party entails that (Andrew believes that John believes that) 1 1 Mary came to the party. The negated form John does not regret that Mary came to the party presupposes the same thing. It is with the latter relationship that this implementation is concerned. Example 4-4 gives the presuppositions derived by application of the rule for factives, from the utterance by Andrew of John doesn't regret that Mary came to the party. 4.4 I m p l i c a t u r e s I restrict myself here first of all to so-called clausal quantity implicatures, and second, to their appearance in disjunctive utterances. Other complex sentences carry similar implicatures (e.g., if-then sentences). (See definition 3 in chapter 2 of this thesis). The intuition I am trying to capture is expressed by Gazdar[Gaz79, p61]: . . . the utterance of such a complex sentence implicates that both the con-stituent sentence and its negation are compatible with what the speaker knows. For instance, when a speaker utters a sentence of the form A is X or A is Y, she may mean any of A is X, A is Y, A is not X, A is not Y. These are the so-called clausal quantity implicatures, and Mercer assumes their a-priori existence in his method for generating the presuppositions of complex sentential forms. It is my intention here to show that they can be accomodated within the theory presented in this thesis, and (equivalently) that they can be produced by the implementation. The intent in both Mercer's work and in this thesis is that those implicatures which are consistent (mutually and with existing context) will themselves carry presuppositions, and thus sanction additional inferences for the hearer about the mental state of the speaker. The 'survivability' of these potential implicatures is thus a central issue. 1 1 As noted elsewhere, this work follows Horton in that presuppositions are beliefs of agents. CHAPTER 4. IMPLEMENTATION 65 Example 4—5: Candidate clausal quantity implicatures given: Sentence 4—4 Sentence 4—5 Sentence 4—6 Sentence 4—7 Sentence 4—8 My cousin is a bachelor or a spinster My cousin is a bachelor My cousin is not a bachelor My cousin is a spinster My cousin is not a spinster • Consider example 4-5. The utterance of sentence 4-4 produces the candidate im-plicatures of sentences 4-5 through 4-8. In this case, some of the candidates are mutually inconsistent, and thus should be placed in separate extensions of the de-fault theory, for further consideration. Several obvious choices present themselves for the implementation of the impli-cature generating rules, with interesting methodological repercussions. Of interest are the following: 1. a single disjunctive default 2. a single conjunctive default 3. separate default rules Briefly, the first option suggests a default rule of the following form: utt(S, A U B) imp(S, A) U imp(S, ^A) U imp(S, B) U imp(S, ^B) With reference to example 4-5, this approach can be easily dismissed, for it is too weak; it allows the survival in a single extension of mutually inconsistent candidates, and will subsequently sanction the prediction of invalid presuppositions, resulting in a mental model of the speaker that is patently incorrect. The second option requires a default rule of the following form: utt(S, AUB) imp(S, A) f) imp(S, ->A) fl imp(S, B) D imp(S, ~^B) This approach is too strong; if any of the candidate implicatures are inconsistent (with context of with another candidate), then none of them will be predicted. This is because the conjunction requires that all of the candidates be true in some (single) extension of the default theory. The last choice is a set of four default rules, one for each of the candidate im-plicatures in the disjunctive environment. This has the intended effect of letting CHAPTER 4. IMPLEMENTATION 66 '/, Clausal quantity implicature generating function, following Gazdar: default fc(l,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,Ul). default fc(2,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,U2). default fc(3,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,not Ul). default fc(4,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,not U2). Figure 4.12: Implicature-generating schema only those candidates survive that are consistent with established context, while maintaining alternate possibilities. Figure 4.12 shows a possible implementation resulting from the third approach discussed above. The preceding discussion has been left at a deliberately intuitive level, as nothing would be gained from additional formality. The intent has been to give a justification of the approach taken to the implementation of the implicature generating rules, and to provide a feeling for some of the default-logic programming issues that arise in practice. 4.5 R a t i o n a l i t y -Some aspects of the rationality conditions could not be implemented in Theorist without attention to the underlying control mechanism. The expressive power of Theorist is gained at the expense of not being able to guarantee the computability of an expression. In particular, some formulae which intuitively capture the obvi-ous properties of introspection are patently left-recursive, with the result that pure Theorist will not terminate in evaluating these expressions. To alleviate these restrictions, a simple depth-bound has been imposed upon the mechanics of the theorem-prover. The repercussions for the implementation are that CHAPTER 4. IMPLEMENTATION 67 7, We are informed of knowledge that is mutually known in the community: default aware(Agent,A) : mutual(A) => bel(Agent,A). '/, If we believe the antecedent of a rule, we believe its consequent: default implicit(Agent,A,C) : mutual(=>(A,C)) and bel(Agent.A) => bel(Agent.C). '/, If we believe a l i s t , we believe the items in the l i s t : default conjunct(Agent,List,X) : bel(Agent,List) and member(X.List) => bel(Agent.X). '/, Positive Introspection (patently left-recursive) : default pos_int(Agent,B) : bel(Agent.B) => bel(Agent,bel(Agent,B)). Figure 4.13: Rationality Constraints left-recursive formulae can be evaluated up to the depth-bound. The theory itself is compromised in that completeness and soundness can be no longer simultaneously ensured. However, all derivations in this implementation are unaffected by this loss. 1 2 A depth-bound is a natural kind of restriction to place upon an inference mechanism, reflecting the finiteness of the agent concerned[Che86]. Figure 4.13 are some default rules that express likely conditions on rationality or introspection. They correspond to previous efforts by other researchers as related in earlier chapters of this thesis, and alleviate the problem of logical omniscience by relaxing the well-formedness criteria to one of default, rather than classical logical consistency. 4.6 O t h e r A s p e c t s Other kinds of information are also required by the theory, and must be represented in the implementation. In particular, world information, lexical information, etc., as discussed earlier, must be provided for. Refer to Appendix A.1.5 for details. 1 2 A version of Theorist which employs iterative deepening search strategies is under develop-ment. This version will be both sound and complete, and will exhibit all the desireable features of the depth-bounded implementation. CHAPTER 4. IMPLEMENTATION 68 4.7 C a n c e l l a t i o n a n d M u l t i p l e E x t e n s i o n s Mercer has provided an explication of how default logic might be employed to rep-resent and derive the presuppositions of natural language utterances, going as far as to show how this might be done for complex sentences such as disjunctions. His technique is to avoid multiple extension theories wherever possible, as there is no clear semantics for theories of this type, and only a hazy ontology. This is a gen-eral problem with default reasoning, and most practitioners have sought to avoid it, rather than solve it. Although Mercer urges that in the case of multiple extension theories, the actual presuppositions of a complex utterance are those which are in all extensions, he is unhappy with his definition because he cannot provide a clear interpretation of "membership in all extensions." I, on the other hand, have argued in this thesis that the extensions of a default theory can be regarded as mere technical compo-nents of a system, insofar as they serve to expedite the process of presupposition generation, and I have noted the correspondence of this claim to Gazdar's notion of pre-supposition. I am now prepared to go a little farther. When the Speaker utters Jack is a bachelor or a spinster, the (sceptical) criterion of membership-in-all-extensions permits only the derivation of the Speaker-belief that Jack is an adult. In particular, the system is unable on these grounds to decide the sex of Jack. But if the theory also includes a default rule to the effect that people with the name Jack are of the male sex, then there is what might be thought of as reinforcing evidence for the Speaker-belief that Jack is a male. This extra information can also be regarded as a new counter-argument against the Speaker-belief that Jack is a female. It is this intuition that I would like to promote as the basis for theory preference (see section 2.5.2). Chapter 5 Conclusion ...why may we not say, that all Automata (Engines that move themselves by springs and wheeles as doth a watch) have an arti-ficial! life? —Thomas Hobbes, Leviathan Let us likewise beware of believing the universe is a machine; it is certainly not constructed so as to perform some operation, we do it far too great honour with the word 'machine'. —Nietzsche, The Gay Science.1 5.1 C o n t r i b u t i o n This thesis has made contributions in several areas. • A principled theory communication has been developed, with particular at-tention to its application in the field of user-modelling • Mercer's[Mer87] theory of presupposition has been extended to include be-liefs of interlocutors [HH88], and has additionally been implemented in the Theorist [P0088] framework for default reasoning • The theory of communication has also been implemented in the same frame-work for default reasoning, allowing derivation of implicatures and presuppo-sitions ^rom the introduction to Thus Spoke Zarathustra, pl7. Translated by R.J. Hollingdale, Pen-guin Books, 1969, New York, NY. 69 CHAPTER 5. CONCLUSION 70 • The theory and implementation support the derivation of users' beliefs from their utterances, thereby demonstrating the application of default reasoning theory and practice to user-modelling • Issues of default logic programming have been resolved, with resulting contri-butions to that body of knowledge • The theory and implementation allow representation of alternate interpreta-tions of the discourse 5.2 P r o b l e m s 5.2.1 Multiple Extensions The astute reader of this thesis will have noticed an apparent contradiction, which I have left unresolved to this point. The weakness I to which I refer concerns the thorny issue of multiple extensions, and their differing interpretations within my system in the context of presupposition generation and of utterance-meaning. I have suggested a purely syntactic and ontologically agnostic view of multi-ple extensions with regard to their role in presupposition-generation (section 4.7), while with regard to the application of the principles of communication, I have sug-gested that a multiplicity of extensions has significant representational importance (sections 4.2, 2.2.1). I have been admittedly opportunistic, and a complete resolution of this issue will not disappear until an adequate basis for theory preference is established. I have suggested how this might be accomplished within an ontology that is purely syntactic (section 2.5.2), and hope to make some progress in this area. The syntactic account resolves the problem described, although the implementation presented in this thesis is not yet able to make use of these observations. 5.2.2 Goals, Plans and Desires Though I am not yet ready to recant my earlier view that beliefs are enough to represent mental states of interlocutors, I now admit that there are immediate ad-vantages to augmenting the representational language to include primitives for such things as goals and desires. A user-modeller, for instance, might profit from being able to reason about the user's goals. CHAPTERS. CONCLUSION 71 5.3 F u r t h e r w o r k There are two obvious directions in which to take further work. As noted throughout this thesis, I have systematically avoided trying to account for the goals and desires of agents represented with this system. My reasons for this are quite practical. Such an effort would have taken me to the outer limits of pragmatics, where I would at best have been on shaky ground. I would then have had to take into account the Speaker's point of view as well, and this would not serve in the development of a User-modeller, for which a completely Hearer-based view is adequate. Of course, none of these disclaimers prevent future expansion of this work to eventually encompass goals and desires of both Speaker and Hearer; the methodology and the reasoning framework employed were chosen to assure that such future efforts would remain consistent with what has already been presented here. Thus, one avenue for future work is the development of principles of pragmatics, to be represented in a default reasoning framework. The search for these principles would be hampered by lack of any underlying theory, and such an effort should probably be delayed until cognitive science has more to offer. Another —and I think better— direction to take would be to probe further into the mechanism of the default reasoning framework itself. The current imple-mentation is plagued by the well-known problem of multiple extensions, and any enhancement of the system to cover goals or desires would continue to suffer from these same problems. The still unresolved difficulties of preference in multiple exten-sion theories will continue to be a major impediment to the productive application of default reasoning. Bibliography [AK88] Douglas E . Appelt and Kurt Konolige. A nonmonotonic logic for reason-ing about speech acts and belief revision. In 2nd International Workshop on Non-Monotonic Reasoning, pages 164-176, 1988. [A1187] James Allen. Natural Language Understanding. The Ben-jamin/Cummings Publishing Company, Inc., Menlo Park, C A , 1987. [AP80] James Allen and Raymond Perrault. Analyzing intention in utterances. Artificial Intelligence, 15:143-178, 1980. [Aus62] J . L. Austin. How to do Things with Words. Oxford University Press, 1962. [Axe84] Robert Axelrod. The Evolution of Cooperation. Basic Books, Inc., 1984. [Bar85] Jonathan Baron. Rationality and Intelligence. Cambridge University Press, 1985. [Bat83] John Batali. Computational introspection. A .L Memo 701, Artificial Intelligence Laboratory, M.I.T., 1983. [BH79] Kent Bach and Robert M . Harnish. Linguistic Communication and Speech Acts. MIT Press, Cambridge, M A , 1979. [BR89] Noel Burton-Roberts. The Limits to Debate: A revised theory of seman-tic presupposition. Cambridge University Press, 1989. [Bre89] Gerhard Brewka. Preferred subtheories: An extended logical framework for logical reasoning. IJCAI, 1989. [CCBD87] A. Csinger, H.da Costa, B.Forghani, and D.A.Lowther. Increasing cad throughput with a programmable user interface. In Official Proceedings of the 3rd Int'l IMS '87, SATECH '87, Part I, Long Beach, CA. , 1987. [CdCF87] A. Csinger, H. da Costa, and B. Forghani. A general-purpose pro-grammable command decoder. In IEEE Proceedings, Conference Compint, pages 139-41, November 1987. 72 [Cha76] E . Charniak. Inference and knowledge i. In E . Charniak and Y. Wilks, editors, Computational Semantics. North-Holland, 1976. [Che86] Christopher Cherniak. Minimal Rationality. MIT Press: A Bradford Book, Cambridge, M A , 1986. [Chu86] Paul M . Churchland. Scientific Realism and the Plasticity of Mind. Cam-bridge, 1986. [Chu88] PaulM. Churchland. Matter and Consciousness. Cambridge: TheM.I.T. Press (A Bradford Book), 1988. [CL] Philip R. Cohen and Hector J . Levesque. Speech acts and rationality. CSLI. [CP79] Philip R. Cohen and Raymond Perrault. Elements of a plan-based theory of speech-acts. Cognitive Science, 3:177-212, 1979. [CP89] Andrew Csinger and David Poole. From utterance to belief: Default rea-soning in user-modelling. In Proceedings of the Conference for Knowledge Based Computing Systems, KBCS-89, pages 408-419, Bombay, India., December 1989. [Daw87] Richard Dawkins. The Blind Watchmaker. W.W. Norton, New York, N.Y. , 1987. [Den81] Daniel C. Dennett. Intentional systems. In John Haugland, editor, Mind Design. The M.I.T. Press, Cambridge, 1981. [Den85] Daniel C. Dennett. Elbow Room. The M.I.T. Press (A Bradford Book), Cambridge, 1985. [Den87] Daniel C. Dennett. The Intentional Stance. The MIT Press, Cambridge, 1987. [Eth83] David W. Etherington. Formalizing non-monotonic reasoning systems. Technical Report 1, University of British Columbia, Vancouver, Canada, V6T 1W5, 1983. [Fre92] Gottlob Frege. On sense and reference. In P. Geach and M . Black, editors, Translations from the Philosophical Writings of Gottlob Frege, pages 55-78. Blackwell, 1892. [Gaz79] Gerald Gazdar. Pragmatics: Implicature, Presupposition and Logical Form. Academic Press, 1979. 73 [Gef89] Hector Geffner. Default reasoning, minimality and coherence. In KR89, page 137, Toronto, Canada, May 1989. [Gri75] H.P. Grice. Logic and conversation. In P. Cole and J .L. Morgan, editors, Syntax and Semantics: Speech Acts, vol 3, pages 47-58. Academic Press, New York, 1975. [GS80] Jeroen Groenendijk and Martin Stokhof. A pragmatic analysis of speci-ficity. In Frank Heny, editor, Ambiguities in Intensional Contexts, pages 153-190. D. Reidel Company, Dordrecht, Holland, 1980. [Had89] Robert F. Hadley. The many uses of 'belief in ai'. CSS-IS T R 03, Centre for Systems Science, Simon Fraser University, Burnaby, B .C. , V5A 1S6, 1989. [Her75] Hans G. Herzberger. Dimensions of truth. In Contemporary Research in Philosophical Logic and Linguistic Semantics, unknown, 1975. [HH88] Diane Horton and Graeme Hirst. Presuppositions as beliefs. In COLING, 1988. [Hor87] Diane Horton. Incorporating agents' beliefs in a model of presupposition. Technical Report 201, University of Toronto, 1987. [Kap82] S.J. Kaplan. Cooperative responses from a portable natural language query system. Artificial Intelligence, 19(2):165-88, 1982. [Kar74] L . Karttunen. Presupposition and linguistic context. Theoretical Lin-guistics, 1:181-194, 1974. [KF88] Robert Kass and T im Finin. Modelling the user in natural language systems. Computational Linguistics, 14(3):5, September 1988. [Kon84] Kurt Konolige. Belief and incompleteness. Technical Report 319, SRI, SRI International, Menlo Park, C A , 1984. [Kon85] Kurt Konolige. A computational theory of belief introspection. In IJ-CAI85, pages 502-508, 1985. [Kor85] Robert R. Korfhage. Intelligent information retrieval: Issues in user modelling. Technical Report 85-CSE-9, Dept. of Computer Science and Engineering, Southern Methodist University, Dallas, Texas, May 1985. [KP79] L. Karttunen and S. Peters. Conventional implicature. In C. K. Oh and D. A. Dineen, editors, Syntax and Semantics. Academic Press, 1979. 74 [Lev84] Hector J . Levesque. A logic of implicit and explicit belief. AAAI, pages 198-202, 1984. [Lev88] Robert I. Levine. A Comprehensive Guide to Expert Systems: Turbo-Pascal Edition. McGraw-Hill, 1988. [Lew69] D. Lewis. Convention. Cambridge University Press, 1969. [Lyc84] William G. Lycan. Logical Form in Natural Language. MIT Press, (A Bradford Book), Cambridge, M A , 1984. [Mar81] David Marr. Artificial intelligence, a personal view. In John Haugland, editor, Mind Design. The M.I.T. Press, Cambridge, 1981. [Mer87] Robert Mercer. A default logic approach to the derivation of natural language presuppositions. Technical Report 35, University of British Columbia, October 1987. [Mor87] E . Morgado. Reasoning and meta-reasoning. In S. Shapiro, editor, En-cyclopedia of Artificial Intelligence, page 601. J . Wiley and Sons, New York, 1987. [MR84] Robert Mercer and Richard Rosenberg. Generating corrective answers by computing presuppositions of answers, not of questions. In Proceed-ings of the 1984 Conference, pages 16-19, University of Western Ontario, London, Ontario, May 1984. Canadian Society for Computational Stud-ies of Intelligence. [OD79] C. K. Oh and D. A. Dineen. Syntax and Semantics. Academic Press, 1979. [Per87] C. Raymond Perrault. An application of default logic to speech act theory. CSLI 90, Center for the Study of Language and Information, Stanford, C A . , 1987. [Pfa85] G . E . Pfaff, editor. User Interface Management Systems. Springer-Verlag, Berlin, 1985. [PG86] C. Raymond Perrault and Barbara J . Grosz. Natural language interfaces. Annual Review of Computer Science, 1:47-82, 1986. [PGA87] David Poole, Randy Goebel, and Romas Aleliunas. A logical framework for default reasoning. In The Knowledge Frontier: Essays in the Repre-sentation of Knowledge, pages 331-352. Springer Verlag, New York, NY, 1987. 75 [Poo85] [Poo87] [P0088] [Poo89a] [Poo89b] [Poo89c] [Rei80] [Sea84] [SM88] [Sta80] [Sti83] [Str50] [SV85] [vF75] [Win71] David Poole. On the comparison of theories: Preferring the most spe-cific explanation. In IJCAI, volume I, pages 144-147, Los Angeles, C A , August 1985. David Poole. A logical framework for default reasoning. Artificial Intel-ligence, 36(l):27-47, 1987. David Poole. Representing knowledge for logic-based diagnosis. In In-ternational Conference on Fifth Generation Computing Systems, pages 1282-1290, Tokyo, Japan, November 1988. David Poole. Explanation and prediction: an architecture for default and abductive reasoning. Computational Intelligence, 5(2):97-110, 1989. David Poole. Normality and faults in logic-based diagnosis. In IJCAI, pages 1304-1310, Detroit, MI, August 1989. David Poole. What the lottery paradox tells us about default reason-ing. In First International Conference on the Principles of Knowledge Representation and Reasoning (KR89), Toronto, Canada, May 1989. R. Reiter. A logic for default reasoning. 13(1,2):81-132, 1980. Artificial Intelligence, John Searle. Minds, Brains and Science. Harvard University Press, Cambridge, 1984. Yoav Shoham and Yoram Moses. Belief as defeasible knowledge. STAN-CS 1237, Department of Computer Science, Stanford University, Stan-ford, C A 94305, 1988. Robert Stalnaker. Review of gazdar's pragmatics. Language, 56(4), 1980. Stephen Stich. From Folk Psychology to Cognitive Science: The Case Against Belief. MIT Press (A Bradford Book), Cambridge, M A , 1983. P. F. Strawson. On referring. Mind, 59:320-344, 1950. John R. Searle and Daniel Vanderveken. Foundations of Elocutionary Logic. Cambridge University Press, 1985. Bas C. van Frassen. Incomplete assertion and belnap connectives. In Contemporary Research in Philosophical Logic and Linguistic Semantics. unknown, 1975. Terry Winograd. Procedures as a representation for data in a computer program for understanding natural language. AI-TR 17, M.I.T. A l Lab-oratory, February 1971. 76 Appendix A Theorist Listings A . l M a x i m s % This version sets out to construct the agents' knowledge bases from '/0 an understanding of the Gricean Maxims of Cooperation, and from the % utterances of the agents. %%• GRICEAN Quality Analog % sincerity does not involve TRUE beliefs of the hearer: default sincere(S, U). fact sincere(S,U) and utt(S, U) => bel(S, U). °/.% GRICEAN Quantity Analog % the speaker doesn't necessarily believe what he says here; this i s V, subsumed in the sincerity rule: default quantity(S, U). fact quantity(S,U) and utt(S, U) => bel(S, not bel(hearer, U)). */, don't say what you know hearer knows % Sarcasm predication default sarcastic(S, U). fact sarcastic(S,U) and utt(S, U) => 77 bel(S, not bel(hearer, U)) and bel(S, bel(hearer, bel(S, not bel(hearer, U)))) and not bel(S, U). A . 2 P r e s u p p o s i t i o n PRESUPPOSITIONAL ANALYSES: % default rules to enable presuppositions under negation: '/, noncriterial presupposition schema: default pre_by_nonc(S, Object, Propty, Presupposition) : (utt(S, property(Object, not Propty)) or imp(S, property(Object, not Propty))) and none(Propty, Presupposition) => bel(S, property(Object, Presupposition)). °/0 factive presupposition schema: % what we really want i n the following i s the narrow-scope % negation of the factive verb, but we adopt the wide scope % representation for convenience. default pre_by_factive(Speaker,Subject,Presupposition,Factive) : (utt(Speaker, not property(Subject, Factive, Presupposition)) or imp(Speaker, not property(Subject, Factive, Presupposition))) and factive(Factive) => bel(Speaker, bel(Subject,Presupposition)). A . 3 I m p l i c a t u r e % follows from Quantity: % 'the utterance of such a complex sentence implicates that % both the constituent sentence and i t s negation are compatible */, with what the speaker knows.' [GAZDAR79, p61] . default fc(l,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,Ul). default fc(2,S,Ul,U2) : 78 utt(S, or(Ul,U2)) ^ imp(S,U2). default fc(3,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,not Ul). default fc(4,S,Ul,U2) : utt(S, or(Ul,U2)) => imp(S,not U2). '/, implicatures are believed by default: default sensible(S,U) : imp(S.U) => bel(S.U). A . 4 R a t i o n a l i t y % We are informed of knowledge that i s mutually known i n % the community: default aware(Agent,A) : mutual(A) => bel(S,A). °/0 If we believe the antecedent of a rule, we believe i t s consequent: default implicit(Agent,A,C) : mutual(=>(A,C)) and bel(Agent,A) => bel(Agent.C). % If we believe a l i s t , we believe the items: default conjunct(Agent,List,X) : bel(Agent,List) and member(X,List) => bel(Agent,X). % Positive Introspection (patently left-recursive): default pos_int(Agent,B) : bel(Agent,B) => bel(Agent,bel(Agent,B)). % Motherhood... fact member(X,[XI T a i l ] ) . fact member(X,Tail) => member(X,[HlTail]), 7 9 fact mutual(->(A,not property(0,B))) and bel(S,A) => not bel(S,property(0,B)). % Re-write rules: fact bel(S,property(0,not B)) => bel(S,not property(0,B)). fact bel(S,not property(0,B)) => bel(S,property(0,not B)). fact imp(S,not property(0,B)) => imp(S,property(0,not B)). A . 5 M i s c e l l a n e o u s A.5.1 World Information '/„ WORLD INFORMATION: definition of bachelor: fact mutual(=>(property(X, bachelor), [property(X, male), property(X, adult), property(X, not married)])). V, definition of spinster: fact mutual(=>(property(X, spinster), [property(X, female), property(X, adult), property(X, not married)])). fact mutual(=>(property(Anyone, female), not property(Anyone, male))). fact mutual(=>(property(Anyone, bachelor), not property(Anyone, spinster))). A.5.2 Lexical Information % The non.criterial facts: fact none(bachelor,male). fact none(bachelor,adult). fact none(spinster,female). fact nonc(spinster,adult). % The factive facts: fact factive(regret). 80 ers and Publications A.Csinger. Implementing a Theory of Communication, Master's Thesis, Dept. of Computer Science, University of British Columbia, in preparation for graduation in spring 1990. A.Csinger, D.Poole. "From Utterance to Belief via Presupposition", Knowledge Based Computer Systems Proceedings, Dec. 11, 1989, pp408-4T9, Bombay, India. Also to appear in Lecture Notes in Computer Science, Springer-Verlag. A.Csinger, D.Poole. "Hypothetical Reasoning and Discourse Structure", Submitted to AAAI-90, Boston M A , July 1990. A.Csinger, M.Horsch, "A Syntactic Basis for Theory Preference", In preparation. A.Csinger, B.Forghani, D.Seward. "When Should You Use a C A D Consultant", Machine Design, Aug. 25, 1988, pp91-94. Reprinted in CAD/CAM World, No.3, Sep.13, 1989 as "Hvornar skal du bruge en CAD-konsulent?" page 8 (Translated by Niels Moos). A.Csinger, A.Shewchuk, B.Forghani. "Three-dimensional Finite Element Mesh Stor-age", IEEE Proceedings Compint '87, Nov. 9-12, 1987, Montreal, Canada. ppl46-148. A.Csinger H.da Costa, B.Forghani, D.A.Lowther. "A Programmable User-Interface for C A D Systems", IEEE Proceedings Compint '87, Nov. 9-12, 1987, Montreal, Canada. ppl39-141. A.Csinger, D.Seward. "3D Finite Element Mesh Generation and P ost-pro cess ing", Intermag '87 Poster Session, April 14-17, 1987, Tokyo, Japan. A.Csinger, H.da Costa, B.Forghani, D.A.Lowther. "Increasing C A D Throughput with a Programmable User Interface", In Proceedings of the Third International IMS-87 Conference, SATECH '87, Sep.14, 1987, Long Beach, C A . MagNetSD User's Manuals, Infolytica Corp. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051599/manifest

Comment

Related Items