Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A schema & constraint-based representation to understanding natural language Kuttner, Eliza Wing-Mun 1986

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1986_A6_7 K87_4.pdf [ 4.98MB ]
JSON: 831-1.0051881.json
JSON-LD: 831-1.0051881-ld.json
RDF/XML (Pretty): 831-1.0051881-rdf.xml
RDF/JSON: 831-1.0051881-rdf.json
Turtle: 831-1.0051881-turtle.txt
N-Triples: 831-1.0051881-rdf-ntriples.txt
Original Record: 831-1.0051881-source.json
Full Text

Full Text

A S C H E M A & C O N S T R A I N T - B A S E D R E P R E S E N T A T I O N T O U N D E R S T A N D I N G N A T U R A L L A N G U A G E By E L I Z A W I N G - M U N K U T T N E R B.Sc , University of British Columbia, 1983 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E STUDIES ( D E P A R T M E N T O F C O M P U T E R SCIENCE) We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y O F BRITISH C O L U M B I A November 1986 © Eliza Kuttner, 1986 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 DE-6G/81) Abstract This thesis attempts to represent the syntax and semantics of English sentences using a schema and constraint-based approach. In this approach, syntactic and semantic knowledge that are represented by schemata are processed in parallel with the utilization of network con-sistency techniques and an augmented version of Earley's context-free parsing algorithm. A sentence's syntax and semantics are disambiguated incrementally as the interpretation pro-ceeds left to right, word by word. Each word and recognized grammatical constituent provide additional information that helps to guide the interpretation process. It is desirable to attempt to apply network consistency techniques and schema-knowledge representations on understanding natural language since the former has been proven to be quite efficient and the latter provides modularity in representing knowledge. In addition, this approach is appealing because it can cope with ambiguities in an efficient manner. Multiple interpretations are retained if ambiguity exists as indicated by the words processed so far. How-ever, incorrect interpretations are eliminated as soon as their inappropriateness is discovered. Thus, backtracking search which is known to be inefficient is avoided. ii Contents A b s t r a c t u Contents i " List of Figures v List of Tables v i Acknowledgement v i i 1 Introduction 1 2 Background? 6 2.1 Early history • 6 2.2 Knowledge-based systems 8 2.3 Network consistency techniques 16 2.4 Earley's context-free parsing algorithm 20 3 A Schema & Const ra int -Based A p p r o a c h to N L U 25 3.1 Overview of the system 25 3.2 Knowledge base (KB) 29 3.2.1 Syntactic schemata 30 3.2.2 Semantic schemata 34 3.3 System control 4 2 3.3.1 Predictor 44 3.3.2 Scanner 4 7 3.3.3 Completer 4 8 3.4 Constraint satisfaction 51 3.4.1 Syntax handler 51 3.4.2 Semantics handler 54 3.5 Morpher 6 2 3.6 Garbage collector 6 3 iii 4 Discussion of examples 5 Conclusions and Future Directions Bibliography Appendix i v List of Figures 3.1 Overview of the system 26 3.2 B N F grammar for a syntactic schema 31 3.3 B N F grammar for a semantic schema 35 3.4 Specialization Hierarchy for the noun 'music' 39 3.5 Sample semantic schema for the verb 'compose' 41 3.6 Sample semantic schema for the preposition ' in ' 41 3.7 Sample derived syntactic schema . 45 3.8 Sample derived semantic schema 55 3.9 Sample derived semantic schema for a yerbr • • • 56 3.10 Splitting a hypothetical NCG: description 59 4.1 Representation for 'famous keyboard music' 67 4.2 Representation for 'Mozart 's father's birthday' 69 4.3 Representation for 'which three sonatas were written by Beethoven' 71 4.4 A representation for 'who composed the Messiah in Dublin' 73 4.5 A second representation for 'who composed the Messiah in Dublin' 74 v List of Tables 3.1 Syntactic categories' abbreviations 30 3.2 Properties of a semantic schema 36 3.3 Sample schema relation for the verb 'compose' 40 3.4 Table of syntactic constraints 53 3.5 Table of semantic predicates 54 vi Acknowledgement I would like to thank my supervisor, B i l l Havens, for research ideas, help, support and patience. M y thanks also to the Natural Sciences and Engineering Research Council of Canada and the Department of Computer Science at U . B . C . for financial support. I would also like to thank all my friends at the Department for making my years of graduate studies an enjoyable as well as educational experience. Special thanks to Michael, Alex, Lisa and my second reader, Richard Rosenberg, for their help in correcting this thesis. Thanks to my parents, my brother and my husband for their love and encouragement without which this thesis could never have reached completion. Lastly, If would like to thank Babaji and Satish for teaching and helping, me with everything. vii Chapter 1 Introduction Research in knowledge representation (KR) for understanding natural language (NL) has gained prominence over the years. In order for a natural language understanding (NLU) system to attain the level of competence of humans, it must have the same kind of knowledge of the world that a human has, and this large store of "world-knowledge" must be represented and manipulated in an efficient manner. Many KRs have been utilized for the task of understanding N L . Although there are advan-tages in each representation, there are disadvantages as well. Thus, none of the representations is perfect, and the search continues for a K R that encompasses the advantages of other ap-proaches and yet can avoid the disadvantages. This thesis conducts one such search attempt by introducing a schema and constraint-based K R approach for understanding N L . It com-bines the advantages of logical and procedural representations while overcoming some of their disadvantages. Logic is precise and seems to express facts in a way that corresponds to our rational un-derstanding of some domain [Hayes, 1977] [McCarthy, 1977], and the use of first-order logic 1 CHAPTER 1. INTRODUCTION 2 for K R in N L U is quite popular. However, N L U systems using logic for K R are inefficient. This is due to the fact that representation and processing are separated in the logic-based ap-proach [Barr &; Feigenbaum, 1981]. A logic representation may be appealing but the processing method that determines how the stored facts are manipulated can be inefficient. For example, the use of the resolution method of inference in QA3 [Green, 1969] causes combinatorial explo-sion in the number of ways facts can be combined to make inferences as the number of facts in the database increases. A procedural approach to K R offers more efficiency because the embedding of knowledge into procedures allows control over the deduction' process1. However,, ife has disadvantagesi as) well. In a logic-based system, addition of logical facts or assertions is straight-forward, but in a procedural-based system, addition of facts can be complicated since it may cause changes in the heuristic knowledge of different procedures due to the unavoidable interaction between various facts and the nature of heuristics. Thus, modularity of knowledge in the database is lost in this approach. Winograd's S H R D L U [Winograd, 1972]| is a system that adopted this K R . Winograd pointed out in [Winograd-, 1980]; that the representation of the speaker/hearer internal structure in S H R D L U was ad hoc. As a result, the preciseness of logical representation is lost. In this thesis, we describe a schema and constraint-based approach to N L U that attempts to incorporate the efficiency of a procedural approach, and yet can attain a reasonable level of modularity of knowledge and preciseness of representation that a logical approach offers. Efficiency is hard to achieve in most N L U systems with large databases because search is CHAPTER 1. INTRODUCTION 3 involved. Our approach takes an alternative view of language as providing semantic constraints on descriptions of the linguistic world instead of driving a search for such a description. This means that we are trying to view N L U as a constraint satisfaction problem that avoids search whenever possible. A class of network consistency algorithms has been developed by Waltz, Montanari, Mack-worth and Freuder respectively to solve constraint satisfaction problems efficiently. The com-plexity of these algorithms has been studied [Mackworth ic Freuder, 1984], and it has been shown that one type of consistency, arc consistency, is achievable in quadratic time, and path consistency in cubic time in the worst case. These; algorithms have: been applied to machine perception tasks successfully [Waltz, 1972] [Mackworth, 1977b] [Havens & Mackworth, 1983] [Mulder, 1985]; we apply them to N L U in this thesis. Computer vision and N L U can both be considered as recognition tasks which need to resolve uncertainties or ambiguities in the input. In computer vision, segmentation of the gray scale of a digitized picture into edges and regions is required before interpretation is. performed' whereasi in NLU',, it is the segmentation of the input sentence^ into words and syntactic categories^ In both cases;, there is a need for feed-back between segmentation and interpretation. Also, knowledge of either a visual or natural language domain can be represented by specialization hierarchies [Braehman, 1982] which are knowledge structures that categorize classes of objects in the world that is being represented. These similarities between vision and N L U tasks as well as the successful application of net-work consistency techniques on vision tasks make us believe that it is appropriate to use a constraint-based approach for N L U . CHAPTER 1. INTRODUCTION 4 Logic is the best tool for achieving preciseness and modularity of knowledge; however, efficiency is hard to attain at the same time. We believe that a schema and constraint-based approach offers efficiency as well as provide a reasonable degree of modularity and preciseness of representation. Several representations fall into the category of schema knowl-edge representations. They include frames [Minsky, 1975], scripts and plans [Schank, 1975] [Schank Ac Abelson, 1977]. Also, different forms of memory schemata [Bobrow Ac Norman, 1975] are classified as schema knowledge representations. [Rumelhart Ac Ortony, 1976] also discusses the representation of knowledge in memory in terms of schemata. In each case, a schema rep-resents a self-contained collection of related knowledge. A n expansion of the knowledge base (KB); is equivalent to the addition of schemata in this approach, and since the knowledge is well-organized into schemata with the interactions between them well-defined, additions to the K B are quite straight-forward. The combination of consistency techniques and schema knowledge representations as a repre-sentational formalism has been described1 as schema labelling [Havens, 1985}. The methodology has been used for interpreting hand-printed Chinese characters [Bult, 1986], and for recognizing V L S I circuit designs from their mask layouts [Albn Ac Havens, 1985]. This thesis adopts this approach to represent the syntax and semantics of English sentences. The system implemented takes a simple sentence as input and produces a representation of the sentence's syntax and semantics as output which is in the form of a parse tree (PT) and a network consistency graph (NCG). The P T is the syntactic structure assigned to the input sentence after it has been analyzed according to the given phrase structure grammar. It CHAPTER 1. INTRODUCTION 5 represents the syntactic aspects of the input sentence. The semantics are captured in the N C G which is a semantic network of schemata linked by labelled arcs which specify the relationships between the semantic entities of the sentence. The P T and N C G are created in parallel as the system analyses the input sentence one word at a time from left to right. Each word may provide additional syntactic or semantic constraints that help the system to disambiguate the syntax and semantics of the sentence. Syntactic constraints are applied in the P T whereas semantic constraints are applied in the N C G . A t this point, network consistency techniques are utilized to maintain consistency in the N C G after the semantic constraints are applied. Schema knowledge representations are used to represent all the knowledge needed by the system in the K B which is a collection of static schemata of syntactic and semantic information. If a sentence is ambiguous, then multiple representations are produced. A n augmented version of Earley's context-free parsing algorithm is used to provide control for the system [Havens, 1983]!. Chapter 3 of this thesis presents this schema and constraint-based approach to N L U through a detailed description of our implementation. Chapter 2 provides a background study of the field of N L from the early history of machine translation in the 1940s to the knowledge-based systems of the 1970s. Also, descriptions of some network consistency techniques and Earley's context-free parsing algorithm are given. Chapter 4 describes some sample sentence representations produced by the system. Chapter 5 discusses the merits and drawbacks of this approach, and the possibility of expansion of the system in the future. Chapter 2 Background This chapter gives a brief history of natural language research showing how the research trend' has progressed from machine translation and pattern matching without the use of knowl-edge in the 1940s to knowledge-based systems in the 1970s. A collection of K R techniques for N L U is presented and the systems that adopt these techniques are mentioned and discussed. Finally, some background on network consistency techniques and Earley's context-free parsing algorithm are given. 2.1 E a r l y h i s t o r y Soon after computers came into existence in the 1940s, researchers were interested in ap-plying them to the study of language. Initial attempts involved only surface-level processing of text such as the compiling of word indexes and concordances. During the 1950s, the main computer application for natural language was in machine translation [Weaver, 1949] where the computer tries to assume the role of a human translator in the translation of texts from one language to another. The basic method involves dictionary-lookup, word substitution, and 6 CHAPTER 2. BACKGROUND 7 rearrangement of the resulting string of words to fit the target language's word order. How-ever, the focusing on syntactic information alone in such attempts produced poor results. The problems that arose include the choosing of appropriate word equivalences when a word has several translations depending upon the context, and the rearrangement of words in the target language to produce a truly equivalent sentence. It then became apparent that high-quality translation can only be achieved if the system can "understand" the input text before it recon-structs the string of words in the target language. In addition, such language understanding involves much "world-knowledge" which is applied implicitly when humans translate from one language to another [Bar-Hillel, 1960]L Thus, the focus of AI natural language research shifted to that of natural language un-derstanding [Winograd, 1980]. However, the early N L U systems that were developed in the 1960s still did not deal with the issue of knowledge representation; these systems used ad hoc data structures to store facts about a restricted domain. In addition, the syntax of language was not dealt with in any sophisticated way, and semantic knowledge was only implicit in the patterns and heuristics used for parsing. A representative set of these early systems in-clude S A D - S A M [Lindsay, 1963], B A S E B A L L [Green et al., 1963], S T U D E N T [Bobrow, 1968], E L I Z A [Weizenbaum, 1966], and SIR [Raphael, 1968]. The next and current phase of N L research that began in the 1970s is closely connected with research on the representation of knowledge. The N L U programs that belong to this category of research are called knowledge-based systems in [Barr & Feigenbaum, 1981], since these systems use a fair amount of knowledge about the domain of discourse to help understand sentences CHAPTER 2. BACKGROUND 8 and the knowledge is stored within the system using some knowledge representation method. 2.2 K n o w l e d g e - b a s e d systems Instead of using ad hoc data structures for storing facts about the domain of discussion like the early N L U systems did, all knowledge-based systems utilize some formal K R method for storing, retrieving and manipulating knowledge. The principal K R methods that have been utilized in N L U systems include logical forms, procedures, semantic networks and the various forms of schemata such as frames, scripts and plans. Logic was one of the first K R methods used in AI. One of the early N L U systems that used logic to represent knowledge is the general-purpose question-answering system QA3 [Green, 1969] QA3 could solve simple problems in a few different domains such as chemistry, robot movement and automatic programming. It used the resolution method of inference to perform deduc-tions and was quite successful in solving simple problems with a certain degree of generality. However, as the number of facts increases in a database, the system's performance decreases! because the resolution method caused combinatorial explosion in the number of ways facts are combined to make inferences: More recent attempts in using logic for representing knowledge in N L U systems include the works of N L researchers such as Dahl, Pereira, Warren, and McCord. Dahl has written a N L query system to be consulted in Spanish/French [Dahl, 1981] where the grammar and the facts of the database were written in P R O L O G , a programming language based on first-order logic. This system was later adapted to Portuguese consultation by H . Coelho and F. CHAPTER 2. BACKGROUND 9 Pereira, and subsequently to English consultation by D. Warren and F. Pereira. With the development of the P R O L O G programming language, logic was used in these cases as the underlying representational formalism as well as a programming language for the translation of N L input to a N L U system. Logic programming, since the development of the P R O L O G programming language, has become quite popular. In addition, several grammar formalisms have been developed and incorporated into P R O L O G , including metamorphosis grammars [Colmerauer, 1978], extrapo-sition grammars [Pereira, 1981], and a special case of metamorphosis grammars which Pereira and Warren have given the name definite clause grammars [Pereira Ac Warren, 1980}. Extraposition grammars were used in a N L question-answering system called Chat80 by Pereira and Warren which was also written in P R O L O G and was designed to be efficient as well as easily adaptable to a variety of applications [Warren Ac Pereira, 1982]'. N L analysis is performed in three separate phases in this system. The translation phase translates an English sentence into logic. Next, the planning phase augments the logical form with extra control information which makes it an efficient piece of P R O L O G program. The last phase is execution where the optimised P R O L O G code is executed to produce an answer. Dahl and McCord have subsequently developed, both separately and jointly, N L U systems that solve N L problems such as coordination and quantifiers [Dahl Ac McCord, 1983], and use slots and modifiers for syntactic and semantic interpretation [McCord, 1982]. These systems were also written in P R O L O G with the use of logic grammars. The procedural representation of knowledge is another K R method that has been employed CHAPTER 2. BACKGROUND 10 in N L U systems. The procedural approach stresses the importance of "knowing how" to use knowledge and achieves this by representing the knowledge of the world as procedures which when executed know how to carry out specific actions according to the heuristic knowledge that is embedded in the procedures. This type of domain-specific information allows more directed deduction processes and thus provides the efficiency that declarative representations lack. How-ever, the procedural approach has its disadvantages as well. The knowledge that is captured in procedures cannot be stated explicitly as it can be in declarative representations thus making it less accessible, and consequently making the task of adding and changing procedures quite difficult as minor changes in one procedure may have far-reaching effects on other procedures. The merits and drawbacks of the two methods are discussed in [Winograd, 1975]; in response to the declarative/procedural controversy of the 1970s. Woods' question-answering system about airline flight schedules [Woods, 1968] is an ex-ample of an early procedural system. Input questions are translated into functions which when executed over the system's database produce the correct answer. The L U N A R pro-gram [Woods et al., 1972] follows from this research and takes a similar approach. It is a N L information-retrieval system that aids geologists with their task in evaluating data on moon rock and soil composition obtained from the Apollo-11 mission. The system processes English queries in three steps. The syntactic analysis step takes an English query and produces a derivation tree with the help of an augmented transition network parser [Woods, 1970]. The derivation tree is then transformed to an expression in a formal query language that captures the semantics of the original English query in the semantic interpretation step. Finally, the CHAPTER 2. BACKGROUND 11 query language expression is executed over the database to produce a response. The other representative system of the procedural approach is the S H R D L U program [Winograd, 1972]. The program maintains an interactive dialogue with the user regarding its simulation of a robot arm that manipulates a set of toy blocks on a table. Syntactic, semantic and reasoning knowledge are embodied in procedures which are pieces of executable code. The program was written in LISP and M I C R O - P L A N N E R . The latter is a version of the P L A N -N E R language [Hewitt, 1972] which represents procedural data in the form of "theorems". P L A N N E R ' S paradigm is theorem proving, and it satisfies a goal by looking for the appropri-ate "theorem" to prove it. However, it allows the efficiency of the process to be increased by providing the user the flexibility to specify his own heuristics in his procedures. The semantic network formalism for K R has its origin in Quillian's development of a psy-chological model of human associative memory [Quillian, 1968]! and his Teachable Language Comprehender ( T L C ) program [Quillian, 1969] which simulates this memory model. Many systems with a network-based representation have been written since then. However, all that some of these systems have in common is simply the superficial common notation of having nodes and arcs representing some sort of a taxonomic hierarchy. [Woods, 1975] stresses the importance of understanding what the notation means, and [Brachman, 1983] discusses the many different meanings that the IS-A link of a semantic network has adopted over the years and points out those meanings which are important in expressing knowledge and which gives a semantic network its expressive power. A n early attempt at using the semantic network formalism that follows directly from Quil-CHAPTER 2. BACKGROUND 12 lian's T L C work is Carbonell's work on a computer-aided instruction program called S C H O L A R [Carbonell, 1970]. This tutoring program can answer questions posed by students about South American geography; the geographical information is stored in a semantic network. Around the same time, Fillmore's work on linguistic case structure [Fillmore, 1968] was influencing the development of semantic networks. The cases of Fillmore were utilized as the underlying re-lationships that label the arcs in the networks in [Simmons & Slocum, 1972], [Simmons, 1973] and [Rumelhart & Norman, 1973]. Although these works all used the case frame approach in choosing arc types, Rumelhart and Norman's system was more psychologically oriented and placed more emphasis on the simulation of long-term memory and human cognitive processes, whereas Simmons' system put more emphasis on the generation of answers to questions using the semantic network. [Shapiro, 1982] is a more recent attempt in the research in sentence generation from semantic networks. A couple of speech understanding systems, [Woods et al., 1976] and [Walker, 1978], used semantic networks for representing knowledge. In connection with Walker's system, Hendrix has developed the idea of "network partitioning" [Hendrix, 1975] [Hendrix, 1979] in which the nodes and arcs are partitioned into "net spaces" which allow for the representation of logical connectives, the delimiting of the scopes of quantified variables, the encoding of alternative and hypothetical worlds as well as the focusing of attention at particular levels of detail. Grosz has adopted the idea of network partitioning in her work on the representation and use of focus in dialogue understanding [Grosz, 1977]. Woods's 1975 paper, "What's in a link", has raised some important questions regarding CHAPTER 2. BACKGROUND 13 the logical adequacy of the semantic network representation and has asked us to consider for the first time the meaning of the semantic network notation. In response to these is-sues raised by Woods, many researchers have attempted to investigate the underlying logical meaning of the network formalism and to deal with the expressive inadequacy of the nota-tion. This includes the following works: [Schubert, 1976], [Hayes, 1977], [Schubert et al., 1979], [Levesque & Mylopoulos, 1979] and [Shapiro, 1979]. In fact, [Shapiro, 1979] follows directly from Shapiro's earlier work, [Shapiro, 1971], where a distinction was already made between the conceptual level and the structural level of the network. A n excellent historical review of semantic networks may be found in [Brachman, 1979]l Related to the research on semantic networks is the work on schemata for knowledge rep-resentation. The term schema was first used by Bartlett in relation to his work on memory [Bartlett, 1932]. Schemata involve organizing knowledge into aggregate structures. A vari-ety of representations fall into this category. Minsky's frames [Minsky, 1975] and Schank's scripts and plans [Schank, 1975]! [Schank & Abelson, 1977]1 are among the well-known ones. [Bobrow <k Norman, 1975] and [Rumelhart & Ortony, 1976] also discuss schema representations in representing knowledge in memory. Frames are data structures that are used to represent stereotyped situations; scripts are frame-like structures that are designed to represent sequences of events that describe some stereotyped human activities, and plans differ from scripts in the sense that they participate in the explanation of the sequences of actions that lead to a goal. A n important characteristic of schema-based or frame-based systems is their capability to represent both declarative and CHAPTER 2. BACKGROUND 14 procedural knowledge [Winograd, 1975]. A frame is organized into slots and besides represent-ing a collection of static facts, a frame can have procedures attached to its slots to drive the reasoning process of the system. Other interpretations of the idea of frames are provided in [Hayes, 1981]. The GUS system [Bobrow et al., 1977] is an experiment with frame-based N L U . It was de-signed as a prototype of an automated airline reservation assistant. It attempted to illustrate the expectation-driven processing of frames and facilitate in the understanding of various aspects of N L such as indirect answers and anaphoric references. The systems, S A M (Script Applier Mech-anism) and P A M (Plan Applier Mechanism)', were developed to demonstrate the use of scripts, and plans in understanding simple stories [Schank et al., 1975] [Schank & Abelson, 1977]. By understanding a story, we mean the ability to paraphrase the story and make inferences from it. S A M does this by trying to fit the story into one or more scripts and it has been used to understand newspaper stories [Cullingford, 1978]. The P A M system [Wilensky, 1978] under-stands stories by determining the goals of the characters in the story and then interpreting their actions by matching them to plans that lead to those goals. The idea of frames as structures that unify a collection of related knowledge is interesting but vague. The development of K R L (Knowledge Representation Language), a frame-based representation language, was an attempt to make precise the intuitive ideas about frames and to explore frame-based processing [Bobrow & Winograd, 1977]. The language K L - O N E [Brachman, 1978] [Brachman, 1979] [Brachman & Schmolze, 1985] has also been designed for supporting structured knowledge representations. The principle structure of K L - O N E is the CHAPTER 2. BACKGROUND 15 "concept" whose primary component is a "role". Roles are like generalized attribute descrip-tions representing the relationships between entities denoted by the concept and other entities. K L - O N E has inspired other new research efforts done on representations including K R Y P -T O N and K L - T W O . The K R Y P T O N system [Brachman et al., 1983a] [Brachman et a l , 1983b] overcomes some of the trouble with frames with regard to its limitations in representing either assertions or descriptions. Instead of denning the system in terms of structures for representing knowledge, K R Y P T O N provides a functional view of a knowledge base emaphasizing what it can be asked and told about the domain. The system has two major components, namely the ter-minological component and the assertional component. The former represents objects in terms of definitional knowledge using frame-like structures whereas the latter represents propositions similar to the manner of first-order predicate calculus language. Thus, K R Y P T O N handles both terminological and assertional knowledge by combining frame-like structural representa-tions with a first-order theorem prover. This type of hybrid inference system that provides more than one language for expression of domain knowledge is appealing because an intelligent system usually has more than one kind of representational need [Brachman & Levesque, 1982] [Brachman et al., 1985]. K L - T W O [Vilain, 1985] is similar to K R Y P T O N in that it is also a hybrid representation system. However, instead of basing the representation system on a first-order theorem prover, K L - T W O is based on a device called R U P (Reasoning Utility Package) [McAllester, 1980] [McAllester, 1982]. R U P provides only a subset of the inferential power of a first-order theo-rem prover, but it is computationally more efficient. K L - T W O is basically composed of two CHAPTER 2. BACKGROUND 16 components, called P E N N I and NIKL. P E N N I is a modified version of R U P and NIKL is a terminological reasoner which is a direct descendant of K L - O N E . Another recent attempt in hybrid representation is the integration of frames with production rules in the K E E system [Kehler & Clemenson, 1984]. The use of a frame-based representation in this system to assist in the task of reasoning is well-described in [Fikes & Kehler, 1985]. The lattest attempt in providing a representation that incorporates schemata and yet can avoid their vagueness is schema labelling [Havens, 1985], which defines schemata precisely for recognition tasks. The theory shows how schema knowledge representations can be combined with network consistency techniques to yield a descriptively and procedurally adequate formal-ism. This thesis adopts the schema labelling approach and is therefore partially schema-based. The knowledge base for our system is organized into schemata. In addition, the resulting net-work representation that captures the semantics of an input sentence is a network of schemata, each being a particular instance of one of the model schemata given in the knowledge base. Our schemata basically provide us with a structured representation of a collection of information, but do not have embedded procedures that participate in the reasoning process of the system as frames and scripts have. 2.3 N e t w o r k consistency techniques Some network consistency techniques are reviewed in this section to provide the background for understanding our constraint-based approach to N L U that adopts these techniques. CHAPTER 2. BACKGROUND 17 Network consistency techniques were developed in the attempt to solve constraint satisfac-tion problems (CSP). [Mackworth & Freuder, 1984] defines a CSP as follows: Given a set of n variables, each with a particular domain and a set of constraining relations which involve a subset of the variables, find all possible n-tuples such that each n-tuple is an instantiation of the n variables in their particular domains satisfying the relations. The variable domains can be continuous or discrete, and the relations can be unary, binary or generally n-ary. For discrete domains that consist of a finite set of values, backtracking may be used to solve a CSP. However, backtracking is inefficient and exhibits problems such as thrashing [Bobrow &; Raphael, 1974]. As a result, network consistency algorithms were developed. Waltz's filtering algorithm [Waltz, 1972] was the first of a set of network consistency al-gorithms. His algorithm was designed to analyse line drawings of toy blocks. In particular, the algorithm was a procedure for filtering out impossible labels for the lines in the drawing. Lines in the drawing meet at points called junctions, and there are a set of possible labellings for each type of junction. The junctions in the drawing are the set of variables for this CSP. Each variable has a domain of allowable labelled junctions. The problem is to determine which junction interpretations provide a globally consistent interpretation. Consistency is forced by the constraint that each line in the drawing must be assigned one and only one label along its entire length. Waltz's filtering algorithm works as follows. The filtering procedure goes through the junc-tions in any order. For each junction, the procedure looks at its neighbours and sees whether the constraint is satisfied. The constraint is satisfied if the line shared by two junctions has CHAPTER 2. BACKGROUND 18 the same label at both ends. If this constraint is not satisfied, that labelling for the junction is eliminated. When such a deletion occurs, all neighbouring junctions whose interpretations are constrained by the deleted junction interpretation are revisited. This and other network consistency algorithms eliminate all local inconsistencies that cannot participate in any global solutions, but they do not necessarily solve the CSP. In the best possible case, the CSP can be solved to yield one interpretation for each junction if there are sufficient constraints to propa-gate such a solution. However, if the algorithm does not leave us with a unique labelling for each junction when it terminates, a backtracking search can be used to enumerate the remain-ing possible labellings. In any case, the algorithm is at least an efficient preprocessor in that it reduces the size of the domains of the variables in only a single pass through the junctions.; Waltz's filtering algorithm only deals with binary constraints. Mackworth [Mackworth, 1977a] presented three types of network consistency algorithms, namely node, arc and path consistency algorithms for eliminating local inconsistencies that involve 1, 2 or 3 variables respectively. A n obvious generalization of his algorithm would deali with arbitrary n-ary constraints. Waltz's filtering algorithm is in fact a special case of AC-2, the second arc consistency algorithm presented by Mackworth, whereas Montanari's algorithm [Montanari, 1974] is a version of a path consistency algorithm. Freuder's network consistency algorithm [Freuder, 1978] called k-consistency removes all inconsistencies involving all subsets of size k out of the n variables. His algorithm is a gener-alization of Mackworth's algorithms. Node, arc and path consistency corresponds to Freuder's 1-, 2- and 3-consistency for k=l , 2 or 3 variables respectively. Freuder's algorithm differs from CHAPTER 2. BACKGROUND 19 the others in the sense that it may determine all solutions to a CSP if we have k=n, thus eliminating the need to have further tree search to determine the solutions. [Mackworth & Freuder, 1984] is a study of the complexity of these network consistency algo-rithms. It is shown that node, arc and path consistency can all be achieved in polynomial time. In particular, arc consistency is achievable in time linear in the number of binary constraints using AC-3 , the third arc consistency algorithm, and the worst case complexity for arc and path consistency is 0(n 2 ) and 0(n 3 ) respectively. In comparison to depth-first backtracking whose worst case complexity is exponential time, these network consistency algorithms show a significant improvement. A network consistency algorithm known as H A C (Hierarchical! Arc Consistency) has recently been developed [Mackworth et al., 1985] to exploit structured domains of the variables in CSPs. This algorithm is useful when the variable domains can be organized hierarchically such that all the possible labels in each domain need not be represented explicitly but can be represented implicitly by one or more abstract labels. Hierarchical are consistency can also be achieved in linear time like A C - 3 . Under the specific condition where the domains are appropriately structured, H A C performs better than AC-3 , but in the worst case where there is no solution, H A C may be twice as slow as AC-3 . The H A C algorithm has been used in the Mapsee3 system [Mulder, 1985]; for interpreting hand-drawn sketch maps. Our approach to N L U is partially constraint-based. We try to view the task of N L U as a C S P so that we can exploit these network consistency techniques to achieve efficiency in analysing N L . Backtracking search is so commonly used in N L U systems to discover the validity CHAPTER 2. BACKGROUND 20 of a sentence and as we have mentioned before, network consistency techniques are superior to backtracking. When viewing N L U or any other recognition tasks as a CSP, we need to identify the variables of the problem, their domains and the constraints that are applied on them. We have chosen the semantic entities of a sentence as the variables for the CSP; the domain for each of these entities is the set of meanings that are attached to it, this is similar to all the facts of the database that are applicable to this entity; the constraints on these entities are the semantic constraints that express the allowable relationships or interactions between these entities. A hierarchical version of arc consistency, AC-2 [Mackworth, 1977a] in particular, is used by our system to arrive at an interpretation for each semantic entity. Although arc consistency does not guarantee a solution, the algorithm can reduce all local inconsistencies such that the resulting network incorporates all the possible interpretations of the sentence. Backtracking search may then be used to assemble actual1 global interpretations. However, the use of network consistency techniques for pre-processing has restricted backtracking search to a minimal. 2.4 E a r l e y ' s context- free p a r s i n g a l g o r i t h m Many parsing algorithms have been developed in the past. This includes the left-parsable (LL) parsing algorithms, the operator precedence parsing algorithms, the predictive parsing algorithms and the right-parsable (LR) parsing algorithms. Although these algorithms are very efficient in that they run in linear space and time, they can only handle a small subset of context-free grammars. However, these efficient algorithms are adequate in handling all the syntactic CHAPTER 2. BACKGROUND 21 features of programming languages. Natural languages include more difficult phenomena than programming languages. For example, the grammar may be ambiguous and usually all parses instead of just one are of interest. Thus, at least a general context-free parsing algorithm is needed for processing natural languages. Backtracking algorithms may be used but they require exponential time. Tabular meth-ods, on the other hand, are asymptotically much faster than backtracking algorithms. Ear-ley's context-free parsing algorithm [Earley, 1970] and the Cocke-Younger-Kasami algorithm [Aho & Ullman, 1972] are two of the most well-known ones that run in polynomial time. They each take space n 2 and time n 3 where n is the length of the input string. Earley's algorithm has the advantage that it only requires time n 2 whenever the grammar is unambiguous and can even operate in time n for most grammars for programming languages whereas Cocke-Younger-Kasami's algorithm requires the grammar to be in Chomsky normal form. Also^ it has been shown that Earley's algorithm is applicable to schema-based systems [Havens, 1983]. As a result, we adopted Earley's algorithm for our system. We will now describe the algorithm informally. A formal description may be found in [Earley, 1970]. The use of Earley's algorithm in the context of our N L U system is explained in detail in chapter 3. Earley's algorithm is a recognizer. This means that it takes as input a string and either accepts or rejects it depending on whether or not the string is a sentence of the specified grammar. The grammar is a context-free grammar (CFG) and is formally defined as a 4-tuple G = (N,T,P,S) CHAPTER 2. BACKGROUND 22 where N is a finite set of non-terminal symbols, T is a finite set of terminal symbols, P is a finite set of production rules, and S is a distinguished symbol of N called the start symbol. Each production in P is of the form A —» a where A is a symbol in N and a is a string where <r,- is in (N U T). Let w = x\xi...xn be the input string with every x,- in T, 1 < t < n. Earley's algorithm scans the input string from left to right and as each symbol x,- is scanned, a set S{ of states is constructed. A state has the form [A^a-p,j] . which represents the condition of the recognition process in recognizing the non-terminal A starting at x J + 1 in w using the production A —• a/3 where a and /? are in (N U T)*. The dot between a and /? is a metasymbol not in N or T, and it delimits the portion, a, of the production's right hand side which has been recognized so far from the portion, /?, that still needs to be recognized. Initially, since none of the symbols in the input string has been recognised, the algorithm begins by creating the state set So containing just the state [<f> —> -S H, 0]. <j> is a new non-terminal that is not in N, H is a new terminal symbol not in T ,and 5 is the root of the grammar. -\ represents the null character at the end of an input string to act as a terminator. As the algorithm processes each new symbol x,-+i, a new state set 5 , + i is generated. If the entire input string is processed and the state set Sn+i contains the single state \<f> —• S H •, 0], CHAPTER 2. BACKGROUND 23 then the input string is an accepted sentence of the given grammar. On the other hand, if at any point in the processing 5,+i remains empty after S< has been processed, then w is rejected as a valid sentence of the grammar G. A state set Si is operated on in the following manner. The states in the set are processed in order and depending on the form of the state, one of three operations is performed on it. These three operations are known as predictor, scanner, and completer. They may add more states to Si and may also put states in the next state set S,+i. The algorithm moves on to process the states of S,+i after all the states in St- have been processed. The predictor operation is applicable to a state when there is a nonterminal to the right of the dot. This operation causes a new state to be added to 5,- for each alternative production of that non-terminal. Thus, in the beginning, the state [<f> —* -S H,0] indicates that the predictor operation can be applied and states of the form \S —> -a, 0] are placed in Si for each production S -> a in P. The scanner operation is> applicable to a state when there is a terminal to the right of the dot. This operation compares that terminal symbol with x,+i, and if they match, it adds the state to Si+i with the dot moved to the right of the terminal symbol to indicate that it has been scanned. Thus, if the state is [A —• a • c/?, k] and c is a terminal symbol that matches the input symbol then the state [A —• ac • f), k\ is placed in the next state set. The completer operation is applicable to a state such as [A —• ac-, k] when the dot is at the end of the production. This operation retrieves all states from the previous state set Sk that have in their productions the non-terminal symbol A to the right of the dot. The dot is then CHAPTER 2. BACKGROUND 24 moved past A in these states and these updated states are added to the current state set S,-. Thus, if the state under processing is [A —• a-, k], then the completer operation will retrieve all states from the state set Sk that have the form [JS —» a • A8,j], 1 < j < n, j < i, propagates them as [B —• aA • 8, j], and places them into 5t-. Conceptually, the predictor provides all possible extensions of a partially completed parse. The scanner checks potential extensions against the input. The completer updates the confirmed extensions to enable the parsing process to continue. Our system adapts the idea of Earley's parsing algorithm to processing sentences. In fact, the algorithm's three main operations of predicting, scanning and completing form the basis of control in our system. However, the productions that are used in our system have embedded syntactic and semantic constraints to enable the analysis of English sentences whose grammar is not context-free. The algorithm has been altered to accomodate this and also to produce all the possible parse trees or derivations of the input sentence instead of just acting as a recognizer. Chapter 3 A Schema &; Constraint-Based Approach to NLU The schema and constraint-based approach to N L U is explained in this chapter through the description of a N L U system which follows this approach. See [Havens, 1985] for an introduction to combined schema and constraint-based recognition. 3.1 O v e r v i e w of the s y s t e m A n overview of the system is shown' in Figure 3.1. Each component of the system is described briefly in this section. The main purpose of the system is to explore the use of schema knowledge representations in combination with network consistency techniques in the task of N L U . Thus, the system is not designed as a N L query system that provides answers to questions on a database, but rather as a representational system that produces a syntactic and a semantic representation for a given input sentence. The system accepts simple declarative or interrogative English sentences as input. It utilizes the information provided by the knowledge base (KB) to analyse the sentence, producing as 25 CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU Figure 3.1: Overview of the system CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 27 output a parse tree (PT) and a network consistency graph (NCG) which represent the syntax and semantics of the input sentence respectively. If there are multiple interpretations for a sentence, then multiple P T s and N C G s are produced in parallel with shared structures whenever possible. Focus of the system is on representing the meaning of input sentences. A P T represents the syntactic structure of an input sentence and is built by the system as it analyses the sentence according to the grammar provided in the K B . This structure is a tree with each node representing a syntactic category, and each leaf node also represents a word in the input sentence thus identifying the word's syntactic category and its relation with respect to the entire structure. The formation of this structure is performed in parallel with the construction of the N C G which represents the semantic structure of the input sentence. As each input word is analysed and appropriately identified with a P T node, one or more semantic nodes may be created in the N C G which correspond to the semantics of the input word. Links are established at this point between the P T node that represents the current input word and the corresponding semantic nodes in the N C G . This allows the ongoing parsing process to further guide the construction process of the N C G . The N C G is a semantic network that is built incrementally in parallel with the P T . The semantics of each input word is represented by one or more nodes in the N C G . The arcs that link these semantic nodes represent the relationships that exist between the semantic components of the sentence. These semantic components are entities from the database and the relationships between them correspond closely with the cases of Fillmore [Fillmore, 1968], As each new node is added to the N C G , additional semantic constraints are introduced. These constraints may CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 28 alter the existing interpretation that each node contributes towards the global interpretation of the sentence. In order to maintain a consistent global interpretation with all constraints satisfied, network consistency techniques are applied at this point. The K B provides the knowledge about the task domain. It also contains both syntactic and semantic knowledge that guide the system in its construction of the P T and the N C G . The information within the K B is represented using schemata. The main driver of the system controls the interpretation of each sentence by manipulating a combination of processes. These processes are the predictor, the scanner, the completer and a garbage collector. They direct the analysis performed by the system and are derived from Earley's context-free parsing algorithm. The predictor computes the phrase structure grammar rules that may be involved in the derivation of the P T and predicts the syntactic nodes that may participate in one or more of the final P T structures. The scanner scans the input sentence one word at a time to provide the necessary information that leads to the acceptance or rejection of the predictions made by the predictor. The completer then narrows the predictions of the multiple interpretations to the appropriate ones that conform to the input. The garbage collector removes the impossible interpretations as soon as they are discovered. This involves the removal of syntactic and semantic nodes from the P T and the N C G . There are other subordinate processes defined in the system. The scanner has a subprocess called the morpher which performs suffix analysis on each word scanned. The completer has two subprocesses, namely the syntax and the semantics handlers. The syntax handler applies CHAPTER 3. A SCHEMA k CONSTRAINT-BASED APPROACH TO NLU 29 syntactic constraints on the nodes in the partially built P T whereas the semantic handler applies semantic constraints on the nodes in the partially built N C G . It also participates in the construction of the N C G . Network consistency techniques are utilized by the semantic handler to resolve the semantic constraints that arise as each word is scanned. If either the syntactic or the semantic constraints are not resolved, then the completer is notified to terminate the current interpretation. In summary, the main driver, with its processes and subprocesses, performs the analysis of input sentences while the K B provides the knowledge that enables the interpretations to take place. The P T and N C G are the final products of this interpretation. Multiple PTs and N C G s will be created if there is more than one interpretation for an input sentence. 3.2 K n o w l e d g e base ( K B ) The knowledge that the system needs in order to analyse English sentences resides in the K B . Trie task domain is composers; and their music. The K B i is a static collection of model schemata. Each of these schemata is a prototype which represents either syntactic or seman-tic knowledge. During the interpretation process, derived schemata are generated from these prototypes forming the syntactic and semantic nodes of the P T and the N C G . There are two types of model schemata in the K B 1 . The syntactic schemata represent syn-tactic knowledge which specifies how sentences are structured. This is its knowledge of English grammar. The other type of model schemata is the semantic schemata which represent the semantics of words. These two types of model schemata are described in the following sections. CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 30 3.2.1 Syntactic schemata Each syntactic schema represents a syntactic category and is referred to by the abbreviated name of the syntactic category which it represents. Table 3.1 shows the abbreviated names of the syntactic categories that are handled by the system. Figure 3.2 gives the BNF syntax of a syntactic schema. s -sentence PP(s) -prepositional phrase np -noun phrase qword -question words subnp -noun group auxv -auxiliary verb vp -verb phrase v -verb comps -subject-complements det -determiner mods -general modifiers adj -adjective nmods -noun modifiers npr -proper noun adjs -adjectives n -common noun nposs -possessive nouns prep -preposition Table 3.1: Syntactic categories' abbreviations Each syntactic schema contains a set of composition rules which define the various ways the syntactic category can be composed. If the syntactic category denotes a terminal symbol ih> the grammar, then the syntactic schema will have the rule (*dot *term) to indicate that the syntactic category is composed of an input word. Otherwise, a composition rule is composed of the symbol *dot, grammar constituents, plus one or more predicates. The symbol *dot at the beginning of each rule serves a special purpose in Earley's parsing algorithm that is described in section 2.4. It is used to delimit the portion of the rule that has been recognized and constraints that have been applied and resolved successfully from the portion that still needs to be found CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 31 <syntactic schema> ::= (<composition rulel>+)|(<composition rule2>) <composition rulel> ::= (*dot <grammar constituent>+ | <predicate>*) <composition rule2> ::= (*dot *term) <grammar constituent> ::= s|np|ngroup|vp|comps|mods|nmods|adjs|nposs|pp|pps[qword| auxv | v | det | adj | npr | n| prep <predicate> ::= (<syn predname> <syn args>*)|(<sem predname> <sem args>*) <syn predname> ::= rwordis|hasfeature]gparhasno|gparhas|isposs|isnotposs|nvagree|vvagree| dnagree <sem predname> ::= build|setptr|sp|checknurn <syn args> ::= <wlist> | <syn feature> |(<grammar constituent>)j <grammar constituent> ( < t e s t > ) { ° - 1 > <sern args> ::= <grammar constituent><word>^0,1^ [ <grammar constituent>2 <arc reln>(0 , 1^ |(schild <grammar constituent>)<grammar constituent <arc reln> <wlist> ::= a list of English words <syn feature> ::— (trans)| (intrans)|(cop) <test> ::= (iftrans)|(ifnosubj) <word> ::= an English word <arc reln> ::= <condn list> | <case> | <function> <condn list> ::= ((<word><case choice>)+) <case> ::= agent|obj|mod|temp|locnjevent <function> ::= depends|rdepends <case choice> ::= <case> |(<case>+) Figure 3.2: B N F grammar for a syntactic schema CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 32 from the input sentence and from the constraints that still need to be applied. The predicates are either syntactic or semantic constraints that should be applied at the appropriate time in the recognition process. They provide context-sensitive information to the system in deciding the appropriate parse. They also provide the necessary instructions on when and how to build and link nodes and maintain consistency in the NCG. The semantics of these predicates are explained in section 3.4. The complete set of syntactic schemata can be found in the appendix. Each composition rule of a syntactic category denotes a possible composition or parse or in-terpretation for that syntactic category. Multiple interpretations exist when multiple rules of a syntactic schema are simultaneously active. For a particular interpretation to be valid, all' con-straints in that rule must be satisfied and the input must conform with the rule's compositional elements. The syntactic schemata are models of grammatical rules. The schemata in the PT that capture the grammar of a sentence are derived schemata of these model syntactic schemata in the KB. They are created during the analysis of a sentence and are dependent both on the models and on the input. The *dot symbol is always at the beginning of a rule of a model syntactic schemata, but each derived schema's rule will probably have its *dot in different positions in the rule reflecting the status of that schema with respect to the condition of the parsing process. The creation and manipulation of these derived schemata is explained in more detail in the following sections. As an example, one rule in the set of composition rules for the sentence syntactic schema named's , is: CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 33 (*dot n p v p (nvagree np vp) (sp v p np agent) (sp np v p agent) (checknum vp)) This rule says that an input sentence is a valid sentence if it is composed of a noun phrase (np) followed by a verb phrase (vp). This is the rule for the composition of a declarative sentence and there are other rules that specify alternate ways that a sentence can be formed. They can be found in the s syntactic schema given in the appendix. There are four predicates in this sample rule: (nvagree n p vp) (sp v p n p agent) (sp n p v p agent); (checknum v p ) (nvagree n p vp) is a syntactic constraint. It says that to satisfy the constraint there must exist a number agreement between the main noun in n p and the main verb in v p . For instance, if the main noun is in singular form, and the main verb is in third person singular form, as in 'Bach lives', then the constraint is satisfied. Interpretation of a sentence based on this rule is allowed to continue only if the constraint is satisfied. Al l syntactic constraints serve a similar purpose, which is to check certain nodes in the partially constructed P T for a particular property or for some syntactic agreement between nodes that must exist before a sentence can be considered valid and that the interpretation process can continue. The remaining constraints in the example are semantic constraints, and are applied on the semantic nodes in the NCG(s). The first two constraints are called specialization constraints. They create arcs between the semantic node that represents np and the semantic node that CHAPTER 3. A SCHEMA <fe CONSTRAINT-BASED APPROACH TO NLU 34 represents v p in the N C G . The relationship between the two is marked 'agent'. This means that n p fills the agent case of the schema relations of v p . Network consistency must exist when such links are made between semantic nodes. The technique of network consistency was introduced in section 2.3, and section 3.4.2 explains how it is performed on a N C G in detail. If the N C G is still consistent after a fink is made between two of its nodes, then the semantic constraint is satisfied. The (checknum vp) semantic constraint does not create new links in the N C G . Instead, it tries to apply the number constraints provided by determiners on the semantic nodes of nouns. The specialization constraint and the number constraint are the two basic types of semantic constraints. The specialization constraint is usually less straight forward in its specification of the relationship that should exist between nodes such as the 'agent' relationship given in the above example. In most cases, a list of possible relationships and the conditions under which they are applicable are specified, and it is the semantic handler's job to decide which relationship is appropriate for a given situation. 3.2.2 S e m a n t i c s c h e m a t a For each English word, there's a schema in the K B that represents it. The schema contains all the syntactic and semantic knowledge of the word that is needed in order that the system may function. Figure 3.3 gives the syntax for such a semantic schema. A brief explanation of each of the properties that a semantic schema may have is given in Table 3.2. Words of different syntactic categories have different types of syntactic and semantic knowl-CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU <semantic schema> ::= <word><syn c a t x i n f l n s X s y n feature>^0,1^<labelset> <case list>^0,1^<schema relations>^0,1^<sem <word> ::= a word <syn cat> ::= n|npr|pn|adj|qword|det|prep|v|auxv <inflns> ::= s|es|s-d|s-ed|es-ed|er-est|r-st|ur|*| <infln entry> <syn feature> ::= (trans)|(intrans)|(cop) <labelset> ::= (<label> +) <case lists* ::= (<case>+) <schema relations> ::= (<sreln> + ) <sem rule> ::= (equal-num <num>)|(morethan <nurn>) <inflh entry> ::= (<rootwd><possessive>^0,1J<number>^0,1^<tense>*<person>{0 <label> ::= <word> | <fabricated label> <case> ::= agent|obj|mod|temp|locn|event|label <sreln> ::= (<fabricated l a b e l x l a b e l > + ) <num> ::= 0|l|2|3|4|5|6|7|8|9 <rootwd> ::= the non-inflectional form of an English word <possessive> :;= (poss) <number> ::= (number pi)' <tense> ::= (infin)|(tns present)|(tns past)|(pastpart)|(prespart) <person> ::= (pncode <code>) <fabrieated label> ::= <word><num> + <code> ::= lsg|3sg|13sg|xl3sg Figure 3.3: B N F grammar for a semantic schema CHAPTER 3. A SCHEMA k CONSTRAINT-BASED APPROACH TO NLU 36 w o r d : an English word or an identifier representing an English description, syntactic category: the syntactic class that the word belongs to. inflectional knowledge: the possible variety of terminations of a word to express the relations of number, person and tense. syntactic features: exists only for verbs, classifying them as transitive, intransitive or copula verbs. label set: A collection of labels to capture the meaning of a word. case list: exists only for verbs and prepositions, specifying the cases that may be filled. schema relations: exists only for verbs and prepositions, specifying the objects that may fill the given cases, thus showing the relationships between the objects. semantic rule: exists only for determiners, a semantic predicate that specifies how to quantify nouns. Table 3.2: Properties of a semantic schema edge in their schemata: Thus, not every semantic schema has all the properties listed in Table 3.2. Each of these properties and the type of words to which it pertains is explained further below. Basically, syntactic knowledge includes the syntactic category that a word belongs to and its syntactic features. A l l words have syntactic knowledge in their semantic schemata. Words that are of the syntactic categories of nouns and verbs also have knowledge of their inflections included in the schemata. Inflection is a type of syntactic feature that through variance in word termination expresses the relations of number, person and tense. For example, the plural form of the noun 'composer' can be formed with the suffix's' appended to the word. Thus, V is the inflectional knowledge within the schema for the word 'composer'. Certain verbs are considered as irregular and have no inflectional knowledge that can guide the morpher in deriving the tense and person of an inflected form of the verb. Thus, the CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 37 knowledge of person and tense are provided explicitly for each present, past and past participle form of the verb in the KB. Verbs also have the syntactic features that categorize them as transitive, intransitive or copula verbs. Syntactic knowledge of words is needed because the satisfaction of syntactic constraints depends on them. For example, a syntactic constraint such as (hasfeature (trans)) is satisfied if the schema for the verb has (trans) as a syntactic feature meaning that the verb is transitive. Similarly, the satisfaction of semantic constraints depends on the semantic knowledge within the semantic schemata. Words of the syntactic category of determiner are basically quantifiers in our system. Their semantics is represented by predicates, such as (equal-num 1) for the determiner 'one', which is basically a semantic constraint which when executed captures the semantics of the word. Thus, we are assuming that the semantic significance of determiners is in providing number constraints on nouns. Words of other syntactic categories have their semantics represented by label sets. A label set is a collection of labels which are meant to capture the meaning of a word. For example, the word 'composer' which is a noun has the label set: (Bach Handel Haydn Mozart Beethoven Chopin Berlioz Tchaikovsky Verdi) We are assuming that a noun represents a class of objects and we call each of these objects a label. A label may be another noun or a pseudo-noun which also has a label set whose labels also have label sets and so on. As a result, a hierarchy of knowledge organized into classes and subclasses is formed for each noun. This is known as the specialization hierarchy CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 38 [Brachman, 1982]. Pseudo-nouns are descriptions for a collection of nouns. They allow groups of nouns or labels to be categorized under one name. For example, 'voc-music' is a pseudo-noun that represents the collection of objects that fall under the category of vocal music. The specialization hierarchy for the word 'music' in Figure 3.4 shows how pseudo-nouns can group nouns together and be used as labels. The pseudo-nouns are in italics in the figure and the descriptions that they represent are given in parentheses. As with nouns, the semantic knowledge of adjectives and question words are represented by label sets within the semantic schemata. For an adjective, its label set is the class of all objects that may be modified by the adjective. For example, the adjective 'famous' has in its schema the label set (music composer) in our K B because all' of the musical compositions and composers are considered famous and can be modified by this adjective. Similarly with a question word, its label set represents a class of objects. For example, the question word 'who' represents the class of all people and since all the people in our database are composers, its label set is simply (composer). Prepositions and verbs also have label sets for representing semantics of words. However, there is a substantial difference between their label sets and those of nouns, adjectives and question words. Verbs denote linguistic events, and each event involves the interaction of objects. For example, the verb 'compose' represents all the composing events and each such event may involve a composer, a composition and the time and place that the compositon was written. Thus, verbs' label sets are not simply classes of objects as in nouns but are classes of events. We represent each event by what we call a fabricated label so that the event with CHAPTER 3 A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 39 Figure 3.4: Specialization Hierarchy for the noun 'music' CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 40 all the inter-related objects can be referred to by a single name. For example, if there are altogether three composing events for the verb 'compose', then its label set will be (composel compose2 compose3) where each element in the list is a fabricated label denoting an event. In addition, we also need to include what we call a case list and schema relations within the semantic schemata of verbs. The case list within a semantic schema expresses the cases of the verb that may be filled. This is an adaptation of Fillmore's cases [Fillmore, 1968]. For each fabricated label, there is a corresponding schema relation that lists the objects that fill the cases of the verb for a particular incidence of that verb. The set of schema relations is a set of tuples which represents all the events that are denoted by the verb in the KB. Thus, for the verb 'compose', the case list is (label agent obj mod)) and each line in Table 3.3 denotes a schema relation for each of the three events in the label set. label agent obj mod composel Bach Mass-in-B-minor inl ; compose2 Bach Well-Tempered-Clavier | compose3 | Handel j Messiah j in2 Table 3.3: Sample schema relation for the verb 'compose' The first schema relation in the table says that we have an event named 'composel' which represents the event of composing where 'Bach' is the agent of the act of composing and the composition 'Mass-in-B-minor' is the object of the act. Modifiers on the act are summarized by the fabricated label 'inl'. The other two schema relations can be interpreted in a similar fashion. To summarize, in the case that there are only three composing events, then the semantic schema CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 41 word: compose syntactic category: v inflectional knowledge: s-d syntactic features: (trans) label set: (composel compose2 compose3) case list: (label agent obj mod) schema relations: (composel Bach Mass-in-B-minor inl) (compose2 Bach Well-Tempered-Clavier) (compose3 Handel Messiah in2) Figure 3.5: Sample semantic schema for the verb 'compose' word: in syntactic category: prep inflectional knowledge: * label set: (inl in2 in3) case list: (label event temp locn); schema relations: (inl composel 1733 Leipzig) (in2 compose3 1742 Dublin), (in3 bomli 1685. Eisenach)) Figure 3.6: Sample semantic schema for the preposition 'in' for the verb 'compose' in the K B contains the information given in Figure 3.5. In order to understand the fabricated label ' i n l ' that appeared in a schema relation in Figure 3.5, we now turn to the semantic schemata of prepositions. Somewhat like verbs, prepositions' do not simply represent classes of objects but represent the relationships between objects. Thus, within the semantic schemata of prepositions, we again have label sets of fabricated labels, case lists and schema relations as in verb schemata. A simplified version of the semantic schema of the preposition ' in ' is given in Figure 3.6 to illustrate this. The ' i n l ' in the schema relation of the fabricated label 'composel' in the previous example is itself a fabricated label which expresses the relationship of the event of composing with the time and location of the event in one name. For example, the schema relation of ' i n l ' informs us CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 42 that the event represented by 'composel' has happened in the year 1733 in the city of Leipzig. 3.3 S y s t e m c o n t r o l Earley's context-free parsing algorithm [Earley, 1970] and Havens' schema-based recognition method [Havens, 1983] have been adapted to provide the necessary control for the interpretation process in our system. Recognition is performed in a bottom-up fashion coupled with top-down control. This integrated bottom-up and top-down recognition method allows the construction of the P T and the N C G in parallel. In addition, all possible interpretations can be explored in parallel while each schema retains control of the recognition process. The result is the creation of multiple P T s and N C G s with shared networks for multiple interpretations. We shall describe control in terms of three processes after Earley's three parsing functions in the following sections. A brief description of the actions involved in the interpretation process is given below. The main driver of the system is responsible for controlling the direction of interpretation of an input sentence. Direction is either bottom-up or top-down depending on the status of the derived syntactic schemata in the state sets of the system. The status of the derived syntactic schemata are in turn dependent on the input sentence and the K B . A derived syntactic schema is an instantiation of a model syntactic schema in the K B . Its creation signifies the possibility of its corresponding model syntactic schema in capturing the syntax of a part of or the entire input sentence. Derived syntactic schemata reside in state sets which eventually turn into the P T representation. CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 43 Initially, the main driver of the system creates a state 1 called the derived root schema that contains the rule (*dot s *term) 2 and places this schema in the state set called stO. This schema is responsible for the chain reaction spawning of syntactic schemata derived in the future. The list (*dot s *term) represents the status of the derived root schema. *dot is a symbol that delimits the recognized portion of the rule from the unrecognized portion. Initially, *dot is at the far left of the rule stating that s which is the root of the grammar has not been recognized yet. The user may choose another nonterminal symbol to be the root of the grammar. For example, if one wants to see the representation for a noun phrase instead of an entire sentence, then one may choose n p to be the root of the grammar and (*dot n p *term) will be the derived root schema instead. *term is a terminal symbol whose presence causes the scanner process to scan a word from the input sentence and in this case a null character should be scanned to signify the end of the input sentence. It is necessary to recognize * t e r m after s as stated in the above rule to ensure that the end of the input string is reached after s has been recognized. The main driver contains a loop that processes state sets one at a time until a new version of the derived root schema with the rule (s *term *dot) appears in a state set indicating that the input sentence has been interpreted successfully or until an empty state set is encountered indicating failure in the interpretation process. A state set is processed if all the derived syntactic schemata that it contains have been operated on in order; a derived syntactic schema 'Earley's terminology. [Aho & Ullman, 1972] calls a state an item, and calls a state set a parse list. We call the states in a state set derived syntactic schemata since each of them contains a collection of information based on the model syntactic schemata from the K B instead of being just a simple rule of the form [A—> a • /?]. 2 O u r LISP representation of the right hand side of the rule [<j> —» -S -I] described in section 2.4. CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 44 may be operated on by either the predictor, the scanner or the completer processes depending on its status. These three processes may add more derived syntactic schemata to the current state set under processing or to the next state set to be processed. The actions of each process is either top-down or bottom-up processing. Derived syntactic schemata that do not contribute to the final interpretation of the input sentence are deleted from the state set that contains them by the garbage collector as they are found to be inappropriate during the interpretation process. Thus, all the derived syntactic schemata in the series of state sets at the end of the interpretation process form a parse tree (PT) that provides a syntactic structure for the analysed input sentence. The three processes are described in the following sections. 3.3.1 P r e d i c t o r When the main driver inspects the status of a derived syntactic schema within the current state set being processed and notices that there is a nonterminal symbol to the right of *dot in the rule within the schema, the predictor process is invoked. The predictor creates a new derived syntactic schema for each composition rule within the model syntactic schema indicated by the nonterminal symbol, and places them in the current state set for future processing. In the beginning, stO is the current state set containing the derived root schema with the rule (*dot s *term). Since s is a nonterminal symbol, the predictor is invoked. It retrieves the model syntactic schema for the syntactic category s from the KB, and for each composition rule within the s model syntactic schema, a derived syntactic schema is created. One of the derived syntactic schemata of the 8 syntactic category created at this point is shown in Figure 3.7. CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO N L U 45 instance: sO-0 class: s state: stO rule: (*dot np vp (nvagree np vp) (sp vp np agent) (sp np vp agent) (checknum vp)) parent: rootO-0 potchild: nil child: nil word: nil sem: nil Figure 3.7: Sample derived syntactic schema The instance slot stores the name of a derived syntactic schema which is 'sO-0' in this ex-ample. Names are constructed following a special rule that allows it to reflect information of which syntactic class the derived syntactic schema belongs to, its position in the list of derived syntactic schemata of the same syntactic class, and which state set it resides in. The purpose of the naming convention is basically for ease of identification during the creation of the PT. This schema is derived from the a syntactic category and is contained in the state set stO as specified by the class slot and the state slot respectively. The rule slot contains the composition rule that represents the status of this schema in the recognition process. The parent slot records the name of the derived syntactic schema which causes the pre-diction of this schema. This is necessary so that the completer process knows which derived syntactic schema to complete to when this schema's rule has reached the completion stage. Completion of the rule reflects that the nonterminal symbol s has been captured in the input sentence by the composition rule of the s syntactic category represented in this schema. The potchild slot records all the derived syntactic schemata that will be predicted from this schema. For example, this and other derived syntactic schemata spawn from 'rootO-0' are CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 46 all potential children of 'rootO-0' representing all the possible parses with s as the root. Later, when this schema is processed, the nonterminal symbol np after *dot in the rule will cause all the possible compositions of np to be predicted as derived syntactic schemata, each of these will be placed in the potchild slot of this 'sO-0' schema. The purpose of having this slot is to facilitate garbage collection. This schema can be garbage collected when all its potential children in the potchild slot have been deleted from the slot. The entry of a derived syntactic schema in the potchild slot is removed if that schema is garbage collected due to its rule not being able to capture the input thus destroying its potential in being a child of the current schema in the final PT. Alternatively, the entry of the derived syntactic schema in the potchild slot can be removed if that schema has completed its rule and has made a copy of the current schema, thus showing that it no longer needs the original version of the current schema. The copy of the current schema is that schema's new parent; the rule in this copy is updated to reflect the successful completion of that schema in recognizing the nonterminal symbol after *dot. It is necessary to make a copy of the current schema and update the rule in the copy instead of just updating the rule in the original schema because the original one may have more than one potential children, some of which may complete to it later expecting the rule not to have changed in the mean time. The child slot is used to record those potential children that have successfully completed their rules. This information enables the system to print out the PT representation at the end of the interpretation process thus allowing Earley's algorithm to be a parser instead of simply a recognizer in the formal sense. The word slot is used to record the word from the CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 47 input sentence that is scanned during the processing of the current schema such that it may be printed with the schema that it is associated with in the P T output. Only those derived syntactic schemata that have a terminal symbol in their rules can have the w o r d slot filled. The sem slot provides a link to the N C G by storing the names of those semantic nodes in the N C G that represent the semantics of the current derived syntactic schema. Lastly, there is one important point that needs to be mentioned with regard to the actions of the predictor. Before the creation of any new derived syntactic schemata, the predictor checks the current state set to see whether those schemata have already been predicted previously and are waiting to be processed. If so, the predictor will not create duplicate derived syntactic schemata, but will instead retrieve those derived syntactic schemata from the current state set and update its parent slot to reflect the fact that there exists another schema which also predicts them. This means that subparses may be shared by multiple PTs thus yielding space efficiency. 3.3.2 Scanner The scanner process is invoked when there is a terminal symbol, * term, to the right of *dot in the rule of the derived syntactic schema under processing. The scanner then fetches the next word from the input sentence, and passes it to the morpher subprocess so that the root form of the input word can be determined. It then checks whether the resulting root word belongs to the syntactic category specified under the class slot of the current derived syntactic schema. If it does, it partially confirms the validity of the series of derived syntactic schemata that lead to the prediction of the current derived syntactic schema. It is only a partial CHAPTER 3. A SCHEMA ic CONSTRAINT-BASED APPROACH TO NLU 48 confirmation because there may still exists constraints that need to be satisfied before these derived syntactic schemata just mentioned can participate in the final P T representation. A derived syntactic schema that has been scanned successfully will be updated and then be placed on the next state set. The updating operation involves altering the name of the schema to comply with the naming convention, and most importantly to update the rule in the schema to reflect its new status. Before scanning, the rule in the derived syntactic schema is (,..*dot *term), and after updating, it will be (...*term *dot) which means that all symbols to the left of *dot has been recognized. A special case that the scanner process has to deal with is when the scanned word is a null character signifying the end of the input sentence. If the derived syntactic schema is the root schema with the rule (s *dot *term), then the scanning is successful, and normal updating as mentioned above is performed. Otherwise, an incorrect prediction has occured which expects more input and yet input has been exhausted. As a result, the garbage collector is invoked to dispose of the current schema as well as any other related derived syntactic schemata that are involved in the incorrect prediction. The garbage collector is also invoked when an unsuccessful scan has occurred; that is when the scanned input word does not belong to the syntactic class specified by the current derived syntactic schema or when it cannot be analysed successfully by the morpher. 3.3.3 Completer The completer process is invoked whenever the derived syntactic schema under processing has *dot at the end of its rule. It usually means that a successful scan has occurred previously CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 49 when this derived syntactic schema was last processed, and now its parents should be informed that the prediction is correct. So, the first operation of the completer is to retrieve all the parents of the current schema as listed in its parent slot. If required, consistency checks are performed on each parent. If the results are satisfactory, then the parent schema is updated and placed at the end of the queue in the current state set to await further processing. This involves either more prediction to see if the current predicted parse can be carried to its completion or completion has actually been reached. In the latter case, a trace back the prediction path is needed to confirm the success of the prediction. When a parent is retrieved, a copy is made so that the status of the original copy w i l l not be affected. The original copy is needed by other incompleted potential children and thus should not have its status changed. A new name is given to the copy and its rule is updated to reflect the completion of a derived syntactic schema predicted from it. For example, if the rule in this parent copy is (np *dot v p (nvagree n p vp) (sp v p n p agent) (sp n p v p agent) ( c h e c k n u m v p ) ) , then the updated rule wi l l have * d o t to the right of v p . This indicates that a derived syntactic schema of the syntactic category v p has been correctly predicted and is now completing to its parents. Consistency checks are required for this parent since four constraint predicates, (nvagree n p vp) (sp v p n p agent) (sp n p v p agent) ( c h e c k n u m v p ) , appear after the * d o t in its rule. The syntactic handler will be invoked to handle the syntactic constraints, and the semantic handler the semantic constraints. * d o t wil l be advanced to the right of each constraint after it has been satisfied. When all constraints are satisfied, the completed derived syntactic schema wil l be recorded under the CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 50 child slot of its parent. The potchild slot of this parent which is the copy is set to nil since no predictions has been made from it yet. The derived syntactic schemata representing the recognized portion of its rule are listed under its child slot. The original copy of this parent will have the completed derived syntactic schema removed from its potchild slot since it is now a completed child of the copy. The changes made to Earley's parser are incorported in the operations of the completer as mentioned above. Whereas Earley's completer just retrieves the parents and updates their rules, our completer also has subprocesses to apply syntactic and semantic constraints in order to be able to incoporate context-sensitive information in the grammar rules in the appropriate places. Although a completed derived syntactic schema may be shared by multiple parents, its corresponding semantic nodes in the N C G cannot be shared as well since each parent may be involved in a different interpretation. Thus, copies of semantic nodes must be made to ensure that all the semantic interpretations are distinct and do not interact. However, space efficiency is desirable; so some semantic nodes are still shared by different N C G s as long as the interpretations can be kept separated. This handling of multiple N C G s is described in section 3.4.2 on the semantics handler. The description of the three system control processes has shown that the parsing procedure is nondeterministic. The predictor process predicts all the possible parses simultaneously in a top-down fashion. These multiple parses are incorporated in the state sets. The derived syntactic schemata which reside in the state sets are created as a result of the prediction process. They CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 51 are retained unless they are found to be incorrectly predicted via the bottom-up action of the scanner and the completer. The derived syntactic schemata which cannot participate in the final P T representation will be deleted from the appropriate state sets. The scanner detects those incorrectly predicted derived syntactic schemata by checking the prediction against the words in the input sentence. On the other hand, the completer detects the incorrectly predicted derived syntactic schemata by calling the syntax and semantics handlers to apply the necessary constraints. Those derived schemata, both syntactic and semantic ones, that do not satisfy the constraints are then deleted. 3.4 C o n s t r a i n t sa t i s fac t ion This section explains how syntactic and semantic constraints are resolved. These constraints are embedded in the composition rules of a model syntactic schema. They are resolved through the actions of the syntax handler and the semantics handler respectively. 3.4.1 Syntax handler The syntax handler is invoked by the completer whenever the derived syntactic schema under processing has syntactic constraints that need to be satisfied. Syntactic constraints are applied to the syntactic nodes of the P T . These syntactic nodes are simply the derived syntactic schemata in the state sets from where the final one or more PTs are formed. The main function of these syntactic constraints is to provide context-sensitive information to the system so that an incorrect parse may be found as early as possible. The information that these constraints provide is essential for the parsing of English sentences since they do not conform to a simple CHAPTER 3. A SCHEMA b CONSTRAINT-BASED APPROACH TO NLU 52 context-free grammar. The syntactic constraints may be divided into two categories. Syntactic constraints of the first category are basically conditions that a particular derived syntactic schema must satisfy or properties that the derived syntactic schema must possess. This type of syntactic constraint assists the system in deciding which composition rule to follow at the appropriate stage during the interpretation process. The second category of syntactic constraints usually involves two particular derived syntactic schemata. The constraint specifies the number agreement that must exist between these two schemata. This type of syntactic constraint assists the system in deciding which composition' rule to discard. Both constraint types are represented as a list such as (hasfeature (trans)) where the head of the list is the name of the constraint, and the tail of the list gives the arguments to the constraint. The syntax handler's job is to call on the appropriate LISP function to perform the named constraint. It must also pass the required arguments to the LISP function. This usually involves a search for the appropriate derived syntactic schemata to which the constraint, is applied. For example, the constraint (nvagree np vp) can only be applied after the syntax handler has located the main noun and the main verb of the sentence. The main noun, for instance, can be found in the word slot of a derived syntactic schema of class n which is listed in the child slot of another derived syntactic schema of class np. Sometimes a constraint may have an argument that specifies the condition under which it is appropriate to apply the constraint. This type of embedded constraints must be handled by the syntax handler as well. A list of the syntactic constraints, and a brief explanation of each one is given in Table 3.4. CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 53 • Category I syntactic constraints: rwordis : The current derived syntactic schema must be associated with an input word whose root form is specified in the argument list of this constraint. hasfeature: The argument of this constraint specifies the particular syntactic feature that the word in the word slot of the current derived syntactic schema must have. gparhasno: The argument of this constraint specifies a syntactic class that the rule of the grandparent of the current derived syntactic schema must not have. gparhas: The opposite of the above constraint. isposs: The word in the word slot of the current derived syntactic schema must be in possessive form. isnotposs: The opposite of the above constraint. • Category II syntactic constraints: nvagree: Number agreement must exist between the main noun and the main verb, w a g r e e : Number agreement must exist between the auxiliary verb and the main verb, dnagree: Number agreement must exist between the determiner and the main noun. Table 3.4: Table of syntactic constraints CHAPTER 3. A SCHEMA ii CONSTRAINT-BASED APPROACH TO NLU 54 3.4.2 Semantics handler The semantics handler is invoked whenever the completer process encounters semantic pred-icates in the rule of the derived syntactic schema under processing. Semantic predicates may be constraints that need to be applied on the semantic nodes in the NCG, or they may be commands that specify how the NCG is built and linked to its corresponding derived syntac-tic schemata. The semantic handler also manages the creation of mutiple NCGs to capture multiple semantic interpretations of an input sentence. Each semantic predicate is in the form of a list similar to a syntactic constraint with the name of the predicate given in the head of the list, and the arguments of the predicate given in the tail. If the semantic predicate is a constraint, then the semantic handler has to perform) a search operation similar to that of the syntax handler in finding the appropriate semantic nodes to which to apply the constraint. There are four semantic predicates, given in Table 3.5. Build and setptr are commands for the semantics handler whereas sp and checknum are semantic constraints. build: Specify what type of semantic node to build and which derived syntactic schema it should correspond with. setptr: Specify additional links between syntactic and semantic nodes, and when to create copies of semantic nodes to allow multiple NCGs to exist separately. sp: A semantic constraint that causes the execution of arc consistency between semantic nodes. As a result, links between semantic nodes are established and made consistent. checknum: A semantic constraint on the number of labels in the label set of a semantic node as applied by a determiner. Table 3.5: Table of semantic predicates CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 55 The checknum const ra in t is the simplest of the four. It simple checks the cardinality of the label set of a semantic node to see if it conforms with what the determiner indicates it to be. The operations of the semantics handler with regard to each of the other predicates are explained below. The bu i l d pred icate The bu i l d pred ica te is applicable after an input word has been scanned. It builds one or more semantic nodes to represent the additional semantic entities that the new scanned word has introduced. A semantic node is a derived semantic schema which is an instantiation of a model semantic schema in the K B . The form of a semantic node for the scanned word, 'composer', is given in Figure 3.8. snode: sem6 rword: composer lset: (composer) comps: nil rcomps: nil fillede: nil rfilledc: nil syn: (subnp4-3) Figure 3.8: Sample derived semantic schema The name of the semantic node is given in the snode slot. Names are numbered in the sequence in which they are created, and do not have any special significance as names of syntactic nodes do. The r w o r d slot holds the root form of the scanned input word that has triggered the building of this node. It provides access to the model semantic schema associated with this derived semantic schema so that the case list and schema relations of the model one CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 56 can be retrieved when needed. The lset slot initially contains the label set of the associated model semantic schema. The label set is refined as the interpretation process proceeds and only consistent labels that satisfy all constraints applied to them are retained. The label set represents the interpretation of the semantic node. The comps slot holds a list of semantic nodes which are the components of this node. A semantic node is a component of another if it applies a constraint on the other semantic node. Thus, a change in the label set of a semantic node will affect all the semantic nodes that have it as a component. The rcomp slot holds a list of semantic nodes which have this semantic node as a component. These two slots allow the semantic handler to have access to all trie semantic nodes that are linked to this node. The fillede and rfilledc slots provide case information for the nodes listed in the comps and rcomps slots respectively. Figure 3.9 illustrates this through the example of a semantic node which denotes a verb. snode: sem5 rword: write lset: (composelO composel 1 compose 12} comps: (sem4 semO) rcomps: nil fillede: ((sem4 (obj)) (semO (agent)))| rfilledc: nil! syn: (vpO-4) Figure 3.9: Sample derived semantic schema for a verb In Figure 3.9, the component, 'sem4', fills the objective case of the verb write, and 'semO? fills the agent case. These two components have refined the labels in the lset slot to only three composing events from the original list of all composing events when the node was first built. Lastly, the syn slot specifies the derived syntactic schema or schemata with which this CHAPTER 3. A SCHEMA ic CONSTRAINT-BASED APPROACH TO NLU 57 semantic node is associated. The argument to the build predicate may specify the syntactic class of the derived syntactic schema with which this semantic node should associate. If unspecified, it defaults to the derived syntactic schema currently being processed by the completer which contains the build predicate in its rule. T h e setptr predicate The setptr predicate is applied when the list of semantic nodes of a completed derived syntactic schema should be incorporated into its parent's list. There are three special cases in which the setptr predicate applies: I. A child has only one parent and the parent does not have other potential children. 2!. A child has more than one parent to which to complete. 3. A child is completing to a parent which has other potential children. The first case indicates that there is only one interpretation or parse active at the time and the other two cases indicate the existence of multiple parses or subparses. When multiple inter-pretations are involved, the current N C G would need to be split to capture each interpretation. This involves the copying of semantic nodes and creating a new N C G . In the first case of only one interpretation existing, the setptr predicate can simply be executed by updating the parent's sem slot to include the semantic nodes in the sem slot of its child. Also, the syn slot of all the semantic nodes must be updated to indicate a new link to another derived syntactic schema. These additional links are necessary so that the system control which is provided via the syntactic parsing process can be carried over to control the formation of the semantic representations, the N C G s . CHAPTER 3. A SCHEMA &c CONSTRAINT-BASED APPROACH TO NLU 58 This simple case of having only one active predicted parse or subparse seldom occurs. Usually, one or both of the remaining cases occur during a completion process. In the second case, each parent of the child represents a different parse or subparse. Even though the child schema can be shared in the syntactic description, its corresponding semantic nodes cannot be shared since each parse represents a different semantic interpretation. Thus, appropriate actions must be taken to ensure that each parent has its own corresponding N C G . Similarly, in the third case, each potential child represents a different parse or subparse. As one of these potential children completes to its parent, a copy of this parent is made, as explained in section 3.3.3. Since each copy is involved in a different interpretation, its set of semantic nodes must be kept separated from that of other copies. Copies of semantic nodes must be made in the two cases just mentioned to keep the semantic interpretations separate. However, each semantic node is linked to other semantic nodes in a N C G . If a copy of a semantic node is needed, all the linked semantic nodes that cannot be shared between N C G s must also be copied. The result is a splitting of a single N C G description into multiple N C G descriptions. Figure 3.10 illustrates how this is done. For the sake of readability, we use a hypothetical example to demonstrate the actions involve in splitting a N C G description. Here, semantic nodes have alphabetic names and only the comps and rcomps slots are shown. These slots contain the information on the links in a N C G and are the only slots that are affected in the splitting; thus we include them in the diagram to illustrate how they get updated. The example assumes that node A is a semantic node that needs to be copied. Thus, node A l is created as a copy. However, the fact that node A is connected to other nodes in the CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU Before splitting: component of A [ are components After splitting: Original NCG Figure 3.10: Splitting a hypothetical N C G description CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO NLU 60 network means that other nodes that cannot be shared by two interpretations must be copied and linked appropriately as well. Basically, if a node is copied, then all the nodes listed in its rcomps slot need to be copied as well, but the nodes listed in its comps slot need not be copied and can be shared. For example, node A ' s comps slot has nodes B and D , and its r c o m p slot has nodes C and D . Thus, copies are made for C and D where none is made for B . Copies are necessary for C and D because they have node A as a component which apply constraints on them, and if A ' s bf lset slot is changed, C and D's lset slots must be updated accordingly. Therefore, if node A has a copy A l which is supposed to represent a separate interpretation, then C and D should not still be linked to A l since they should not let their lset slots be affected by A l ' s interpretation as well. As a result, copies of C and D are made. To summarize, the action of copying one semantic node usually leads to a propagation of the copying action until two interpretations can be represented by two separate N C G s that do not interact although they may share certain semantic nodes. A n efficiency of representation is achieved with this sharing of node while keeping the interpretations separate. T h e sp constraint The sp constraint is applied whenever a link has to be established between two or more semantic nodes in the N C G . The arguments of the constraint give guidelines to the semantic handler in finding the semantic nodes that should be linked, and the relationship between them. Linking two nodes means one node is made a component of the other. The semantic handler then records the information concerning each link in the appropriate comps, rcomps, filledc CHAPTER 3. A SCHEMA &z CONSTRAINT-BASED APPROACH TO NLU 61 and rfilledc slots of the two semantic nodes involved. In addition, it runs arc consistency to ensure that any inconsistencies introduced by the additional links are removed. Inconsistencies that cannot be resolved result in the constraint not being satisfied. Consistency checking is basically a filtering mechanism for maintaining consistent label sets, which may also refine the label sets gradually to reflect a global interpretation if enough con-straints exist. The network consistency algorithm that is adopted by our system is an implicit version of hierarchical arc consistency (HAC) [Mackworth et al., 1985]. In H A C , constraints are represented by relation matrices of leaf labels. These are explicitly established and prepro-cessed at the outset. In our system, the constraints are represented implicitly in the schema relations. Network consistency in our system works as follows. For a link inserted between each pair of semantic nodes, one node is made the component of the other, say for example that node B is made a component of node A . Each label in the label set of A is checked to see whether the addition of B) as a component has made a label inconsistent. If so, that label is deleted from A's label set, and every neighbour of A in the network that has A as a component must also be checked for consistency. Thus, consistency checks propagate throughout the network to ensure that all labels in the label sets of all nodes are consistent with the inconsistent ones removed in the process. A label of node A is considered consistent if it or at least one of its descendants in its specialization hierarchy of labels is compatible with at least one label in the label set of node B . Compatibility basically means equality. Two labels are compatible if they are equal in name, CHAPTER 3. A SCHEMA <fe CONSTRAINT-BASED APPROACH TO NLU 62 or if one of them is compatible with a member of the schema relation of the other label under the specified case. For example, the label 'Verdi' is compatible with the label 'compose21' whose schema relation is (compose21 Verdi Otello) if the specified case is 'agent' since 'Verdi' is a member of its schema relation and it resides in the agent case slot. If a fabricated label resides in that slot instead of'Verdi' , then the semantic handler will have to fetch the fabricated label's schema relation to check for compatability and a recursive process is involved until an atomic label can be reached to allow an atomic name comparison. If a label is incompatible, its descendants at the next lower lever in its specialization hi-erarchy will be fetched to replace it in the label set. If at least one of its descendants is consistent with node B's label set, consistency is achieved for that original label. The checking for consistency in the hierarchy of a label and its descendants is essentially H A C . Constraints are predominantly given by the schema relations of verbs and prepositions. However, noun modifiers also apply constraints on nouns. Adjectives and possessive nouns are the two types of modifiers that the system can handle. They refine the label! set, of the noun that they are modifying to only those labels to which the noun modifiers can be applied. 3.5 M o r p h e r The morpher is a subprocess to the scanner. It performs suffix analysis on an input word. Words that are not root words themselves but are formed in a regular way from their roots are analysed by the morpher to have the root form determined. These words need not be entered into the K B explicitly. However, words that are formed in an irregular way must be placed in CHAPTER 3. A SCHEMA & CONSTRAINT-BASED APPROACH TO NLU 63 the K B . The morpher is given a table of possible suffixes and the syntactic features associated with each. It can determine the root form of the input word, the part of speech, and other syntactic features of the input word so that a semantic schema can be created for it dynamically. If a word cannot be analysed, the user will be asked to either respell it, give a definition for it, or abort the interpretation process. The definition must be given in the format of a semantic schema. 3.6 G a r b a g e co l lec tor Over the course of the interpretation process, many derived schemata, both syntactic and semantic ones, are created. However, only some will actually be part of the final P T and N C G representations. Therefore, for reasons of space efficiency, a garbage collector is provided in our system to reclaim the space used by derived schemata once they are found to be incorrectly predicted and do not play a part in the global interpretation. The garbage collector is invoked whenever one of the three main processes of the system discover that a derived schema is inappropriate; The garbage collection process works differently depending on which process is calling it. The following is a description of the actions of the garbage collector in each situation. After scanning, a derived syntactic schema may be found to have incorrectly predicted the input word. Thus, this derived syntactic schema is garbage collected. In addition, all the derived syntactic schemata that are related to this prediction, and are not involved in an CHAPTER 3. A SCHEMA <fc CONSTRAINT-BASED APPROACH TO N L U 64 alternate prediction at the same time are garbage collected. This involves tracing the prediction path to all the ancestors of this current derived syntactic schema and garbage collecting those which cannot participate in another parse. For each ancestor that can be garbage collected, its completed children may also be garbage collected if they are not shared by other derived syntactic schemata. As a result, the garbage collector traverses up and down the potential P T looking for derived syntactic schemata that can be disposed. At the same time, derived syntactic schemata that are related to the garbage collected ones must have the appropriate slots in their schemata updated to delete their connections to the garbage collected schemata. A similar process is performed when a derived syntactic schema cannot complete successfully to its parent because some constraints are not satisfied. In this case, the garbage collector must also collect the related semantic nodes as well. Again, all connected semantic nodes that do not participate in another N C G description can also be garbage collected. Those that cannot be garbage collected must have their comps, rcomps, filledc and rfilledc slots updated to reflect the discontinued links. The garbage collector can also be invoked when the predictor process is called to make further predictions when input has already been exhausted. In this case, the derived syntactic schema under processing cannot possibly be part of any final interpretations and it is garbage collected accordingly. Chapter 4 D i s c u s s i o n o f e x a m p l e s The N L U system is written in F R A N Z LISP and runs on a V A X 11/780 under the UNIX operating system. The grammar and semantic knowledge base is given in the appendix. The system can handle simple declarative and interrogative sentences. Approximately twenty com-plete sentences and ten sentence fragments were tested successfully. Four of these and their corresponding P T and N C G representations are illustrated in this chapter. A brief dicussion is given for each example. The examples show how constraint propagation works and how the label sets of semantic nodes are refined incrementally to give a global consistent interpretation. The first two examples illustrate how noun modifiers are represented and manipulated in the system. Both of these examples are sentence fragments. The third example shows how determiners and prepositional phrases are handled. The last example demonstrates how multiple representations are generated for a sentence which has multiple interpretations. The last two examples are complete sentences. A diagram showing the P T and N C G representations in a network form is given for each example to facilitate understanding of the system's behaviour. 65 CHAPTER 4. DISCUSSION OF EXAMPLES 66 The contents of the syntactic and semantic nodes in the P T and the N C G are not shown in full in the diagrams to increase readability. Only information essential to the understanding of the diagrams is included. Leaf nodes of the P T indicate each input word that is scanned. The r w o r d and lset slots of semantic nodes are listed beside the nodes in the diagrams. Also, the schema relations of the abstract labels in the label set of semantic nodes are listed even though they are not explicitly represented in each semantic node. They only reside in the model semantic schemata in the K B , but are listed in the diagrams to aid the reader in interpreting the fabricated labels. Figure 4.1 gives the representation of the sentence fragment 'famous keyboard music'. The P T is shown on the left and the N C G o n the right in the diagram. This example shows how adjectives affect the label set of a noun that they modify. In this noun phrase, both adjectives, 'famous' and 'keyboard', provide constraints on the noun, 'music'. As the parsing process scans the words from left to right, the semantic nodes, 'sem3', 'sem4' and 'sem5', are created along with, their corresponding syntactic nodes. The label set of the noun 'music' is originally (voc-music ins-music) . When 'subnp3-3' has established both *mods2-2' and 'n3-3' as its children, the sp constraint in its composition rule is applied. It specifies the addition of links between the semantic node representing the noun and the semantic nodes representing the adjectives. Consistency checks are performed when the links are established. This results in the label set of 'sem5' to be refined to (key-music) to represent music which is famous and of type keyboard. At a first glance, neither 'voc-music' nor 'ins-music' in the original label set of 'sem5' matches the label 'key-music' in the label set of 'sem3'. However, the labels (oratorio opera mass) CHAPTER 4. DISCUSSION OF EXAMPLES 67 < ^ ^ o d s 2 - 2 j ^ ) < ^ ^ r i 3 - 3 ^J^> ' r— — 1 T T T T . ~ " ^ " " a d j s O word:famous "" yprd: music P ;;: sem4 rword:mus i c lset:(key-music) sem3 rword:famous rword:keyboard lset;:; (music, l s e t : (key-music), composer} word:keyboard PIAGftAM LEGEND a s y n t a c t i c node- in; the PT: a semantic node i n the NCG A B •k A k A denotes' B> B i s a component of A; A's l a b e l set i s c o n s t r a i n e d by B's. Every l a b e l i n A's l a b e l set must be compatible with at l e a s t on label- i n B's l a b e l set. A f i l l s the k case s l o t i n the schema r e l a t i o n s of B. B f i l l s the k case s l o t i n the schema r e l a t i o n s of A. Figure 4.1: Representation for 'famous keyboard music' CHAPTER 4. DISCUSSION OF EXAMPLES 68 and (ballet s y m p h o n y concerto key-music pro-music) exist under the hierarchy of the two abstract labels, 'voc-music' and 'ins-music', and 'key-music' is a member of one of these lists. The consistency propagation process then causes these descendant labels to be retrieved and to replace (voc-music ins-music). A l l except the label 'key-music' are then deleted from the label set of 'sem5' because of incompatibility. The second example, Figure 4.2, shows how possessive nouns differ from adjectives in the way they apply constriants on the main noun. A comparison of Figure 4.1 and Figure 4.2 illustrates this. In Figure 4.1, each adjective applies constraints on the main noun independently. In Figure 4.2, a chain relationship is established where each possessive noun only apply constraints on the immediately following noun. In the system, the noun phrases, 'father of Mozart' and 'Mozart's father', are equivalent in meaning. Although the syntactic structures generated for these two noun phrases are different, the semantic representations are the same. Thus, the N C G in Figure 4.2 represents both the noun phrases, 'Mozart's father's birthday' and 'birthday of the father of Mozart-. At the beginning of the recognition process, multiple syntactic nodes of the syntactic class, subnp, are active. Each node is predicted to correspond with a particular composition rule that resides in the model syntactic schema of subnp. When 'Mozart's' is scanned from the input, only the syntactic nodes that are derived from composition rules with the grammar constituent, mods , are retained. The semantic nodes, 'semO' and 'semi', are built at this point. A n sp constraint is applied to create a link between the two nodes to represent all the objects that are 'of Mozart'. For example, 'father of Mozart', 'music of Mozart' and 'birthday of Mozart'. CHAPTER 4. DISCUSSION OF EXAMPLES 69 ^ ^ u b np 3-3~^^' sem4 mods1-2 n4-3 '', word:birthday T obj sem3 nmodsO-2 I agent > I C. ^ • • i n p o s s l - l ^Qnmods3-2J"]\ (^~~~npr l - l ^ J ^ _ ( ^ n p o s s 2 -2 } word:Mozart's_- 1^  ,V n2-2 word:father's sem2 I semi agent semO r w o r d : b i r t h d a y lset:(1719) r w o r d : o f l s e t : ( o f 2 8 ) s c h - r e l : (of28 1719 Leopold) r w o r d : f a t h e r l s e t : (Leopold). r w o r d : o f l s e t : (of 1! of8 of9 of27)' s c h - r e l : ( o f l Leopold Mozart) (of8 Don-Giovanni Mozart) (of 9i Symphony-no. 4O-in-G-minor Mozart) (of27 1756 Mozart) rword:Mozart l s e t : (Mozart)! Figure 4.2: Representation for 'Mozart's father's birthday' CHAPTER 4. DISCUSSION OF EXAMPLES 70 The label set of 'semi' is refined from the list of all 'of relationships to those with 'Mozart' as the agent in the schema relations. Subsequent scanning of the input word 'father's' results in the building of semantic nodes 'sem2' and 'sem3'. A similar link is established between these two nodes to reflect in the label set of 'sem3' all the 'of relationships pertaining to all the fathers in the K B . As the parsing process discovers that there are no more possessive nouns, the link between 'sem2' and 'semi' is made. Consistency checks then refine the label set of 'sem2' to just the father of Mozart, Leopold. Lastly, the input word 'birthday' is scanned and 'sem4' is built. When 'modsl-2' and 'n4-3' have both completed to 'subnp3-3' successfully, the link between 'sem4' and 'sem3' is made. The label set of 'sem4' is then refined from the list of birthdays of every person in the K B to only the birthday of Mozart's father. The third example is a complete sentence with a prepositional phrase and a determiner. Figure 4.3 shows the representation for this sentence, 'which three sonatas were written by Beethoven'. Multiple prepositional phrases also apply independent constraints on a verb or noun as in the case of multiple adjectives. Both the P T and the N C G are built incrementally as in previous examples. The input words, 'which' and 'sonatas', cause the semantic nodes 'semO' and 'semi' to be built after the words have been scanned. The determiner, 'three', does not cause a semantic node to be constructed. Instead, it applies a number constraint on the cardinality of the label set of 'semi' after the entire N C G has been built. In this case, there are exactly three sonatas in the label set; therefore, the constraint is satisfied. The input word, 'which', represents initially all the objects intensionally via its label set. Thus, the label set of 'semO' is originally (composer music father date place). It is refined to the list of all sonatas CHAPTER 4. DISCUSSION OF EXAMPLES 71 c s2-7 <^qwordO -T^)<^ n p 2 - 3 ^ ~ ^ ) ^ v p l 2 - 7 :word:which c detl-2 word:three word: sonatas^ subnpl3^3^^^auxv2 - 4 ~ ^ ^ | : word:were word:written , i ' ' pps5-7 P P2-7 C _ prep2-6 > < ^ n p 9 - 7 ^ > — ^ T ^ ~ ^ T d^subnp 3 0^7^. word:by u i,,. ,|T -: "• i : : : ,,<•:. r "•? nprlO-7 sem8 agent q b j f s c n r e l : arword: w r i t e f l s e t : (composelO composell£ compose!2 sem7 word:Beethoven rword:whicK l s e t : (PS MSE AS)) agent \ rword:sonata 1 l s e t : (PS MS AS) rword 1: by l s e t : ( b y l 0 b y l l byl2) s c h - r e l : (bylO PS Beethoven) ( b y l l MS Beethoven) (byl2 AS Beethoven) PS = The-Pathetique-Sonata MS = The-Moonlight-Sonata AS = The-Appassionata sem5 rword:Beethoven lset:(Beethoven) Figure 4.3: Representation for 'which three sonatas were written by Beethoven' CHAPTER 4. DISCUSSION OF EXAMPLES 72 when the input word 'sonatas' is scanned and a link is made between 'semO' and 'semi'. The node, 'sem8', representing the verb 'write' is built after the words 'were written' are scanned. Its label set contains all writing events initially which are represented by schema relations. It is refined to those writing events that have Beethoven as the agent after the prepositional phrase has been scanned and a link is made between 'sem8' and 'sem7'. The node, 'sem7', has its label set refined to those fabricated labels whose schema relations have Beethoven as an agent when the link between 'sem7' and 'sem5' is made. When all the grammar constituents of 's2-7', the sentence derived syntactic schema, have completed to it, the link between 'sem8' and 'semi' is made. The label set of 'semi' is then refined to only those sonatas written by Beethoven. Since 'semi' is a component of 'semO', the constraint propagation process causes the label set of 'semO' which now contains all the sonatas in the K B to be further refined to the same sonatas that are listed in the label set of 'semi' . This actually provides the answer to the question. The last example, illustrated in Figure 4.4 and 4.5, is also a complete sentence. The sentence is 'who composed the Messiah in Dublin'. It is chosen to demonstrate how multiple P T and N C G representations are generated for a sentence which has multiple parses or interpretations. In this case, the ambiguity lies in the attachment of the prepositional phrase to either the noun or the verb. Two diagrams are drawn separately for this example for clarity reasons although some nodes in the two PTs and N C G s are actually shared. The two interpretations yield different syntactic as well as semantic structures. In both diagrams, a box is drawn around the semantic nodes representing the prepositional phrase to help identify the two possible attachments of those nodes to the rest of the N C G . The semantic nodes, 'seml3' and *seml2', have (temp CHAPTER 4. DISCUSSION OF EXAMPLES 73 locn) shown as a case slot to be filled by 'semlO' in the diagram. This actually means that 'semlO' should fill either the temporal or the locative case of the preposition 'in' within the two different possible NCGs of Figures 4.4 and 4.5 respectively. 'seml2' also has a case slot shown as event(obj) which is actually a combination of two cases. This means that 'seml4' should fill the 'obj' slot of the fabricated label 'composel' which itself fills the event slot of 'in2' in 'seml2'. posel pzig) blin word:Dublin Figure 4.4: A representation for 'who composed the Messiah in Dublin' At the beginning of the parsing process, there are multiple active syntactic nodes for the syntactic category s. After the word 'who' has been scanned from the input, only two sets of predictions that involve an interrogative sentence are retained. The semantic node, 'seml6', is then built and has the label set, (composer), which represents all the people in the KB CHAPTER 4. DISCUSSION OF EXAMPLES 74 (^ordO -3 jT ) ' ( J word:Messiah rword:who lset:(Handel) rword:compose lset:(compose3) obj s c h - r e l : (compose3 Messiah in2) ^"prepl word:in rword:Messiah lset: (Messiah). event (ob j). semi 2 rword:in l s e t : i n 2 |(temp locn) sch-rel: n2' composel 733' Leipzig) r npr 6-6~^ word:Dublin j (ir I 11 semlO rword:Dublin lset:(Dublin) Figure 4.5: A second representation for 'who composed the Messiah in Dublin' CHAPTER 4. DISCUSSION OF EXAMPLES 75 intensionally. One set of predictions for an interrogative sentence predict a np to follow the question word, and the other set predicts a vp to follow. When the word 'composed' is scanned, the syntactic nodes involved with the incorrect prediction of a np are garbage collected. The correct prediction of a vp thus causes 'seml5' to be built to represent the verb, 'compose', with a label set of all composing events. Multiple vp syntactic nodes are active at this point to indicate the different possible verb phrases that may exist. 'vpO-6' and 'vpl-6' in the two diagrams respectively are two of these possible predicted syntactic nodes for representing a vp. Next, 'Messiah' is scanned from the input and a semantic node representing it is built. At this point, there is still only one N C G , with three unconnected semantic nodes repre-senting the words, 'who', 'compose' and 'Messiah', respectively. Since a np may or may not contain a prepositional phrase, there are two branches of predictions involved. One branch completes the np to its vp parent, and the other is involved in further predictions of prepo-sitional phrases attached to a noun phrase. The np that completes has two parents, 'vpO-6' and 'vpl-6'; therefore, multiple parses-are possible and the N C G representation is' split at this point. Copies of semantic nodes are made to allow for the splitting. The link between the semantic node representing 'Messiah' and the semantic node representing 'compose' is made in both N C G s . As the interpretation process continues and the prepositional phrase is scanned, semantic nodes are built to represent the pp. The completion of the pp semantic structure would cause a copy of it to be made since it has two parents, one is a np and the other is a vp. The link between 'seml5' and 'seml3' in Figure 4.4 is made at this point. The link between 'seml4' and 'seml2' for the other N C G in Figure 4.5 is also made at the same time. Input has CHAPTER 4. DISCUSSION OF EXAMPLES 76 now been exhausted and both v p syntactic nodes complete to their parents. Since two parses exist, the syntactic node for s is copied, each having a different v p syntactic node as its child. The semantic node for the question word 'who' is also copied at this point. One copy is linked to 'seml5' in one N C G , and the other is linked to 'seml7' in the other N C G . In both cases, the label set of the semantic node representing 'who' is refined from the label set, (composer), which represents all composers intensionally to just (Handel) . This yields the answer to the question. Generally, multiple parses do not have to yield the same answer. Chapter 5 C o n c l u s i o n s a n d F u t u r e D i r e c t i o n s In this chapter, the merits of the schema and constraint-based approach to understanding English sentences are reviewed. In addition, possible extensions to the system are mentioned. Traditional approaches to understanding English sentences have followed the linear paradigm that involves three phases in the analysis of a sentence. First, the input sentence is parsed by a syntactic parser to form a parse tree, then a semantic analyser takes the parse tree as input and produces a database query in a prespecified formal language. Lastly, this query is analysed by an evaluator which finds the answer from the given database. This approach is still adopted by many researchers in logic prograrnming as described in section 2.2.. The major drawback of this approach is in its inability to deal with ambiguities efficiently. If there are multiple possible parses for a sentence, then one must be selected to allow the processing to continue. In the event that this parse is incorrect as discovered later by the semantic analyser or the database evaluator, it is necessary for the system to back up and find another parse. A reliance on auto-matic backtracking makes this approach highly inefficient because of the thrashing behaviour of backtracking [Mackworth, 1977a]. Similarly, if the ambiguity exists in words having multiple 77 CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 78 senses, then one word sense must be chosen for each ambiguous word, and again backtracking is utilized if the wrong word senses have been chosen first. Our approach handles both structural and word sense ambiguities much more gracefully and efficiently. Al l possible parses are explored in parallel and thus no backtracking is ever needed. Multiple word senses are all incorporated in the label set of a semantic schema. The inappropriate senses that do not satisfy all the existing constraints are removed through the application of network consistency. This approach remains non-committal to any of the am-biguous parse tree structures and word senses until enough evidence is found to decide on the correct interpretations. Although all parses and word senses are processed in parallel, Earley's algorithm is still efficient. It is the sharing of structures in both the P T and the N C G that makes this parallel approach feasible. In addition, label sets represent objects of the world intensionally, thus making this approach efficient. This thesis indicates that this non-committal approach of taking into account all parses and all word senses until input evidence says otherwise is more plausible than the traditional selective processing and backtracking approach. The traditional approach involves a three-pass process whereas the schema and constraint-based approach is a one-pass, non-deterministic process where syntactic and semantic interpretations are carried out in parallel. In fact, this one-pass versus three-pass processing is part of a long standing argument in the linguistics field about whether semantics affect parsing. [Flores d'Arcais & Schreuder, 1983] give a brief history on this issue and is summarised below. In the past, computing a structural description for a sentence was considered to be vital CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 79 to the understanding of a sentence, and syntax was the main concern. Semantics were consid-ered by linguists to be computed using a syntactic representation. This forms the basis of the three-pass processing approach. However, there has been an increasing interest in the relation-ships of grammatical representations to meaning structures, and models of sentence processing where semantics affect parsing became popular. In spite of this, linguists themselves are not unanimous as to whether semantics processing should be separated or intermingled with syn-tactic processing. Also, it is still an issue in psycholinguistics as to whether syntactic analysis of sentences constitutes a separate and autonomous stage in their perception. As Fodor says, "linguistic form recognition can't be context-driven because context doesn't determine form'', and he knows of no convincing psychological evidence "that syntactic parsing is ever guided by the subject's appreciation of semantic context or of 'real world' background" [Fodor, 1983]. It is evident; however, from the point of view of efficiency in sentence processing, that semantic information be brought in as soon as possible in order to limit search. Thus, this thesis takes a middle ground approach which splits the difference between strictly autonomous parsers and contextually driven ones. Our system builds the syntactic and semantic representations in parallel under the guidance of composition rules which contain grammatical constituents as well as syntactic and semantic predicates. The interpretation process is never driven by the semantic predicates, but if the semantic constraints are not satisfied, a possible parse under prediction by the grammar rules can be aborted. Therefore, although semantic information is never actually used to predict syntactic structures, it is introduced as soon as possible to help the system to narrow down its search for the correct parse. Fodor mentions in CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 80 [Fodor, 1983] that all the results on context effects in parsing that he knows of are compatible with this approach, and he believes that an approach of this flavour will prove ultimately to be valid. A one-pass, non-deterministic approach is appropriate for resolving sentence ambiguities. However, predicting and carrying all possible parses and interpretations simultaneously can be inefficient in terms of space. There is a space and time tradeoff involved. This approach tries to limit hypotheses on syntactic and semantic representations by eliminating and garbage collecting the inappropriate predictions as soon as possible. The application of syntactic and semantic constraints with the use of network consistency ensures that incorrect hypotheses are eliminated as soon as input evidence precludes their existence. This process is known as specialization in Havens' theory of schema labelling [Havens, 1985]. Havens' theory also stresses the importance of combining the knowledge of composition with the specialization process. Our system follows this approach. The knowledge of composition in our system' exists in the schema knowledge base where the composition rules guide the interpretation process. Specialization is seen in the application of network consistency in refining the hypotheses to produce the final interpretation. The use of consistency techniques without knowledge of composition is inefficient as noted earlier. In a similar manner, a schema approach with knowledge of composition but without specialization is also inefficient. A schema approach may provide hypotheses to the composition of a sentence; however, it needs to search for the correct hypothesis. Backtracking may be utilized but it is highly inefficient. O n the other hand, network consistency techniques are shown to be more CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 81 efficient. Thus, combining the schema approach with network consistency techniques is a natural thing to do. In particular, we have chosen arc consistency for the specialization process because it is achievable in time linear in the number of constraints [Mackworth &; Freuder, 1984]. In addition, the fact that our domain can be structured hierarchically has allowed us to utilize an implicit form of hierarchical arc consistency [Mackworth et al., 1985] which can be more efficient than arc consistency. Also, all the possible labels in each domain need not be represented explicitly but can be represented implicitly by one or more abstract labels. This intensional representation of label sets yields efficiency. The organization of these labels in a hierarchy has also given the system deductive power which is present in any semantic network formalism. Logic is known to be a descriptively adequate formalism in that it is precise and is good for specification. However, logic programming based on the use of P R O L O G usually relies on the automatic backtracking mechanism of P R O L O G for control, and does not provide the structure for efficient computation. We have already mentioned the advantage of our approach in terms of efficiency. In addition, the composition rules in our syntactic schemata are precise and transparent as well. Although procedural linguistic knowledge is embedded in those rules to provide some control, it is not obscure as in the case of A T N (Augmented Transition Network). Also, our formalism lends itself rather well to the relaxation of some syntactic constraints. Any syntactic constraints can be removed from the composition rules without affecting the actions of the system if such relaxation is deemed desirable. This is possible since the syntactic constraints are not incorporated within procedures in the system. Relaxation of syntactic constraints allows the system to be more flexible, and to tolerate some grammatical errors in a sentence as humans CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 82 often do. Our adoption of Earley's context-free parsing algorithm is quite efficient in that subtrees that belong to more than one parse trees may be shared without redundancy in representation. However, this sharing only occurs if the subtree to be shared is predicted by the common lines of analysis in the same state set. Therefore, redundancy in the creation of similar subtree structures may still occur if a subtree created in one state set is predicted again in a later state set. A n improvement can be made by adopting a similar context-free parsing algorithm by Tomita [Tomita, 1985] which claims to be more efficient in its representation of all parse trees in what he calls a shared-packed forest where the above problem does not arise. Our system which only has a simple grammar, deals with a limited number of grammatical constructs, and works on a small domain. Thus, there is much room for development. The modularity of schemata allows ease of expansion as syntactic schemata can easily be added to provide a larger grammar with its added constraints. The difficulties lie in determining the semantic significance of more complex grammatical constructs such as adverbs, and how they should interact with other semantic entities in the sentence. Relationships between grammatical constructs in the semantic sense must be determined before the system can know how to build a N C G to represent them. It is beyond the scope of this thesis to determine the semantics of all grammatical constructs as linguists are still researching this topic. However, it suffices to say that improvements can be made to the current system with our existing knowledge of semantics. For example, determiners can be dealt with in a more thorough manner with the addition of scoping on the quantifiers and the utilization of fuzzy logic and sets. The system CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 83 can also be extended to provide answers to questions instead of just outputting the syntactic and semantic representations. This involves selecting the consistent labels from the semantic nodes of the final N C G representation and producing a coherent answer from them with the addition of a sentence generation component. Perhaps a most important enhancement to the system would be to allow additional facts to be added which would constitute a form of learning. This system will reject a sentence that is not supported by the given facts in the database. Al l or some of the label sets of the semantic nodes in the N C G will become empty to reflect this situation. However, the system may be altered to add the appropriate labels to the empty label sets for certain semantic entities such that rejection of the sentence would not occur. Once the representation is built, the knowledge base can be updated accordingly to reflect the new additions. For example, we present the sentence 'John is a composer' to the system where such a fact does not currently exist in the database. During the interpretation process of the system, a semantic node representing 'composer' will be built and it will' have an abstract label representing all1 the composers in the database in its label set. When the system tries to link the semantic node representing 'John' to the one representing 'composer', consistency checking is performed. We will find that 'John' is not in the list of composers represented by the abstract label under the semantic node of 'composer' thus causing the abstract label to be deleted. At this point, the system should notify the user of this fact and ask whether the user desires to have 'John' be added to the list of composers. If the answer is in the affirmative, the label 'John' should be inserted in the now empty label set of the semantic node representing 'composer', and later be added to CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 84 the label set of the word schema for 'composer' in the knowledge base to make it a permanent fact. In this case, the system knows about the verb 'is' and thus is capable of establishing the representational structures to allow for the addition of a given fact. Thus, it is essential that the system knows about the relationships between semantic entities before addition of facts can occur. Therefore, if a new fact contains a new verb not known to the system, it would not be possible for the system to perform such automatic additions to the knowledge base. The approach introduced by this thesis could eventually be adapted to the understanding of stories and conversations as well. In these cases, more abstract schemata will exist in the knowledge base besides the simple word schemata. For example, schemata such as scripts and plans that represent situations will be needed. However, this approach would not require backtracking to search for the correct script to apply just as it would not use backtracking to search for the appropriate word sense when a word is ambiguous. Instead, all possible scripts would be processed in parallel and the system would refine on the appropriate ones when enough evidence is given in the input to decide on the correct ones. For example, let us consider the sentence 'John and Mary were walking down the aisle.' A standard approach is to select a script to apply to this situation [Schank & Abelson, 1977]. Assume that the system has chosen the 'marriage script'. However, if the next sentence is 'John took a can of apple juice off the shelf, then the system needs to backtrack and select the 'supermarket script' instead. In our approach, the system would not commit itself to either of the two scripts mentioned above after seeing the first sentence but would keep both of them around since they are both consistent with the information given in the sentence. When it sees the second sentence, enough constraints CHAPTER 5. CONCLUSIONS AND FUTURE DIRECTIONS 85 then exist to refine the interpretation to the 'supermarket script' and only at this point would the script be applied. In summary, the schema and constraint-based approach to understanding natural language as introduced in this thesis has some merit. It can handle ambiguity more effectively than the traditional linear approach without utilising backtracking. It utilises semantics to aid parsing to increase efficiency and yet does not let semantics determine syntactic form. Thus, the approach is in middle ground concerning the issue of whether semantics affect parsing. Although the interpretation process of all parses is performed in parallel, inefficiency is avoided because intensional label sets are used and the constraint propagation process plus the knowledge of composition help to eliminate incorrect interpretations as early as possible. In addition, garbage collection is incorporated to give space efficiency. Lastly, extensions to the system are possible and it has the potential to develop into a story or conversation understanding system without sacrificing efficiency. B i b l i o g r a p h y [Aho & Ullman, 1972] [Alon & Havens, 1985] [Bar-Hillel, I960]; [Barr &; Feigenbaum, 1981] [Bartlett, 1932]; [Bobrow, 1968] [Bobrow et al., 1977] [Bobrow & Norman, 1975] Aho, A . V . and Ullman, J. D. 1972. The Theory of Pars-ing, Translation and Compiling. Englewood Cliffs, N. J.: Prentice-Hall. Alon, A . and Havens, W. 1985. Recognizing VLSI circuits from mask artwork by schema labelling. Tech. Report 85-1, Dept. of Computer Science, Univ. of British Columbia, Vancouver, Canada. Bar-Hillel, Y. 1960. The present status of automatic trans-lation of languages. In F. L. Alt (Ed.), Advances in Com-puters (Vol. 1). New York: Academic Press, 91-163. Barr, A . and Feigenbaum, E . A . 1981. The Handbook of Artificial Intelligence (Vol. 1). Los Altos, C A . : William Kaufman, Inc. Bartlett, F. C . 1932. Remembering. Cambridge, England: Cambridge University Press. Bobrow, D. G . 1968. Natural language input for a com-puter problem-solving system. In M . Minsky (Ed.), Se-mantic Information Processing. Cambridge, Mass.: MIT Press, 146-226. Bobrow, D. G . , Kaplan, R. M . , Kay, M . , Norman, D. A . , Thompson, H . and Winograd, T . 1977. G U S , a frame-driven dialogue system. Artificial Intelligence, 8:155-173. Bobrow, D. G . and Norman, D . A . 1975. Some princi-ples of memory schemata. In D . G . Bobrow and D. A . Collins (Eds.), Representation and Understanding. New York: Academic Press, 131-149. 86 BIBLIOGRAPHY [Bobrow <fc Raphael, 1974] [Bobrow & Winograd, 1977] [Brachman, 1978] [Brachman, 1979] [Brachman, 1982] [Brachman, 1983]; [Brachman et al., 1983a] [Brachman et al., 1983b] [Brachman et al., 1985]: [Brachman & Levesque, 1982] [Brachman & Schmolze, 1985] [Bult, 1986] 87 Bobrow, D . G . and Raphael, B. 1974. New programming languages for artificial intelligence research. Computing Survey, 6(3): 153-174. Bobrow, D. G . and Winograd, T . 1977. A n overview of K R L , a knowledge representation language. Cognitive Sci-ence, 1:3-46. Brachman, R. J. 1978. A structural paradigm for repre-senting knowledge. Report No. 3605, Bolt, Beranek and Newman, Inc., Cambridge, Mass. Brachman, R. J. 1979. On the epistemological status of se-mantic networks. In N. V . Findler (Ed.), Associative Net-works. New York: Academic Press, 3-50. Brachman, R. J. 1982. What IS-A is and isn't. Proc. Cana-dian Society for Computational Studies of Intelligence. May 1982, 212-221. Brachman, R. J. 1983. What IS-A is and isn't: an analysis of taxonomic links in semantic networks. IEEE Computer, 16(10): 30-36. Brachman, R. J. , Fikes, R. E . and Levesque, H . J . 1983a. Krypton: integrating terminology and assertion. AAAI-8S, 31-35. Brachman, R. J., Fikes, R. E . and Levesque, Hi. J. 1983b. Krypton: a functional approach to knowledge representa-tion. IEEE Computer, 16(10): 67-73. Brachman, R. J., Gilbert, V. P. and Levesque, H . J. 1985. A n essential hybrid reasoning system: knowledge and sym-bol level accounts of K R Y P T O N . IJCAI-85, 532-539. Brachman, R. J . and Levesque, H . J. 1982. Competence in knowledge representation. AAAI-82, 189-192. Brachman, R. J. and Schmolze, J . G . A n overview of the K L - O N E knowledge representation system. Cognitive Sci-ence, 9(2): 171-216. Bult, T . P. 1986. Schema labelling applied to hand-printed Chinese character recognition. M.Sc. Thesis, Dept. of BIBLIOGRAPHY 88 [Carbonell, 1970] [Colmerauer, 1978] [Cullingford, 1978] [Dahl, 1981] [Dahl, 1983]; [Dahl & McCord, 1983]! [Earley, 1970] [Fikes & Kehler, 1985] [Fillmore, 1968] [Flores d'Arcais & Schreuder, 1983] [Fodor, 1983] Computer Science, Univ. of British Columbia, Vancouver, Canada (in preparation). Carbonell, J. R. 1970. AI in CAI : an artificial intelligence approach to computer-assisted instruction. IEEE Transac-tions on Man-Machine Systems, MMS-11: 190-202. Colmerauer, A . 1978. Metamorphosis grammars. In L. Bole (Ed.), Natural Language Communication with Com-puters. Berlin: Springer-Verlag, 133-189. Cullingford, R. E . 1978. Script application: computer un-derstanding of newspaper stories. Research Report 116, Dept. of Computer Science, Yale University. Dahl, V . 1981. Translating Spanish into logic through logic. AJCL, 7(3): 149-164. Dahl, V . 1983. Logic programming as a representation of knowledge. IEEE Computer, 16(10): 61-65. Dahl, V . and McCord, M . C. 1983. Treating coordination in logic grammars. AJCL, 9(2): 69-91. Earley, J. 1970. An efficient context-free parsing algorithm. Communications of ACM, 6(8): 94-102. Fikes, R. and Kehler T . 1985. The role of frame-based representation in reasoning. Communications of the ACM, 28(9): 904-920. Fillmore, C . 1968. The case for case. In E . Bach and R. T . Harms (Eds.), Universals in Linguistic Theory. Chicago: Holt, Rinehart and Winston, 1-90: Flores d'Arcais, G . B. and Schreuder, R. 1983. The pro-cess of language understanding: a few issues in contempo-rary psycholinguistics. In G . B. Flores d'Arcais and R. J. Jarvella (Eds.), The Process of Language Understanding. New York: Wiley and Sons, 1-41. 1983. Fodor, J. A . 1983. The Modularity of Mind. Cam-bridge, Massachusetts: M I T Press. [Freuder, 1978] Freuder, E . C . 1978. Synthesizing constraint expressions. Communications of ACM, 21(11): 958-966. BIBLIOGRAPHY [Green et al., 1963] [Green, 1969] [Grosz, 1977] [Havens, 1983] [Havens, 1985] [Havens & Mackworth, 1983]; [Hayes, 1977] [Hayes, 1981] [Hayes, 1977] [Hendrix, 1975] [Hendrix, 1979] [Hewitt, 1972] [Kehler &t Clemenson, 1984] 89 Green, B. F. , Jr., Wolf, A . K. , Chomsky, C . and Laughery, K . 1963. B A S E B A L L : A n automatic question answerer. In E . A . Feigenbaum and J. Feldman (Eds.), Computers and Thought. New York: McGraw-Hill, 207-216. Green, C . C . 1969. The application of theorem-proving to question-answering systems. IJCAI-69, 219-237. Grosz, B. J . 1977. The representation and use of focus in a system for understanding dialogue. IJCAI-77, 67-76. Havens, W. 1983. Recognition mechanisms for schema-based knowledge representations. Int. Journal of Comput-ers and Mathematics 9, no.l. Pergamon Press: 185-199. Havens, W. 1985. A theory of schema labelling. Computa-tional Intelligence, 1(3):127-139. Havens, W. and Mackworth, A . K . 1983. Representing knowledge of the visual world. IEEE Computer, 16(10): 90-98. Hayes, Pat J. 1977. In defence of logic. IJCAI-77, 559-565. Hayes, Pat J. 1981. The logic of frames. In B. L. Webber and N. J. Nilsson, Readings in Artificial Intelligence. Palo Alto: Tioga Pub. Co., 451-458. Hayes, Philip J. 1977. On semantic nets, frames and asso-ciations. IJCAI-77, 99-107. Hendrix, G . G . 1975. Expanding the utility of semantic networks through partitioning. MCA 1-75, 115-121. Hendrix, G . G . 1979. Encoding knowledge in partitioned networks. In N . V . Findler (Ed.), Associative Networks. New York: Academic Press, 51-92. Hewitt, C . 1972. Description and theoretical analysis (us-ing schemata) of P L A N N E R , a language for proving the-orems and manipulating models in a robot. Report No. TR-258, AI Lab., M.I.T. , Cambridge, M A . Kehler, T . P. and Clemenson, G . D . 1984. A n application development system for expert systems. System Software, 3(1): 212-224. BIBLIOGRAPHY [Levesque & Mylopoulos, 1979] [Lindsay, 1963] [Mackworth, 1977a] [Mackworth, 1977b] [Mackworth & Freuder, 1984] [Mackworth & Havens, 1981]; [Mackworth et al., 1985] [McAllester, 1980]! [McAllester, 1982]! [McCarthy, 1977] [McCord, 1982} [Minsky, 1975] 90 Levesque, H . and Mylopoulos, J. 1979. A procedural se-mantics for semantic networks. In N. V . Findler (Ed.), Associative Networks. New York: Academic Press, 93-120. Lindsay, R. K . 1963. Inferential memory as the basis of ma-chines which understand natural language. In E . A . Feigen-baum and J . Feldman (Eds.), Computers and Thought. New York: McGraw-Hill, 217-233. Mackworth, A . K. 1977a. Consistency in networks of rela-tions. Artificial Intelligence, 8: 99-118. Mackworth, A . K . 1977b. On reading sketch maps. IJCAI-77, 598-606. Mackworth, A . K. and Freuder, E . C . 1984. The complex-ity of some polynomial network consistency algorithms for constraint satisfaction problems. Artificial Intelligence, 25: 5-74. Mackworth, A . K. and Havens, W. 1981. Structuring do-main knowledge for visual perception. IJCAI-81, 625-627. Mackworth, A . K. , Mulder, J . A . and Havens, W. S. 1985. Hierarchical arc consistency: exploiting structured do-mains in constraint satisfaction problems. Tech. Report 85-7, Dept. of Computer Science, Univ. of British Columbia, Vancouver, Canada. McAllester', D\ A . 19801 An' outlook on truth maintenance. AI Memo 551, AI Lab., M.I.T. , Cambridge, M A . McAllester, D. A . 1982. Reasoning utility package user's manual. AI Memo 667, AI Lab., M.I .T. , Cambridge, M A . McCarthy, J. 1977. Epistemological problems of Artificial Intelligence. IJCAI-77, 1038-1044. McCord, M . C . 1982. Using slots and modifiers in logic grammars for natural language. Artificial Intelligence, 18(3): 327-367. Minsky, M . 1975. A framework for representing knowledge. In P. Winston (Ed.), The Psychology of Computer Vision. New York: McGraw-Hill, 211-277. BIBLIOGRAPHY [Montanari, 1974] [Mulder, 1985] [Pereira, 1981] [Pereira Ac Warren, 1980] [Quillian, 1968] [Quillian, 1969]; [Raphael, 1968] [Rumelhart Ac Norman, 1973] [Rumelhart Ac Ortony, 1976] [Schank, 1975] [Schank Ac Abelson, 1977] 91 Montanari, U . 1974. Network of constraints: fundamental properties and applications in picture processing. Informa-tion Science, 7: 95-132. Mulder, J . 1985. Using discrimination graphs to represent visual knowledge. Tech. Report 85-14, Dept. of Computer Science, Univ. of British Columbia, Vancouver, Canada. Pereira, F. 1981. Extraposition grammars. AJCL, 7:243-256. Pereira, F. and Warren, D. 1980. Definite clause grammars for language analysis - a survey of the formalism and a comparison with augmented transition networks. Artificial Intelligence, 13(3): 231-278. Quillian, R. 1968. Semantic memory. In M . Minsky (Ed.), Semantic Information Processing. MIT Press, 227-270. Quillian, R. 1969. The teachable language comprehender: a simulation program and the theory of language. Com-munications of the ACM, 12: 459-476. Raphael, B. 1968. SIR: A computer program for semantic information retrieval. In M . Minsky (Ed.), Semantic Infor-mation Processing. Cambridge, Mass.: M I T Press, 33-145. Rumelhart, D. E . and Norman, D. 1973. Active semantic networks as a model of human memory. IJCAI-78, 450-457. Rumelhart, D . E . and Ortony, A . 1976. The representation of knowledge in memory. TR-55, Centre for Human Info. Processing, Dept. of Psych., Univ. of Calif, at San Diego, LaJolla, C A . Schank, R. C . 1975. The structure of episodes in memory. In D. G . Bobrow and D. A . Collins (Eds.), Representation and Understanding. New York: Academic Press, 237-272. Schank, R. C . and Abelson, R. P. 1977. Scripts, Plans, Goals, and Understanding. Hillsdale, N . J.: Lawrence Erl-baum Associates. BIBLIOGRAPHY [Schank et al., 1975] [Schubert, 1976] [Schubert et al., 1979] [Shapiro, 1971] [Shapiro, 1979] [Shapiro, 1982] [Simmons, 1973] [Simmons & Slocum, 1972]: [Tomita, 1985]; [Vilain, 1985] [Walker, 1978] [Waltz, 1972] 92 Schank, R., and Yale AI Project. 1975. SAM - a story un-derstander. Research Report 43, Dept. of Computer Sci-ence, Yale University. Schubert, L. K. 1976. Extending the expressive power of semantic networks. Artificial Intelligence, 7(2): 163-198. Schubaert, L. K., Goebel, R. G. and Cercone, N. J. 1979. The structure and organization of a semantic net for com-prehension and inference. In N. V. Findler (Ed.), Associa-tive Networks. New York: Academic Press, 121-175. Shapiro, S. C. 1971. A net structure for semantic informa-tion storage, deduction and retrieval. IJCAI-71, 512-523. Shapiro, S. C. 1979. The SNePs semantic network process-ing system. In N. V. Findler (Ed.), Associative Networks. New York: Academic Press, 179-203. Shapiro, S. C. 1982. Generalized augmented transition net-work grammars for generation from semantic networks. AJCL, 8(1): 12-25. Simmons, R. 1973. Semantic networks: their computation and use for understanding English sentences. In Schank R. C. and Colby K. M. (Eds.), Computer Models of Thought and Language. San Francisco: Freeman. Simmons, R. and Slocum,, J. 1972. Generating English dis-courses from semantic networks. Communications of the ACM, 15(10): 891-905. Tomita, M. 1985. An efficient context-free parsing algo-rithm for natural languages. IJCAI-85, 756-763. Vilain, M. 1985. The restricted language architecture of a hybrid representation system. IJCAI-85, 547-551. Walker, D. E. (Ed.) 1978. Understanding Spoken Lan-guage. New York: North Holland. Waltz, D. E. 1972. Generating semantic descriptions of scenes with shadows. Tech. Report MAC AI-TR-271, MIT, Cambridge, MA. BIBLIOGRAPHY [Warren & Pereira, 1982] [Weaver, 1949] [Weizenbaum, 1966] [Wilensky, 1978] [Winograd, 1972]: [Winograd, 1975] [Winograd, 1980] [Winograd, 1983]! [Woods, 1968] [Woods, 1970]; [Woods, 1975] 93 Warren, D. and Pereira, F. 1982. An efficient easily adapt-able system for interpreting natural language queries. AJCL, 8(3-4): 110-119. Weaver, W. 1949. Translation. In W. N. Locke and A. D . Booth (Eds.), Machine Translation of Languages. New York: Technology Press of M I T and Wiley and Sons (1955), 15-23. Weizenbaum, J. 1966. E L I Z A - A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9: 36-45. Wilensky, R. 1978. Understanding goal-based stories. Re-search Report No. 140., Dept. of Computer Science, Yale University. Winograd, T . 1972. Understanding Natural Language. New York: Academic Press. Winograd, T . 1975. Frame representations and the declar-ative/procedural controversy. In D. G . Bobrow and A. Collins (Eds.), Representation and Understanding. New York: Academic Press, 185-210. Winograd, T . 1980. What does it mean to understand lan-guage? Cognitive Science, 4:209^241. Winograd, T . 19831 Language as a Cognitive Process Vol. 1: Syntax. Reading, Mass.: Addison-Wesley. Woods, W. A . 1968. Procedural semantics for a question-answering machine. Fall Joint Computer Conference, 33: 457-471. Woods, W. A . 1970. Transition network grammars for natural language analysis. Communication of the A CM, 13(10): 591-606: Woods, W. A . 1975. What's in a link: Foundations for semantic networks. In D. G . Bobrow and A. Collins (Eds.), Representation and Understanding. New York: Academic Press, 35-82. BIBLIOGRAPHY [Woods et al., 1972] [Woods et al., 1976] 94 Woods, W. A . , Kaplan, R., and Nash-Webber, B. 1972. The lunar sciences natural language information system: Final report. B B N Report No. 2378, Bolt, Beranek and Newman, Inc., Cambridge, Mass. Woods, W. A . , et al. 1976. Speech understanding systems: Final report. B B N Report No. 3438, Bolt, Beranek and Newman, Inc., Cambridge, Mass. A p p e n d i x The appendix contains the listing of the K B . The first section is a list of all the model syntactic schemata and the second section is a list of all the model semantic schemata. The syntax and semantics of these schemata are explained in Section 3.2. Listing of model syntactic schemata: (s ((*dot np vp (nvagree np vp) (sp vp np agent) (sp np vp agent) (checknum vp)) (*dot qword (build qword) vp (nvagree qword vp (iftrans)) (setptr) (sp qword vp ( ( i n (temp locn))(who agent)(where mod) (when mod)(what obj))) (checknum vp)) (*dot qword (rwordis (which what)) np (sp qword np) vp (nvagree np vp (ifnosubj)) (sp vp np (depends)) (sp np vp (rdepends)) (checknum vp)) (*dot auxv np (nvagree np auxv) vp (vvagree auxv vp) (sp vp np agent) (sp np vp agent) (checknum vp)) (*dot auxv (rwordis (be)) np (nvagree np auxv) comps (nvagree comps auxv)) (sp np comps) (sp compsi np) (checknum np) (checknum1 comps)))); (np ((*dot det subnp (dnagree det subnp) (setptr)) (*dot subnp (setptr)))) (subnp ((*dot npr (isnotposs) (build)) (*dot n (isnotposs) (build)) (*dot npr (isnotposs) (build) pps (sp subnp pps ( ( i n event) (compose obj)))) (*dot mods n (isnotposs) (build) (sp subnp mods ((of obj)))) (*dot n (isnotposs) (build) pps (sp subnp pps ((from agent) (by obj) (of obj)))) (*dot mods n (isnotposs) (build) (sp subnp mods ((of obj))) pps (sp subnp pps ((from agent)(by obj)(of obj)))))) (vp ((*dot v (hasfeature (trans)) (build) np (sp vp np obj)) (*dot v (hasfeature (trans)) (build) np (sp vp np obj) pps (sp vp pps mod)) 95 (*dot v (hasfeature (intrans)) (build)) (*dot v (hasfeature (intrans)) (build) pps (sp vp pps mod) (sp pps vp ((in event)(of obj)))) (*dot v (hasfeature (cop)) (gparhasno (auxv)) comps (nvagree comps v) (setptr)) (*dot auxv (gparhasno (auxv)) v (vvagree auxv v) (build) comps (nvagree comps auxv (ifcop)) (sp vp comps ((in mod)(by agent))) (sp comps vp ((in event)(of obj)(by obj)))) (*dot auxv (gparhas (qword)) np (nvagree np auxv) v (vvagree auxv v) (build) (sp vp np agent)))) (comps ((*dot pps (setptr)) (*dot adj (build)) (*dot np (setptr)))) (mods ((*dot nmods (setptr3) adjs (setptr)) (*dot nmods (setptr3)) (*dot adjs (setptr)))) (nmods ((*dot nposs nmods (sp (schild nmods) nposs obj) (setptr3)) (*dot nposs (setptr3)))) (adjs ((*dot adj (build) adjs (setptr)) (*dot adj (build)))) (nposs ((*dot n (isposs) (build n) (build nposs of) (sp nposs n agent)) (*dot npr (isposs) (build npr) (build nposs of)' (sp nposs npr agent)))) (pps ((*dot pp (setptr3) pps (setptr)) (*dot pp (setptr3)))> (pp ((*dot prep (build) np (sp pp np ((from locn)(by agent) (in (temp locn))(of agent)))))) (qword ((*dot *term))) (auxv ((*dot *term))) (v ((*dot *term))) (det ((*dot *term))) (adj ((*dot *term))) (npr ((*dot *term))) (n ((*dot *term))) (prep ((*dot *term))) Listing of model semantic schemata: (composer n s (Bach Handel Haydn Mozart Beethoven Chopin Berlioz Tchaikovsky Verdi)) (Bach npr * (Bach)) 96 ( H a n d e l n p r * ( H a n d e l ) ) ( H a y d n n p r * ( H a y d n ) ) ( M o z a r t n p r * ( M o z a r t ) ) ( B e e t h o v e n n p r * ( B e e t h o v e n ) ) ( C h o p i n n p r * ( C h o p i n ) ) ( B e r l i o z n p r * ( B e r l i o z ) ) ( T c h a i k o v s k y n p r * ( T c h a i k o v s k y ) ) ( V e r d i n p r * ( V e r d i ) ) ( L e o p o l d n p r * ( L e o p o l d ) ) ( m u s i c n * ( v o c - m u s i c i n s - m u s i c ) ) ( v o c - m u s i c p n * ( o r a t o r i o o p e r a m a s s ) ) ( i n s - m u s i c p n * ( b a l l e t symphony c o n c e r t o k e y - m u s i c p r o - m u s i c ) ) ( k e y - m u s i c p n * ( s o n a t a p o l o n a i s e w a l t z W e l l - T e m p e r e d - C l a v i e r ) ) ( p r o - m u s i c p n * ( o v e r t u r e - f a n t a s y ) ) ( o r a t o r i o n s ( M e s s i a h T h e - C r e a t i o n ) ) ( o p e r a n s ( G i u l i o - C e s a r e D o n - G i o v a n n i L a - T r a v i a t a A i d a O t e l l o F a l s t a f f ) ) (mass n e s ( M a s s - i n - B - m i n o r ) ) ( b a l l e t n s ( T h e - N u t c r a c k e r ) ) : ( s y m p h o n y n e s ( T h e - S u r p r i s e - S y m p h o n y S y m p h o n y - n o . 4 0 - i n - G - m i n o r T h e - F i f t h - S y m p h o n y S y m p h o n i c - F a n t a s t i q u e ) ) ( c o n c e r t o n s ( T h e - E m p e r o r - C o n c e r t o ) ) ( s o n a t a n s ( T h e - P a t h e t i q u e - S o n a t a T h e - M o o n l i g h t - S o n a t a T h e - A p p a s s i o n a t a ) ) ( p o l o n a i s e n s ( P o l o n a i s e - i n - A - f l a t ) ) ( w a l t z n e ( V a l s e - B r i l l i a n t e ) > ( o v e r t u r e - f a n t a s y n e s ( R o m e o - a n d - J u l i e t ) ) ( W e l l - T e m p e r e d - C l a v i e r n p r * ( W e l l - T e m p e r e d - C l a v i e r ) ) ( M e s s i a h n p r * ( M e s s i a h ) ) ( T h e - C r e a t i o n n p r * ( T h e - C r e a t i o n ) ) ( G i u l i o - C e s a r e n p r * ( G i u l i o - C e s a r e ) ) ( D o n - G i o v a n n i n p r * ( D o n - G i o v a n n i ) ) ( L a - T r a v i a t a n p r * ( L a - T r a v i a t a ) ) ( A i d a n p r * ( A i d a ) ) ( O t e l l o n p r * ( O t e l l o ) ) ( F a l s t a f f n p r * ( F a l s t a f f ) ) ( M a s s - i n - B - m i n o r n p r * ( M a s s - i n - B - m i n o r ) ) ( T h e - N u t c r a c k e r n p r * ( T h e - N u t c r a c k e r ) ) ( T h e - S u r p r i s e - S y m p h o n y n p r * ( T h e - S u r p r i s e - S y m p h o n y ) ) (Symphony-no. 4 0 - i n - G - m i n o r n p r * (Symphony-no. 4 0 - i n - G - m i n o r ) ) 9 7 The-Fifth-Symphony npr * (The-Fifth-Symphony)) Symphonie-Fantastique npr * (Symphonie-Fantastique)) The-Emperor-Concerto npr * (The-Emperor-Concerto)) The-Pathetique-Sonata npr * (The-Pathetique-Sonata)) The-Moonlight-Sonata npr * (The-Moonlight-Sonata)) The-Appassionata npr * (The-Appassionata)) Polonaise-in-A-flat npr * (Polonaise-in-A-flat)) Valse-Brilliante npr * (Valse-Brilliante)) Romeo-and-Juliet npr * (Romeo-and-Juliet)) father n s (Leopold)) date n s (birthday dday cday)) birthday n s (|1685| |1719| |1732| |1756| |1770| |1803| |1810| |1813| |18401)) dday pn * (117501 117591 117871 117911 11809] 118271 118491 118691 118931 119011)) cday pn * (117331 117421)) type n s (oratorio opera mass ballet symphony concerto sonata polonaise: waltz overture-fantasy)); place n s (birthplace dplace cplace)); birthplace n s (Eisenach Halle Rohrau Salzburg Augsburg Bonn Warsaw Grenoble Votkinsk Busseto)) dplace pn i (Leipzig London Vi Milan)) cplace pn * (Leipzig Dublin)) 116851 npr * (116851)) 117191| npr * (117191)> 117321 npr * (117321)) 117561 npr * (I1756D) |1770| npr * (117701)) 118031 npr * (118031)) |1810| npr * (118101)) |1813| npr * (I1813D) |1840| npr * (118401)) |1750| npr * (117501)) 117591 npr * (117591)) 117871 npr * (I1787D) 117911 npr * (117911)) |1809| npr * (118091)) 118271 npr * (1.18271)) |1849| npr * (118491)) 98 (|1869| npr * (|1869|)) (|1893| npr * (|1893|)) (119011 npr * (|19011)) (117331 npr * (|1733|)) (|1742| npr * (|1742|)) (Eisenach npr * (Eisenach)) (Halle npr * (Halle)) (Rohrau npr * (Rohrau)) (Salzburg npr * (Salzburg)) (Augsburg npr * (Augsburg)) (Bonn npr * (Bonn)) (Warsaw npr * (Warsaw)) (Grenoble npr * (Grenoble)) (Votkinsfc npr * (Votkinsk)) (Busseto npr * (Busseto)) (Leipzig npr * (Leipzig)) (London npr * (London)) (Vienna npr * (Vienna)) (Paris npr * (Paris)) (St-Petersburg npr * (St-Petersburg)) (Milan npr * (Milan)) (Dublin npr * (Dublin)) (vocal adj * (oratorio opera mass)) (instrumental adj * (ballet symphony concerto sonata polonaise waltz overture-fantasy Well-Tempered-Clavier)) (keyboard adj * (key-music)) (program adj; * (pro-music))' (famous adj * (music composer)) (dead adj * (composer)) (who qword * (composer)) (when qword * (date)) (where qword * (place)) (which qword * (composer music father date place)) (what qword * (composer music father date place)) (the det (the (number any))) (a det * (equal-num 1)) (one det * (equal-num 1)) (two det (two (number pi)) (equal-num 2)) (three det (three (number pi)) (equal-num 3)) 99 (some det (some (number pi)) (morethan 1)) (from prep * (froml from2 from3 from4 fromB from6 from7 from8 from9 fromlO) (label locn agent) ((froml Eisenach Bach) (from2 Halle Handel) (from3 Rohrau Haydn) (from4 Salzburg Mozart) (from5 Augsburg Leopold) (from6 Bonn Beethoven) (from7 Warsaw Chopin) (from8 Grenoble Berlioz) (from9 Votkinsk Tchaikovsky) (fromlO Busseto Verdi))) (by prep * (byl by2 by3 by4 by5 by6 by7 by8 by9 bylO byll byl2 byl3 byl4 byl5 byl6 byl7 byl8 byl9 by20 by21 by22) (label obj agent) ((byl Mass-in-B-minor Bach) (by2 Well-Tempered-Clavier Bach) (by3 Messiah Handel) (by4 Giulio-Cesare Handel) (by5 The-Surprise-Symphony Haydn) (by6 The-Creation Haydn) (by7 Don-Giovanni Mozart) (by8 Symphony-no.40-in-G-minor Mozart) (by9 The-Fifth-Symphony Beethoven) (bylO The-Pathetique-Sonata Beethoven) (byll' The-Moonlight-Sonata Beethoven) (byl2 The-Appassionata Beethoven) (byl3 The-Emperor-Concerto Beethoven) (byl4 Polonaise-in-A-flat Chopin) (byl5 Valse-Brilliante Chopin) (byl6 Symphonie-Fantastique Berlioz) (byl7 Romeo-and-Juliet Tchaikovsky) (byl8 The-Nutcracker Tchaikovsky) (byl9 La-Traviata Verdi) (by20 Aida Verdi) (by21 Otello Verdi) (by22 Falstaff Verdi))) (in prep * (inl in2 in3 in4 in5 in6 in7 in8 in9 inlO i n l l inl2 inl3 inl4 inl6 inl6 inl7 inl8 inl9 in20 in21 in22) (label event temp locn) ((inl composel 117331 Leipzig) (in2 compose3 117421 Dublin) (in3 boml 116851 Eisenach) (in4 born2 116851 Halle) (in5 born3 117321 Rohrau) (in6 born4 117561 Salzburg) (in7 born5 117191 Augsburg) (in8 born6 117701 Bonn) (in9 bom7 118101 Warsaw) (inlO born8 118031 Grenoble) ( i n l l bornO 118401 Votkinsk) (inl2 bornlO 118131 Busseto) (inl3 diel 117501 Leipzig) (inl4 die2 117591 London) (inl5 die3 118091 Vienna) (inl6 die4 117911 Vienna) (inl7 die5 117871 Salzburg) (inl8 die6 118271 Vienna) (inl9 die7 118271 Vienna) (in20 die8 118691 Paris) 100 (in21 die9 118931 St-Petersburg) (in22 dielO 119011 Milan))) (of prep * (ofl of2 of3 of4 of5 of6 of7 of8 of9 oflO o f l l ofl2 ofl3 ofl4 ofl5 ofl6 ofl7 ofl8 ofl9 of20 of21 of22 of23 of24 of25 of26 of27 of28 of29 of30 of31 of32 of33) (label obj agent) ((ofl Leopold Mozart) (of2 Mass-in-B-minor Bach) (of3 Well-Tempered-Clavier Bach) (of4 Messiah Handel) (of5 Giulio-Cesare Handel) (of6 The-Surprise-Symphony Haydn) (of7 The-Creation Haydn) (of8 Don-Giovanni Mozart) (of9 Symphony-no.40-in-G-minor Mozart) (oflO The-Fifth-Symphony Beethoven) (o f l l The-Pathetique-Sonata Beethoven) (ofl2 The-Moonlight-Sonata Beethoven) (ofl3 The-Appassionata Beethoven) (ofl4 The-Emperor-Concerto Beethoven) (ofl5 Polonaise-in-A-flat Chopin) (of 16 Valse-Brilliante Chopin)' (of 17 Symphonie-Fantastique Berlioz)' (of18 Romeo-and-Juliet Tchaikovsky) (ofl9 The-Nutcracker Tchaikovsky) (of20 La-Traviata Verdi) (of21 Aida Verdi)' (of22 Otello Verdi) (of23 Falstaff Verdi) (of24 |1685| Bach) (of25 |1685| Handel) (of26 |1732| Haydn) (of27 117561; Mozart) (of28 |1719| Leopold) (of29 |1770|i Beethoven) (of30 |1810| Chopin) (of31 |1803| Berlioz) (of32 |1840| Tchaikovsky) (of33 118131 Verdi))) (compose v s-d (trans)) (composel compose2 compose3 compose4 compose5 compose6 compose7 compose8 compose9 composelO composell composel2 composelS compose14 compose15 composel6 composelT composel8 composelQ compose20 compose21 compose22) (label agent obj mod) ((composel Bach Mass-in-B-minor inl) (compose2 Bach Well-Tempered-Clavier) (compose3 Handel Messiah in2) (compose4 Handel Giulio-Cesare) (composes Haydn The-Surprise-Symphony) (compose6 Haydn The-Creation) (compose? Mozart Don-Giovanni) (compose8 Mozart Symphony-no.40-in-G-minor) (compose9 Beethoven The-Fifth-Symphony) (composelO Beethoven The-Pathetique-Sonata) (composell Beethoven The-Moonlight-Sonata) 101 (composel2 Beethoven The-Appassionata) (composel3 Beethoven The-Emperor-Concerto) (composel4 Chopin Polonaise-in-A-flat) (composel5 Chopin Valse-Brilliante) (composel6 Berlioz Symphonie-Fantastique) (composel7 Tchaikovsky Romeo-and-Juliet) (composel8 Tchaikovsky The-Nutcracker) (composelQ Verdi La-Traviata) (compose20 Verdi Aida) (compose21 Verdi Otello) (compose22 Verdi Falstaff))) (write v i r r (trans) (compose)) (writes v (write (tns present) (pncode 3sg))) (wrote v (write (tns past))) (written v (write (pastpart))) (born v (born (infin) (pastpart)) (intrans) (bornl born2 born3 born4 born5 born6 born7 born8 born9 bornlO) (label agent mod) ((bornl Bach in3) (born2 Handel in4) (born3 Haydn in5) (born4 Mozart in6) (bornB Leopold in7) (born6 Beethoven in8) (born7 Chopin in9) (born& Berlioz inlO) (born9 Tchaikovsky i n l l ) (bornlO Verdi inl2))) (die v s-d (intrans) (diel die2 die3 die4 die5 die6 die7 die8 die9 dielO) (label agent mod) ((diel Bach inl3) (die2 Handel inl4) (die3 Haydn inl5) (die4 Mozart inl6) (die5 Leopold inl7) (die6 Beethoven inl8) (die7 Chopin inl9) (die8 Berlioz in20) (die9 Tchaikovsky in21) (dielO Verdi in22))) (be v ir r ) (be auxv i r r (cop)) (is auxv (be (tns present) (pncode 3sg))) (are auxv (be (tns present) (pncode xl3sg))) (was auxv (be (tns past) (pncode 13sg))) (were auxv (be (tns past) (pncode xl3sg))) (is v (be (tns present) (pncode 3sg))) (are v (be (tns present) (pncode xl3sg))) (was v (be (tns past) (pncode 13sg))) (were v (be (tns past) (pncode xl3sg))) (do auxv ir r ) (does auxv (do (tns present) (pncode 3sg))) (did auxv (do (tns past))) 102 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items