Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A principle-based system for natural language analysis and translation Crocker, Matthew Walter 1988

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1988_A6_7 C76.pdf [ 7.13MB ]
JSON: 831-1.0051945.json
JSON-LD: 831-1.0051945-ld.json
RDF/XML (Pretty): 831-1.0051945-rdf.xml
RDF/JSON: 831-1.0051945-rdf.json
Turtle: 831-1.0051945-turtle.txt
N-Triples: 831-1.0051945-rdf-ntriples.txt
Original Record: 831-1.0051945-source.json
Full Text

Full Text

A PRINCIPLE-BASED SYSTEM FOR N A T U R A L L A N G U A G E ANALYSIS AND  TRANSLATION  by MATTHEW WALTER CROCKER B.Sc.(Computer Science), The University of New Brunswick, 1986 A T H E S I S S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S FOR T H E D E G R E E OF M A S T E R OF SCIENCE in T H E F A C U L T Y O F G R A D U A T E STUDIES D E P A R T M E N T OF C O M P U T E R SCIENCE  We accept this thesis as conforming to the required standard  T H E U N I V E R S I T Y OF BRITISH C O L U M B I A July 1988 © Matthew Walter Crocker, 1988  In presenting  this thesis in partial fulfilment of the requirements for an advanced  degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may department or by  his or her  representatives.  be granted by the head of  It is understood that copying or  publication of this thesis for financial gain shall not be allowed without my permission.  Department of  Computer Science  The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date August 12,  1988  my  written  Abstract Traditional views of grammatical theory hold that languages are characterised by sets of constructions. This approach entails the enumeration of all possible constructions for each language being described. Current theories of transformational generative grammar have established an alternative position. Specifically, Chomsky's Government-Binding theory proposes a system of principles which are common to human language. Such a theory is referred to as a "Universal Grammar"(UG). Associated with the principles of grammar are parameters of variation which account for the diversity of human languages. The grammar for a particular language is known as a "Core Grammar", and is characterised by an appropriately parametrised instance of UG. Despite these advances in linguistic theory, construction-based approaches have remained the status quo within the field of natural language processing. This thesis investigates the possibility of developing a principle-based system which reflects the modular nature of the linguistic theory. That is, rather than stipulating the possible constructions of a language, a system is developed which uses the principles of grammar and language specific parameters to parse language. Specifically, a system-is presented which performs syntactic analysis and translation for a subset of English and German. The cross-linguistic nature of the theory is reflected by the system which can be considered a procedural model of UG.  ii  Contents  Abstract  ii  List of Figures  vi  Acknowledgements  vii  1 Introduction  1  2 Government-Binding Theory  5  2.1  Acquisition and Explanation  5  2.2  A Model of Grammar  8  2.2.1  A System of Rules  9  2.2.2  A System of Principles  10  2.3  X-Theory  12  2.4  0-Theory and Lexical Selection  16  2.5  Movement  18  2.6  Government  20  2.7  Case Theory  21  2.8  Bounding Theory  '.  iii  24  3 Representations and Analysis 3.1  28  The Lexicon  28  3.1.1  The Dictionary  29  3.1.2  The Morphology  31  3.2  Phrase Structure  32  3.3  Transformations  34  3.3.1  .^-Substitution  35  3.3.2  X-Substitution  36 38  4 Parsing with Principles 4.1  4.2  The Parsing Module  40  4.1.1  Parsing X Phrase Structure  41  4.1.2  Parsing Specifiers, Adjuncts, and Arguments  44  Principles as Constraints  47  4.2.1  Applying Constraints  49  4.2.2  TheECP  50  4.2.3  Case Theory  52  4.2.4  0-Theory  54  4.2.5  Subjacency and Movement  55  5 Principle-Based Translation  60  5.1  Recovering D-structure  61  5.2  Translation  63  5.3  Generating S-structures  66  iv  6 Evaluation and Discussion  72  6.1  Principle-Based Systems  72  6.2  The Lexicon  74  6.3  Syntactic Analysis  75  6.3.1  The Parsing Module  76  6.3.2  The Constraint Module  78  6.4  Translation and Generation  79  6.5  Related Issues  80  6.5.1  Partial Evaluation  81  6.5.2  Modeling Linguistic Performance  82  7 Conclusions  84  8 References  86  A Example Translations  91  B Parsing Module  94  C Language Parameters  105  D Constraint Module  107  E Translation Module  116  F Morphological Analyser  128  G The Lexicon  133  v  List of Figures 2.1  Model of Grammar  9  4.1  The Parsing Model  39  5.1  The Translation Model  61  vi  Acknowledgements F i r s t a n d foremost I w o u l d like t o express m y gratitude t o m y supervisors: D r . M i c h a e l Rochemont, for his invaluable assistance w i t h the l i n g u i s t i c theory a n d guidance throughout the course of this thesis, a n d D r . H a r v e y A b r a m s o n , for originally p o i n t i n g me i n the right direction a n d p r o v i d i n g m e w i t h a background i n logic programming. I w o u l d especially l i k e t o thank R a n d y Sharp for his comments a n d suggestions d u r i n g the final stages of the thesis, B r i a n Ross for his friendship a n d numerous helpful discussions, B a r r y B r a c h m a n for t e c h n i c a l assistance, and the members of the D e p a r t m e n t of C o m p u t e r Science w h o made the whole process enjoyable: Dave, B r e n t , R i c k , a n d H e i d i t o name just a few. I n a d d i t i o n , special thanks must go t o m y parents a n d M y r o n for their constant support a n d encouragement o f m y academic pursuits. F i n a l l y , I w o u l d like to thank the N a t u r a l Sciences a n d Engineering Research C o u n cil for its support through t w o postgraduate scholarships, D r . A b r a m s o n for research assistantships under a n I B M S U R G r a n t , a n d the D e p a r t m e n t o f C o m p u t e r Science for additional funding.  vii  Chapter 1  Introduction Linguistic theory as it has developed within the transformational generative enterprise is fundamentally concerned with the nature of the human language faculty. The underlying hypothesis is that humans are innately endowed with some knowledge of language. Those principles of grammar which are innate are said to constitute a theory of Universal  Gram-  mar (UG). Such a theory could, for example, shed light on the process of child language acquisition. As Chomsky observes, some fundamental questions which arise in this pursuit are [Chomsky 86b]: (1)  (i) What constitutes knowledge of language? (ii) How is knowledge of language acquired? (iii) How is knowledge of language put to use?  The questions of (li) and (liii) raise the distinction between a theory of linguistic competence, represented by a particular grammar, and a theory of linguistic performance. Specifically, a theory of grammar must provide an account of the fundamental principles of UG, as well as those elements of grammar which must be learned. The latter must also be consistent with some psychologically plausible theory of language acquisition (i.e. the question of (Iii)). A theory of performance will indicate how knowledge of language is put to use in tasks of recognising and producing utterances. Work within transformational generative grammar focuses primarily on the questions ( l i & i i ) . The prevalent theory is Chomsky's Government-Binding  1  Theory which posits a set  CHAPTER  1.  INTRODUCTION  2  of fundamental, language independent subtheories. Each subtheory consists of principles of grammar, which are taken to have parameters of variation which account for the diversity in phenomena across human languages. The set of principles constitutes an instance of UG, where the grammar for an individual language, or Core Grammar, is represented by a specific set of parameter settings. It is important to note that the program of research sketched above represents a significant shift in focus from systems of rules to systems of principles. That is, while some linguistic theories, including early transformational grammar, attempt to characterise languages via descriptive sets of rules, the current approach is oriented toward the determination of a set of underlying principles which can account for all languages. The advantages of such an approach are clear; in addition to accommodating theories of acquisition, the resulting theory will possess explanatory abilities which permit us to derive the properties of various languages instead of merely stipulating them. Despite these advances in linguistic theory, rule-based approaches have remained the status quo  in natural language processing systems. These systems are typically based  upon some large context free grammar which is either used to compute parsing tables, or used directly in an Augmented Transition Network (ATN) or Logic Grammar. As a result, they inherit many of the problems of the rule-based grammar upon which they are founded. As Barton points out, rule systems are both unconstrained and stipulative in nature [Barton 84]. Not only are large numbers of language dependent rules required, but their correctness is difficult to ensure given the lack of underlying principles. An alternative approach is to construct systems which employ the principles of grammar directly. This thesis is concerned with the implementation of a natural language analysis system which is based upon a subset of the principles of Government-Binding theory. We propose a particular procedural interpretation of the principles of GB theory, which are realised as a Prolog logic program . In this way, the parser can be considered to model 1  1  The discussion of implementation in Chapters 4 and 5 assumes a knowledge of Prolog on the part of the reader. For an introduction to the language see [Clocksin etal. 81], [Hogger 84], and [Sterling etal. 86]. For a more theoretical discussion see [Lloyd 87].  CHAPTER  1.  INTRODUCTION  3  human linguistic competence. An effort is made to reflect the modularity present within the theory, and preserve its cross-linguistic capacity. The parser attempts to maintain the distinction between the language independent principles of UG, and the specific rules and parameters relevant to an individual language. That is, a language independent parser is constructed which accesses the language specific information to parse a particular language. To illustrate this, a syntactic translation  2  system is developed, which permits bi-directional translation between English and German. The system consists of a principle-based parser which performs syntactic analysis, and a translator which translates lexical items and generates sentences in the target language. The basic approach was initially developed by Sharp, and later adopted by Dorr (see [Sharp 85], [Dorr 87]). Both describe principle-based parsing/translation systems for English and Spanish. The basic goal of this research is to provide further insight into possible design strategies for principle-based systems. Emphasis is placed particularly on the design of an efficient system which retains the cross-linguistic capacity of the linguistic theory. The application to English and German brings to light certain language variations which are not clearly handled by the previous systems, and as such provides further evidence that a "universal parser" is indeed possible. In Chapter 2 we present the Government-Binding theory as an instance of UG. Specifically, we motivate the theory with respect to language acquisition and the desire for an explanatory theory of grammar. We then present the model of grammar and each of the relevant subsystems of principles and parameters. In Chapter 3, we describe the system of representations adopted for the lexicon and phrase structure. In addition, we present a language specific analysis of the transformations which are accounted for by the present system. Chapter 4 presents the implementation of the syntactic analysis component while Chapter 5 describes the translation component. Chapter 6 is devoted to a discussion and 2  By syntactic translation we mean to imply that no semantics or pragmatics are involved. Rather, a simple system of lexical equivalence is assumed.  CHAPTER  1.  INTRODUCTION  4  evaluation of the overall system, and examines related work in proposing possible improvements and extensions. Chapter 7 presents a general summary of the work and results, and suggests directions for future research. Finally, Appendix A illustrates the performance of the system with a number of example translations, while the remaining appendices contain the Prolog source code.  Chapter 2  Government-Binding Theory Current efforts in transformational generative grammar ( T G G ) have led to the approach we know as Government-Binding  Theory (henceforth, G B theory). The aim of the research has  been to move away from systems of rules, favoring systems of principles. G B theory posits a set of independent subtheories of principles which are fundamental to all human languages. When combined, these subtheories yield a coherent theory of grammar, known as Universal Grammar. Each subsystem, or module, is intended to apply cross-linguistically, with some degree of parametrization of the principles for individual languages. Such a theory, with parameters instantiated appropriately, is said to constitute the Core Grammar of a specific language. In this chapter we will first discuss language acquisition and explanation as motivating forces in theoretical linguistics. We then outline the overall model of grammar and present each of the modules of G B theory relevant to this work.  2.1  Acquisition and Explanation  The relevance of language acquisition to linguistic theory was recognised by transformational grammarians as early as Aspects of the Theory of Syntax [Chomsky 65]. This early work, however, was oriented more towards achieving descriptive adequacy for natural language grammars, than towards producing a grammatical theory. The more recent aim of  5  CHAPTER  2.  GOVERNMENT-BINDING THEORY  6  Chomskian linguistics (see [Chomsky 73]) has been to provide a theory of grammar with certain explanatory abilities. Specifically, linguistic research has been guided more explicitly by the problem of language acquisition. This entails that grammatical theory account for the ability of children to master a rich and highly structured knowledge of grammar despite the fact that they are presented with data which is both degenerate and deficient. Hornstein and Lightfoot outline these deficiencies on three levels [Hornstein etal. 81]: (2)  (i) Children hear speech which does not consist uniformly of complete grammatical sentences, but also utterances with pauses, incomplete statements, slips of the tongue, etc. (ii) Despite being presented with finite data, children become able to deal with an infinite range of utterances. (iii) People attain knowledge of the structure of language, despite the absence of such data. That is, people are able to make judgements concerning complex/rare sentences, ambiguity relations, and grammaticality using knowledge which is not available as primary linguistic data (PLD) to the child.  The problem of language acquisition then is that children acquire an extremely sophisticated knowledge of language despite it being underdetermined through the poverty of stimuli  of (2). Furthermore, this occurs rather uniformly despite variation in intelligence  and experience. In attempting to account for this problem, the theory of grammar presupposes an a priori knowledge of language . The assumption is that humans are endowed with a language 1  faculty which consists of a set of "genetically encoded" principles [Lightfoot 82]. These principles are then activated appropriately during the acquisition process. A crucial observation at this point is that a child learns any language to which he is exposed. This entails that the innate principles be language independent. A fundamental goal of linguistics research is to determine the content of these principles such that they are abstract enough to account for all attainable languages, while still being rich enough to explain language acquisition under seemingly impoverished conditions. It is worth noting that these criteria 1  This is opposed to inductive theories of learning which appear insufficient to account for the language acquisition problem given the poverty of stimuli assumption.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  7  of abstractness and richness are in conflict. However, this seems desirable as it provides rather strict guidelines for research. The basic approach taken in the pursuit of such a theory has been to attribute parameters of variation to each of the principles. The value of a given parameter is drawn from a finite (presumably small) set of possible values. The theory of grammar defined by the set of principles is known as Universal Grammar (UG). When the parameters are appropriately instantiated for a specific language, UG is said to constitute a core grammar for that language. The process of language acquisition can now be considered a parameter setting operation. The child begins at the initial state 5,- with an uninstantiated set of principles. The parameters then become set via some Language Acquisition Device (LAD) which interprets the data presented to the child (for further discussion see [Chomsky 81b]). After sufficient information and experience, all the parameters are set and the child has a knowledge of the core grammar (i.e. the final state, Sj). If we take the data presented to the child to be random and unstructured, we can make a further simplification by suggesting that acquisition is effectively instantaneous . This essentially assumes that the relevant data is 2  presented to the child all at once. This hypothetical formulation of the acquisition process is known as the logical problem of language acquisition. By adopting such a solution to the problem of language acquisition, we now have a metric for evaluating the explanatory adequacy of a theory of grammar. The criteria for explanation have been outlined as follows [Hornstein et al. 81]: (3)  (i) Coverage of empirical data: show that facts follow from the principles. (ii) Simplicity and elegance of the principles. (iii) The principles should contribute insight to the problem of language acquisition.  2  While this may indeed be too strong a simplification, it should be noted that it does not conflict with theories which assume a more incremental learning process involving maturation of the child and language faculty. Rather, it abstracts away from these issues, so as to simplify the task at hand.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  8  The requirement of (3iii) entails that the parameters of variation meet certain learnability criteria. Specifically, they must be determinable on the basis of positive evidence only. The motivation for this is that children by assumption are not exposed to the ungrammatical data, in any systematic way, which might be necessary to fix parameters (for further discussion see [Wexler etal. 80]).  2.2  A Model of Grammar  Transformational generative grammar is considered to have its origins in Syntactic  Struc-  tures [Chomsky 57]. The original model proposed two levels of syntactic representation; deep-structure  (or, D-structure) and surface-structure  (or, S-structure) . D-structures 3  constituted a representation of the semantically relevant grammatical functions, and were considered to be generated by a set of phrase-structure rules. D-structures were then mapped to their "surface form" via a set of transformational operations. These transformational operations included rules for passivisation, wh-movement, subject raising, etc. and typically specified the structural description  (SD) and structural change (SC). Consider  for example the following: (4)  (a) (b) (c)  The poem was written by Coleridge. Coleridge [TNS ,t] write the poem. X NP AUX V NP Y by Z SD: 1 2 3 4 5 6 7 8 SC: 1 5 3+be 4+en d> 6 7+2 8 p0J  If we take the sentence of (4a) to have the D-structure representation of (4b), where  the poem is the object of write, and Coleridge is the subject, then we can see that a rule such as that of (4c) is perfectly adequate to describe the passive transformation of (4b) to (4a). If we consider however the criteria and motivations for an explanatory theory of grammar as presented in Section 2.1, we see that the model presented above is inadequate. Not 3  In fact, the notions of "deep" and "D" structure are not coextensive, and the same can be said for "surface" and "S" structure. Indeed, the abbreviated notation was adopted by Chomky explicitly to avoid confusion in the literature. For our purposes however, we may take them to be similar.  CHAPTER  9  2. G O V E R N M E N T - B I N D I N G T H E O R Y  only are the rules language dependent i n nature, b u t they also are highly specialised a n d rather vast i n number. T h e former almost certainly eliminates the plausability o f such rules being innate, a n d t h e l a t t e r presents a problem for learning under the poverty of s t i m u l i assumption o f (2). T h e pursuit of an e x p l a n a t o r y theory of g r a m m a r has led t o the reduction of the rule component i n favor of universal principles. T h e current m o d e l has a s i m i l a r format t o that of early T G G , that is the generation of D-structures which are m a p p e d t o S-structures v i a a t r a n s f o r m a t i o n a l component. In a d d i t i o n , two interpretive components have been added; the L F ( L o g i c a l F o r m ) component, a n d the P F ( P h o n e t i c F o r m ) component.  B o t h are  derived f r o m S-structure, resulting i n the m o d e l of g r a m m a r i l l u s t r a t e d i n F i g u r e 2.1.  D-structure  $_  S-structure  F i g u r e 2.1: M o d e l of G r a m m a r  In the remainder of this section, we w i l l discuss the system of rules and principles w h i c h are relevant t o the system presented here. Specifically, we adopt the linguistic framework of G o v e r n m e n t - B i n d i n g theory as presented i n [Chomsky 81a], a n d recent developments w i t h i n that theory. A fairly detailed account of the evolution of generative g r a m m a r is presented i n [vRiemsdijk etal. 86].  2.2.1  A System of Rules  T h e three basic parts of the rule system of G B are outlined i n [Chomsky 82] as follows:  CHAPTER  (5)  2.  GOVERNMENT-BINDING THEORY  (A) Lexicon (B) Syntax: (C) Interpretive:  (i) (ii) (i) (ii)  10  Base component Transformational component PF component LF component  The Lexicon constitutes the vocabulary of a language. That is, it consists of a set of entries for each lexical item, with information concerning its syntactic features, morphology, phonology, and selectional requirements. The Base Component, combined with information projected from the lexicon, generates D-structures. The Transformational Component maps a D-structure representation to a corresponding S-structure via application of the Move-a rule. This rule simply provides for the movement of elements from D-structure to S-structure positions. The resulting S-structure is then subject to the well-formedness conditions imposed by the principles of grammar. Viewed slightly differently, we can consider the principles to act as constraints on the application of Move-a. Additionally, the interpreted components map S-structures to their PF (phonetic form) and LF (logical form) representations respectively. As with the generation of S-structure, Move-a also plays a role in the generation of these representations. We restrict the discussion here however to the D-structure and S-structure levels of representation. The Base Component and Transformational Component together generate the structural representations, or Syntax of sentences. 2.2.2  A System of Principles  As we have seen, the rule systems are stated very generally. That is, sub categorization information projected from the lexicon, combined with X-theory  4  and lexical insertion  yields D-structures. The basic transformation operation Move-a then maps D-structures to S-structures. This simplicity in the systems of rules is made possible by the shift in emphasis to the system of principles. 4  Additionally, X-theory creates positions for adjuncts. Adjuncts may appear, regardless of subcategorization, for the purposes of modification and further description. We will return to the discussion of adjuncts in Section 2.3.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  11  The principles of GB theory interact so as to impose well-formedness conditions at the various syntactic levels of representation (i.e. D-structure, S-structure, and LF). These subsystems of principles, as outlined in [Chomsky 82] are: (6)  (a) X-theory (b) 0-theory (c) Case theory (d) Binding theory (e) Bounding theory (f) Control theory (g) Government theory  As we have mentioned earlier, the generation of D-structures is determined by information projected from the lexicon in conjunction with some version of the X-theory of phrase structure. Specifically, lexical properties may include certain selectional requirements. Consider for example that some prepositions, such as for, subcategorize for an NP object, while others, such as away do not. Additionally, verbs may subcategorize for a variety of phrasal constituents. In general, these sub categorized positions must be assigned a thematic role (henceforth, #-role). It is also possible for a predicate VP to assign a 0-role to the subject position, even though that position is not subcategorized. The generation of S-structures via the rule Move-a is constrained by the interaction of several principles. The Projection Principle requires that selectional properties of lexical items be represented at each syntactic level. This entails that moved constituents leave a trace behind in their D-structure positions. Bounding theory imposes locality conditions on the application of Move-a via the principle of Subjacency. Note however that under the right conditions elements may still be moved an arbitrary distance. This is achieved by successive "hopping" from one subjacent position to another. The abstract notion of chains  is used to represent a constituent in terms of its "history of movement", in which  the links of the chain record each application of Move-a. As we will see, chains provide an extremely convenient and elegant means of stating certain principles. Case theory concerns the assignment of Case to noun phrases (NP's). The so-called Case Filter  is instrumental in determining the distribution of NP's by requiring that each  CHAPTER  2.  GOVERNMENT-BINDING THEORY  12  chain receive Case exactly once. Binding theory outlines possible coreference relations holding between NP's, while Control theory determines the reference of PRO, a phonetically null pronominal anaphor. Finally, the theory of Government defines the structural domain of lexical items. This notion plays an integral part in several of the principles. The remainder of this chapter will examine each of the relevant principles and their interaction in some detail. The theories of Binding and Control have been omitted from the discussion as they are not present in the system developed here.  2.3  X-Theory  X-theory provides a generalised schema for the representation of phrase structure. The theory capitalises on the observation that all phrasal categories bear a certain structural resemblance. Specifically, X-theory captures the endocentric nature of phrases with respect to a (lexical) head. The head of a phrase is taken to be that lexical item of which the phrase can be considered a "projection". Consider for example the following set of traditional phrase structure rules for a subset of English: 5  (7)  (a) (b) (b) (b)  S — NP Aux VP NP -» Det N (PP) VP V (NP) (PP) PP —• P (NP)  A cursory inspection of (7b-d) makes apparent the systematic redundancy of such phrase structure rules. That is, NP is a projection of N, VP of V etc. for each lexical category. Furthermore, the choice of constituents which may follow the head (known as the complements), is already specified in the lexicon as selectional requirements. These two observations make possible the following general rule: (8) X 5  —y X  Complements  Using standard notation and terminology, we take those symbols on the left of the —• to be non-terminal, phrasal nodes, and the rest, by default, to be terminal symbols (i.e. representing a class of lexical items). Symbols appearing in ()'s are considered optional.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  13  Where X may be any lexical category , N, V, A or P. X represents a non-terminal 6  node which inherits its grammatical properties from X . The choice of Complements is projected from the subcategorization information in the lexical entry for X. The next level of phrase structure includes the rule for specifiers. The possible specifiers of a phrase are determined by the lexical properties of a given head. For instance, the possible specifiers for noun phrases are taken to be determiners (eg. articles the,a and quantifiers each,all) and possessive NP's, as in the man's coat. For adjective and prepositional phrases, the specifiers may be degree modifiers. The phrase structure rule for specifiers is the following: (9) X  -»• Specifier ~X  Finally, the theory must account in some way for adjuncts to phrases. Adjuncts are those phrases which are in no way lexically selected by the head, but rather appear optionally to further modify or describe the phrase in question. Typical examples are prepositional phrases (PP's) which can modify NP's, VP' or AP's as in the following: (10)  (a) (b) (c)  the man on the hill watched with the telescope drunk at the party  (= NP) (= VP) (= AP)  Indeed, such PP adjuncts can also be a source of syntactic ambiguity as in the now famous example: / saw the man on the hill with the telescope, in which the two PP's can be interpreted as modifying either saw or man in a variety of ways. Additional adjuncts include adjectives and relative clauses for noun phrases and adverbs to verb phrases. The addition of appropriate rules for adjuncts yields the following set of rules for a theory of X : 7  6  The lexical categories are not "arbitrary". Rather, they can be considered abbreviations for the more fundamental substantive [± N] and predicative [± V] features. The correspondence is as follows: N = [+N -V], V = [-N +V], A = [+N +V], and P = [-N -V].  7  In the two-level X system, X = XP. That is, NP, VP, etc. refer to the maximal projections of their respective phrasal categories. We use both notations interchangeably throughout the thesis.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  (11)  X X X X  —• Specifier X —• Adjuncts X —y X Adjuncts —>• X Complements  (a) (b) (c) (d)  14  We have so far considered phrase structure only with respect to the so-called lexical categories. We have yet to provide an analysis of the the structure of S (that is, the rule of (7a)) in terms of X-theory. Additionally, we must account for the structure of embedded and relative clauses. Earlier formulations of X phrase structure took these as exceptional constituents, and the following rules were stipulated for them: (12)  (a) (b)  S -+ NP Infl S -> Comp S  VP  The rule of (12a) is virtually identical to that of (7a), with the exception that the Aux  (auxilliary) element has been replaced by Infl (for, inflection). Infl is presumed to  contain information about the tense of the clause, typically represented by the [±TNS] feature. Just in the case Infl is [4-TNS], it also contains the agreement element (abbreviated AGR) consisting of the usual person, number, and gender attributes. The second rule, (12b), states that embedded or relative clauses consist of a (possibly optional) element Comp,  followed by an S. Comp is a complementizer element such as that or for. The  following example uses the traditional labeled bracketing representation to illustrate these constructions: (13)  [s I think [^ that [5 the man bought [JVP the book [^ that [5 I wanted ]]]]]]  A more recent analysis however has been to consider Infl and Comp as non-lexical categories, which are subject to the rules of X-theory in the same way as lexical categories [Chomsky 86a]. That is, take Infl (= I) to be the head of IP (=S) and Comp (= C) to be the head of CP (=S). Under this analysis, it seems natural to let the subject NP be a specifier to IP, and assume that VP is a complement which is inherently selected for by I. Additionally, C selects for an IP complement. The specifier position of CP, as we shall see in Chapter 3, is useful as a landing site for moved constituents (such as wh-phrases), but  CHAPTER  2.  GOVERNMENT-BINDING THEORY  15  is typically not filled at D-structure. Under this analysis the rules of (12) are replaced by the following instantiations of the X-theory: (14)  (a)  C  -> Wh-phrase  0>) 00  C  -> _C I  I -*  (d)  I  C  Nj I V  The X rules outlined in (11) and (14) are indeed sufficient for describing the phrase structure of languages such as English. Problems arise however when attempting to account for other languages. Specifically, these problems concern the order, or precedence relations. That is, while specifiers are initial in many Indo-European languages, this generalisation does not hold for many other languages . In addition, languages vary with respect to the 8  position of heads relative to their complements. For English, the heads of all categories are initial with respect to their complements. In German however, both V and I are considered to be final [Thiersch 78]. In general then, the dominance relationships of the rule in (11) hold, but the precedence relationships must be taken to vary parametrically for individual languages. It is possible then to express X-theory in even more general terms. The rules of (15a-b) represent similar dominance relations, but permit variations in precedence to be determined by parameters (or, as we shall see, other principles) . If we let the superscripts represent 9  the bar-level used above, then we take arguments (i.e. sub categorized constituents) to be sister-to X', i < 2, and adjuncts are sister-to X  (15)  (a)  ^Y,X' X ^Xi,Y where: i <2,j<  2  1 0  .  X*  (b)  {  (c)  i, j > 0.  We may take Y to be either an X or a simple lexical specifier. 8  9  10  See [Lightfoot 82] for a discussion of constituent order for a variety of languages. This version of X-theory is adapted from Chomsky's class notes, 1986. This notion of X phrase structure differs with that of (11) in two respects:firstly,adjuncts are now sister to X instead of X , and secondly, the Specifier position is now a "special instance" of an adjunct. This is somewhat inconsistent with the literature, but is convenient for our purposes and nothing critical hinges on it. 2  1  CHAPTER  2.4  2.  GOVERNMENT-BINDING THEORY  16  0-Theory and Lexical Selection  We remarked above that lexical items may have certain selectional requirements. Specifically, they may select, or subcategorize, for certain phrasal complements. Additionally, lexical items may require the presence of constituents considered necessary thematic participants of the sentence. In such cases, the required constituent is assigned a 0-role by the lexical item which selects for it. Subcategorization properties of a given head will require that constituents of a specific category appear adjacent to it, and dominated by some 11  projection of the head. This is primarily a syntactic requirement which determines the complements of  the head for the purposes of X-theory.  0-theory concerns the thematic relations which are assigned by a lexical item. This is related to subcategorization, in that subcategorized constituents are generally assigned <?-roles (we will strengthen this notion below). Furthermore, a lexical item may assign an external  #-role to its Subject . Consider for example the sentence John put the book on  the refrigerator.  12  Here, the head of VP, put, subcategorizes for NP and PP complements.  Additionally, the verb assigns three 0-roles; Agent is assigned to the subject, Theme to the direct object, and Location to the prepositional phrase. The assignment of a #-role to a constituent is known as 8-marking. More precisely, a constituent receives a #-role by virtue of occupying a position which is 0-marked (henceforth, ^-position). It is a fundamental requirement that all 0-roles be assigned at Dstructure, which is considered to constitute a "pure" representation of the thematic interpretation. The proper assignment of 0-roles is determined by the O-Criterion, stated as follows: (16) ^-Criterion: each argument bears one and only one 0-role, and each 0-role is assigned to one and only one argument. 1 1  In fact, the adjacency requirement may be taken to vary parametrically, as suggested by languages such as Japanese.  1 2  In fact, the external d-role is taken to be assigned by the VP, as determined compositionally by the verb and its complements [Marantz 81].  CHAPTER  2.  GOVERNMENT-BINDING THEORY  17  Subcategorized elements are considered to be directly 0-marked. In addition, we observed above that the subject position may be indirectly ^-marked. This indirect ^-marking is generally considered to be mediated through I. If we further assume that the ^-criterion applies at each syntactic level of representation, then movement may never be to a 6position, since this would result in a 0-role being assigned to two arguments (i.e. the D-structure argument which must have vacated the position, and the element which is moved to the position). The Projection Principle in effect generalizes the notion of selectional requirements across each level of syntactic representation. Specifically, it is stated as follows : 13  (17) The Projection Principle: (i) Subcategorizable positions are 0-marked by the governing head, at each syntactic level of representation (i.e. D-structure, S-structure, and LF). (ii) ^-marked positions must be represented categorially at each syntactic level of representation. The Projection Principle is fundamental to GB theory, as it constrains the mapping between each level of syntactic representation. As a result of imposing this general wellformedness condition, the Projection Principle has as one of its consequences the theory of empty categories. That is, when an element is moved from its D-structure position, that position must remain in some sense even though it is not filled by any lexical constituent. To account for this, a phrasal constituent of the same category as the moved element remains in the vacated position, dominating a trace which is co-indexed with the moved element. Note that movement or raising to object position is now prohibited by the Projection Principle and ^-criterion, since the raised constituent would receive two #-roles . 14  While the Projection Principle stipulates that direct ^-marking is entailed by subcategorization, indirect ^-marking is clearly not. The assignment of a 0-role to subject is in no 1 3  This statement of the Projection Principle differs slightly from the formulation in [Chomsky 81a], but is essentially the same in both spirit and function.  1 4  There have been recent claims that raising to object is indeed possible, and that the ^-criterion should be restated. For discussion of this see [Pullum etal. 88] and [Massam 85].  CHAPTER  2.  GOVERNMENT-BINDING THEORY  18  pre-determined sense obligatory but rather is determined purely by the ^-marking properties of the head. So, while raising to object is not possible, raising to subject sometimes is, as the following example illustrates: (18)  (a) (b)  It seems [CP that John likes Mary], John,' seems [IP e,- to like Mary].  While seems does select for a propositional complement, it does notfl-markthe subject position. In (18a) the non-referential, pleonastic element "it" is inserted to fill the position. This permits (18b) to be generated, where John moves from the subject position of the embedded clause (where it receives its #-fole), to the non-0-marked subject position in the matrix clause (known as a 9-position). The Extended Projection Principle  introduces the additional stipulation that every  clause must have a subject. This is necessary just in the case where the verb does not select for a subject, as was observed in (18). That is, the sentence Seems that John likes Mary  2.5  is ruled ungrammatical by the Extended Projection Principle.  Movement  The shift from rules to principles has permitted the generalisation, indeed trivialisation of the transformational component. While past approaches posited numerous transformational rules, such as that for passive illustrated in (4), the more recent tact has been to propose the singular operation Move-a. Move-a essentially says move anything, anywhere. The possibility of over or incorrect application of this operation is ruled out by the principles of GB. In short, these principles dictate when movement is possible, necessary, or prohibited. We have introduced the notion of chains as a representation of arguments in terms of their "history of movement". More precisely, a chain consists of a tail (its D-structure position) and a head (its S-structure position). By virtue of the ^-criterion, each chain is assigned exactly one 0-role, and this must be to the tail of the chain. The chain may also  CHAPTER  2.  GOVERNMENT-BINDING THEORY  19  contain intermediate positions, represented by traces, through which the argument moved on the way to its destination, S-structure position. Despite the general nature of the Move-a operation, the types of movement possible are rather severely constrained by the principles. As Chomsky observes, there are essentially two types of movement: substitution and adjunction. We are concerned here primarily with substitution which has the following general properties [Chomsky 86a]: (19)  (a) (b) (c) (d)  There is no movement to a complement position. Only X° can move to a head position. Only X can move to a specifier position. Only X° and X are visible to Move-a.  The ^-criterion clearly accounts for (19a), while (19b-c) would appear to follow from the Structure Preserving Hypothesis [Emonds  76]. The constraint of (19d) is simply stipulated,  but seems intuitively desirable. In English and German, the head-to-head movement of (19b) typically involves the categories V and I. As we shall see in Chapter 3, this type of movement is responsible for have-be raising  and Subject-Aux Inversion (SAI) in English, as well as for the so-called  verb-second phenomena  in German.  We have observed that movement must always be to a ^-position. In addition, we can make the general distinction between argument positions (A-positions) which may receive a 0-role, and non-argument positions (A-positions) which may not. The subject is therefore always an A-position, but not necessarily a ^-position. In general, any individual application of Move-a may be from an A-position to an A-position (henceforth, A-movement) or from an A or A position to an A-position (A-movement). A-movement is to a subject position (SPEC, IP) and is generally determined by Case theory, as we shall see in Section 2.7. An instance of A-movement would be movement of a wh-phrase to the SPEC, position.  CP  CHAPTER  2.6  2.  GOVERNMENT-BINDING THEORY  20  Government  The phrase structure of X-theory permits us to describe certain structural relationships between constituents.  Central to GB theory is the notion that many of its principles  prescribe certain locality constraints on the relations between items in the tree. That is, a given principle defines a domain which is local with respect to the constituent that principle concerns. The most prominent, and indeed fundamental, of these structural domains is that of government. command.  Government is itself defined in terms of the less restrictive notion of m-  We take these to be denned as follows : 15  (20) m-command: a m-commands (5 iff every maximal projection dominating a dominates (3. We may now define government as follows: (21)  government:  iff (i) a m-commands and (ii) a is a head (i.e. X°), and (iii) (3 m-commands a  a governs (3  The theory of Government plays a central role in determining the distribution of empty categories. Specifically there are two types of empty categories: traces, which occupy the positions which have been vacated by movement, and PRO, which is a phonetically null pronominal anaphor. The basic well-formedness condition for empty categories is known as the Empty Category Principle (ECP), which states that all traces must be properly governed.  This might be extended, to include the observation that PRO must be ungoverned.  This Extended ECP [Chomsky 82] can be stated simply as follows: (22)  Extended E C P :  If a is an empty category, then (i) a is trace iff it is properly governed  (ii) a is PRO iff it is ungoverned. These definitions have been adapted from a variety of sources in the current literature.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  21  where, (23)  Proper Government: a properly governs 8 iff a governs 3, (a a lexical X°) or a locally A-binds 3  Where the notion of locally A-bound, entails that the empty category (henceforth, ec) be bound by an A position which is "not too far away". In general then, if an empty category is governed, it is a trace, and if not it is PRO. We may also assume that PRO may only appear in subject position, since the object position is necessarily governed by its sub categorizing head. To see how the ECP applies, consider the following sentences: (24)  (a) (b)  Why,- does Barry want [ PRO to be a rocket scientist ] t{ 1 Who,- does Barry want [ f [ to be a rocket scientist ] ] ?  In (24a) we see that the embedded subject ec is ungoverned, and therefore must be PRO. In (24b) however, we see that the subject ec is locally bound by twhich is in the SPEC,CP position (an A-position). The subject ec of (24b) is therefore properly governed, and must be a trace.  2.7  Case Theory  Case theory concerns the assignment of Case to noun phrases, and plays an important role in determining their distribution. While Case may be realized morphologically, NP's are considered to receive Case, regardless of their morphological status. In English, morphological case is rather impoverished, generally distinguishing only the nominative, accusative and genitive form of pronouns such as he/him/his, we/us/our and the wh-pronouns who/whom/ whose  (although this distinction between who and whom is disappearing). German, on the  other hand, maintains a fairly rich case system, in which the nominative, accusative, dative, and genitive forms of most NP's are distinguished (although not always uniquely). Case theory, however, is not concerned with such phenomena, which appear to vary widely among languages, but rather with the more general assignment of abstract Case.  CHAPTER  2.  GOVERNMENT-BINDING THEORY  22  The fundamental requirement of Case theory is that every NP receive (abstract) Case. An NP receives Case by occupying a position to which Case is assigned. Specifically, a position is assigned Case if it is governed by a Case-assigner. For English the Case assigning categories are generally taken to be V, P, and I. The ability for V and P to assign Case must additionally be specified in the lexicon. If a lexical item can assign Case, it is said to be transitive, otherwise intransitive. In addition, I is a Case-assigner just in case it has the feature [-(-TNS] . The fundamental principle of Case theory, the Case Filter, is 16  stated as follows: (25) Case Filter: *NP, where NP is phonetically realized and has no Case. Chomsky has suggested that this requirement can be re-formulated in terms of chains (see [Chomsky 81a], Chapter6). That is, Case theory requires that every chain be assigned Case exactly once. Consider for example the sentences below: (26)  (a) (b) (c) (d)  * It appears [IP John to like Mary]. John,- appears [jp e,- to like Mary]. It appears [cp that John likes Mary]. * John appears [CP (that) e,- likes Mary].  The Case Filter rules (26a) as ungrammatical, since John is not assigned Case (I is untensed). This can however be rendered grammatical by moving John to the subject position of the tensed matrix clause, as in (26b). This results in the formation of the chain (John, e,). This is grammatical since the chain receives a 0-role at e,-, and Case is assigned to John. If the embedded clause is tensed, as in (26c), then its subject need not move. Indeed, movement of the matrix subject is prohibited since the resulting chain would be assigned Case twice, as shown in (26d) . 17  There are however instances when the subject of an untensed clause does not need to move. Consider, 16  17  Another point of view is to consider AGR a Case-assigner, not I[+TNS], since AGR is present (or, rich) only for I[+TNS]. For other accounts of the ungrammaticality of (26d), see [Chomsky 81a].  GOVERNMENT-BINDING THEORY  CHAPTER  2.  (27)  I believe [IP John to like Mary], I believe [cp that John likes Mary].  (a) (b)  23  As we would expect, (27b) is grammatical much as (26c) is. How then can we explain the grammaticality of (27a)? The solution adopted here is to assume that government may take place across the maximal projection IP (=S). The recent position on this is to assume that if X is governed, so is its specifier, SPEC,XP  and its head, X° [Chomsky 86a].  Therefore, since believe governs the IP , the subject, SPEC,IP,  is governed and thus can  be Case-marked. This permits the matrix verb to Case-mark the subject of the embedded clause, rendering (27a) grammatical. The difference then between the verbs believe and appear,  is that believe is transitive, permitting it to assign Case, while appear is not. This  process is generally referred to as Exceptional Case Marking (ECM) . 18  As afinalexample, consider the following set of sentences: (28)  (a) (b) (c)  I want [jp John to like Mary]. I want very much [cp for John to like Mary]. * I want [cp that John like(s) Mary].  At first glance, (28a) would suggest that want behaves similarly to believe. Examination of (28b) however has led to the analysis that want selects for CP with a for complementizer. The for acts as a preposition, able to assign Case to the embedded subject. While for may/must delete when adjacent to the verb, it must be present when there are intervening elements (behaving similarly to the that complementizer). An additional observation is that sentences with believe can be passivised, while those with want cannot. It has been proposed that Case theory may also play a significant role in determining the so-called free and fixed word order properties of languages. The Case Adjacency Principle (CAP) [Stowell 81] requires that NP's be "string adjacent" to their Case-assigners. The invocation of this principle is taken to be determined by a parameter of variation for individual languages. Consider the following English and German examples: 1 8  Previous analysis suggested that 6e/«eue-type verbs could somehow delete the S node, leaving only an S, which was a non-barrier to government [Chomsky. 81a].  GOVERNMENT-BINDING THEORY  CHAPTER  2.  (29)  The boy will put the book on the table. * The boy will put on the table the book. Der Junge wird das Buch auf den Tisch legen. Der Junge wird auf den Tisch das Buch legen.  (a) (b) (c) (d)  24  We may explain the ungrammatically of (29b), by assuming that the CAP is in effect for English. Additionally, we may take the relative free word (or constituent) order of German to indicate that the principle does not apply in this language. We will not pursue this matter here, but simply note it as a principled solution to determining such properties of language.  2.8  Bounding Theory  The previous sections have outlined how principles constrain possible landing sites for movement. These constraints are imposed primarily by the ^-criterion, the Projection Principle, and Case theory. Bounding theory constrains movement directly by prohibiting a constituent from being moved "too far" in a single hop. Additionally, the theory imposes certain island constraints, where islands are taken to be domains out of which no constituent can be moved. An early approach was to specify these constraints individually. Consider for example the following island conditions proposed in [Ross 67]: (30) Complex NP Constraint (CNPC): No element in an S dominated by an NP, may be extracted from that NP: * Who,- do [s you like [NP the book that John gave to t,- ] ] (31) Wh-Island Constraint (WhIC): No element contained in an indirect question, S, may be moved out of that S: * Whatj do [s you wonder [j who; [s k bought tj ] ] ] (32) Sentential Subject Condition (SSC): No element may be extracted from an S, if that S is a (sentential) subject: * Who,- did [s [NP [S that she dated t,- ] ] bother you ]  CHAPTER  GOVERNMENT-BINDING THEORY  2.  We can summarize these constraints by requiring that (j) not be related  25  1 9  to ^ in the  following contexts: (33)  CNPC: WhIC: SSC:  ... 4> ... [ ... [NP - 1> ... <f> ... [ ... [s - i> ••• ... d> ... [ ... [NP - [s V» s  s  s  In an attempt to capture these constraints in a principled fashion, Chomsky introduced the Subjacency Condition [Chomsky 73], a formulation of which is stated below: (34) Subjacency: A singular application of Move-a may not cross more than one bounding node. That is, <f> may not be related to ip in the following context: ... (j) ... [ ... [p ... i> ... a  where a and Q are bounding nodes. Additionally, it is necessary to impose the condition of Strict Cyclicity on the operation of Move-a [Chomsky 73]. This condition may be stated as follows: (35) Strict Cycle Condition: Once Move-a applies across a cyclic node 0, no future application of Move-a may be applied so as to solely affect a subdomain of 0. This condition essentially requires that Move-a be applied in a strictly "bottom-up" manner. While the notion of cyclic node has been denned in a variety of ways, we take it to be any maximal projection, following [Williams 74]. Under this notion of Subjacency, the bounding nodes for English were generally taken to be NP and S. As we can see, these choices for bounding nodes account for all the island constraints in (33). The choice of bounding nodes was however shown to be subject to parametric variation across languages, such as Italian where evidence suggested the bounding nodes were NP and S [Rizzi 82]. The Subjacency condition is not sufficient however, to account for all possible island effects. Consider for example the following sentences: 19  We take "related" to include the antecedent-trace relationship of a moved constituent and its trace.  GOVERNMENT-BINDING THEORY  CHAPTER  2.  (36)  Who,- did you read [ a book about U ] 1 * What,- did you read [ a book under U ] 1  (a) (b)  26  While (36a) is perfectly grammatical, (36b) is not, with the intended reading: What was the book that you read resting under ?.  To account for this phenomena, Huang proposed  the Condition on Extraction Domain [Huang 82], which can be stated as follows: (37) Condition on Extraction Domain (CED): A phrase a may be extracted out of a domain /? only if /3 is properly governed. The formulation of proper government adopted was assumed to exclude adjuncts. If we take book to optionally select for an "about" PP, then extraction out of the PP is possible, as in (36a) (since the PP will be properly governed). Extraction of the locative PP adjunct in (36a) is ruled ungrammatical by the CED principle, since the PP is not properly governed . 20  In an attempt to account for both the CED and Subjacency with one principle, Chomsky has proposed the concept of barriers [Chomsky 86a]. The work has been an attempt to recast the principles of government and bounding in terms of this more general notion. We restrict our discussion here to the account of Subjacency developed by Lasnik and Saito, which is a revision of that suggested by Chomsky. We take the definition of barrier to be as follows [Lasnik etal. 88a]: (38)  barrier:  a is a barrier for /3 iff: (i) a is a maximal projection, (ii) a is not L-marked, and  (iii) a dominates f3. where, (39) L-marking: a is L-marked by /? iff f3 is directly 0-marked by a. 20  Note, this requires that the government domain be determined by thefirst(i.e. lowest) X node. In the present system however, we assume m-command to be determined by the "collective" set of X nodes (i.e. the highest node of the phrase). 2  2  CHAPTER  2.  GOVERNMENT-BINDING THEORY  27  In addition, we follow Lasnik and Saito in assuming that VP can be L-marked by I, but that IP cannot be L-maxked by C. With these notions defined, we can now restate Subjacency as follows: (40) Subjacency: 0 is Subjacent to a if for every 7 a barrier for immediately dominating 7 dominates a.  0, the maximal projection  To see how Subjacency now applies consider the following sentences taken from the examples above : 21  (41)  (a) (b) (c) (d) (e)  * Who,- do [ you like the book [ OPj that [ John gave t,- tj ] ] ] ? * What,- do [ you wonder whoj [ tj bought t,- ] ] ? * Who,- did [ [ that she dated t,- ] bother you ] ? Who,- did [ you read a book about t,- ] ] ? * What,- did [ you read a book [ under tj ] ] ? 7  7  7  7  7  7  7  7  7  7  The 7 symbol has been used to indicate those maximal projections which are barriers. That is, those phrases which axe not L-maxked. In each case, our new formulation of Subjacency accounts for the phenomena of each of the other approaches outlined above. Notice that in (41d), book is considered to subcategorize (optionally) for an about PP. The PP is therefore L-marked, and not a barrier. In (41e) however, the under PP is simply an adjunct, and therefore not L-marked, making a barrier. In this case, the antecedent for t appears outside the first X dominating 7, thus violating the Subjacency condition.  Wefollowlinguistic convention in using OP to represent an empty operator - a phonetically null whpronoun.  Chapter 3  Representations and Analysis In this chapter we will outline the system of representations adopted for the lexical and syntactic components of the system. The lexicon specifies the properties of words known to the system. The section discussing syntax will outline how the X phrase structures are represented, and we also present the syntactic analysis of English and German which has been adopted in this work.  3.1  The Lexicon  The lexicon constitutes the vocabulary of the system. It is comprised primarily of two components: 1) a database of lexical entries, known as the dictionary, and 2) morphological information for constructing the regular inflected forms of the dictionary entries. A lexical entry can contain basically three types of information: (42) Morphological: Agreement features such as person, number, gender, and the tense and participle forms for verbs. Primarily subcategorization information, properties of transitivity, and any other features affecting syntax.  Syntactic:  Semantic:  We restrict this to ^-marking properties of lexical items.  For our purposes, the syntactic and semantic information is effectively merged. That is, subcategorization information is considered to indicate the direct ^-marking properties  28  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  29  of lexical items, and indirect ^-marking properties (i.e. of the subject) are specified by a simple binary feature. No attempt is made here to actually assign thematic roles to constituents, rather we observe only the syntactic requirements that this would impose. 3.1.1  T h e Dictionary  The format of the dictionary is similar to that of [Sharp 85], where entries are represented as Prolog facts, or unit clauses, of the following form: (43) dict(Language, Category, Word, Lexicallnfo). Language  simply specifies the language of the entry, and for our purposes may be  instantiated as follows: (44)  Language =  Category  eng English ger German  indicates the lexical category for the entry. Additionally, a category may  be parametrized, to permit subcategorization of a specific form or feature. The parameters may also assist in specifying and restricting possible phrase structure configurations. Category  (45)  may take on the following values:  Category =  n(Case)  v(TNS/Form)  p(Type)  i(TNS) c(Lev/Type)  d  Noun, with a Case parameter (for German) Case = "nom" (nominative), "acc" (accusative), "dat" (dative). Verb, with TNS (tense) and Form (participle): TNS = "tns(+)" (tensed) or "tns(-)" (untensed) Form = "part(pres)" (present participle) or "part(past)" (past participle). Preposition, where Type = "loc" (locative), "dir" (directional), "tem" (temporal), "des" (descriptive). Infl, with TNS parameter, similar to verbs. Complementizer, where Lev = "mat" (root clause), or "emb" (embedded clause), and Type = "sent" (sentential complement) or "rel" (relative clause). Determiners, specifiers to noun phrases.  CHAPTER  Word  REPRESENTATIONS AND ANALYSIS  3.  30  is the lexical item itself.  Lexicallnfo  is actually a list structure, containing all the morphological and syntactic  information which is specific to the lexical entry. The list may contain the follow information structures: (46)  Lexicallnfo =  ftr(FtrList) subcat(Frame) theta(±) irr(List) root(Word) english(Eng) german(Ger) pl(Plural) aux(±)  Morphological and inflectional features, contained in FtrList. Subcategorization frame, where Frame is a list of subcategorized categories. Indicates whether the verb 0-marks the subject, theta(+) (yes), or theta(-) (no). A list of irregular forms for the entry. The root, if the entry is irregular. Specifies the English lexical equivalent. Specifies the German lexical equivalent. Specifies an irregular plural form. Marks an auxilliary verb, aux(+) (auxilliary), aux(-) (main).  The set of features contained in the ftr construct are drawn from the following: (47)  FtrList =  per(Per) num(±) gen(Gen) wh(±) case(Case) proper(±) num(N,G,C)  Person, per(l), per(2), per(3). Number, num(-|-) (plural), num(-) (singular). Gender, gen(m) (masc.) gen(f) (fem.) or gen(n) (neut.). Wh-item, wh(+) (e.g. who) or wh(-). Case, case(X), X = nom, acc, or dat. Indicate if a noun is proper, proper(-f) (proper), proper(-) (common). Represent the triple of number, gender, and Case, used for determiner forms (German only).  To reduce the size of entries, and the redundancy of information, default feature sets can be specified for the individual categories of a language. When the entry for a word is retrieved, the default FtrList of the category is consulted for any information not present in the entry itself. Defaults may be specified as follows: (48) defaults(Language, Category, DefFtrList).  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  31  The defaults for the present system appear at the beginning of the lexicon for each language (Cf. Appendix G).  3.1.2  The Morphology  Any given lexical item may have a variety of "surface" forms. That is, nouns have a plural form, determiners may vary according to Case, number, and gender (for German, specifically), and verbs may be tensed or appear in participle forms. Often, these forms can be constructed from the root word, by applying specific morphological rules . An obvious 1  example is the addition of "s" for the plural form of a noun, or the addition of "ed" for the simple past of a verb. If the various forms of a lexical entry can be constructed through the application of some predetermined operations for that language, the item is said to be regular. If these rules fail to apply to a given item, then it is irregular. Consider the plural form of "woman", which is not "womans", as our pluralisation rule would suggest, but rather "women". While the present system performs only limited morphological analysis, it is capable of analyzing word suffixes. This permits the system to construct dictionary entries for the inflected forms or regular words, thus greatly reducing the size of the lexicon. The suffix rules are specified in the lexicon for each language, and have the following form: (49) suff_table(Language, Category, InflEnd, RootEnd, Features). where, (50)  Language = Category = InflEnd = RootEnd = Features =  eng/ger n/v/i FtrList  The The The The The  language for which the rule applies. category for which the rule applies. ending of the inflected form. ending of the root form. features associated with the inflected form.  We are concerned here with generative morphology. Specifically, we account for the orthographic conventions for representing variations in the morphology.  CHAPTER  REPRESENTATIONS AND ANALYSIS  3.  32  The dictionary now consists of primarily two types of entries: 1) the root entries, containing the syntactic features, and 2) entries for the irregular forms, which contain their inflectional features, and a pointer to the root entry. In fact, German presents a problem for the rather simple morphological analysis performed by this system. Specifically, regular lexical items may not just vary the suffix, but also the prefix, and possibly inner vowels. Although there may be formal schemes for deriving these variations, they are too complex for this system and are treated as irregular occurrences.  3.2  Phrase Structure  We adopt here the version of X-theory presented at the end of Section 2.3. The rules are repeated here for convenience: (51)  (a) (b) (c)  X  Y,X'  i  X'^X^F where: i <2,j<  i, j >  0.  We assume the Specifier position for all phrases appears as thefirst(highest and leftmost) pre-adjunct, and that all adjuncts are sister to an X  2  node. The complements are  those arguments which are sub categorized by the head, and appear as sisters to the X° node or its X projections. The result is a strictly binary branching representation of X 1  phrase structure. To represent this structure, we introduce the "/" operator to represent dominance, and use Prolog's list notation "[ ... ]", to indicate sisterhood and precedence. Since the phrase structure trees are strictly binary, a list must contain exactly two nodes. As a trivial example, the representation of "7 dominates a and 0, where a and (3 are sisters and a precedes /?" is as follows: (52) 7 / [ " , 0  ]  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  33  With the representation of these structural relations defined, we will now present the choice of representations for the nodes of the tree. The classes of nonterminal and terminal nodes correspond to the X projections and the "lexical" nodes respectively. There are three possible nonterminal nodes. The first is the maximal projection. This is the highest X  2  node, which has  projection for the phrase (i.e. its root) . We will call this the phrasal 2  the following form:  (53) xmax(Category, ID, Features, Constraints) where Category is the phrasal category, determined by the head. ID is a unique identifier for the phrase, used to distinguish it from other phrases. This is used by procedures which search the tree for a specific phrase. Features is a list of terms which indicate the features (such as agreement) and properties (such as #-roles, and L-marking) assigned to the phrase. Constraints is used in syntactic analysis, and will be discussed in Section 4.2. The remaining non-terminals represent the intermediate X and X projections. These 2  1  are simply represented as follows: (54)  (a.) xmax( Category) (b) xbar(Category)  where Category is the same as that for the maximal node. Note that the X maximal 2  projection, and the X intermediate projection are distinguished by their arity. 2  The terminal nodes are used to represent the actual lexical items, or leaves, of the parse tree. Specifically, they may be simple lexical specifiers or adjuncts to the head of a phrase (i.e. X°). For these, we adopt the follow representation: (55)  2  (a) (b) (c)  spec(Category, Word, Features) Specifier adj(Category, Word, Features) Adjunct head(Category, Word, Features) Head  We introduce the notion of the maximal projection as the "highest" X projection of a phrase. This is not to suggest that there is a third X level. The distinction being made here is a purely categorial one, not typological. In general, the maximal projection refers collectively to all the X projections. For convenience however, the features of the phrase are attached to the highest node, and it is also used as the maximal projection for purposes of m-command and government. This differs from [Chomsky 86a], which takes the first X projection as determining the m-command domain. 2  2  2  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  34  where, (56)  Category Word  = the category of the lexical item. = the lexical item itself.  Features  = the list of morphological/syntactic features of the word.  Additionally, punctuation symbols have the following representation: (57) punc(punc,Punctuation,Mode) where, (58)  Punctuation Mode  = "?" or "." = "ques" or "decl"  Finally, we must be able to represent empty categories (ec's), which are phonetically null, but nonetheless present in the structure. These have the general form: (59) e_cat(Category, Type) Where Category represents the category of the position occupied by the ec, and Type indicates whether the ec is an np-trace, wh-trace or PRO, represented as "trace(np)", "trace(wh)" and "pro", respectively.  3.3  Transformations  In this section, we will discuss the various types of movement transformations modeled by this system. Specifically, we will examine the basic movement phenomena as presented in Section 2.5, and the conditions under which they may occur in English and German. In as much as we are accounting for only a subset of each of these languages, we also only account for a subset of possible transformations. Specifically, these are head-to-head movement involving V, I, and C, wh-movement to the front of a matrix sentence (i.e. indirect questions are not handled), and NP-movement to a Case-marked position. These are fundamental substitution transformations, and no attempt is made here to account for the various types of adjunction transformations possible.  CHAPTER  3.3.1  3.  REPRESENTATIONS AND ANALYSIS  35  ^-Substitution  We take X°-Substitution to be movement of an X° to another head position. Within this system we account for only two such transformations: verb-raising and inversion. Verb-raising is the movement of V to I, when I is not lexically filled. The result of the transformation is that the verbal element is appropriately inflected by the AGR features present in I. Consider for example the following two sentences: (60)  (a) (b)  Mozart [TNS t] compose the sonata, Mozart composed the sonata. pas  In this example, the inflection of compose in (60a) is the result of moving it to the head of the inflected I phrase. As we might expect, the same holds for German as illustrated by the following embedded sentence constructions (recall that VP and IP are head final): (61)  (a) (b)  dass Brigitte nach Berlin fahren [TNS ] dass Brigitte nach Berlin fahrt pres  Inversion is the movement of I to C, typically in the matrix clause (at least for our purposes). This transformation accounts for the so-called Subject-Aux Inversion (SAI) phenomena which appears in English and German question sentences. Consider the following: (62)  (a) (b)  The girl [TNS ] have read the book, Has the girl read the book? pres  The generation of (62b) is the result of have moving first to I, where it receives it inflection as above, and then to C, resulting in its pre-subject position. The same occurs in German, in the analogous sentence: Hat das Madchen das Buch gelesen, where haben raises to the final I position, and then moves to the head of CP, at the front of the sentence. A difference arises between the two languages however, in that all German verbs may move to C, while in English only auxiliaries (e.g. have and be) are permitted to invert. The result is the insertion of the semantically null "do" element in the I position, in cases where  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  36  no auxilliary is present and inversion is required . This is illustrated by the following 3  grammatical/ungrammatical examples: (63)  (a) (b) (c)  The woman [TNS ] write the letter. * Wrote the woman the letter? Did the woman write the letter? past  In fact, inversion is an obligatory transformation in all German matrix clauses, resulting in the so-called verb-second phenomena, when combined with topicalisation. Additionally, it occurs in English wh-questions, both of these constructions are discussed below.  3.3.2 X-Substitution We take X-Substitution to include any substitution transformation involving a maximal projection. We observed in Section 2.5 that such movement may only be to a specifier position. This may be derived from the following: a) maximal projections may only move to positions where maximal projection can appear (i.e. not to an X° position) by the Structure Preserving Hypothesis, and b) movement to a object position would violate the Projection Principle. In the present treatment we are concerned primarily with two possible types of XSubstitution. The first is movement to the SPEC,IP position, known as raising to subject, and the second is movement to the SPEC,CP position, known as movement to  COMP.  Raising to subject occurs in situations where an NP must move in order to receive Case. The two possibilities are raising of an object to subject, as in passive constructions, and raising of subject to subject. We are concerned here only with the latter. Consider the following examples: (64)  3  (a) (b)  e [TNS ] seem John to like Mary, John,- seems 2,- to like Mary. pres  In the present system, this is accomplished by simply checking to see if inversion is to take place after raising. If so, then do is inserted if no auxilliary can be raised. For further discussion, see Section 5.3 and the raise-verb predicate in Appendix E.  CHAPTER  3.  REPRESENTATIONS AND ANALYSIS  37  In this situation, the embedded subject must move to receive Case. An additional requirement is that the SPEC,IP position be a ^-position so as not to violate the ^-criterion. Finally, the trace, /j, of the embedded subject must be properly governed so as to satisfy the ECP. Movement to Comp accounts for the movement of a wh-phrase to the front of a clause. In the case of the matrix clause, wh-movement is usually accompanied by inversion . Consider the following: 4  (65)  (a) (b)  The boy [TNS ] have put what on the table, What; has the boy put tj on the table? pre3  In this example, the auxilliary verb inverts to the pre-subject position, and the whphrase moves to the front of the sentence in the SPEC,CP position. The same phenomena occurs in German, as in the sentence: Was hat der Junge auf den Tisch gelegt. In fact, the verb-second phenomena of German is the result of inversion, and movement of some topic to the SPEC,CP position. That is, if no wh-phrase has moved to the SPEC,CP position, then some other constituent is topicalised, for non question sentences. This is exemplified by the following: (66)  (a) (b)  Der Junge das Buch auf den Tisch legen haben [TNS ]. Das Buch hat der Junge auf den Tisch gelegt. pres  In this example, haben raises to I and then inverts to C. In addition, some constituent is topicalised, in this case das Buch, resulting in the verb-second configuration of the matrix clause.  4  Except in those cases where the matrix subject is the wh-phrase.  Chapter 4  Parsing with Principles In constructing a parser based on the principles of G B theory, a variety of approaches may be t a k e n . In Sharp's system [Sharp 85], possible S-structures are generated according to X  phrase structure, and then well-formedness is determined by the principles.  principles are specified as filters on possible S-structure configurations.  The  If a constraint  of some p r i n c i p l e is v i o l a t e d , the parser backtracks, generating a new S-structure.  This  process continues u n t i l the parser generates a g r a m m a t i c a l S-structure. T h e efficiency problems of the above strategy are obvious.  T h a t is, by postponing  the a p p l i c a t i o n of well-formedness conditions u n t i l after the entire S-structure has been generated the effects of b a c k t r a c k i n g are magnified. A n alternative approach is to apply relevant constraints during the parsing stage.  In this way i t becomes possible to block  erroneous parses as early as possible. T h e m o d e l adopted here is roughly i l l u s t r a t e d i n F i g u r e 4.1. For our purposes, we take S-structure to include a representation of the sentence's phrase s t r u c t u r e along w i t h antecedent-trace relations of moved constituents (i.e. chains). In the present system, we exclude the recovery of N P coreference ( B i n d i n g theory) and the reference of P R O ( C o n t r o l theory). T h e s y n t a c t i c analysis component of the system consists p r i m a r i l y of two modules. Specifically these are the Parsing Module and the Constraint Module. T h e parsing module  38  CHAPTER  4.  39  PARSING WITH PRINCIPLES  Theta Theory  Case Theory The  Sentence  Z~  ECP  Principles  I  X-bar Parser (P.P., Parameters, Lexicon)  Subjacency  S-Structure  Figure 4.1: The Parsing Model generates candidate S-structures. As each phrase is completed, the constraint module is called to apply relevant principles and constraints. The parsing module employs an X-parser resembling that developed by Sharp. That is, the "language independent" rules of X-theory are represented in Definite Clause Grammar (DCG)  notation. This includes the basic X rules, as well as those for parsing arguments,  adjuncts, and specifiers. The actual phrase structure rules used here are an instantiation of the X-rules presented in Chapter 2, tailored for parsing by DCG's. In addition to Xtheory, the parser uses sub categorization information projected from the lexicon, combined with the Projection Principle, and language specific information about possible choices for adjuncts and specifiers. The X-parser constructs phrase structure trees during the derivation. The constraint module is applied to these trees as each maximal projection is parsed, in a bottom-up fashion. The result is a "co-routining" of the parser and the principles, which is not unlike the approach taken by Dorr. [Dorr 87]. Furthermore, a facility is provided for passing information which is relevant to constraints, up the tree. The result is that violation of a given constraint may be detected as soon as possible, forcing the parser to backtrack at that point where the problem occurred.  CHAPTER  4.  PARSING WITH PRINCIPLES  40  In this chapter we will present the design, implementation, and operation of both the X-parser and the constraint module. These two modules constitute the syntactic  analysis  component of the system.  4.1  The Parsing Module  The purpose of the X-parser is to generate possible phrase structure representations for a given sentence. Since we are recovering S-structure, not D-structure, X-theory alone is insufficient. The parser must also be able to account for moved constituents and hypothesize empty categories. To this end, several elements of ^-theory are incorporated directly in the parser. Specifically, the parser accesses subcategorization information of lexical items to drive the parsing of arguments. If a subcategorized phrase is not present, then it is assumed to have moved, and an empty category is inserted. Furthermore, the parser assumes that all IP's have a Subject specifier. These two features essentially capture the Extended Projection Principle, ^-marking and L-marking of subcategorized constituents is also performed by the parser, but ^-marking of the subject and enforcement of the ^-criterion are handled by the constraint module. The parser is also provided with language specific knowledge about where possible adjuncts and specifiers may appear. Information concerning head position (w.r.t. arguments) is specified for the categories of each language. Additionally, information about where moved elements may appear is included. These phrase structure parameters are accessed by the parser during the parsing stage. In this way, a distinction is maintained between the "universal" X-parser and the language specific parameters. In the parsing module, a logic grammar is used to specify the X phrase structure rules. Logic grammars, originally developed by Colmerauer, permit the specification of typical rewrite rules using logical terms as grammar symbols [Colmerauer 78]. That is, grammar symbols may be functors containing arguments. The arguments, or informants [Dahl et al. 86], may be used to pass information via Prolog's inherent unification mechanism.  CHAPTER  4. PARSING WITH PRINCIPLES  41  The specification of logic grammar rules is itself a specification of a parser. That is, Prolog's underlying theorem prover can be used to derive or parse strings for the language specified. As Prolog uses a top-down, left-to-right, backtracking theorem prover, the result is essentially a recursive descent parser for the logic grammar . In the present system we 1  use the Definite Clause Grammar (DCG) formalism [Pereira etal. 80], which is a restricted form of Colmerauer's Metamorphosis Grammars (MG) [Colmerauer 78].  4.1.1  Parsing X Phrase Structure  The X phrase structure rules outlined in Sections 2.3, and 3.2, present a problem for the DCG formalism. Specifically, they contain left recursive rules. The phrase structure template adopted for parsing purposes is roughly the following: (67)  (a) X (b) X (c) X (d) X  SpecX —• Adjuncts X Adjuncts —*• Arguments X° -»• X ° Arguments 2  2  1  1  1  It is important to note that this is not assuming a three level X system. The purpose of proposing distinct rules for (67a&b) is to permit the parsing of only one Specifier, and enforce its phrase initial position (i.e. the highest, leftmost position, at least for English and German). The DCG rules corresponding to (67a&b) are specified as follows: (68) xmax(L,C,Tree,S,NS)  xmax2(L,C,Tree,S,NS)  1  >  —>  spec(L,C,TXmax,STree,S,Sl), xmax2(L,C,TXmax,Sl,NS), {agree(L,STree,BTree),gen_ID(BTree), constraints(L,C,BTree,Tree)}. adjunct(L,pre,C,TApost,Tree,S,Sl), xbar(L,C,TXbar,Sl,S2), adjunct(L,post,C,TXbar,TApost,S2,NS).  In fact, other parsing strategies may be used in parsing logic grammars. See for example [Abramson etal. 88] which employs the LL(k), LALR(k), and Earley parsing algorithms, among others.  CHAPTER  4.  PARSING WITH PRINCIPLES  42  These define the rules for parsing the maximal and intermediate X projections for 2  a language L, of a category C.  The third argument, TREE,  contains the parse tree  representation of the current phrase. It is important to note that the parse tree does not exactly reflect the derivation of the DCG rules used, but rather is constructed in accordance with the representations presented in Section 3.2. Thefinaltwo arguments represent the old and new lookahead lists, 5 and NS, respectively. The three Prolog calls at the end of thefirstrule invoke the agree, genJLD and constraint routines. The former ensures that the agreement relations between various elements are maintained (such as Spec/Noun and Subject/Verb agreement).  The gen-ID predicate  simply assigns a unique identifier to the maximal projection. The latter calls the constraint module, which applies each of the relevant GB principles to the current subtree. The following rules correspond to those of (67c&d), used for parsing arguments: (69)  xbar(L,C,Tree,S,NS)  >  (head_position(L,C,initial),!}, xmin(L,C,Args,HD,S,Sl), {stack(L,initial,HD,Sl,S2)}, arg(L,post,C,Args,HD,[],Tree,S2,NS).  xbar(L,C,Tree,S,NS)  >  {head_position(L,C,final), HD=X/head(C,_,_), stack(_,final,head(C,W,_,F),S,R),!, getargs(L,head(C,W,As,F))}, arg(L,pre,C,As,HD,[],Tree,R,Sl), xmin(L,C,As,HD,Sl,NS).  xbar(L,C,Tree,S,NS)  >  {head_position(L,C,final),!, poss_empty(L,C,As), HD=X/head(C,empty,F)}, arg(L,pre,C,As,HD,[],Tree,S,NSl), xmin(L,C,As,HD,NSl,NS).  Each rule begins by checking the head-position parameter, which determines whether the head of a phrase is initial orfinalwith respect to its arguments. The parameter has the following form:  CHAPTER  4.  PARSING WITH PRINCIPLES  43  (70) head_position(L,C,Position). where for a language L, and a phrase of category C, Position specifies the head as being "initial" or "final" with respect to its arguments. This head position parameter is language specific, and must be set for each category of a language. We take all categories to be initial, except for V and I in German. The first rule, for head initial phrases, is straight forward. The head is parsed by xmin, which returns the subcategorization frame in As. Each argument is then parsed by arg, which is discussed in more detail below. The second two rules handle head final cases. Since the head's subcategorization frame is necessary to parse the initial arguments, the routine gets the head by inspecting the lookahead stack (using the stack predicate). If the head is found, then getargs is used to retrieve the frame. The final rule handles the situation where the final head is not present, either due to absence, or movement. The poss.empty  parameter, shown in (71) specifies that a category C for language L may be  empty for the above reasons. The default subcategorization frame, for categories such as C and I which are predictable, is stated in As. (71) poss_empty(L,C,As). The final two rules are used to parse the X ° level of a phrase: (72) xmin(eng,C,Args,Tree,[HD|S],S)  xmin(L,C,Args,xbar(C)/Tree,S,S)  —>  >  [], {HD = head(C,W,Args,F), poss_move(eng,c(mat/sent),C,_), head_anal(eng,C,W,HD,initial), get args (eng ,head (C, W, Args ,F) ), Tree = xbar(C)/head(C,empty,_),!}. head(L,C,Args,Tree).  The first rule handles the case where a V or I element has moved to the matrix C position (i.e. inversion), and is not present at I. The purpose of the clause is to determine what the moved element was, so that the appropriate subcategorization information can be retrieved. The second rule simply parses the head of the phrase as follows:  CHAPTER  4. PARSING WITH PRINCIPLES  44  (73)  head(L,C,As,head(C,W,F))  >  head(L,C,As,head(Cat,W,F))  >  head(L,C,As,head(C,W,F))  >  head(L,C,As,head(Cat,W,F))  >  head(L,C,As,head(C,empty,F))  —>  [Word], {head_anal(L,C,Word,HD,initial),!, HD =~head(C,W,_,F), getargs(L,head(C,W,As,F))}. [Word], {head_anal( L,Cat, Word ,HD ,iiiitial), HD = head(Cat,W,Args,F), poss move(L,C,Cat,As),!}. [head(C,W,_,F)], {!,getargs(L,head(C,W,As,F))}. [head(Cat,W,Args,F)], {poss_move(L,C,Cat,As),!}. [], {poss_empty(L,C,As),!}.  The first two rules parse initial heads. Thefirstrule will parse a head, and retrieve its subcategorization frame, used by the xbar rules. The second rule handles the case where a head of category Cat, has moved to head of a phrase of category C. This possibility is verified by the possjmove parameter which has the following form: (74) poss_move(L,Cl,C2,Args). This states that for language L, a head of category C2 may move to the head of a phrase of category C l . The default subcategorization frame for Cl is specified in Args. The various possibilities for head movement must be stated for each language, as they vary from one to the other, and are not handled in a principled fashion by the present system. The second pair of rules are similar to the first two, with the exception that they parse final heads which have been inserted on the lookahead stack. Thefinalrule parses those heads which are not present, and calls the poss-empty parameter mentioned above. 4.1.2  Parsing Specifiers, Adjuncts, and Arguments  In addition to the X rules outlined above, rules are necessary to parse the specifiers, adjuncts and arguments referred to by the above rules. That is, the DCG rules for the spec, adjunct  and arg non-terminals must be specified.  CHAPTER  4.  PARSING WITH PRINCIPLES  45  The rules for parsing specifiers are relatively straightforward. The first clause parses the specifier, while the second clause permits the specifier to be absent for certain categories. (75) spec(L,C,TX,xmax(C,ID,0,Cons)/[TSpec,TX],S,NS) spec(L,C,X/R,xmax(C,ID,[],Cons)/R,S,S)  —> —>  spec(L,C,TSpec,S,NS). [],{no_spec(L,CList), member(C,CList)}.  no_spec(L,[n(_),v(_),p(_),c(mat/sent)]) : - !. The choice of specifier is set for each category of a language. In this system, we take wh-phrases to be the specifiers for CP, and determiners for NP, in English. German is similar, with the addition that it may take a topic as a specifier to the matrix CP. We return to the parsing of wh-phrases and topics below. The rules for parsing adjuncts are very similar to those for specifiers. Thefirstrule below handles a special case for German, where the highest verb dominated by an infinitival IP must adjoin to the right of I. This can be illustrated by the sentence Ich versuchte das Buck zu sehen  ("I tried to see the book"), where sehen has moved to an adjoined position  to the right of zu . Note that the remaining two rules are ordered such that the parser 2  will first.assume no adjuncts. This generally improves parse times, and also tends to yield preferred attachment preferences for PP's in accordance with the Right Association principle [Kimball 73] . 3  (76) adjunct(ger,post,C,TX,xmax(C)/[TX,Tadj],S,S)  —>  adjunct(L,Pos,C,X/Tree,xmax(C)/Tree,S,S) adjunct(L,pre,C,Tree,xmax(C)/[Tadj,Tree],S,NS) adjunct(L,post,C,Tree,xmax(C)/[Tree,Tadj],S,NS)  —> —> —>  {C = i(tns(-)),!}, [head(v(T),W,F)], {Tadj = head(v(T),W,F)}. []. adj(L,pre,C,Tadj,S,NS). adj(L,post,C,Tadj,S,NS).  The possible choices of adjuncts in the present system are PP and CP (relative clause) 2  3  In fact, this is a case where head movement is an adjunction transformation, in contrast with head-tohead movement which is substitution. This is the only example of adjunction dealt with by the present system, and it is incorporated only because of its obligatory application in German infinitivals. For further discussion of the RA principle and parsing, see [Pereira 85].  CHAPTER  4.  PARSING WITH PRINCIPLES  46  adjuncts to NP's, and PP adjuncts to VP's. The choices of specifier and adjunct possibilities appear in Appendix C. The rules for arguments attempt to parse each constituent in the subcategorization frame for the head of the phrase. The rules are as follows: (77) arg(L,Pos,C,Q,HD,ALst,T,S,S) —> [],{build_tree(Pos,C,HD,ALst,T)>. arg(L,Pos,C,As,HD,ALst,T,S,NS) —> {select(L,C,As,A,NewAs)}, a_xmax(L,C,A,Tx,S,Sl), arg(L,Pos,C,NewAs,HD,[Tx|ALst],T,Sl,NS). arg(L,Pos,C,[A|Args],HD,ALst,T,S,NS) —> a_xmax_e(L,C,A,Tx), arg(L,Pos,C,Args,HD,[Tx|ALst],T,S,NS). The first three arguments indicate language, head position, and category. The fourth informant is the list of categories of the sub categorized constituents. Thefifthand sixth arguments represent the head of the phrase and the list of parsed arguments. The first rule is selected when all the arguments have been parsed. It calls build-tree with the head and list of parsed arguments, to construct the phrase structure representation which is returned in the seventh informant T. The second rule attempts to parse an argument by calling a-xmax, and then calls arg recursively. The select predicate is used to retrieve an argument category from the subcategorization frame. This routine can be used to enforce the constituent order requirements for a specific language (i.e.fixedor free). The third rule is applied if the argument is not present, and parses an empty category using ajxmaxje. The ajxmax and a-xmax-e predicates are identical to xmax and xmax-e except that the former perform the 9 and L-marking of the parsed phrases, since they are subcategorized. The following special rules are introduced to parse wh-phrases: (78) wh_phrase(L,c(mat/_),Tree,S,NS)  —>  wh_phrase(L,c(emb/T),Tree,S,S)  —>  {empty_cat(L,C)}, wh_xmax(L,Cat,C,Tree,S,NS). [],{gensym(e,ID)}, {set_ec(L,T,ID,Tree)}.  CHAPTER  The  4.  PARSING WITH PRINCIPLES  47  first rule attempts to parse a lexical wh-phrase in the specifier position of the  matrix CP. Specifically, it will try to parse any phrase that can have wh status, which is specified by the parameter for each language (taken to be NP and PP here). The  whjcmax rule is similar to that for xmax with the addition of a check for wh-morphology. The  second clause will parse an empty operator as the specifier of a relative clause or a  comp-trace as the specifier of an embedded sentence. The comp-trace may be used in the construction of chains, as an intermediate position. Topics with verb-second are a German phenomenon, and are treated here roughly as wh-phrases. Specifically, the topic is a constituent that has moved to the matrix SPEC,CP position, but is not a wh-phrase (i.e. it does not have wh morphology) . The rule for 4  parsing the topic is as follows: (79) topic(L,c(mat/sent),Tree,S,NS)  Again, this is similar to the  —>  {empty_cat(L,C)}, spec(L,C,TXmax,XTree,S,Sl), xmax2(L,C,TXmax,Sl,NS), {add_feature( [ant (+) ,case(_)] ,XTree,Tr), agree(L,Tr,BTree),gen_ID(BTree), constraints(L ,C ,BTree,Tree)}.  xmax rule, with the exception that the topic is marked as  an antecedent, much as a wh-phrase. This is accomplished by adding the the phrase's feature list, using the  4.2  P r i n c i p l e s as  ant(+) term to  add.feature predicate.  Constraints  The X-parser uses X-theory and elements of 0-theory (notably, the Projection Principle and subcategorization) to generate candidate S-structures. During the course of parsing, the constraint module is consulted for the purpose of constructing chains and verifying well-formedness. Specifically, the constraint module is invoked as each maximal projection 4  Topicalisation phenomena in German is much more complex than the treatment here reflects. In reality, a variety of phrasal categories may be topicalised, and the transformation is not restricted to matrix clauses. For more discussion see [Haider etal. 85].  CHAPTER  4.  PARSING WITH PRINCIPLES  48  is completed, as shown in the first clause of (68). The effect is that the constraints are applied in a bottom-up, left-to-right manner. Each principle comprises a component of the module. Here, these are taken to be the ECP, Case theory, 0-theory and Bounding theory. Each component is constructed such that only the subtree of the current maximal projection is accessible. That is, no information about higher, previously parsed constituents is available . In this way, the constraints can 5  be considered "local" in a very strict sense. Specifically, a constraint may affect only the m-command domain for the head of the current projection - a natural restriction as many of the principles are stated in terms of government. While the constraints may be formulated so as to require only information that is contained in the current subtree, this can be very inefficient. The ECP for example will spend time looking for ec's in every subtree, even when there are none present. Additionally, some mechanism is necessary to keep track of partially constructed structures, notably chains. To deal with both of these problems we introduce the notion of a constraint list. Specifically, these lists maintain that information about the current subtree which is of direct relevance to the principles. That is, once the constraint module has been applied to a phrase, it may wish to pass information up the tree, for later consideration. The constraint lists are stored as the fourth argument of the phrasal node, shown in (53) and repeated here for convenience: (80) xmax(Category, ID, Features, Constraints) The Constraints variable is left uninstantiated by the parser, for use by the constraint module. The constraint modulefirstmerges the constraint lists of the immediately dominated phrases. As each principle is applied, the list may be modified and information no longer relevant is removed. When all the constraints have been applied, any new information may be added to the list, and Constraints becomes so instantiated. The constraint list affiliated with a phrasal node therefore represents the "state of affairs" for that node's 5  Recall that the parser does employ a top-down strategy, making access to the left context theoretically accessible. In the present system however, only the current subtree is available at each node.  CHAPTER 4. PARSING WITH PRINCIPLES  49  subtree at the conclusion of the constraint module's application. We refer to information added to the constraint list as constraint requests, since their purpose is to request the attention of the individual constraints. The range of possible requests is as follows: (81)  (a) (b) (c) (d) (e) (f)  case(ID,Case) theta(ID) ec(Cat,ID,Type) ant(Type,Cat,ID) chain(Type,ID,Cat,Case,Theta,Chain) c_chain(Type,ID,Cat,Case,Theta,Chain)  A request for Case by an NP. A request for an external 0-role. An ec requests an antecedent. An antecedent requests a trace. A partially constructed chain. A completely constructed chain.  In each case, the ID argument is used to indicate the position of the phrase which issued the request. That is, ID identifies the location of the NP requesting Case, the ec requesting an antecedent, and so on. In (81c-f), Cat simply indicates the category of the individual trace, antecedent, or chain (i.e. PP or NP). In the case of empty categories, Type is used to indicated if the ec is np-trace, wh-trace, or PRO. For antecedents however, Type indicates whether the moved phrase is in an A or A position (i.e. as "a" or "abar"). For chains, Type is similar to that for antecedents, with the additional possibility of "pro" chains. This will be discussed in more detail below in the Section 4.2.5. In (81e&f), Case and Theta represent the Case and 8 marked position of the chain, while Chain is a list of each of the positions which make up the chain. Each of the constraint requests will be discussed in more detail with respect to their function in the Sections below.  4.2.1  Applying Constraints  The highest level of the constraint module involves three stages, as shown in the following Prolog segment: (82) constraints(L,C,Treel,Tree2) :— get_constraints(Treel,Cons), satisfy_constraints(L,C,Treel,Cons,Tree2,Constraints), add_constraints(Tree2,Constraints).  CHAPTER  4. PARSING WITH PRINCIPLES  50  The get-constraints predicate merges the constraint lists of each maximal projection which is immediately dominated by the current phrase. In doing so the constraint module becomes aware of what chains have been partially constructed, and what requests remain to be satisfied. The satisfy-constraints predicate applies each of the constraints to the phrase, passing the revised tree and constraint list as arguments. The Prolog code is stated simply as follows: (83) satisfy_constraints(L,C,Treel,Constraints,Tree3,NewConstraints) :— ecp(L,C,Treel,Constraints), case_theory(L,C,Treel,Tree2,Constraints,Constraints2), theta_theory(L,C,Tree2,Constraints2,Tree3,Constraints3), subjacency(L,C,Tree3,Constraints3,NewConstraints). We discuss each of these constraint routines in the remaining subsections of this chapter. The add-constraints routine generates any new constraint requests appropriate for the phrase itself. Specifically, "case" requests are issued for each lexical NP, "ant" requests are issued for each non 0-marked argument, and "theta" requests are issued by VP's which assign an external 0-role. The "ec" requests are not added by this predicate, but are inserted into the constraint list during parsing by the xraaxje and ajxmax-e rules. The "chain" requests are created and updated by the subjacency routine, based on ec's.  4.2.2  The E C P  The Empty Category Principle (ECP) is instrumental in licensing'empty categories. From a parsing point of view, it determines whether an empty category is a trace or PRO. We adopt the ECP basically as stated in (22), and repeat it here for convenience: (84)  Extended ECP:  If a is an empty category, then (i) a is trace iff it is properly governed  (ii) a is PRO iff it is ungoverned. In addition, we stated that trace was properly governed if and only if it was governed by lexical head, or locally A-bound (see (23)). The ECP constraint deals only with those  CHAPTER  4. PARSING WITH PRINCIPLES  51  cases where proper government is government by a lexical head, and defers those cases where a trace may be A-bound until later. That is, it determines if an ec is a trace, by checking to see if it is governed by a lexical item. The following specifies the appropriate Prolog code for this: (85)  ecp(L4(_),Tree,Constraints) :— !. ecp(L,C,Tree,Constraints) :— exists_ec(Constraints,Tree,ID,Type) — > ((properly_governs(L,C,Tree,xmax(Cat,ID!_,_)) — > Type = trace(WH) ; Type = pro), ecp(L,C,Tree,Constraints)),!. ecp(L,C,Tree,Constraints) :— !. exists_ec(Constraints,Tree,ID,Type) :— member(ec(_,ID ,Type),Constraints), var(Type).  The first clause causes the constraint to ignore a subject ec, since it won't be governed by a lexical category within the IP projection. Determination of the subject as either trace or PRO, is decided by virtue of it being locally A-bound or not. This is not performed by the ecp routine, but rather by the Subjacency constraint or the Case constraint, both of which are discussed below. The second clause captures the rest of the ECP definition, which states that if an ec is governed by a lexical category it must be trace, and is otherwise PRO. For our purposes, the proper-government  predicate is defined strictly as government by a lexical head (i.e.  N, V, A, or P) . In this clause, the first goal checks if there exists an empty category in 6  the constraint list. Then, if the ec is properly governed it is marked as a trace, otherwise 7  as PRO. Note, since PRO can only appear in subject position, this predicate will never 6  In the present system, adjectival phrases are not considered. So in fact, only N, V, and P are taken to be lexical heads here.  7  In the present system, we take the m-command domain of a given head to be everything dominated by its highest maximal projection. This is contrary to Chomsky's Barriers formulation [Chomsky 86a], in which the m-command domain is determined by the first maximal projection dominating the head. In this way we predict that an ec adjunct of a lexical category is trace, and not PRO.  CHAPTER  4.  PARSING WITH PRINCIPLES  52  actually mark an ec as PRO in a grammatical situation.  4.2.3  Case Theory  In the present system, Case theory has two effects. Firstly, it assigns Case to phonetically realised noun phrases. That is, if a noun phrase has requested Case, it must be assigned Case by the governing lexical item. Secondly, once an empty category has been identified as a trace (by the ECP), Case theory is used to determine if it is an np-trace (i.e. it is not assigned Case) or a wh-trace (i.e. it is assigned Case) . Note, that a subject ec of a 8  tensed clause will not be dealt with by the ECP constraint, but will be marked as wh-trace by the Case routine. The high-level predicate for the Case principle is as follows: (86)  case_theory(L,C,Treel,Tree3,Consl,Cons2) :— case_assigner (L, C ,Tree 1 ,Trans),!, case_mark_lexical(Treel,Tree2,Trans,Consl,Cons2), case_mark_traces(Tree2,Tree3,Trans,Cons2),!. case_theory(L,C,Tree,Tree,Cons,Cons) :— !.  Thefirstgoal, case-assigner, determines whether or not the head of the current maximal projection is a Case assigner. If not, then the predicate trivially succeeds, and application of the constraint is abandoned. If so, then any Case requests must be satisfied. The most crucial of the requests is that of a lexical NP, which if not satisfied will cause the parser to backtrack. The predicate for Case marking the lexical NP's is as follows: 8  This is included in Chomsky's statement of the Generalised Empty Category Principle [Chomsky 81a] which is stated as follows: Generalised E C P : If a is an empty category, then (i) a is PRO if and only if it is ungoverned (ii) a is trace if and only if it is properly governed (iii) a is a variable only if it is Case-marked  CHAPTER  4.  PARSING WITH PRINCIPLES  53  (87)  case_mark_lexical(Tree 1 ,Tree3 ,Trans,Cons 1,Cons3) :— remove(case(ID,Case),Consl,Cons2),!, governed(xmax(n(_) ,ID,_,_) ,Treel,Trans), mark_node(ID,case(ID),Treel,Tree2), case_mark_lexical(Tree2,Tree3,Trans,Cons2,Cons3),!. case_mark_lexical(Tree,Tree,Trans,C,C) : - !. If an NP dominated by the current maximal projection requires Case, then the case(ID, Case)  request will appear in the constraint list. This request indicates the location of the  NP (i.e. the node ID), and the Case requested (e.g. nominative, accusative, etc.). If the NP is governed, then it is Case-marked. This is accomplished by adding the case(ID) feature to the NP's feature list, using the mark_node predicate. If the NP is not governed by the Case assigner, then the routine fails. Note that wh-phrases and topics do not issue a Case request, since their traces must by Case-marked. If there exists a trace in the current set of constraint requests, Case theory is used to distinguish np-trace from wh-trace, as mentioned above. The predicate for Case marking traces is specified as follows: (88)  case_mark_traces(Treel,Tree3,Trans,Constraints) : — exists_trace(Constraints,Treel,ID,WH) -> (governed(xmax(Cat,ID,_,_),Treel,Trans) —> (mark_node(ID,case(ID),Treel,Tree2), WH = wh) ; ( Tree2 = Treel , WH = np)), case_mark_traces(Tree2,Tree3,Trans,Constraints),!. case_mark_traces(Tree,Tree,Trans,Constraints) :— !. exists_trace(Constraints,Tree,ID,WH) :— member(ec(Cat,ID,trace(WH)),Constraints),var(WH).  The first goal checks to see if there are any traces. If there is a trace, the predicate determines if it is governed, and if so the node is Case marked (as for lexical NP's), and the trace is marked as a wh-trace. If the trace is not Case marked, then it is marked as an np-trace.  CHAPTER  4.2.4  4.  54  PARSING WITH PRINCIPLES  0-Theory  W h i l e ^ - m a r k i n g of arguments is performed by the parser, ^-marking of the subject is controlled by the constraint module. T h e 9 constraint only applies at IP, and succeeds t r i v i a l l y for other m a x i m a l projections. Specifically i t handles two cases: that where the subject is assigned a <2-role, and that where it is not. T h e first case is implemented by the following clause: (89) theta_theory(L,i(_),Treel,01dCons,Tree2,NewCons) : remove(theta(-f-),01dCons,NewCons),!, Treel = X/[Subjl,Rest], S u b j l = xmax(_,ID,_,_)/_, add_feature(theta(ID),Subjl,Subj2), Tree2 = X / [ S u b j 2 , Rest].  If the subject is assigned a 0-role, then the theta(+) t e r m w i l l be present i n the constraint list. If so, the subject position is m a r k e d as receiving a 0-role by adding the  theta(ID)  feature to the subject node, where ID indicates the subject's position. T h e second clause handles the s i t u a t i o n where the subject is not assigned a  fl-role,  i n d i c a t e d b y the lack of the theta(+) t e r m i n the constraints list. T h i s possibility is treated by the f o l l o w i n g clause: (90) theta_theory(Li(_),Tree,01dCons,Tree,NewCons) :— \+ member(theta(+),01dCons), Tree = X/[Subj,_], \+ pleonastic(L,Subj), \+ S u b j = _/e_cat(_J,!, S u b j = xmax(_,ID,_ append([ant(a,n(_),ID)],01dCons,NewCons). theta_theory(L,C,Tree,Constraints,Tree,Constraints) :— !. pleonastic(eng,NP) :— get_head(NP,head(_,it,_)). pleonastic(ger,NP) :— get_head(NP,head(_,es,_)).  CHAPTER  4.  PARSING WITH PRINCIPLES  55  In this case there are two possibilities: 1) the subject is a pleonastic element (e.g. it or es) , or 2) the subject is occupied by a referential NP which has received its 69  role elsewhere. In the case of the former, the predicate simply succeeds. In the latter instance, the requirement that the subject is an antecedent to an np-trace is added to the constraint list. This is done by adding the term ant(a,n(_),ID) to the list, where "a" indicates the antecedent is in an A-position, "n(_)" indicates it's an NP, and "ID" specifies its location. This constraint term is described in the following section, which discusses the implementation of Bounding theory. 4.2.5  Subjacency and Movement  In this section, we discuss the treatment of moved constituents. Specifically, this section of the constraint module is concerned with the recovery of chains. This involves initialising chains when 0-marked traces are encountered, and prepending each Subjacent intermediate position to the chain until the antecedent is found. In addition to the Subjacency principle, this routine incorporates elements of the ECP, Case and 0-theory. More precisely, the Case filter is verified by ensuring that each NP chain receives Case exactly once. The ^-criterion is enforced by requiring that each chain receive exactly one 0-role. Finally, a mechanism is introduced whereby a subject ec of an infinitival can be determined as either np-trace or PRO, thus completing the implementation of the ECP. This is accomplished by introducing "PRO" chains, which have the effect of determining whether or not the ec is A-bound. In the discussion that follows, we will present in more detail how the chains are constructed, and how the above principles are incorporated. The highest level predicate of the subjacency routine is as follows: (91) subjacency(L,C,Tree,01dCons,NewCons) : bind_traces(L,C,Tree,01dCons,Consl), check_pro_chains(L,C,Tree,Consl,Cons2), \+ member(barrier(ID),Cons2), barriers(L,C,Tree,Cons2,NewCons). 9  English in fact has another pleonastic element, "there".  CHAPTER  4.  PARSING WITH PRINCIPLES  56  The first goal, bindJraces, controls the construction of chains. The second goal, as we shall see, is involved in implementing the remaining portion of the ECP. The third goal ensures that Subjacency is not violated, while the final goal determines if the present node is a barrier or not, for "higher" applications of the constraint. The bind-traces predicate is by far the most involved. Its function is to initialise, construct, and close chains. Thefirsttwo clauses, shown in (92) handle the cases where a chain of unknown type is present in the constraint list. These chains are created when an ec is found in a subject position of an infinitival IP (and may be either PRO or trace). We take advantage of the idea that if the IP is dominated by a CP, the ec may be locally A-bound, but if the IP is dominated by a VP, the ec must be properly governed (by the V) and hence be an np-trace. As a result, if thefirstnode dominating the chain is a CP, then we suggest that the chain is of type "pro", meaning it may or may not be locally A-bound. We will return to this possibility below in the discussion of the check-pro-chains predicate. If thefirstnode dominating the chain is a VP, then we know the subject of the embedded IP is an np-trace, since an intervening CP node would have caused the previous clause to be applied. In this case the chain must be an A-chain (i.e. type is "a"). This occurs in subject raising contexts, with verbs like seems, and is handled by the second clause. (92) bind_traces(L,c(_),Tree,01dCons,NewCons) :— member(chain(Type,ID,CC,Case,Theta,List),01dCons), var(Type),!, getjiead (Tr ee ,head (_,empty,J), Type=pro, bind_traces(L,c(_),Tree,01dCons,NewCons),!. bind_traces(L,v(_),Tree,01dCons,NewCons) :— member(chain(Type,ID,CC,Case,Theta,List),01dCons), var(Type) ,!,Type=a,set_last(trace(np) ,List), bind_traces(L,v(_),Tree,01dCons,NewCons),!. The third clause shown in (93), handles the closure of chains. This occurs when the antecedent has been reached, and added to the chain. The first goal indicates the possible relationships which may hold between an antecedent and its chain. That is, if the  CHAPTER  4.  PARSING WITH PRINCIPLES  57  antecedent is in an A-position, so must the current head of the chain (thus forbidding movement from an A to an A position). If the antecedent is in an A-position, then any type of chain is possible (i.e. "a", "abar", or "pro"). The second and third goals simply remove the antecedent and chain from the constraint list, while the fourth sets the tail ec of the chain to wh-trace, if it was a pro-chain. Finally we insert the completed chain, or enchain,  into the constraint list.  (93) bind_traces(L,C,Tree,01dCons,NewCns) :— member((At/Ct),[(a/a),(abar/a),(abar/abar),(abar/pro)]), remove(ant(At,CC,ID),01dCons,Cnsl), remove(chain(Ct,IDC,CC,Case,Th,List),Cnsl,Cns2), (Ct=pro,set_last(trace(wh),List) ; true), subjacent(ID,IDC,Tree,Cns2,Cns3), agree_case(ID,Tree,Case,Th), append([c_chain(At,ID,CC,Case,Th,[ant(At,CC,ID)|List])],Cns3,NewCns),!. The following clauses handle the creation of new chains. The first creates a chain of unknown type for the subject of an infinitival IP, as mentioned above. The second may bind an intermediate trace to an existing, Subjacent chain via the bind-tojchain predicate. Alternatively, it will initiate a chain for a trace in a ^-marked position. The make.chain routine is relatively straightforward, and the reader is referred to Appendix D, for a listing of the Prolog code. (94) bind_traces(L,i(tns(-)),Tree,01dCons,NewCons) : exists_ec( OldCons ,Tree,ID ,Type) — > make_chain(Tree,ec(Cat,ID,Type),Unknown,01dCons,Consl), bind_traces(L 4(tns(—)) ,Tree,Consl ,NewCons). bind_traces(L,C,Tree,01dCons,NewCons) :— member(ec(Cat,ID,trace(T)),01dCons),!, (bind_to_chain(ec(Cat,ID,trace(T)),Tree,01dCons,Consl); make_chain(Tree,ec(Cat,ID,trace(T)),a,01dCons,Consl)), bind_traces(L ,C ,Tree,Cons 1 ,NewCons). bind_traces(L,C,Tree,Cons,Cons) :— !. The bind-to-chain predicate, shown in (95), appends intermediate traces (i.e. those  CHAPTER  4. PARSING WITH PRINCIPLES  58  traces in SPEC,CP or SPEC,IP, ^-positions). The routine simply removes the old chain, verifies that the intermediate trace is Subjacent, and inserts the new chain, with the new trace, into the constraint list. The code is as follows: (95)  bind_to_chain(ec(C,ID,trace(T)),Tree,01dCons,NewCns) : remove(ec( C,ID ,Type) ,01dCons ,Cns 1), member( ChT,[a,abar ,pro]), remove(chain(ChT,IDC,C,Cse,Th,List),Cnsl,Cns2), subjacent(ID,IDC,Tree,Cns2,Cns3), agree_case(ID,Tree,Cse,Th), append([chain(ChT,ID,C,Cse,Th,[ec(C,ID,trace(T))|List])],Cns3,NewCns),!.  The interesting point here is the call, which ensures that any chain is assigned Case and a 0-role exactly once. This captures half of the ^-criterion, and part of the Case filter. The Prolog to enforce these constraints is as follows: (96)  agree_case(ID,Tree,Case,Theta) : get_subtree(ID,Tree,xmax(C,ID,F,Cons)/R),!, (member(case(IDC),F),Casel=case(IDC);true), (member(theta(IDT),F),Thetal=theta(IDT);true),!, Case=Casel,Theta=Thetal,!.  In the above discussion, we have mentioned pro-chains. These arise from the ECP condition which states that an ec is a trace if it is "locally A-bound". This notion is not locally determinable, since we don't know if the ec is locally bound until an antecedent has been discovered. This possibility arises in the case of the subject position of an infinitival IP, where an ec may either be a wh-trace, or PRO. In these cases, a chain of type "pro" is created, indicating that if an antecedent is found the ec is a trace for it, otherwise it is PRO. These chains have the lowest priority. That is, if two chains are competing for a single antecedent, the non-pro-chain wins . The following code simply removes any pro10  In fact while this strategy works reasonably well, there do exist certain constructions which pose potential problems. Specifically, these are known as parasitic gaps (for discussion see [Taraldsen 81], [Engdahl 83], [Chomsky 82]).  CHAPTER  4.  PARSING WITH PRINCIPLES  59  chain, if no Subjacent antecedent could be found, and sets the tail (i.e. the base ec) to "pro": (97) check_pro_chains(L,C,Tree,Consl,Cons4) :— remove(chain(pro,ID,CC,Case,Theta,List),Consl,Cons2), remove(barrier(ID),Cons2,Cons3), set_last(pro,List), check_pro_chains(L,C,Tree,Cons3,Cons4),!. check_pro_chains(L,C,Tree,Cons,Cons) : - !. The barriers predicate determines if the current maximal projection is a barrier. Recall, that Chomsky's notion of barrier is a relative one, in which 7 is a barrier for a under certain conditions [Chomsky 86a]. In the present system however, we have adopted the definition proposed by Lasnik and Saito, in which all non-L-marked maximal projections are taken to be barriers for movement [Lasnik et al. 88b]. The code is represented as follows: (98) barriers(L,C,Tree,01dCons,NewCons) :— is_barrier(Tree), member(chain(Type,ID ,CC ,Case,Theta,List) ,01dCons), \ + member (barrier (ID) ,01d Cons),!, append([barrier(ID)],01dCons,Consl), barriers(L,C,Tree,Consl,NewCons). barriers(L,C,Tree,Cons,Cons) :— !. is_barrier(xmax(C,ID,Features,Cons)/R) :— \+ member(l_marked,Features),!.  If the node is a barrier, that is it is not 1-marked, then a barrier (ID) constraint is added to the constraint list for each chain (where ID identifies the specific chain).  Chapter 5  Principle-Based Translation In this chapter we present the implementation of the translation component of the system. Specifically, this component translates S-structures of a "source" language into S-structures of a "target" language. To accomplish this, we capitalise on the notion of D-structure as a language independent syntactic representation, as it is intended by the linguistic theory. In so doing, the translation component avoids the problems which arise in dealing with seemingly idiosyncratic surface phenomena which may exist between the source and target languages. The translation component consists of three modules. The first recovers D-structure from the S-structure representation produced by the syntactic analysis component. The second translates the D-structure of the source language into that of the target language, and the third generates the S-structure representation for the target language. This model of translation is basically identical to that employed by [Sharp 85], and [Dorr 87] for translation between English and Spanish, and is illustrated in Figure 5.1. It is important to note that the present system performs only syntactic translation. That is, it assumes lexical items in each language have essentially an injective mapping. The system does not attempt to achieve the breadth of coverage of existing machine translation (MT) systems (see [Slocum 85] for a survey of existing systems). Rather, its purpose is to show how a principle-based approach can greatly simplify the task, by performing  60  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  61  Target S-Structure  Source S-Structure  i  Generate  Parse i  i  T  Source D-Structure  Translate  Target D-Structure  Figure 5.1: The Translation Model translation at D-structure. Each of the modules is called in sequence by the translate predicate specified as follows: (99)  translate(SourceTree,TargetTree) :— write('Source S u r f a c e  Structure:'),nl,  pretty(SourceTree),nl, gen_deep_structure(SourceTree,SourceDeep), trans_lexical(SourceDeep,TargetDeep,Mode), gen_surf_structure(Mode,TargetDeep,TargetTree).  The remainder of this chapter will discuss the implementation of each module.  5.1  Recovering D-structure  The recovery of D-structure from S-structure is a relatively straightforward task. Essentially it involves "undoing", or reversing the move-a transformations which are reflected by S-structure. Note, there is no need to employ any constraints at this stage, since the chains have already been verified as well-formed during syntactic analysis (e.g. the Case filter and ^-criterion have been satisfied). The predicate gen-deepstructure rather trivial, calling only one important goal, rev-move-alpha.  is therefore  Both are specified below:  CHAPTER  5. PRINCIPLE-BASED TRANSLATION  62  (100)  gen_deep_structure(SourceTree,TargetTree) :— rev_move_alph.a(SourceTree,TargetTree),!, write(' Source Deep Structure:' ),nl, pretty (TargetTree),nl. rev_move_alpha(S_structure,D_structure) :— S_structure = xmax(C,ID,F,Cons)/_, make_trace_lists(l,Cons,Lists), rev_move_np(S_structure,D_structurel,Lists), rev_move_head(D_structurel,D_structure),!. The revjmovejalpha predicate divides the reverse transformations into two categories; the first are moved X constituents, namely NP's and wh-phrases, and the second are moved heads, such as verbs raising to I, and inversion to C. These reverse transformations are performed by revjmovejnp and revjmoveJiead respectively (these are not order dependent). The process of moving X's to their D-structure positions is relatively straightforward. It essentially involves "collapsing" the chains which are recovered during syntactic analysis. Consider the specification of revjmovejnp shown below: (101)  rev_move_np(D_str,D_str,[]) :— !. rev_move_np(S_str,D_str,[Chain|Rest]) :— extract_from_chain(Chain,SurfID,BaseID), collapse_chain(S_str,SurfID,D_strl,BaseID), rev_move_np(D_strl,D_str,Rest),!. The first clause halts when the list of chains, the third argument, has been exhausted. The main clause calls extracLfrom-chain with the.firstchain in the list. This predicate simply returns the surface position of the moved constituent (i.e. the head of the chain) in SurfID, and the base position, or tail, in BaselD. These two positions, along with the Sstructure representation, are passed to collapse-chain which returns a new representation, Dstrl,  in which the constituent has been returned to its base position. The predicate is  then called recursively until all chains have been collapsed. In the present system, head-movement is not recovered through the use of chains, since movement is always to the next highest X° position (possibly successively). As a result,  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  63  undoing head movement is also relatively straightforward. The task here is to first find a moved head, and then return it to its base position. This is performed by rev-moveJiead which is written as follows: (102) rev_move_head(01dDstr,NewDstr) :— nnd_moved_head(01dDstr,HD,01dDstrl),!, move_hd_base(HD,01dDstrl,NewDstr). rev_move_head(Dstr,Dstr). The first goal, find-movedJiead, takes an existing tree representation for the clause (OldDstr),  and looks for situations in which a head does not match the category of the  phrase. If it finds such an occurrence, it returns the head as HD, and sets the head of the phrase to empty. The second goal then searches for the D-structure position of the head and sets it to be HD. In fact, the only case this must deal with is inversion of V or I to C, since the parser accounts for raising of V to I automatically . 1  5.2  Translation  The translation module, as stated earlier, is highly simplified and performs only syntactic translation. The highest level, shown in (103), simply calls transJex to translate the lexical items of the source language into those of the target language. 1  That is, the X-parser will parse a V which has raised to I in its D-structure, V, position. This is relatively straightforward since these positions are generally string adjacent in both English and German, however a more principled approach using chains would doubtlessly be preferable.  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  64  (103)  trans_lexical(SourceDeep,TargetDeep,Mode) :— trans_lex(S our ceDeep ,Target Deep ,Mode), write(' Target Deep Structure),nl, pretty(TargetDeep),nl. trans_lex(S ourceDeep ,TargetDeep,Mode) :— SourceDeep = xmax(C,ID,F,Cons)/R,!, SourceDeep2 = xmax(C,ID,F,_)/R, trans_lexl(SourceDeep2,TargetDeep,Mode), target(L),!, rev_constraints(L,C,TargetDeep). trans_lex(SourceDeep,TargetDeep,Mode) :— trans_lexl(SourceDeep ,TargetDeep ,Mode).  The transJex predicate simply recurses through the D-structure of the source language, calling transJexl at each maximal projection. The transJexl predicate (see Appendix E) translates each lexical item of the phrase, and in turn calls transJex for each "embedded" maximal projection. The result is essentially a top-down, left-to-right translation of the sentence. During translation, the Mode variable also becomes instantiated. This indicates if the sentence is interrogative ("ques") or declarative ("decl"), and is used for S-structure generation, discussed in the next section. In addition to translation, the transJex predicate also calls rev-constraints at each maximal projection. The purpose of this is to determine which constituents of the target D-structure are going to have to move during the generation of S-structure. The result is the construction of chains which will be "unfolded" during generation . The predicate is 2  shown below: (104)  rev_constraints(L,C,Tree) :— get_constr aints(Tree ,Constraints), subjacency(L,C,Tree,Constraints,NewConstraints), rev_add_constraints(Tree,NewConstraints). 2  This should probably be performed in the generation component, but was included here since the Dstructure is being traversed anyway, for translation, and the application of the constraint module does not interfere.  CHAPTER 5. PRINCIPLE-BASED TRANSLATION  65  This predicate essentially applies "in reverse" the constraint module used for parsing . 3  Note that while the translation is performed top-down, the application of constraints is done bottom-up, as the call appears at the end of the clause. In a complete system, the application of principles would begin from scratch, applying Case theory to determine those constituents which must move, and ensuring that none of the principles are violated during the transformation phase. Here however, we take advantage of the similar Case marking properties of English and German. That is, if a constituent is returned to a non Case marked position in the recovery of English D-structure, then that position is not Case marked in the German D-structure (and vice versa). In general, structural similarities between English and German make verification of the many principles non-critical. Therefore, 6  for our purposes we need only apply the Subjacency module. This has been implemented such that it can be applied in reverse. That is, by instantiating the D-structure argument, it will construct the chains which are to be reflected in the corresponding S-structure. The geLconstraints  predicate is as described in Chapter 4, and simply merges the constraint  lists of immediately dominated maximal projections. The rev-ada\.constraints is the reverse analog of the add-constraints predicate used in parsing. That is, it adds requests to the constraint list of each X for consideration by the constraint module. The relevant code is shown below: 3  In fact, if we want to remain true to the model of transformational grammar, the syntactic analysis applies the principles in reverse, while generation applies them in the intended manner.  CHAPTER  5. PRINCIPLE-BASED TRANSLATION  66  rev_add_constraints(xmax(C,ID,F,NewCons)/R,Cons) :— rev_new_constraints(C,xmax(C,ID,F,Cons)/R,Consl), append(Cons,Consl,NewCons). rev_new_constraints(_,xmax(C,ID,_,_)/e_cat(Type),[Cons]) :— !, ( (Type = ant , Cons = ant(abar ,ID)) ; (Type = comp , Cons = ec(_,ID,trace(comp))) ; (Type = np , Cons = ant(aj_,ID)) ). rev_new_constraints(_,xmax(C,ID,_,_)/e_cat(_,Type),Cons) :— !, ( (Type = trace(comp) , Cons = [ec(_,ID,trace(comp))]); Cons = [] ). rev_new_constraints(_,xmax(C,ID,Ftr,_)/R,[ec(C,ID,trace(wh))]) :— member(wh(-l-),Ftr),!. rev_new_constraints(n(_)panax(C,ID,Ftr,_)/R,[ec(C,ID,trace(np))]) :— \+ member(case(_),Ftr),!. rev_new_constraints(_,Tree,[]). u  The purpose of this predicate, is basically to determine possible intermediate and destination positions for moved constituents. When chains are collapsed in recovering Dstructure, the vacated S-structure positions (i.e. SPEC,CP and SPEC,IP positions) are filled by appropriate empty phrase markers. These are interpreted by the second clause of (105). Specifically, an empty subject position may be a destination for the head of an A-Chain, a matrix SPEC,CP may be a landing site for a wh-phrase, and an embedded SPEC,CP may be an intermediate position (since indirect questions are not handled). The fourth and fifth clauses find non Case marked NP's and wh-phrases, which add the ec request to the constraint list. This has the effect of requesting that these positions be vacated.  5.3  Generating S-structures  In this section, we present the model for generating S-structures from D-structures in the target language. This involves doing all the necessary transformations, and ensuring that agreement is reflected by the morphology of relevant elements. The generation module is controlled by the following predicate, gensurfstructure:  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  67  (106) gen_surf_structure(Mode,TargetDeep,TargetSurface) :— target(L),set_inversion(L,Mode,Inv), move_alpha(TargetDeep ,TargetSurface 1), raising(L,Inv,TargetSurfacel,TargetSurface2), gen_pf(Target Surface2 ,TargetS urface3), inversion(Inv,TargetSurface3,TargetSurface4), topicalize(L,Mode,TargetSurface4,TargetSurface), write('Target Surface Structure:'),nl, pretty(TargetSurface). The clause is passed the D-structure representation, TargetDeep, and the mode of the sentence, Mode. When completed, the variable TargetSurface will be instantiated with the S-structure representation of the sentence in the target language. Thefirstcall simply instantiates L as the target language (either "english" or "german"), while the second sets the inversionflag,Inv, based on L and Mode using the parameters below: 4  (107) set_inversion(ger^,yes). set_inversion(eng,ques,yes). set_inversion(eng,decl,no). These parameters indicate that inversion always takes place in German (thus accounting for "verb-second" phenomena), and inversion is performed in English questions. The next operation is to apply the move-alpha predicate which uses the chains constructed by translate to move the various constituents. The Prolog is shown below: (108) move_alpha(Deep,Surface) :— Deep = xmax(C,ID,F,Cons)/_, make_trace_lists(l,Cons,Lists), move_np(Surface,Deep,Lists). This first two goals examine the constraint list and extract relevant information from the chains. The third goal, movejnp, is responsible for actually moving the constituents, 4  In fact, there exist more principled accounts of the parametric variation which determines inversion. For further discussion see [Davis 87].  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  68  and is specified as follows: (109) move_np(D_str,D_str,[]) : - !. move_np(S_str,D_str,[Chain|Rest]) :— extract_from_chain( Chain,SurfID ,BaseID), collapse_chain(D_str 1 ,SurfID ,D_str ,BaseID), move_np(S_str,D_strl,Rest),!. This predicate works much as the revjmovejnp predicate described in the previous section. That is, it determines the surface and base position for the constituent of each chain and then calls collapse.chain.  Here however, the D-structure argument of collapse.chain  is instantiated, and the S-structure representation is generated. The next task is to perform verb raising. As noted in Section 5.1, the analysis performed here is somewhat tailored to the English and German. We take raising to be an obligatory transformation where the "highest" verb of a clause moves to the head of the dominating IP, just incase I is empty. In this way, thefirstverbal element is inflected by the AGR features present in the I node. A difference does exist between English and German however, in that only English auxilliaries (e.g. have and be) may raise, if inversion to C is to take place, while all German verbs raise, regardless of inversion . As a result, this routine must 5  be aware of whether or not inversion is to take place, since for English, "do-support" may be required in cases where no auxilliary is present. The main predicate for verb raising is as follows: 5  For a more principled account of raising, see [Koopman 84] in which the ECP is used to account for these phenomena.  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  69  raising(Lang,Inv,DeepTree,SurfTree) :— DeepTree = xmax(i(Tns),ID,Ftr,Cons)/[Left,Right],!, raising(Lang,no,Left,NewL),raising(Lang,no,Right,NewR), DeepTreel = xmax(i(Tns),ID,Ftr,Cons)/[NewL,NewR], check_subject(Left,Inv,NewInv), r aisel (Lang,NewInv ,ID ,DeepTree 1 ,S urfTree). raising(Lang,Inv,X/[L,R],X/[NewL,NewR]) : - !, raising(Lang,Inv,L,NewL),raising(Lang,Inv,R,NewR). raising(Lang,Inv,X/R,X/NewR) :— !,raising(Lang,Inv,R,NewR). raising(Lang,Inv,X,X) :— !.  Basically, this traverses the tree, until the matrix IP clause is found. Then raising is applied to the subtrees of the clause with the inversion argument set to "no", since inversion may only occur in the matrix clause. The subject of the IP is then checked to make sure it's not empty. This may revise the inversionflag,since an empty subject blocks the inversion transformation. Then, the raisel predicate is called, and is shown below:  raisel(Lang,Inv,ID,DeepInfl,SurfInfl) : get_head( Deeplnfl ,head(i (_) ,emp ty,_)),!, raise_verb(Lang,Inv,ID,DeepInfl,DeepInfll,Verb), set_head(Verb,ID,DeepInfll,SurfInfl). raisel(ger,Inv,ID,DeepInfl,SurfInfl) : get_head(DeepInfl,head(i(tns(-)),zu,_)),!, raise_verb(ger,Inv,ID,DeepInfl ,Deep Infl l,Verb), Deeplnfll = Xmax/R, Surflnfl = Xmax/[xmax(i(tns(-)))/R,Verb]. raisel(L,Inv,ID,Infl,Infl) : - !.  The first clause checks to see if I is empty. If so, it calls raise-verb which retrieves the head of the subcategorized VP (and sets its head to empty). The set-head routine is then called to instantiate the head of the IP with the verb. The second clause is not really a raising case, but rather handles the adjunction of the highest V to the right of I in German infinitival clauses. Again, raisejverb is called to find the verb, and then the adjunction is performed. Thefinalcase will trivially succeed if I is lexical.  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  70  At this point, it is convenient to apply the agreement routines to ensure that person, number, gender, Case and tense are realised appropriately for the various phrases and lexical items. This is performed by the genjpf routine, specified as follows:  gen_pf(SourceDeep,TargetDeep) :— SourceDeep = xmax(C,ID,F,Cons)/R, member( C ,[n(_) ,c(emb/rel)]),!, target(L), rev_agree (L,SourceDeep ,Target Deep 1), gen_pf1 (Target Deep 1 ,TargetDeep). gen_pf( SourceDeep ,TargetDeep) :— SourceDeep = xmax(C,ID,F,Cons)/R,!, gen_pf1 (SourceDeep ,Target D eep 1), target(L),!, rev_agree(L ,TargetDeep 1 ,TargetDeep). gen_pf( SourceDeep ,TargetDeep) :— gen_pfl (S our ceD eep,Target D eep).  This predicate traverses the tree, applying rev.agree at each maximal projection. This task is basically performed in a bottom-up fashion by the second clause. Thefirstclause however applies rev.agreefirstin the case of NP's and relative clause CP's since the operator of the CP must agree with the dominating NP, and this may further be used to inflect the embedded clause. The re.vja.gree predicate simply calls the agree routine which is used during syntactic analysis. In this case however, the D-structure, uninflected representation is instantiated, and the inflected S-structure is generated. Once the various agreement inflections have been performed, the inversion transformation is performed, using the inversion predicate specified below:  (H3) . inversion(yes,DeepTree,SurfTree) :— !, find_infl(DeepTree,SurfTreel,Infl), SurfTreel = xmax(_,ID,_,_)/_, set_head(Infl ,ID ,S urfTree 1 ,S urfTree). inversion(no,Tree,Tree) :— !.  CHAPTER  5.  PRINCIPLE-BASED TRANSLATION  71  The two clauses handle the case where the inversionflagis set to "yes" or "no" respectively. In the latter case, the predicate trivially succeeds. In the case where inversion is to take place, the  findJnfl  routine first locates the matrix IP, and retrieves its head (setting  it to empty). Then the head of the matrix CP is instantiated, by the I (or possibly raised V) element. The final transformation is that for topicalisation in German. This is not performed by move-alpha,  since it is done independently of the standard wh or Case motivated transfor-  mations. Specifically, if the SPEC,CP of the matrix CP is empty, then some constituent must move to that position. The topicalize routine is shown below: (114) topicalize(ger,decl,DeepTree,SurfTree) : - !, get_topic(DeepTree,Tree,Topic), attachJopic(Topic,Tree,SurfTree). topicalize(_,_,Tree,Tree) :— !. The first goal examines the current tree and returns an appropriate phrase to topicalise. In this system, the highest phrase dominated by the matrix IP is selected (i.e. the subject), although this routine could be altered to use a more sophisticated selection strategy depending on emphasis and possibly syntactic restrictions. Thefinalgoal simply attaches the phrase to the topic, SPEC,CP, position.  Chapter 6  Evaluation and Discussion The preceding chapters have presented a system for natural language analysis and translation which is based upon the principles of transformational generative grammar. Specifically, we have presented the linguistic framework of Government-Binding theory, a system of representations, and an implementation based upon the linguistic principles. The discussion thus far has been primarily descriptive, focusing on the exposition of the theory, representations, and implementation. This chapter is devoted to an evaluation and discussion of the present system. We begin with a discussion of some general issues central to the construction of principlebased systems. We will then turn to an evaluation of each component of the system with respect to our general design criteria and compare ours with other principle-based systems. Finally, we discuss certain related issues which may prove relevant in future systems.  6.1  Principle-Based Systems  Thus far we have appealed largely to intuition in use of the term "principle-based" system. We have presented a linguistic theory which is founded upon the notion of a set of language independent principles, in which the grammar for a specific language is characterised by instantiating parameters of variation. This model has been proposed as a theory of linguistic competence. That is, it attempts to capture human's innate knowledge of language.  72  CHAPTER  6.  EVALUATION AND DISCUSSION  73  In designing a principle-based parser, the basic goal is to construct a system which determines syntactic well-formedness through the application of the principles of grammar. The linguistic theory is largely declarative in nature. That is, it consists of principles which act as conditions on representations. From a parsing perspective this presents a problem: how are these "representations" to be constructed in thefirstplace? In solving this problem it is necessary to assign some procedural interpretation to the declarative principles of the theory. That is, the principles must not act only as conditions on representation, but must also contribute to the construction of those representations. The degree to which principles of grammar are used procedurally or declaratively can be a significant factor in determining the organisation and efficiency of the system. The present system uses the principles of X-theory, and 0-theory combined with lexical selection to construct phrase structure representations. Principles are applied to partially constructed structures to determine their "local" well-formedness. This organisation dictates the existence of certain extra machinery, namely the constraint lists, to allow for the incremental application of the principles. This contrasts with Sharp's system, in which the principles of grammar are stated purely as conditions on possible S-structure representations. The advantages of the more "procedural" approach taken here, and in Dorr's system, are reflected by the improved efficiency. An interesting alternative to using X-theory to drive parsing, is the licensing approach taken by Abney and Cole [Abney etal. 86]. Their parser, implemented within the Actors computational paradigm, capitalises on the licensing relationships which exist between lexical elements and constituents.  Specifically, Case and 9 theory are used directly to  determine well-formedness and construct representations. In the present work we are not concerned simply with the development of a principlebased parser, but rather the development of a parser with cross-linguistic application. In designing such a system, the necessity of maintaining the modularity and autonomy of the various subtheories becomes readily apparent. That is, parsers where the principles of grammar are somehow embedded directly in the architecture, such as the Marcus parser  CHAPTER  6.  EVALUATION AND DISCUSSION  74  [Marcus 80], seem unlikely candidates for cross-linguistic systems. Rather, it seems that a modular approach, reflecting the linguistic model, should be employed. The ultimate metric with which to evaluate the system is that of linguisticfidelity.That is, how faithful is the system to the linguistic theory. The ideal system should reproduce precisely the same grammatically judgements as the linguistic theory. The closer the parser design is to the linguistic model, the more likely it is to be correct.  6.2  The Lexicon  The shift within linguistic theory from systems of rules to systems of principles has led to increased emphasis on the lexicon. That is, properties of lexical items are projected from the lexicon and interact with the principles of grammar in determining well-formed utterances. The most notable examples of this are the subcategorization and ^-marking properties of lexical items. As we have seen, lexical items may select for certain constituents so as to ensure that semantic roles are appropriately filled. Furthermore, the possible categories for a selected constituent are generally constrained in some way. In the present system, ^-marking properties are not expressed in detail. That is, constituents are not assigned explicit 0-roles such as Agent, Patient or Theme. Rather, they are simply marked as receiving "some" 0-role. Subcategorization frames are specified, indicating the categories of the selected complements, and by the Projection Principle they are marked as ^-positions. Additionally, a lexical item (typically a verb) may indicate that it assigns an external 0-role to the subject position . 1  While this approach to ^-marking is sufficient for enforcing the ^-criterion and the Projection Principle, there are clearly some problems which arise. In thefirstplace, it may be necessary to specify a number of subcategorization frames, indicating different phrasal categories for the same selected 0-role. An example taken from [Chomsky 86b] illustrates how the verb asks semantically selects (or, s-selects) for Proposition complement, but 1  In the present system we assume all subjects are NP's, thus excluding the possibility of sentential subjects. Extending the system to include sentential subject would not however present any problem.  CHAPTER  6.  EVALUATION AND DISCUSSION  75  categorially selects (or, c-selects) for either an NP or clause, as follows: (115)  (a) (b)  I asked [/vp the time ] . I asked [cp what time it was ] .  In an attempt to eliminate redundancy in specifying s-selection and c-selection, Chomsky proposes that s-selection of some "semantic category" C entails c-selection of a syntactic category which is the "canonical structural representation" of C (i.e. CSR(C)). That is, if a verb s-selects for a Proposition, it c-selects CSR(Proposition), which may be either NP or clause (i.e. CP or IP) . 2  For purposes of translation it seems almost certain that a richer ^-marking system will prove necessary. Especially in more diverse languages where similar #-roles assigned by corresponding lexical items may be assigned to different structural positions. While the present system simply matches structural positions, a more complete system should match #-roles. Consider for example the following: (116)  (a) (b)  Er gefallt mir. I like him.  In this example, the Agent-role is assigned to the object position by gefallen and the subject by like, while the Patient-role is assigned to the subject and object position of these two verbs respectively . 3  6.3  Syntactic Analysis  The syntactic analysis component accepts an input sentence and recovers an annotated S-structure representation. That is, it recovers both phrase structure and chains. In the present system, this component is comprised of two central modules: the Parsing Module 2  3  There are instances where a verb s-selects for a Proposition which may only be a clause, such as wonder. That is, J wondered what time it was is grammatical, but / wondered the time is not. Case theory provides a reasonable explanation by suggesting that wonder but not ask is intransitive, resulting in a violation of the Case Filter in the latter sentence [Pesetsky 83]. A literal translation might equate gefallen with please, in which case He pleased me would have similar structural ^-marking properties and thus present no problem.  CHAPTER  6.  EVALUATION AND DISCUSSION  76  and the Constraint Module. The Parsing Module uses X-theory and the Extended Projection Principle to generate possible phrase structures and predict empty categories. During parsing it projects subcategorization information from the lexicon, and accesses language specific information such as head position for various categories, adjunct possibilities, and possible destinations for moved constituents. The Constraint Module is invoked as each maximal projection is completed. It incorporates Case theory, the ECP, 0 theory, and Subjacency and performs the dual task of constructing chains and verifying the well-formedness of S-structures. 6.3.1  The Parsing Module  The Parsing Module employs a DCG specification of the X metarules to drive syntactic analysis. The result is a top-down, left-to-right, backtracking parser, based on Prolog's underlying theorem prover. The decision to use logic grammars was based primarily on their convenience and perspicuity. That is, X rules are specified directly, with logical variables being instantiated with categorial and other phrase specific information during the parsing. In addition, the top-down parsing strategy facilitates the relatively straightforward "prediction" of empty elements . 4  This parsing strategy does however have certain drawbacks. Specifically, a top-down approach leads to certain fundamental problems in a model where information relevant to parsing is projected from lexical items. The most obvious example of this is subcategorization information. In the present system, when a head is encountered its subcategorization frame is accessed and used to parse its arguments. The advantage is the prediction of traces in argument positions (i.e. if the argument isn't present, it must have moved and left a trace). A disadvantage is that a lookahead mechanism is required in the case of head final phrases, such that the head's subcategorization frame can be accessed to parse the pre-arguments. The obvious solution to this is to use a bottom-up strategy where phrases are projected 4  This is to say that no extra machinery is required, as is typically the case with bottom-up approaches.  CHAPTER  6.  EVALUATION AND DISCUSSION  77  only as evidence for them appears in the input. Indeed, we have already observed that the constraints are intended to apply in a bottom-up manner. In such a system, some reduction procedure would be called upon to attach any arguments once the verb is encountered. Note, this doesn't entail the exclusion of logic grammars, but simply requires that some alternate parsing strategy be employed . 5  A number of existing systems employ bottom-up parsing strategies. Berwick and Weinberg present a modified version of Marcus' deterministic LR(k) parser [Marcus 80], [Berwick et al. 84]. These parsers use finitely bounded lookahead to compute their derivations without backtracking. The fundamental disadvantage of this approach is that the control table is language specific, in effect pre-computing the possible surface phenomena instead of directly consulting the principles of grammar as structures are computed. The cross-linguistic capability of their approach seems questionable, since the tables must be completely re-calculated for a different language . As Barton suggests, one approach might 6  be to introduce principles and parameters to the Marcus framework, thus reducing the tables of rules [Barton 84]. In fact, Kuhns' system aims at doing precisely this in his Prolog implementation of a principle-based parser which augments a Marcus parser with Binding and Control theory, and chains which are used to enforce the ^-criterion [Kuhns 86]. Another interesting approach is the assertion set parser presented in [Barton et al. 85] and [Berwick etal. 85]. This system uses phrase markers to parse input with "information monotonicity", and without lookahead. Unfortunately, not enough is known about this technique to determine its cross-linguistic abilities, especially in head final languages. Finally, we might pursue the use of more traditional, non-deterministic, parsing strategies. Dorr for example uses an Earley algorithm to parse a slight expanded X rule template [Dorr 87]. In her system, the parser is co-routined with the principles so as to block the derivation of ungrammatical structures, in a manner roughly similar to the system devel5  6  For further discussion of alternative parsing strategies for logic grammars see [Pereira etal. 87], [Abramson etal. 88]. Indeed, it is not entirely clear that such a parser can handle a headfinallanguage such as German, using reasonably bounded lookahead.  CHAPTER  6.  EVALUATION AND DISCUSSION  78  oped here. Pereira gives an interesting account of how a shift-reduce parser may combined with an oracle for the purpose of resolving attachment preferences when shift/reduce conflicts occur [Pereira 85]. This could provide an interesting method of incorporating various performance/processing principles directly into the parser . 7  6.3.2  T h e Constraint M o d u l e  The Constraint Module operates in tandem with the Parsing Module. Its purpose is to apply the principles of grammar to the partially constructed S-structures generated by the parser. These principles of grammar are used to construct chains of movement and verify the well-formedness of (sub-)structures with respect to their phrase structure and chains. An attempt has been made in the current system to maintain the modularity of the subtheories of grammar. To some extent this has been accomplished, however the Subjacency constraint has amalgamated the tasks of constructing chains and verifying their well-formedness by incorporating elements of Case and 6 theory. The design of future systems may benefit from a more modular approach which could prove more efficient, perspicuous, and easily modifiable. While a relatively high degree of language independence has been achieved, some "short cuts" have been taken where English and German show similar characteristics. The most notable example of this is the treatment of head movement which is largely stipulated in the present system. More principled account of this phenomena in terms of chains do exist within the theory and should be reflected by future systems (see [Koopman 84]). The present system applies constraints to completed phrases in a bottom-up fashion. This approach is especially convenient when combined with a bottom-up parsing strategy as in Dorr's system [Dorr 87]. It is conceivable, however, that the constraints could be formulated as conditions on left contexts. This is especially relevant to Prolog-based systems in which Horn clause theorem provers may be used to parser a logic grammar (as in the 7  Pereira demonstrates his approach using a traditional phrase structure grammar. The suitability of his approach to an X system remains to be determined.  CHAPTER  6.  EVALUATION AND DISCUSSION  79  present system). As Stabler shows, it is possible to begin with a first-order logic specification of the linguistic constraints and rewrite them as Horn clauses, assuming negation as failure [Stabler 87]. Specifically, Stabler introduces a simplified set of linguistic constraints, intended to constrain the possible derivations of an underspecified DCG grammar. While the original formulation of the constraints assumes the existence of an entire proof/parse tree, Stabler shows that a series of program transformations can be invoked to produce constraints that can be applied at any point during the derivation. The constraints are formulated so as to use a "specialised" left context derivation tree, resulting in a much more efficient parser. Admittedly, the feasibility Stabler's approach remains to be determined. Currently, it has only been applied to extremely simplified constraints, with much of the burden still being placed on a phrase structure grammar. Furthermore, the process of transforming the original linguistic constraints into their specialised left context form has only been partially automated. The goal, however, of creating a system which can transform some first-order specification of the linguistic constraints into and efficient Horn clause theorem prover for parsing is an attractive one.  6.4  Translation and Generation  In the present system, the translation and generation components have been somewhat simplified. The translation component maps D-structures from the source language to the target language by directly translating lexical items. Additionally, some structural changes are made to account for the headfinalposition of V and I in German. Indeed, these configurational differences combined the possible thematic/structural divergences observed in Section 6.2 might lead us to question the suitability of D-structure as in "interlingual" representation for purposes of translation. As a solution to this, Dorr has suggested the use of a lexical conceptual structure (LCS) which represents sentence meaning through "predicate decomposition" [Dorr 88]. This would entail the development of a system to map the D-structures of a given language to and from an LCS representation. Such an  CHAPTER  6.  EVALUATION AND DISCUSSION  80  approach certainly seems necessary in the development of translation systems for more diverse languages. The generation component constructs S-structures from D-structures. That is, it applies the rule Move-a, subject to the constraints imposed by the principles of grammar, and their parameters. The present system takes advantage of certain structural similarities between English and German in performing the generation. Specifically, it assumes similar Case-marking properties, and landing sites for movement. The result is that only a single S-structure is generated, and it corresponds quite closely to that of the original source language sentence. An alternative, and likely preferable, approach would be to generate possible S-structures "from scratch". That is, begin with a bare D-structure and apply each of the principles, so as to force and constrain possible movement. This strategy was employed in Sharp's system, which will generate a set of possible surface structures, which vary with respect to moved constituents and inflection of embedded clauses [Sharp 85]. As mentioned earlier, this approach to translation does not displace a knowledge-based approach, but rather compliments it. By performing translation at D-structure, or some other interlingual form, the task of dealing with idiosyncratic surface phenomena is sidestepped. This constrasts with the surface-to-surface form approaches, such as that employed by McCord [McCord 86]. The use of a surface form as the basis for translation entails that the transfer component perform complex restructuring of the surface structure in addition to translation. The result of McCord's approach is a relatively large set of language dependent rules, which only apply uni-directionally.  6.5  Related Issues  In addition to the points made above, there remain a numbers of areas open for possible improvement and extension of the system. In this section we will discuss two of these. The first concerns the possibility of improving the performance of the system through compilation techniques, while the second discusses the possibility of incorporating principles of human language performance.  CHAPTER  6.5.1  6.  EVALUATION AND DISCUSSION  81  Partial Evaluation  The system as presented here uses the principles of grammar in an on-line fashion. That is, they are consulted during the parsing process. Furthermore, the principles of grammar access their respective parameter values in a similar on-line manner. This approach has a number of advantages in terms of maintaining the modular, autonomous nature of the principles and their cross-linguistic applicability. The efficiency, however, is hindered by the amount of computation which must be performed at parse time. An obvious solution to this is to investigate possible techniques for "pre-computing" or compiling certain aspects or components of the parser. Within a logic programming framework, techniques of partial evaluation are of particular interest. Partial evaluation is the automatic derivation of a specialised instance of a program . 8  Consider, for example, a function which computes x , with x, y as parameters. This y  function could be partially evaluated with respect to y = 3 to produce the specialised cubic function x [Ershov 82]. The partial evaluation of logic programs is facilitated by 3  the ease with which meta-interpreters can be developed to evaluate, and possibly resolve, Prolog clauses with respect given input values. Additionally, a user must specify control information to limit the extent to which the evaluation is performed . 9  The possibility of partial evaluating the principles of the constraints module with respect to language specific parameters is both computationally and theoretically attractive. That is, it could be viewed as a core grammar generator, computing a specialised, efficient, language specific constraint module. If the control information can be specified independent of a set of language parameters, then an entirely automatic partial evaluator can be constructed for this task. The only major disadvantage is that a constraint module is necessary for each language in the system (although certain non-evaluated components might 8  9  For a detailed discussion of partial evaluation and Prolog see [Pereira etal. 87]. For a more theoretical discussion of the soundness and completeness of partial evaluation in Prolog see [Lloyd etal. 87]. That is, if uncontrolled, a partial evaluator may attempt to expand or "compile-out" the original program beyond the point of any benefit.  CHAPTER  6.  EVALUATION AND DISCUSSION  82  still be shared). A similar partial evaluation technique could also be used with the parsing module to construct a (limited) set of phrase structure rules, based on the X metarules, language specific information about possible adjuncts and their positions, and the constraints. This process basically derives possible surface configurations ahead of time, to expedite the parsing of common structures using pre-compiled phrase structure rules. This would in some sense constitute return to the traditional phrase structure approach. Note, however, that the rules are constructed automatically, based on the principles of grammar, and the number of rules can be limited as much as desired - the constraint module will still be used to verify well-formedness at parse time. 6.5.2  M o d e l i n g Linguistic Performance  The present system consists of a procedural model of UG applied to syntactic analysis, translation, and generation. That is, the system models the principles of grammar, a theory of linguistic competence. A further area of interest is the construction of systems which model human linguistic performance. Specifically, these systems must reflect the basic principles of human language processing, and ideally possess similar organisation and employ similar algorithms. Frazier, for example, draws on some intuitions supported by evidence from Dutch (a head final language similar to German) to suggest that principle-based systems should precompile the principles of X, Case and 6 theory to produce a limited set of phrase structure rules [Frazier 86]. The partial evaluation process discussed above would seem particularly relevant to such a theory of performance. Alternatively, Pritchett suggests that the principles of grammatical theory be employed directly in a theory of language processing [Pritchett 88]. Specifically, he proposes a 6Attachment principle and a 0-Reanalysis constraint, based on the ^-criterion, which accurately predict human processing of garden path sentences. Ultimately, he extends his theory to incorporate the entire theory of grammar by formulating the following principle:  CHAPTER  6.  EVALUATION AND DISCUSSION  83  (117) S-Attachment: Every principle of the syntax attempts to be satisfied at every point during processing. Such a principle is compatible with the online, incremental application of principles performed by the present system and that of Dorr's. In fact, the above two theories may not be entirely incompatible since some degree of pre-compilation would not necessarily conflict with Pritchett's E-Attachment principle. A more sophisticated discussion of these and other theories of performance is beyond the scope of this theory. We simply wish to make the point here that reconciling models of competence with those of performance appears to be both an interesting and promising area for future research . 10  For further discussion see [Berwick 87].  Chapter 7  Conclusions This thesis has presented a system for syntactic analysis and translation which is based upon the current theories of transformational generative grammar. Specifically, we have developed a system which employs the principles of Chomsky's Government-Binding theory in parsing and determining the well-formedness of sentences. As such, the system can be considered a procedural model of Universal Grammar, which accesses language specific information in analysing sentences of a particular language. Furthermore, we have shown that the cross-linguistic nature of the system lends itself particularly well to the task of language translation. The principle-based approach to natural language analysis represents a significant departure from the traditional, construction-based systems. Specifically, the embodiment of universal principles of grammar drastically reduces the necessity of specifying language specific rules. Rather, the grammar for an individual language can be stated as a compact set of parameters and language specific information. The current system accounts for a significant subset of German and English grammars. Specifically, a large set of substitution transformations are handled, including subject raising and A-movement. The system will handle a variety of constructions, including embedded sentences, relative clauses, and adjunct PP's. Notable omissions of the system include passive and subjunctive forms, adjectival phrases, and adjunction transformations. Finally,  84  CHAPTER  7.  CONCLUSIONS  85  the system has excluded the theories of Binding and Control and the PF and LF levels of representation which would be necessary for a complete implementation of the linguistic model. The translation and generation components of the present system have been simplified iii favour of the syntactic analysis component. The incremental application of the principles during parsing represents a significant improvement in overall efficiency relative to Sharp's system. Specifically, the use of constraint lists provides a convenient method for constructing chains and enforcing constraints. The modular nature of the system suggests that extending the coverage of the system, and introducing new languages should be relatively straightforward. Principle-based approaches to the design of natural language systems are becoming increasingly popular. Most of these systems, however, are suited to the analysis of an individual language.  Specifically, only two multi-lingual systems have been previously  constructed, namely those of Sharp and Dorr [Sharp 85], [Dorr 87]. Both of which handle English and Spanish. The present work has presented a system for the analysis of English and German which constitutes further evidence that the development of a parser with crosslinguistic application is indeed feasible. In addition we have shown that modularity and incremental constraint application are fundamental to the efficient design of such systems. The construction of principle-based systems is relevant to a number of disciplines. As a model of the human language faculty, such systems are of direct interest in linguistics and cognitive science. Indeed, sophisticated systems may prove to be valuable testbeds for revisions or extensions of the theory. Additionally, principle-based approaches provide efficient, elegant, and robust techniques for natural language analysis, translation and generation within thefieldsof computational linguistics and artificial intelligence.  References [Abney etal. 86]  Steven Abney and Jennifer Cole. A Government-Binding Parser, unpublished manuscript, MIT, 1986.  [Abramson etal. 88] Harvey Abramson, Matthew Crocker, Brian Ross, and Doug Westcott. Towards a Logic Based Expert System for Compiler Development. In PLILP'88 Proceedings, Institut National de Recherche en Informatique et en Automatique, Orleans, France, 1988. [Barton 84]  G. Edward Barton. Toward a Principle-Based Parser. AI Memo 788, MIT AI Laboratory, Cambridge, Massachusetts, 1984.  [Barton etal. 85]  G. Edward Barton and Robert C. Berwick. Parsing with Assertion Sets and Information Monotonicity. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 769-771, IJCAI, Los Angeles, 1985.  [Berwick 87]  Robert C. Berwick. Principle-Based Parsing. Technical Report 972, MIT AI Laboratory, Cambridge, Massachusetts, June 1987.  [Berwick etal. 84]  Robert C. Berwick and Amy S. Weinberg. The Grammatical Basis of Linguistic Performance. Current Studies in Linguistics, The MIT  Press, Cambridge, Massachusetts, 1984. [Berwick etal. 85]  Robert C. Berwick and Amy S. Weinberg. Deterministic Parsing and Linguistic Explanation. AI Memo 836, MIT AI Laboratory, Cambridge, Massachusetts, June 1985.  [Chomsky 57]  Noam Chomsky. Syntactic Structures. Mouton, The Hague, 1957.  [Chomsky 65]  Noam Chomsky. Aspects of the Theory of Syntax. MIT Press, Cambridge, Massachusetts, 1965.  [Chomsky 73]  Noam Chomsky. Conditions on Transformations. In S. R. Anderson and P. Kiparsky, editors, A Festschrift for Morris Halle, Holt, Rinehart and Winston, New York, 1973.  86  87  REFERENCES  [Chomsky 81a]  Noam Chomsky. Lectures on Government and Binding. Foris Publications, Dordecht, 1981.  [Chomsky 81b]  Noam Chomsky. Principles and Parameters in Syntactic Theory. In Norbert Hornstein and David Lightfoot, editors, Explanation in Linguistics, chapter 2, pages 32-75, Longman, London, 1981.  [Chomsky 82]  Noam Chomsky. Some Concepts and Consequences of the Theory of Government and Binding.  Linguistic Inquiry Monograph Six, The  MIT Press, Cambridge, Massachusetts, 1982. [Chomsky 86a]  Noam Chomsky. Barriers. Linguistic Inquiry Monograph Thirteen, The MIT Press, Cambridge, Massachusetts, 1986.  [Chomsky 86b]  Noam Chomsky. Knowledge of Language: Its Nature, Origin and Use. Convergence Series, Praeger, New York, 1986.  [Clocksin etal. 81]  W.F. Clocksin and C.S. Mellish. Programming in Prolog. Springer Verlag, 2nd edition, 1981.  [Colmerauer 78]  Alain Colmerauer. Metamorphosis Grammars. In L. Bole, editor, Lecture Notes in Computer Science, Springer Verlag, 1978.  [Dahl etal. 86]  Veronica Dahl, Charles Brown, Michel Boyer, T. Pattabhiraman, and Diane Massam. Mechanizing Expertise in GB Theory. LCCR TR 86-10, LCCR, Simon Fraser University, Burnaby, B.C., Canada, 1986.  [Davis 87]  Henry Davis. The Acquisition of the English Auxilliary System and its Relation to Linguistic Theory. PhD thesis, University of British Columbia, Vancouver, Canada, 1987.  [Dorr 87]  Bonnie Dorr. UNITRAN: A Principle-Based Approach to Machine Translation. Master's thesis, MIT, Cambridge, Massachusetts, 1987.  [Dorr 88]  Bonnie Dorr. A Lexical Conceptual Approach to Generation for Machine Translation. AI Memo 1015, MIT, Cambridge, Massachusetts, December 1988.  [Emonds 76]  Joseph Emonds. A Transformational Approach to English Syntax. Academic Press, New York, 1976.  [Engdahl 83]  Elisabet Engdahl. 6(l):5-34, 1983.  [Ershov 82]  A. P. Ershov. Mixed Computation: Potential Applications and Problems for Study. Theoretical Computer Science, 18(l):41-67, 1982.  Parasitic Gaps.  Linguistics and Philosophy,  REFERENCES  88  [Frazier 86]  Lyn Frazier. Natural Classes in Language Processing, unpublished manuscript, MIT, 1986.  [Haider etal. 85]  Hubert Haider and Martin Prinzhorn, editors. Verb Second Phenomena in Germanic Languages. Publications in Language Sciences,  Foris, Dordecht, 1985. [Hogger 84]  Christopher J. Hogger. Introduction to Logic Programming. Volume 21 of APIC Studies in Data Processing, Academic Press, London, 1984.  [Hornstein etal. 81] Norbert Hornstein and David Lightfoot. Introduction. In Norbert Hornstein and David Lightfoot, editors, Explanation in Linguistics, chapter 1, pages 9-31, Longman, London, 1981. [Huang 82]  C.-T. James Huang. Logical Relations on Chinese and the Theory of Grammar. PhD thesis, MIT, Cambridge, Massachusetts, 1982.  [Kimball 73]  John Kimball. Seven Principles of Surface Structure Parsing in Natural Language. Cognition, 2(l):15-47, 1973.  [Koopman 84]  Hilda Koopman. The Syntax of Verbs. Foris, Dordecht, 1984.  [Kuhns 86]  Robert J. Kuhns. A PROLOG Implementation of GovernmentBinding Theory.  In 11th International Conference on Computa-  pages 546-550, The International Committee on Computational Linguistics, Bonn, West Germany, August 1986. tional Linguistics,  [Lasnik etal. 88a]  Howard Lasnik and Mamoru Saito. forthcoming, 1988.  [Lasnik etal. 88b]  Howard Lasnik and Juan Uriagereka. A Course in GB Syntax: Lectures on Binding and Empty Categories. Current Studies in Linguistics, MIT Press, Cambridge, Massachusetts, 1988.  [Lightfoot 82]  David Lightfoot. The Language Lottery: Towards a Biology of Grammars. MIT Press, Cambridge, Massachusetts, 1982.  [Lloyd 87]  John W. Lloyd. Foundations of Logic Programming. Verlag, second edition, 1987.  [Lloyd etal. 87]  John W. Lloyd and Joseph C. Shepherdson. Partial Evaluation in Logic Programming. Technical Report CS-87-09, University of Bristol, December 1987.  [Marantz 81]  A. Marantz. On the Nature of Grammatical Relations. PhD thesis, MIT, Cambridge, Massachusetts, 1981.  Springer-  REFERENCES  89  [Marcus 80]  Mitchell P. Marcus. A Theory of Syntactic Recognition for Natural Language. The MIT Press Series in Artificial Intelligence, The MIT  Press, Cambridge, Massachusetts, 1980. [Massam 85]  Dianne Massam. Case Theory and the Projection Principle. PhD  thesis, MIT, Cambridge, Massachusetts, 1985. [McCord 86]  Michael C. McCord. Design of a Prolog-Based Machine Translation System. In Ehud Shapiro, editor, Third International Conference on Logic Programming, pages 350-374, London, U.K., July 1986.  [Pereira 85]  Fernando C. N. Pereira. A New Characterization of Attachment Preferences. In David R. Dowty, Lauri Karttunen, and Arnold M. Zwicky, editors, Natural Language Parsing, chapter 9, pages 307319, Cambridge University Press, Cambridge, England, 1985.  [Pereira et al. 80]  Fernando C.N. Pereira and D.H.D. Warren. Definite Clause Grammars for Language Analysis. Artificial Intelligence, 13:231-278, 1980.  [Pereira etal. 87]  Fernando C.N. Pereira and Stuart M. Shieber. Prolog and NaturalLanguage Analysis. CSLI Lecture Notes, Center for the Study of Language and Information, Stanford, California, 1987.  [Pesetsky 83]  D. Pesetsky. Paths and Categories. PhD thesis, MIT, Cambridge, Massachusetts, 1983.  [Pritchett 88]  Brad Pritchett. Garden Path Phenomena and the Grammatical Basis of Language Processing. Language, September 1988.  [Pullum etal. 88]  Geoffrey K. Pullum and Paul M. Postal. Expletive noun phrases in subcategorized positions. Linguistic Inquiry, 19(4), 1988.  [Rizzi 82]  Luigi Rizzi. Issues in Italian Syntax. Foris, Dordrecht, 1982.  [Ross 67]  John R. Ross. Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Massachusetts, 1967.  [Sharp 85]  Randall M. Sharp. A Model of Grammar Based on Principles of Government and Binding. Master's thesis, University of British Columbia, Vancouver, Canada, October 1985.  [Slocum 85]  Jonathan Slocum. A Survey of Machine Translation: Its History, Current Status, and Future Prospects. Computational Linguistics, 11(1):1-17, 1985.  REFERENCES  90  [Stabler 87]  Edward P. Stabler. Logic Formulations of Government-Binding Principles for Automatic Theorem Provers. Cognitive Science Memo 30, University of Western Ontario, London, Ontario, June 1987.  [Sterling etal. 86]  Leon Sterling and Ehud Shapiro.  The Art of Prolog. The MIT Press Series in Logic Programming, The MIT Press, Cambridge,  Massachusetts, 1986. [Stowell 81]  Timothy Stowell. Origins of Phrase Structure. PhD thesis, MIT, Cambridge, Massachusetts, 1981.  [Taraldsen 81]  K.T. Taraldsen. The Theoretical Interpretation of a Class of Marked Extractions. In A. Belletti, L. Brandi, and L. Rizzi, editors, Theory of Markedness in Generative Grammar, Proceedings of the 1979 Glow Conference,  pages 475-516, Scuola Normale Superiore, Pisa,  1981. [Thiersch 78]  Craig Thiersch. Topics in German Syntax. PhD thesis, MIT, Cambridge, Massachusetts, 1978.  [vRiemsdijk etal. 86] Henk van Riemsdijk and Edwin Williams. Introduction to the Theory of Grammar. Current Studies in Linguistics, The MIT Press, Cambridge, Massachusetts, 1986. [Wexler etal. 80]  Ken Wexler and Peter Culicover. Formal Principles of Language  Acquisition. MIT Press, Cambridge, Massachusetts, 1980. [Williams 74]  Edwin Williams. Rule Ordering and Syntax. PhD thesis, MIT, Cambridge, Massachusetts, 1974.  Appendix A  Example Translations This appendix illustrates the execution of the system on a number of example sentences from both English and German. The system was written in Quintus Prolog V2.2, under BSD UNIX 4.2 on a Sun 3 at the University of British Columbia. A complete listing of the Prolog source is presentented in the following appendices. The source language is specified using the "language" command, followed by the language abbreviation (e.g. "language ger"). The target language is set automatically. Once the sentence is parsed, the S-structure form is displayed, with the various traces (i.e. np-#, wh-#, op-# and comp-#) shown coindexed with their antecedent. Additionally, PRO is shown. The D-structure of the source sentence is then displayed with the chains "collapsed" and head-movement undone. Finally, once translation has been performed, the D-structure and S-structure representations in the target language are printed. Note, capitalisation and accents are not handled by the present system. (1) >>>language eng. >>>I have seen the book. Source Surface Structure: i have see the book. Source Deep Structure: i have see the book. Target Deep Structure: ich das buch sehen haben. Target Surface Structure: ich habe das buch gesehen. (2) >>>Have the boys seen the girl? Source Surface Structure: have the boy see the girl? Source Deep Structure: the boy have see the girl? Target Deep Structure: das junge das madchen sehen haben? Target Surface Structure: haben die jungen das madchen gesehen? (3) >»What did the g i r l see on the table? Source Surface Structure: what-1 do the g i r l see wh-1 on the table? 91  APPENDIX  A.  E X A M P L E  TRANSLATIONS  92  Source Deep Structure: the g i r l see what-1 on the table? Target Deep Structure: das madchen was-1 sehen auf das tisch? Target Surface Structure: was-1 sah das madchen wh-1 auf den tisch? (4) >>>What w i l l the woman put on the table? Source Surface Structure: what-1 w i l l the woman put wh-1 on the table? Source Deep Structure: the woman w i l l put what-1 on the table? Target Deep Structure: das frau was-1 auf das tisch legen werden? Target Surface Structure: was-1 wird die frau wh-1 auf den t i s c h legen? (5) »>Have you seen the books that the g i r l s put on the table? Source Surface Structure: have you see the book op-1 that the g i r l put wh-1 on the table? Source Deep Structure: you have see the book that the g i r l put op-1 on the table? Target Deep Structure: sie das buch das madchen das-1 auf das tisch legen sehen haben? Target Surface Structure: haben sie die buchen die-1 das madchen wh-1 auf den t i s c h legte gesehen? (6) >>>I believe that the g i r l t r i e d to put the book on the table. Source Surface Structure: i believe that the g i r l t r y PRO to put the book on the table. Source Deep Structure: i believe that the g i r l try PRO to put the book on the table. Target Deep Structure: ich dass das madchen dass PRO das buch auf das t i s c h legen zu versuchen glauben. Target Surface Structure: ich glaube dass das madchen PRO das buch auf den t i s c h zu legen versuchte. (7) >>>What do you believe the boy has put on the table? Source Surface Structure: what-1 do you believe comp-1 the boy have put wh-1 on the table? Source Deep Structure: you believe comp-1 the boy have put what-1 on the table? Target Deep Structure: sie comp-1 dass das junge was-1 auf das tisch legen haben glauben? Target Surface Structure: was-1 glauben sie comp-1 dass der junge wh-1 auf den t i s c h gelegt hat? (8) >>>language ger. >>>Ich habe das Buch gesehen.  APPENDIX  Source Source Target Target  A. E X A M P L E T R A N S L A T I O N S  Surface Structure: ich-1 haben Deep Structure: ich-1 das buch Deep Structure: i have see the Surface Structure: i have seen  93  wh-1 das buch sehen. sehen haben. book. the book.  (9) >>>Haben Sie die Frau gesehen? Source Surface Structure: haben sie das frau sehen? Source Deep Structure: sie das frau sehen haben? Target Deep Structure: you have see the woman? Target Surface Structure: have you seen the woman? (10) >>>Was hat der Junge auf den Tisch gelegt? Source Surface Structure: was-1 haben das junge wh-1 auf das tisch legen? Source Deep Structure: das junge was-1 auf das tisch legen haben? Target Deep Structure: the boy have put what-1 on the table? Target Surface Structure: what-1 has the boy put wh-1 on the table? (11) >>>Ich glaube dass der Junge die Frau gesehen hat. Source Surface Structure: ich-1 glauben wh-1 dass das junge das frau sehen haben. Source Deep Structure: ich-1 dass das junge das frau sehen haben glauben. Target Deep Structure: i believe that the boy have see the woman. Target Surface Structure: i believe that the boy has seen the woman. (12) >>>Was glauben Sie dass der Junge auf den Tisch gelegt hat? Source Surface Structure: was-1 glauben sie comp-1 dass das junge wh-1 auf das tisch legen haben? Source Deep Structure: sie comp-1 dass das junge was-1 auf das tisch legen haben glauben? Target Deep Structure: you believe comp-1 that the boy have put what-1 on the table? Target Surface Structure: what-1 do you believe comp-1 that the boy has put wh-1 on the table?  Appendix B  Parsing  M O D U L E  % Section: Parser % % Paradigm: parse(Language,Source,Tree). % Language: Source language {english,german}. % Source: Input sentence to be parse. % Tree: The parse derivation tree.  % % Description: This predicate controls parsing of the input % string. It first computes the lookahead stack of final % heads (bupdcg), and then calls the DCG rules which parse % the input according to the X—bar phrase structure rules.  %  parse(L,String,Tree) :— bupdcg(L,String,XbarList,S), xmax(L ,c(mat/sent) ,Tree,S ,[],Xbar List,[]).  % bupdcg/4'- The predicate constructs a lookahead stack of all final heads. % This stack is used during parsing to recover the subcategorization % frames of final heads. The second clause also "undoes" the adjunction % of a verb to the right of "zu".  bupdcg(L,D,0,[]) :- !. bupdcg(ger,[zu|Rest],[HD|XRest],[Top,HD|Stack]) : head_anal(ger ,C at ,zu ,HD,final), bupdcg(ger,Rest,XRest,[Top|Stack]).  94  APPENDIX  B. PARSING MODULE  bupdcg(L,[Word|Rest],[HD|XRest],[HD|Stack]) : head_anal( L, C at ,Word ,HD,final), bupdcg(L,Rest,XRest,Stack). bupdcg(L,[Word|Rest],[Word|XRest],NS) : head_anal(L,C,Word,HD,initial), bupdcg(L,Rest,XRest,NS), poss_move(L,Cat,C,J. bupdcg(L,[LexItem|Rest],[LexItem|XRest],Stack) :bupdcg(L,Rest,XRest,Stack).  % head_anal/5: Call the morpher to determine if the word is in the lexicon. % Determine if the word is a phrasal head, what its category is, % it subcategorization frame, and its position (final/ initial).  head_anal(L,Cat,Word,head(Cat,Word,As,Features),Pos) : morph(L,Word,Cat,Root,Features), is_head(Cat), getaxgs(L,head(Cat,Word,As,Features)), head_position(L,Cat,Pos). is_head(C) : - member(C,[n(_),v(_),p(_),c(_),i(_)]).  % % The DCG rules to parse according to X—bar theory.  % % xmax(L,C,Tree,S,NS): parses the Specifier and Adjuncts of a phrase, and % also verifies the features of Tree match those subcategorized for.  xmax(L,C,XTree,S,NS)  >  xmax_e(L,C,Tree)  —>  xmax2(L,C,Tree,S,NS)  —  >  spec(L,C,TXmax,STree,S,Sl), xmax2(L,C,TXmax,Sl,NS), {agree(L,STree,BTree),gen_ID(BTree), constraints(L,C,BTree,XTree)}. [], {empty_cat(L,C), Treel = xmax(C,ID,0,[ec(C,ID,T)])/e_cat(C agree(L,Treel,Tree),gen_ID(Tree)}. adjunct (L ,pre,C ,TApost ,Tree ,S ,S 1), xbar(L,C,TXbar,Sl,S2),  APPENDIX  B.  PARSING MODULE  adjunct(L,post,C,TXbar,TApost,S2,NS).  % % xbar(L,C,Tree,S,NS): parses the Arguments and Head of a phrase, depending % on the order (head initial/final parameter). In the case of a final % head, 'stack' is called to examine the lookahead list and retrieve % the subctegorization information.  xbar(L,C,Tree,S,NS)  >  {head_position(L,C,initial),!}, xmin(L ,C, Args,HD ,S ,S 1), {stack(L,initial,HD,Sl,S2)}, arg(L,post,C,Args,HD,[],Tree,S2,NS).  xbar(L,C,Tree,S,NS)  —>  {head_position(L,C,fiiial), HD=X/head(C^,_), stack(_,final,head(C,W,_,F),S,R),!, getargs(L,head(C,W,As,F))}, arg(L,pre,C,As,HD,[],Tree,R,Sl), xmin(L,C,As,HD,Sl,NS).  xbar(L,C,Tree,S,NS)  —>  {head_position(L,C,final),!, poss_empty(L,C,As), HD=X/head( C ,empty,F)}, arg(L,pre,C,As,HD,[],Tree,S,NSl), xmin(L,C,As,HD,NSl,NS).  %  % stack(L,Pos,Head,OldStack,NewStack): % heads. This is used to retrieve  Maintains the lookahead of final their subcategorization frames.  stack(_,final,head(C,W,As,F),S,R) :— reverse(S,[head(C,W,As,F)|Rest]), reverse(Rest,R),!. stack(eng,initial,xbar(c( ))/head(C,W,F),S,[head(C,W,As,F)|S]) : \+ C = c(_),!. stack(ger,initial,X/head(C,W,F),[head(C,W,As,F)|R],New) : reverse(R,Rl),reverse([head(C,W,As,F)|Rl],New),!. stack(_,initial^,S,S) :— !.  % % xmin(L,C,Args,Tree,S,NS): parses the head of a phrase either as a lexical % head (which has possibly moved from its base generated position) or  APPENDIX  %  B. PARSING MODULE  97  as an empty head (if the category permits the head to be absent/moved).  xmin (eng,C, Args ,Tree, [H D | S], S)  — > 0, {HD = head(C,W,Args,F), possjnove(eng,c(mat/sent),C,_), head_anal(eng,C,W,HD,initial), get args(eng ,head (C, W, Args ,F)), Tree = xbar(C)/head(C,empty ),!}. head(L,C,Args,Tree). u  xmin(L,C,Args,xbar(C)/Tree,S,S)  —>  head(L,C,As,head(C,W,F))  —>  head(L,C,As,head(Cat,W,F))  — >  head(L,C,As,head(C,W,F))  — >  head(L,C,As,head(Cat,W,F))  — >  head(L ,C, As ,head (C ,empty,F))  —>  [Word], {head_anal(L,C,Word,HD,initial),!, HD = head(C,W,_,F), getargs(L,head(C,W,As,F))}. [Word], {head_anal(L,Cat,Word,HD initial), HD = head(Cat,W,Args,F), poss_move(L,C,Cat,As),!}. jhead(C,W,_,F)], {!,getargs(L,head(C,W,As,F))}. [head(Cat,W,Args,F)], {poss move(L,C,Cat,As),!}.  0, {poss_empty(L,C,As),!}.  %.  % The following are the general rule for Adjunct and Specifier productions. % These refer to the language specific choices for possible adjuncts % and specifiers.  spec(L,C,TXpanax(C,ID,Q,Cons)/[TSpec,TX],S,NS) — > spec(L,C,X/R,xmax(C,ID,[],Cons)/R,S,S) — >  spec(L,C,TSpec,S,NS). [],{no_spec(L,CList), member(C,CList)}.  no_spec(L,[n(_),v(_),p(_),c(mat/sent)]) :— !. adjunct(ger,post,C,TXpcmax(C)/[TX,Tadj],S,S)  adjunct(L,Pos,C,X/Tree,xmax(C)/Tree,S,S) adjunct(L,pre,C,Tree,xmax(C)/[Tadj,Tree],S,NS) adjunct(L,post,C,Tree,xmax(C)/[Tree,Tadj],S,NS)  —  >  —  >  — > — >  {C = i(tns(-)),!}, [head(v(T),W,F)], {Tadj = head(v(T),W,F)}. []•  adj(L,pre,C,Tadj,S,NS). adj(L,post,C,Tadj,S,NS).  APPENDIX  B. PARSING MODULE  %,  % topic: parses a topic in the SPEC, CP position of a German root clause.  topic(L,c(mat/sent),Tree,S,NS)  —>  {empty_cat(L,C)}, spec(L,C,TXmax,XTree,S,Sl), xmax2(L,C,TXmax,Sl,NS), {add_feature([ant(+),case(_)],XTree,Tr), agree(L,Tr ,BTree) ,gen_ID(BTree), constraints(L,C,BTree,Tree)}.  % % wh_phrase: parses wh—phrases in the SPEC, CP position of root and relative % clauses.  wh_phrase(L,c(mat/_),Tree,S,NS)  —>  wh_phrase(L,c(emb/T),Tree,S,S)  —>  wh_xmax(L,Cat,C,XTree,S,NS)  —>  {empty_cat(L,C)}, wh_xmax(L ,Cat ,C ,Tree,S ,NS). [],{gensym(e,ID)}, {set_ec(L,T,ID,Tree)}. spec(L,C,TXmax,STree,S,Sl), xbar(L,C,TXmax,Sl,NS), {wh_agree(L,Cat,STree,BTree), genJD(BTree), constraints(L,C,BTr )XTree)}. ee  set_ec(_,sent,ID, xmax(C,ID,[agree(_,_^,_)],[ec(C,ID,trace(comp))])/e_cat(C,trace(comp))). set_ec(eng,rel,ID, xmax(n(_),ID,[agree(_^^,_),wh(+)],[ant(abar,n(_),ID)])/e_cat(n(_),operator)). % % arg: uses the subcategorization frame of a head to parse its arguments. If % the argument is not present, a trace is inserted. 'a_xmax' parses % a lexical argument, 'a_xmax_e' parses a trace.  arg(L,Pos,C,Q,HD,ALst,T,S,S) —> . [],{build_tree(Pos,C,HD,ALst,T)}. arg(L,Pos ,C ,As ,HD, ALst ,T,S ,NS) —> {select(L,C,As,A,NewAs)}, a_xmax(L,C,A,Tx,S,Sl), arg(L,Pos,C,NewAs,HD,[Tx|ALst],T,Sl,NS). arg( L ,Pos ,C, [ A | Args] ,HD, ALst ,T ,S ,N S) —>  APPENDIX  B. PARSING MODULE  99  a_xmax_e(L,C,A,Tx), arg(L,Pos,C,Args,HD,[Tx|ALst],T,S,NS). arg(L,Pos,C,[{A}|Args],HD,ALst,T,S,NS) —> arg(L,Pos,C,Args,HD,ALst,T,S,NS). a_xmax(L,Cat,C,XTree,S,NS)  —>  spec(L,C,TXmax,STreel,S,Sl), xmax2(L ,C ,TXmax,S 1 ,NS), {agree(L,STreel,BTreel),gen_ID(BTreel), BTreel = xmax(_,ID,_,_)/Rest, l_mark(Cat,BTreel,BTree2), t_mark(Cat,ID,BTree2,BTree), constraints(L,C,BTree,XTree)}.  a_xmax_e(L,Cat,C,Tree)  —>  xmax_e(L,C,Treel), {l_mark(Cat,Treel,Tree2), t_mark( Cat ,ID ,Tree2,Tree)}.  % % poss_move(Language,Catl,Cat2,Args): A head of Cat2 can move from its base % generated position to the head of Catl. The default arguments of % Catl are Args.  poss_move(L,C,C^). poss_move(L,c(mat/sent),i(tns(+)),[i(tns(+))]). poss_move(L,c(mat/sent),v(nil/tns(+)),[i(tns(+))]).  % % poss_empty(Language,Cat,Args): The head of Cat can be empty due to % absence/movement. The default arguments of Cat are Args.  poss_empty(L,i(tns(+)),[v(nil/tns(+))]). poss_empty(L,c(emb/rel),[i(tns(+))]). poss_empty(eng,c(emb/sent),[i(_)]). poss_empty(ger,c(emb/sent),[i(tns(—))]). poss_empty(eng,c(mat/sent),[i(tns(+))]). poss_empty(ger,v(T)J.  % % empty_cat(Language, Category):  empty_cat(L,n(_)). empty_cat (L ,p (_)).  Category is a legal empty category for Language.  APPENDIX  B. PARSING MODULE  % % buildjree/ 5: constructs and appropriate binary tree for the X—bar level %  of a phrase (e.g. the head and its arguments.)  build_tree(post ,C ,HD ,ArgTrees,xmax( C) /Pair) :— reverse(ArgTrees,ATs),make_pairs(post,C,HD,ATs,Pair),!. build_tree(pre,C,HD,ArgTrees,xmax(C)/Pair) : make_pairs(pre,C ,HD ,ArgTrees ,Pair),!. make_pairs(Pos,C,xbar(C)/HD,0,HD) : - !. makej?airs(Pos,C,HD,[A| As],Pair) :— pair(Pos,C,HD,A,P),make_pairs(Pos,C,xbar(C)/P,As,Pair),!. pair(post,C,HD,A,[HD,A]) : - !. pair(pre, C,HD,A,[A,HD]) : - !.  % % select(L,C,As,A,NewAs): determine if C has fixed/free order arguments for % language L, and selects arguments from As appropriately. For % the present system, we assume both English and German to be fixed %  order, for efficiency purposes.  select (L,C, As, A,New As) :— order (L ,C ,Order),!, select (O rder, As, A ,Ne w As). select(free,As,A,NewAs) :— append(L,[A|R],As), append(L,R,NewAs). select(fixed,[A|NewAs],A,NewAs) : - !. order(eng^,fixed). or der (ger ,_,fixed).  % % l_mark: L—marks all subcategorized constituents (except IP by CP).  l_mark(c(_),Tree,Tree) : - !. l_mark(_,Treel,Tree2) :— !, add_feature(l_marked,Treel,Tree2).  % % tjnaark: Theta marks all arguments, except CP & IP do not theta—mark %  their arguments.  APPENDIX  B. PARSING MODULE  101  t_mark(c(_),_,Tree,Tree) : - !. t_mark(i(_)^,Tree,Tree) : - !. t_mark(_,ID,Treel,Tree2) : - !, add_feature(theta(ID),Treel,Tree2). %.  % addjeature: adds a given feature, or list of features to a phrasal node.  add_feature(Feature,Tree,Tree) :— var(Feature),!. addJeature([F|Flist]pcmax(C,ID,Fl,Cons)/R,xmax(C,ID,F2,Cons)/R) : - !, append([F|Flist],Fl,F2). add_feature(Feature,xmax(C,ID,Fl,Cons)/R xmax(C,ID,F2,Cons)/R) :— !, append([Feature],Fl,F2). r  % % agree(L,SurfTree,DeepTree): verify that the agreement relations hold for % Surf Tree, and construct a "canonical" (uninflected) Deep Tree for % purposes of translation. The feature list of DeepTree receives the % 'agree(Tns,Per,Num/Gen,Case)' feature to keep track of inflection.  agree(Language,SurfTree,BaseTree) :— case_agree( S urfTree ,Case), S urfTree = xmax(C,_,_,_)/_, agreel(Language,C,SurfTree,BaseTreel,Tns,Per,NumGen,Case,wh(—)/Proper), add_feature([Proper,agree(Tns,Per,NumGen,Case)],BaseTreel,BaseTree),!.  % % wh_agree: verify agreement and wh—status of wh—phrases.  wh_agree(Lauguage,Cat,SurfTree,BaseTree) :— SurfTree = xmax(C,_,_,_)/-> (Cat = c(emb/rel),Proper = proper(+),!,C=n(_) ; true), agreel(Language,C,SurfTree,BaseTreel,Tns,Per,NumGen,Case,wh(+)/Proper), add_feature( [wh(+) ,ant (+) ,Proper] ,B aseTreel ,BaseTree2), add_feature(agree(Tns,Per,NumGen,Case),BaseTree2,BaseTree),!.  % % revjagree: similar to 'agree', but the DeepTree is instantiated and the % inflected surface tree is constructed (subject to agreement). Used % for generation purposes.  rev_agree(L,Target,NewTarget) :— Target = xmax(C,ID,F,Cons)/_,  APPENDIX  102  B. P A R S I N G M O D U L E  case_agree(Target,Case), (member(agree(Tns,Per,NumGen,Case),F);true),!, agreel(L,C,NewTarget,Target,Tns,Per,NumGen,Case,Wh). case_agree(xmax(n(Case),_,_,_)/R,NCase) case_agree(xmax(_,_,_,_)/R,Case) :— !.  :— !,Case=NCase.  % % The following predicates are used by the three agree predicates above. %  agreel: Traverses the tree, calling agree2.  %  agree2: Enforces agreement relations between constituents, and  % % %  calls 'agreejerminal' for lexical items. agreejerminal: Accesses morphological information to determine the agreement features of lexical items.  agreel(L,C,X/[LS,RS],X/[LB,RB],Tns,Per,NumGen,Case,Wh) : agree2(L,C,LS,LB,Tns,Per,NumGen,Case,Wh), agree2(L,C,RS,RB,Tns,Per,NumGen,Case,Wh). agreel(L,C,X/RS,X/RB,Tns,Per,NumGen,Case,Wh) : agree2(L,C,RS,RB,Tns,Per,NumGen,Case,Wh). agree2(L,i(tns(+)),Xmax,Xmax,Tns,Per,NumGen,Case,Wh) :— X m a x = xmax(v(nil/tns(-|-)),_,F,_)/R,!, member( agree ( T n s 1,_,_,_) ,F), (Tnsl = Tns ; Tnsl = -),!. agree2(L,i(_),Xmax,Xmax,Tns,Per,NumGen,Case,Wh) :— X m a x = xmax(n(nom),_,F,_)/R,!, member (agree(_,Per,NumGen ^) ,F). agree2(L,c(emb/rel),Xmax,Xmax,Tns,Per,NumGen,Case,Wh) :— X m a x = xmax(n(_),_,F,_)/R,!, member (agree (_, P e r ,NumGen,_) ,F). agree2(L,c(mat/sent),Xmax,Xmax,Tns,Per,NumGen,Case,Wh) :— X m a x = xmax(i(_),_,F,_)/R,!, member(agree(ITns,Per,NumGen^),F), ( I T n s = T n s ; true),!. agree2(L,n(_),Xmax,Xmax,Tns,Per,NumGen,Case,Wh) :— X m a x = xmax(c(emb/rel) .,F,_)/R,!, W h = WH/proper(-), member(agree(_^,NumGen,_),F). a g r e e 2 ( L , C , X m a x , X m a x , T n s , P e r , N u m G e n , C a s e , W h ) :— X m a x = xmax(_,_,_,_)/R. agree2(L,C,X/[LS,RS],X/[LB,RB],Tns,Per,NumGen,Case,Wh) :>  APPENDIX  B.  PARSING MODULE  103  agree2(L,C,LS,LB,Tns,Per,NumGen,Case,Wh), agree2(L,C,RS,RB,Tns,Per,NumGen,Case,Wh). agree2(L,C,X/RS,X/RB,Tns,Per,NumGen,Case,Wh) : agree2(L,C,RS,RB,Tns,Per,NumGen,Case,Wh). agree2(L,C,XS,XB,Tns,Per,NumGen,Case,Wh) : agree_terminal(L,XS ,XB ,Cat ,Features), extract_features(Cat,Features,Tns,Per,NumGen,Case,Wh). agree_terminal(L,SurfTerm,BaseTerm,Cat,Features) :— member(Node,[spec,head,adj]), SurfTerm =.. [Node,Cat,empty,Features], BaseTerm =.. [Node,Cat,empty,Features],!. agree_terminal(L ,SurfTerm,BaseTerm,C ,F) : member(Node,[spec,head,adj]), (var(BaseTerm),SurfTerm=..[Node,C,WordJ,BaseTerm=..[Node,C,Root,F]; var(SurfTerm)3aseTerm=..[Node,C,RootJ,SurfTerm==..[Node,C,Word,F]),!, morp h( L, Word, C ,Root ,F), get_cat_features(L,C,F). agreeJ;erminal(L,SurfTerm,BaseTerm,nil,Features) :— (var (S urfTerm) ,terminal( B aseTerm); var(BaseTerm),terminal(SurfTerm)),!, SurfTerm = BaseTerm. get_cat_features(L,v(Form),Features) :— !, v_features(L,Features,Form). get_cat_features(L,_,_) : - !.  % % extract_features:  retrieve  relevant features from the dictionary  entry.  extract_features(d,Features,Tns,Per,Num/Gen,Case,wh(W)/proper(—)) : — !, ext_ftr( wh(W),Features), ext_ftr(num(Num, Gen, Case), Features). extract_features(p(_),Features,Tns,Per,Num/Gen,Case,wh(W)/Proper) :— !, ext_ftr(wh(W),Features). extract_features(n(_),Features,Tns,Per,Num/Gen,Case,wh(W)/proper(P)) :— !, ext_ftr(wh(N),Features), (wh(N)=wh(W);(wh(N)=wh(-),\+P=(-))),!, ext_ftr(per(Per),Features), ext_ftr(num(Num),Features), ext_ftr(gen(Gen),Features), ext_ftr(case(Case),Features), ext_ftr(proper(P),Features).  APPENDIX  B. PARSING MODULE  extract_features(v(_),Features,Tns,Per,Num/Gen,Case,Wh) : - !, ext_ftr(tns(Tns),Features), ext_ftr(per(Per),Features), ext_ftr(num(Num),Features). extract_features(i(_),Features,Tns,Per,Num/Gen,Case,Wh) : - !, ext_ftr(tns(Tns),Features), ext_ftr(per(Per), Features), ext_ftr(num(Num),Features). extract_features(_,_,Tns,Per,Num/Gen,Case,Wh) :— !. ext_ftr(Ftr,Features) :— member(ftr(Flist),Features),!, member(Ftr,Flist).  104  Appendix C  Language Parameters  % Section: Phrase Structure Parameters % % Description: This section contains the language specific choices for % possible specifiers and adjuncts. In addition the headjposition % parameter is specified for the various categories of each % language.  ;  %  % General rule to parse simple lexical items as specifiers. spec(L,C,spec(Cat,W,F),S,S)  —>  [W], {specifier(L,C,Cat), morph(L,W,Cat,R,F)}.  %,  % Parse punctuation  as an adjunct to root CP.  adj(L,post,c(mat/sent),Tree,S,S)—> adj(L,post,c(mat/sent),Tree,S,S)—>  ['.'],{Tree = punc(punc,'.' ,decl)}. ['?'],{Tree = punc(punc,'?',ques)}.  % % Section: English Configuration % % The following define the specific phrase structure rules for: English spec(eng,c(T),Tree,S,NS)  —>  wh_phrase(eng,c(T),Tree,S,NS).  105  APPENDIX  C.  LANGUAGE PARAMETERS  spec(eng,i(_) ,Tree,S ,S) spec(eng,i(_),Tree,S,S)  -> •>  xmax(eng,n(nom),Tree,[],[]). xmax_e(eng,n(nom),Tree).  adj(eng ,p ost ,n (_) ,Tree, S ,N S) adj(eng,post,n(_),Tree,S,NS) adj(eng,post,v(T),Tree,S,NS)  -> -> •>  xmax(eng,p(_),Tree,S,NS). xmax(eng,c(emb / rel),Tree,S ,NS). xmax(eng,p(_),Tree,S,NS).  specifier^ eng,n(_),d). head_position(eng,_,initial).  % All heads are initial.  % % Section: German  Configuration  % % The following define the specific phrase structure rules for:  German  spec(ger,c(mat/sent),Tree,S,NS) spec(ger,c(T) ,Tree,S ,NS) spec(ger,i(_),Tree,S,NS) spec(ger,i(_) ,Tree,S ,S)  •> -> -> ->  topic(ger,c(T),Tree,S,NS). wh_phrase(ger,c(T),Tree,S,NS). xmax(ger,n(nom),Tree,S,NS). xmax_e(ger,n(nom),Tree).  adj(ger,pre,v(_),Tree,S,NS) adj(ger,post,n(_),Tree,S,NS) adj(ger,post,n(_),Tree,S,NS)  -> -> ->  xmax(ger,p(_),Tree,S,NS). xmax(ger,p(_),Tree,S,NS). xmax(ger,c(emb/rel),Tree,S,NS).  specifier(ger,n(_),d). head_position(ger,n(_)4 itial). head_position(ger ,v(_) ,final). head_position(ger,a,initial). head_position(ger,p(_) ,initial). head_position(ger,i(_),final). head_position(ger,c(_) ,initial). n  Appendix D  Constraint Module  % Section:  Constraints  % % Paradigm:  constraints (Language, Category, Tree).  % % Description: This module evaluates the constraints, as appropriate for % the Category. The action is as follows: % 1. Determine the constraint list, based on the immediately % dominated X—max nodes, and the properties of the current node. % 2. Satisfy all constraints possible; if there is a definite % violation then fail (force a re—parse). % 3. Return the new list of constraints, to be passed up the tree.  constraints(L,C,Treel,Tree2) :— get_constraints(Treel,Cons), satisfy_constraints(L,C,Treel,Cons,Tree2,Constraints), add_constraints(Tree2,Constraints).  % % get_constraints(Tree, Constraints): search the tree for all the maximal % projections dominated by Tree, gets their constraint lists and % merges them.  get_constraints(xmax(C,ID,F,Cons)/R,Cons) :— \+ var(Cons),!. get_constraints(X/[L,R],Cons) :— !, non_terminal(X), get_constraints(L,CL),  107  APPENDIX  D. CONSTRAINT MODULE  get_constraints(R,CR), append(CL,CR,Cons),!. get_constraints(X/Y,Cons) :— !, non_terminaI(X), get_constraints( Y,Cons),!. get_constraints(X,[]) :— termmal(X). terminal(Term) :— Term =.. [Pred|_], member(Pred,[e_cat,head,spec,adj,punc]),!. non_terminal(Node) :— Node =.. [XBAR|J, member(XBAR,[xmax,xbar]),!. gen_ID(xmax(C,ID,F,Cons)/R) : gensym(C,ID),!.  % % add_constraints(Tree, Constraints): determines if any new constraints should % be added for the current tree (note: some are also added during the %  satisfy constraint phase).  add_constraints(xmax(C,ID,F,Cons)/R,Constraints) :— new_constraints(C,xmax(C,ID,F,Cons)/R,NewCons), append(Constraints,NewCons,Cons),!. new_constraints(n(Case),xmax(n(_),ID,F,Cons)/R,[case(ID,Case)]) :— \+ member(ant(-|-),F),!. new_constraints(C,xmax(Cat,ID,F,Cons)/R,[ant(abar,Cat,ID)]) :— member(ant(+),F),!. new_constraints(v(_),Tree,[theta(X)]) :— get_theta(Tree,X),!. new_constraints(Cat,Tree,[]) :— !. get_theta(Tree,X) : get_head(Tree,head(C,W,F)), member(theta(X),F),!. get_theta(_,+) :— !. % % satisfy_constraints(Language,Cat, Tree, Constraints,NewConstraints): applies % the principles of Government— Binding theory to the current set of  APPENDIX  D. CONSTRAINT MODULE  %  constraints for Tree. Each principle determines its own applicability  %  (if it doesn't apply, it will trivially succeed).  satisfy_constraints(L,C,Treel,Constraints,Tree3,NewConstraints) :— ecp(L,C,Treel, Constraints), case_theory(L,C,Treel,Tree2,Constraints,Constraints2), theta_theory(L,C,Tree2,Constraints2,Tree3,Constraints3), subjacency(L,C,Tree3,Constraints3,NewConstraints). %.  % ecp(L,C, Tree, Constraints): if there exists an empty category in Constraints, %  then if it is properly governed its 'trace', otherwise its  %  (in fact, this avoids subjects, so never determines PRO).  'PRO'.  ecp(La(_),Tree,Constraints) : - !. ecp(L,C,Tree,Constraints) :— exists_ec(Constraints,Tree,ID,Type) — > ((properly_governs(L,C,Tree,xmax(Cat,ID_.)) — > Type = trace(WH) ; Type = pro), ecp(L,C,Tree,Constraints)),!. ecp(L,C,Tree,Constraints) :— !. i  l  exists_ec(Constraints,Tree,ID,Type) :— member (ec(_,ID ,Type),Constraints), var(Type).  % % case_theory(Language,Cat,Consl,Cons2): case theory has two applications, % 1. To mark lexical (non—wh) NP's with case, and fail if it % cannot (Case Filter). % 2. To determine if a given trace is an np—trace or wh—trace. % First we determine if Cat is a case assigner for Language, and then % do (1) casejnarkjexical, and (2) case_mark_traces above.  case_theory(L,C,Treel,Tree3,Consl,Cons2) :— case_assigner( L, C ,Tree 1 ,Trans),!, case_mark_lexical(Treel,Tree2,Trans,Consl,Cons2), case_mark_traces(Tree2,Tree3,Trans,Cons2),!. case_theory(L,C,Tree,Tree,Cons,Cons) :— !.  %  % case_mark_lexical(Tree,Consl,Cons2):  every case request in Consl, must  APPENDIX  %  D. CONSTRAINT MODULE  be governed in Tree, else Case Filter applies (ie.  Fail).  case_mark_lexical(Treel,Tree3,Trans,Consl,Cons3) :— remove(case(ID,Case),Consl,Cons2),!, governed(xmax(n(_) ,ID,_,_) ,Treel, Trans), mark_node(ID,case(ID),Treel,Tree2), case_mark_lexical(Tree2,Tree3,Trans,Cons2,Cons3),!. case_mark_lexical(Tree,Tree,Trans,C,C) : - !.  % % case_mark_traces (Tree, Constraints): if there exists a trace in Constraints, % of unknown type (wh/np), then if it is governed in Tree its a % wh—trace, otherwise its an np—trace.  case_mark_traces(Treel,Tree3,Trans,Constraints) :— exists_trace(Constraints,Treel,ID,WH) — > (governed(xmax( Cat ,ID ^,_) ,Treel, Trans) - > (mark_node(ID,case(ID),Treel,Tree2), WH = wh) ; ( Tree2 = Treel , WH = np)), case_mark_traces(Tree2,Tree3,Trans,Constraints),!. case_mark_traces(Tree,Tree,Trans,Constraints) :— !. exists_trace(Constraints,Tree,ID,WH) :— member(ec(Cat,ID,trace(WH)),Constraints),var(WH).  % % thetajheory: if a verb theta marks it's subject, the theta(-h) constraint %  is removed, and the new Tree, with theta—marked subject, is returned.  %  If at IP there is a theta(-) constraint, then the subject is not  %  theta—marked, but it must be a antecedent for a a—chain (or 'it').  theta_theory(L,i(_),Treel,01dCons,Tree2,NewCons) : remove(theta(+),01dCons,NewCons),!, Treel = X/[Subjl,Rest], Subjl = xmax(_,ID,_,_)/_, add_feature(theta(ID),Subjl,Subj2), Tree2 = X/[Sub j2,Rest]. theta_theory(L,i(_),Tree,01dCons,Tree,NewCons) :— \+ member(theta(+),01dCons), Tree = X/[Subj,_], \+ pleonastic(L,Subj), \+ Subj = _/e_cat(_J,!,  APPENDIX  D. C O N S T R A I N T M O D U L E  S u b j = xmax(_,ID,_,_)/_, append([ant(a,n(J,ID)],OldCons,NewCons). theta_theory(L,C,Tree,Constraiiits,Tree,Constraints) :— !. pleonastic(eng,NP) : - get_head(NP,head(_,it,J). pleonastic(ger,NP) :— get_head(NP,head(_,es,_)).  % % subjacency (Language, Cat, Tree, OldConstraints,NewConstraints): This % predicate constrols the construct of chains, and applies the % Subjacency principle to determine their well-formedness. subjacency(L,C,Tree,01dCons,NewCons) :— bind_traces(L,C,Tree,01dCons,Consl),  % bind all traces.  check_pro_chains ( L , C ,Tree , C o n s l , C o n s 2 ) , \ + member(barrier(ID),Cons2),  % no barriers remain.  barriers(L,C,Tree,Cons2,NewCons).  % add new barriers if any.  % % check_pro_chains(L,Cat,Tree,OldConstraints,NewConstraints): if a pro—chain % is present, and there is no antecedent available, the the original % empty category must have been PRO, and is set accordingly. check_pro_chains(L,C,Tree,Consl,Cons4) :— remove(chain(pro,ID,CC,Case,Theta,List),Consl,Cons2), remove(barrier(ID),Cons2,Cons3), set_last(pro,List), check_pro_chains(L,C,Tree,Cons3,Cons4),!. check_pro_chains(L,C,Tree,Cons,Cons) :— !.  % % barriers(L, Cat, Tree, OldConstraints,NewConstraints): if the current node is % a barrier (ie, not I—marked), then it becomes a barrier to any % chains which it dominates. barriers(L,C,Tree,01dCons,NewCons) :— is_bairier (Tree), member(chain(Type,ID,CC,Case,Theta,List),01dCons), \ + member(barrier(ID),01dCons),!, % not already a barrier. append([barrier(ID)],01dCons,Consl), barriers(L,C,Tree,Consl,NewCons). barriers(L,C,Tree,Cons,Cons) :— !.  APPENDIX  D. C O N S T R A I N T  112  MODULE  is_barrier(xmax(C,ID,Features,Cons)/R) : \ + member(l_marked,Features),!.  % current node is not I—marked.  set_last(Type,[ec(C,ID,Type)|[]]) : - !. set_last(Type,[A|Y]) : - !,set_last(Type,Y).  % % bind_traces(Language,Cat,Tree,OldConstraints,NewConstraints): this % attempts to either close a chain, by prepending an antecedent to % it (clause 1) or extend a chain, by prepending a trace to it % (clause 2). bind_traces(L,c(_),Tree,01dCons,NewCons) :— member(chain(Type,ID,CC,Case,Theta,List),01dCons), var(Type),!, get_head(Tree,head(_,empty _)), Type=pro, %  bind_traces(L,c(_),Tree,01dCons,NewCons),!. bind_traces(L,v(_),Tree,01dCons,NewCons) :— member(chain(Type,ID,CC,Case,Theta,List),01dCons), var(Type),!,Type=a,set_last(trace(np),List), bind_traces(L,v(_),Tree,01dCons,NewCons),!. bind_traces(L,C,Tree,01dCons,NewCns) :— member((At/Ct),[(a/a),(abar/a),(abar/abar),(abar/pro)]), remove(ant(At,CC,ID),01dCons,Cnsl), remove(chain(Ct,IDC,CC,Case,Th,List),Cnsl,Cns2), (Ct=pro,set_last(trace(wh),List) ; true), sub jacent(ID ,ID C ,Tree,Cns2,Cns3), agree_case(ID,Tree,Case,Th), append([c_chain(At,ID,CC,Case,Th,[ant(At,CC,ID)|List])],Cns3,NewCns),!. bind_traces(L,i(tns(-)),Tree,01dCons,NewCons) : exists_ec(01dCons,Tree,ID,Type) - > make_chain(Tree,ec(Cat,ID,Type),Unknown,01dCons,Consl), bind_traces(L,i(tns(-)),Tree,Consl,NewCons). bind_traces(L,C,Tree,01dCons,NewCons) :— member(ec(Cat,ID,trace(T)),01dCons),!, (bind_to_chain(ec(Cat,ID,trace(T)),Tree,01dCons,Consl); make_chain(Tree,ec(Cat,ID,trace(T)),a,01dCons,Consl)), bind_traces(L ,C ,Tree,Consl ,NewCons). bind_traces(L,C,Tree,Cons,Cons) :— !.  APPENDIX  D. CONSTRAINT MODULE  113  % % bindJo_chain(Trace,Tree,OldConstraints,NewConstraints): finds a trace, % and a chain, makes sure they are subjacent, and prepends the %  trace to the chain, and puts the new chain in NewConstraints.  bind_to_chain(ec(C,ID,trace(T)),Tree,01dCons,NewCns) :— remove(ec(C,ID,Type),01dCons,Cnsl), member(ChT,[a,abar,pro]), remove(chain(ChT,IDC,C,Cse,Th,List),Cnsl,Cns2), subjacent(ID,IDC,Tree,Cns2,Cns3),  % get the empty cat.  % get the chain. % are they subjacent?  agree_case(ID,Tree,Cse,Th), % only one theta role. append([chain(ChT,ID,C,Cse,Th,[ec(C,ID,trace(T))|List])],Cns3,NewCns),!.  % % make_chain(Trace,OldConstraints,NewConstraints): if the trace couldn't be % attached to a chain, then it must start a new chain.  make_chain(Tree,ec(C,ID,Type),ChType,01dCons,NewCons) :— \+ var(Type), Type = trace(T), \+ var(T), T = comp, remove(ec( C ,ID ,Type),Old Cons ,NewCons),!. make_chain(Tree,ec(C,ID,Type),ChType,01dCons,NewCons) : remove(ec(C,ID,Type),01dCons,Consl),  % remove the ec  agree_case(ID ,Tree ,Case,Theta), append([chain(ChType,ID,C,Case,Theta,[ec(C,ID,Type)])],Consl,NewCons),!. subjacent(IDtrace,IDchain,Tree,01dCons,NewCons) :— remove(barrier(ID chain) ,01dCons ,NewCons),!. subjacent(IDtrace,IDchain,Tree,Cons,Cons).  % % agree_case: ensure that a chain receives Case and a Theta role exactly once.  agree_case(ID,Tree,Case,Theta) :— get_subtree(ID,Tree,xmax(C,ID,F,Cons)/R),!, (member(case(IDC),F),Casel=case(IDC);true), (member(theta(IDT),F),Thetal=theta(IDT);true),!, Case=Casel,Theta=Thetal,!.  %  :  % governs(Language,Cat,Tree Node): first determines is Cat is a governor % for Language, and then calls 'governed' to see if Node is governed % by the head of the current maximal projection. t  APPENDIX  D. CONSTRAINT MODULE  114  governs(L,C,Tree,Node) :— governor (L,C),!, governed (Node,Tree ,t r ans (—)). governor(L,C) :— governor_list(L,List),!,member(C,List). governorjist (_,[n(_) ,v(_) ,a,p(J 4(tns(+))]).  % % properly_governs(Language,Cat,Tree,Node): first determines is Cat is a % proper governor for Languages, and then calls 'governed' to see % if Node is governed by the head of the current maximal projection.  properly_governs(L,C,Tree,Node) :— proper_governor(L,C),!, pgoverned( Node,Tree). proper_governor(L,C) : - pg_list(L,List),!,member(C,List). pg_list(_,[n(_),v(_),p(_)]). pgoverned(xmax(_,ID,_,_),xmax(C,IDG,F,Cons)/R) :— pgovernedl(ID,IDG,xmax(C,IDG,F,Cons)/R),!. pgovernedl(ID,IGD,X/[L,R]) : ( L = xmax(C)/T ; R = xmax(C)/T),!, pgovernedl(ID,IDG,xmax(C)/T). pgovernedl(ID,IGD,X/[L,R]) : ( L = xbar(C)/T ; R = xbar(C)/T),!, governed 1 (ID ,ID G ,X/ [L ,R] ,t rans (+)).  % % governed(Node,Tree): determines if Node is governed by the head of % the current maximal projection, which is the root of Tree.  governed(xmax(_,ID,_ .),xmax(C,IDG,F,Cons)/R,Trans) : governedl(ID,IDG,xmax(C,IDG,F,Cons)/R,Trans),!. governedl(ID,IDG,xmax(_,ID,_,_)/_,Trans) :— !. governedl(ID,IDG,xmax(C,IDX _)/_,trans(+)) : - \+IDX=IDG,\+C=i(tns(-)),!,fail. governed 1 (ID,IDG,X/[L,R],Trans) : - !, (governedl(ID,IDG,L,Trans) ; governedl(ID,IDG,R,Trans)). governedl(ID,IDG,X/Y,Trans) : governed 1 (ID ,ID G, Y,Tr ans). %  w  APPENDIX  D.  CONSTRAINT MODULE  %,  % case_assigner: determines if the head of the phrase is a case—assigner.  case_assigner( L,v(_) ,Tree,trans(4-)) :— get_head(Tree,head(_,_,Features)), member(trans(-f), Features),!. case_assigner(L,C,Tree,trans(-)) :— case_assigner(L,C),!. case_assigner(L,c(emb/sent) ,Tree,trans(+)) :— get_head (Tree,head(_, Wor d ^)), member(Word,[for,empty]),!. case_assigner (_,v(_)). case_assigner (_,p(_)). case_assigner(_,i(tns(+))).  %  :  % markjnode: find the phrase labeled ID and mark its feature list with % the term Mark.  mark_node(ID,Mark,xmax(C,ID,Fl,Cons)/R,xmax(C,ID,F2,Cons)/R) : - !, append([Mark],Fl,F2). mark_node(ID,Mark,X/[L,R],X/[Ll,Rl]) : - !, ( (mark_node(ID,Mark,L,Ll),R=Rl) ; (mark_node(ID,Mark,R,Rl),L=Ll) ). mark_node(ID,Mark,X/R,X/Rl) : - !, mark_node(ID,Mark,R,Rl). mark_node(ID,Mark,X) : - !,fail.  % % get_subtree: find (and return) the phrase labeled ID.  get_subtree(ID,xmax(C,ID,F,Cons)/R,xmax(C,ID,F,Cons)/R) : - !. get_subtree(ID,X/[L,R],Subtree) : - !, (get_subtree(ID,L,Subtree) ; get_subtree(ID,R,Subtree)). get_subtree(ID,X/R,Subtree) : - !, get_subtree(ID,R,Subtree). get_subtree(ID,R,Subtree) :— !,fail.  115  Appendix E  Translation Module  %,  % Section: Translate % % Paradigm: translate(SourceTree,TargetTree). % % Description: This section is responsible for translating the parse % tree for the source input, into the target language. The high % % % % % % % %  level predicates are called as follows: 1. gen_deep_structures: this produces the source D—structure, applying move—alpha in reverse. 2. transjexical: converts source lexical items into their inflected, target equivalents. The result is the D—structure for the target language. 3. gen_surf_structures: generates surface structures for the D—structures of the target language.  % translate(SourceTree,TargetTree) :— write('Source Surface Structure: ' ) , n l , pretty(SourceTree), gen_deep_structure(SourceTree,SourceDeep), trans_lexical(SourceDeep,TargetDeep,Mode), gen_surf_structure(Mode,TargetDeep,TargetTree).  % % gen_deep_structure(SurfaceTree,DeepTree): collapses all chains, returning % moved constituents to their base generated positions. Heads which % have moved to Comp (through inversion), are also returned to their  116  APPENDIX  %  E. TRANSLATION MODULE  base positions.  gen_deep_structure(SourceTree,TargetTree) :— rev_move_alpha( S ourceTree ,TargetTree),!, write(' Source Deep Structure: '),nl, pretty (TargetTree). %.  % rev_move_alpha: is essentially applying move—alpha in reverse. % It first calls revjnovejip,. and then revjnovejiead.  rev_move_alpha(S_structure,D_structure) :— S_structure = xmax(C,ID,F,Cons)/_, make_trace_lists(l,Cons,Lists), rev_move_np(S_structure,D_structurel,Lists), rev_move_head(D_structurel,D_structure),!.  % % revjnovejip: each NP/PP which is the head of a chain is moved back to its % base—generated position.  rev_move_np(D_str,D_str,[]) : - !. rev_move_np(S_str,D_str,[Chain|Rest]) :— extract_from_chain(Chain,SurfID,BaseID), collapse_cliaiii(S_str,SurfID,D_strl,BaseID), rev_move_np(D_strl,D_str,Rest),!. extract_from_chain(ecl(_,[SurflD|Rest]),SurfID,BaseID) :— r ever se( Rest, [B aselD |J),!.  % % collapsejchain(S— str,SID,D—str,BID): move an element whose S—Str positition % is SID to its D—str position specified by BID. This predicate is % bi-directional (either S—str/SID or D—str/BID may be specified).  collapse_chain(X/[L,R],SurfID,X/[NewL,NewR],BaseID) : R=xmax(C,SurfID,_,_)/_,leave_ec(X,R,NewR), move_to_base(R,BaseID,L,NewL),!. collapse_chain(X/[L,R],SurfID,X/[NewL,NewR],BaselD) : L=xmax(C,SurfID,_,_)/_,leave_ec(X,L,NewL), move_to_base(L,BaseID,R,NewR),!. coUapse_chain(X/[L,R],SurnD,X/[NewL,NewR],BaseID) : - !,  117  APPENDIX  E. TRANSLATION MODULE  collapse_chain(L,SurfID,NewL,BaseID), collapse_chain(R,SurflD,NewR,BaseID),!. collapse_chain(X/R,SurfID,X/NewR,BaseID) : collapse_chain(R,SurflD,NewR,BaseID),!. collapse_chain(X _,X,_) :— !. 1  % move_to_base: returns a phrase to its Base position (at BaselD). move_to_base(Ant,BaselD,X/[L,R] ,X/[L,NewAnt]) : R=xmax(C,BaseID,_,_)/_,set_features(R,Ant,NewAnt),!. move_to_base(Ant,BaselD,X/[L,R] ,X/[NewAnt,R]) : L=xmax( C,B aselD,_,_) /_,set_features(L ,Ant,NewAnt),!. move_to_base(Ant,BaseID,X/[L,R],X/[NewL,NewR]) : - !, move_to_base( Ant ,B aselD ,L ,NewL), move_to_base( Ant ,B aselD ,R,Ne wR),!. move_to_base(Ant,_,X,X) :— |. leave_ec(X,xmax( Cat ,ID ,F,Cons)/_,xmax( C at ,ID,_,_) /e_cat (Type)) :— x_node(X,C,_) -> ( (C = c(emb/rel), Type = ant) ; (C = c(mat/sent),Type = ant) ; (C — c(emb/sent),Type = comp) ; (C = i(_), Type = np) ). set_features(xmax(C,BID,Bftrs,BCons)/Xpcmax(C,AID,Aftrs,ACons)/R, xmax(C,BID,NewFtrs,[])/R) : \+ var(X), % Surface to Base. (\+ member(case(_),Aftrs),append([wh(+)],Aftrs,NewFtrs) ; Aftrs = NewFtrs),!. set_features(xmax(C,BID,Ftrs,BCons)/X,xmax(C,AID,Ftrs,_)/R, xmax(C,BID,Ftrs,_)/R) : var(X), % Base — > Surface. X = e_cat(C,trace(wh)).  %  % rev_move_head(OldDstr,NewDstr): heads which have been moved by have/be % raising or inversion are also returned to their base positions. rev_move_head(01dDstr,NewDstr) :— find_moved_head(01dDstr,HD,01dDstrl),!,  APPENDIX  E. TRANSLATION MODULE  move_hd_base(HD,01dDstrl,NewDstr). rev_move_head(Dstr,Dstr). find_moved_head(Dstr,head(C2,W,Ftrs),NewDstr) : Dstr = xmax(Cl,ID,F,Cons)/R, get_head(Dstr,head(C2,W,Ftrs)), \+ W = empty, \+ CI = C2,!, set_head(head(c(mat/sent),empty,[]),ID,Dstr,NewDstr). move_hd_base(head(i(_),do,_),01dDstr,NewDstr) : OldDstr = xmax(i(_),ID,F,_)/_, get_head( OldDstr ,head(C ,emptyj_)),!, member (agree(Tns,_,_,_) ,F), adjust_vp(tns(Tns),OldDstr,NewDstr). move_hd_base(head(C,W,F),01dDstr,NewDstr) : OldDstr = xmax(C,ID,_,_)/_, member(C,[i(_),v(_)]), get_head(01dDstr,head(C,empty^)),!, set_head(head(C,W,F),ID,01dDstr,NewDstr). move_hd_base(HD,X/[L,R],X/[NewL,NewR]) : - !, move_hd_base(HD,L,NewL), move_hd_base(HD,R,NewR). move_hd_base(HD,X/R,X/NewR) : - !, move_hd_base(HD,R,NewR). move_hd_base(HD,X,X) : - !. adjust_vp(Tns,VP,VP) : VP = xmax(C,_, J / , \+ member(C,[i(_),v(_)]),!. adjust_vp(Tns,01dVP,NewVP) : OldVP = xmax(v(nil/tns(-)),ID,F,_)/_,!, get_head(01dVP,head(C,W,Ftr)), set_head(head(v(nil/tns(+)),W,Ftr),ID,01dVP,VPl), adjust_tns(Tns,VPl,NewVP). adjust_vp(Tns,X/[L,R],X/[NewL,NewR]) : - !, adjust_vp(Tns,L,NewL), adjust_vp(Tns,R,NewR). adjust_vp(Tns,X/R,X/NewR) : - !, adjust_vp(Tns,R,NewR). adjust_vp(Tns,X,X) : - !.  119  APPENDIX  E. TRANSLATION MODULE  120  %.  % transJexical(SourceDeep,TargetDeep,Mode): translates each lexical item from % the source language to that of the target language. At phrasal nodes % %  feature aggreement is forced, producing the inflected (surface) form. Constituents are also re—ordered for the target language.  trans_lexical(SourceDeep,TargetDeep,Mode) :— trans_lex(SourceDeep,TargetDeep,Mode), write('Target Deep Structure:' ),nl, pretty(TargetDeep). trans_lex(SourceDeep,TargetDeep,Mode) :— SourceDeep = xmax(C,ID,F,Cons)/R,!, SourceDeep2 = xmax(C,ID,F,_)/R, trans_lexl(SourceDeep2,TargetDeep,Mode), target(L),!, rev_constraints(L,C,TargetDeep). trans_lex(SourceDeep,TargetDeep,Mode) :— transjexl (SourceDeep ,Target Deep,Mode). trans_lexl(X/[L,R],Tree,Mode) : X =.. [xmax,Cat|J, (x_bar(Cat,L) ; x_bar(Cat,R)),!, list_args(X/[L,R],Arglist,Head,Mode), order_args (Cat,Arglist,Ordered), x_node(X,Cat,Pos), (Pos=initial,Position=post •; Pos=final,Position=pre), build_tree(Position,Cat,Head,Ordered,_/As), Tree = X/As. trans_lexl(X/[L,R],X/[NL,NR],Mode) : - !, trans_lex(L ,NL ,Mode), trans_lex(R,NR,Mode). trans_lexl(X/R,X/NewR,Mode) : - !, trans_lex(R,NewR,Mode). transJexl(e_cat(C,operator),head(C,das,_),Mode) :— !. trans_lexl(head(c(emb/rel)^,_),head(c(emb/rel),empty.),Mode) : - target(ger). trans_lexl(head(c(emb/sent)^),head(c(emb/T),dass^),Mode) : - target(ger). transJexl(head(c(emb/T)^,J,head(c(emb/T),that^),Mode) : - target(eng). trans_lexl(HD,NewHD,Mode) : HD =.. [Pred,C,W,F], member(Pred,[head,spec,adj]),!, trans_word(C,W,F,NewW,NewF), i  APPENDIX  E. TRANSLATION MODULE  NewHD =.. [Pred,C,NewW,NewF],!. trans_lexl(Punc,Punc,Mode) :— Punc = punc(punc,_,Mode),!. trans_lexl(X,X,Mode) : - !. trans_word(Cat,empty,SFtr,empty,_) :— !. transjvord(c(emb/sent),_,F,NewW,_) :— (target(ger),NewW=dass) ; (target(eng),NewW=that),!. transjvord(c(emb/rel),_,F,NewW,_) :— (target(ger),NewW=empty) ; (target(eng),NewW=that),!. transjvord(Cat,SWord,SFtr,TWordJ : (member(engUsh(TWord),SFtr);member(german(TWord),SFtr)),!.  % % list_args(Tree,ArgList,Head,Mode): the arguments of a head (ie those sister % to x—bar) are returned as a list, along with the Head & Mode (if % applicable).  list_args(X/ [L ,R], [NewL | Rest] ,Head ,Mode) : x_node(X,C,_),x_bar(C,R),!, trans_lex(L,NewL,Mode), list_args(R,Rest,Head,Mode). list_args(X/[L,R],[NewR|Rest],Head,Mode) : x_node(X,C^)^c_bar(C,L),!, trans_lex(R,NewR,Mode), list_args(L,Rest,Head,Mode). list_args(Head,[],HD,Mode) : trans_lex(Head,HD,Mode).  % % order_args(Category,ArgList, OrderedArgs): accepts a list of arguments and % reorders them as appropriate for a given category. Currently this % simply moves accusative NP's next to the head.  order_args(Cat,Args,OrdArgs) :— remove(xmax(n(acc),ID,F,C)/R,Args,New Args),!, append(NewArgs,[xmax(n(acc),ID,F,C)/R],OrdArgs). order_args(Cat,Args,Args) : - !. order_cons(X,NL,NR,NR,NL) : x_node(X,Cat,Pos), ((x_bar(Cat,NL),Pos=final);  APPENDIX  E. TRANSLATION MODULE  (x_bar(Cat,NR),Pos=imtial)),!. order_cons(X,NL,NR,NL,NR) : - !. x_node(Node,C,Pos) :— Node =.. [Xbar,C|J,member(Xbar,[xmax,xbar]), target(L),head_position(L,C,Pos). x_bar(C,xbar(C)/_) : - !. %.  % revjconstrainst(Language, Category, Tree): apply the appropriate constraints %  in reverse. Essentially, this routine uses 'subjacency' to determine  %  where Wh—phrases and caseless NP's should be moved.  rev_constraints(L,C,Tree) :— get_constraints(Tree,Constraints), subjacency(L,C,Tree,Constraints,NewConstraints), rev_add_constraints(Tree,NewConstraints). rev_add_constraints(xmax(C,ID,F,NewCons)/R,Cons) :— rev_new_constraints(C,xmax(C,ID,F,Cons)/R,Consl), append(Cons,Consl,NewCons). rev_new_constraints(_,xmax(C,ID,_ .)/e_cat(Type),[Cons]) :— !, ( (Type = ant , Cons = ant(abar^,ID)) ; (Type = comp , Cons = ec(_,ID,trace(comp))) ; (Type = np , Cons = ant(a,_,ID)) ). rev_new_constraints(_,xmax(C,ID,_^)/e_cat(_,Type),Cons) :— !, ( (Type = trace(comp) , Cons = [ec(_,ID,trace(comp))]); Cons = [] ). rev_new_constraints(_,xmax(C,ID,Ftr,_)/R,[ec(C,ID,trace(wh))]) :— member(wh(-|-),Ftr),!. rev_new_constraints(n(_),xmax(C,ID,Ftr,_)/R,[ec(C,ID,trace(np))]) :— \+ member(case(_),Ftr),!. rev_ne w_const rain t s (_,Tree, []). %  % % gen_surf_structure(Mode,DeepStructure,SurfaceStructure): generates a % surface structure for the target language by applying Move—alpha % and certain language specific transformations.  gen_surf_structure(Mode,TargetDeep,TargetSurface) :—  122  APPENDIX  E. TRANSLATION MODULE  t axget (L) ,set_inversion( L ,Mode ,In v), move_alpha(TargetDeep ,TargetSurface 1), raising(L,Inv,TargetSurfacel,TargetSurface2), gen_pf(TargetSurface2,TargetSurface3), in version (In v ,Tar ge t S urface 3, Target S ur face4), topicalize(L,Mode,TargetSurface4,TargetSurface), write('Target Surface Structure:'),nl, pretty(TargetSurface). %.  % movejalpha: moves non case—marked NP's and wh—phrase, according to the % chains constructed during translation.  move_alpha(Deep,Surface) :— Deep = xmax(C,ID,F,Cons)/_, make_trace_lists(l,Cons,Lists), move_np (S urface ,Deep,Lists). move_np(D_str,D_str,[]) :— !. move_np(S_str,D_str,[Chain|Rest]) :— extract_from_chain( Chain,SurfID ,B aselD), collapse_chain(D_strl,SurfID,D_str,BaseID), move_np(S_str,D_strl,Rest),!.  % % raising(Lang,Inv,DeepTree,SurfTree): if head of INFL is empty, then the % next highest verb is raised to that position (where is receives its % %  TNS,PER,NUM features). In English, if inversion is to take place, then the verb raised must aux(+), otherwise do—support is required.  raising(Lang,Inv,DeepTree,SurfTree) :— DeepTree = xmax(i(Tns),ID,Ftr,Cons)/[Left,Right],!, raising(Lang,no,Left ,NewL) ,raising(Lang,no,Right ,NewR), DeepTreel = xmax(i(Tns),ID,Ftr,Cons)/[NewL,NewR], check_subject(Left,Inv,NewInv), raisel(Lang,NewInv,ID,DeepTreel,S urfTree). raising(Lang,Inv,X/[L,R],X/[NewL,NewR]) : - !, raising(Lang,Inv,L,NewL),raising(Lang,Inv,R,NewR). raising(Lang,Inv,X/R,X/NewR) : - !,raising(Lang,Inv,R,NewR). raising(Lang,Inv,X,X) :— !. raisel(Lang,Inv,ID,DeepInfl,SurfInfi) :—  APPENDIX  E.  TRANSLATION MODULE  124  get_head( D eeplnfl ,head(i (_) ,emp ty,_)),!, raise_verb(Lang,Inv,ID,DeepInfl,DeepInnl,Verb), setJiead(VerbJD,DeepInnl,SurfInn). raisel(ger,Inv,ID,DeepInfl,SurfInfl) :— get_head( D eeplnfl ,head(i(tns(—)),zu,_)),!, raise_verb(ger,Inv,ID,DeepInfl,DeepInfll,Verb), Deeplnfll = Xmax/R, Surflnfl = Xmax/[xmax(i(tns(-)))/R,Verb]. raisel(L,Inv,ID,Infl,Infl) : - !.  % % raisejverb:  retrieve  the head of the VP, and set the head of the phrase  %  to empty. For English,  %  to take place.  do support  is performed  if inversion  is  raise_verb(Lang,Inv,IDinfl,Tree,Tree, Verb) : — Tree = xmax(i(_),ID,_,_)/_, \+ ID = IDinfl,!. raise_verb(Lang,Inv,IDinfl,DeepVP,SurfVP,Verb) :— DeepVP = xmax(v(_),ID,_J/_,!, get_head(DeepVP,head(v(F),V,Ftr)), ((Lang = ger ; Inv = no ; aux(Lang,V)) — > (set_head(head(v(F),emptyJ,ID,DeepVP,SurfVP), Verb = head(v(F),VJ) ; (Lang — eng , Inv=yes) — > (set_head(head(v(nil/tns(-)),VJ,ID,DeepVP,SurfVPl), adjust_tns(tns(-),SurfVPl,SurfVP), F = Form/Tns,Verb = head(i(Tns),do,J) ). raise_verb(Lang,Inv,IDinfl,X/[L,R],X/[NewL,NewR],Verb) : - !, raise_verb(Lang,Inv,IDinfl,L,NewL,Verb), raise_verb(Lang,Inv,IDinfl,R NewR,Verb). raise_verb(Lang,Inv,IDinfl,X/R,X/NewR,Verb) : - !, raise_verb(Lang,Inv,IDinfl,R,NewR,Verb). raise_verb(Lang,Inv,IDinfl,X,X,Verb) :— !. 5  aux(L,V) : - morph(L,V,v(_),R,Ftr),!, member(aux(-|-),Ftr). check_subject(xmax(C,ID,F,Cns)/e_cat(_^),_,no) check_subject(_,Inv,Inv).  :— !.  adjust_tns(tns(T),xmax(C,ID,F,Cns)/R,xmax(C,ID,NewF,Cns)/R) : -  APPENDIX  E. TRANSLATION MODULE  remove(agree(Tns,Per,NumGen,Case),F,Fl), append([agree(T,Per,NumGen,Case)],Fl,NewF),!. set_head(HD,IDpcmax(C,IDX,F,Cons)/Rpcmax(C,IDX,F,Cons)/R) : \+ ID = IDX,!. set_head(HD,ID,X/[L,R],X/[NewL,NewR]) : - !, set_head(HD,ID,L,NewL), set_head(HD ,ID,R,NewR),!. set_head(HD,ID,X/R,X/NewR) : - !, set_head( HD ,ID ,R,NewR),!. set_head(HD,ID,head(_,_,_),HD) : - !. set_head(HD,ID,X,X) : - !.  % % inversion(Language,Mode,DeepTree,SurfTree): simply moves the head of the % matrix INFL to head of COMP. This must apply after raising.  inversion(yes,DeepTree,SurfTree) :— !, find_inn(DeepTree,SurfTreel,Infl), SurfTreel = xmax(_,ID,_,_)/_, set>ead(InflJD,SurfTreel,SurfTree). inversion(no,Tree,Tree) :— !. find_infl(Xmax,InflMax,Infi) :— Xmax = xmax(i(tns(+)),ID,_,_)/_,!, get_head(Xmax,Infl), set_head(head(i(tns(-|-)),empty,_),ID,Xmax,InflMax). findJnfl(X/[L,R],X/[NewL,NewR]Jnfl) : - !, ((find_inn(R,NewR,Infl),L=NewL); (nnd_infl(L,NewL,Infl),R=NewR)). find_infl(X/R,X/NewR,Infl) : - !, find_infl(R,NewR,Infl). set_inversion(ger^,yes). set_inversion(eng,ques,yes). set_inversion(eng,decl,no).  % % gen_pf(SourceTree,TargetTree): after move—alpha has applied to wh—phrase % and verbal heads, traverse the tree and generate the appropriate % surface forms for agreement.  APPENDIX  E. TRANSLATION MODULE  gen_pf(SourceDeep,TargetDeep) :— SourceDeep = xmax(C,ID,F,Cons)/R, member( C ,[n(_) ,c(emb /rel)]),!, target(L), rev_agree(L ,SourceDeep,Target Deep 1), gen_pf1 (Target Deep 1 ,Target Deep). gen_pf( SourceDeep ,TargetDeep) : SourceDeep = xmax(C,ID,F,Cons)/R,!, gen_pf1 (SourceDeep,Target Deep 1), target(L),!, rev_agree(L ,TargetDeep 1 ,Target Deep). gen_pf(SourceDeep,TargetDeep) :— gen_pf 1 (SourceDeep,Target Deep). gen_pfl(X/[L,R],X/[NewL,NewR]) : - !, gen_pf(L,NewL), gen_pf(R,NewR). gen_pfl(X/R,X/NewR) : - !, gen_pf(R,NewR). gen_pfl(X,X) : - !.  %__ % topicalize(Language,Mode,DeepTree,SurfTree): if Mode is Declarative then % find a Topic in DeepTree, and attach it to Comp. (German only).  topicalize(ger,decl,DeepTree,SurfTree) :— !, get_topic(DeepTree,Tree,Topic), attach_topic(Topic,Tree,SurfTree). topicalize(_,_,Tree,Tree) :— !.  % % get_topic(DeepTree,SurfTree,Topic): finds the first constituent under % Infl in DeepTree, then returns it as Topic, and SurfTree has an % ec where the topic was.  get_topic(Treel,Tree2,Topic) :— Treel = X/[Topic,R], x_node(X,i(tns(+)),_),!, leave_ec(X,Topic,EC), Tree2 = X/[EC,R]. get_topic(X/[L,R],X/[NewL,NewR],Topic) : - !, ((get_topic(L,NewL,Topic),R=NewR) ;  126  APPENDIX  E. TRANSLATION MODULE  (get_topic(R,NewR,Topic),L=NewL)). get_topic(X/R,X/NewR,Topic) : - !, get_topic(R,NewR,Topic). attach_topic(Topic,X/Rest,Tree) :— Tree = X/[Topic,xmax(c(mat/sent))/Rest].  127  Appendix F  Morphological Analyser % % Section: Morphological Analyzer % % Paradigm: morph(Language,Word,Category,Root,Features). % Language: Source language {english,german}. % Word: The source word to be morphed. % Category: Lexical category {n,v,a,p,i,c,d}. % Root: The Word's root, and % Features: The syntactic features of the word.  %  % Description: This section is responsible for morphological analyses % and dictionary lookup of lexical items. Language and either Word % or Root must be specified, Category is optional. % The <morph> predicate first checks if the entry is already in % the lexicon, if not it performs a suffix analysis based on the % assumption that the item is not irregular. The predicate has % been written so that if Word is instantiated, Root+Features is % returned, and vice versa. %.  morph(L,Word,punc,punc(Word),[j) : - member(Word,['.','?','!']),!. morph(L,Word,Cat,Root,Data) :— \+ atom(Word),var(Root), !, fail. morph(L,Word,Cat,Root,Data) :— direction( Word ,Root ,forward), dict(L,Cat,Word,Fl), remove(root( Root) ,F 1 ,F2), get_root_features (L ,C at ,Root ,RF),  128  APPENDIX  F.  MORPHOLOGICAL ANALYSER  remove_ftr(RF,NewRF), append(F2,NewRF,Datal), set_defaults(L,Cat,Datal,Data). morph(L,Root,Cat,Root,Data) :— direction(Root ,Root ,forward), dict(L,Cat,Root,Datal), \+ member(root(_),Datal), set_defaults(L,Cat,Datal,Data). morph(L,Word,Cat,Root,Data) :— direction(Word ,Root ,forward), suff_table(L,Cat,SufF,Eiid,Sfeat), suffanal(Word,Cat,Suff,End,Root), get_root_features (L ,C at ,Root >RF) > adjust_ftr(RF,Sfeat,Datal), set_defaults(L,Cat,Datal,Data). morph(L,Word,Cat,Root,Data) :— direction(Word ,Root ,backward), dict(L,Cat,Root,Datal), \+ member(root(_),Datal), member(irr(List),Datal),!, member(IrrWord,List), dict(L,Cat,IrrWord,Data2), set_defaults(L,Cat,Data2,Data). direction( Word ,Root ,forward). direction(Word,Root,backward) :— var(Word),atom(Root).  % % The following %  feature  predicates  are used to extract features,  lists for "created"  lexical  and  items.  remove_ftr(Ftr,NewFtr) :— (remove(ftr(_),Ftr,NewFtr);Ftr=NewFtr),!. adjust_ftr(RF,Sftr,Data) : (remove(ftr(Rftr),RF,NewRF);Rftr=Q,NewRF=RF),!, def_merge(Rftr,Sftr,Newftr), append([ftr(Newftr)],NewRF,Data),!. get_root_features(L,Cat,Root,Features) :— dict(L,Cat,Root,Features), \+ member(root(_),Features).  construct  APPENDIX  F.  MORPHOLOGICAL ANALYSER  set_defaults(L,Cat,01dFeatures,NewFeatures) :— (remove(ftr(01dFlist),01dFeatures,Ftrs); Ftrs = OldFeatures,01dFlist = 0), defaults(L,Cat,Defaults),!, def_merge(Defaults,01dFlist,NewFlist), append([ftr(NewFlist)],Ftrs,New Features),!. defmerge([],Features,Features) :— !. def_merge([DF|R],Features,[DF|NewFeatures]) : nul_feature(DF,NF), \+ member(NF,Features), def_merge(R,Features,NewFeatures). def_merge(L|R],Features,New Features) :— def_merge(R,Features,New Features). mil feature(F,NulF) : F =.. [P|Args], nul_args(Args,NulArgs), NulF [P|NulArgs].  nulargs([],0) :- !• nul_args([W|X],[Y|Z]) : nul_args(X,Z).  % % suffanal: Finds possible suffix/root combinations. (Either Word or Root % may be instantiated.)  sufFanal(Word,Cat,Suff,End,Root) :— atom (Word),var ( Root), process(Word,Wlist), process(Suff,Slist), match (Wlist ,Slist ,P Rroot), reverse( P Rroot ,Proot), name( End,Endlist), append(Proot,Endlist,Rootlist), name(Root,Rootlist),!. suffanal(Word,Cat,Suff,End,Root) : var(Word),atom(Root), process(Root,Rlist),  130  APPENDIX  F.  MORPHOLOGICAL ANALYSER  131  process(End,Endlist), match(Rlist,Endlist,BRlist), reverse(B Rlist ,Blist), name(Suff,Slist), append(Blist,Slist,Wordlist), name(Word,Wordlist),!.  % Determine  the subcategorization  frame  (arguments)  of a given head.  getargs(L,head(C,W,As,RF)) : member(subcat(_),RF),!, member(subcat(Args),RF), convertJo_list(Args,As). getaxgs(L,head(C,W,[],F)) : - !. convert_to_list([A|As],[A|As]) : - !. convert_to_list(A,[A]) :— !.  % getjiead:  retrieves  the head of the current  subtree.  get_head(xmax(C,ID,_,_)/R,Head) :— getjiead 1 (ID ,xmax( C ,ID,_,_) /R,Head). get_headl(ID,head(C|w,F),head(C,W,F)) : - !. get_headl(ID,xmax(_,IDX^J/R,_) : - \+ ID = IDX,!,fail. get_headl(ID,X/[L,R],Head) : (get_headl(ID,L,Head);get_headl(ID,R,Head)). get_headl(ID,X/L,Head) : get_headl(ID,L,Head).  % % Determine the participle % of subcategorization.  form  and tense of the verb, for  v_features(L,F,Form/Tns) :— !, get_ftr(F,Ftr), get_form(Ftr,Form), get_tns(Ftr,Tns). get_ftr(F,Ftr) : - member(ftr(Ftr),F),!. get_ftr(F,[]) : - \+ member(ftr(Ftr),F),!.  purposes  APPENDIX F. MORPHOLOGICAL ANALYSER  get_form(F,paxt(Form)) :— member(part(Form),F),!. get_form(F,nil) :— \+ member(part(Form),F),!. get_tns(F,tns(-)) : - member(tns(-),F),!. get_tns(F,tns(-)) : - \+ member(tns(X),F),!. get_tns(F,tns(+)) : - member(tns(X),F),\+ X = '-',!.  % % Utility routines used by morpher only.  match(Wlist,0,Wlist). match([L|Wlist],[L|Slist],Root) : - match(Wlist,Slist,Root). process('',[]). process(Term,List) :— name(Term,Listl), reverse(Listl,List),!. get_ftr(01dFeat,Feat,Rest) : append(ftr(Feat),Rest,01dFeat),!. get_ftr(R,[],R) : - !.  132  Appendix G  The Lexicon % % Section: Suffix Table (English) % % Paradigm: suff_table(Language, Category,InflEnd,RootEnd,Features). %  Language:  Language of source {eng,ger}.  %  Category:  Lexical category {n,v}.  %  InflEnd:  The inflected ending.  %  RootEnd:  The ending of the root form.  %  Features:  Features to be added by the inflection.  % sufft able(eng,n(_) 4es ,y,[num(+)]). sufft able(eng,n(_) ,ves ,f, [num(+)]). suff_table(eng,n(_),ves,fe,[num(+)]). suff_table(eng,n(_),s,'' ,[num(+)]). suff_table(eng,n(_),es,'' ,[num(+)]). suff_table(eng,v(_),'',[tns(pres),per(l)]). suff_table(eng,v(_),'',' ',[tns(pres),per(2)]). suff_table(engjv(_),'',' ',[tns(pres),per(3),num(+)]). suff_table(eng,v(_),i 5y>[ ( ),per(3),tns(pres)]). suff_table(eng,v(_),s,'' ,[num(—),per(3),tns(pres)]). suff_table(eng,v(_)4d,y,[tns(past)]). suff_table(eng,v(_),ed,e,[tns(past)]). suff_table(eng,v(_),ed,'' ,[tns(past)]). suff_table(eng,v(_),ed,e,[part(past),tns(—)]). sufF_table(eng,v(_),ed,'' ,[part(past),tns(—)]). suff_table(eng,v(_),en,e,[part(past),tns(—)]). suff_table(eng,v(_),en,'' ,[part(past),tns(-)]). es  num  _  e  133  APPENDIX  134  G. THE LEXICON  suff_table(eng,v(_),ing,e,[part(pres)]). suff_table(eng,v(_),ing,'' ,[part(pres)]). defaults(eng,n(_) ,[per(3) ,num(—) ,gen(_) ,wh(—) ,case(_) ,proper(—)]). defaults(eng,d,[num(_,_,_),wh(-)]). defaults(eng,p(_),[wh(—)]). defaults(eng,v(_) ,[tns(—),per(_),num(_) ,aux( -) ,tr ans (+)]). defaults(eng,i(_),[tns(-|-),per(_),num(_)]). defaults(eng,Cat,[]). %,  % % Section: Lexicon (English) % % Paradigm: dict(L,Cat, % %  Word,[ftr([tns(T) gen(G),num(P),per(N),part(F),wh(M)J), )  irr([Fl,F2,...]),pastpart(F),prespart(F)  )  pl(PF) ,proper(M),pro(M) ,root(RF) ,subcat(Frame)]).  %.  %% %% Nouns %%  dict(eng,n(_),book,[german(buch)]). dict(eng,n(_),boy,[pl(boys),german(junge)]). dict(eng,n(_),boys,[ftr([num(-|-)]),root(boy)]). dict(eng,n(_),girl,[german(madchen)]). dict(eng,n(_),table,[german(tisch)]). dict(eng,n(_),woman,[pl(women),german(frau)]). dict(eng,n(_),women,[ftr([num(-(-)]),root(woman)]). dict(eng,n(_)4)[ftr([case(nom),per(l)]),proper(-|-),german(ich)]). dict(eng,n(_),me,[ftr([case(acc),per(l)]),root(i)]). diet (eng,n(_) ,we,[ftr( [case(nom) ,num(-|-) ,per( 1)] ),proper (+) ,german(wir)]). dict(eng,n(_),us,[ftr([case(acc),num(-t-),per(l)]),root(we)]). dict(eng,n(_) ,you,[ftr( [per(2)]) ,proper(+),german(sie)]). diet (eng,n(_) ,he,[ftr( [case(nom)]) ,proper(+) ,german(er)]). dict(eng,n(_),him,[ftr([case(acc)]),root(he)]). diet (eng,n(_) ,she ,[ftr( [case(nom)] ),proper(+) ,german(sie)]). dict(eng,n(_),her,[ftr([case(acc)]),root(she)]). diet (eng,n(_) ,[german(es)]). diet (eng,n(_), what ,[ftr( [wh(+) ,proper (+)]) ,german( was)]). diet (eng,n(_) ,which,[ftr ([wh(+) ,proper(+)]) ,german(das)]). dict(eng,n(_),who,[ftr([wh(-|-),proper(-t-)]),german(wer)]).  APPENDIX  %% %% %%  G.  THE LEXICON  Verbs  diet (eng,v(_) ,put, [subcat ([n(acc) ,p (loc)]) ,irr( [putting]), german(legen)]). dict(eng,v(_),put,[ftr([part(past),tns(—)]),root(put)]). dict(eng,v(_),put,[ftr([tns(past)]),root(put)]). dict(eng,v(_),putting,[ftr([part(pres),tns(pres)]),root(put)]). dict(eng,v(_),see,[subcat([n(acc)]),irr([seen,saw]),pastpart(seen), german(sehen)]). dict(eng,v(_),seen,[ftr([part(past),tns(—)]),root(see)]). dict(eng,v(_),saw,[ftr([tns(past)]),root(see)]). dict(eng,v(_),try,[subcat(c(emb/sent)),irr([tried]), german (versuchen)]). dict(eng,v(_),give,[subcat([n(_),p(dir)]),irr([gave])]). dict(eng,v(_),gave,[ftr([tns(past)]),root(give)]). dict(eng,v(_),believe,[subcat(i(tns(—))),subcat(c(emb/sent)), irr( [believed]) ,german(glauben)]). diet (eng,v(_) ,believed ,[ftr( [part (past)]) ,root (believe)]). dict(eng,v(_),seem,[ftr([trans(—)]),subcat(i(tns(—))),subcat(c(emb/sent)), thet a( -) ,german(sheinen)]). dict(eng,v(_),want,[subcat(c(emb/sent)),pastpart(wanted), irr( [wanted]) ,german(mogen)]). dict(eng,v(_),have,[subcat(v(part(past)/tns(—))),aux(+),irr([has]), german(haben)]). dict(eng,v(_),has,[ftr([tns(pres),per(3),num(—)]),root (have)]). dict(eng,v(_),be,[subcat(v(part(pres)/tns(—))),aux(+),german(sein)]). dict(eng,v(_),been,[ftr([part(past)]),subcat(v(part(pres)/tns(—))), german(sein)]). dict(eng,v(_),being,[ftr([part(pres)]),subcat(v(part(pres)/tns(-))), german(sein)]). dict(eng,v(_)4 )[f ([tns(pres),per(3),num(-)]),root(be)]). dict(eng,v(_),am,[ftr([tns(pres),per(l),num(-)]),root(be)]). dict(eng,v(_),are,[ftr([tns(pres),num(+)]),root(be)]). dict(eng,v(_),are,[ftr([tns(pres),per(2)]),root(be)]). dict(eng,v(_),was,[ftr([tns(past),per(3),num(-)]),root(be)]). diet (eng,v(_) ,was,[ftr( [tns(past) ,per( 1) ,num(—)]) ,root (be)]). dict(eng,v(_),were,[ftr([tns(past),num(+)]),root(be)]). dict(eng,v( ),were,[ftr([tns(past),per(2)]),root(be)]). s  %%  %% Modals  tr  135  APPENDIX  G. THE LEXICON  70/0  dict(eng4(tns(-)),to,[ftr([tns(-)]),subcat(v(nil/tns(-))),german(zu)]). ctat(eng4(tns(+)),wiu,[ftr([tns(m^ cUct(eng4(tns(+)),do,[subcat(v(nil/tris(-))),aux(-|-),german(nil)]). dict(eng4(tns(+)),do,[ftr([tns(pres),per(l),per(2)]),root(do)]). dict(eng,i(tns(+)),do,[ftr([tns(pres),per(3),num(+)]),root(do)]). dict(eng4(tns(-r-)),does,[ftr([tns(pres),per(3),num(-)]),root(do)]). dict(engj.(tns(4-)),did,[ftr([tns(past),per( ),num( )]),root(do)]).  %% %%  Prepositions  %%  dict(eng,p(dir),to,[subcat(n(acc)),german([nach])]). dict(eng,p(loc),on,[subcat(n(acc)),german(auf)]). diet (eng,p( ) ,where,[ftr([wh(+)]) ,german(wo)]).  %% %%  Complementizers  %%  dict(eng,c(emb/_),that,[subcat(i(tns(-|-))),german(dass)]). dict(eng,c(emb/_),for,[ftr([tns(—)]),subcat(i(tns(—))),german(denn)]). '0 70  %% %%  Determiners  dict(eng,d,the,[german(das)]). dict(eng,d,a,[ftr([num(—)]) ,german(ein)]). dict(eng,d,what ,[ftr( [wh(+)]) ,german(welches)]). dict(eng,d,which,[ftr([wh(+)]),german(welches)]). dict(eng,d,every,[ftr([num(—)]),german(jede)]).  %  Section:  Suffix  Table  (German)  % %  Paradigm:  suff_table(Language,Category,InflEnd,RootEnd,Features).  %  Language:  Language  %  Category:  Lexical  of source {ger,ger}.  %  InflEnd:  The inflected  %  RootEnd:  The ending  %  Features:  Features to be added by the  category {n,v}.  suff_table(ger,n(_),en,' e' ,[num(+)]). suff_table(ger,n(_),en,'' ,[num(+)]). suff_table(ger,v(_),en,'' ,[part(past)]).  ending. of the root form. inflection.  APPENDIX  G. THE LEXICON  sufftable(ger,v(_),'' ,'n' ,[per(l),num(—),tns(pres)]). suff_table(ger,v(_),'',' ',[per(2),tns(pres)]). suff_table(ger,v(_),'',' ',[num(+),tns(pres)]). defaults(ger,n(_),[per(3),num(—),wh(—),gen(_),case(_),proper(-)]). defaults(ger,d,[wh(—)]). defaults(ger,p(_),[wh(—)]). defaults(ger,v(_) ,[tns( -) ,per(_) ,num(_) ,trans(+)]). defaults(ger,i(_) ,[tns(+) ,per(_) ,num(_)]). default s (ger ,Cat, []).  % % Section: Lexicon (German) % % Paradigm: dict(L,Cat, Word,[ftr([tns(T),num(PG,C),per(N),part(F),wh(M)]), % irr([Fl,F2,...]) ,pastpart(F) ,prespart(F), % pl(PF),proper (M) ,pro(M) ,root(RF) ,subcat(Frame)]). %.  %% %% Nouns %%  dict(ger,n(_) jch,[ftr([case(nom),per(l)]),proper(+),irr([mich,mir]), english(i)]). dict(ger,ri(_),mich,[ftr([case(acc),per(l)]),root(ich)]). dict(ger,n(_),mir,[ftr([case(dat),per(l)]),root(ich)]). dict(ger,n(_),sie,[ftr([case(nom),case(acc),per(2)]),proper(+),irr([ihnen]), english(you)]). dict(ger,n(_)ihnen,[ftr([case(dat),per(2)]),root(sie)]). dict(ger,n(_),sie,[ftr([case(nom),case(acc)]),proper(+),irr([ihnen]), english(she)]). dict(ger,n(_)4h.r,[ftr([case(dat)]),root(sie)]). dict(ger,n(_),er,[ftr([case(nom)]),proper(+),irr([ihn,ihm]), english(he)]). dict(ger ,n(_) 4hn,[ftr( [case(acc)] ),root(er)]). dict(ger,n(_)4hm,[ftr([case(dat)]),root(er)]). diet (ger,n(_) ,wir ,[ftr ([case(nom) ,p er( 1) ,num(+)]) ,proper(+) ,irr ([uns]), english(we)]). dict(ger,n(_),uns,[ftr([per(l),num(+),case(acc),case(dat)]),root(wir)]). dict(ger,n(_),buch,[ftr([gen(n)]),english(book)]). dict(ger,n(_)junge,[ftr([gen(m)]),english(boy)j). dict(ger,n(_),madchen,[ftr([gen(n)]),english(girl)]).  137  APPENDIX  G.  THE LEXICON  138  dict(ger,n(_),tisch,[ftr([gen(m)]),english(table)]). dict(ger,n(_),frau,[ftr([gen(f)]),english(woman)]). dict(ger,n(_),es,[english(es)]). diet (ger,n(_) ,was ,[ftr( [case(nom) ,case(acc) ,proper(+) ,wh(+)] ),irr( [wem]) ,english(what)]). dict(ger,n(J,wem,[ftr([case(dat),proper(-|-),wli(-|-)]),root(was)]). diet (ger ,n(_) ,wer,[ftr( [case(nom),proper(+) ,wh(+)]) ,irr( [wen,wem]) ,english(who)]). dict(ger,n(_),wen,[ftr([case(acc),proper(-(-),wh(-r')]),root(wer)]). dict(ger,n(_),wem,[ftr([case(dat),proper(+),wh(-|-)]),root(wer)]). diet (ger ,n(_) ,das, [ftr ([wh(+) ,prop er(-f ),num( -) ,gen(n) ,case(nom),case(acc)]), irr ([das,der ,die,den,dem]) ,english( which)]). (uct(ger,n(J4er,[ftr([wh(4-)proper(+),num(-),gen(m),case(nom)]),root(das^ dict(ger,n(J,der,[ftr([wh(+),proper(-r-),num(-),gen(f),case(dat)]),root(das)]). dict(ger,n(_),die,[ftr([wh(-f),proper(-)-),num(—),gen(f),case(nom),case(acc)]),root(das)]). dict(ger,n(J4i )[ftr([wh(+),proper(+),nu (uct(ger,nQ4 )[ft ([ h(+),proper(+),num(-),gen(m),case(ac dict(ger,nQ,den,[ftr([wh(+),proper(+),num(-|-),gen(_),case(acc)]),root(das)]). dict(ger,n(J,dem,[ftr([wh(-f-),proper(+),num(—),case(dat),gen(m),gen(n)]),root(das)]). )  e  en  r  w  '0/0  %% Verbs 70/0  dict(ger,v(. .) J-egen, [sub cat ([n( acc) ,p (lo c)]), irr•( [lege,legen,legt 4ag,lagen ,gelegt]) ,english(put)]). dict(ger,v .)Jegt,[ftr([per(3),tns(pres)]),root(legen)]). dict(ger,v .)4egte,[ftr([num(-),tns(past)]),root(legen)]). dict(ger,v .)J.egten,[ftr([num(+),tns(past)]),root(legen)]). dict(ger,v .),gelegt,[ftr([part(past)]),root(legen)]). dict(ger,v .),sehen,[subcat([n(acc)]),english(see)]). dict(ger,v .),sieht,[ftr([per(3),tns(pres)]),root(sehen)]). dict(ger,v .),sah,[ftr([num(—),tns(past)]),root(sehen)]). dict(ger,v .),sahen,[ftr([num(-|-),tns(past)]),root(sehen)]). dict(ger,v .) ,gesehen,[ftr([part(past)] ),root(sehen)]). dict(ger,v .),geben,[subcat([n(acc),n(dat)]),english(give)]). dict(ger,v .),gibt,[ftr([per(3),tns(past)]),root(geben)]). dict(ger,v .) ,gab,[ftr ([num(—) ,tns(past)]) ,root(geben)]). diet(ger ,v .) ,gaben, [ftr ([num(+) ,tns(past)] ),root (geb en)]). dict(ger,v .),gegeben,[ftr([part(past)]),root(geben)]). dict(ger,v .),glauben,[subcat(i(tns(—))),subcat(c(emb/sent)),english(believe)]). dict(ger,v .),glaubt,[ftr([per(3),tns(pres)]),root(glauben)]). dict(ger,v .) ,glaubte,[ftr( [num(—) ,tns(past)] ),root (glauben)]). dict(ger,v .),glaubten,[ftr([num(+),tns(past)]),root(glauben)]). dict(ger,v .),geglaubt,[ftr([part(past)]),root(glauben)]). dict(ger,v .) ,versuchen ,[subcat (c(emb /sent)) ,english(try)]).  APPENDIX  G. THE LEXICON  dict(ger,v(_),versucht,[ftr([per(3),tns(pres)]),root(versuchen)]). dict(ger,v(_),versuchte,[ftr([num(-),tns(past)]),root(versuchen)]). dict(ger,v(_),versuchten,[ftr([num(+),tns(past)]),root(versuchen)]). dict(ger,v(_),versucht,[ftr([part(past)]),root(versuchen)]). dict(ger,v(_),sheinen,[subcat(i(tns(—))),subcat(c(emb/sent)),english(seem)]). dict(ger,v(_),sheint,[ftr([per(3),tns(pres)]),root(sheinen)]). dict(ger,v(_),gesheint,[ftr([part(past)]),root(sheinen)]). dict(ger,v(_),sein,[subcat(v(_)),aux(+),english(be)]). dict(ger,v(_) jst,[ftr([tns(pres),per(3),num(-)]),root(sein)]). dict(ger,v(_),bin,[ftr([tns(pres),per(l),num(-)]),root(sein)]). dict(ger,v(_),sind,[ftr([tns(pres),num(+)]),root(sein)]). dict(ger,v(J,sind,[ftr([tns(pres),per(2)]),root(sein)]). dict(ger,v(_),war,[ftr([tns(past),per(3),num(—)]),root(sein)]). dict(ger,v(_),gewesen,[ftr([tns(past),per(l),num(-)]),root(sein)]). dict(ger,v(_),haben,[subcat(v(part(past)/tns(—))),aux(+),english(have)]). dict(ger,v(_),b.aben,[ftr([tns(pres),per(2),num(_)]),root(haben)]). dict(ger,v(_),haben,[ftr([tns(pres),per(_),num(+)]),root(haben)]). dict(ger,v(_),habe,[ftr([tns(pres),per(l),iium(—)]),root(haben)]). dict(ger,v( ),hat,[ftr([per(3),tns(pres),num(-)]),root(haben)]).  %%  %% Modals %%  dict(ger,i(tns(-)),zu,[ftr([tns(-)]),subcat(v(nil/tns(-))),english(to)]). dict(ger,i(tns(+)),werden,[subcat(v(nil/tns(-))),english(will)]). dict(ger,i(tns(+)),werden,[ftr([tns(fut),per(2),num(_)]),root(werden)]). dict(ger,i(tns(+)),werden,[ftr([tns(fut),per(_),num(+)]),root(werden)]). dict(ger,i(tns(+)),werde,[ftr([tns(fut),per(l),num(-)]),root(werden)]). aUct(gera(tns(+)),wird,[ftr([tns(fut),per(3),num(-)]),root(werden)]).  %%  %% Prepositions %%  di ct (ger ,p (dir) ,zu, [sub cat (n(_)) ,english (to)]). dict(ger,p(loc),auf,[subcat(n(_)),english(on)]). diet (ger ,p (lo c) ,in, [sub cat (n (_)) ,english (in)]). dict(ger,p(_),wo,[ftr([wh(+)]),english(where)]).  %% %% %%  Complementizers  dict(ger,c(emb/_),dass,[subcat(i(_)),english(that)]). dict(ger,c(emb/_),denn,[ftr([tns(-)]),subcat(i(T)),english(for)]). %% %%  Determiners  139  APPENDIX  G. THE LEXICON  cUct(ger,d4as,[irr([der,das,die,den,dem]),english(the)]). dict(ger,d,das,[ftr([num(—,n,nom),num(—,n,acc)]),root(das)]). dict(ger,d,der,[ftr([num(—,m,nom),num(—,f,dat)]),root(das)]). dict(ger,d,die,[ftr([num(—,f,nom),num(—,f,acc),num(+^,nom),num(+ .,acc)]), root (das)]). dict(ger,d,den,[ftr([num(—,m,acc),num(+ .,acc)]),root(das)]). dict(ger,d,dem,[ftr([num(—,m,dat),num(—,n,dat)]),root(das)]). dict(ger,d,eiii,[irr([ein,einer,eine,einem,emen]),english(a)]). dict(ger,d,ein,[ftr([num(—,n,nom),num(—,n,acc),num( — ,m,nom)]),root(ein)]). diet (ger ,d,einer,[ftr( [imm(—,f,dat)]) ,root (ein)]). dict(ger,d,eine,[ftr([num(—,f,nom),num(-,f,acc)]),root(ein)]). dict(ger,d,einen,[ftr([num(—,m,acc),num(+ ,acc)]),root(ein)]). dict(ger,d,einem,[ftr([num(-,m,dat),num(-,n,dat)]),root(ein)]). dict(eng,d,welches,[ftr([wh(+),num(_^,_)]),english( which)]). %  v  u  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items