Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Towards an explanatory division of competence and performance : a language-independent parsing scheme Alphonce, Carl G. 1992

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1992_fall_alphonce_carl.pdf [ 4.25MB ]
JSON: 831-1.0051278.json
JSON-LD: 831-1.0051278-ld.json
RDF/XML (Pretty): 831-1.0051278-rdf.xml
RDF/JSON: 831-1.0051278-rdf.json
Turtle: 831-1.0051278-turtle.txt
N-Triples: 831-1.0051278-rdf-ntriples.txt
Original Record: 831-1.0051278-source.json
Full Text

Full Text

TOWARDS AN EXPLANATORY DIVISION OFCOMPETENCE AND PERFORMANCE:A LANGUAGE-INDEPENDENT PARSING SCHEMEByCarl G. AlphonceB.Sc., McGill University, 1988B.A., McGill University, 1990A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF COMPUTER SCIENCEWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAOctober 1992© Carl G. Alphonce, 1992In presenting this thesis in partial fulfilment of the requirements for an advanced degree atthe University of British Columbia, I agree that the Library shall make it freely availablefor reference and study. I further agree that permission for extensive copying of thisthesis for scholarly purposes may be granted by the head of my department or by hisor her representatives. It is understood that copying or publication of this thesis forfinancial gain shall not be allowed without my written permission.Department of Computer ScienceThe University of British Columbia2075 Wesbrook PlaceVancouver, CanadaV6T 1Z1Date:AbstractThis dissertation defends in some small measure the thesis that there is a universalparsing model for natural languages. Such a model will apply, without change, cross-linguistically.The defense of this thesis proceeds by finding solutions to some apparently insur-mountable problems which arise from the interaction of a set of basic and seeminglyuncontroversial assumptions concerning both the linguistic framework and the computa-tional one. Some of the difficulties associated with parsing overtly and covertly derivedunbounded A dependencies, as instantiated in English and Chinese in particular, areexplored; solutions which are psycholinguistically plausible are presented.In further defense, it is claimed that there are certain linguistic phenomena, such asRelativized Minimality and some curious directional asymmetries in the movement rulemove-a, which are better analyzed as artificats of performance constraints, rather thancompetence constraints; by removing some of the burden from the competence theory(the syntactic theory, in this case) and placing it on the performance theory (the parsingmodel), a more perspicuous model of human language-processing emerges. By delim-iting more strictly the division between competence and performance in this manner,the adoption of a universal parsing scheme is made possible. This cross-linguisticallyapplicable model makes strong computational and linguistic predictions, which are alsoexplored.iiTable of ContentsAbstract^ iiList of Tables^ viiList of Figures^ viiiAcknowledgements^ ix1 Introduction 11.1 Thesis ^ 11.2 Motivation 31.3 Outline^ 42 Linguistic Framework 52.1 A Modular Theory ^ 52.2 A Multistratal Theory 62.3 X Theory ^ 82.4 Basic Definitions 112.4.1^Graphs and Trees ^ 112.4.2^Structural Relations 122.5 Affect-a and Chains ^ 142.5.1^Substitution 152.5.2^Adjunction ^ 162.5.3^Chains 161112.6 A-movement ^  182.6.1 0 Theory  ^182.6.2 Case Theory  ^192.6.3 Further Examples of A-movement ^  222.7 Extended Projection Principle ^  242.8 pro and PRO ^  252.9 Predication  ^262.10 A-movement ^  282.10.1 14/h-movement  ^282.10.2 Operator Scope ^  302.10.3 Selectional Restrictions  ^312.10.4 Relative Clauses  ^332.11 Government ^  352.12 Relativized Minimality  ^362.13 The Empty Category Principle ^  402.14 Minimal Linguistic Assumptions  423 Computational Framework^ 433.1 A General Model  433.1.1 The Marcus Parser ^  443.1.2 Licensing Parsers  ^463.1.3 Chunk Parsing ^  473.1.4 Other Details  ^493.1.5 Summary ^  513.2 Minimal Computational Assumptions ^  523.3 Psycholinguistic Evidence  ^53iv3.3.1 Filler-Driven Parsing  ^533.3.2 General Parsing Principles  ^534 Problems and Solutions^ 554.1 Relativized Minimality Effects  ^564.2 Brief Overview of the Parser's Operation ^  584.3 A-type Movements ^  634.4 Wh-movement  704.4.1 S-Structure 14/h-movement  ^704.4.2 LF 1/Vh-movement^  784.5 Revised Parsing Model  914.6 Empty Operator Constructions ^  925 Implications^ 946 Related Work^ 976.1 Three Approaches ^  976.1.1 Language-Specific Parsing  ^986.1.2 Parameterized Parsing  ^986.1.3 Common Parsing Scheme^  1006.2 Other Research ^  1016.2.1 Long-Distance A-dependencies at S-Structure ^ 1016.2.2 A Grammar Tester ^  1017 Future Work^ 1028 Conclusion^ 104vBibliography^ 106A The Parser^ 110A.1 Implementation ^  110A.2 Examples  111viList of Tables2.1 The lexical categories  ^94.2 Movement-type diagnostics ^  83viiList of Figures2.1 A multistratal theory of grammar — the levels of GB theory  ^82.2 Generic X-projection for English  ^102.3 The structure of VP  ^112.4 A sample tree ^  132.5 XP adjoined to YP on the left ^  162.6 XP adjoined to YP on the right  162.7 An example of a chain ^  172.8 Unrelated chains are allowed  372.9 Nested chains of the same type are disallowed ^  372.10 Intersecting chains of the same type are disallowed  ^382.11 Nested chains are permitted in some configurations ^393.12 Skeletal X projections created by the chunker: chased = (+past] + chase^493.13 The mechanisms of relative clauses ^  504.14 The buffer  ^594.15 The stack  ^604.16 Possible attachment sites  ^624.17 English LF representation ^  804.18 Proposed Chinese LF representation ^  825.19 Possible CP projections  ^95viiiAcknowledgementsAs everyone claims, there are far too many people to be thanked for their (perhapsunwitting) contributions to this thesis. As is the custom, I mention those I rememberat the moment I write this; I would humbly ask those people who I have neglected tomention to forgive my utterly unintentional oversight.First and foremost, I thank my thesis supervisors, David Poole and Henry Davis,who both forced me to read, think, and write. Without their help and persistence, noneof what follows would have been written. I also thank the Department of ComputerScience for generous financial support.I thank David Leblanc for introducing me to Henry, for many a fruitful discussion,and for never being afraid to ask hard questions. For great help with Chinese, I thankYan Feng Qu. For support, encouragement, and beer, I owe a debt of gratitude toAndrew Csinger and Mike Horsch. For support, encouragement, and coffee, I owe a debtof gratitude to Christopher Romanzin. He provided me with much LATEX support andthesis encouragement. Thanks also to Scott, Rob, Yggy, Stan, DvanB, Johnny, and therest of the 312 crowd.I especially wish to thank my parents. Tack fOr tillfallet att Lida (jag skojar) genomskrivandet av en avhandling, och fOr att kanske ovetande ha lett mig in i den akademiskavarlden.Finally, and to me most importantly, my unending thanks to Averill, who has putup with me and my rantings and ravings; who, through it all has stood by me, pushedand prodded me on, reminding me of why I am doing this; and who, despite all this, hascommitted herself to a lifetime of more of the same.ixChapter 1Introduction1.1 ThesisThis dissertation investigates a psycholinguistically motivated model of (human) linguisticprocessing along with the computational and linguistic implications of such a model. Thethesis I aim to defend is that there is one parsing mechanism which is sufficient to handleall human languages. A proof of this is clearly an unreasonable undertaking; my effortswill therefore be directed towards showing that selected challenges to this hypothesis donot in fact constitute refutations of it.In order to even begin making claims of universality of the parsing mechanism,the underlying linguistic theory must be cross-linguistically applicable. This has been adriving force behind the development of Government-Binding (GB) theory. GB theory isa constrained theory which makes strong predictions, addresses questions of aquisition,and attains a high level of empirical coverage. It is thus on this theory that the parser isbased. A significant obstacle to overcome when basing a parser on a multistratal theorysuch as GB is being able to recover the appropriate representations at each level of thegrammar. This difficulty is made more acute when additional constraints, such as therequirement that the parser be psycholinguistically plausible, are introduced.It might be argued that the problems of dealing with multiple levels of represen-tation could be easily dispensed with by basing the parser on a monostratal linguistic1Chapter 1. Introduction^ 2theory, such as Generalized Phrase Structure Grammar (GPSG) [Gazdar et al. 85]. Lex-ical Functional Grammar (LFG) [Bresnan 82], although a bistratal theory, might also bethought a better framework within which to work — LFG has been developed with com-putational work in mind. However, neither of these theoretical frameworks can accountfor the same breadth of linguistic phenomena, in as great a variety of languages, in asconstrained (and thus explanatory) a manner, as can GB theory. Furthermore, I showthat by using a more suitable parsing model, the multiple levels of GB theory cease tobe problematic.I also aim to achieve an equitable division of labour between competence and per-formance. I assume that all parameterization to account for cross-linguistic variation islimited to the competence module, and that the performance module is fixed. This is notan uncontroversial assumption. Mazuka & Lust [Mazuka et al. 90] claim, for instance,that in order for a "universal" parsing mechanism to be maintained, the parser must beparameterized. By allowing the parsing mechanism to be parameterized, they maintainthat there is a family of parsers for natural language. Although I do not directly addressthe claim of Mazuka & Lust that this parameterization is necessary due to differences inthe branching direction of languages (they argue that English must be parsed by a pri-marily top-down mechanism, while Japanese must be parser using a bottom-up process),I indirectly show that this is not the case, as my parser handles English using a primarilybottom-up approach.Lee, Chien, Lin, Huang, and Chen [Lee et al. 91] describe a parsing mechanismconceived of specifically to handle the syntactic peculiarities of Chinese as compared toEnglish. They imply that Chinese must be handled with a mechanism different fromthat of English. I show that this is not the necessary conclusion. The parsing mechanismI develop in this thesis handles complex constructions in both English and Chinese.I am not primarily concerned with achieving an extensive linguistic coverage. TheChapter 1. Introduction^ 3main focus is instead on finding ways to deal with theoretically problematic constructions.Because of this, only relevant constructions are dealt with in the main text of the thesis;a greater range of examples are shown in appendix A.2. It is clear that greater linguisticcoverage can easily be achieved once the difficult cases are solved.1.2 MotivationThe field of Artificial Intelligence (AI) encompasses many different activities; I take oneof the most important of these activities to be the gaining of a deeper understanding ofhuman intelligence. Constructing systems to carry out tasks which humans do may beuseful, but they are of real interest to the field of AI only insofar as they serve as modelsagainst which theories of human cognition can be tested.Rensink Provan [Rensink et al. 91] highlight the relevance of the "naturally in-telligent" system in the pursuit of AI:Consider a cat in its natural environment. If it is to catch prey and escapefrom predators, the cat must not only be able to process visual information,but must also do so in real time. Its visual system is therefore best explainednot only in terms of limitations on the information available to the eye, butalso in terms of limitations on other resources, such as time and space.There is an increasing awareness — especially within the more computationalsub-disciplines of cognitive science — that these more general resource limi-tations influence many kinds of perceptual and cognitive processes. [pg. 311]Regardless of whether or not one subscribes to the hypothesis that only humanshave the (cognitive) facilities to support a highly structured and productive communica-tion system, it is clear that humans do possess such an ability. In constructing artificialsystems to process language, heed should be paid to the manner in which humans parseand understand it.A measure of how closely a model matches reality is needed. In this case, we needto ensure that our model is psycholinguistically plausible — that it is consistent withChapter I. Introduction^ 4psycholinguistic evidence pertaining to how people process language. Chapter 3 covers aselection of psycholinguistic results which places constraints on the parsing model.On a more concrete level, part of the motivation for this study is to begin formu-lating a theory of performance, one which will account for certain linguistic phenomenaon its own. By removing the burden of explanation from the competence theory, thecompetence theory can be simplified; in conjunction, these two simple and orthogonalcomponents function as a more precise and accurate theory.It should be noted that a theoretically motivated study is not without some prac-tical fallout. A modular and universal parsing scheme will allow construction of naturallanguage applications with a minimum of effort. Once the amount of language-specificcode which must be written is minimized, then creating multi-lingual programs is a rela-tively straightforward task. The syntactic aspects of machine translation should also besimplified if a universal parsing scheme is implemented.1.3 OutlineThe thesis proceeds as follows. The next chapter presents an overview of the linguistictheory in which this work is carried out. It is a short summary of those aspects ofGovernment-Binding theory relevant for the subsequent discussion. Chapter 3 sets upthe basic computational framework, describing briefly the parsing models from which Ihave drawn inspiration, and outlining some relevant psycholinguistic results. Chapter 4attempts to merge the assumptions of chapters 2 and 3, showing where they do not meshsmoothly, and offering solutions in the form of revisions to either the linguistic or thecomputational model. Chapter 5 discusses the computational and linguistic implicationsof this work, while chapter 6 explores some related work. Chapter 7 outlines issues leftfor future work, and chapter 8 finally concludes the thesis with a summary.Chapter 2Linguistic FrameworkThe role of this chapter is to present a brief overview of the syntactic theory (Government-Binding (GB) theory) underlying this thesis. It is not my intent to give a thoroughaccount of all aspects of GB theory; 1 instead, the focus will be to provide coverage ofthose areas which are necessary background to understand the remainder of the thesis.2.1 A Modular TheoryAny theory of natural language syntax must seek to account for the similarities anddifferences found amongst the languages of the world. Towards that end, GB theory hasevolved into a highly modular theory. By isolating various components of the theory fromeach other, factoring the theory into the lowest or simplest terms, each module becomessimpler, and can hope to yield more accurate predictions. The relevant modules are,• X Theory• Case Theory• 0 Theory• Government Theory• Control Theory'For readers who are interested in exploring GB theory in more detail, there are several good generalreferences, among them [van Riemsdijk et al. 86, Lasnik et al. 88, Haegeman 91]. For more ambitiousreaders, [Chomsky et al. 92] provides a very good overview of the current status of orthodox GB theory,though it is much more technical and dense than the other references cited.Chapter 2. Linguistic Framework^ 6Each module has associated with it several parameters which are set differently fordifferent languages. It is hypothesized that the basic building blocks of human languagecan be captured in the modules, with slight parametric differences accounting for cross-linguistic variation. As of yet no definitive list of parameters has been forthcoming; someparameters will be mentioned in passing in what follows, to give the reader some feel forhow they are meant to function in the general case.2.2 A Multistratal TheoryConsider the two sentences shown below.(1) The dog bit the cat.(2) The cat was bitten by the dog.The two sentences convey the same information, but with a slightly different em-phasis. In the first sentence "the dog" is more prominent, whereas in the second one"the cat" is. We say that "the dog" is the subject and "the cat" is the (direct) object in(1), while in (2) "the cat" is the subject and "the dog" is the indirect object. Note thateven though the grammatical functions of "the dog" and "the cat" are different in thetwo sentences, "the dog" is doing the biting and it is "the cat" that is being bitten inboth. We say that "the dog" is playing the role of the agent and "the cat" the role ofthe patient. Agent and patient are referred to as thematic roles (or 0-roles). This pro-vides some semantic motivation for thinking of these two sentences as being syntacticallyrelated in some manner.Sentences (1) and (2) are also related morphologically. Active-passive pairs exhibita predictable morphological variation — bit versus was bitten in this case. This passiveChapter 2. Linguistic Framework^ 7morphology is taken as a further indication that the two could profitably be viewed as be-ing syntactically related in some fashion. This requirement of morphological relatednessexcludes the possibility of analysing the following pair of sentences as being syntacticallyrelated,(3) John sent the book to Mary.(4) Mary received the book from John.Because the sentences in (1) and (2) convey basically the same information andare morphologically related in a predictable fashion, GB theory holds that they aresyntactically related in the following way. The two should have a common underlyingrepresentation with different surface realizations encoding the difference in attention.Thus were born the levels D-Structure and S-Structure. 2 D-Structure is the level ofrepresentation at which thematic relations are encoded, while S-Structure can be thoughtof as the interface between two other levels, Phonological Form (PF) and Logical Form(LF).In current (mainstream) GB theory, there are four levels in the grammar.' D-Structure and S-Structure have survived, while PF and LF are newer additions. PF ismeant to deal with post-lexical phonology, and is viewed as the interface to the motor-perceptual system. This level of the grammar will not be discussed further in this thesis.LF is the interface to semantics. It is at this level that things like quantifier scopeare represented. This thesis deals with various problems which LF poses for parsing.Figure 2.1 shows the relationships amongst these levels and the lexicon.'These levels were originally called Deep Structure and Surface Structure. These labels were aban-doned in favour of the abbreviated forms now in use because Deep Structure was perceived as more ofa level of semantics rather than as a syntactic level within the model of grammar.3Though in recent work, Chomsky [Chomsky 92] abandons the level of D-Structure.^Chapter 2. Linguistic Framework^ 8^Lexicon ^ D-StructureS-StructureNPF^LFFigure 2.1: A multistratal theory of grammar — the levels of GB theoryThe lexicon contains entries for every lexical item. Among other things an item'slexical entry contains its category. Many items also include subcategorization frames. Averb's subcategorization frame provides information about the types of complements itselects,' along with information about any Case or 0-marking which might be assignedto these complements. Prepositions also have subcategorization frames, but most othercategories do not.2.3 X TheoryX (pronounced "x bar") theory is a highly generalized theory of the structure of phrases.As formulated in [Chomsky 86a], the phrasal categories are divided into two differenttypes. The lexical categories are based on the features [±N,±V], as shown in Table 2.1. 5The common abbreviations for these categories are A, V, N, and P. Note that this'There are two types of selection, s-selection and c-selection [Chomsky 86b]. S-selection (or semanticselection) refers to the semantic type of the complements that the lexical item takes. C-selection (orcategorial selection) refers to the grammatical category of a lexical item's complements.5 English only has prepositions. Postpositions serve the same purpose as prepositions — they justoccur after the phrase with which they are associated, rather than before it. To make for easier reading,I will henceforth refer to the class of pre- and post- positions simply as prepositions.+N —N+v adjective verbpre-/post- positionsV nounChapter 2. Linguistic Framework^ 9Table 2.1: The lexical categoriesclassification only deals with the major lexical categories.The non-lexicar categories consist of inflection (abbreviated I), complementizer(abbreviated C), and determiner (abbreviated D). Inflection includes such "items" astense and agreement; in Chinese, which does not express tense as English does, encodesaspect in this position. The complementizer is meant for clausal complementizers, suchas "that":(5)^I believe that John is lying.Each of these categories can head a projection, according to the following X-schema[Chomsky 86a]: 7X ---- Y* X , Y = specifierX --4 X Z* , Z = complementX" ---4 X' W , W = adjunctLet us first take care of some terminology. "X" is known as the head of the pro-jection, and is also written X°. Xis also known as the single bar-level projection (of X),and is sometimes written as X'. X is called the maximal projection (of X). XP, X", andxMAX are other common notational variants of X. Xn is the n-th bar-level projection ofX, where n EN.6 A better term for these categories might be functional, though I will use standard terminologythroughout.7 "*" is the Kleene star, indicating that zero or more occurances of the category are permitted.Chapter 2. Linguistic Framework^ 10XPSpecifier^X_____,----' -----,X^ComplementFigure 2.2: Generic X-projection for EnglishNote that this schema is not meant to directly encode any linear constraints onthe order of elements, only hierarchical ones.' The linear order is determined by variousparameter settings, which specify, for example, whether specifiers are projection-initialor projection-final, and whether heads are initial or final. Kayne [Kayne 84] argues fora binary-branching structure, which will allow for at most one specifier and at mostone complement per projection. Figure 2.2 shows the prototypical X projection in En-glish. Note that English has projection-initial specifiers, and that heads precede theircomplements.X constraints are traditionally satisfied at D-Structure, but need not be satisfiedat all other levels of representation.'In recent literature, there have been proposals concerning the structure of VP.Koopman & Sportiche, Kuroda, and Speas [Koopman et al. 91, Kuroda 88, Speas 90]argue that the "subject" should be base-generated within the VP. I adopt the Kuroda /Speas version, as shown in figure 2.3, in which the "subject" is generated in the specifier8 13y this I mean that there is no ordering of elements imposed on the right hand side of each individualrule in the schema. This schema does encode linear constraints indirectly because crossing branches arenot allowed. For example, the specifier could never occur between the head and its complement.9 In recent work [Chomsky 92] X constraints are satisfied at all levels of the grammar.Chapter 2. Linguistic Framework^ 11of VP position.For reasons of simplicity, the implementation maintains a uniform structure for allprojections, regardless of the category of the head. Using this structure for the VP also,the parser cannot handle multiple-complement structures (as there is only one positionavailable for complements). Though this is clearly undesirable for a general-purposeparser, this limitation is of no consequence for this thesis.VP"Subject" = Specifier^V'V^ComplementFigure 2.3: The structure of VP2.4 Basic DefinitionsThe purpose of this section is to provide a definition of a tree, and to define the basicstructural relations between positions in a tree which GB theory makes use of.2.4.1 Graphs and TreesThe definitions in this section are taken in large part from [Johnsonbaugh 84].Definition 2.1 A graph G consists of a set N of nodes and a set E of edges such thateach edge e E E is associated with an unordered pair of nodes.If the edge e is associated with the nodes m and n, then we write e = (m, n).Definition 2.2 Two edges are parallel if they are associated with the same pair of nodes.Chapter 2. Linguistic Framework^ 12Definition 2.3 An edge associated with a pair of nodes m and n, where m = n, is calleda loop.Definition 2.4 A simple graph is a graph with neither loops nor parallel edges.Definition 2.5 A path is a sequence of edges {(no ,n i ),(n i ,n 2 ),..., (nk_ i ,nk )} in whichthe edges are distinct.The path shown in the above definition is abbreviated (n o , n 1 , ... , nk).Definition 2.6 A tree is a simple graph in which (i) there is a unique path between eachpair of nodes, and (ii) there is a distinguished node called the root of the tree.Let T be a rooted tree with root n o . Suppose that (no , n 1 , ..., nk ) is a path in T.Then we say that (i) n k_ 1 is the mother of nk ; (ii) no , , nk_ i are ancestors of nk ; and(iii) nk is a daughter of n k- 1 .2.4.2 Structural RelationsWe are now in a position to define some of the basic structural relations used in GBtheory. These structural relations are defined over a single tree.Definition 2.7 a immediately dominates 13 if a is the mother of 13.For example, in figure 2.4, I' immediately dominates both I and VP. We can alsodefine the notion of sisterhood, which will come to play an important role in 0 theory, tobe discussed below.Definition 2.8 a and /3 are sisters (or are in a sisterhood relation) if a and /3 areimmediately dominated by the same node.Chapter 2. Linguistic Framework^ 13CPCP Specifier^C 'C^IPIP Specifier^I 'I^VPVP Specifier^V 'V^VP ComplementFigure 2.4: A sample treeFor example, I and VP are sisters in figure 2.4, as are IP Specifier and I'.The relation dominates is the transitive closure of the relation immediately domi-nates.Definition 2.9 a dominates # if (i) a immediately dominates #, or (ii) a immediatelydominates -y, and -y dominates #.In figure 2.4, C' dominates C and IP, since it immediately dominates them. C' alsodominates everything that either C or IP dominates (namely IP Specifier, I', I, VP, VPSpecifier, V', V, and VP Complement).Chapter 2. Linguistic Framework^ 14Another very central notion is that of c-command.Definition 2.10 a c-commands /3 iff (i) a does not dominate /3, and (ii) every y thatdominates a dominates /3.Consulting figure 2.4 again, it is easy to verify that I c-commands VP (and vice-versa). I also c-commands VP Specifier, V', V, and VP Complement. I', however,only c-commands IP Specifier; I' does not c-command anything which it also dominates.Finally, note that any node (except the root node of the tree) will c-command itself underthis definition.A closely related notion is that of m-command:Definition 2.11 a m-commands /3 iff (i) a does not dominate [3, and (ii) every -yMAXthat dominates a dominates /3.In figure 2.4, consider which nodes I m-commands; I m-commands everything thatI c-commands, as well as m-commanding I' and IP Specifier.2.5 Affect-a and ChainsThe theory provides one mechanism for transforming one level of representation to an-other. This mechanism is affect-a. Crudely, affect-a allows anything to be done, as longas no constraints are violated; some of these well-formedness constraints are explored inthe next section. In fact, affect-a is a "cover-term" for a number of different processes,the most well-known of which is movement. The movement rule is known as move-a;move-a applies only to heads and maximal projections. Under the guise of move-a lurktwo distinct operations, substitution and adjunction.Chapter 2. Linguistic Framework^ 152.5.1 SubstitutionSubstitution takes place when an element is moved into a position generated empty in D-Structure. The following examples each exhibit instances of substitution. In each set, thesentence as we would hear it is shown first, while (a) shows the S-Structure representation,and (b) shows the D-Structure representation from which the S-Structure representationis derived. e denotes the empty D-Structure position into which the phrase will move atS-Structure. t denotes a trace. A trace is what is left in place of the phrase being moveda phonetically empty category which shares the same features as the moved phrase.(6) John was tricked.a. [ John was tricked t ]b. [ e was tricked John ](7) John seems to be sleeping.a. [ John seems [ t to be sleeping ] ]b. [ e seems [ John to be sleeping ] ](8) What did Mary give John?a. [ what did Mary give John t ]b. [ e Mary gave John what ](S-Structure)(D-Structure)(S-Structure)(D-Structure)(S-Structure)(D-Structure)Various constraints act to limit the applicability of substitution. Some of the effectsof these constraints are that only heads may move to head positions, and only maximalprojections may move to specifier positions.Chapter 2. Linguistic Framework^ 16YPXP^ YP^ XP^ YPADJUNCTION Figure 2.5: XP adjoined to YP on the leftYPYP^ XPADJUNCTIONFigure 2.6: XP adjoined to YP on the right2.5.2 AdjunctionA second manner of incorporating a moved constituent into a tree is called adjunction.Adjunction applies only to maximal projections; a maximal projection XP can be ad-joined to another maximal projection YP either on the left of YP or on the right of YP.The adjunction operation makes XP and YP sisters; their mother is a YP. Figures 2.5and 2.6 show typical cases of adjunction on the left and right, respectively.2.5.3 ChainsThe sequence of traces left by an application of move-a forms a chain of traces. Eachadjacent pair of traces in a chain is called a link of the chain. Links in a chain are subjectto locality conditions, defined by the type of the chain. The type of a chain is defined bycharacteristics of the moved element and the location from which it moved. The chainformed from the movement shown in figure 2.7 is C = (XP, t"', t", t', t). XP is called theXPChapter 2. Linguistic Framework^ 17Figure 2.7: An example of a chainhead of the chain, t the foot of the chain. A single unmoved element a forms a chain ofone element: (a).What different types of maximal projections are there? According to Rizzi andCinque [Rizzi 90, Cinque 90] moved elements are distinguished by their degree of refer-entiality. Their work will be discussed in more detail in section 2.12; it is sufficient to notehere that two types of elements are distinguished — those which are highly referentialaccording to some metric, and those which are not.There are two types of positions from which a maximal projection may be extractedA- and A-positions. A-positions are characterized as those in which 0-roles are poten-tially assigned. There is disagreement as to the A/A status of certain positions. In recentwork [Chomsky et al. 92, Chomsky 92, Mahajian 90] an attempt to replace this notionby a more well-defined one, L-relatedness, has been made. For the purposes of this thesisan informal, if stipulatory, characterization of the A/A status of positions will suffice. Iwill consider as A-positions the VP-internal positions (the VP-internal subject position,Chapter 2. Linguistic Framework^ 18and the object position) as well as the specifier of IP position. Any other positions I willconsider to be A-positions.Heads (X°s) are not distinguished along the same lines as maximal projections.Movement of heads is strictly local in nature. Since head movement is of little interestas far as this thesis is concerned, it will not be discussed in any detail.2.6 A-movementMove-a is constrained by various modules of the grammar. These constraints serve bothto limit move-a's applicability in certain situations, as well as to force it in others. A-movement is the name given to applications of move-a which are constrained by 0 Theoryand Case Theory. An A-chain is a chain which originates in an A-position. 1°2.6.1 0 TheorySentences (1) and (2) from this chapter, repeated here for convenience, were used tointroduce the notion of 0-role.(9) The dog bit the cat.(10) The cat was bitten by the dog.GB theory has one condition to do with 0-roles, the 0-criterion:0 CRITERION(i) Every 0-role which a verb assigns must be assigned to an A-chain;(ii) Every A chain must be assigned exactly one 0-role.Recalling that even a single unmoved element constitutes a chain, it is clear thatin the following examples, the (a) sentences are grammatical, while the (b) sentences"Another characterization of an A-chain, not involving the notion of A-position, is one whose foot isin a 0-marked position and whose head is in a Case marked position [Chomsky 86b].Chapter 2. Linguistic Framework^ 19exhibit 0-criterion violations: 11(11) a. Carmilla hit the ball.b. * Carmilla hit.(12) a. Carmilla gave Franklin the book.b. * Carmilla kissed Franklin the book.In the first pair, the (b) sentence violates the first clause of the 0-criterion, while in thelatter, the second clause is not met.2.6.2 Case TheoryEnglish does not exhibit much overt case marking, but what little there is may be seenby considering pronouns.(13) She gave her the book.(14) He gave his mother the book."She" is marked morphologically with the nominative case, while "her" is exhibits theaccusative case. In the second sentence, "he" has nominative case (just like "she"), and"his" shows genitive case marking.Cross-linguistically, languages show great variation in the amount of overt casemarking which is present. GB theory assumes that all languages have case marking,although it may not always be morphologically realized. Case assignment obtains onlyin very specific situations (the assigner and assignee must be in a sisterhood relation;Case-assignment is also directional), and can therefore instead impose a relatively fixedword order in the language. This is the case in English; for instance, in the sentence,11 The examples given do not exhibit solely 0-criterion violations — categorial selection, 0-role assign-ment, and Case marking often go hand in hand; hence the sentences given also violate the Case Filterand selectional (categorial) restrictions.Chapter 2. Linguistic Framework^ 20(15) Mary gave Jane the is assumed that "Mary" is assigned nominative case and "Jane" accusative case, justlike "she" and "her" above, yet there is no overt marking to indicate this. This casemarking, since it is not always morphologically realized, is referred to as abstract Case(written with a capital C).Case can be assigned in a number of ways. Verbs assign Case to their complements,as do prepositions. Case can also be assigned to the specifier of IP position; it is assumedthat tense features 12 in the I node assign Case to this position.As important as the positions to which Case is assigned are the positions to whichCase is not assigned. The VP-internal subject position is not assigned Case by the verb(because this position is, strictly speaking, not a complement of the verb). Since tensefeatures are assumed to assign Case to the specifier of IP position, it follows that ininfinitival (tenseless) clauses there will be no Case assignment to this position. Passivemorphology also is taken to "absorb" Case marking to one of the verb's complements (aswell as any "external" 0-marking — see below). The constraint on Case marking can beformulated as follows:"CASE FILTEREvery A-chain must be headed by a Case marked position.'This is not the current view of how nominative Case assignment is achieved; other mechanisms havebeen proposed, mostly for theory-internal reasons. If not entirely correct, using tense features is morethan precise enough for my purposes.'There is an exception to this, namely the element PRO (to be discussed in section 2.8). Crucially,PRO must not be assigned Case marking.Chapter 2. Linguistic Framework^ 21The following pair can be explained in terms of the Case theory." The (a) version(which also happens to be a D-Structure representation) is ungrammatical because of aviolation of the Case Filter; the (b) version avoids a violation of the Case Filter throughan application of move-a.(16) a. * e seems John to be niceb. John seems t to be niceThe verb "seem" is a raising verb. These types of verbs do not assign 0-roles to theirsubject positions, but Case is assigned in the main clause by tense features. Since thereis no Case assigned to "John" in the embedded clause, application of move-a is forced.According to Aoun & Li [Aoun et al. 89] Chinese does not have any Subject Raising.'If there are no Subject Raising phenomena in Chinese, the internal subject hypothesiscannot be maintained. It is therefore assumed that in Chinese, the subject is base-generated in the specifier of IP position, and that it receives the external 0-role of theverb in this position.Consider now some instances of Case filter violations which do not force movement.(17) a. * Mary thinks that John to be niceb. Mary thinks that John is niceIn (a), the embedded clause is infinitival, so there is no Case assignment. The chainC = (John) does not receive Case marking, and there is no manner in which movementof "John" can satisfy the Case filter. In (b), the embedded clause is tensed, and so Casewill be assigned, and the sentence passes the Case filter.14The (a) version does not solely violate the Case filter, but also the Extended Projection Principle,which is discussed in section 2.7. A sentence such as "* It seems John to be nice" violates only theCase filter. I use the example given in the text to make the D-Structure and S-Structure relationship astransparent as possible.15Though see [Li 90] for a differing view.Chapter 2. Linguistic Framework^ 22Finally, there is a special class of verbs known as Exceptional Case Marking (ECM)verbs. These verbs can assign case to the subject of an embedded clause. To illustratethis phenomena, consider the following sentences:(18) a. Mary thinks he is lyingb. Mary believes he is lying(19) a. * Mary thinks he to be lyingb. * Mary believes he to be lyingThe first pair shows what one would expect, namely that it is acceptable if the subjectof the embedded clause, "he", is assigned Case in the embedded clause. The second pairdemonstrates that there is no assignment of nominative Case in the embedded clause,because it is infinitival.Consider now this pair,(20) a. * Mary thinks him to be lyingb. Mary believes him to be lyingHere we see a difference between the two verbs. "Believe" is an ECM verb, allowingit to assign case to the subject of the embedded clause. "Him", though marked withaccusative Case, which is the Case marking which objects typically get, is clearly thesubject of the embedded clause — "Mary" clearly does not believe "him" (she thinksthat he is lying). Another verb which falls into the ECM category is "expect".2.6.3 Further Examples of A-movementHere I present some further examples of A-type movements.Chapter 2. Linguistic Framework^ 23Movement of VP-Internal SubjectThe subject of an English clause,' although base-generated inside the VP and receivinga 0-role there, cannot remain in this position at S-Structure. The VP-internal subjectposition is not a Case-marked position, and the subject is thus forced to move in orderto pass the Case Filter. If the clause is tensed, then the subject will only move to thespecifier of IP position.(21) a. Tina kissed Kevin.b. [II, [DP Tina ]i [vp ti [v , [v kiss ]J]]If the clause is untensed, the subject will have to move further up the tree to find aCase-marked position which does not receive a 0-role.(22) a. Tina seemed to kiss Kevin.b. [1p [DP Tina ]i [vp [v , [v seem^[vp ti [v, [v kiss ]]M]]PassiveThe formation of the passive form of an active sentence is also governed by Case and0 considerations. It is claimed that passive morphology (the presence of the auxiliaryverb "be" and "-ed" suffix on the main verb in the English case) "absorbs" the external0-role and one internal Case marking. More formally, passive morphology is taken tochange the thematic and Case marking properties of the verb in a consistent manner —the external 0-role is no longer assigned, nor is one internal Case.The passive sentence,16 Recall that, according to Aoun & Li [Aoun et al. 89], Chinese does not have Subject Raising of anykind.Chapter 2. Linguistic Framework^ 24(23) The cat was chased.thus has the structure,(24) [ip [DP the cat]i [vp t, [vi [v was ] [vp ti [v , [v chased ] ti ]]]]]In order for the sentence to pass the Case Filter, application of move-a is forced toestablish the well-formed chain ([DP the cat ] i , ti , ti , ti ).2.7 Extended Projection PrincipleThe Projection Principle can be stated as follows:'THE PROJECTION PRINCIPLEThe 9-Criterion holds at D-Structure, S-Structure, and LF.It requires that the thematic relations established at D-Structure be maintained at theother levels of the grammar (save PF). These relations are maintained through chainformation; constraints on the various levels force the application of move-a, while theProjection Principle necessitates the construction of chains.There are some verbs which do not assign 9-roles to their subjects; the ProjectionPrinciple, as it stands, cannot require the presence of a subject in this case. Naturallanguage, however, does seem to require the presence of a subject for each VP. TheExtended Projection Principle (EPP) requires not only that the Projection Principle besatisfied, but also that every VP have a subject (regardless of whether or not the verbwhich heads the VP assigns a 0-role to its subject position).17This formulation of the Projection Principle is taken from [van Riemsdijk et al. 86].Chapter 2. Linguistic Framework^ 252.8 pro and PROGB theory postulates the existence of a number of different (phonetically) empty ele-ments. It is beyond the scope of this thesis to present the justification for the existenceof them (the reader is referred to the various GB overviews cited in the introductionfor further reading). Traces have already been introduced; the purpose of this section issimply to mention two other empty elements which will play an important part later inthe thesis.Some languages seem not to require the presence of a subject. In Italian, forexample, either of (a) or (b) is a well-formed sentence,(25) a. Io parlo.b. Parlo."Io" is the pronoun "I", and "parlo" is the first person singular form of the verb "totalk". Both sentences thus mean "I talk". The Extended Projection Principle (EPP)requires that a subject always be present, however. It is thus assumed that a covertsubject is present in sentences without an overt one. This empty element is called pro(or "little pro"). It is simply a phonetically null pronoun, which may be Case-markedand may receive a 0-role.The Binding Theory (which also is not discussed in this thesis) deals with possibleand required coreference of various elements. A consequence of the Binding Theory, thePRO-Theorem, holds that there is another phonetically empty pronoun, called PRO (or"big pro"). PRO must receive a 0-role, but it cannot be Case-marked. A property ofPRO is that it must not be governed (see section 2.11), yet Case-marking is taken tooccur only in configurations in which government holds. Hence PRO cannot appear inthe specifier of IP position of tensed clauses, only in untensed clauses.Chapter 2. Linguistic Framework^ 262.9 PredicationPredication is a relationship between an antecedent and a predicate. It relates the an-tecedent to the open position in the predicate. In the following examples' a predicationrelationship holds between the italicized elements,(26) a. John is sad.b. John ate the meat raw.c. John ate the meat nude.d. John made Bill mad.Here the predication relation holds between a DP and an Adjective Phrase (AP). Theitalicized DPs act as antecedents (the thing which is being modified), filling the openposition in the AP predicate (the modifier). Considering the preferred interpretationsonly, it is clear that in the first sentence sad is modifying John. In the second sentence,it is the meat which is understood to be raw. In contrast, John is taken to be modifiedby nude in the third sentence. In the last example, it is Bill who is mad.Each AP above is a headed (or simple) predicate, so called because the maximalprojection of the head is the predicate. Simple predicates can be based upon any of themajor lexical categories N, A, P, or V.There are also complex predicates. Such predicates are either IPs or CPs, withan explicit predicate variable in the specifier position. A predicate variable defines anopen position in the clause, which makes it a one-place predicate. For example, theconfiguration,(27)^[HD PRO [vp ...]]18The discussion and examples of this section derive from [Williams 80].Chapter 2. Linguistic Framework^ 27is a complex predicate, in which PRO is the predicate variable.Williams construes the predication relation as a coindexation of the predicate andits antecedent. Thus we have, for simple predicates,(28) a. John, is sad,b. John ate [the meat], raw,c. John, ate the meat nude,d. John made Bill, mad,Some examples of predication involving complex predicates are,(29) a. John, promised Bill [ ip PRO to leave ],b. John persuaded Bill, [ ip PRO to leave],c. John, tried [ ip PRO to leave],These last three examples exhibit what is known as control. Of note in thesesentences is that the interpretation of the subject of the embedded clause is determinedby the verb of the main clause. In the first sentence, John promised Bill that he (John)would leave. In the second sentence, John persuaded Bill that he (Bill) should leave. Inthe last case, it is John who attempted to leave.The main clause verb thus "controls" the interpretation of the subject of the em-bedded clause. The subject of the embedded clause is not overt in any of the cases.These sentences might at first seem amenable to analysis as A-type movement, asexemplified in,(30)^John i promised Bill rip t, to leave ]Chapter 2. Linguistic Framework^ 28This is not possible, however. The 0-criterion does not allow a DP to receive more thanone 0-role, yet if the phenomenon of control were construed as A-movement, then this isexactly what would happen. Consider,(31) John expected Bill to leave."Expect" is an Exceptional Case Marking (ECM) verb, assigning Case to the subject ofthe embedded clause, but no 0 role. John does not expect Bill, he expects Bill to leave.Bill is thus the agent of the embedded verb. In the control case, if this were analyzed asA-movement, then the chain (John,,t,) would receive two 0 roles. t, is assigned a 0 rolefrom "leave", while John, gets one from "promise". Hence this analysis is not viable.2.10 A-movementA-movement is motivated primarily by scope considerations of the moved operator andselectional properties of verbs. Not all types of A-movement are obligatory; topicalization,for instance, is an optional stylistic movement. In this section wh-movement constructionsand infinitival relative clauses, both of which are directly relevant to the discussion inchapter 4, will be considered. Appendix A.2 contains many further examples of A-movements and the parse trees which the parser produces for them.2.10.1 Wh-movementIn-movement is an A-type movement dealing with the movement of question words fromtheir base-generated positions to the positions they need to occupy at LF. Wh-movementis exemplified in the following examples,(32) a. Who dislikes Susan?b. What did Susan buy?Chapter 2. Linguistic Framework^ 29The structures of these sentences are,(33) a. [cp Who [c ,^] [11,^[,, t, [y [, dislikes ] Susan ]]]]]]b. [cp What [c , [c did ]^Susan^[i ] [,, buy t, ]]]]]A question word' such as "who" or "what", is moved to the front of the clause, and anA-chain is established between the wh-phrase and the trace.In English, wh-movement takes place between D-Structure and S-Structure (thisis overt movement). In some languages, such as Chinese, wh-phrases appear in theirD-Structure positions at S-Structure,(34) Ni da le shuiyou hit ASP who"Who did you hit?"At first there might seem to be little if no justification for postulating that Chinesehas wh-movement. Huang [Huang 82] investigated Chinese wh-constructions, and foundmotivation for proposing that Chinese has wh-movement between S-Structure and LF(covert movement).Move-a is subject to a number of constraints. These movement constraints ex-plain why wh-phrases cannot be extracted from certain constructions. Thus, while (a)is fine, (b) and (c) are both deviant to some degree. Moreover, under the intendedinterpretation,' (c) is significantly worse than (b)."Question words are referred to as wh-words since they, in English, generally begin with the letters"wh" (as in "who", "what", "when", and "where"; "how" is an exception); these "words" are actuallyconsidered to be complete phrases (DPs), and so they are more often referred to as wh-phrases. Thisterminology, though not cross-linguistically applicable, is used throughout the literature.20 The point is that "why" cannot be interpreted as question about Marty's liking; "why" can onlyrefer to Fred's belief.Chapter 2. Linguistic Framework^ 30(35) a. Who, does Fred believe^that Marty likes tib. ? Who, does Fred believe^the claim [cp that Marty likes 4 ]]c. * Why, does Fred believe [NP the claim [cp that Marty likes Beverley ti ]]If Chinese does not have covert wh-movement, then no such asymmetries should beexhibited; conversely, if these asymmetries are found in Chinese, they can be taken to beindicative of movement. Indeed, these asymmetries do exist in Chinese,(36) Fred xiangxin Marty xihuan shui?Fred believe Marty like ASP whoa. "Who does Fred believe that Marty likes?"(37) Fred xiangxin Marty xihuan shui de shuofa?Fred believe Marty like ASP who COMP claima. "Who does Fred believe the claim that Marty likes?"(38) * Fred xiangxin Marty weishenme xihuan Beverley de shuofa?Fred believe Marty why like ASP Beverley COMP claima. "Why does Fred believe the claim that Marty likes Beverley?"Further support for the hypothesis that Chinese has wh-movement comes fromselectional restrictions, discussed below in section Operator ScopeGB theory holds that an operator' must, at some level of representation, make its scopeexplicit by structural means. This is done through the c-command relation — whatever21 According to Chomsky [Chomsky 81] operators include, among other elements, quantifiers and wh-phrases.Chapter 2. Linguistic Framework^ 31subtree an operator c-commands at the relevant level of representation is taken to be itsscope. An operator will bind a variable which falls in its scope. For example, when awh-phrase is moved, as in,(39) Whati did Averill buy tithe interpretation is,(40) What is the thing x, such that Averill bought xWhati binds the A-trace t„ a variable. The interpretation of the trace is dependent onthe operator.There is a prohibition against the occurrence of unbound (or free) variables innatural language. There is also a prohibition against vacuous quantification; an operator,in order to be licensed to appear in a phrase structure tree, must bind some variable. Ingeneral, an operator may bind only one variable.Languages differ as to the level of representation at which the scope of operatorsis made explicit. In Polish and Hungarian, scope is represented mostly at S-Structure;in English, scope is represented at S-Structure for wh-phrases, and at LF for quantifiers,while in Chinese, scope is represented at LF for wh-phrases, and at S-Structure forquantifiers.2.10.3 Selectional RestrictionsCertain verbs require that their embedded clause either have or lack the wh feature; theyselect a clause that is either [+ wit] or [—wh]. The paradigm below demonstrates this,(41) a. Beverley asked me who bought books.b. * Who does Beverley ask me bought books?Chapter 2. Linguistic Framework^ 32(42) a. * Beverley believes who bought books.b. Who does Beverley believe bought books?(43) a. Beverley knows who bought books.b. Who does Beverley knows bought books?The verb "wonder" selects for a [H- wit] clause, while "think" selects a [—wh] clause. Theverb "know", on the other hand, selects both. This same contrast is found in Chinese,even though there is no overt wh-movement going on which could move the wh-phraseinto the required position, the embedded clause CP specifier.(44) Zhangsan wen wo shui mai le shu.Zhangsan ask I who buy ASP booka. "Zhangsan asked me who bought books."b. * "Who did Zhangsan asked me bought books?"(45) Zhangsan xiangxin shui mai le shuZhangsan believe who buy ASP booka. * "Zhangsan believes who bought books."b. "Who does Zhangsan believe bought books?"(46) Zhangsan zhidao shui mai le shuZhangsan know who buy ASP booka. "Zhangsan knows who bought books."b. "Who does Zhangsan know bought books?"Data such as this provides additional support for postulating LF wh-movement for Chi-nese. Assuming that Chinese (and similar languages) have covert wh-movement allowssignificant generalizations to be captured. However, as will become evident in chapter 4,covert movement also causes problems for a psycholinguistically plausible parser.Chapter 2. Linguistic Framework^ 332.10.4 Relative ClausesA relative clause is a clause which modifies a DP. As such, the relative clause is a one-place predicate. In the example below, the relative clause is "who Mary despises"; itmodifies the DP "the man".(47) Mike saw the man who Mary despisesJust like an AP, a relative clause is a one-place predicate, the antecedent of which isidentified through predication. However, unlike the case of an AP, the antecedent of arelative clause must occur in a specific structural configuration. Thus, the predicate isadjoined to and coindexed with its antecedent:(48) [DP antecedent [cp^.]1The complete structure of the above example is thus,(49)^Mike saw^[DP the man [„ who i [11, Mary despises t i ]]iThere is an A-chain which exists between who i and ti , while a predication relationshipholds between [DP the man and the predicate; the predicate variable who, bears thesame index as the clause and the antecedent. 22Empty Operator ConstructionsThe relative clauses considered so far have all been tensed (or finite) relative clauses.Being tensed, their specifier of IP positions are Case-marked. There are also untensed22According to Browning [Browning 87] there are two mechanisms necessary to make the proper linkbetween the predicate and the predicate variable. These mechanisms are feature percolation, wherebythe head of a phrase agrees in all its features with its maximal projection, and SPEC-HEAD agreement,whereby a head agrees with its specifier. For the purposes of this thesis I choose to ignore these additionalmechanisms in the interests of simplicity of implementation. A "faithful" parser should, of course,implement these relations as well; at present, a direct coindexation between the antecedent and thepredicate variable is carried out. See section 3.1.4 for more details on the implementation.Chapter 2. Linguistic Framework^ 34(or infinitival) relative clauses, which do not have Case-marked IP specifiers. Infinitivalrelative clauses are a type of empty operator construction. Empty operators are simplyoperators which are phonetically null.Browning [Browning 87] discusses many different types of empty operator construc-tions, among them purpose clauses. Since purpose clauses and infinitival relative clausesactually have the same structure, I will discuss primarily the latter.'Infinitival relative clauses can be of either the subject-gap or the object-gap variety,(50) a. Jennifer is the woman to watch Marlene.^(subject-gap)b. Jennifer is the woman to watch. (object-gap)In the first sentence, "Jennifer" is going to watch Marlene. In the second sentence,"Jennifer" is perceived as the woman that should be watched, by some arbitrary person.There clearly is a difference in interpretation, which can only be resolved by the end ofthe sentence. The following pair shows that the "resolution point" might be embeddedarbitrarily from from the start of the relative clause ([the woman ... ]).(51) a. Jennifer is the woman to tell the police to watch Marlene.b. Jennifer is the woman to tell the police to watch.'Purpose clauses have internal structures associated with them identical to those associated withinfinitival relative clauses. The difference between the two types of clauses is that the purpose clauseinterpretation is possible only with a restricted set of verbs. The purpose clause reading can in thesecases be forced by including the phrase "in order", as follows,John bought the bike (in order) to ride.The two possible interpretations are as follows. The first is the relative clause reading, while the secondis the purpose clause reading.John bought "the bike", the one which it is really cool to ride.John bought the bike in order to ride every day to and from school.Since I am interested in the difficulties posed by the difference in structure between subject gap andobject gap varieties of these clauses, I choose to simply consider infinitival relative clauses.Chapter 2. Linguistic Framework^ 35Although the subject-gap / object-gap pairs seem to be of equal parsing complexity, theircommonly accepted structural analyses would predict that they should be of very differ-ent processing complexity. Browning proposes the following structures for subject-gapand object-gap purpose clauses (and also for infinitival relative clauses).(52) a. [ ip PRO 11b. [cp pro, [c , C i [ip PRO [,, ... t2 JJ]]The first is a simple predication structure, in which PRO is the predicate variable. Thesecond structure also involves predication. The operator binds a trace, which is inter-preted with respect to the antecedent, while PRO receives an arbitrary interpretation.Browning notes that the major difference between subject-gap and object-gap pur-pose clauses is the incompatibility of the subject-gap variety with dative shift structures.She argues that thematic orientation rather than a structural difference accounts for this.Substantial structural differences thus seem ill motivated; subject-gap purpose clausesthus act more as a subclass of object-gap purpose clauses than as a class on their own.2.11 GovernmentThere are two types of government, a structural relationship, which are important inchain formation. The first is head government, the second antecedent government; theseare defined as follows (Relativized Minimality and the notion of barrier will be discussedin the next section),Definition 2.12 a head-governs if (i) a E {A, N, P, V, I} ; (ii) a m-commands /3;(iii) no barrier intervenes; (iv) Relativized Minimality is respected.It must be assumed that I does not head-govern its specifier position when it does notcontain tense features.Chapter 2. Linguistic Framework^ 36Definition 2.13 a X antecedent governs /3 (X E {A, A, X°}) zff (i) a and /3 are coin-dexed; (ii) a c-commands (iii) no barrier intervenes; (iv) Relativized Minimality isrespected.2.12 Relativized MinimalityChomsky [Chomsky 86a] introduces a condition of minimality into government,Definition 2.14 In the configuration . a . . . . . . 6 . 13 -y is a barrier for 13 if -yis the immediate projection of 6, a zero-level category distinct from # .Rizzi [Rizzi 90] notes that Chomsky's minimality condition is rigid in that only if -y isa head does its presence interfere with government by a; furthermore such a head 7prevents both head government and antecedent government by a. Rizzi thus proposesthat minimality should be relativized according to the type of government. This impliesthat a local potential head governor will prevent a distant head from governing into itsdomain, and that a local antecedent governor will prevent a distant antecedent governorfrom the same; however, potential head governors interfere in no way with antecedentgovernment, and likewise, potential antecedent governors have no effect on governmentby heads.Taking this concept of relativity a step further, Rizzi also proposes to limit interac-tion of potential antecedent governors to those of the same type: X ° , A, or A. RelativizedMinimality (RM) thus construed restricts the formation of multiple chains of the sametype. Figure 2.8 shows that completely disjoint chains are allowed under RM.As a first approximation, nested chains of the same type are not permitted underRM. Figures 2.9 and 2.10 depict configurations which are not permissible if the chainsare of the same type — nested chains and overlapping chains respectively.RM is, however, stated in terms of typical potential governors for some element.Chapter 2. Linguistic Framework^ 37(( /,Figure 2.8: Unrelated chains are allowedFigure 2.9: Nested chains of the same type are disallowedRELATIVIZED MINIMALITYa X-governs /3 only if there is no y such that (i) -y is a typical potentialX-governor for , 3; (ii) -y c-commands 0 and does not c-command a.A typical potential governor is any element of the proper type which could, in someconfiguration, actually be a governor.Definition 2.15 7 is a typical potential head governor for 0 iffy is a head m-commandingo •Chapter 2. Linguistic Framework^ 38Figure 2.10: Intersecting chains of the same type are disallowedDefinition 2.16 y is a typical potential A antecedent governor for /3 if y is an Aspecifier c-commanding /3.Definition 2.17 -y is a typical potential A antecedent governor for /3 if -y is an Aspecifier c-commanding /3.Definition 2.18 y is a typical potential X ° antecedent governor for /3 if -y is a headc-commanding /3.If the possible government domain of neither chain interferes with that of the other,nesting of chains is allowed. This situation obtains when the embedded chain is containedin a left branch (in a right-branching language) of the tree, as in figure 2.11.In order to account for the paradigm seen in (53) — (59), both Rizzi and Cinque[Rizzi 90, Cinque 90] appeal to the degree of referentiality of the moved phrase. Rizzities a phrase's degree of referentiality to the 0-role it receives, whereas Cinque calls uponPesetsky's [Pesetsky 87] notion of D-linking. I will leave open the question of what anappropriate definition of referentiality might be, since no one notion has been settledupon (nor are any of the proposals particularly well-defined at the moment).Chapter 2. Linguistic Framework^ 39Figure 2.11: Nested chains are permitted in some configurations.Cinque allows for two types of A-dependencies. Antecedent government is availableto those elements which are not "referential enough"; antecedent government is a rela-tively restricted type of dependency-establishing method. Binding is the other mannerin which Cinque allows an A-dependency to be set up. Binding is much less restrictedthan antecedent government.Definition 2.19 a binds 9 if (i) a and 13 have the same referential index; (ii) a c-commands 13.(Antecedent) government chains are sensitive to government barriers,Definition 2.20 Every maximal projection that fails to be directly selected by a categorynondistinct from [+V] is a barrier for government.while binding chains are blocked by binding barriers,Definition 2.21 Every maximal projection that fails to be (directly or indirectly) selectedin the canonical direction by a category nondistinct from [-FV] is a barrier for binding.The canonical direction is the direction in which a verb assigns Case-marking to itscomplement. For English this is rightward. The categories non-distinct from [+V] areChapter 2. Linguistic Framework^ 40A, V, I, and C. Finally, the different notions of selection are defined as,Definition 2.22 a directly selects /3, where a is a lexical category, if a directly s-selects0.Definition 2.23 a directly selects 13, where a is a non-lexical category, if a directlyc-selects /3.Direct in these cases refers to direct 0-marking (for internal 0-roles), as opposed indirect0-marking (for external 0-roles).2.13 The Empty Category PrincipleIt has long been observed that there exists an asymmetry between the extraction ofarguments versus adjuncts from various domains. Historically, domains from which ex-traction could not take place are called islands (see [Ross 67]). Two classes of islandsare recognized: weak islands and strong island. It is the weak islands which allow theargument/adjunct asymmetry to be seen. In the following examples, taken from Cinque[Cinque 90] (his (1) — (7)), the first three are considered strong islands, while the lastfour are weak islands.(53) Subject islanda. * Which books did [ talking about t ] become difficult?b. * How would [ to behave t ] be inappropriate?(54) Complex NP islanda. * To whom have you found [ someone [ who would speak t ]]?b. * How have you found [ someone [ who would fix it t ]]?Chapter 2. Linguistic Framework^ 41(55) Adjunct islanda. * To whom did you leave [ without speaking t ]?b. * How was he fired [ after behaving t ]?(56) Wh-islanda. ?? To whom didn't they know [ when to give their present t ]?b. * How did they ask you [ who behaved t ]?(57) Inner (negative) islanda. To whom didn't you speak t?b. * How didn't you behave t?(58) Factive islanda. To whom do you regret [ that you could not speak t ]?b. * How do you regret [ that you behaved t ]?(59) Extraposition Islanda. To whom is it time to speak t?b. * How is it time to behave t?The Empty Category Principle (ECP) was formulated to account for this paradigm.It is a principle which has appeared in numerous forms in the literature [Lasnik et al. 84,Chomsky 86a, Kayne 84, Aoun et al. 87]. I adopt Cinque's formulation, which derivesfrom that of Rizzi [Rizzi 90].Chapter 2. Linguistic Framework^ 42THE EMPTY CATEGORY PRINCIPLEA nonpronominal empty category must be properly head-governed by ahead non-distinct from [-I-V].Proper head-government is defined as follows,Definition 2.24 a properly head-governs /3 if a head-governs /3 and a c-commands2.14 Minimal Linguistic AssumptionsThis chapter has presented a short overview of current syntactic theory within the GBframework. Although the problems to be discussed are cast in terms of GB theory,as are the solutions proposed, the ideas behind them are not tied to this theoreticalmodel. Indeed, there is a minimal set of syntactic assumptions, as well as a minimal setof processing assumptions, which, when taken together, yield the problems which thisthesis deals with. This section presents such a minimal set of syntactic assumptions,which are (relatively) uncontroversial.It is assumed that there is some level of syntactic description at which the quantifi-cational structure of sentences is reflected. In GB theory, this is the level of logical form(LF).It is further assumed that the scope of a wh-phrase (or an operator, in general) isrepresented structurally at this level of representation. In GB theory, wh-phrases mustc-command their scope at LF; they come to c-command their scope through applicationof move-a (this subcase commonly being referred to as wh-movement).Finally it is assumed that this process of identifying the scope of a wh-phrase mayoccur either overtly or covertly. In GB theory, this implies that wh-movement maytake place either between D-Structure and S-Structure (overt movement), or betweenS-Structure and LF (covert movement).Chapter 3Computational FrameworkThis chapter sets up the computational framework within which the work of this thesisis carried out. The general parsing model is presented first, alongside an explorationof various models on which it is based. Some minimal computational assumptions areoutlined in the second section. The third and final section of the chapter discusses somemotivating psycholinguistic evidence for these assumptions.3.1 A General ModelThis section considers various proposals regarding psycholinguistically plausible parsingmethods. I will discuss each in turn, finally summarizing the general parsing model whichI will use. Further refinements to this general model will be described in the next chapter,as the motivation for them are encountered and explored.There are certain facts about the human language processing apparatus which apsycholinguistically plausible model of parsing must be consistent with; minimally, itmust recognize that,• humans have quite strict short-term memory limitations, and,• humans process linguistic input very quickly, and,• not all sentence constructions are of equal processing complexity.43Chapter 3. Computational Framework^ 443.1.1 The Marcus ParserMarcus [Marcus 80] made the first serious attempt to formulate a psycholinguisticallyplausible parsing mechanism. Marcus made several very important observations abouthow such a model must be structured.Most importantly, Marcus claimed that in order to be psycholinguistically plau-sible, a parsing mechanism must be deterministic. Marcus presented this idea as theDeterminism Hypothesis:... the syntax of any natural language can be parsed by a mechanism whichoperates "strictly deterministically" in that it does not simulate a nondeter-ministic machine ... [pg. 2]Deterministic parsing entails that the parser must exhibit certain behaviours. First of all,Marcus observes that it must be the case that any structure which is built is permanent— the parser cannot be allowed to backtrack:In terms of the structures that the interpreter creates and manipulates, thiswill mean that once a parse node is created, it cannot be destroyed; that oncea node is labeled with a given grammatical feature, that feature cannot beremoved; and that once one node is attached to another node as its daughter,that attachment cannot be broken. [pg. 12]Furthermore, no structure which is built may ever be discarded:... all syntactic substructures created by the grammar interpreter for a giveninput must be output as part of the syntactic structure assigned to that input.[pg. 12]Finally, the internal state of the parser must not be able to encode temporary syntacticstructure. If this were the case, then the previous conditions would be vacuous.In addition to exhibiting the above behaviours, the parser must, in order to bedeterministic, operate in an at least partially bottom-up fashion. This implies that itwill be driven by the input stream, projecting structure from the lexical items. It cannotChapter 3. Computational Framework^ 45operate exclusively bottom-up, however. The parser must "be able to reflect expectationsthat follow from general grammatical properties of the partial structures built up duringthe parsing process." [pg. 14] For example, subcategorization requirements are a form oftop - down parsing. More generally, selection by one category of some particular categoryinvolves expectation on the parser's part of what it should "see" in the input stream.For example, C obligatorily selects IP as a complement, just as I obligatorily selects VPas a complement. (See section 3.1.2 for a more detailed discussion of selection.)It is also assumed that the parser processes input in a strictly left-to-right manner.Although a parser would have to take as input a sound stream in order to truly claimto model the human language processing system, current technology prevents this. Aparser taking written input, however, must certainly process it in the same order as ahuman would hear it if it were spoken, if any claims of psycholinguistic plausibility areto made.The left-to-right order processing constraint, together with the determinism hy-pothesis, entails that the parser must have access to a certain amount of lookahead. Thisallows the parser to postpone deciding what to do with a certain element in the inputstream until it has seen some number of further elements from the input stream. Onecan think of this as being a "lag" in processing. The amount of lookahead must be con-strained, else the determinism claim is invalidated — by using unbounded lookahead theparser would, in effect, be voiding the determinism claim.The Marcus parser utilizes two major data structures. The first is a buffer, a first-infirst-out data structure. The buffer contains nodes which are seeking mother nodes. Thesecond is a stack, a last-in first-out data structure. The stack contains nodes which areseeking daughters. In brief, the parser reads input from the input stream, places partiallyprocessed input in the buffer, and tree fragments in the stack.Chapter 3. Computational Framework^ 463.1.2 Licensing ParsersThere has been interest recently in methods of building phrase structure trees withoutusing extensive phrase structure rules. This section describes two such parsers; althoughwhat I adopt is closer to the work of [Abney et al. 86] than that of [LeBlanc 90], thereare certain aspects of the latter which deserve mention here.AbneyAbney and Abney & Cole [Abney 87, Abney et al. 86] describe how licensing conditionscan be used to build phrase structure. Every element in a phrase structure representationmust serve some purpose by being there; something must license its presence. As Abney[Abney 87] puts it, "Specifically, every element in the structure is licensed by performinga particular function in the structure; the structure is well-formed only if every elementin it is licensed." [pg. 2]The licensing conditions which Abney [Abney 87] uses are (i) functional selection,(ii) subjecthood, (iii) modification, and (iv) 0-role assignment.Abney assumes that the functional selection relationship holds between C and IP,between I and VP, and between D and NP. IP is the only possible complement of C, asare VP and NP of I and D respectively. This information is very useful to the parserwhen it is building phrase structure. Functional selection, though similar to c-selection, ismore restrictive. V c-selects (and s-selects) its complements, but V does not functionallyselect them.The subjecthood relation is intended to license a subject's appearance; it is basicallyan encoding of the Extended Projection Principle's requirement that every clause havea subject.Modification is intended to license modifiers. Since I do not deal with modifiers, IChapter 3. Computational Framework^ 47will not discuss this condition further.The 0-criterion is a very strong condition, requiring both that every A-chain haveexactly one 0-role, and that every 0-role be assigned to exactly one A-chain. Abneymakes use of this, using 0-role assignment as a licensing condition. In other words, oneway in which a maximal projection is licensed to appear in a phrase structure tree is bybeing assigned a 0-role.Abney also assumes that sisterhood is necessary in order for a licensing relationshipto hold. A directionality parameter is introduced into each licensing condition, allowingcross-linguistic variation to be accounted for.I believe that the use of licensing conditions to build phrase structure is correct,though I do not adopt the same set of licensing conditions as Abney. I outline thelicensing relations I incorporate into my model below.LeBlancLeBlanc [LeBlanc 90] builds phrase structure without using an explicit X schema. Basedon a revision of GB theory by Davis [Davis 87], various Percolation Principles are usedto derive the categorial features of a dominating nodes given two adjacent nodes.LeBlanc's parser is driven by Case marking and 0-role saturation. In his use ofCase marking to build phrase structure, LeBlanc's parser differs from that of Abney.Case marking is primitive to GB theory, and together with 0-assignment, it is centralto A-movement. The adoption of Case marking as a licensing mechanism thus seemsnatural.3.1.3 Chunk ParsingIn recent work Abney [Abney 91] has further explored the construction of phrase structurerepresentations. It is observed that humans seem to parse linguistic input in "chunks".Chapter 3. Computational Framework^ 48Abney proposes a chunking parser as more psycholinguistically plausible than one whichdeals directly with the lexical items. Following the spirit, though not the letter, ofAbney's proposal, the parser herein described projects each lexical item to its full Xprojection. These skeletal X projections are then joined together by the parser proper(this is what Abney calls the "attacher").There are three exceptions to this simplistic chunking procedure. The first involvesthe construction of DPs. LeBlanc notes that noun phrases (be they conceived of as NPsor DPs) must be handled in special ways by the parser. I simply enforce the functionalselection constraint as early as possible, namely in the "chunker" portion of the parser, toensure that a complete DP is the smallest chunk that will be returned after encounteringa determiner or a noun. Modifying prepositional phrases (PPs) can cause difficulties, asthere may be arbitrarily many of them. Nominalizations such as,(60) the destruction of the cityinvolve subcategorization. Although this subcategorization is optional, the chunker canuse such information as a cue to search for a potential complement. If a complement isfound, it is incorporated into the structure, else the structure is closed off at this point.Cases of adjuncts, which are not subcategorized for, are not dealt with.The second special case is that of a tensed verb, which is analyzed as tense featuresand an untensed verb. Each of these projects a skeletal X projection. Figure 3.12 showshow the inflected verb "chased" is analyzed by the chunker. 1The chunker also handles the case of head movement of an auxiliary in a specialmanner. For further details, refer to section 4.2.Abney also notes that the chunker can contain language-specific rules. Although'Chinese does not have tense; instead, the parser recovers aspectual features from the verb, whichhead the I projection. The I projection is thus parameterized for either "tense" or "aspect".Chapter 3. Computational Framework^ 49IP^ VPI '^NI'I^V[+past] chaseFigure 3.12: Skeletal X projections created by the chunker: chased = [-I-past] + chaseI do not think this is necessary, for coding simplicity I utilize two slightly differentchunking procedures to handle the difference between English, in which "tense" headsthe I projection, and Chinese, in which "aspect" heads the I projection.3.1.4 Other DetailsAs mentioned in a footnote in section 2.10.1, not all mechanisms prescribed by the theoryas necessary to represent relative clauses correctly are implemented in the parser. In orderto simplify the programming involved, the mechanisms of feature percolation and SPEC-HEAD agreement are not implemented. Furthermore, the antecedent of the relativeclause is not coindexed with the relative clause. This would result in a representationsuch as,(61) [DP [DP the man ]i [cp whoi [Ip • • . ti . . . ]] ]with the relationships shown in figure 3.13. Rather, the antecedent is coindexed withthe predicate variable, which in turn binds some variable in the relative clause. Whenthe parser identifies that a predication relationship needs to be established, it directlycoindexes the predicate variable and the antecedent. In other words, the parser producesa structure similar to the proper one, but in a more direct and ad hoc manner.Chapter 3. Computational Framework^ 50DPDP <^ ›i• CPPredication\:OperatorSPEC-HEAD AAgreementFeatureC'PercolationFigure 3.13: The mechanisms of relative clausesLDE, I.(62)^[DP the man 1, [cp who, [ip • • • t, • • • ]]]The relationship between^the man ], and who, is taken to be that of predication.As will be further discussed in Chapter 4, the implementation does not handle headmovement very well. Since the primary purpose of the implementation is to demonstratehow A-movements are handled by the parsing scheme, head movement is carried out ina fairly ad hoc manner at present. Proper handling of head movement is an issue left forfuture work.A further weakness of the implementation is that adjuncts are not handled at all.There are several interesting aspects to adjuncts; though they do not bear directly onthe points being made in the thesis, a natural extension of this work must address howthey are parsed. Parsers driven by licensing mechanisms in general have a difficult timewith adjuncts, since the appearance of adjuncts does not seem to be governed by anylicensing conditions.Chapter 3. Computational Framework^ 513.1.5 SummaryTo conclude this section, I will summarize the parsing model which I assume as a startingpoint for this thesis.The parser has two major (conceptual) components, corresponding to Abney'schunker and attacher. It is of the traditional buffer and stack variety, as defined in[Marcus 80]. The chunker takes lexical items from the input stream, constructs skeletalphrase structure representations (chunks), and deposits them at the end of the buffer.The attacher uses the buffer and the stack; it has direct access to the first element ofthe buffer and the top element of the stack (although it can look at all the items in thebuffer, it can only modify the first one). It shuffles nodes between the two, employingthe licensing mechanisms to attach nodes to each other to eventually form a completephrase structure tree.The licensing mechanisms I use are:• 0-role assignment• Case-marking• functional selection• scope assignment• prohibition against vacuous quantification• Extended Projection Principle (EPP)The first three licensing mechanisms were addressed above. The remaining ones, thoughdiscussed in chapter 2, deserve a brief second mention here. The scope assignmentrequirement is used to license the appearance of quantificational elements and to flagmovement of such elements. The prohibition against vacuous quantification forces thepresence of a variable for a quantificational element to bind. In other words, if a quantifierChapter 3. Computational Framework^ 52has been identified, the appearance of a variable which this quantifier can bind is notonly licensed, it is required. Finally, the EPP forces the presence of a subject in eachclause. The EPP thus licenses the appearance of pleonastics, as in(63) It seems Tina hit Kevin.In conjunction with the prohibition against vacuous quantification, I use the EPP tosolve a parsing dilemma in infinitival relative clause constructions. In these cases, a thepresence of a PRO subject is required.3.2 Minimal Computational AssumptionsIn this section the basic assumptions about the parsing model are laid out. These as-sumptions, which are relatively uncontroversial, taken in conjunction with the linguisticassumptions presented earlier, conspire to present seemingly insurmountable difficultiesfor any parsing mechanism adhering to them.I assume that the parser has a limited lookahead capability. Though there is noexplicit limit on the amount of lookahead the parser may use built into it, no more thantwo chunks' worth of material is looked at before a decision is reached as to what to dowith the current item.The parser does not backtrack. Once a piece of phrase structure has been attachedto another, the parser is committed and cannot undo this attachment.I also assume that the parser operates in a strictly left to right fashion. This simplyimplies that the parser is constrained to process items from the input stream in the sameorder in which a person would hear them.It is further assumed that the parser operates primarily bottom up (with subcate-gorization and functional selection accounting for the top-down aspects).Chapter 3. Computational Framework^ 53Finally, I assume that the parser is filler-driven rather than gap-driven in its chain-construction process (see below).3.3 Psycholinguistic EvidenceThe purpose of this section is to present psycholinguistic evidence for the processingassumptions outlined at the end of the previous section.3.3.1 Filler-Driven ParsingFrazier & Flores-D'Arcais [Frazier et al. 89] studied gap locating strategies in Englishand Dutch. Their study concluded that the then commonly held view that parsing wasgap-driven could not be correct. Parsing is gap-driven if the parser makes no attemptto establish a dependency between a filler and a gap until the gap is identified; in termsof GB theory, chain construction would in this case be triggered by the identification ofthe gap. Instead they advocate the use of a filler-driven strategy. Chain construction isin this case triggered by the identification of a filler. Kurtzman, Crawford, & Nychis-Florence [Kurtzman et al. 91] consider primarily how wh-traces are located; their resultsalso support a filler-driven approach.3.3.2 General Parsing PrinciplesFrazier & Rayner [Frazier et al. 88] conducted experiments in which subjects read sen-tences while their eye movements were tracked and recorded. The measurements allowedboth the global and local complexity which the sentences presented to the subjects to begauged. Frazier & Rayner proposed a number of constraints to account for their results.Among those constraints are the following,Chapter 3. Computational Framework^ 54Left-to-Right Constraint This constraint ensures that the parser will process wordsin the order which it encounters them (i.e.: from left to right).First Analysis Constraint The parser pursues only one possible analysis at one time,and does not attempt multiple analyses in parallel.Late Closure The parser should attempt to attach each new chunk into the most re-cently constructed parse tree fragment.Most Recent Filler When constructing multiple chains, the most recently identifiedfiller should be associated with the first gap found.These constraints each support the computational assumptions I make. Though notformulated as an explicit constraint, the fact that people have a very limited short-termmemory capacity is an underlying motivation for the First Analysis Constraint, LateClosure, and the Most Recent Filler constraint. My assumption of limited lookahead isa more explicit expression of this restriction.The Left-to-Right Constraint certainly supports the assumption that the parseroperates in a left to right manner.The First Analysis Constraint indirectly justifies the non-backtracking assumption.As Marcus [Marcus 80] notes, if backtracking is allowed, then the determinism claim isinvalidated, and this is equivalent to pursuing multiple parses at once.Late Closure is implied by the operation of the stack and buffer of the parser, asthe attacher only works with the first element of the buffer and the top element of thestack.The Most Recent Filler strategy, as will be seen in the next chapter, is incorporatedvery neatly into the parsing mechanism, and is used to explain certain linguistic effects.Chapter 4Problems and SolutionsThe computational and linguistic assumptions made in the previous two chapters seeminnocuous enough, yet they interact to produce seemingly insurmountable difficulties.This chapter studies a selection of syntactic constructs and linguistic effects, and theprocessing difficulties they pose given these assumptions. Since my main focus is ondealing with theoretically problematic and therefore interesting phenomena, rather thanstriving simply to achieve a substantial empirical coverage, only a limited number ofpertinent examples are considered.Three main issues are dealt with in this chapter. I first consider Relativized Mini-mality (RM), a set of conditions on possible chain-interactions. I argue that RM is betteranalyzed as a processing (performance) constraint rather than as a linguistic (compe-tence) constraint. The second issue concerns overt as opposed to covert movement; thedifficulties posed by cross-linguistic variation in the syntactic level at which A-movementtakes place are studied. Finally, the parsing problems caused by infinitival relative clauses(in English) of various types are looked at. There are two broad classes of infinitival rel-ative clauses of interest, so-called subject-gap and object-gap clauses. One of these twoclasses is predicted to be unparsable given the previously outlined assumptions, whilethey are both equally easy to process in reality.The majority of problems are caused by A-type movements, and hence attention isfocussed on them. However, A-type movements are considered as well; the parsing mech-anism is refined throughout the chapter, and straightforward examples of the parser's55Chapter 4. Problems and Solutions^ 56operation make the rest of the chapter more easily understood.4.1 Relativized Minimality EffectsThe Marcus parser has a serious limitation, in that it cannot handle multiple wh-movementstructures. Consider, for example,(64) Which problem did John wonder how Mary solved?The Marcus parser is able to keep track of only one wh-dependency, and only one whichcan be analyzed as arising from successive cyclic movement. Recalling Relativized Mini-mality (RM) and the related discussion from Chapter 2, it becomes apparent that somemechanism more sophisticated than that used by the Marcus parser is needed to handlevarious filler-gap dependencies. Moreover, as both Rizzi and Cinque [Rizzi 90, Cinque 90]observe, there are certain long distance A-dependencies which cannot be analyzed as aris-ing from successive cyclic wh-movement.To summarize, RM captures the generalization that there are (at least) three dis-tinct forms of dependencies induced by move-a — (i) A chains, (ii) A chains, and (iii)X°chains (or head movement chains). Chains of different types do not interfere witheach other, but in the case of chains of the same type, there must be no closer potentialantecedent governor for the foot of the chain.' This type of constraint on chains appearsto be amenable to analysis as a parsing constraint rather than a grammar constraint, inthe following way. 21 Thanks to Mark Baker for pointing out that the effect of this is that only in left-branching contexts(in a predominantly right-branching language like English) do multiple dependencies of the same typeoccur. Thus a single store, rather than a a stack, is required to achieve the desired effect.2There are two ways of explaining linguistic effects in terms of parsing constraints. The first is toencode the appropriate constraints directly into the parsing mechanism; this is the route I follow (thisis a direct manner of explanation). The second casts the relevent constraints as part of the syntax, butholds that they exist solely to satisfy processing limitations (an indirect account). The first approachChapter 4. Problems and Solutions^ 57The parser must have some way of keeping track of the different chains which it isconstructing, and must be able to keep separate the three different types of chains. Thebehaviour prescribed by RM is achieved if the parser has a separate stack 3 for each typeof chain it is constructing.It is not necessary for the parser to keep track of the whole chain, but merely thecurrent link in the chain. Once a link has been successfully created, it is clear that thereis no interference from other potential antecedent governors. Keeping only one link ofthe chain in memory at once lessens the memory load on the parser — a requirement ifthe model is to maintain claims of psycholinguistic plausibility. If the model necessitatedkeeping the whole chain in memory, the model would require a potentially unboundedamount of memory, as chains (especially A chains) are potentially unbounded.The parser must also be able to differentiate between the two different types of Adependencies which may be established [Cinque 90], antecedent-government and binding.A constraint on the required degree of referentiality of the moved element is not yet welldefined in the literature [Pesetsky 87, Rizzi 90, Cinque 90]. Since A dependencies areof much greater interest in this thesis than the partitioning of lexical items into variousclasses, I have chosen to simply encode directly in the lexicon the type of A dependencyeach lexical item enters into. Once a proper characterization of this constraint emergesit can be straightforwardly incorporated into the parsing mechanism, and the currentstipulative approach dispensed with.There are also reasons why RM should not be considered a purely syntactic (compe-tence) constraint. As it is defined, it is not open to cross-linguistic variation. If the parseris universal cross-linguistically, while the competence theory is parameterized, then theresults in a more constrained theory. It allows only two types of constraints — processing constraintsand syntactic constraints. The latter approach also allows processing-derived syntactic constraints, aswell as syntactically-derived processing constraints.3Though, as noted, only in left-branching structures may other elements be pushed onto the stack;else the parser operates as though it has a single store for moved elements of a certain type.Chapter 4. Problems and Solutions^ 58strong hypothesis is that all universal principles' belong to the performance theory andonly cross-linguistically variable principles are part of the competence theory. Althoughthis is most certainly too strong a statement, RM seems to be well-motivated as a parsingconstraint in the first place. Furthermore, at least as far as nested chains are concerned,there is no reason why the competence theory should rule this out, as ... filler2gape gapi ... structures have easily identifiable filler-gap relationships. However,there is a very good performance-related reason as to why nested structures are ruled out— they require a large amount of short-term memory to resolve properly. Moreover, RMviolations, such as wh-island constraint violations, do not result in completely deviantstructures. Instead, there seems to be a gradient effect. This is generally taken to be anindication of a performace rather than a competence constraint.' Thus, as a performanceconstraint, RM has a good explanation; as a competence constraint, it seems singularlyodd.4.2 Brief Overview of the Parser's OperationIn this section I present a short discussion of the data structures used, as well as anoutline of how the parser operates.The main data structures of the parser are a buffer, a main stack, and severalmovement stacks. The buffer is a First-In First-Out (FIFO) structure (see figure 4.14).Elements enter the buffer from the rear, and leave it from the front. The parser triesto always have at least two elements in the buffer, but it will not allow more than fourelements in the buffer at once.The main stack is a Last-In First-Out (LIFO) structure (see figure 4.15). Ideally,'I take a universal principle to be one which applies cross-linguitically without change. In otherwords, I use universal to mean unparameterized in this case.'The parser as implemented does not provide graded judgements; if a violation of RM occurs, theparser simply stops the parse at that point.Chapter 4. Problems and Solutions^ 59XP^XP^ XPFront of Buffer^ Rear of BufferFigure 4.14: The bufferthe stack should have a limited size as well, but using Marcus' model and assumptions,this is not possible (see section 3.1.1). The parser is not able to connect phrase structurefragments together as soon as it should. Instead, it is forced to wait until a phrase iscomplete before attaching it to something else. A projection is complete when all of itsselectional restrictions have been met; this includes functional as well as lexical selection.Lexical selection includes the selection of complements specified in subcategorizationframes. The reason that attachment cannot be carried out until a projection is completeis that once two pieces of phrase structure are connected, the parser cannot access theinternal structure of the newly formed phrase in order to attach complements to a pieceof it. This leads to a very large number of tree fragments sitting on the stack, which isnot psycholinguistically plausible (see section 3.3.2 and chapter 7 for relevant discussion).As an example, when a tensed verb is processed, both an IP chunk and a VP chunkare constructed (see section 3.1.3 for details). The VP chunk cannot be connected to theIP chunk until all of its selectional restrictions are satisfied. Until then, both the IP andthe VP must just sit on the stack. If these two tree fragments are part of an embeddedclause, the problem is compounded, as the IP cannot be connected to the higher-levelclause until complete; it will not be complete until the VP is complete.'6A question which leaps to mind is how potentially unbounded structure, such as a DP with stackedadjectives, might be handled. Although the parser at present does not deal with such constructions, aChapter 4. Problems and Solutions^ 60xpTop of Stack •••Bottom of StackFigure 4.15: The stackThe parser makes use of three auxiliary data structures for moved elements. Thesedata structures are stacks, though pushing of new elements is restricted to left branchingstructures, due to the linear nature of the parse.The parser proceeds by cycling through a sequence of possible operations, perform-ing the first one which is possible in the current configuration, and then beginning a newcycle. The possible operations may be grouped as follows:Head Movement Head movement is handled by the parser only to the extent that itmust, and in an ad hoc manner. The single case of head movement the parserrecognizes is that of an auxiliary moved from I to C.The chunker recognizes configurations in which an element of category I cannotoccupy the head position of an IP. These configurations are (i) a wh-phrase followedproper chunking procedure can be used to parse them without the need for an unbounded data structureto store the iterating elements. When the chunker enounters a determiner it recognizes that an NP mustbe a part of the final DP which the determiner will head. If, in the course of searching for the nounwhich heads this NP the chunker encounters a sequence of stacked adjectives, it can simply incorporatethem into the partially built DP, without storing them in a buffer or stack. If new items are continuallyincorporated into existing structure, no increased processing load results.Chapter 4. Problems and Solutions^ 61by an auxiliary, and (ii) an auxiliary followed by a DP. In these cases, a skeletalCP is projected, with the head of the CP occupied by the auxiliary of category I.The parser assumes that a projection headed by an element of the wrong categorywas moved there via head movement. It thus begins to construct an X °-chain,placing an appropriate trace in the head movement stack. The head movementchain is terminated when the landing site (an empty head of the proper category)is found, subject to Relativized Minimality.Replenish Buffer The parser tries to keep at least two elements in the buffer at alltimes. To satisfy this need, the chunker is invoked. The chunker processes itemsfrom the input stream, generating chunks which are placed in the buffer. Thechunker does this either until there are at least two chunks in the buffer, or untilthere are no more items in the input stream.Predication The conditions under which the predication relation holds are checkednext. If licensed, the proper configuration is constructed and the proper relation-ships are established between the relevant elements (as discussed in section 3.1.4).Attachment This is the main part of the parser; it is the embodiment of the attacher(see sections 3.1.3 and 3.1.5). There are two main types of attachment. The firstinvolves the attachment of two pieces of phrase structure to each other, while thesecond involves the incorporation of a trace into a piece of phrase structure. Theparser always attempts to do these two types of attachment in that order.The parser always considers the element located at the front of the buffer and theelement at the top of the stack when attempting to attach two phrase structurefragments to each other. These are the only two positions of these data structureswhich are accessible to the attacher.In order to maintain the linear order of items from the input stream, the attacherSpecifier^SpecifierChapter 4. Problems and Solutions^ 62YP^XPComplement^ComplementFigure 4.16: Possible attachment siteshas four choices as to how to try to attach these fragments together. Referring tofigure 4.16, the possible attachments are as follows (where XP is sitting at the frontof the buffer, and YP is located at the top of the stack):1. If the specifier of XP is on the left, an attempt to attach YP into this positionis made.2. If the complement of XP is on the left, an attempt to attach YP into thisposition is made.3. If the complement of YP is on the right, an attempt to attach XP into thisposition is made.4. If the specifier of YP is on the right, an attempt to attach XP into this positionis made.If any attachment is possible, the first one (according to the order given) is carriedout.Concerning the incorporation of a trace into a phrase structure representation, theparser considers the same attachment sites, in the same order, for any trace whichmight be present in one of the movement stacks.CP Insertion The conditions under which a CP should be inserted (such as when awh-phrase has been identified, but there is no overt complementizer present) aretested; if needed, a skeletal CP is dropped into the first position of the buffer.Chapter 4. Problems and Solutions^ 63Stopping Conditions The conditions under which a parse is successful are checked.These conditions are (i) the input stream is empty, (ii) the buffer is empty, (iii) thereis a single phrase structure representation in the stack, and (iv) all the movementstacks are empty.Extended Projection Principle The constraints of the Extended Projection Principleare checked (see sections 2.7 and 2.8), and if necessary complied with at this point.Else If nothing else can be done, the parser attempts to remove an element from thebuffer and push it onto the stack. If the buffer is empty, then this will not bepossible — in this case the top element of the stack is popped and placed at thefront of the buffer.4.3 A-type MovementsA-type movement is driven by Case and 0-role saturation, and is thus relatively easy forthe parser to "undo" in the course of building up the parse tree. The Case filter and the0 criterion together conspire to force various DPs to move from a 0-marked position atD-Structure so as to occupy a position where Case is assigned by S-Structure.This means that the "undoing" of A-type movement will be triggered by the identi-fication of a DP in a Case marked but non-0 marked position. The parser begins buildinga chain for this moved element, and must therefore keep track of a "trace" of it.The parser will deposit a trace of this moved element in each compatible position itencounters, until the terminal landing site has been encountered. Any non-Case markedposition, whether 0 marked or not, is a possible landing site. If the position is not 0marked, it is an intermediate landing site, while if it is 0 marked, it is (necessarily) thefoot of the chain.Once a suitable landing site for the trace has been found, it will be inserted intoChapter 4. Problems and Solutions^ 64the tree structure at this point. A local check of the Case and 0 marking of the trace isdone to see if all the well-formedness conditions on chains are met at this point. If thetrace has been assigned both Case marking and a 0 role, then the A-chain is complete.The store for A-type movement is therefore cleared of its contents. If the Case and 0 rolerequirements of the trace have not yet been met, the contents of the A-movement stackremain, and the current chain is thus continued.Let us now consider a detailed example of how the parser actually handles a simpleA-type dependency. Take an ordinary English sentence such as,(65) The dog chased the cat.This sentence is analyzed as having the structure shown below,(66) [ip [NP the clod^[, ]] [vp ti chased the cat ]]At first the buffer, (main) stack, and movement stacks are all empty, while theinput stream contains the words of the sentence to be parsed. The parser has onlylimited access to the input stream; the chunker can remove the first element from theinput stream, and project a skeletal X structure for it (recall section 3.1.3). The initialstate can thus be represented as follows, where "—" serves to separate visually the variouselements in the buffer and the stack.' Recall also that the buffer is a First-In-First-Out(FIFO) structure, while the stack is a Last-In-First-Out (LIFO) construct. In the parserstate snapshots shown below, items enter the buffer from the right and leave it from theleft, while items enter and leave the stack from the left (i.e.: the top of the stack is onthe left).7As this example deals with an A-type movement, only the relevent movement stack is shown in these"snapshots"; the other movement stacks remain empty throughout the parse.Store ContentsInput the dog chased the catBufferStackA mvnt stackChapter 4. Problems and Solutions^ 65As described in section 3.1.5 the chunker operates by projecting skeletal X structurefrom heads, and placing these chunks in the buffer. Three special cases discussed were(i) the parsing of DPs, (ii) the projection of both a skeletal IP and VP from a tensedverb, and (iii) the processing of the auxiliary verb "do". Thus, at this point, since aDP always functionally selects an NP, the chunker parses the first two elements from theinput stream, the and dog, as a chunk, a DP. This complete DP is placed in the buffer.Store ContentsInput chased the catBuffer [DP the dog]StackA mvnt stackThe parser attempts to always have at least two chunks in the buffer at once; it usesthis lookahead to avoid backtracking. Whenever there are fewer than two chunks in thebuffer, the chunker is invoked to process items from the input stream until there are twoor more chunks in the buffer. Since the processing of one item from the input streammay result in more than one chunk (consider the case of a tensed verb, which results intwo chunks, a skeletal IP projection and a skeletal VP projection), there may be morethan two chunks in the buffer after the chunker finishes its operation. Once there are atleast two chunks in the buffer, the chunker relinquishes control to the attacher.Briefly, the parser's basic order of operation involves four phases. The first priorityChapter 4. Problems and Solutions^ 66for the parser is to keep at least two chunks in the buffer, if possible. This will not bepossible if the input stream is empty. The second phase involves trying to attach piecesof phrase structure to each other, while the third one is chain construction. Finally, thelast phase, which is entered only if nothing else can be done, shuffles material betweenthe buffer and the stack.Let us now return to the example. At this point in the parse there is now only onechunk in the buffer, and so the chunker is activated to replenish the buffer. This nextword, the tensed verb chased, is extracted from the input stream and analyzed as [-I-past]and chase. The chunker recognizes that the feature [-I-past] must head an IP, and chasea VP; it builds skeletal X projections headed by these elements, and places them in thebuffer. Although IP functionally selects a VP, these two chunks cannot yet be attached,as the VP is not complete.Store ContentsInput the catBuffer [DP the dog] – [IP [I i [I +past]]] — [vp [vf [v chase]]]StackA mvnt stackThe parser now has at least two chunks in the buffer — in fact, there are three. Theattacher works with the first element of the buffer and the top element of the stack,trying to put them together somehow. Since the stack is empty, and no attachment canthus be done, the parser removes the first element from the buffer, and pushes it ontothe top of the stack.Chapter 4. Problems and Solutions^ 67Store ContentsInput the catBuffer [^[I' [I -1-past]]] — [ up^[v, [V chase]]]Stack [Dp the dog]A mvnt stackThe DP at the top of the stack needs to be assigned both Case and a 0-role. Since thereis Case to be assigned in the specifier of IP position, the DP is licensed to appear there.The attacher carries out the attachment; the structure resulting from any attachmentis always placed at the front of the buffer. The attacher also recognizes that the 0requirements of the DP have not been satisfied. This, as described above, is the triggerfor the parser to start constructing an A-chain. A trace of the DP is placed in the A-movement stack. Since there is now a trace awaiting attachment, a position satisfyingits Case and 0 requirements will be searched for in subsequent processing.Store ContentsInput the catBuffer [ip [DP the dog], [I , [1 +past]]] — [vp [v , [v chase]]]StackA mvnt stack tiThere are currently at least two chunks in the buffer (in fact, exactly two), so the chunkeris not called upon to replenish it. No attachments can be done since the stack is empty,and no chain construction can be pursued in this particular configuration either. Hence,the first element is removed from the buffer and pushed onto the stack.Chapter 4. Problems and Solutions^ 68Store ContentsInput the catBuffer [vp [v , [v chase]]]Stack [IP [Dp the clod [ i , [i -}-past]]]A mvnt stack tiThe buffer now contains less than two chunks, and so the chunker becomes active inorder to place more items in the buffer, thereby re-establishing the parser's lookahead.The chunker examines the input stream, and places the chunk the cat in the buffer.Store ContentsInputBuffer [vp [vi [v chase]]] — [DP the cat]Stack [IP [DP the dog] i [,, [ 1 +past]]]A mvnt stack tiThe attacher finally gets its turn again. The VP and the IP cannot be attached toeach other in any fashion,' so the movement stacks are checked. The A-movement stackcontains a trace which requires a 0-role (but must not be assigned Case, as it has alreadybeen assigned Case marking), and the verb chase assigns a 0-role to the specifier positionof its VP (but no Case marking). The attacher can therefore attach the trace into thisposition. As this A-chain is now complete, no further trace is placed into the A-movementstack.8Recall that any phrase must be complete before it can be attached to another phrase; all (functionalor lexical) selection must be satisfied first. Once a piece of phrase structure has been incorporated intoanother, this incorporated piece is unavailable for further manipulation; hence a correct parse could notpossibly be achieved if an incomplete phrase were attached to some parse tree fragment.Chapter 4. Problems and Solutions^ 69Store ContentsInputBuffer [vp ti [v , [v chase]]] — [DP the cat]Stack [IP [DP the dog], [I , [I +past]]]A mvnt stackAs nothing more can be done in this configuration, the first element of the buffer ispushed onto the stack.Store ContentsInputBuffer [DP the cat]Stack [vp ti [v , [v chase]]] — [IP [DP the dog]i [,, [1 +past]]]A mvnt stackThe buffer once again contains fewer than two chunks, but since the input stream isempty the chunker is unable to replenish the buffer. Processing thus continues in thisconfiguration. The DP the cat requires both Case marking and a 0-role, and chaseassigns Case and a 0-role to the complement of the VP position; the attacher carries outthe attachment. Since the Case and 0 requirements of this DP are satisfied there is noA-movement going on, and no trace needs to be placed in the A-movement stack.Store ContentsInputBuffer [vp ti [v , [v chase] [DP the cat]]]Stack the dog]i [I , [1 +past]]]A mvnt stackThe VP is finally complete, and so it and the IP can be connected. This forms the finalparse tree.Chapter 4. Problems and Solutions^ 70Store ContentsInputBuffer the clod [I , [I +past] [vp t, [v , [v chase] [DP the cat]]]]]StackA mvnt stackThis structure is pushed onto the (empty) stack, to make sure that the buffer is alsoempty. At this point everything except the stack is empty. The stack contains a singleelement. The parse was thus successful.Other A-type movements are handled in an analogous fashion.4.4 Wh-movementWit-movement is a type of A-movement. A dependencies are more difficult to determinethan A dependencies. A dependencies are constrained by Case and 0 theory; each memberof an A chain is therefore very easy to identify. This is not the case for A chains, and theextremities of an A chain cause special problems for the parser (the foot causes difficultiesin languages like English, the head in those like Chinese).4.4.1 S-Structure Wh-movementLet us now consider a case of S-Structure wh-movement, as is exhibited in,(67) Who did you kiss?The structure which the parser is to recover for this sentence is,(68) [ CF who i [IP did you [vP kiss ti ]]]Store ContentsInput who did you kissBufferStackX° mvnt stackA mvnt stackA mvnt stackChapter 4. Problems and Solutions^ 71Again, the parser begins parsing in the following configuration.The chunker reads items from the input stream, projecting a skeletal X structure foreach, until there are at least two chunks in the buffer. It begins with the wh word who.'Store ContentsInput did you kissBuffer [DP whoStackX° mvnt stackA mvnt stackA mvnt stackNext the chunker encounters did. This word phonologically/morphologically reflects thepresence of tense features. The syntactic theory holds that tense features are generatedunder the I node, and move, via X°-movement, to the C node. The parser detects thisX°-movement, placing the appropriate tense features on the X °-movement stack.'In English it is clear that a wh-phrase situated at the front of a clause has been displaced (unless oneadopts the Vacuous Movement Hypothesis [Chomsky 86a]), and one might question why the parser doesnot immediately begin constructing an A-chain. While this reasoning applies to English, the parser doesnot know which language it is dealing with and thus cannot make any assumptions about the location ofany element until it is attached into a a phrase structure fragment by the attacher. (The chunker doesviolate this principle in the case of head movement, as discussed in section 4.2.)Chapter 4. Problems and Solutions^ 72Store ContentsInput you kissBuffer [DP who ] — [cp [c , [c do ]]]StackX° mvnt stack [ +past ]A mvnt stackA mvnt stackNothing can be done in this configuration, so the first element in the buffer is removedand pushed onto the stack.Store ContentsInput you kissBuffer [cp^[ci [c do ]11Stack [DP who ]X° mvnt stack [ +past ]A mvnt stackA mvnt stackThe chunker is again required to "fill up" the buffer, so that it contains at least twochunks.Store ContentsInput kissBuffer [cp [c' [c do ]1] — [DP you ]Stack [DP who ]X° mvnt stack [ +past ]A mvnt stackA mvnt stackChapter 4. Problems and Solutions^ 73It is now possible for the attacher to attach the DP who to the specifier of CP position.Since this DP has not yet received Case marking nor a 9-role, it is clear that it mustmove. This movement must be of the A type; an A-chain is always headed by a Casemarked position. A trace of the DP is thus placed on the A movement stack.Store ContentsInput kissBuffer [cp [DP who ] i [c , [c do ]]] — [DP you ]StackX° mvnt stack [ +past ]A mvnt stackA mvnt stack tiAgain, the first element from the buffer is removed and pushed onto the stack.Store ContentsInput kissBuffer [DP you ]Stack [cp [DP who ] i [c , [c do ]]]X° mvnt stack [ +past ]A mvnt stackA mvnt stack tiThe next item from the input stream is analyzed by the chunker. This item is the verbkiss. The chunker knows that there must be an IP projection before the VP (throughboth functional selection and the presence of inflection on the verb), so an empty IP isplaced in the buffer before the VP.Chapter 4. Problems and Solutions^ 74Store ContentsInputBuffer [DP you ] — [I ,^[1 ]]] —^[vp^[v ,^[v kiss ]]]Stack [cP [DP who ] i [c , [c do ]]]X° mvnt stack [ +past ]A mvnt stackA mvnt stack tiSince nothing can be done in this configuration, the first element from the buffer isremoved and pushed onto the stack.Store ContentsInputBuffer [IP^[I ,^[I ]]] - [vp^[v ,^[v kiss ]]]Stack [DP you ] — [cp [DP who ] i [c , [c do ]]]X° mvnt stack [ +past ]A mvnt stackA mvnt stack tiThe empty IP is now at the front of the buffer, and the X ° movement can be resolved.The tense features are deposited in the I node, and the X °-movement stack is cleared ofits contents.Chapter 4. Problems and Solutions^ 75Store ContentsInputBuffer [IP^[I ,^[1 +past 1]] — [vp^[v , [v kiss ]]]Stack [DP you ] — [cp [DP who ]i [ c , [c do illX° mvnt stackA mvnt stackA mvnt stack tiSince the I node contains tense features, it is now able to assign Case marking to thespecifier of the IP node. This allows the attacher to attach the DP you into this position.This DP now has Case marking, but no 0-role, so an A-chain must be constructed. Atrace of this DP, properly coindexed with an index unique from that of the A dependencyalso in progress, is placed in the A-movement stack.Store ContentsInputBuffer [ip [DP you ]i [I , [1 +past 1]] — [vp [N,'^[v kiss ]]]Stack [Cr, [DP who ]i [c , [c do ]]]X° mvnt stackA mvnt stack tjA mvnt stack tiThe first element of the buffer is removed and pushed onto the stack.Chapter 4. Problems and Solutions^ 76Store ContentsInputBuffer [vp^[v ,^[v kiss ]1]Stack [IP [DP you ] j [I' [I +past ]]] — [cp [DP who ] i [c , [c do ]]]X° mvnt stackA mvnt stack tjA mvnt stack tiThe VP is now at the front of the buffer. Since the specifier of VP position is assigned a0-role but no Case marking, the A dependency can now be resolved. The trace is takenfrom the A movement stack, and is placed in the VP-internal subject position. Since allthe Case and 0 requirements of this A chain are now satisfied, no further trace is placedin the A-movement stack.Store ContentsInputBuffer [vp^tj^[v , [v kiss ]]]Stack [IP [DP you b [I , [1 +past ]]] — [cp [DP who ]2 [c do ]]]X° mvnt stackA mvnt stackA mvnt stack tiAlthough there are now less than two items in the buffer, the chunker cannot supplymore chunks as the input stream is empty. There is nothing that the attacher can do inthis configuration, so the VP is removed from the buffer and pushed onto the stack.Chapter 4. Problems and Solutions^ 77Store ContentsInputBufferStack [vp tj [v , [v kiss ]]] — [IP [DP you ]i [ I , [1 +past ]]] — [Dp [DP who ]i [ a , [c do ]]]X° mvnt stackA mvnt stackA mvnt stack tiThe attacher is now considering the right edge of the VP; there is a Case- and 0-markedposition to the right of the verb. Since there is no overt NP which can be placed in thisposition, the attacher consults the movement stacks to see if there is any trace whichwould be compatible with this position. Of course, the A-chain can now be resolved.Store ContentsInputBuffer [vp^ti^[v , [v kiss t, ]11Stack [IP [DP you ]j [I' [I +past ]j] — [Dp [DP who ] 2 [c , [c do ]]]X° mvnt stackA mvnt stackA mvnt stackThe VP, now complete, can be attached into the complement position of the IP. Func-tional selection licenses this attachment.Chapter 4. Problems and Solutions^ 78Store ContentsInputBuffer [IP [DP you b [1 , [i +past ] [VP ,^[v kiss^ti ]]]]]Stack [cp [DP who ] i [c , [c do ]]]X° mvnt stackA mvnt stackA mvnt stackSimilarly, functional selection licenses the attachment of the complete IP into the comple-ment of CP position. This results in a complete and correct parse tree for this sentence.Store ContentsInputBuffer [cp [DP who ] i [c , [c do ] [IP [DP you ]j [i , [1 +past ] [vp ti [v , [v kiss ti ]]]]]]]StackX° mvnt stackA mvnt stackA mvnt stack4.4.2 LF Wh-movementGiven what has been presented so far, it does not appear that A-movement shouldcause any fundamental difficulties for the parsing mechanism. However, if the parser isto operate in accordance with the minimal assumptions outlined, then LF wh-movementshould be an unparsable construction. Languages such as Chinese have LF wh-movement,and it is to Chinese that we now turn our attention.Some examples of direct questions in Chinese are shown below. "Shenme", theChapter 4. Problems and Solutions^ 79question word corresponding to the English "what", appears in its D-Structure positionin all the sentences.(69) a. Ni kanjian le shenme?you see what"What did you see?"b. Ni shuo ni kanjian le shenme?you say you see ASP what"What did you say you saw?"c. Ni renwei ni shuo ni kanjian le shenme?you think you say you see ASP what"What did you think you said you saw?"It is clear that the question word can be arbitrarily far away from the position where itshould appear at LF, the matrix clause specifier of CP (compare the English translations,where "what" occupies this position, and see figure 4.17). The difficulty which arises whilethe parser is processing the sentence is that it has no way of knowing that it is dealingwith a wh construction until encounters the very last word of the sentence. It seemsinevitable, given the standard analysis of wh movement, that the parser will have to useeither unbounded lookahead or backtracking in order to produce an accurate parse tree.It seems to be a fact of language that overt movement in general, and overt wh-movement in particular, is leftward. Rightward movement, while not ruled out, is severelyconstrained [Ross 67, Grosu 73]. While there is nothing within the theory which forcesthe movement to be leftward — no directionality of movement is imposed directly on therule move-a — GB theory seems to make the strong assumption that all long distancemovement, whether it be overt or covert, is leftward (except under those very constrainedconditions where rightward movement can occur). From the left-to-right nature of theChapter 4. Problems and Solutions^ 80(a) EnglishFigure 4.17: English LF representationChapter 4. Problems and Solutions^ 81parser's operation and its filler-driven strategy for constructing movement chains, it fallsout that the filler must come before (to the left of) the gap. A parsing-based explanationfor this directional bias of move-a thus emerges.At first sight, this explanation does not seem to help in the case of covert wh-movement; here the filler is located in its D-Structure position, and it seems that thereis no way to relate such a wh-phrase to its proper LF position in the specifier of CPposition. There is, however, a straightforward way to get around this difficulty assumethat covert movement is rightward. There is nothing in the syntactic theory to rule outthe LF representation shown in figure 4.18.With this LF analysis for Chinese wh-movement, the parser can construct chainsin basically the same manner as for English. The parser does not start to construct achain until it encounters the filler (the wh word). Instead of leaving the filler in thisposition and placing a trace in the A-movement stack, the parser places the filler in theA-movement stack, and leaves a properly coindexed trace behind. This filler is carriedalong towards the right until a final landing site for it is found; the parser will depositintermediate traces along the way as required.Two distinct types of filler-driven parsing are thus engaged in by the parser (asreported in [Alphonce et al. 92, Davis et al. 92]). The first is filler-driven gap-locatingparsing (exemplified by English wh-movement), while the second is filler-driven gap-creating parsing (Chinese wh-movement). Gap-creating parsing allows the parser to startbuilding a chain of covert movement when it has identified the filler. The filler is movedto the right, and a gap is created in the S-Structure position of the filler. Note that thisis in contrast to the case of gap-locating movement, which leaves the filler in place andcreates a trace which is then moved along.It is easy for the parser to identify whether an element should undergo gap-locatingor gap-creating movement as Case and 0 conditions determine the type of movement thatChapter 4. Problems and Solutions^ 82CPC ' shenme IP CNiI VPV'Vkanjian le^traceFigure 4.18: Proposed Chinese LF representationis required (see table 4.2). The Case and 0 properties of an element in a given positionunambiguously identify the type of movement it must undergo in order to saturate itsCase-marking and 0-role requirements. Consider an element which has Case-marking butno 0-role. Since the head of an A-chain is Case-marked while its foot is assigned a 0-role,clearly this element must undergo A-movement. Furthermore, since it is lacking a 0-role,it must be a case of gap-locating movement. In contrast, an element which has a 0-rolebut requires Case-marking must undergo gap-creating A-movement.Filler-driven gap-locating parsing is used to recover S-Structure dependencies, whileChapter 4. Problems and Solutions^ 83Type of Cham FormationGap Locating Gap CreatingA-movement A-movement A-movement A-movementCase needs has needs has0-role needs needs has hasTable 4.2: Movement-type diagnosticsfiller-driven gap-creating parsing is used to recover LF dependencies. Since movementchains can interfere with each other only if they are established at the same level ofrepresentation, the parser uses two sets of movement stacks, a gap-locating set and agap-creating set.Let us now consider an example of how the parser deals with Chinese wh movement.The sentence to be parsed is,(70) Ni kanjian le shenme?The assumed structure which the parser should recover is,(71) [cis [ip Ni [, kanjian le t, ]] shenme iNote that "le" is an aspectual marker, and is treated by the parser as part of the verb.The initial state of the parser is as in the previous cases,Chapter 4. Problems and Solutions^ 84Store ContentsInput ni kanjian-le shenmeBufferStackGap Locating Movement Stack Gap Creating Movement StackX°AAThe chunker starts things off by processing the first item from the input stream. Thisfirst item is the pronoun "ni"; the chunker puts a DP chunk containing "ni" into thebuffer.Store ContentsInput kanjian-le shenmeBuffer [DP ni ]StackGap Locating Movement Stack Gap Creating Movement StackX°AASince the buffer only contains one item, the chunker is called upon to provide morechunks. In processing the next word from the input, the verb "kanjian-le", two chunksare produced and placed in the buffer.Chapter 4. Problems and Solutions^ 85Store ContentsInput shenmeBuffer [DP^ni ] — [^r^r^-4--Lip^Li , L I . asp ]]] — [vp^[v , [v kanjian-le ]]]StackGap Locating Movement Stack Gap Creating Movement StackX°AAThe first element from the buffer is pushed onto the stack.Store ContentsInput shenmeBuffer [ll, [I , [1 +asp ]1] — [vp [v , [v kanjian-le ]]]Stack [DP ni ]Gap Locating Movement Stack Gap Creating Movement StackX°AAThe DP is attached into the specifier of IP position. Since in Chinese verbs assign theirexternal 9-roles to the specifier of IP position, the Case and 9 properties of the DP aresatisfied, and no A-chain is constructed (in contrast to the English case).Chapter 4. Problems and Solutions^ 86Store ContentsInput shenmeBuffer [1p^[DP ni ]^[1 , [1 +asp 1]] — [vp [v , [v kanjian-le ]]lStackGap Locating Movement Stack Gap Creating Movement StackX°AAAgain, nothing can be done and the stack is empty; the first element of the buffer istherefore pushed onto the stack.Store ContentsInput shenmeBuffer [vp [v , [v kanjian-le ]]]Stack [IP [DP ni ] [,, [1 +asp ]]]Gap Locating Movement Stack Gap Creating Movement StackX°AAMore chunks need to be placed in the buffer, as it now contains fewer than two. Thechunker processes the next item from the input stream, "shenme", and places the chunkbased on it in the buffer.Chapter 4. Problems and Solutions^ 87Store ContentsInputBuffer [, [v , [v kanjian-le ]]] — [DP shenme ]Stack [IP^[DP ni ] [,, [ 1 +asp ]]]Gap Locating Movement Stack Gap Creating Movement StackX°AAThe first element from the buffer is pushed onto the stack.Store ContentsInputBuffer [DP shenme ]Stack [VP [v' [v kanjian-le ]]] — f^fLIP^[DP ni ] [I , [1 +asp ]]1Gap Locating Movement Stack Gap Creating Movement StackX°AASince the input stream is empty, no more chunks can be added to the buffer. Hence pro-cessing must continue in the present configuration; a gap-creating movement is initiated."Shenme" is a wh-phrase which needs to move at LF to the specifier of some CP. It is tobe attached into a position which is both Case-marked and assigned a 0-role. This is theindication to the parser that it must construct a chain using gap-creation. Therefore, atrace is inserted into the complement position of the VP, and the DP itself is placed inthe gap-creating A-movement stack.Chapter 4. Problems and Solutions^ 88Store ContentsInputBuffer [vp [v , [v kanjian-le ] ti ]]Stack [^pp ni ] [,, [ i +asp iiiGap Locating Movement Stack Gap Creating Movement StackX°AA [D, shenme ] iThe VP is now complete, so its attachment into the IP, licensed by functional selection,is done.Store ContentsInputBuffer [ip^[D, ni ] [,, [ i +asp ] [VP [V , [v kanjian-le ] ti ]]]]StackGap Locating Movement Stack Gap Creating Movement StackX°AA [Dp shenme iiThe parser can do nothing but push the first element from the buffer onto the stack inthis configuration.Chapter 4. Problems and Solutions^ 89Store ContentsInputBufferStack [1p [DP ni ] [,, [i +asp ] [vp [v , [v kanjian-le ] ti fillGap Locating Movement Stack Gap Creating Movement StackX°AA [DP shenme ] iAlthough the stack contains a single phrase structure fragment and the buffer is empty,not all the movement stacks are void of content. Therefore, the parse is not yet complete.There is an A-dependency which needs to be resolved. The presence of the wh-phrase inthe gap-creating A movement stack, and the presence of a single IP in the stack, signalsto the parser that the landing site for the wh-phrase must be created. The IP is droppedback into the buffer.Store ContentsInputBuffer [1p [DP ni ] [I , [I +asp ] [vp^[v , [v kanjian-le ] ti ]]]]Stack [cp^[c ,^[c I]]Gap Locating Movement Stack Gap Creating Movement StackX°AA [DP shenme ]iThe IP is attached into the complement position of the CP.Chapter 4. Problems and Solutions^ 90Store ContentsInputBuffer f^1^f^f,CP LC' LIP^[DP ni ] [1 , [1 +asp ] [vp [v ,^[v kanjian-le ] ti ]]]]]]StackGap Locating Movement Stack Gap Creating Movement StackX°AA [Dp shenme i iNothing can be done, so the first element from the buffer is pushed onto the stack.Store ContentsInputBufferStack 1^r^f^1Lcp L c , Lip [DP ni ] [Ii [ i +asp ] [vp [v , [v kanjian-le ] ti ]]]]]]Gap Locating Movement Stack Gap Creating Movement StackX°AA [Dp shenme iiThe A-dependency can finally be resolved. The specifier of CP is on the right, and sothe parser can in this configuration attach the DP from the movement stack into thisposition, completing the parse tree.Chapter 4. Problems and Solutions^ 91Store ContentsInputBuffer f^1^1^[DP ni ] f^f ^+asp ] ^f^f^kanjian-le I ti 11111 [DP shenme ] i ILCP^LC'^LIP^LDP -- i^LI'^LI . --,^i^[VP^L V i^LVStackGap Locating Movement Stack Gap Creating Movement StackX°AAFinally, the parser moves the completed parse tree into the stack, to ensure that thebuffer is emtpy, which it is. The state is:Store ContentsInputBufferStack [op^f^f^[DP^ r^rIlL  Lc , Lip L  ___i ,1 Li , Li^,-1- asp ] [vp [v , [v kanjian-le ] ti 1111] [DP shenme b ]Gap Locating Movement Stack Gap Creating Movement StackX°AA4.5 Revised Parsing ModelThe parser engages in two types of chain-creation — gap-locating and gap-creating. Sincethese correspond to application of the rule move-a at different levels of the grammar,there will be no interference between chains constructed using gap-location and thosebuilt using gap-creation. Each type of chain-creating process can build any type ofChapter 4. Problems and Solutions^ 92chain, either X° , A, or A. The set of movement stacks used to implement RelativizedMinimality must therefore be refined to reflect the chain-creation process used. In otherwords, the parser has access to a set of gap-locating movement stacks and a set of gap-creating movement stacks.4.6 Empty Operator ConstructionsWe will now study a type of construction which, again, poses a dilemma for the parserunder the assumptions of sections 2.14 and 3.2. Since several detailed examples of howthe parser functions have been presented earlier in the chapter, I will simply describe(without the aid of parser "snapshots") how the parser deals with these constructions.Example parses for these types of constructions are shown in appendix A.2.Recall the examples of empty operator constructions:(72) a. Jennifer is the woman to watch Marlene.b. Jennifer is the woman to tell the police to watch Marlene.(73) a. Jennifer is the woman to watch.b. Jennifer is the woman to tell the police to watch.Under the processing assumptions Browning's analysis (section 2.10.4) predicts that theparser must backtrack in order to parse either an SGC or an OGC. The structure whichshould follow "the woman" cannot be determined until the end of the clause. As the (b)sentences show, this may be an unbounded distance away. Given the parsing assumptions,processing difficulties seem inevitable. The path I choose to pursue is that of maintainingboth the linguistic and computational assumptions made earlier, while challenging thestructure assigned to these clauses.Abandoning Browning's analysis, I must offer an alternative. Pesetsky [Pesetsky 82]Chapter 4. Problems and Solutions^ 93suggests that PRO may appear on the right. If this proposal is extended to any non-Casemarked specifier, a unified analysis emerges.(74)^a. [NP ] [cp 02 [IP [I/ [vp^] ] prof ]]b. [NP ] [cp Oi [IP^[vp ^ti] ^ROARB ]]The same structure is built at the start of the clause in both instances — an emptyoperator (which need not be assumed to be pro).In the case of a subject-gap clause, the extended projection principle (EPP) re-quires the presence of a subject; pro is inserted on the right, yielding an A-bound pro[Cinque 90]. It is coindexed with the operator, thus avoiding a violation of the prohibitionagainst vacuous quantification.In the case of an object-gap clause, the parser finds a gap in the V' node, andcoindexes this gap with the operator. The EPP requires the presence of a subject, hencePRO is inserted. This PRO cannot be bound by the operator as well; it would haveto be parasitic in some sense, but this would violate the anti c-command constraint onparasitic gaps.A slight complication arises here. The specifier of VP position is a position in whicha DP-trace can occur. The Empty Category Principle holds that such a trace must beproperly governed, and thus governed. By the PRO theorem, PRO must be ungoverned.Therefore, PRO cannot remain in the specifier of VP position, but must instead move tothe specifier of IP position (which is ungoverned in untensed clauses).The parser constructs the proper structure for both subject-gap and object-gapclauses with no extra machinery. The difference between the two simply falls out as aresult of the interaction of two linguistic principles, the EPP and the constraint againstvacuous quantification.Chapter 5ImplicationsThe purpose of this chapter is to discuss the various syntactic and computational impli-cations of the parsing model I am proposing.A basic assumption I make is that all parameterization to account for cross-linguisticvariation must occur in the competence module; the performance module is not param-eterized. Conversely, I also make the assumption that any competence constraint isparameterized, and varies cross-linguistically, while any constraint which applies withoutchange to all languages is a performance constraint. This has far-reaching implicationsas to the division of labour between competence and performance.A case in point is Relativized Minimality (RM). As presented by Rizzi [Rizzi 90],RM does not allow for any variation across languages. I am therefore forced to cast RMas an artifact of the processing mechanism rather than as a principle of grammar. Theparsing model presented herein incorporates RM in a natural manner.I adopt the hypothesis that all long-distance dependencies are parsed in a filler-driven manner. Together with the processing assumptions, this predicts that long-distance overt movement must always be leftward, and similarly that long-distance covertmovement is rightward. Conversely, long-distance leftward movement must be overt,while long-distance rightward movement is forced to be covert.The fact that the directionality of movement and the level of representation atwhich the movement takes place are directly linked yields a performance explanationfor this curious linear asymmetry. Since this is a performance constraint, it is predicted94CP(III)^n (IV)AC^ CCPChapter 5. Implications^ 95that long-distance overt movement in all languages is leftward, and long-distance covertmovement in all languages is rightward. As far as I know, this is the case. It is alsopredicted that the restrictiveness of overt rightward movement will be mirrored for covertleftward movement.Moreover, it is predicted that all possible configurations for the CP projectionshould be instantiated. As reported in [Davis et al. 92, Alphonce et al. 92], all possibil-ities are actually present. Figure 5.19 illustrates the possibilities. Examples of languagesassociated with the various configurations are: (I) English; (II) Farsi, Vietnamese; (III)Vata; (IV) Chinese, Japanese.CP(1)/\C 'C CFigure 5.19: Possible CP projectionsThe fact that there is never an overt CP specifier in languages with LF wh-movement, another curiosity, now receives a straightforward explanation — languageswith covert wh-movement do not have CP specifiers on the left.The interaction of the linguistic with the processing assumptions forces abandon-ment of Browning's [Browning 87] proposed structure for infinitival relative clauses. Un-der Browning's analysis, one or the other of subject-gap and object-gap infinitival relativeChapter 5. Implications^ 96clauses should be unparsable, which clearly is not the case. I propose a structure whichcircumvents the processing difficulties, and which is in some ways a theoretically morepleasing analysis than that of Browning.A fallout of the gap-locating and gap-creating mechanisms for chain construction isthat it is entirely feasible to build a one pass deterministic parser based on a multistratalmodel of grammar. In particular, the multiple levels of GB theory pose no difficulty fora properly formulated parsing mechanism. Movement at different levels of the grammarare viewed by the parser simply as different types of chain building.Furthermore, by adopting an appropriately articulated parsing model, it is reason-able to claim that this parsing scheme is applicable cross-linguistically. The practicalconsequences of this are numerous.If one's goal is to produce multi-lingual applications, then an immediate benefit isthat complex rule sets need not be developed for each new language. Once the parsingmechanism is in place, only the lexicon and the parameter settings need be specified(along with any language-specific information contained in the chunker, if any). A relatedbenefit is that the amount of required maintenance for the parser is decreased. Testingand maintenance resources need not be distributed amongst a set of language-specificparser, but can all be concentrated on the single parsing mechanism.The task of machine translation (at least as far as the syntactic component is con-cerned) is greatly simplified. Sharp [Sharp 85] developed an English-Spanish translationsystem based on GB theory, while Crocker [Crocker 88] developed an English-Germanone. Each system uses the same underlying mechanisms for both languages. Extendingthis idea, the implication is that a (relatively) simple translation method can be used forany pair of languages.Chapter 6Related WorkIn this chapter I discuss some recent work which is related in some manner to the topicof this thesis. First I consider work directly concerned with the proper method to usein processing more than one language. Second I look at some other recent work which,while not directly concerned with cross-linguistic parsing at the moment, are nonethelessinteresting.6.1 Three ApproachesMazuka & Lust [Mazuka et al. 90] argue that there are three different paths one canfollow in parsing more than one language. One can, for example, claim that there isa single parsing scheme which is sufficient for all languages. That is the route that Ihave taken in this thesis, as have, for example, Frazier Si Rayner [Frazier et al. 88] andSato [Sato 88]. The other obvious road to take is to reject this hypothesis completely,and to build separate parsers for each new language. Lee, Chien, Lin, Huang, and Chen[Lee et al. 91] seem to take this route, at least with respect to English and Chinese.Finally, there is the middle ground, which Mazuka & Lust themselves occupy. Theyargue for a parsing scheme which is "universal" yet parameterized; by this they meanthat there is a family of parsers for different languages, all of them deriving from the samebasic components, and varying minimally due to a parameter setting. These proposalsare considered briefly in the following three sections.97Chapter 6. Related Work^ 986.1.1 Language-Specific ParsingAlthough their goal is not to produce a psycholinguistically plausible parsing scheme, Lee,Chien, Lin, Huang, and Chen [Lee et al. 91] are striving to produce an efficient parser forChinese. They note that much work in natural language processing has been concernedwith parsing "several western languages such as English" [p. 347]. Hypothesizing thata probable reason for the dearth of Chinese natural language processing systems stemsfrom the fact that the syntactic structure of Chinese and English differ, they proceed topresent a parser "specially designed for the Chinese language" [p. 347].Although they refer to GB theory, their parser is based on a collection of phrasestructure rules. They use a bidirectional chart parser, which could in itself be useful fora multitude of languages; the phrase structure rules are language-specific, however.Thus, in tailoring their parser specifically to Chinese, they have limited the appli-cability of the parser to this one language. Although the bidirectionality of the parserrenders it implausible as a model of human langauage processing, this feature of the parsercould no doubt be exploited in parsing languages with radically different structures fromthat of Chinese.6.1.2 Parameterized ParsingMazuka & Lust [Mazuka et al. 90] argue that certain constraints which Universal Gram-mar (UG) impose on possible grammars of natural languages directly influence the pars-ing mechanism used to process any given language. In particular, they hypothesize thatthere is a parameter in UG which determines the branching direction of a language, andthat this parameter affects the form of the parsing mechanism. Thus, they reject theassumption I make that the parsing mechanism is fixed cross-linguistically.They note that a top-down approach, commonly used for right-branching languages,Chapter 6. Related Work^ 99predicts that left-branching constructions are difficult to parse. In conjunction with anassumption of universality of the parsing mechanism, this predicts that left-branchinglanguages should be harder to parse than right-branching languages. This, however, isnot the case. Mazuka & Lust therefore propose that right-branching languages, suchas English, are parsed using a top-down approach, while left-branching languages, suchas Japanese, use a bottom-up parsing algorithm They claim that although universal,the human language processing system is parameterized; they claim further that thisparameterization is due to the setting of a competence parameter, the branching directionparameter.Mazuka & Lust themselves "recognize that neither a pure top-down nor a purebottom-up approach will be appropriate for a complete model of natural language pro-cessing" [p. 181]. This is clearly the case for languages which exhibit mixed branchingpatterns, such as Chinese (which is head-final in its CP projection, but head-initial inits VP projection, for instance). A binary parameter based on branching direction seemsunreasonable given the great diversity of branching patters cross-linguistically.Hasegawa [Hasegawa 90] also casts doubt on their conclusion. The evidence whichMazuka & Lust cite does not unambiguously support their conclusion — they admitthat "the existing data are not always consistent and in no way conclusive at this time"[p. 197]. Hasegawa further notes that "although it is an open question how grammarand parsing interact with each other seems to be preferred, conceptually as well asmethodologically, that these two components remain as separate modules, unless somestrong evidence suggests otherwise." [p. 216] This is the null hypothesis to which Iadhere.Chapter 6. Related Work^ 1006.1.3 Common Parsing SchemeSato [Sato 88] describes the Pattern Oriented Parser (POP), a bottom-up parser whichrelies on sentence and phrase patterns to parse both left-branching and right-branchinglanguages.POP builds parse trees according to sentence patterns. These patterns are "... parsetree frames, each of which is associated with one class of verbs... " [pg. 22] It is not clearwhether chain wellformedness conditions are implemented, nor is it clear whether thereis a fixed number of verb classes cross-linguistically. Even so, POP demonstrates thatit certainly is possible to construct a single parsing mechanism which can handle bothEnglish and Japanese, in defiance of Mazuka & Lust's claim to the contrary.Frazier Si Rayner [Frazier et al. 88] argue for a universal parsing scheme which,due to the left or right branching nature of the language, takes on different forms tosatisfy various processing constraints. Addressing the same problem as Mazuka & Lustthey arrive at a fundamentally different solution. Frazier Si Rayner are able to maintainthe distinction between competence and performance which Mazuka & Lust abandon,as well as the notion of a completely universal parsing scheme. They propose a modelwhich must satisfy constraints such as Late Closure and the First Analysis Constraint(discussed in section 3.3.2) among others. The idea is that the branching direction of thelanguage, determined independently, causes these constraints to be satisfied in differentways by different languages, resulting in differing parsing strategies. As mentioned insection 4.1, this is an indirect means of explanation. I deny that the branching patternsof different languages force the use of a parsing mechanism with a variable amount of top-down versus bottom-up processing; I believe that a fundamentally bottom-up parser withcertain top-down aspects is sufficient. Frazier Si Rayner's approach does not, however,necessitate the abandonment of a strict division between competence and performance,Chapter 6. Related Work^ 101nor does it invalidate the claim that the parser is not parameterized, as does MazukaLust's proposal.6.2 Other Research6.2.1 Long-Distance A-dependencies at S-StructureDahl, Popowich, and Rochemont [Dahl et al. 91] describe a GB based parsing systembased on Static Discontinuity Grammars (SDGs). They translate GB theory into anSDG, a grammar formalism for which there are efficient parsing methods available. Thisresearch is not motivated by psycholinguistic concerns, but it does present some inter-esting ideas.The parser does not explicitly construct multiple levels of representation, but con-structs instead what they term D/S Structure. This is a representation at which theconstraints of both D-Structure are S-Structure are enforced. My parser recovers an LFrepresentation, with constraints at the various levels enforced at the appropriate timeduring tree construction. Since the grammar is embedded in the parser, this is possi-ble. Dahl, Popowich, and Rochemont do not have this option available to them, as theyencode the grammar using an SDG, which is then parsed using a generic SDG parsingalgorithm6.2.2 A Grammar TesterFong [Fong 90] constructs a highly declarative parsing system which can be used as agrammar tester and a grammar-building tool by the the linguist. As such, if the grammarit is set up to implement can handle multiple languages, so can the parser. However, itis in no way psycholinguistically plausible, and therefore offers no insight into how thehuman language processing capacity might be structured.Chapter 7Future WorkThere are many possible avenues to follow in future work. The first obvious path to travelis the cross-linguistic one: the parsing scheme should be tested on a greater variety oflanguages. Japanese is an interesting language, because it is a head-final language. Usingthe parser with Japanese would allow a direct test of Mazuka k Lust's [Mazuka et al. 90]claim that English and Japanese cannot be parsed using the same parsing scheme. Polishand Hungarian are interesting due to their representation of scope at S-Structure. Thereare also various non - configurational languages, among them Warlpiri, which do not havea fixed surface word order.Successful incorporation of a greater number of movement processes, such as Quan-tifier Raising, and modules of the grammar, such as Binding, would further strengthenthe claim that the general parsing scheme is viable.A fairly simple extension to the parser, though one which is not implemented cur-rently, involves having it handle binding chains in addition to antecedent governmentchains. Achieving this would involve making the A-movment stack somewhat more fine-grained. It would have to be split into an A/Binding movement stack as well as anA/Antecedent-Government movement stack. The linguistic implications of this wouldneed to be investigated.An analysis of various unmoved wh-phrases should also be undertaken. Unmovedwh-phrases include so-called wh-in-situ and "echo questions", respectively exemplified by,102Chapter 7. Future Work^ 103(75) Who remembers what we bought where?(76) Beverley saw Averill with whom?The difficulty here is that the scope of the unmoved wh-phrase is obviously not establishedby overt wh-movement. Furthermore, it seems that it cannot be given by covert wh-movement either. The difficulty stems from the possible interpretations for these typesof wh-phrases, which differs from that of usual wh-phrases.An explanation for the difficulty in processing centre-embedding constructions, be-yond the usual "memory-limitation" proposal, may be possible. Centre-embedding itselfdoes not seem to lie at the root of people's processing difficulties, as cross-linguisticallythere are several centre-embedding structures which cause no such difficulties. In Englishobject relatives, in which the DP being modified by the relative clause is interpreted withrespect to the object position of the verb in the relative clause, are bad when nested. Incontrast, while subject relatives (in which the modified DP is interpreted with respect tothe subject position) can be nested without degradation.(77) * The rat that the cat that the dog chased caught died.(78) I saw the dog that chased the cat that caught the rat that died.Interestingly, in Japanese and Chinese, nested object relatives are grammatical, whilenested subject relatives are deviant. If one considers the movement of empty operatorsin these constructions, it seems as though improper interaction of chains of the sametype is responsible for the processing difficulties, and not the fact that the constructionsare centre-embedding. More study of this and other centre-embedding constructions isneeded, however, before any firm conclusions can be drawn.Chapter 8ConclusionThis dissertation has investigated some issues which cast doubt on the existence of auniversally applicable parsing scheme for natural language. Although I have not provedthat there is a universal parsing scheme for all languages, it is hoped that the parsingmodel herein presented is a step in the proper direction.This hope is based on the fact that the parser processes a right-branching language(English) in a primarily bottom-up manner (a mixed strategy, having both top-downand bottom-up aspects, is employed). Mazuka & Lust [Mazuka et al. 90] hold that thereis a direct correspondence between the branching direction of a language and the top-down/bottom-up directionality of the parser; they claim that a top-down approach mustbe used for right-branching languages while left-branching languages require the use of abottom-up approach. A mixed strategy should be sufficient to process a language of anybranching pattern.The parser also embodies mechanisms (gap-locating and gap-creating chain con-struction) which permit it to recover D-Structure and LF information from an S-Structurerepresentation in a single-pass. This provides a solution to one of the problems encoun-tered when basing a parser on a multistratal model of grammar.Moreover, the parser adheres to (at least some of) the principles deemed necessaryfor a parser to be psycholinguistically plausible. Some aspects of its operation are, unfor-tunately, not plausible, yet these are related to the particular processing model on whichI chose to base the parser; I do not believe that they will pose insurmountable difficulties.104Chapter 8. Conclusion^ 105Finally, in conformance with the hypothesis that all universal principles are perfor-mance principles and that all cross-linguistically varying conditions belong in the com-petence module, Relativized Minimality is cast as an artifact of processing constraints.This is one example of how processing considerations can yield explanations to linguisticphenomena where the syntactic theory could account for them only through stipulation.Other examples are the direction asymmetry exhibited by move-a, and the lack of overtCP specifiers on the left in languages with covert wh-movement.Bibliography[Abney 87]^Steven Abney. Licensing and parsing. In Joyce McDonoughand Bernadette Plunkett, editors, Proceedings of NELS 17,Amherst, Massachusetts, 1987. University of Massachusetts atAmherst.[Abney 91]^S. Abney. Parsing by chunks. In Robert C. Berwick, Steven P.Abney, and Carol Tenny, editors, Principle-based Parsing:Computation and Psycholinguistics. Kluwer Academic Publish-ers, Dordrecht, The Netherlands, 1991.[Abney et al. 86]^Steven Abney and Jennifer Cole. A government-binding parser.In S. Berman, J.-W. Choe, and McDonough, editors, Proceed-ings of NELS 16, Amherst, Massachusetts, 1986. University ofMassachusetts at Amherst.[Alphonce et al. 92]^Carl Alphonce and Henry Davis. Performance constraints andlinguistic explanation. To appear in the proceedings of theWestern Conference on Linguistics, 1992.[Aoun et al. 87]^J. Aoun, N. Hornstein, D. Lightfoot, and A. Weinberg. Twotypes of locality. Linguistic Inquiry, 18(4), 1987.[Aoun et al. 89]^J. Aoun and A. Li. Scope and constituency. Linguistic Inquiry,20(2), 1989.[Bresnan 82]^J. Bresnan, editor. The Mental Representation of GrammaticalRelations. MIT Press series on cognitive theory and mentalrepresentation. The MIT Press, 1982.[Browning 87]^M. Browning. Null Operator Constructions. PhD thesis, MIT,1987.[Chomsky 81]^Noam Chomsky. Lectures on Government and Binding. Stud-ies in Generative Grammar 9. Foris Publications, Dordrecht,The Netherlands, 1981.[Chomsky 86a]^Noam Chomsky. Barriers. Linguistic Inquiry Monographs 13.The MIT Press, 1986.[Chomsky 86b]^Noam Chomsky. Knowledge of Language. Convergence.Praeger Publishers, 1986.106Bibliography^ 107[Chomsky 92]^Noam Chomsky. A minimalist program for linguistic theory.Unpublished ms, MIT, 1992.[Chomsky et al. 92]^Noam Chomsky and Howard Lasnik. Principles and parame-ters theory. In J. Jacobs, A. von Stechow, and W. Sternefeld,editors, Syntax: An International Handbook of ContemporaryResearch. Walter de Gruyter, Berlin, 1992.[Cinque 90]^Guglielmo Cinque. Types of A- Dependencies. Linguistic In-quiry Monographs 17. The MIT Press, 1990.[Clocksin et al. 84]^W. F. Clocksin and C. S. Mellish. Programming in Prolog.Springer-Verlag, second edition, 1984.[Crocker 88]^Matthew W. Crocker. A principle-based system for naturallanguage analysis and translation. Master's thesis, Univeristyof British Columbia, 1988.[Dahl et al. 91]^Veronica Dahl, Fred Popowich, and Michael Rochemont. Aprincipled characterization of dislocated phrases: Capturingbarriers with static discontinuity grammars. Technical ReportCMPT TR 91-09, School of Computing Science, Simon FraserUniversity, 1991.[Davis 87]^H. Davis. The Acquisition of the English Auxiliary System andIts Relation to Linguistic Theory. PhD thesis, University ofBritish Columbia, 1987.[Davis et al. 92]^Henry Davis and Carl Alphonce. Parsing, WH-movement andlinear asymmetry. To appear in the proceedings of the 22n dmeeting of the North Eastern Linguistics Society, 1992.[Fong 90]^Sandiway Fong. Computational Properties of Principle-BasedGrammatical Theories. PhD thesis, MIT Artificial IntelligenceLaboratory, 1990.[Frazier et al. 88]^Lyn Frazier and Keith Rayner. Parameterizing the languageprocessing system: Left- vs. right-branching within and acrosslanguages. In John A. Hawkins, editor, Explaining LanguageUniversals. B. Blackwell, 1988.[Frazier et al. 89]^Lyn Frazier and Giovanni B. Flores D'Arcais. Filler drivenparsing: A study of gap filling in Dutch. Journal of Memoryand Language, 28(3), 1989.[Gazdar et al. 85]^Gerald Gazdar, Ewan Klein, Geoffrey Pullum, and Ivan Sag.Generalized Phrase Structure Grammar. Basil Blackwell Pub-lisher Ltd., Oxford, UK, 1985.Bibliography^ 108[Grosu 73][Haegeman 91][Hasegawa 90]A. Grosu. On the status of the so-called right roof constraint.Language, 49:294-311, 1973.Liliane M. V. Haegeman. Introduction to Government andBinding Theory. B. Blackwell, 1991.Nobuko Hasegawa. Comments on Mazuka and Lust's paper.In L. Frazier and J. de Villiers, editors, Language Processingand Language Acquisition, pages 207-223. Kluwer, Dordrecht,1990.[Huang 82]^C.-T. J. Huang. Logical Relations in Chinese and the Theoryof Grammar. PhD thesis, MIT, 1982.[Johnsonbaugh 84]^Richard Johnsonbaugh. Discrete Mathematics. MacmillanPublishing Company, New York, NY, 1984.[Kayne 84]^R. Kayne. Connectedness and Binary Branching. Foris, Dor-drecht, 1984.[Koopman et al. 91]^Hilda Koopman and Dominique Sportiche. The position ofsubjects. Lingua, 85:211-258, 1991.[Kuroda 88]^S. Y. Kuroda. Whether we agree or not. In William J. Poser,editor, Papers from the Second International Conference onJapanese Syntax. Center for the Study of Language and Infor-mation, Stanford University, 1988.[Kurtzman et al. 91]^Howard S. Kurtzman, Loren F. Crawford, and Caylee Nychis-Florence. Locating WH-traces. In Robert C. Berwick,Steven P. Abney, and Carol Tenny, editors, Principle-basedParsing: computation and psycholinguistics. Kluwer AcademicPublishers, Dordrecht, The Netherlands, 1991.[Lasnik et al. 84]^H. Lasnik and M. Saito. On the nature of proper government.Linguistic Inquiry, 15:235-289, 1984.[Lasnik et al. 88]^Howard Lasnik and Juan Uriagereka. A Course in GB Syntax:Lectures on Binding and Empty Categories. The MIT Press,1988.[LeBlanc 90]^David C. LeBlanc. The generation of phrase-structure rep-resentations from principles. Master's thesis, University ofBritish Columbia, 1990.[Lee et al. 91]^Lin-Shan Lee, Lee-Feng Chien, Long-Ji Lin, James Huang, andK.-J. Chen. An efficient natural language processing systemspecially designed for the Chinese language. ComputationalLinguistics, 17(4), 1991.Bibliography^ 109[Li 90]^Y. Audrey Li. Order and Constituency in Mandarin Chinese.Kluwer Academic Publishers, 1990.[Mahajian 90]^A. Mahajian. The A/A-bar Distinction and Movement Theory.PhD thesis, MIT, 1990.[Marcus 80]^Mitchell P. Marcus. A Theory of Syntactic Recognition forNatural Language. The MIT Press, 1980.[Mazuka et al. 90]^R. Mazuka and B. Lust. On parameter-setting and parsing:Predictions for cross-linguistic differences in adult and childprocessing. In L. Frazier and J. de Villiers, editors, LanguageProcessing and Language Acquisition, pages 163-205. Kluwer,Dordrecht, 1990.[Pesetsky 82]^D. Pesetsky. Paths and Categories. PhD thesis, MIT, 1982.[Pesetsky 87] D. Pesetsky. Wh-in-situ: Movement and unselective binding.In E. Reuland and A. ter Meulen, editors, The Representationof (Inftlefiniteness. The MIT Press, Cambridge, Massachusetts,1987.[Rensink et al. 91]^Ronald A. Rensink and Gregory Provan. The analysis ofresource-limited vision systems. In Program of the ThirteenthAnnual Conference of the Cognitive Science Society, pages 311-316. Lawrence Erlbaum Associates, 1991.[Rizzi 90]^L. Rizzi. Relativized Minimality. Linguistic Inquiry Mono-graphs 16. The MIT Press, 1990.[Ross 67]^J. R. Ross. Constraints on variables in syntax. PhD thesis,MIT, 1967. Distributed by the Indiana University LinguisticsClub, Bloomington.[Sato 88]^Paul T. Sato. A common parsing scheme for left- and right-branching languages. Computational Linguistics, 14(1):20-30,1988.[Sharp 85]^R. Sharp. A model of grammar based on principles of gov-ernment and binding. Master's thesis, Univeristy of BritishColumbia, 1985.[Speas 90]^M. Speas. Phrase Structure in Natural Language. Kluwer Aca-demic Publishers, Dordrecht, 1990.[van Riemsdijk et al. 86] Henk van Riemsdijk and Edwin Williams. Introduction to theTheory of Grammar. The MIT Press, 1986.[Williams 80]^Edwin Williams. Predication. Linguistic Inquiry, 11:203-238,1980.Appendix AThe ParserA.1 ImplementationThe parsing model described in the main body of the text has been implemented inProlog, a logic programming language. The purpose of this appendix is to briefly describethe implementation, and provide several examples of the output of the parser.The implementation faithfully models the parsing mechanism described in the text.The main predicate is parse, which embodies the attacher and the meta-level controlin the parser. The input stream, the buffer, the main stack, and the various movementstacks are arguments of the predicate parse.The chunker is implemented by a predicate named chunk. The input routine usedby the chunker is taken from Clocksin^Mellish [Clocksin et al. 84].Two versions of the parser have been compiled, one for Chinese and one for English.The only differences between the two versions are that,1. the Chinese version consults a Chinese lexicon and subcategorization information,while the English version consults an English lexicon and subcategorization infor-mation, and2. the parameter which indicates at which level wh-movement takes place is set to LFfor Chinese and S-Structure for English, and3. the branching directions for some X projections differ, and4. the chunking procedure differs slightly.110Appendix A. The Parser^ 111A.2 ExamplesThe purpose of this section is to present the parser's output for various examples. In orderto make the Chinese examples easier to understand, a mini Chinese-English dictionaryis provided below.NounsChinese Englishyou dogmao catni youshenme whatshui whoVerbsChinese Englishda hitmai buyxiangxin believexiangzhidao wonderxihuan likezhui chaseMiscellaneousChinese Englishbei passive markerle aspectual markerAppendix A. The Parser^ 112To begin with consider the English example shown below. It involves a simpleA-dependency due to the movement of the subject from the VP-internal position intothe IP specifier. Indices are printed in angled brackets, as in < gla = 1 >. Indices areidentified according to the type of chain building procedure which created them, as wellas the type of chain they are in. For example, glab signifies a gap locating A-chain, whilegca is the label associated with a gap creating A-chain. The indices for head movementchains (which are labelled glh or glc) are placed next to the maximal projection of thehead, rather than next to the head itself.[i2 < >[d2 < gla=1 >ec[d1d = the[n2 < >ec[n1n = dogec]]]][ili = past[v2 < >[d2 trace ] < gla=1 >Evlv = chase[d2 < >ec[d1d = the[n2 < >ec[nln = catec]]]]]]]]the dog chased the cat.Appendix A. The Parser^ 113This next example is the same sentence, but this time in Chinese. It is a verysimple example for the parser to deal with, since there is no movement going on.[i2 < >[d2 < >ec[dld = ec[n2 < >ec[nln = gouec]]]]DAi = aspect[v2 < >ec[v1v = zhui[d2 < >ec[d1d = ec[n2 < >ec[n1n = maoec]]]]]]]]gou zhui mao.Appendix A. The Parser^ 114This is an example of raising; note the A-chain. Since there is no Subject Raisingin Chinese, there is no corresponding example in Chinese.[12 < >[d2 < gla=1 >ec[did = ec[n2 < >ec[111n =johnec]]]][iii = present[v2 < >[d2 trace ] < gla=1 >[viv = seem[12 < >[d2 trace ] < gla=1 >[iii = to[v2 < >[d2 trace ] < gla=1[viv = like[d2 < >ec[did = ec[n2 < >ec[nln = maryec]]]]]]]]]]]]john seems to like mary.Appendix A. The Parser^ 115Here we have passive sentence, first in English, then in Chinese. Recall that theparser treats the aspectual marker "le" as part of the verb (hence "da_le").[i2 < >[d2 < gla=1 >ec[d1d = ec[n2 < >ec[nln = billec]]]][i1i = past[v2 < >[d2 trace ] < gla=1 >[v1v = be[v2 < >[d2 trace ] < gla=1 >[vlv = hit[d2 trace ] < gla=1 >]]]]]]bill was hit.[i2 < >[d2 < gla=1 >ec[d1d = ec[n2 < >ec[n1n = billec]]]][iii = bei[v2 < >ec[v1v = da_le[d2 trace ] < gla=1 >]]]]bill bei da_le.Appendix A. The Parser^ 116This is a simple A-movement, a direct question. Compare the output for Englishand Chinese.[c2 < glh=1 >[d2 < glab=2 >ec[dld = ec[n2 < >ec[nln = whoec]]]][cic = do[i2 < glh=1 >[d2 < g1a=3 >ec[did = ec[n2 < >ec[nln = youec]]]][i1i = past[v2 < >[d2 trace ] < g1a=3 >[viv = hit[d2 trace ] < glab=2 >]]]]]]who did you hit.[c2 < >[ci[i2 < >[d2 < >ec[did= ec[n2 < >ec[nln = niec]]]][iii = aspect[v2 < >ec[v1v = da_le[d2 trace ] < gcab=1 >]]]]c = ec][d2 < gcab=1 >ec[dld = ec[n2 < >ec[nln = shuiec]]]]]ni da_le shui.Appendix A. The Parser^ 117This is a slightly more interesting case of A-movement, an indirect question. Noticethat in the Chinese case, the parser uses the verb's selectional restrictions (the verb"wonder" requires a H- whi CP as complement) to decide where the A-chain is to end.[i2 < >[d2 < gla=1 >ec[did = ec[n2 < >ec[n1n =johnec]]]][ili = past[v2 < >[d2 trace ] < gla=1 >[v1v = wonder[c2 < >[d2 < glab=3 >ec[d1d = ec[n2 < >ec[n1n = whatec]]]][c1c = ec[i2 < >[d2 < gla=2 >ec[d1d= ec[n2 < >ec[n1n = billec]]]][i1i = past[v2 < >[d2 trace ] < gla=2 >[v1v = buy[d2 trace ] < glab=3 >]]]]]]]]]]john wondered what bill bought.Appendix A. The Parser[i2 < >[d2 < >ec[d1d = ec[n2 < >ec[nln = johnec]]]][i1i = aspect[v2 < >ec[v1v = xiangzhidao[c2 < >[cl[i2 < >[d2 < >ec[d1d= ec[n2 < >ec[n1n = billec]]]][11i = aspect[v2 < >ec[v1v = mai_le[d2 trace ] < gcab=1 >]]]]c = ec][d2 < gcab=1 >ec[dld = ec[n2 < >ec[n1n = shenmejohn xiangzhidao bill mai_le shenme.118Appendix A. The Parser^ 119This example is very similar to the previous one, except that here the matrix clauseverb selects for an embedded clause which is [— wh].[c2 < glh=1 >[d2 < glab=2 >ec[dld = ec[n2 < >ec[nin = whatec]]]][c1c = do[i2 < glh=1 >[d2 < gla=3 >ec[did = ec[n2 < >ec[nin = johnec]]]][ili = past[v2 < >[d2 trace ] < gla=3 >[viv = believe[c2 < >[d2 trace ] < glab=2 >[cic = that[i2 < >[d2 < gla=4 >ec[d1d = ec[n2 < >ec[nin = billec]]]][ili = past[v2 < >[d2 trace ] < gla=4 >[viv = buy[d2 trace ] < glab=2 >]]]]]]]]]]]]what did john believe bill bought.Appendix A. The Parser[c2 < >[ci[i2 < >[d2 < >ec[did= ec[n2 < >ec[nin = johnec]]]][iii = aspect[v2 < >ec[viv = xiangxin[c2 < >[ci[i2 < >[d2 < >ec[did= ec[n2 < >ec[nin = billec]]]]120[iii = aspect[v2 < >ec[viv = mai_lec = ec] [d2 trace ] < gcab=1 >]]]]c = ec]^[d2 trace ] < gcab=1 >]]]]][d2 < gcab=1 >ec[did = ec[n2 < >ec[nin = shenmeec]]]]]john xiangxin bill mai_le shenme.Appendix A. The Parser^ 121This is a case of topicalization, another A-movement. Notice that the movement isleftward in both English and Chinese.[i: adjunct[d2 < glab=2 >ec[did = ec[n2 < >ec[n1n =johnec]]]][i2 < >[d2 < gla=1 >ec[d1d = ec[n2 < >ec[nin = maryec]]]][i1i = present[v2 < >[d2 trace ] < gla=1 >[viv = like[d2 trace ] < glab=2john mary likes.[i: adjunct[d2 < glab=1 >ec[dld = ec[n2 < >ec[nin = johnec]]]][i2 < >[d2 < >ec[did = ec[n2 < >ec[nln = maryec]]]][iii = aspect[v2 < >ec[viv = xihuan[d2 trace ] < glab=1 >]]]]]john mary xihuan.Appendix A. The Parser^ 122A subject-gap infinitival relative clause in English.[i2 < >[d2 < gla=1 >ecEd1d = ec[n2 < >ec[n1n =Johnec]]]]Ei1i = past[v2 < >[d2 trace ] < gla=1 >[viv = kissEd: adjunct[d2 < glab=2 >ec[did = the[n2 < >ec[n1n = womanec]]]][c2 < >[op2 < glab=2 >ec[optop = empty_opec]]Ec1c = ec[i2 < >[iii = to[v2 < >[viv = watch[d2 < >ec[d1d= ec[n2 < >ec[n1n = billec]]]]][d2 < glab=2 >ec[did = proec]]]]ec]]]]]illjohn kissed the woman to watch bill.Appendix A. The Parser^ 123An object-gap infinitival relative clause in English.[i2 < >[(12 < gla=1 >ec[d1d = ec[n2 < >ec[nin = johnec]]]][iii = past[v2 < >[d2 trace ] < gla=1 >[viv = kiss[d: adjunct[d2 < glab=2 >ecEd1d = the[n2 < >ec[111n = womanec]]]][c2 < >[op2 < glab=2 >ec[op1op = empty_opec]][clc = ec[i2 < >[ili = to[v2 < >[v1v = watch[op2 trace ] < glab=2 >][d2 trace ] < gca=3 >]][d2 < gca=3 >ec[d1d = bigPRO_arbec]]]]]]]]]]john kissed the woman to watch.The next example is an ambiguous sentence ^ is it [the man to kiss the woman]who John expected to watch the dog, or [the woman to watch the dog] who John expectedthe man to kiss? The parser only pursues a single parse, and comes up with the latteranalysis.Appendix A. The Parser^ 124[i2 < >[d2 < gla=1 >ec[did = ec[n2 < >ec[nin =johnec]]]][iii = past[v2 < >[d2 trace ] < gla=1 >[1,1v = expect[i2 < >[d2 < g1a=2 >ec[did = the[n2 < >ec[nin = manec]]]]i = to[v2 < >[d2 trace ] < gla=2 >[viv = kiss[d: adjunct[d2 < glab=3 >ec[dld = the[n2 < >ec[nin = womanec]]]][c2 < >[0p2 < glab=3 >ec[oplop = empty_opec]][cic = ec[i2 < >[iii = to[v2 < >[viv = watch[d2 < >ec[did = the[n2 < >ec[nin = dogec]]]]][d2 < glab=3 >ec[dld = proec]]]]ecil]]]]]]]]]]john expected the man to kiss the woman to watch the dog.Appendix A. The Parser^ 125This is a case of an Exceptional Case Marking (ECM) verb in a context wherethe subject of the embedded clause receives its Case marking from the I node of theembedded clause. Thus, this is not an ECM structure.[i2 < >Cd2 < gla=1 >ec[dld = ec[n2 < >ec[nin = maryec]]]][iii = present[v2 < >[d2 trace ] < gla=1 >v = believe< >[d2 < gla=2 >ec[dld = ec[n2 < >ec[nln = johnec]]]]i = present[v2 < >[d2 trace ] < gla=2 >v = be[d2 < >ec[did = a[n2 < >ec[nin = liarec]]]]]]]]]]]]mary believes john is a liar.Appendix A. The Parser^ 126Finally, a case of an (ECM) structure where the ECM verb assigns Case to thesubject of the embedded clause.[i2 < >[d2 < gla=1 >ec[d1d = ec[n2 < >ec[nln = maryec]]]][i1i = present[v2 < >[d2 trace ] < gla=1 >[v1v = believe[i2 < >[d2 < gla=2 >ec[d1d = ec[n2 < >ec[nln =johnec]]]][iii = to[v2 < >[d2 trace ] < gla=2 >[v1v = be[d2 < >ec[dld = a[n2 < >ec[n1n = liarec]]]Millllmary believes john to be a liar.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items