UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Tier-based locality in long-distance phonotactics : learnability and typology McMullin, Kevin James 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2016_may_mcmullin_kevin.pdf [ 3.8MB ]
JSON: 24-1.0228114.json
JSON-LD: 24-1.0228114-ld.json
RDF/XML (Pretty): 24-1.0228114-rdf.xml
RDF/JSON: 24-1.0228114-rdf.json
Turtle: 24-1.0228114-turtle.txt
N-Triples: 24-1.0228114-rdf-ntriples.txt
Original Record: 24-1.0228114-source.json
Full Text

Full Text

Tier-Based Locality in Long-Distance Phonotactics:Learnability and TypologybyKevin James McMullinB.A., The University of North Carolina at Chapel Hill, 2009a thesis submitted in partial fulfillmentof the requirements for the degree ofDoctor of Philosophyinthe faculty of graduate and postdoctoral studies(Linguistics)The University of British Columbia(Vancouver)February 2016© Kevin James McMullin, 2016AbstractAn important property of any language’s sound system is its phonotactics—theunique way in which it allows its inventory of speech sounds to combine. Inter-estingly, certain types of phonotactic co-occurrence restrictions found in naturallanguages may hold across any amount of intervening material. For example, theSamala (Chumash) language of Southern California exhibits a pattern of sibilantharmony, such that [s] and [ʃ] may not co-occur anywhere within the same word(e.g. /ha-s-xintila-waʃ / becomes [ha-ʃ-xintila-waʃ] ‘his former gentile name’; Ap-plegate, 1972).Long-distance dependencies like this, despite being relatively common cross-linguistically, are known to pose serious problems for learnability. A learner needsan enormous amount of computational power to discover an interaction in an un-bounded search space defined by arbitrary distances, resulting in patterns that arenot learnable in practice. Their existence in natural languages thus suggests thathumans are equipped with cognitive learning biases that restrict the available hy-pothesis space and facilitate the learning of patterns with certain properties but notothers.This dissertation presents a series of artificial language learning studies thatsupport the hypothesis that the typology of locality relations in long-distance con-sonantal phonotactics is shaped, at least in part, by such biases. From a theoreticalperspective, the goal is to explore and define the boundaries of the human learner’shypothesis space for phonotactic patterns. I argue that the seemingly simple con-straints used in the Agreement by Correspondence framework (Rose and Walker,2004; Hansson, 2010a; Bennett, 2013) generate many pathological patterns thatare unattested cross-linguistically. By contrast, the properties of locality observediifor patterns of long-distance consonant agreement and disagreement belong to awell-defined and relatively simple class of subregular formal languages (stringsets)called the Tier-based Strictly 2-Local languages (TSL2; Heinz et al., 2011). I there-fore argue that class of TSL2 stringsets offers an excellent approximation of theboundaries of possible, human-learnable phonotactics. More generally, I suggestthat the formal-language-theoretic approach can be used to inform phonologicaltheory, allowing for a better understanding of the computational complexity andlearnability of predicted patterns.iiiPrefaceAll of the experimental work presented henceforth was conducted in the Languageand Learning Laboratory at the University of British Columbia. All experimentsand associated methods were approved by the University of British Columbia’sResearch Ethics Board [certificate #H13-00857].Portions of the text related to Experiment 1 (Section 2.2) and the discussionof modular learning (Section 4.1) have been modified from published material thatdescribed preliminary results of Experiment 1 [McMullin, K. and Hansson, G. Ó.(2014). Locality in long-distance phonotactics: evidence for modular learning. InIyer, J. and Kusmer, L., editors, Proceedings of the 44th meeting of the North East-ern Linguistic Society, volume 2, pages 1–14. GLSA Publications, University ofMassachusetts, Amherst, MA.]. I was the lead investigator for this project, re-sponsible for all major areas of concept formation, experimental design, stimuluspreparation, data collection and analysis, as well as manuscript composition. Gun-nar Ólafur Hansson was involved throughout the project, in concept formation,experimental design, data analysis, and manuscript composition.Portions of the text related to the Agreement by Correspondence framework andthe predictions thereof (Sections 2.3 and 3.1), and the Tier-based Strictly 2-Localcharacterization of long-distance phonotactics (Section 4.2) have been modifiedfrom an unpublished manuscript, which is based on a paper presented at the 2014Annual Meeting on Phonology [McMullin, K. and Hansson, G. Ó. (2015). Long-distance phonotactics as Tier-based Strictly 2-Local languages. Unpublished ms.University of British Columbia.]. I was the lead investigator, and both authorscontributed to concept formation, data analysis, and manuscript composition.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Typology and Learning Bias . . . . . . . . . . . . . . . . . . . . 31.2 Experimental Methodology . . . . . . . . . . . . . . . . . . . . . 51.3 Typology of Long-Distance Consonant Interactions . . . . . . . . 61.3.1 Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3.2 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.3 Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4 Theoretical Frameworks . . . . . . . . . . . . . . . . . . . . . . 121.4.1 Optimality Theory . . . . . . . . . . . . . . . . . . . . . 131.4.2 Formal Language Theory . . . . . . . . . . . . . . . . . . 141.5 Central Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.6 Structure of the Dissertation . . . . . . . . . . . . . . . . . . . . 18v2 Locality Relations in Consonant Harmony . . . . . . . . . . . . . . 202.1 Unbounded and Transvocalic Consonant Harmony . . . . . . . . 212.2 Consonant Harmony in Artificial Language Learning . . . . . . . 242.2.1 Experimental Methodology . . . . . . . . . . . . . . . . Recruiting Participants . . . . . . . . . . . . . Stimuli . . . . . . . . . . . . . . . . . . . . . . Experimental Design: Three Phases . . . . . . . Procedures . . . . . . . . . . . . . . . . . . . . 302.2.2 Experiment 1: Liquid Harmony . . . . . . . . . . . . . . 312.2.2.1 Participants . . . . . . . . . . . . . . . . . . . 312.2.2.2 Training Conditions for Experiment 1 . . . . . . 312.2.2.3 Results and Analysis . . . . . . . . . . . . . . . 322.2.2.4 Summary and Discussion . . . . . . . . . . . . 392.3 Consonant Harmony as Agreement by Correspondence . . . . . . 392.3.1 Deriving Unbounded and Transvocalic Harmony in ABC . 412.3.2 Biased Learning of ABC Constraint Rankings . . . . . . . 432.4 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . 463 Problems with the ABC Approach . . . . . . . . . . . . . . . . . . . 483.1 Limited Typological Support for Predictions of ABC . . . . . . . 493.1.1 Pathological Harmony Patterns . . . . . . . . . . . . . . . 493.1.1.1 Agreement by Proxy . . . . . . . . . . . . . . . 493.1.1.2 Sensitivity toNumber and Parity of Potential Cor-respondents . . . . . . . . . . . . . . . . . . . 513.1.2 Pathological Dissimilation Patterns . . . . . . . . . . . . 563.1.2.1 Dissimilation in the ABC Framework . . . . . . 563.1.2.2 Pathological Prediction: Beyond-Transvocalic Dis-similation . . . . . . . . . . . . . . . . . . . . 583.1.2.3 Ranking Paradox: Basic Transvocalic Dissimi-lation . . . . . . . . . . . . . . . . . . . . . . . 593.2 Limited Experimental Support for ABC Predictions . . . . . . . . 613.2.1 Summary of Predictions . . . . . . . . . . . . . . . . . . 613.2.2 Experiment 2: Liquid Dissimilation . . . . . . . . . . . . 63vi3.2.2.1 Methodology . . . . . . . . . . . . . . . . . . . 633.2.2.2 Results and Analysis . . . . . . . . . . . . . . . 643.2.2.3 Discussion of Experiment 2 . . . . . . . . . . . 673.2.3 Analysis of Successful Learners in Experiments 1 and 2 . 713.2.3.1 Motivation for Additional Analysis . . . . . . . 713.2.3.2 Defining a Threshold for Successful Learning . 723.2.3.3 Results for Successful Learners in Experiments1 and 2 . . . . . . . . . . . . . . . . . . . . . . 733.2.3.4 Discussion: Successful Learners in Experiments1 and 2 . . . . . . . . . . . . . . . . . . . . . . 743.3 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . 784 Locality in Formal Language Theory: A Tier-Based Solution . . . . 794.1 Strictly Local and Strictly Piecewise Languages . . . . . . . . . . 814.1.1 Learning Bias and the Argument for Modular Learning . . 844.1.2 Evidence Against this Approach from Blocking . . . . . . 844.2 Tier-Based Strictly 2-Local Languages . . . . . . . . . . . . . . . 864.2.1 Consonant Harmony from the TSL2 Perspective . . . . . 874.2.2 Locality as a Consequence of the Tier . . . . . . . . . . . 894.3 Pathological Patterns That Are Not TSL2 . . . . . . . . . . . . . 904.4 Experiments 3 and 4: Rich-Stimulus Training . . . . . . . . . . . 934.4.1 Motivation for Experiments 3 and 4 . . . . . . . . . . . . 934.4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . 954.4.2.1 Participants, Stimuli, and Procedure . . . . . . 954.4.2.2 Training Conditions . . . . . . . . . . . . . . . 954.4.3 Results and Analysis . . . . . . . . . . . . . . . . . . . . 974.4.3.1 Results of Experiment 3 . . . . . . . . . . . . . 984.4.3.2 Results of Experiment 4 . . . . . . . . . . . . . 1034.4.4 Individual Results and General Discussion . . . . . . . . 1064.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . 1105 Questions About the TSL2 Approach . . . . . . . . . . . . . . . . . 1125.1 Is the TSL2 Region Too Big? . . . . . . . . . . . . . . . . . . . . 113vii5.1.1 Kinyarwanda Sibilant Harmony With Blocking . . . . . . 1145.1.2 Latin Liquid Dissimilation . . . . . . . . . . . . . . . . . 1185.1.3 Experimental Learning of Arbitrary Tiers . . . . . . . . . 1205.1.4 Conclusion: The TSL2 Region is Not Too Big . . . . . . . 1215.2 Is the TSL2 Region Computationally Learnable? . . . . . . . . . . 1225.2.1 Summary of the Tier-based Strictly Local Inference Algo-rithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.2.2 Example of the 2TSLIA: Sibilant Harmony with Blocking 1275.2.3 Conclusion: The TSL2 Class Is Learnable . . . . . . . . . 1315.3 Is the TSL2 Region Too Small? . . . . . . . . . . . . . . . . . . . 1315.3.1 Multiple Non-Conflicting TSL2 Patterns . . . . . . . . . . 1315.3.1.1 Tamashek Tuareg: Two Long-Distance Depen-dencies . . . . . . . . . . . . . . . . . . . . . . 1325.3.1.2 Imdlawn Tashlhiyt: Sibilant Harmony With Par-tial Blocking . . . . . . . . . . . . . . . . . . . 1345.3.2 Multiple Conflicting TSL2 Dependencies . . . . . . . . . 1375.3.2.1 Revisiting Tamashek Tuareg . . . . . . . . . . 1375.3.2.2 Samala Sibilant Harmony Overrides Palatalization1405.3.3 TSL2 Constraints in Phonological Theory . . . . . . . . . 1425.3.4 Conclusion: The TSL2 Region Is Not Too Small . . . . . 1455.4 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . 1466 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 1476.1 Empirical Findings . . . . . . . . . . . . . . . . . . . . . . . . . 1476.2 Assessing Theoretical Approaches . . . . . . . . . . . . . . . . . 1506.3 Outstanding Issues . . . . . . . . . . . . . . . . . . . . . . . . . 1566.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160A Full List of Stimuli Used in Experiments . . . . . . . . . . . . . . . . 173B Statistical Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . 190B.1 Experiment 1 (16 Subjects) . . . . . . . . . . . . . . . . . . . . . 190viiiB.2 Experiment 1 (12 “Successful Learners”) . . . . . . . . . . . . . 192B.3 Experiment 2 (16 Subjects) . . . . . . . . . . . . . . . . . . . . . 194B.4 Experiment 2 (12 “Successful Learners”) . . . . . . . . . . . . . 196B.5 Experiment 3 (16 Subjects) . . . . . . . . . . . . . . . . . . . . . 198B.6 Experiment 4 (16 Subjects) . . . . . . . . . . . . . . . . . . . . . 200ixList of TablesTable 2.1 Attested and unattested variants of consonant harmony locality 23Table 2.2 Breakdown of stimuli used in Experiments 1 through 4 . . . . 27Table 2.3 Example of testing trials for all experiments . . . . . . . . . . 30Table 2.4 Examples of training items in Experiment 1 . . . . . . . . . . 32Table 2.5 Experiment 1: Logistic regression analysis . . . . . . . . . . . 33Table 2.6 Experiment 1: Summary of odds ratios . . . . . . . . . . . . . 35Table 2.7 Factorial typology of consonant harmony with ABC constraints 44Table 3.1 Example of training items in Experiment 2 . . . . . . . . . . . 64Table 3.2 Experiment 2: Logistic regression analysis . . . . . . . . . . . 65Table 3.3 Experiment 2: Summary of odds ratios . . . . . . . . . . . . . 67Table 3.4 Experiments 1 and 2: Summary of odds ratios for “successfullearners” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74Table 4.1 Example tier-based strings for a hypothetical word [pireʃaʃolus] 87Table 4.2 TSL2 grammars for three types of sibilant harmony . . . . . . 89Table 4.3 Three types of segments in a TSL2 grammar . . . . . . . . . . 90Table 4.4 Unattested variants of long-distance locality . . . . . . . . . . 91Table 4.5 Example of S-Harm-M-Faith training in Experiment 3. . . . . 96Table 4.6 Example of M-Diss-S-Faith training in Experiment 4. . . . . . 97Table 4.7 Experiments 3 and 4: Summary of odds ratios . . . . . . . . . 98Table 5.1 TSL2 grammars for three attested variants of sibilant harmony . 113Table 5.2 Number of possible TSL2 grammars for Σ ={s, ʃ, p, t, a} . . . 124xTable 6.1 Typology of long-distance locality relations . . . . . . . . . . 147Table 6.2 Summary of training conditions for Experiments 1 through 4 . 148Table 6.3 Factorial typology of ABC with different locality-based con-straints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Table 6.4 Formal language characterization of dependencies with variouslocality relations . . . . . . . . . . . . . . . . . . . . . . . . . 154Table 6.5 String grammaticality for different TSL2 grammars . . . . . . 155Table A.1 List of stimuli used in the practice phase. . . . . . . . . . . . . 173Table A.2 Training stimuli with liquids at “Medium-range” . . . . . . . . 174Table A.3 Training stimuli with liquids at “Short-range” . . . . . . . . . 178Table A.4 Training stimuli with no liquids. . . . . . . . . . . . . . . . . . 182Table A.5 List of stimuli used in testing phase. . . . . . . . . . . . . . . 186Table B.1 Experiment 1: Mixed-effects logistic regression (16 subjects pergroup, Short-range baseline) . . . . . . . . . . . . . . . . . . . 190Table B.2 Experiment 1: Mixed-effects logistic regression (16 subjects pergroup, Medium-range baseline) . . . . . . . . . . . . . . . . . 191Table B.3 Experiment 1: Mixed-effects logistic regression (16 subjects pergroup, Long-range baseline) . . . . . . . . . . . . . . . . . . . 191Table B.4 Experiment 1: Mixed-effects logistic regression (12 “successfullearners” per group, Short-range baseline) . . . . . . . . . . . 192Table B.5 Experiment 1: Mixed-effects logistic regression (12 “successfullearners” per group, Medium-range baseline) . . . . . . . . . . 193Table B.6 Experiment 1: Mixed-effects logistic regression (12 “successfullearners” per group, Long-range baseline) . . . . . . . . . . . 193Table B.7 Experiment 2: Mixed-effects logistic regression (16 subjects pergroup, Short-range baseline) . . . . . . . . . . . . . . . . . . . 194Table B.8 Experiment 2: Mixed-effects logistic regression (16 subjects pergroup, Medium-range baseline) . . . . . . . . . . . . . . . . . 195Table B.9 Experiment 2: Mixed-effects logistic regression (16 subjects pergroup, Long-range baseline) . . . . . . . . . . . . . . . . . . . 195xiTable B.10 Experiment 2: Mixed-effects logistic regression (12 “successfullearners” per group, Short-range baseline) . . . . . . . . . . . 196Table B.11 Experiment 2: Mixed-effects logistic regression (12 “successfullearners” per group, Medium-range baseline) . . . . . . . . . . 197Table B.12 Experiment 2: Mixed-effects logistic regression (12 “successfullearners” per group, Long-range baseline) . . . . . . . . . . . 197Table B.13 Experiment 3: Mixed-effects logistic regression (16 subjects pergroup, Short-range baseline) . . . . . . . . . . . . . . . . . . . 198Table B.14 Experiment 3: Mixed-effects logistic regression (16 subjects pergroup, Medium-range baseline) . . . . . . . . . . . . . . . . . 199Table B.15 Experiment 3: Mixed-effects logistic regression (16 subjects pergroup, Long-range baseline) . . . . . . . . . . . . . . . . . . . 199Table B.16 Experiment 4: Mixed-effects logistic regression (16 subjects pergroup, Short-range baseline) . . . . . . . . . . . . . . . . . . . 200Table B.17 Experiment 4: Mixed-effects logistic regression (16 subjects pergroup, Medium-range baseline) . . . . . . . . . . . . . . . . . 201Table B.18 Experiment 4: Mixed-effects logistic regression (16 subjects pergroup, Long-range baseline) . . . . . . . . . . . . . . . . . . . 201xiiList of FiguresFigure 1.1 Basic model of learning . . . . . . . . . . . . . . . . . . . . 4Figure 1.2 The Chomsky hierarchy . . . . . . . . . . . . . . . . . . . . 15Figure 1.3 The subregular hierarchy . . . . . . . . . . . . . . . . . . . . 16Figure 2.1 Results of Experiment 1: M-Harm and S-Harm learning . . . 36Figure 2.2 Results of Experiment 1: M-Harm generalization . . . . . . . 37Figure 2.3 Results of Experiment 1: S-Harm generalization . . . . . . . 38Figure 3.1 Results of Experiment 2: M-Diss and S-Diss learning . . . . . 68Figure 3.2 Results of Experiment 2: M-Diss generalization . . . . . . . . 69Figure 3.3 Results of Experiment 2: S-Diss generalization . . . . . . . . 70Figure 4.1 The subregular hierarchy . . . . . . . . . . . . . . . . . . . . 80Figure 4.2 Results of Experiment 3: M-Harm-S-Faith and S-Harm-M-Faithlearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Figure 4.3 Results of Experiment 3: M-Harm-S-Faith generalization . . . 101Figure 4.4 Results of Experiment 3: S-Harm-M-Faith generalization . . . 102Figure 4.5 Results of Experiment 4: M-Diss-S-Faith and S-Diss-M-Faithlearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104Figure 4.6 Results of Experiment 4: M-Diss-S-Faith generalization . . . 105Figure 4.7 Results of Experiment 4: S-Diss-M-Faith generalization . . . 106Figure 4.8 Individual results for M-Harm-S-Faith group (Exp. 3) . . . . 108Figure 4.9 Individual results for M-Diss-S-Faith group (Exp. 4) . . . . . 109xiiiAcknowledgmentsI am so grateful for so many people who encouraged, helped, inspired, and moti-vated me as I completed this task—it was not easy enough to be done on my own.I will first acknowledge that my wife Ashley has provided me with a ridiculousamount of support during the last few years. While I can recognize certain sacrificesshe made and the amount of faith she put in my abilities, I do not believe that I canfully understand how difficult it is to be on the other side of life as a Ph.D. student.Nonetheless, I offer her an assuredly heartfelt “Thank you!”, along with the hopethat our adventure in Vancouver was both fun and worthwhile. I love you.I owe a great deal of thanks to my parents Stephen and Rosalie, and to mysister Nancy, who have always supported me and have rarely questioned my loftygoals. Furthermore, they continue to be excellent role models. Since I began as anundergraduate, all three of them have obtained a mid-career Masters or Ph.D. whileretaining a full-time position. This astounds me, and it was often an unspokenreminder that my job was comparatively rather straightforward.Like many others in this field, I was initially drawn to linguistics by the en-thusiasm of those teaching my introductory courses. I therefore want to recognizetwo instructors from my time at UNC: Melissa Frazier, who was perhaps the bestteacher I have ever had (despite writing her dissertation at the time) and Jen Smith,who developed my interest in phonology, and who provided me with much-neededadvice and supervision from afar as I applied to graduate programs.While at UBC, I benefited greatly from intellectual discussions and social en-deavours with nearly all of my fellow graduate students. I extend my thanks toall of them, while recognizing but a subset by name: James Crippen, Ella Fund-Reznicek, Joash Gambarage, Michael McAuliffe, Stacey Menzies, Lauren Quinn,xivJennifer Abel, Alexis Black, Masaki Noguchi, Sonja Thoma, Blake Allen, AndreiAnghelescu, Natalie Weber, Michael Schwan, Erin Guntly, Adriana Osa-Gómez,Emily Sadlier-Brown, Oksana Tkachman, Avery Ozburn, and Megan Keough.I had the pleasure of working closely with a number of UBC faculty mem-bers who have influenced my research and my perception of what it means to be ascholar. In addition to those who taught my graduate courses, I would like to men-tion Kathleen Currie Hall andMolly Babel, who provided excellent demonstrationsof what it will take to succeed early in my career.As the non-phonologist on my committee, Carla Hudson Kam’s contributionswere practical in nature, and essential to my success. She provided the facilities andequipment for all of my experiments, she stressed the importance of methodologicaldetail and statistical rigour in my research, and her role in meetings was often toensure that my tasks could actually be achieved on time. I am left wondering howanyone completes a Ph.D. without Carla on their committee.It is difficult to overstate the impression that Doug Pulleyblank has made onme. I have observed and interacted with him in his role as a teacher, as a researcher,as Graduate Advisor, as Department Head, and as a member of my committee. Icontinue to be struck by his respect for others, his leadership, his general curiosity,his brilliance, and his focus on the importance of everyone’s work-life balance.Inevitably, however, the person whose influence is most apparent in my workis Gunnar Hansson. I could not have asked for a better supervisor and I would nothave been successful without him. Despite spending hundreds of hours togetherdiscussing research, I cannot pinpoint why exactly we work so well together. I canonly hope that I will be able to exemplify similar qualities as a person and a linguist.I have also benefited from the input and advice of several other people. In par-ticular, discussion and correspondence with Will Bennett, Jeff Heinz, Adam Jar-dine, and Rachel Walker have provided more insight than they may realize, and myresearch is significantly better because of them. Furthermore, Joe Stemberger andJoel Friedman (University Examiners), as well as Sharon Rose (External Examiner)offered several suggestions that have improved my dissertation.Finally, I point out that my dissertation research was supported financially bya SSHRC Doctoral Fellowship, a UBC Faculty of Arts Graduate Research Award,and SSHRC Insight Grant 435–2013–0455 to Gunnar Ólafur Hansson.xvChapter 1IntroductionThis dissertation explores the boundaries of the human language learner’s hypoth-esis space with respect to the types of sound patterns that can be inferred fromexposure to linguistic data. I focus mainly on phonotactic dependencies that holdbetween non-adjacent consonants and, more specifically, on the types of localityrelations that are found in such patterns. The primary goal is to establish a computa-tionally well-defined set of sound patterns that are predicted to be human-learnable.This is achieved by assessing the range of cross-linguistically attested patterns inlight of experimental results from a series of artificial language learning studies.As an example of a dependency that holds between non-adjacent consonants,consider the case of sibilant harmony in Samala (Ineseño Chumash; Applegate,1972), in which two sibilants are not permitted to co-occur anywhere within thesame word unless they agree in anteriority. Evidence of this can be seen when thesibilants that surface as [+anterior] segments [s] or [tsʰ] in the word [sapitsʰolus]‘he has a stroke of good luck’ instead surface as [–anterior] segments when theperfective suffix [-waʃ] is added, becoming [ʃapitʃʰoluʃwaʃ] ‘he had a stroke ofgood luck’ (Applegate, 1972, p. 119).Patterns like this, which sometimes hold across a great number of segmentsor syllables (e.g. /k-su-k’ili-mekeken-ʃ/ becomes [kʃuk’ilimekeketʃ] ‘I straightenmyself up’; Applegate, 1972, p. 119), are known to pose serious problems for learn-ability. This is true both from a general perspective on the cognitive limits of hu-man learning (see, e.g., Creel et al., 2004; Newport and Aslin, 2004; Gebhart et al.,12009), as well as for computational models of learning (see, e.g., Heinz, 2007, 2010;Hayes and Wilson, 2008; Heinz et al., 2011; Goldsmith and Riggle, 2012; Jardineand Heinz, 2015). The core issue is that if sound patterns can hold across unknown,arbitrary, and potentially unbounded distances, the search space of possible patternsis simply too large for any learner (human or otherwise) to traverse efficiently.Nonetheless, long-distance dependencies are robustly attested in natural language,suggesting that some kind of cognitive learning biases must be present that enablesuccessful learning of the exhibited sound patterns. The central questions guidingthis dissertation are as follows:• Does the typology of attested non-adjacent phonotactic patterns reflect theproperties and inductive biases of human learning mechanisms?• Do existing theories of long-distance consonant interactions over- or under-predict with respect to the range of patterns supported by empirical data?Upon investigating the above questions, it becomes clear that current approachesdo not necessarily offer the correct set of predictions about the typology and learn-ability of long-distance dependencies. This motivates a third question:• Can a computational (formal-language-theoretic) definition of the notion of atier be incorporated into phonological theory in order to improve predictionsabout the set of possible, human-learnable phonotactic patterns?The present research therefore contributes to a growing body of literature that high-lights the importance of a computational foundation for phonological theory, andan explanation of the relationship between the typology and the learnability of lin-guistic patterns (see also, e.g., Heinz, 2007, 2010; Finley, 2008; Moreton, 2008,2012; Hayes and Wilson, 2008; Lai, 2012; Morley, 2015).The remainder of this chapter establishes a context for asking the above ques-tions and defines the scope of my dissertation research. Section 1.1 outlines thebasic assumptions I make about the relationship between linguistic typology andlanguage learning. Section 1.2 motivates the need for further behavioural dataabout the learnability of linguistic sound patterns and summarizes the experimentalparadigm of artificial language learning—a methodology that has seen a growing2amount of attention in the literature, and which represents a sizeable portion of theresearch presented in this dissertation. In Section 1.3, I introduce the empirical fo-cus of my dissertation, long-distance consonantal phonotactics, summarizing thebasic cross-linguistic properties with respect to similarity, locality relations, andblocking, as well as any existing experimental evidence of learning biases that areassociated with these aspects of the typology. Section 1.4 then outlines two dis-tinct theoretical frameworks, both Optimality Theory and formal language theory,and summarizes previous theoretical accounts of long-distance consonant interac-tions. Section 1.5 states the central claims that will be made in this dissertation, andSection 1.6 outlines the structure of the remaining chapters.1.1 Typology and Learning BiasPatterns that are observed in natural language, irrespective of how they first arise,must be learnable in order to be acquired by a new generation of speakers. Cross-linguistic typological distributions may therefore provide an indirect window onproperties of the human language learner, such as restrictions on the available hy-pothesis space or heuristics employed to navigate that space. Such learning bi-ases, be they domain-specific (i.e. at play only in linguistic or phonotactic learn-ing) or domain-general (applying also when learning, for example, visual or non-linguistic auditory patterns) are a major factor in shaping and constraining typo-logical variation (see, e.g., Wilson, 2006; Finley and Badecker, 2007; Kirby et al.,2008; Moreton, 2008, 2012; Scott-Phillips and Kirby, 2010; Culbertson, 2012; Cul-bertson et al., 2012; Rafferty et al., 2013).Figure 1.1 provides a basic illustration of the connections between a learner, alearner’s hypothesis space, and a set of cross-linguistically attested patterns. Con-sider a human learner whose goal is to correctly identify a grammar for a languageLx after being exposed to a finite amount of primary linguistic data (from Lx). If thedata is drawn from a natural human language, we would certainly expect the learnerto acquire the correct grammar. I assume that there are restrictions on the types oflanguages that the human learning mechanism is capable of identifying correctly,and one of the goals of this research is to determine exactly what that boundaryis. I note, however, that in order to simplify the present investigation, I largely set3Hypothesis space (Human-learnable patterns) • Lx Input: Primary linguistic data Attested patterns Function: learning algorithm with restrictions and biases  • Lz • Ly Output: Grammar for a language in the hypothesis space Figure 1.1: Basic model of a learning as a function that maps input trainingitems to an output language in the range of the learner.aside issues of imperfect learning and assume an idealized learner that is successfulwhenever the language belongs to the learner’s hypothesis space (though see, e.g.,Kiparsky, 1968 on imperfect learning and analogical change).Imagine now that the learner is exposed to data from a hypothetical, unattestedlanguage Ly that contains a pattern that does not resemble anything found in thetypology of natural language. If such a case were to arise, we might expect that thelearner would erroneously map the training data from Ly to an alternative grammarthat generates a language that is somewhat similar to Ly, but within range of thelearner (Lz in Figure 1.1). The types of linguistic patterns that remain stable, per-sisting through transitions from one generation of speakers to the next, should bethose that are human-learnable.4While the set of long-distance phonotactic dependencies observed in naturallanguage may provide a rough estimate of the learner’s hypothesis space, I also as-sume that there exist many accidental gaps in the typology—patterns that a humanlearner would correctly identify if provided the right training data. With respectto phonological patterns in particular, there also exist certain stochastic pressuresshaping the typology in terms of which patterns are most likely to arise in or disap-pear from a language through, e.g., misperception, misproduction, errors in speechplanning, and so on (Ohala, 1993; Blevins, 2004; Hansson, 2008; Garrett and John-son, 2012). The present research does not attempt to capture these influences onthe relative frequency and distribution of all possible patterns, but focuses on estab-lishing a definition of the categorical boundary between possible (human-learnable)and impossible (not human-learnable) long-distance phonotactic patterns, by eval-uating the predictions of multiple hypotheses within two theoretical frameworks(see Section 1.4).1.2 Experimental MethodologyUnder the assumption that all patterns exhibited in natural language are learnable,any proposed region of possible languages must contain at least those patterns thatare attested. While the range of observed patterns is likely a reasonable estimateof the limits on human learning, it cannot be taken for granted that other, logicallypossible but unattested patterns are unlearnable. As a means of testing the limits ofhuman learning, this dissertation employs an artificial language learning paradigm.This methodology has become increasingly popular for linguists and cognitive psy-chologists interested in language learning (e.g. Pycha et al., 2003; Wilson, 2003,2006; Hudson Kam and Newport, 2005, 2009; Finley and Badecker, 2009; More-ton, 2008, 2012; Finley, 2011, 2012; Moreton and Pater, 2012a,b). One of the mainadvantages of such methods is that they enable the researcher to overcome the rel-ative rarity (or non-existence) of certain patterns, which makes it unfeasible to usethemore traditionalmethod of studying children throughout the process of languageacquisition.In a typical experiment, subjects complete a training phase in which they areexposed to certain forms from an artificial language, constructed by the researcher,5that exhibits the pattern of interest. This is followed by a testing phase to deter-mine whether or not they have learned the pattern, and in some cases whether ornot they generalize it to novel contexts that were not encountered in training. Theartificial language learning methodology therefore provides an accessible way toobtain data about the learning of any type of phonological pattern, whether it is rel-atively frequent, rare, or completely unattested across the world’s languages. Anadditional benefit of the paradigm is that the researcher has total control over thelearner’s input, such that direct comparisons between learners who received onlyslightly different sets of training items are possible. However, the methodology isnot without criticisms (for a recent overview of the findings and criticisms of arti-ficial phonology experiments, see Moreton and Pater, 2012a,b). For example, wemay not know what biases the learners are bringing in from their own native lan-guage, or their language experience as a whole, and there has been little researchconcerning the relationship between artificial language learning and natural lan-guage learning (L1 or L2 acquisition; see Ettlinger et al., 2015, for a summary ofthe limited evidence, and an argument that artificial language learning is similarto second language learning). It is thus imperative to include a well-constructedcontrol condition that can indicate any biases a learner might come in with or thatresult from the task itself, so that we can factor them out when performing statisticalanalyses (Reber and Perruchet, 2003; Finn and Hudson Kam, 2008).1.3 Typology of Long-Distance Consonant InteractionsWhile many phonotactic patterns result from interactions between adjacent seg-ments (e.g. voicing or place assimilation in consonant clusters, palatalization ofconsonants before front vowels), there are several types of phonological patternsthat apply even across intervening material. Such patterns have long been a topicof debate in phonological theory (e.g. Halle and Vergnaud, 1981; Poser, 1982; Ste-riade, 1987a,b; Odden, 1994; Gafos, 1999; Hansson, 2001, 2010a; Ní Chiosáin andPadgett, 2001; Pulleyblank, 2002; Rose and Walker, 2004; Nevins, 2010; Bennett,2013). In what follows, I use the term long-distance phonotactics to refer to co-occurrence restrictions on segments in surface forms, primarily with respect to non-adjacent pairs of consonants in particular. Such interactions may be assimilatory6or dissimilatory in nature, but must hold between two consonants that are separatedby at least an intervening vowel (Hansson, 2010a; Bennett, 2013). For example,as the data in (1) and (2) demonstrate (with evidence from suffix allomorphy), cer-tain languages may require two co-occurring liquids [l, r] to agree (as in Bukusu;Odden, 1994), or to disagree (as in Georgian; Fallon, 1993; Odden, 1994), evenwhen they are separated by several segments.(1) Liquid harmony in Bukusu (Bantu; Odden, 1994)a. teex-el-a ‘cook for’b. lim-il-a ‘cultivate for’c. kar-ir-a ‘twist’d. rum-ir-a ‘send someone’e. reeb-er-a ‘ask for’In the data from Bukusu above, (1a) and (1b) show that when a verb stem containsno liquids, or when it contains a liquid [l], the applicative suffix surfaces as [-il] or[-el]. However, when the stem contains [r] as in (1c)-(1e), there is an alternationin the suffix, which becomes [-ir] or [-er]. The resulting generalization is that, inBukusu, words with *[r…l] are not permitted.1The opposite generalization is shown by the Georgian data presented below:(2) Liquid dissimilation in Georgian (Kartvelian; Fallon, 1993; Odden, 1994)a. dan-uri ‘Danish’b. p’olon-uri ‘Polish’c. ungr-uli ‘Hungarian’d. aprik’-uli ‘African’The suffix /-uri/ remains faithful in (2a)-(2b), when there is no other [r] precedingthe suffix. However, if another [r] does precede it, the suffix surfaces as [-uli],as seen in (2c)-(2d). The generalization for the data in (2) is that in a word that1This is the extent of the data as described byOdden (1994, based on his own field notes). Hansson(2010a) further notes that the pattern holds morpheme-internally, and that it may be optional in thelonger-range contexts (e.g. [rum-ir-a] [rum-il-a] ‘send for’; cited from theComparative Bantu OnlineDictionary available at http://www.cbold.ish-lyon.cnrs.fr/).7contains two liquids, they may not both be [r].2A practical motivation for my focus on long-distance consonant interactions isthe relative recency and accessibility of comprehensive typological studies, bothfor consonant harmony (Hansson, 2001, 2010a; Rose and Walker, 2004) and long-distance consonant dissimilation (Suzuki, 1998; Bennett, 2013). The cross-linguisticdistribution of these patterns reveals several interesting asymmetries that I hypothe-size are related to human learning biases, but which have not been fully investigatedin artificial language learning studies. The remainder of this section summarizesthe typology and relevant experimental results for three important aspects of long-distance consonant interactions: the relative similarity of interacting segments, thedistance between them, and whether or not the dependency can be blocked when aspecific segment intervenes.1.3.1 SimilarityFirst, with respect to similarity in terms of shared features, it seems that there is ageneral dispreference for the co-occurrence of highly similar segments, which canbe repaired either with harmony by making them even more similar (perhaps iden-tical), or with dissimilation by differentiating them beyond some threshold. Forexample, the most common type of consonant harmony (by far; Hansson, 2010a)is sibilant harmony, which prohibits the co-occurrence of two sibilants that do notmatch for some other feature (e.g. anteriority, voicing, or both). Likewise, a well-attested type of dissimilation requires disagreement in major-place features amongobstruents (Alderete and Frisch, 2007; Bennett, 2013). I note, however, that thereis a difference between harmony and dissimilation in the types of features that aremost often involved. In contrast to the relatively common patterns of sibilant har-mony and major-place dissimilation, there are no attested cases of a language thatprohibits *[s…s] and *[ʃ…ʃ], but allows [s…ʃ] and [ʃ…s] (i.e. a form a sibilant dis-similation), nor is there a language that exhibits a pattern of major-place harmony(though the latter is commonplace in child language; Levelt, 2011). The presentresearch largely sets aside issues of similarity and feature specifications (for an ex-tensive investigation of the relationship between the cross-linguistic properties of2This is a simplified dataset presented for introductory purposes, as the ban on words containing*[r…r] is blocked by an intervening [l]. For a full discussion of the Georgian data, see Section consonant assimilation and dissimilation, see Bennett, 2013).The experimental literature, linguistic and non-linguistic alike, supports theidea that human learning of a dependency is influenced by the similarity of the twoelements that enter into that dependency. Evidence from a number of studies showsthat non-adjacent dependencies are more easily learned when the two interactingelements are more similar (e.g. Creel et al., 2004; Newport and Aslin, 2004; Geb-hart et al., 2009). I note, however, that evidence from Koo and Oh (2013) suggeststhat similarity among interacting consonants may be best understood as a contribut-ing factor to learnability rather than a necessary condition for human learning (seeSection 5.1.3 for a full description of their study).1.3.2 DistanceApart from similarity, the relative distance separating the two segments can alsobe a conditioning factor. Odden (1994) describes the Bukusu and Georgian pat-terns, in (1) and (2), respectively, as unbounded dependencies that hold across anynumber of non-participating interveners. He further argues that syllable adjacency(i.e. …Cv.C… contexts) is the appropriate characterization of locality for certainnon-adjacent phonotactic dependencies (in addition to a third level of locality whichrequires direct adjacency in the string). Similarly, Pulleyblank (2002) proposes aset of constraints that drive harmony and dissimilation, which are specified for threediscrete levels of locality—‘Distant’ (unbounded), ‘Medium’ (roughly equivalentto syllable-adjacent), and ‘Close’ (string-adjacent).The starting point of this dissertation is an extensive investigation of the dis-tinction between unbounded patterns, which hold in all …C…C… contexts, anddependencies that apply only within a bounded …Cv.C…window. For consonantharmony in particular, this is a robust dichotomy and there is no other type of re-striction on distance that is attested, such as a dependency that holds across at mostone intervening consonant.An example of the dichotomy is illustrated below with cases of sibilant har-mony from two related Omotic languages. The data in (3) show an unboundeddependency found in Aari. The perfective suffix /-s/, which stays faithful in (3a),surfaces instead as [ʃ] when it is preceded (at any distance) by a lamino-postalveolar9sibilant, as seen in (3b)-(3d).(3) Unbounded sibilant harmony in Aari (Hayward, 1990)a. /baʔ-s-e/ baʔse ‘he brought’b. /ʔuʃ-s-it/ ʔuʃʃ it ‘I cooked’c. /tʃʼa̤ːq-s-it/ tʃʼa̤ːqʃ it ‘I swore’d. /ʃed-er-s-it/ ʃederʃ it ‘I was seen’In the data fromKoyra below, the 3rd person masculine singlular (perfective) suffix/-osːo/ harmonizes with a preceding sibilant when they are separated by a singlevowel, as in (4b)-(4c). However, it surfaces faithfully across any greater distance(i.e. when another surface consonant intervenes), as shown in (4d)-(4e).(4) Transvocalic sibilant harmony in Koyra (Koorete; Hayward, 1982)a. /tim-d-osːo/ tindosːo ‘he got wet’b. /patʃ-d-osːo/ patʃ ːoʃ ːo ‘it became less’c. /giːʒ-d-osːo/ giːʒːoʃ ːo ‘it suppurated’d. /ʃod-d-osːo/ ʃodːosːo ‘he uprooted’e. /ʔatʃ-ut-d-osːo/ ʔatʃutːosːo ‘he (polite) reaped’Note that while Rose and Walker (2004) and Bennett (2013) follow Odden(1994) in defining the latter of these two patterns in terms of syllable-adjacency(since they are most often observed in a …Cv.Cv… configuration), I instead fol-low Hansson (2010a) in characterizing them as transvocalic harmony since the de-pendencies appear to hold across maximally one vowel (short or long) and are neverseen to hold across an intervening consonant. The crucial data needed to distinguishbetween the syllable adjacent vs. transvocalic definitions of locality would bewordswith at least one closed syllable, such that the two elements of the dependency arein adjacent syllables but are separated by an intervening consonant (e.g. Cvc.Cvor Cv.cvC). Hansson (2010a) notes, however, that many of the potentially infor-mative languages do not allow coda consonants, and those that do only permit alimited set of consonants in coda position. While this topic merits future investi-gation, Hansson argues that what little evidence there is favours the transvocalic10(rather than syllable-adjacent) characterization of these …Cv.C… dependencies.3For patterns of sibilant harmony, results from artificial language learning stud-ies alignwith the observed split between transvocalic and unbounded locality. Learn-ers who encounter sibilant harmony only in transvocalic Sv.Sv contexts (where Srepresents a sibilant [s] or [ʃ]) tend not to generalize to greater distances, but expo-sure to Sv.cv.Sv harmony, across an intervening consonant, results in the learningof a genuinely unbounded pattern—subjects tend to generalize both to lesser andgreater distances. This is shown by Finley (2011, 2012), who exposed subjects to aleft-to-right pattern of sibilant harmony in the form of a [-su] vs. [ʃu] suffix alterna-tion triggered by a sibilant in the stem. McMullin and Hansson (2014, Experiment1) replicate this finding with right-to-left directionality, as subjects were exposedto sibilant alternations in ‘verb’ stems triggered by two separate suffixes, [-su] and[-ʃi], which indicated ‘past’ and ‘future’ tense, respectively. Finally, Experiment 1of this dissertation (see Section 2.2.2) produces a further replication of these results,but with a different type of interaction in terms of the segments involved (i.e. liquidsrather than sibilants).With respect to long-distance consonant dissimilation, results from Koo andCole (2006) provide evidence that liquid dissimilation can be learned from labora-tory exposure to the pattern, but to my knowledge there has been no investigationinto the learnability of different locality parameters for such patterns. Experiment2 (see Section 3.2.2) fills this gap, showing that subjects learn and generalize long-distance patterns of liquid dissimilation in the same way as they do for harmony.1.3.3 BlockingIn terms of whether or not certain intervening segments can block a long-distanceinteraction between consonants, the cross-linguistic details have historically beenreported as being quite different for harmony and dissimilation.The typology of consonant harmony, as reported by Hansson (2001), as well asRose and Walker (2004), reveals a conspicuous absence of systems that exhibit3Note that in (4d)-(4e), under the standard assumption that the two halves of a geminateconsonant straddle the syllable boundary, the two sibilants are technically in adjacent syllables(e.g. [ʔa.tʃut.tos.so] ‘he (polite) reaped’), but do not agree for anteriority. This is part of Hansson’s(2010a) evidence that the dichotomy is between unbounded and transvocalic (not syllable-adjacent)dependencies.11blocking effects. More recently, however, at least three languages with a gen-uine case of consonant harmony with blocking have been reported in the litera-ture, including Slovenian (Jurgec, 2011, see Section 4.2.1), Kinyarwanda (Walkerand Mpiranya, 2005; Hansson, 2007; Walker et al., 2008, see Section 5.1.1), andImdlawn Tashlhiyt (Elmedlaoui, 1995; Hansson, 2010b, see Section Bycontrast, no such gap in the typology has been proposed for long-distance dissim-ilation (to my knowledge). This may be in part due to the case of Latin liquid dis-similation, which has been studied extensively in the literature (e.g. Watkins, 1970;Dressler, 1971; Jensen, 1974; Steriade, 1987a; Odden, 1994; Cser, 2010). The tra-ditional description of the pattern is that underlying /-al/ surfaces as [-ar] when an[l] precedes it (e.g. /lun-al-is/ becomes [lun-ar-is] ‘lunar’), but that the dependencyis blocked by an intervening [r] (e.g. /flor-al-is/ surfaces faithfully as [flor-al-is], not*[flor-ar-is]).4 Despite the relative prominence of the Latin pattern in the literature,however, there do not appear to be as many instances of long-distance consonantdissimilation with blocking as one might think. In particular, Bennett (2013) notesonly three cases of long-distance dissimilation that is blocked by an interveningconsonant, along with four others that are either blocked in some other way, or thatare not fully supported empirically.There is a need for an investigation of long-distance dependencieswith blockingin terms of whether or not humans can learn such patterns in the laboratory, but thisremains outside the scope of the present research. Instead, I will argue for a unifiedaccount of the cross-linguistic properties of locality and blocking, and the resultingpredictions will serve as the basis for artificial language learning studies in futureresearch.1.4 Theoretical FrameworksThis dissertation considers several potential characterizations of the learner’s hy-pothesis space, but is restricted in scope to a comparison of proposals within twotheoretical frameworks in particular: Optimality Theory and formal language the-ory. Each presents different challenges for and predictions about learnability, which4Cser (2010) argues that this simple generalization is not sufficient for capturing the full regularityof the pattern. Specifically, he presents evidence from a corpus that shows that dissimilation is alsoblocked by intervening labial and velar consonants. This is described in detail in Section outlined in general terms below.1.4.1 Optimality TheoryIn Optimality Theory (OT; Prince and Smolensky, 2004), the learner’s hypothesisspace is determined by an innate, universal set of ranked and violable constraints.The types of phonotactic patterns that the learner needs to consider are restrictedto those generated by the factorial typology (i.e. all possible constraint rankings).If each attested pattern can be generated with at least one such ranking, then thelearning problem is quite simple—the learner needs only to find a ranking that ac-counts for all of the encountered forms. Finding a correct constraint ranking is arelatively straightforward task if the learner has inherent access to the constraintset, and there are several algorithms that can do so (e.g. Boersma, 1997; Tesar andSmolensky, 2000; Goldwater and Johnson, 2003). In short, the success of theselearning algorithms is a result of the structural properties of OT grammars, whichmay be independent of the constraints themselves (Heinz, 2009; Tesar and Smolen-sky, 2000; Dresher, 1999makes a similar argument about learning in Principles andParameters frameworks).If a new pattern is discovered that cannot be accounted for with the set of positedconstraints, it is not a problem for learnability itself, as the phonologist needs onlyto formulate a new (though by hypothesis still innate) constraint that can accountfor the pattern. While there are rough criteria for what constitutes a plausible andwell-formed constraint, especially in terms of phonetic grounding (Archangeli andPulleyblank, 1994; Hayes, 1999), I argue that constraints must exhibit a furthercomputational grounding. For example, the number of potential violations of gra-dient Align constraints grows quadratically with respect to the length of the word.Aside from predicting a number of unattested patterns (Eisner, 1997; McCarthy,2003), Riggle (2004) shows that they are formally too complex for computing op-timization over. I note that this dissertation does not present any arguments againstconstraint-based approaches or optimization in general, but instead suggests that ifthe goal is to achieve a feasible model of the human grammar and learning mech-anism within a constraint-based framework, we need a better understanding of thecomputational properties underlying individual constraints, and that they should be13demonstrably learnable rather than provided a priori (see also, e.g., Ellison, 1992,1994; Hayes and Wilson, 2008; van de Weijer, 2014).Specifically, my dissertation focuses on the predictions of Agreement by Cor-respondence (ABC), a framework within OT for analyzing non-adjacent segmen-tal interactions (Walker, 2000a,c; Hansson, 2001, 2010a; Rose and Walker, 2004).This approach has seen relative success in accounting for the typology of conso-nant harmony, and has more recently been extended to analyses of long-distanceconsonant dissimilation (Bennett, 2013, 2015). The idea, motivated by the fact thatinteraction seems to be facilitated primarily by similarity in terms of shared fea-tures (see Section 1.3.1 above), is that a similarity-based surface correspondencerelation brings two (non-adjacent) consonants into each other’s purview, and thatcertain restrictions may be imposed on consonants that are in correspondence. Thebasic ABC framework is outlined in Section 2.3, and in Chapter 3 I argue thatdefining the human learner’s hypothesis space in terms of a factorial typology ofABC constraints does not offer a satisfactory approximation of the range of patternsindicated by the empirical evidence.1.4.2 Formal Language TheoryI investigate issues of computational complexity and learnability of phonotacticpatterns within the framework of formal language theory. From this perspective,languages can be thought of as sets of grammatical words, whose members includeonly the sequences of sounds that are well-formed in the language. A phonotacticpattern is thus manifested as a restriction on the strings of segments that are per-mitted in the set. Strings that do not adhere to the pattern will be ungrammatical,and are therefore not members of the stringset. As the scope of this dissertation islimited to phonotactic complexity in particular, the terms pattern and language areused interchangeably in reference to the stringset (or formal language) that reflectsthe phonotactics of a language.A long-known property of phonological mappings (e.g. input strings to out-put strings) is that any pattern that can be generated with an ordered set of rewriterules belongs to the class of regular relations (Johnson, 1972; Kaplan and Kay,1994). Consequently, as Rabin and Scott (1959) show, all stringsets generated14(non-computable) Type 0: recursively enumerable (recursive) Type 1: context-sensitive Type 2: context-free (finite) Type 3: regular (all phonotactics) Figure 1.2: The Chomsky hierarchy. The shaded region indicates that thecomplexity of all attested phonotactic patterns seems to be (at most) reg-ular.by these relations (i.e. the surface phonotactics) are members of the regular re-gion of the Chomsky hierarchy (Chomsky, 1956), which includes several well-known classes of formal languages that are in a subset relationship, as illustrated inFigure 1.2. Certain syntactic processes are known to result in relatively complexcontext-sensitive stringsets (strings of words rather than segments; e.g. Culy, 1985;Shieber, 1985; Kobele, 2006), but it turns out that all attested phonological patternsare indeed regular, including long-distance consonant agreement and disagreement(Heinz, 2010; Heinz et al., 2011; Payne, 2014). While the resulting phonotactic pat-terns are therefore also regular, not every pattern that can be described as a regularstringset is attested in natural language, such as a dependency that holds betweenthe first and last segments of a word (Lai, 2012). However, the regular region can befurther broken down into a hierarchy of well-studied formal language classes thatare proper subsets of the regular languages (the subregular hierarchy; McNaughtonand Papert, 1971; Rogers et al., 2010; Heinz et al., 2011; Rogers and Pullum, 2011).15Regular Star-Free Locally Threshold Testable Locally Testable Strictly Local Strictly Piecewise Piecewise Testable Tier-based Strictly Local Figure 1.3: Illustration of the subregular hierarchy. The largest class of formallanguages (i.e. Regular) is presented on the top, and subset classes arepresented below. Each language class is thus a proper subset of any classthat is above it and connected by a line. Subregular language classes thatare most relevant to this dissertation are presented in boldface.Throughout this dissertation, several of the subregular classes shown in Figure 1.3will be assessed as potential bases for defining a boundary between possible andimpossible phonotactic patterns. In other words, this boundary can be thought of asa learning bias that restricts the hypothesis space for phonotactic patterns to exactlythat class of stringsets.For instance, one type of phonotactic co-occurrence restriction regulates se-quences of string-adjacent segments up to some length k (k-factors; roughly equiv-alent to the concept of n-grams). Each such restriction defines a member of theStrictly k-Local class of formal languages (SLk). An example of a phonologicalconstraint corresponding to an SL2 language would be *[−son, αvoi][−son, −αvoi](cf. Agree[voice]; Lombardi, 1999), which bans any mixed-voicing sequence oftwo obstruents: *bk, *zt, *gθ, *pz, *xð, etc. SLk languages thus provide suffi-cient expressivity for describing interactions between consonants within a particu-lar window (bounded by k). If k = 3, we can also capture transvocalic consonantinteractions as members of the SL region. For example, we might propose a con-16straint like *[+strid, αant][+voc][+strid, –αant] to account for transvocalic sibilantharmony, which can be thought of as an SL3 language that disallows sequencessuch as *sadʒ, *tʃʼoz, *tsuʃ, etc. However, the SL region is not a plausible measureof the limits on human learning, since it is too restrictive to allow for the unboundedtypes of long-distance interactions described in the previous section.By referring instead to precedence relations (i.e. x…y), which are by definitionblind to distance and intervening material, an unbounded dependency can be de-scribed as a Strictly 2-Piecewise (SP2) pattern that disallows certain subsequencesof length 2, such as *s…ʃ, *ʃ…s for patterns of sibilant harmony (Heinz, 2010; Mc-Mullin and Hansson, 2014). However, this characterization only works well for un-bounded phonotactic dependencies without blocking (Heinz, 2010)—and althoughsegmental blocking effects are relatively rare in long-distance patterns of conso-nant (dis)agreement, they are nonetheless attested (see Section 1.3.3) and must beaccounted for.As an alternative, unbounded sibilant harmony can be thought of as a restrictionagainst contiguous [+strid, αant][+strid, –αant] segment pairs (*sʃ, *ʃs), where adja-cency is crucially assessed only among sibilant consonants within the string. Moregenerally, patterns that can be described in similar terms are members of the Tier-based Strictly 2-Local class of formal languages (TSL2; Heinz et al., 2011). In brief,a grammar for a TSL2 language is defined by the relevant subset of the inventorythat comprises the ‘tier’ T, and the set of segment 2-factors (bigrams) that are per-mitted on T, denoted S (or R for the set of prohibited 2-factors). I will demonstratethat this characterization of long-distance dependencies, while still relatively sim-ple, offers an account of the typological properties of locality that extends straight-forwardly to cases of blocking. I note that the concept of tiers (or projections ofsegments) has long been used in theoretical phonology (e.g. Clements, 1980, 1985;Shaw, 1991; Odden, 1994; Blevins, 2004; Clements and Hume, 1995), but that theTSL2 approach differs primarily in that a tier can be defined by any subset of thesegment inventory (i.e. it need not be a set of segments sharing some feature, or be-longing to a natural class, etc). I will argue that this formal characterization of a tieroffers enough flexibility to account for a number of otherwise problematic patterns,without suffering consequences from the perspective of computational theory andlearnability (Heinz et al., 2011; Jardine and Heinz, 2015).171.5 Central ClaimsThe evidence presented in this dissertation points to a fundamental relationship be-tween the cross-linguistic distribution of long-distance phonotactic dependenciesand the way humans can be seen to learn those patterns in a set of artificial languagelearning experiments. I argue that an adequate theory of long-distance consonantinteractions should account for this connection, but that the inherent properties ofthe Agreement by Correspondence framework (Rose and Walker, 2004; Hansson,2010a; Bennett, 2013) result in a number of pathologies and predictions about learn-ability that are not borne out empirically. As such, I claim that the factorial typologyof ABC constraints is not a satisfactory definition of what constitutes a possible,human-learnable phonotactic pattern. In approaching the problem from the per-spective of formal language theory, the hypothesis that all long-distance phono-tactic patterns are members of the Tier-based Strictly 2-Local class of stringsets(Heinz et al., 2011) withstands the scrutiny of comparison with a wide range of pat-terns observed in natural language. I therefore claim that the TSL2 region of formallanguages is an excellent, and moreover computationally well-defined, approxima-tion of the learner’s hypothesis space that offers an account of both unbounded andtransvocalic patterns, with the latter being a special case of a more general categoryof long-distance dependencies with blocking. I will argue that this solution can bethought of as an independent theory of long-distance dependencies, but that it alsohas the potential to be integrated with constraint-based frameworks in the form ofcomputationally grounded and learnable markedness constraints that are defined asindividual formal languages (stringsets) of the TSL2 class.1.6 Structure of the DissertationThe remainder of this dissertation proceeds as follows. Chapter 2 looks in moredepth at the attested two-way split between unbounded and transvocalic localityfor patterns of consonant harmony. After presenting the results of Experiment 1as evidence that humans learn and generalize patterns of liquid harmony in a waythat mirrors the cross-linguistic properties of locality, I offer an explanation of thedichotomy in terms of the constraints used in the Agreement by Correspondenceframework. While the ABC model thus seems to provide a satisfactory account of18certain empirical findings, Chapter 3 demonstrates that a number of pathologicalpatterns can be generated within the factorial typology of ABC constraints, and ar-gues that neither the cross-linguistic typology nor the results of Experiment 2 (an ar-tificial language learning study of liquid dissimilation) provide support for the pre-dictions of the ABC model. Chapter 4 turns to the alternative perspective of formallanguage theory, describing how phonotactic patterns can be characterized as mem-bers of several different types of subregular formal language classes (stringsets),focusing primarily on patterns that can be described as Tier-based Strictly 2-Localstringsets. After demonstrating that the cross-linguistic typology of locality rela-tions and blocking are easily captured by varying the set of segments that constitutesthe ‘tier’ in the TSL2 grammar, I present experimental evidence that patterns out-side of the TSL2 region (i.e. harmony or dissimilation that applies only in ‘beyond-transvocalic’ contexts) are simply not learned in the laboratory (Experiments 3 and4). Chapter 5 further scrutinizes the TSL2 approach, arguing that the proposed re-gion is not too big, demonstrating that it is computationally learnable, and finallyconsidering the possibility of defining markedness constraints (i.e. co-occurrencerestrictions) as individual TSL2 languages. Finally, Chapter 6 summarizes the over-all argument, discusses certain issues that are outside the scope of the present re-search, and concludes.19Chapter 2Locality Relations in ConsonantHarmonyIn this chapter, I first show that the typology of locality relations in consonant har-mony (Section 2.1), or long-distance consonant assimilation, is associated with acognitive learning bias, as evidenced by the results of an artificial language learn-ing study of liquid harmony (Section 2.2). I then outline the Agreement by Cor-respondence (ABC) framework (Rose and Walker, 2004), arguing that satisfactoryexplanations of both the typology and the properties of human learning are tenta-tively offered by the implementation of ABC in Optimality Theory (Section 2.3). Inote that the range of patterns considered in this chapter is intentionally restrictedto those that are easily handled by a basic set of ABC constraints. This is done inorder to exemplify what I consider to be an ideal scenario, in which an observed ty-pological generalization that is demonstrably associated with a human learning biasis incorporated into the theory as a phonological constraint that seems to accountfor all of the empirical data. The reader is asked to bear in mind, however, thatthe unified account of locality relations that is presented in this chapter is not sus-tainable, and that Chapter 3 will highlight a number of problematic predictions thatarise when the ABC approach to long-distance phonotactics is extended to morecomplex cases of consonant harmony and to consonant dissimilation.202.1 Unbounded and Transvocalic Consonant HarmonyAs outlined in Section 1.3, typological surveys of long-distance consonant agree-ment (Rose and Walker, 2004; Hansson, 2010a) reveal several interesting cross-linguistic properties. This chapter focuses primarily on a robust dichotomy thatis observed with respect to the locality of the dependency, or the maximum dis-tance between two interacting consonants. First, there are unbounded dependen-cies that hold in all …C…C… contexts, across any number of intervening seg-ments of any kind. The second type of consonant harmony holds only within arelatively local …CvC… window. Following Hansson (2010a), throughout thisdissertation I will refer to this type of locality as transvocalic, which is defined interms of a dependency that hold across maximally one vowel (short or long) andnever across an intervening consonant. This characterization stands in contrast toRose andWalker (2004) and Bennett (2013), who follow Odden (1994) in referringinstead to syllable-adjacency as the relevant context (see Section 1.3).The data in (1) and (2) provide examples of sibilant harmony from Aari andKoyra, two related Omotic languages of Ethiopia that exhibit the difference be-tween unbounded and transvocalic locality.In Aari (see Hayward, 1990), the perfective suffix /-s/ surfaces faithfully as [-s]when no [–anterior] sibilant such as [ʃ] or [ʒ] precedes it, as seen in (1a). However,(1b)-(1f) demonstrates a pattern of sibilant harmony, in that the suffix surfaces in-stead as [-ʃ] when a [–anterior] sibilant precedes it. Note that there appears to be noupper limit on the trigger-target distance (Hansson, 2010a). For example, (1e) and(1f) show that the dependency holds across an intervening …VCVC… sequence.(1) Unbounded sibilant harmony in Aari (Hayward, 1990)a. /baʔ-s-e/ baʔse ‘he brought’b. /ʔuʃ-s-it/ ʔuʃʃ it ‘I cooked’c. /tʃʼa̤ːq-s-it/ tʃʼa̤ːqʃ it ‘I swore’d. /ʒaʔ-s-it/ ʒaʔʃ it ‘I arrived’e. /ʃed-er-s-it/ ʃederʃ it ‘I was seen’f. /ʒa̤ːg-er-s-e/ ʒa̤ːgerʃe ‘it was sewn’In Koyra (see Hayward, 1982), the 3mSg (perfective) suffix /-osːo/ surfaces faith-21fully as [-osːo] when no [–anterior] sibilant precedes it, as seen in (2a). In (2b) and(2c), the suffix surfaces instead as [-oʃːo]—an alternation that is triggered whenevertwo sibilants are in a transvocalic context, separated by at most one vowel and nointervening consonants. The data in (2d) and (2e) provide evidence that the depen-dency holds only within the transvocalic window, as the alternation is not triggeredwhen additional material intervenes.(2) Transvocalic sibilant harmony in Koyra (Koorete; Hayward, 1982)a. /tim-d-osːo/ tindosːo ‘he got wet’b. /patʃ-d-osːo/ patʃ ːoʃ ːo ‘it became less’c. /giːʒ-d-osːo/ giːʒːoʃ ːo ‘it suppurated’d. /ʃod-d-osːo/ ʃodːosːo ‘he uprooted’e. /ʔatʃ-ut-d-osːo/ ʔatʃutːosːo ‘he (polite) reaped’Another example of the attested dichotomy is provided in (3) and (4), with datafrom two Bantu languages, Yaka and Lamba, spoken primarily in the DemocraticRepublic of the Congo and Zambia, respectively, that exhibit the difference be-tween unbounded and transvocalic locality for a different type of long-distance in-teraction: nasal consonant harmony.1As shown in (3), Yaka (see Hyman, 1995) has a perfective suffix /-idi/ thatsurfaces as [-ini] when it is preceded at any distance by a nasal consonant withinthe stem.(3) Unbounded nasal consonant harmony in Yaka (Hyman, 1995)a. /-tsúb-idi/ -tsúbidi ‘wandered (perf.)’b. /-tsúm-idi/ -tsúmini ‘sewed (perf.)’c. /-mák-idi/ -mákini ‘climbed (perf.)’d. /-míːtuk-idi/ -míːtukini ‘sulked (perf.)’A similar alternation (perfective /-ile/ → [-ine]) is demonstrated by the data in (4)for Lamba (data from Odden, 1994). However, while the alternation is triggered in1Note that in patterns of Bantu nasal consonant harmony, intervening vowels do not surface withnasalization. That is, the production of (2b) is [-tsúmini], not *[-tsúmĩni]. For this reason, Rose andWalker (2004) and Hansson (2010a) present the patterns as cases of true long-distance agreement thatis not achieved by spreading nasality (phonetically or phonologically).22transvocalic contexts, such as that of (4b), the dependency does not hold at greaterdistances, as seen in (4c).(4) Transvocalic nasal consonant harmony in Lamba (Odden, 1994)a. /-pat-ile/ -patile ‘scolded (perf.)’b. /-uːm-ile/ -uːmine ‘dried (perf.)’c. /-mas-ile/ -masile ‘plastered (perf.)’The above cases of transvocalic consonant harmony in (2) and (4) can be thoughtof as long-distance dependencies that are bounded in terms of distance; the patternstill holds when two elements of the dependency are non-adjacent, but only if theyare in a relatively local …CvC… context. Beyond the transvocalic window, in…C…c…C… contexts, the restriction does not apply since the intervening mate-rial exceeds the transvocalic threshold for locality in that it contains one or moreconsonants. Of particular interest is the fact that there are no attested cases of long-distance consonant agreement that are bounded by any other measure (e.g., sibilantharmony across at most one consonant, two vowels, five segments, or any othermetric of distance). Likewise, there are no examples of a language with conso-nant harmony that holds across exactly one intervening consonant (not more orless), or across at least one consonant (i.e. in beyond-transvocalic contexts; seeSection for more on beyond-transvocalic locality). These typological gen-eralizations are summarized in Table 2.1.Table 2.1: Attested and unattested variants of consonant harmony localityLocality Status …CvC… …CvcvC… …CvcvcvC…unbounded attested + + +transvocalic attested + – –≤ 1 consonant unattested + + –= 1 consonant unattested – + –≥ 1 consonant unattested – + +The remainder of this chapter investigates this dichotomy in more depth, pro-viding evidence that it is associated with a human learning bias (Section 2.2), and23summarizing how it has been incorporated into theory as a phonological constraintin the Agreement by Correspondence framework (Section 2.3).Finally, note that in cases where both the unbounded and transvocalic versionsof consonant harmony are represented within the same group of related languages,such as in the Omotic and Bantu cases mentioned above, it can often be concludedfrom independent evidence (e.g. geographic distributions) that the unbounded ver-sion of the sound pattern represents a secondary historical development from whatwas originally a strictly transvocalic dependency (Dolbey and Hansson, 1999; Gun-nar Ólafur Hansson, pers. comm.). This entails a diachronic process of imperfectlearning (overgeneralization) at some point in the history of the language(s) in ques-tion. While the exact nature of the issues surrounding a phonological change of thissort are not considered in depth in this dissertation, I point out that hints of over-generalization from transvocalic to unbounded patterns are likewise evidenced inthe results of the experiments presented throughout this dissertation.2.2 Consonant Harmony in Artificial Language LearningThe typological locality universal described in the previous section has recentlybeen reproduced in a laboratory setting by Finley (2011, 2012), who shows that,in the face of insufficient evidence, learners generalize in ways that adhere to thetypology. In a set of artificial language learning experiments, subjects were taskedwith learning sibilant harmony from a restricted set of training items, in which thechoice of a suffix allomorph ([-su] or [-ʃu]) was dependent on a sibilant in the stem([s] or [ʃ], respectively). To summarize, subjects exposed to “first-order” harmonyin cvSv-Sv (where S represents a sibilant) showed evidence of learning transvo-calic harmony, but did not generalize to novel “second-order” Svcv-Sv forms (Fin-ley, 2011, Experiment1). By contrast, subjects learning harmony from…Svcv-Svcontexts did generalize the dependency, both inwards to cvSv-Sv contexts (Finley,2011, Experiment 2) and outwards to Svcvcv-Sv (Finley, 2012, Experiment 2).McMullin and Hansson (2014, Experiment 1) produced a cohesive replicationof Finley (2011, 2012) that incorporated several modifications to the experimentaldesign. Specifically, the language had two suffixes, [-su] and [-ʃi], correspondingto “past” and “future” tense, respectively. These triggered a suffix-to-stem sibi-24lant harmony pattern, such that any sibilant in the cvcvcv verb stem would alter-nate to match the suffix sibilant. The pattern thus differed from those in Finley(2011, 2012) in two respects: regressive directionality (from suffix to stem) andharmony alternations in open-class items (stems) rather than among a closed set ofsuffix allomorphs. The results aligned with those of Finley’s experiments, extend-ing and strengthening the overall evidence that the locality properties of attestedlong-distance phonotactics are shaped by a learning bias.One motivation for using a sibilant contrast in studies of long-distance phono-tactic learning is that, cross-linguistically, sibilant harmony is by far the most com-monly attested type of non-adjacent consonant assimilation (Gafos, 1999; Hansson,2010a). In this section, I present a replication of Finley (2011, 2012) andMcMullinand Hansson (2014, Experiment 1), in which the target pattern is liquid harmony,a robustly attested (e.g. Bukusu; Odden, 1994, see data in Section 1.3) though lesscommon type of consonant harmony. While this experiment serves to extend previ-ous findings to a different and perhaps less acoustically salient featural contrast, itis also explicitly designed to allow for comparison along different dimensions, suchas contrasting the learning of harmony vs. dissimilation patterns, through a seriesof additional experiments presented in subsequent chapters of this dissertation.2.2.1 Experimental MethodologyThis section describes the aspects of the experimental methodology that apply gen-erally to all experiments presented in this dissertation. Details specific to Experi-ment 1, a study of liquid harmony, are presented below in Section Recruiting ParticipantsAll participants were recruited through a subject pool made up of students at theUniversity of British Columbia, and were eligible to sign up for just one of the ex-periments (i.e. there is no single person who is included in more than one trainingcondition). Each participant either was compensated with $10 or received coursecredit for taking part in the experiment, which took approximately 45 minutes tocomplete. There were no restrictions on the language background of a participant,but the results presented below only include data from those who are self-reported25native speakers of English. Note that many of the participants whose data wereretained for analysis were not monolingual English speakers, but that no partici-pant had experience with another language that is known to exhibit a long-distanceconsonant interaction. StimuliThe entire list of stimuli, which was designed to be used for Experiments 1 through4, consisted of 1560 items. This included “verb” stems, as well as suffixed versionsof each stem, where the suffixes [-li] and [-ɹu] correspond to “future” or “past”tense, respectively. The breakdown of these items is summarized in Table 2.2 (seeAppendix A for the full list of stimuli), and Section describes the stimuliin the context of each phase of the experiment. The stimuli were divided into fourcounterbalanced lists of 390, which were randomized and recorded in a soundproofbooth by four phonetically trained native English speakers (2 male, 2 female). Thespeakers were not made aware of the fact that the list of stimuli adhered to anyphonotactic pattern, and upon being asked to identify any patterns in the data, noneof the speakers suggested the possibility of an interaction between the liquid con-sonants. Each speaker was instructed to produce each stimulus with word-initialstress, without vowel reduction, and all segments as they would in normal speech(e.g. mid vowels as diphthongs [eɪ] and [oʊ]). The stimulus set was designed suchthat each of the four speakers produced an equal number of each consonant andvowel occurring in each position of the three-syllable verb stems for each of threephases, as described below. Experimental Design: Three PhasesAll participants in Experiments 1 through 4 completed three phases of the experi-ment: a practice phase (identical for all groups), training (differed by group), andtesting (identical for all groups).Practice Phase A set of 8 verb stems (along with their two suffixed forms) wasconstructed for a practice phase, in which all participants (regardless of trainingcondition) learned how to conjugate the verbs of the artificial language in past26Table 2.2: Breakdown of stimuli used in Experiments 1 through 4, where Ldenotes a liquid [l, ɹ], c is one of [p, t, k, b, d, g, m, n], and v is a vowel[i, e, o, u].Phase Stem type (#) Suffixed forms (#) Total # of stimuliPractice cvcv (8) cvcv-li (8)cvcv-ɹu (8)24Training cvcvcv (96) cvcvcv-li (96)cvcvcv-ɹu (96)288cvcvLv (96)cvcvlv-li (96)cvcvɹv-li (96)cvcvlv-ɹu (96)cvcvɹv-ɹu (96)480cvLvcv (96)cvlvcv-li (96)cvɹvcv-li (96)cvlvcv-ɹu (96)cvɹvcv-ɹu (96)480Testing cvcvLv (32) cvcvLv-li (32)cvcvLv-ɹu (32)96cvLvcv (32) cvLvcv-li (32)cvLvcv-ɹu (32)96Lvcvcv (32) Lvcvcv-li (32)Lvcvcv-ɹu (32)96Total = 1560and future tense. During this phase, the participants first listened, over a set ofheadphones, to pairs of words consisting of a bare verb stem followed by its pasttense form (e.g. [toke]…[toke-ɹu], [nipu]…[nipu-ɹu]). Note that, in contrast to thetraining phase described below, participants were not asked to repeat each of thewords out loud in the practice phase. The interval time between the stem andits suffixed form was 500 ms. They then did the same for the future tense verbs(e.g. [toke]…[toke-li], [nipu]…[nipu-li]). To minimize any influence on the re-27mainder of the experiment, verb stems in the practice phase were restricted to two-syllable cvcv stems with no liquids, where c represents a stop or nasal consonant[p, t, k, b, d, g, m, n] and v is one of four vowels [i, e, o, u]. The list of stimuli there-fore included 24 individual items that were used in this phase (8 cvcv bare-stemforms, 8 cvcv-li future-tense forms, cvcv-ɹu past-tense forms).Training Phase The training phase consisted of a series of 192 verb triplets. Eachtriplet was produced by the same speaker, and began with a three-syllable verb stemand was followed by its two suffixed forms with [-li] and [-ɹu]. This phase of theexperiment was self-paced, with participants using the keyboard to advance througheach of the items. Note that participants were asked to repeat each word aloud afterhearing it, which is known to aid learning in similar tasks (e.g. Warker et al., 2009).The list of stimuli included 1248 words that were used in the training phase of thevarious training conditions. (Section below describes the precise contents ofthe training phase for each of the three training conditions in Experiment 1. Similarsections are also included for subsequent experiments.)One portion of this list included 96 cvcvLv stems (to be used for the “Short-range” training conditions), where L represents a liquid consonant [l, ɹ]. Note thatsince the list of stimuli was designed to be used for all experiments in this disserta-tion (which look at patterns of both liquid harmony and liquid dissimilation), foursuffixed forms were recorded for each of these stems. For example, a verb stem like[pidele] had two forms that adhered to a suffix-triggered pattern of liquid harmony(future tense [pidele-li] and past tense [pideɹe-ɹu], and two that instead exhibited adissimilatory pattern (future tense [pideɹe-li] and past tense [pidele-ɹu]). For eachShort-range training stem, the list of stimuli therefore included five total words (onestem and four suffixed forms), which resulted in 480 separate stimuli, which werecounterbalanced for the number of stems with each liquid, the speaker who pro-duced the stimuli for each set of stem+suffix items, and for the frequency of thenon-liquid segments in each position of the stem.Similarly, 96 “Medium-range” stems (cvLvcv) were recorded along with eachof their four possible suffixed forms. Each Medium-range stem was obtained byreversing the order of the second and third syllables of the Short-range stimuli(e.g. Short-range [kopeɹu] corresponded to Medium-range [koɹupe]). This main-28tained the counterbalancing for each of the same factors described above, and re-sulted in 480 different words that were also recorded for this portion of the set ofstimuli.The final contents of the training stimuli included 96 cvcvcv verb stems alongwith their past and future tense suffixed forms. Note that since these stems con-tained no liquids (e.g. [dutebi]), only two suffixed versions of each stem wererequired—one past tense ([dutebi-ɹu]) and one future tense ([dutebi-li]). This por-tion of the list therefore contained a total of 288 words.Testing Phase Finally, each participant completed a testing phase, which useda Two-Alternative Forced Choice (2AFC) paradigm to determine whether partici-pants preferred liquid agreement (harmony) or disagreement (disharmony) at threedifferent levels of locality: Short-range (cvcvLv-Lv), Medium-range (cvLvcv-Lv), and Long-range (Lvcvcv-Lv). On each of 96 trials (32 for each of the testingdistance), subjects heard a verb stem that contained one of the two liquids, and wereasked to choose the correct option from two possible suffixed forms, each with thesame suffix. All three items in each trial were produced by the same speaker (oneof the same four encountered in the training phase). For example, the stem for oneLong-range trial was [ɹomuge], with the options of [lomuge-li] or [ɹomuge-li]. Theinterval time between the verb stem and the first suffixed option was 500 ms, andthe time between the first and second suffixed forms was 250 ms. Subjects weregiven a maximum of 3 seconds after the onset of the second option to respond be-fore receiving an error message indicating that no response was recorded. The setof stimuli used in the testing phase (96 triplets = 288 total) was counterbalanced forseveral factors, including which liquid the verb stem contained, which suffix wasused, which speaker produced the stimuli in each trial, whether the first of second2AFC option showed liquid harmony, whether or not an alternation was requiredto achieve harmony, and the number of non-liquid consonants and vowels in eachposition of the test stems. Examples of testing trials at each distance are providedin Table 2.3.29Table 2.3: Example of testing trials for all experimentsLocality level Test stem 2AFC optionsShort-range pidole pidole-ɹu…pidoɹe-ɹu(cvcvLv-Lv) duteɹe dutele-li…duteɹe-liMedium-range tuluge tuɹuge-li…tuluge-li(cvLvcv-Lv) miɹete milete-ɹu…miɹete-ɹuLong-range lugoni lugoni-ɹu…ɹugoni-ɹu(Lvcvcv-Lv) ɹomuge lomuge-li…ɹomuge-li2.2.1.4 ProceduresUpon providing consent to take part in the experiment, each participant was led intoa small room that contained a computer, a set of headphones and a microphone. Theexperimenter gave the following oral instructions to each participant:In this experiment, you’re going to use the headphones to hear wordsfrom a language. In the first part of the experiment, there is a shortsection where you will learn a little bit about how the language works.In the second part of the experiment, you’ll hear a word, repeat it outloud, and then you can press any key on the keyboard to hear the nextone, and so on. The microphone is going to record during that partof the experiment, but those recordings won’t be used for anythingoutside of this study. Then there is a final section where you’ll betested on what you’ve learned, and for that part you’ll switch to usingthe button box. You’re going to hear two options for the right answer.If you think the first one is right, press ‘1’, and if you think the secondone is right, press ‘2’. We ask that you use one finger from each handwhen selecting your answer.Specific instructions were repeatedwith on-screen text prior to each phase of the ex-periment. After completing the study, subjects completed a language-backgroundquestionnaire, and were then debriefed and given the opportunity to ask any ques-tions about the study.302.2.2 Experiment 1: Liquid Harmony2.2.2.1 ParticipantsForty-eight self-reported native speakers of North American English (36 female,12 male, mean age 23) took part in Experiment 1, with 16 subjects assigned to eachof the three training conditions described below. Training Conditions for Experiment 1Subjects in Experiment 1 were assigned to one of three groups that differed with re-spect to the type of verb stems encountered in training. There were two experimen-tal groups, M-Harm (for “Medium-range harmony”) and S-Harm (for “Short-rangeharmony”), as well as a Control group. Recall that the training phase consistedof a series of 192 training triplets (a three-syllable verb stem followed by its twosuffixed forms with [-li] and [-ɹu]).For the experimental groups, a full half of the training stems contained noliquids and therefore provided no evidence of harmony in their suffixed forms(e.g. [dutebi ~ dutebi-ɹu, dutebi-li]). The remaining half of the training stems forthe M-Harm group were of the form cvLvcv, where L represents a liquid [l] or[ɹ] in the second syllable, while the S-Harm group was instead exposed to verbsstems of the shape cvcvLv. Depending on their group, subjects were exposed tothe liquid harmony either across an intervening consonant (e.g. pilede ~ pilede-li,piɹede-ɹu) for M-Harm subjects) or across one vowel with no intervening conso-nants (e.g. gutoɹo ~ gutolo-li, gutoɹo-ɹu] for S-Harm subjects). For each trainingitem in which the stem contained a liquid, one of the two suffixes triggered a liquidalternation resulting in harmony, whereas the other suffixed form already obeyedharmony by morpheme concatenation alone. Training triplets were divided intotwo blocks, and randomized by subject within each block. Examples of trainingitems for the two experimental groups are provided in Table 2.4, along with thenumber of each type of item.Control subjects for this experiment completed the same amount of training, butwere not exposed to any stems containing liquids. Instead, in each of the two train-ing blocks, they were given the full set of the 96 triplets with cvcvcv stems. (They31Table 2.4: Examples of training items for M-Harm and S-Harm groups in Ex-periment 1Group Training triplet Type and number of itemsM-Harm…dutebi…dutebi-ɹu…dutebi-li…96 stems with no liquid…mekotu…mekotu-li…mekotu-ɹu……pilede…pilede-li…piɹede-ɹu…48 stems with [l]…nelogi…neɹogi-ɹu…nelogi-li……koɹupe…kolupe-li…koɹupe-ɹu…48 stems with [ɹ]…guɹoto…guɹoto-ɹu…guloto-li…S-Harm…dutebi…dutebi-ɹu…dutebi-li…96 stems with no liquid…mekotu…mekotu-li…mekotu-ɹu……pidele…pidele-li…pideɹe-ɹu…48 stems with [l]…negilo…negiɹo-ɹu…negilo-li……kopeɹu…kopelu-li…kopeɹu-ɹu…48 stems with [ɹ]…gutoɹo…gutoɹo-ɹu…gutolo-li…thus differed from the experimental groups in that they were exposed to the sametriplets twice each.) Participants in the Control group therefore saw no evidencefor or against liquid harmony in their training. The results of the Control group areexpected to reveal any underlying biases that may influence subject performanceduring the testing phase (e.g. from some aspect of the experimental design or fromany gradient phonotactic patterns exhibited by the English lexicon), and will serveas a baseline in the statistical analysis. Results and AnalysisResults of Experiment 1 were analyzed with a mixed-effects logistic regressionmodel implemented in R (R Core Team, 2014) using the glmer function includedin the lme4 package (Bates et al., 2014). In what follows, I provide a detaileddescription of the statistical model, which is summarized in Table 2.5.The categorical dependent variable in this model is whether the subject chosethe test item exhibiting liquid harmony on a particular trial. The fixed effects por-32Table 2.5: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 1 (N = 4518; log-likelihood = –2235.4)Coefficient Estimate SE Pr(>|z|)Intercept −1:13651 0.26127 < 0.0001Harmony Faithful 2:47228 0.28800 < 0.0001Harmony Second −0:50501 0.12784 < 0.0001Medium-range 0:09456 0.16467 0.5658Long-range −0:18878 0.16318 0.2473S-Harm 1:28500 0.29777 < 0.0001M-Harm 1:10671 0.30310 0.0003Medium-range × S-Harm −1:02310 0.22904 < 0.0001Long-range × S-Harm −0:89644 0.22821 < 0.0001Medium-range ×M-Harm 0:22645 0.23169 0.3284Long-range ×M-Harm −0:29497 0.22715 0.1941tion of the model included, as main effects, the between-subjects variable of train-ing group (M-Harm and S-Harm compared to the baseline Control group) and thewithin-subjects variable of trigger-target distance in the test item (Medium- andLong-range compared to the baseline Short-range items), and an interaction be-tween Group and Distance. The model also includes two nuisance variables thatcontributed significantly to the model fit, labelled Harmony Faithful (whether theliquid in the option with harmony was faithful to the stem liquid, or whether it re-quired an alternation to achieve harmony) andHarmony Second (whether the optionwith harmony was presented as the second member of the pair of 2AFC items). Therandom component consisted of by-subject intercepts and slopes for the same twonuisance variables, which are intended to offset individual tendencies for choos-ing harmony vs. disharmony, faithfulness vs. alternations, and the first vs. second2AFC alternative.The baseline reference (Intercept term) of this model can be thought of as thelog odds of choosing harmony for a subject in the Control group responding to aShort-range trial in which the first 2AFC option involved harmony by means of33an unfaithful alternation (e.g. pidole … pidoɹe-ɹu, pidole-ɹu). The negative In-tercept term thus indicates that on a trial like this, a Control subject is much lesslikely than chance to choose harmony. This is likely due in large part to the factthat an unfaithful alternation discourages a choice of harmony, as evidenced bythe relatively large positive estimate for Harmony Faithful, which indicates thatthe log-odds of choosing harmony when the liquid remains faithful to the stem(e.g. tuluge … tuɹuge-li, tuluge-li) are highly increased compared to when an al-ternation is required. Similarly, the estimate for Harmony Second indicates thatsubjects are slightly less likely to choose harmony when it is presented as the sec-ond of the two options. The estimates for the main effect of the Distance parametersshow that the Control group is slightly more likely to choose harmony at Medium-range (as compared to the Short-range distance), and slightly less likely to do soat Long-range, though neither effect reaches significance. The main effects of S-Harm and M-Harm training show that both experimental groups are significantlymore likely than the Control group to choose harmony at the Short-range distance.The significant interactions between both Medium-range and Long-range distanceswith the S-Harm group show (with negative estimates) that the S-Harm group doesnot seem to apply harmony outside of the Short-range window. By contrast, neitherof the interactions of Medium- and Long-range distances with the M-Harm groupapproach significance, indicating that the M-Harm group treats all three distancesequally.Note that in the logit mixed model presented in Table 2.5, the Short-range dis-tance is used as the baseline reference, meaning that the coefficients for Group ×Distance interactions merely show whether the experimental groups enforced har-mony to the same degree at the other distances as they did in Short-range contexts;this is not equivalent to the hypothesis under consideration. Instead, Table 2.6 con-trasts each experimental group with the control group by comparing the odds ofselecting the form with harmony ([l…l] or [ɹ…ɹ]) at each of the three distance lev-els tested.Each number in the table represents an odds ratio; for example, the odds thata subject in the S-Harm group would choose the harmony-obeying form for itemsof the Short-range type (cvcvLv-Lv) are more than three times (3.61) those of asubject in the Control group doing the same. The odds ratios in Table 2.6 were ex-34Table 2.6: Odds ratios comparing experimental groups to control group forchoosing harmony with each of the three testing distances as model base-lines. Contexts encountered in training are in boldface and all cells thatreach significance are shaded.Type of test item (trigger-target distance)Short-range Medium-range Long-range(cvcvLv-Lv) (cvLvcv-Lv) (Lvcvcv-Lv)M-Harm vs. Control3.02(p < 0.001)3.80(p < 0.001)2.25p ≈ 0.007S-Harm vs. Control3.61(p < 0.001)1.30p ≈ 0.3741.47p ≈ 0.187tracted from the fitted logit mixed model by exponentiating the relevant coefficientestimates for the S-Harm and M-Harm terms in the model. Thus, for example, thefigures in the Short-range column correspond to the coefficient estimates reportedin Table 2.5, where Short-range was taken as the reference level of the Distance pa-rameter (e1:285 ≈ 3:61, e1:107 ≈ 3:02). Odds ratios for the other two columns wereobtained by re-fitting the exact same model but with Medium-range and Long-range, respectively, serving as the reference level for Distance; this allows for adirect comparison with the Control group at each distance level.Finally, to facilitate a visual interpretation of the results of Experiment 1, severalplots depicting mean aggregated proportions of harmony choices for each group ateach level of locality, as well as the range proportions demonstrated by individualsubjects, are provided in Figures 2.1, 2.2, and 2.3. These are discussed below, witheffect sizes discussed with respect to the OR values reported in Table 2.6.The results in Figure 2.1 demonstrate that the experimental groups did show ev-idence of learning the harmony pattern at the trigger-target distance levels on whichthey were trained (those highlighted in boldface in Table 2.6: Medium-range for theM-Harm group, Short-range for the S-Harm group). Each group was significantlymore likely than the Control group to opt for a harmony response for test items ofthe relevant type, with odds ratios of 3.80 and 3.61, respectively.35S-Harm learning atShort-range?cvcvLv-Lv test items(saw harmony)Proportion harmony responses ([l…l] or [r…r])Control S-Harm0.000.250.500.751.00* * M-Harm learning atMedium-range?cvLvcv-Lv test items(saw harmony)Proportion harmony responses ([l…l] or [r…r])Control M-Harm0.000.250.500.751.00Figure 2.1: Plots comparing proportions of harmony responses for Controlsubjects to those of the M-Harm subjects in Medium-range test items(left panel) and to the S-Harm subjects in Short-range test items (rightpanel). Each dot represents individual subject performance, and groupmeans are indicated with a horizontal line. Significance is extracted froma mixed logit model and indicates learning of the pattern each group wasexposed to.More importantly, as shown in Figure 2.2, subjects in the M-Harm group choseharmony more often than those in the Control group at the unfamiliar Short-range(OR=3.02) and Long-range (2.25) distances in the testing phase, and both effectsreached statistical significance. This result, that the M-Harm group chose a testitem with harmony significantly more often than the Control group for all levelsof locality, is taken as evidence that subjects in the M-Harm training condition36M-Harm generalizationto Long-range?Long-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control M-Harm0.000.250.500.751.00* * M-Harm generalizationto Short-range?Short-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control M-Harm0.000.250.500.751.00Figure 2.2: Plots comparing proportions of harmony responses for Controlsubjects to those of the M-Harm subjects in Short-range test items (leftpanel) and in Long-range test items (right panel). Each dot representsindividual subject performance, and group means are indicated with ahorizontal line. Significance is extracted from a mixed logit model andindicates generalization of the pattern the group was exposed to.tend to interpret the dependency as unbounded liquid harmony, generalizing froma Medium-range pattern (which they were trained on and for which they showedevidence of learning) both inwards to Short-range and outwards to Long-range.A similar effect was not seen for the S-Harm group, as shown in Figure 2.3.When taken as a whole, the S-Harm group does not seem to generalize the patternof liquid harmony from the Short-range distance (which they were trained on, and37S-Harm generalizationto Long-range?Long-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control S-Harm0.000.250.500.751.00S-Harm generalizationto Medium-range?Medium-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control S-Harm0.000.250.500.751.00n.s. n.s. Figure 2.3: Plots comparing proportions of harmony responses for Controlsubjects to those of the S-Harm subjects in Medium-range test items (leftpanel) and in Long-range test items (right panel). Each dot represents in-dividual subject performance, and group means are indicated with a hor-izontal line. (Non-)Significance is extracted from a mixed logit modeland indicates generalization of the pattern the group was exposed to.for which they showed evidence of learning) to Medium- (OR=1.30, p ≈ 0.374) orLong-range (OR=1.47, p ≈ 0.187) distances.382.2.2.4 Summary and DiscussionAs expected, the results of Experiment 1 replicate those of Finley (2011, 2012) andMcMullin and Hansson (2014, Experiment 1), extending their findings to a dif-ferent pair of segments (liquids [l, ɹ] rather than sibilants [s, ʃ]). To summarize,subjects who are exposed to consonant harmony in Short-range contexts (acrossjust one vowel, with no intervening consonants) tend to interpret this evidence con-servatively: they internalize a pattern that harmonizes transvocalic consonant pairsbut does not extend to the more distant Medium- and Long-range contexts thatwere not encountered in training.2 By contrast, subjects who are exposed to har-mony in Medium-range contexts (spanning an intervening VCV sequence) tend tolearn a dependency that holds at all locality levels, generalizing both inwards toShort-range and outwards to Long-range contexts. These results conform to thetypological dichotomy among attested consonant harmony systems described inSection 2.1 above. I take this as evidence of a relationship between the limits ofhuman phonotactic learning and the types of patterns observed in natural languages.Below I interpret these findings from the point of view of current phonological the-ory, showing that the treatment of locality in the Agreement by Correspondenceframework straightforwardly captures both the typological generalizations and theproperties of human learning with respect to patterns of consonant harmony.2.3 Consonant Harmony as Agreement byCorrespondenceWhile there exist many proposals for constraint-based analyses of consonant har-mony, I focus here on the account provided by the Agreement by Correspondenceframework (ABC;Walker, 2000a,c, 2001; Hansson, 2001, 2010a; Rose andWalker,2004; Bennett, 2013), as this approach has seen relative success in accounting forthe typology of consonant harmony. The ABC framework posits constraints thatrequire pairs of segments to enter into a surface correspondence relation if they sur-pass some similarity threshold, due to sharing a certain set of feature values. For2As can be seen in the above plots, there are a number of participants whose data do not reflectthe overall trends of each of the experimental groups. Questions about individual differences, and thedifference between “successful” learners vs. those who do not appear to learn anything are discussedat greater length in Section, Corr[+strid] requires all co-occurring sibilants in the output to be sur-face correspondents of one another, regardless of how far apart they are in the word.The definition that I use forCorr[αF] (where [αF] is a set of feature specifications)is provided below in (5), which is a simplified version of the definition given byBennett (2013, p. 55).3(5) Corr[αF]: ‘Two [αF] consonants must correspond.’For each distinct pair of output consonants, X and Y, assign a violation if:a. X and Y both have the feature specification [αF], andb. X and Y are not in the same surface correspondence classWith the above definition, I followBennett (2013), who specifies that surface corre-spondence is an equivalence relation, in that it is symmetric (i.e. if X is a correspon-dent of Y, then Y is also a correspondent of X), transitive (i.e. if X is a correspondentof Y, and Y is a correspondent of Z, then X is a correspondent of Z), and reflexive(i.e. each member of a correspondence class is a correspondent of itself). Whilethese properties are not of immediate consequence, Chapter 3 will demonstrate thattransitivity in particular gives rise to a number of problematic predictions.Other constraints (CC·Limiters; Bennett, 2013) impose restrictions on surface-corresponding segments. Among these are CC-Ident[F] constraints, which re-quire corresponding consonants to agree with respect to some feature (e.g. CC-Ident[ant] requires agreement in [±anterior]) and thus serve as potential triggersand targets of harmony for that feature. Again, I follow Bennett (2013, p. 72), whodefines CC-Ident[F] as follows:(6) CC-Ident[F]: ‘If two consonants correspond, then they agree on [±F].’For each distinct pair of output consonants, X and Y, assign a violation if:a. X and Y are in the same surface correspondence class, andb. X is [αF], andc. Y is [βF]…where F is some feature, [αF], [βF] are its possible values, and α≠β3The sole difference between the definition in (5) and Bennett’s (2013) proposal is that Bennettspecifies that the Corr constraint is assessed with respect to some morphological or phonologicaldomain D, which for present purposes, I assume to be the word.40The final types of CC·Limiters that are relevant for the present discussion includeones that have been proposed as an account for the locality dichotomy of unboundedvs. transvocalic consonant harmony systems as previously discussed in Section 2.1.In particular, the constraint Proximity (Rose and Walker, 2004) penalizes corre-spondence for any pair of consonants that are not within a bounded Cv.C window.While Rose and Walker (2004, p. 494) define Proximity in terms of syllable ad-jacency, I will use a modified version of their proposal that is framed with respectto transvocalic contexts:(7) Proximity: ‘If two consonants correspond, then no consonant intervenesbetween them.’For each distinct pair of output consonants, X and Y, assign a violation if:a. X and Y are in the same surface correspondence class, andb. there is some consonant Z that precedes Y and is preceded by XDepending on the ranking of Corr, CC-Ident, and Proximity constraints (andadditional CC·Limiters) relative to Faithfulness constraints (e.g. IO-Ident[ant]),different variants of consonant harmony patterns can be generated.2.3.1 Deriving Unbounded and Transvocalic Harmony in ABCTableau (8) shows a derivation of the Aari form in (1d), /ʃed-er-s-it/ → [ʃederʃit] ‘Iwas seen’, with an OT grammar for unbounded sibilant harmony. For expositoryreasons, in this and subsequent examples I abstract away from the issue of direc-tionality of assimilation by assuming that anyCC-Ident constraint is specified forprogressive or regressive directionality (sometimes indicated by CLCR-Ident[F]and CRCL-Ident[F], respectively; Rose and Walker, 2004). In any case, since thefocus of this dissertation is on the proper characterization and learnability of thephonotactics themselves (permitted vs. prohibited output sequences), the questionof which repair strategy emerges as optimal, and how this gets determined in thegrammar, is not directly relevant.41(8) …S…c…S… sibilant harmony in Aari/ʃed-er-s-it/Corr[+strid]CC-Ident[ant]IO-Ident[ant]Proximitya. ʃxe.der.syit ∗!b. ʃxe.der.sxit ∗! ∗c.+ ʃxe.der.ʃxit ∗ ∗Candidate (8a) loses as the two sibilants are not in correspondence, a violation ofCorr[+strid]. Candidate (8b) is not optimal because its two sibilants, although insurface correspondence, have mismatched [±ant] specifications. The winner, then,is (8c), which incurs an IO-Ident[ant] violation for the unfaithfulmapping /s/→ [ʃ]in order to satisfy both the Corr and CC-Ident constraints. With Proximityranked below Corr[+strid], the harmony will be enforced across any number ofintervening segments or syllables. Note, however, that the constraint ranking in(8) would not be a suitable analysis of a transvocalic harmony pattern such as thatexhibited by Koyra in (2), since in such cases harmony must be prevented fromapplying across an intervening consonant. Tableaux (9) and (10) give derivationsof (2b) and (2c), demonstrating that a high-ranked Proximity constraint results inharmony being enforced within the desired …CvC… window, but not beyond.(9) …SvS… sibilant harmony in Koyra/patʃ-d-osːo/ Proximity Corr[+strid]CC-Ident[ant]IO-Ident[ant]a. patʃːxosːyo ∗!b. patʃːxosːxo ∗!c.+ patʃːxoʃːxo ∗(10) No …S…c…S… sibilant harmony in Koyra/ʃod-d-osːo/ Proximity Corr[+strid]CC-Ident[ant]IO-Ident[ant]a.+ ʃxodːosːyo ∗b. ʃxodːosːxo ∗! ∗c. ʃxodːoʃːxo ∗! ∗42The attested dichotomy between unbounded and transvocalic variants of consonantharmony locality can thus be accounted for with the relative ranking of Proxim-ity and Corr[F] constraints. Assuming that a language has a pattern of consonantharmony in the first place, the dependencywill be unbounded if Corr[F]≫Prox-imity, but will be transvocalic if Proximity≫ Corr[F].2.3.2 Biased Learning of ABC Constraint RankingsThis section explores the role that learning biases play in Optimality Theory andprovides an account of the typological and experimental data as a reflection of ABClearning biases. The first type of bias to consider is one that shapes the boundaryof the learner’s hypothesis space. That is, we want to define a theoretical criterionfor assessing whether a pattern is learnable or not. In OT terms, this is providedby the constraints themselves. The learner’s goal is to discover the correct rank-ing of a finite number of constraints provided by a universal constraint set (CON).These constraints thus dictate the learner’s hypothesis space, as patterns that cannotbe generated by some ranking permutation (i.e. within the factorial typology) falloutside the region of learnable grammars. In Experiment 1, the pattern in questioninvolves liquids agreeing for the [±lateral] feature, with the relevant constraints be-ing Corr[liquid] (where [liquid] represents a feature bundle such as [+son, –nas,+cons] that picks out [l, ɹ] and omits all other segments), CC-Ident[lat], Prox-imity, and IO-Ident[lat]. With respect to locality when applying the pattern totwo liquids, the twenty-four (four factorial) possible rankings of these constraintsgenerate exactly the patterns observed in the typology: liquid agreement both in…LvL… and …L…c…L… contexts (unbounded harmony), agreement that isnecessary only at the shorter …LvL… distance (transvocalic harmony), and freeco-occurrence of liquids at any distance (no harmony). The crucial rankings forthese patterns are presented in Table 2.7.In summary, the set of constraints posited by the ABC account of consonantharmony can be thought of as learning biases, the factorial typology of constraints4Though it is not relevant for present purposes, there are two types of faithful outputs that canemerge as optimal. Either the relevant pair of consonants will be in correspondence (but will notharmonize), resulting in a winning candidate such as […lxa-ɹxu]) or they will not correspond at all(resulting in, for example, […lxa-ɹyu]), though these two options are indistinguishable phonetically.43Table 2.7: Types of phonotactic patterns that can be generated within a facto-rial typology of ABC constraints.Pattern Crucial ranking Number of rankingsUnboundedharmonyCORR[F] CC-IDENT[G] IO-IDENT[G] PROXIMITY 5Transvocalicharmony CORR[F] CC-IDENT[G] IO-IDENT[G] PROXIMITY 3No restrictions(Faithfulness)4any other ranking 16corresponds to the types of grammars a learner must consider, and the predictionis that the set of possible phonotactics is exactly the set of patterns that can begenerated by each of those grammars. There are many algorithms that can provablyfind a correct constraint ranking for a given pattern (e.g. Boersma, 1997; Tesar andSmolensky, 2000; Boersma and Hayes, 2001; Goldwater and Johnson, 2003), solong as it can be generated by at least one of the grammars in the factorial typology,and therefore both the unbounded and transvocalic variants of consonant harmonylocality are learnable. Furthermore, other types of locality that are logically possiblebut unattested (such as those given above in Table 2.1) will not be learnable inthis approach, since they cannot be generated using the set of constraints underconsideration.It is also important to know what the learner’s first hypothesis is. In terms ofOT, this is the initial constraint ranking, which can also be thought of as a type oflearning bias. That is, in the absence of any evidence for changing the relative rank-ing of two or more constraints, which pattern would emerge as the output of the OTgrammar? In order to ensure a restrictive final grammar, for example, there mustbe a bias that favours high-ranked Markedness constraints and low-ranked Faith-fulness constraints (see, e.g., Smolensky, 1996; Hayes, 2004; Prince and Tesar,2004; for similar arguments framed in terms of Harmonic Grammar, see Jesney44and Tessier, 2011). Asking this question in the context of Experiment 1 will allowus to account for why the Control group did not apply harmony at any of the threetesting distances, why the S-Harm group learned harmony in cvcvLv-Lv contextsbut did not generalize to other locality levels, and why the M-Harm group general-ized from cvLvcv-Lv contexts to all distances.The fact that the Control group showed no preference for harmony at any ofthe three testing distances (and indeed largely made their choices based on whichtesting item was faithful to the bare verb stem) suggests that IO-Ident[lat] beginsas a highly ranked constraint with respect to the others.5 This is likely a reflectionof the presumed adult grammar of the native English speakers who participated inthe experiment (see also Pater and Tessier, 2003). Based only on the results of theControl group, however, we have no means of determining the full ranking andmust also look at the results of the other two groups.Note that to have any pattern of liquid harmony, irrespective of locality, thecrucial ranking is Corr[liquid], CC-Ident[lat] ≫ IO-Ident[lat]. In order forthe learner to transition to a grammar that generates liquid harmony, then, IO-Ident[lat] must be reranked such that it is dominated by both the Corr and CC-Ident constraints. Which hypothesis the learner considers first (i.e. transvocalicvs. unbounded harmony) will depend on the relative ranking of the other con-straints. If Proximity outranks Corr[liquid] in the initial state, then the learnerwill hypothesize a transvocalic pattern until it is given exposure to harmony at agreater distance. If instead Corr[liquid] outranks Proximity initially, then thelearner will hypothesize an unbounded dependency until it is given explicit counter-evidence in beyond-transvocalic contexts. In Experiment 1, the S-Harm group wasexposed to harmony only in cvcvLv-Lv items—a pattern that is compatible withboth types of locality—and did not generalize the pattern to greater distances. Thissuggests thatProximity≫Corr[liquid] initially, but that the ranking can changeif the learner is exposed to liquid harmony at a beyond-transvocalic distance, suchas the cvLvcv-Lv items seen by the M-Harm group in Experiment 1, resulting in5It turns out that IO-Ident[lat] does not need to be undominated for Faithfulness to be optimalat all distances. As long as it is ranked above either Corr[liquid] or CC-Ident[lat], then an un-faithful candidate will never be optimal since it will always be better to leave the two liquids out ofcorrespondence or to allow corresponding liquids to disagree.45a pattern with unbounded locality.2.4 Summary and ConclusionsThis chapter outlined a robust dichotomy with respect to the distance between twointeracting segments in systems of consonant harmony. I then provided evidencefrom an artificial language learning experiment that the properties of human phono-tactic learning reflect this typological dichotomy (unbounded vs. transvocalic lo-cality). In the experiment, which replicates the results of similar studies lookingat sibilant harmony (Finley, 2011, 2012; McMullin and Hansson, 2014), subjectsin the M-Harm training condition were exposed to a pattern of liquid harmony thatspanned an intervening non-liquid consonant (in cvLvcv-Lv contexts) and thatgroup, as a whole, was significantly more likely than the Control group to apply thepattern to liquids at all three testing distances: Short-range (cvcvLv-Lv), Medium-range (cvLvcv-Lv), and Long-range (Lvcvcv-Lv) distances. By contrast, forsubjects in the S-Harm training condition, who were exposed to cvcvLv-Lv har-mony, the group applied the pattern in the Short-range test items, but was no morelikely than the Control group to apply harmony at Medium-range and Long-rangedistances. The overall result was interpreted as evidence for a cognitive learningbias that imposes a restriction on the types of patterns that are human-learnable.Upon establishing this relationship between typology and human learning, Idemonstrated how the constraints used in the Agreement by Correspondence frame-work can account for the empirical data. Specifically, the factorial typology ofthe ABC constraints presented in this section includes consonant harmony patternswith both the unbounded and transvocalic variants of locality (but no other typeof consonant harmony). This provides a straightforward explanation of the resultthat subjects in the M-Harm training condition of Experiment 1 tended to interpretthe training stimuli as evidence for a pattern with unbounded locality—no otherpattern that is compatible with harmony in cvLvcv-Lv contexts can be generatedwith these constraints. Furthermore, the fact that the S-Harm group did not showevidence of learning an unbounded dependency can be construed as a relativelyhigh prior ranking of the Proximity constraint with respect to Corr[liquid].Recall, however, that the purpose of this chapter was to present an intentionally46restricted view of long-distance consonant interactions. The result is an illustrationof what I consider to be the ideal scenario, in which a clear relationship betweenthe typology and learnability of a phonotactic pattern is easily captured within aparticular theoretical framework. In what follows, I will demonstrate that the ABCframework actually makes a number of problematic predictions, and argue for anew approach that is framed in terms of formal language theory.47Chapter 3Problems with the ABCApproachThe purpose of this chapter is to call into question the validity of a number of pre-dictions made by the Agreement by Correspondence (ABC) model of long-distancephonology (Hansson, 2001, 2010a; Rose and Walker, 2004; Bennett, 2013). I firstpoint out that the factorial typology generated by ABC constraints, even in itssimplest form, includes several phonotactic patterns that are unattested in natu-ral language. With respect to long-distance consonant agreement—the types ofpatterns for which the framework was originally intended—the seemingly simpleconstraints used to account for the dichotomy of unbounded vs. transvocalic lo-cality generate many pathological patterns that are unattested cross-linguistically(Section 3.1.1). Section 3.1.2 then presents a further problem that arises when ex-tending the ABC framework to analyses of long-distance consonant disagreement(Bennett, 2013). The fact that both consonant harmony and dissimilation have sim-ilar cross-linguistic properties with respect to locality suggests that we should wantto provide a unified theoretical account for the two types of patterns. However, anattempt to account for both within ABC results in unavoidably different predictionsabout each—in situations where harmony is enforced, dissimilation is expected tooccur in exactly the complement set of environments (see Section 3.1.2). Thisis Bennett’s 2013 “Mismatch Prediction”. Though Bennett does provide limitedtypological support for the predicted case of beyond-transvocalic dissimilation (a48complex interaction of liquids in Sundanese infixation; see also Cohn, 1992; Ben-nett, 2015), the second portion of this chapter explores the characteristics of humanlearning and generalization, demonstrating that subjects in an artificial languagelearning study (analogous to Experiment 1; see Section 2.2.2) do not differ whenthe target pattern is liquid dissimilation rather than liquid harmony (Section 3.2.2).With support from an additional analysis of Experiments 1 and 2 that includes onlythose subjects who surpassed a threshold that was used to indicate successful learn-ing of the training pattern, I conclude that neither the typological nor experimentalevidence supports the predictions of the ABC model of long-distance phonotac-tics and that we should pursue a different approach in order to provide a cohesivetheoretical account of the observed empirical data.3.1 Limited Typological Support for Predictions of ABC3.1.1 Pathological Harmony PatternsThis section describes two pathological cases of consonant harmony. The first iswhat Hansson (2014) refers to as agreement by proxy, in which two relatively dis-similar consonants are forced to agree due to a shared relationship with a third(potentially non-harmonizing) correspondent (Section The second prob-lematic prediction is a sensitivity to the count (even vs. odd parity) of potentialcorrespondents (Section, which prompts a new definition of the locality-based CC·Limiter Proximity, which I call CC-cvc (modelled on CC·SyllAdj,as defined by Bennett, 2013). Agreement by ProxyThe assumption that surface correspondence is an equivalence relation (i.e. it issymmetric, reflexive, and transitive; Bennett, 2013), partitioning the set of co-occurring segments within the output into equivalence classes, can give rise tobizarre agreement by proxy effects (Hansson, 2014). Two co-occurring segmentsthat are normally not required to correspond (nor, therefore, to interact) can beforced to do so when a third segment is present somewhere in the word, providedthat this third segment is sufficiently similar to each of the other two to force them49into (covert) correspondence with itself, and thereby also with each other. For ex-ample, we can imagine a hypothetical obstruent voicing harmony, in which onlyhomorganic obstruent pairs interact, and where the harmony is moreover strictly re-gressive, harmonizing [–voice]…[+voice] sequences to [+voice]…[+voice] (whileleaving [+voice]…[–voice] sequences intact). An input like /sada/ is thus changedto [zada], whereas a form like /saga/ surfaces faithfully as [saga]. A pattern like thisis straightforwardly captured with the ranking in (1) and (2), where Corr[–son,αPlace] requires that co-occurring homorganic obstruents stand in correspondenceand CRCL-Ident[+voi] penalizes [–voi]…[+voi] sequences of surface correspon-dents. The more general constraint Corr[–son], which demands correspondencefor heterorganic as well as homorganic obstruent pairs, is ranked too low to haveany effect.(1) Obstruent voicing parasitic on place: homorganic pairs/sada/Corr[-son, αPlace]CRCL-Ident[+voi]IO-Ident[voi]Corr[-son]a. sxadya ∗! ∗b. sxadxa ∗!c.+ zxadxa ∗(2) Obstruent voicing parasitic on place: heterorganic pairs/saga/Corr[-son, αPlace]CRCL-Ident[+voi]IO-Ident[voi]Corr[-son]a.+ sxagya ∗b. sxagxa ∗!c. zxagxa ∗!Let us now imagine that the same language has another highly ranked Corr con-straint, which demands that segments that agree in both manner ([±continuant]) andvoicing must also stand in surface correspondence to one another: Corr[αcont,βvoi]. In cases like (1) and (2), such a constraint has no bearing on the outcome,as all the relevant output candidates vacuously satisfy it. However, in a case like(3), where the same kind of /s…g/ sequence as in (2) co-occurs with a /x/ some-where else in the word, we see how the mere presence of this /x/ causes regressive50harmony to be triggered in the /s…g/ sequence.(3) Transitive correspondence relation causes agreement by proxy/sagaxa/Corr[αcontβvoi] Corr[–sonαPlace] CRCL-Id[+voi]IO-Id[voi]Corr[-son]a. sxagyaxza ∗! ∗! ∗∗∗b. sxagyaxya ∗! ∗∗c. sxagyaxxa ∗! ∗∗d. sxagxaxxa ∗!e.+ zxagxaxxa ∗f. zxagyaxya ∗ ∗!∗For a [s…g…x] sequence like in (3a)-(3d), one Corr constraint requires [g…x]to be in correspondence (both are velar obstruents) while the other requires thesame of [s…x] (both are voiceless fricatives). The only way to satisfy both Corrconstraints is to place all three segments into the same correspondence class (3c)-(3d), but this means that [s…g] are also in correspondence with each other, unlikein cases like (2). Consequently, the generalization is that the regressive voicingharmony applies to heterorganic obstruent pairs if and only if the word also happensto contain a third obstruent that agrees in place with one but in manner with theother. To my knowledge, nothing resembling this kind of pattern has ever beenattested in a natural language. Intuitively, it seems unlikely that this is an accidentalgap, but it nonetheless falls within the boundary of possible phonotactic patterns asdefined by the factorial typology of ABC constraints. While I do not provide anyexperimental evidence that a pattern of agreement by proxy is not human-learnable,Section 4.3 argues on computational grounds that such dependencies should indeedfall outside of the learner’s hypothesis space. Sensitivity to Number and Parity of Potential CorrespondentsAnother problematic prediction is the result of the definition of Proximity (RoseandWalker, 2004, p. 494; redefined in transvocalic terms both in Section 2.3, and in(4) below). For expository purposes, I will reproduce the way in which transvocalic51(or syllable-adjacent) harmony is generated in ABC, for a hypothetical example oftransvocalic sibilant harmony (analogous to the pattern found in Koyra; see Sec-tion 2.1) that holds within a bounded SvS window, but not at greater distances. Atfirst glance, this appears easily captured with a high-rankingProximity constraint,which is defined in (4) below (in transvocalic terms, rather than syllable-adjacencyas in Rose and Walker, 2004).(4) Proximity: Correspondent consonants must not be separated by any inter-vening consonant.Inwordswhere the two consonants are in a transvocalic context, note thatCorr[+strid]can be satisfied without violating higher-ranked Proximity; a correspondence re-lation is therefore established and agreement in [±ant] is enforced over that relation.This is illustrated below in (5) with a hypothetical input /paʃasa/.(5) High-ranked Proximity permits …SvS… harmony/paʃasa/ Proximity Corr[+strid]CC-Ident[ant]IO-Ident[ant]a. paʃxasya ∗!b. paʃxasxa ∗!c.+ paʃxaʃxa ∗By contrast, when additional material intervenes between the two sibilants, har-mony is not enforced, as shown below in (6) for the input /ʃapasa/.(6) High-ranked Proximity prevents …S…c…S… harmony/ʃapasa/ Proximity Corr[+strid]CC-Ident[ant]IO-Ident[ant]a.+ ʃxapasya ∗b. ʃxapasxa ∗! ∗c. ʃxapaʃxa ∗! ∗In the tableau in (6), the two sibilants are located outside of a transvocalic windowand correspondence is prohibited due to the violations of Proximity for candi-dates (6b) and (6c). As a result, candidate (6a) [ʃxapasya] would emerge as opti-52mal, since the two sibilants are not in correspondence and therefore the candidatedoes not violate top-ranked Proximity. This is a seemingly simple solution forgenerating transvocalic consonant harmony, and is indeed how Rose and Walker(2004) analyze the transvocalic nasal consonant harmony of Ndonga (in contrast toits unbounded counterpart in Kongo).However, since each collection of surface correspondents within an output formconstitutes a set (an equivalence class; see Bennett, 2013), in which every memberis a correspondent of every other member, things quickly become complex once thenumber of correspondents goes above two, even if each local pair of consonants isin adjacent syllables (straddling a single vowel and nothing else). We should expectharmony to apply in a stepping-stone fashion—this is indeed what happens in realcases of transvocalic consonant harmony, such as Koyra, as shown in (7).(7) Stepwise SvS sibilant harmony in Koyra (Koorete; Hayward, 1982)a. /dʒaʃ/ dʒaʃ ‘fear’b. /dʒaʃ-us-/ dʒaʃ-uʃ ‘cause to fear’c. /dʒaʃ-us-esːe/ dʒaʃ-uʃ-eʃːe ‘let him/them frighten (s.o.)!’However, the tableau in (8) shows how things go wrong for cases involving threeconsonants of the relevant class. The same constraint ranking that enforces har-mony in a /…ʃVsV…/ sequence like in (5) will fail to do so in a /…ʃVsVsV…/sequence.(8) Harmony fails with three sibilants/paʃasasa/ Proximity Corr[+strid]CC-Ident[ant]IO-Ident[ant]a. paʃxasyasza ∗∗∗!b. paʃxasxasxa ∗! ∗∗c. / paʃxaʃxaʃxa ∗! ∗∗d. paʃxaʃxasya ∗∗ ∗!e.+ paʃxasyasya ∗∗The problem in situations like (8) is that in a chain of two or more transvocalic pairsof consonants of the relevant type, placing all of the consonants in correspondence53will always result in a violation of Proximity, since at least two of the correspon-dents will necessarily be separated by one or more intervening consonants, such asthe first and third sibilants in (8b)-(8c). Given the ranking Proximity ≫ Corr(which as explained above is the defining property of transvocalic harmony), theoptimal resolution is to leave one of the consonants out of correspondence. Thechoice of which consonant to leave out falls to lower-ranked considerations such asFaithfulness, as can be seen in (8d) vs. (8e). Similarly, an input like /paʃaʃasa/ willsurface without harmony, because [paʃxaʃxasya], with the first two sibilants in cor-respondence, will do better on Faithfulness than [paʃxaʃyaʃya] with correspondence(and hence harmony) between the second and third sibilants. The crucial factor isthus not the number of potential harmony targets but the overall number of sibilantsin the sequence.In fact, the nature of the pathology is even more bizarre, in that the key criterionis the parity of that number, where the predictions for the (non)application of har-mony are different for odd-parity vs. even-parity cases. WithProximity≫Corr,an even number of potentially-harmonizing consonants is best partitioned into aseries of transvocalic correspondence pairs (…CxVCxVCyVCyV…). Harmony istherefore predicted to be enforced in such words only between the 1st and 2nd con-sonant in the sequence, as well as in the 3rd and 4th consonant, etc., whereas thereshould be no requirement for harmony between the 2nd and 3rd, or the 4th and 5th(etc.) consonants. For an odd number of consonants, by contrast, the optimal cor-respondence configuration is for one consonant to stand outside of correspondencewith any of the others, and for the (even-parity) sets of consonants on either side ofthat consonant to be partitioned into individual, harmonizing correspondence pairsas described above. Just as in (8), the determination of which consonant is the “oddman out” will fall to lower-ranked considerations such as Faithfulness. Needlessto say, no natural language displays anything remotely resembling such a soundpattern.In part to avoid problematic predictions like this, Bennett (2013) explicitly re-defines Rose and Walker’s 2004 Proximity constraint such that it is only violatedwhen there is some non-correspondent consonant that intervenes. Bennett (2013)calls this constraint CC·SyllAdj, which is defined as in (9).54(9) CC·SyllAdj: ‘Cs in the same correspondence class must inhabit a con-tiguous span of syllables.’ (Bennett, 2013, p. 85)For each distinct pair of output consonants X and Y, assign a violation if:a. X and Y are in the same surface correspondence class,b. X and Y are in distinct syllables, Σx and Σyc. there is some syllable Σz that precedes Σy, and is preceded by Σxd. Σz contains no members of the same surface correspondence class asX and YBefore demonstrating that this alleviates the parity-sensitivity, a revised version ofthe above constraint, which I call CC-cvc, is provided in (10). The key departurefrom Bennett’s CC·SyllAdj is that CC-cvc assigns violations based on transvo-calic locality rather than syllable-adjacency.(10) CC-cvc: For each distinct pair of output consonants X and Y, assign aviolation if:a. X and Y are in the same surface correspondence class,b. there is some consonant Z that precedes Y, and is preceded by Xc. Z is not in the same surface correspondence class as X and YAs demonstrated in the tableau in (11), which uses the same ranking as the abovetableaux but replaces Proximity with CC-cvc, the desired pattern can now begenerated, since sequences of three or more correspondents no longer violate thelocality-based CC·Limiter.(11) Sequences of three correspondents harmonize/paʃasasa/ CC-cvc Corr[+strid]CC-Ident[ant]IO-Ident[ant]a. paʃxasyasza ∗!∗∗b. paʃxasxasxa ∗!∗c.+ paʃxaʃxaʃxa ∗∗d. paʃxaʃxasya ∗!∗ ∗e. paʃxasyasya ∗!∗55Note, however, that all segments in correspondence are meant to form an equiva-lence class, but that CC-cvc is evaluated only after separating the correspondentsinto a series of pairs with linear order (that is, not for every possible pair). I pointout (with Hansson, 2014) that this fix to the problem begins to undermine the def-inition of the correspondence relation, making correspondence classes look morelike tiers or projections of segments, which is precisely the type of representationthat Chapter 4 will argue for.3.1.2 Pathological Dissimilation PatternsThis section begins by outlining how the constraint machinery of the Agreementby Correspondence framework offers dissimilation as an alternative repair for vi-olations of a phonotactic restriction (Section The result, as pointed outby Bennett (2013), is that the ABC model makes a number of concrete predic-tions about the typology of dissimilation. I will focus primarily on two predic-tions about locality that I argue are not supported by empirical data. First, Sec-tion demonstrates how basic ABC constraints can produce a pathologicalbeyond-transvocalic pattern of dissimilation that applies only outside of the localCvC window. Second, Section shows that with these same constraints,it is impossible to reproduce simple patterns of transvocalic dissimilation that arewidely attested cross-linguistically (and discusses amendments to the ABC theorythat Bennett, 2013 proposes to address this problem). Dissimilation in the ABC FrameworkIn the Agreement by Correspondence model, long-distance dissimilation emergesas a strategy to avoid satisfying an agreement requirement by making the two con-sonants less similar, thereby eliminating the need to have them in correspondence inthe first place (Bennett, 2013). To illustrate this, consider a hypothetical languagethat requires homorganic consonants to be in surface correspondence (facilitated byCorr[αPlace]), and that demands agreement for [±constricted glottis] (e.g. plainvs. ejective stop) among corresponding consonants (CC-Ident[c.g.]). With a lowranking for faithfulness to input specifications for [c.g.], the optimal repair strategyis consonant harmony, such that two homorganic consonants are either both plain56or both ejective, so long as all other constraints remain unviolated. This is shownbelow in Tableau (12), in which candidate (12c) is the winner.(12) Long-distance [c.g.] harmony/t’amata/Corr[αPlace]CC-Ident[c.g.]IO-Ident[c.g.]a. t’xamatya ∗!b. t’xamatxa ∗!c.+ t’xamat’xa ∗However, Tableau (13) shows that after demoting an IO-Ident[place] constraint(which was previously assumed to be undominated), the candidate that exhibitsharmony in (13c) is no longer the winner, and instead it is better to dissimilatethe place feature in order to remove any need for correspondence at all, as seen incandidate (13d) below.(13) Long-distance [place] dissimilation/mat’ata/Corr[αPlace]CC-Ident[c.g.]IO-Ident[c.g.]IO-Ident[place]a. mat’xatya ∗!b. mat’xatxa ∗!c. mat’xat’xa ∗!d.+ mat’xakya ∗More generally speaking, by becoming less similar, a pair of consonants evadesthe scope of a high-ranked Corr constraint that would otherwise require themto be correspondents. Such avoidance can be driven by one or more high-rankedCC·Limiter constraints (Bennett, 2013, see Section 2.3) on surface correspondenceconfigurations, if these cannot be satisfied in any other way (due to high-rankedFaithfulness, for example). This idea, that dissimilation happens only in situationswhere correspondence is penalized, whereas harmony can only take place wherecorrespondence is permitted (since correspondence is the vehicle for agreement),gives rise to Bennett’s (2013) “Mismatch Prediction” regarding the typology ofthese two families of sound patterns. That is, contexts that favour harmony should,57other things being equal, be ones that fail to trigger dissimilation, and vice versa.Given that one type of CC·Limiter in the ABC theory penalizes any pair of corre-spondents that are located too far apart, such asProximity,CC·SyllAdj, orCC-cvc (which are all defined above), these constraints too should be able to triggerdissimilation. As a result, we expect a typological mismatch between the localitypatterns that are possible under consonant harmony and dissimilation, respectively.I note that while each of the locality-based CC·Limiters essentially makes the sameproblematic predictions, the remainder of this section uses CC-cvc to illustratethem. Pathological Prediction: Beyond-Transvocalic DissimilationA locality-based CC·Limiter will trigger dissimilation only when the distance sepa-rating the relevant consonants is greater than the distance specified by the constraint(i.e. greater than a CvC distance for CC-cvc). The ABC model thus predicts thepossibility of dissimilation patterns that are strictly beyond-transvocalic, applyingin exactly the complement set of environments to what is seen for transvocalic con-sonant harmony.The tableaux in (14) and (15) illustrate this schematically. As above, the hy-pothetical case in question involves surface correspondence between homorganicstops (Corr[αPlace]) and a demand that surface correspondents agree for the [c.g.]feature (CC-Ident[c.g.]). The latter requirement can in principle be satisfied eitherdirectly, by laryngeal harmony under correspondence (/t’…t/ → [t’…t’], violat-ing IO-Ident[c.g.]), or indirectly, by place dissimilation out of correspondence(/t’…t/ → [t’…k], violating IO-Ident[place]). However, as (14) shows, CC-cvc will, when ranked highly enough, trigger dissimilation on its own accord inbeyond-transvocalic homorganic consonant pairs, even if CC-Ident[c.g.] is toolow-ranked to play any active role. For transvocalic pairs of consonants as in (15),on the other hand, correspondence is permitted and dissimilation is therefore nottriggered. The result is dissimilation in beyond-transvocalic environments only.(As the comparison between (15b) and (15c) shows, such a dissimilation patterncan coexist either with harmony or faithful non-interaction in transvocalic contexts,depending on the ranking of the relevant CC-Ident and IO-Ident constraints.)58(14) Beyond-transvocalic dissimilation: CC-cvc triggers dissimilation/t’amata/Corr[αPlace]CC-cvc IO-Id[Place]IO-Id[c.g.]CC-Id[c.g.]a. t’xamatya ∗!b. t’xamatxa ∗! ∗c. t’xamat’xa ∗! ∗d.+ t’xamakya ∗(15) Beyond-transvocalic dissimilation: no CvC dissimilation/mat’ata/Corr[αPlace]CC-cvc IO-Id[Place]IO-Id[c.g.]CC-Id[c.g.]a. mat’xatya ∗!b.+ mat’xatxa ∗c.+ mat’xat’xa ∗d. mat’xakya ∗!Unfortunately for the ABCmodel, this predicted mismatch between consonant har-mony and dissimilation is a very poor fit for the attested typology. The only casethat exhibits anything resembling a strictly beyond-transvocalic pattern, Sundaneserhotic dissimilation (Cohn, 1992; Bennett, 2013, 2015), is replete with other com-plications which make it far less persuasive as a test case (infixing morphology,co-existence with lateral harmony, sensitivity to stem-initial vs. non-stem-initialposition, root vs. affix affiliation and onset vs. coda status). Ranking Paradox: Basic Transvocalic DissimilationIn contrast to the pathological beyond-transvocalic dependencies, which are easilygenerated by ABC constraints in spite of limited typological support, sound pat-terns in which dissimilation is confined to transvocalic locality are amply attestedcross-linguistically (Odden, 1994; Bennett, 2013). However, no permutation of theABC constraint types discussed thus far is capable of generating a dissimilationthat is confined to transvocalic contexts without also applying at longer distances.The reason for this is that a ranking paradox arises in cases like those presented in59Tableaux (16) and (17) below.1(16) CvC dissimilation requires Corr[F], CC-Ident[G]≫ IO-Ident[F]/mat’ata/ CC-cvc Corr[αPlace]CC-Ident[c.g.]IO-Ident[Place]a. mat’xatya ∗!b. mat’xatxa ∗!c.+ mat’xakya ∗The example in (16) shows that in order to have any dissimilation whatsoever ina …CvC… context, both Corr[αPlace] and CC-Ident[c.g.] must outrank IO-Ident[Place]. Otherwise, the output would remain faithful to the input form, withthe homorganic stops either being out of surface correspondence, as in (16a), orentering a surface correspondence relation without repairing the sequence that vi-olates the CC-Ident constraint, as in (16b). This holds true no matter what theranking of the locality-based CC·Limiter is (i.e. CC-cvc), since it will never beviolated by the …CvC… consonant pair.However, with the necessary ranking to achieve …CvC… dissimilation beingCorr[place], CC-Ident[c.g.] ≫ IO-Ident[Place], consonants in …C…c…C…contexts will also undergo dissimilation, no matter the ranking of CC-cvc, asshown in (17).(17) Ranking implies dissimilation at all distances/t’amata/ CC-cvc Corr[αPlace]CC-Ident[c.g.]IO-Ident[Place]a. t’xamatya ∗!b. t’xamatxa ∗! ∗!c.+ t’xamakya ∗This is due to the fact that CC·Limiters may only penalize corresponding conso-nants and so candidates like (17c), which dissimilates in order to avoid correspon-1The example in Tableaux (16) and (17) continues with the dissimilation of homorganic stops, butfor clarity omits the IO-Ident[c.g.] constraint and does not offer laryngeal harmony as a possiblerepair.60dence altogether, will always be optimal. That is, it is impossible to rank these ABCconstraints in a way that enforces transvocalic dissimilation without implying thatdissimilation will hold at greater distances as well.To deal with this problem, Bennett (2013) is forced to augment the model withspecial domain-restricted versions of the Corr constraints (without disposing ofthe locality-based CC·Limiters), which call for correspondence only in transvo-calic consonant pairs. For example, Corr-cvc[αPlace] would penalize (16a) butnot (17a). Replacing Corr[αPlace] with this constraint in (16) and (17) wouldproduce a transvocalic-only dissimilation pattern, with winners (16c) and (17a),respectively. Such Corr-cvc[αF] constraints have previously been advocated byHansson (2001, 2010b), but as an alternative to Proximity/CC-SyllAdj ratherthan complementary to it. The inclusion of both constraint types in the model cre-ates an undesirable duplication of effort as well as rampant ambiguity of analysis,as practically every case of transvocalic consonant harmony can be interpreted ei-ther as involving the ranking CC-cvc≫Corr[αF] (with Corr-cvc[αF] rankedtoo low to be relevant) or else the undominated status of Corr-cvc[αF] (with alow ranking of CC-cvc and Corr[αF]).3.2 Limited Experimental Support for ABC Predictions3.2.1 Summary of PredictionsPredictions for Experiment 1 (liquid harmony; see Section 2.2.2) were obtainedby assuming a relationship between the typology of phonotactic patterns and theway humans learn those patterns. Since there are no attested cases of harmonyexhibiting anything other than unbounded or transvocalic locality, the hypothesis(which the results supported) was that the subjects in the M-Harm training group,who were learning from cvLvcv-Lv contexts, would generalize to all levels oflocality. Furthermore, since cross-linguistic evidence suggests that the unboundedpatterns emerge from more local transvocalic dependencies, there was reason tosuspect that subjects in the S-Harm training group would not generalize beyond thedistance encountered in their cvcvLv-Lv training items. This was also supportedby the results, and the inductive biases shown by the learners were easily framed61within the ABC model of consonant harmony, by varying the relative ranking ofProximity.Based only on the typology, we would make the same predictions for dissimila-tion, since unbounded and transvocalic variants of long-distance consonant dissim-ilation are both attested and, with but one possible exception (Sundanese; see Cohn,1992; Bennett, 2013, 2015), no other types of locality are found cross-linguistically.While it seems fortuitous that the ABC framework facilitates dissimilation as an al-ternative repair for prohibited pairs of (non-adjacent) consonants, using the sameconstraint set gives rise to a set of predictions that contradicts the previous logic. Asdiscussed above in Section 3.1.2, the factorial typology of ABC constraints includessound patterns in which dissimilation is enforced only beyond the CvC window.Furthermore, in order to produce simple cases of transvocalic dissimilation, Bennett(2013) posits that the ABC model includes not only Proximity, but an additional,and otherwise redundant, locality-based Corr constraint (Corr-cvc[αF]). Thereare thus at least two competing, yet both motivated, hypotheses for an experimentlooking at the learnability of long-distance dissimilation.If there is indeed a strong connection between the learnability of phonotac-tic patterns and the way such patterns are distributed cross-linguistically, then wewould expect subjects to learn and generalize the pattern of dissimilation in thesame way as subjects did in Experiment 1 when learning harmony. Subjects in anM-Diss training group would generalize the learned pattern to all distances, bothshorter and greater than the cvLvcv-Lv contexts encountered in training. S-Disssubjects on the other hand would tend to restrict the pattern to apply only in thetypes of Short-range cvcvLv-Lv items that they were exposed to in training andnot at greater distances.The ABC model of long-distance phonotactics generates a different set of pre-dictions. For the M-Diss group, there are two potential outcomes that fit the the-ory. Since the cvLvcv-Lv dissimilation encountered in training is compatible withboth the unbounded and beyond-transvocalic variants of locality for dissimilationthat can be generated using basic ABC constraints, we would expect to see sub-jects either generalize to all distances (unbounded locality) or to generalize onlyto the greater Lvcvcv-Lv contexts, without enforcing the pattern for Short-rangecvcvLv-Lv test items (beyond-transvocalic locality). By contrast, since a ‘strictly62transvocalic’ variant of locality is rather difficult to generate usingABC constraints,subjects in the S-Diss group would be expected to have some tendency (at leastmore so than the S-Harm group) to interpret the observed pattern as an unboundeddependency. These two contradictory sets of predictions are the motivation for ex-tending the artificial language learning study to patterns of liquid dissimilation inExperiment Experiment 2: Liquid Dissimilation3.2.2.1 MethodologyParticipants, Stimuli, and Procedure Participants for two additional trainingconditions were recruited and compensated as described in Section 2.2.1, resultingin 32 new participants for Experiment 2 (16 female, 8 male, mean age 23), with 16assigned to each of the “M-Diss” and “S-Diss” groups.The details of the methodology for Experiment 2 are nearly identical to thoseof Experiment 1 (see Section 2.2.1), and all stimuli required for both experimentswere recorded during the same sessions (one session for each of the four speakers).All participants completed a practice phase, a training phase, and a testing phase,with the sole difference being the type of pattern subjects in the experimental groupswere exposed to during training.Training Conditions All training stems were identical to those used in Experi-ment 1. However, rather than seeing alternations that resulted in liquid harmony,subjects in the M-Diss group were exposed to triplets that exhibited a pattern of liq-uid dissimilation in cvLvcv-Lv contexts. Subjects in the S-Diss group were alsoexposed to liquid dissimilation, but only for cvcvLv-Lv items. The testing phasein Experiment 2 was identical to that of Experiment 1. A breakdown of the num-ber and type of stimuli presented in the training phases for the M-Diss and S-Dissgroups is provided in Table 3.1.63Table 3.1: Examples of training items for M-Diss and S-Diss groups in Ex-periment 2Group Training triplet Type and number of itemsM-Diss…dutebi…dutebi-ɹu…dutebi-li…96 stems with no liquid…mekotu…mekotu-li…mekotu-ɹu……pilede…piɹede-li…pilede-ɹu…48 stems with [l]…nelogi…nelogi-ɹu…neɹogi-li……koɹupe…koɹupe-li…kolupe-ɹu…48 stems with [ɹ]…guɹoto…guloto-ɹu…guɹoto-li…S-Diss…dutebi…dutebi-ɹu…dutebi-li…96 stems with no liquid…mekotu…mekotu-li…mekotu-ɹu……pidele…pideɹe-li…pidele-ɹu…48 stems with [l]…negilo…negilo-ɹu…negiɹo-li……kopeɹu…kopeɹu-li…kopelu-ɹu…48 stems with [ɹ]…gutoɹo…gutolo-ɹu…gutoɹo-li… Results and AnalysisThis section presents the results of Experiment 2, analyzing them in much the sameway as was done above for Experiment 1 (see Section for full descriptions).As the two experiments were designed and implemented at the same time, the anal-ysis presented here uses the response data from the same 16 Control subjects usedfor the analysis of Experiment 1. The preliminary mixed-effects logistic regressionanalysis of the results includes data from 48 subjects (the first 16 in each of the threeconditions who completed the study) and models the log-odds of choosing the testitem that had two different liquids. The fixed effects portion of the model includesthe between-subjects variable of training group (M-Diss and S-Diss compared to thebaseline Control group) and the within-subjects variable of trigger-target distancein the test item (Medium- and Long-range compared to the baseline Short-rangeitems), and an interaction between Group and Distance. The model also includestwo nuisance variables that contributed significantly to the model fit, labelled Dis-similation Faithful (whether the stem liquid in the option with disagreement was64faithful to the liquid in the bare-stem form, or whether it required a dissimilatoryalternation) and Dissimilation Second (whether the option with disagreeing liquidswas presented as the second member of the pair of 2AFC items). The randomcomponent consisted of by-subject intercepts and slopes for the same two nuisancevariables, which are intended to offset individual tendencies for choosing harmonyvs. disharmony, faithfulness vs. alternations, and the first vs. second 2AFC alter-native. This model, which uses the Short-range test items as its baseline for theDistance variable, is summarized in Table 3.2.Table 3.2: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 2 (N = 4534; log-likelihood = –2128.3)Coefficient Estimate SE Pr(>|z|)Intercept −0:72645 0:24415 0.0029Dissimilation Faithful 2:38678 0:29248 < 0.0001Dissimilation Second −0:71357 0:13958 < 0.0001Medium-range −0:09605 0:16489 0.5602Long-range 0:18617 0:16339 0.2545S-Diss 2:51628 0:30460 < 0.0001M-Diss 1:47546 0:30615 < 0.0001Medium-range × S-Diss −2:03104 0.24981 < 0.0001Long-range × S-Diss −2:70534 0.25122 < 0.0001Medium-range ×M-Diss −0:00588 0.23654 0.9802Long-range ×M-Diss −0:98671 0.23281 < 0.0001The model in Table 3.2 leads to many of the same conclusions that were drawnfrom the analogous mixed logit model for the first 16 subjects in each condition ofExperiment 1 (see Table 2.5). The negative estimate for the intercept (which usesall of the baseline measures for each of the variables) can be interpreted as follows:a subject in the Control group is less likely than chance to choose a Short-rangetest item with disagreeing liquids ([l…ɹ] or [ɹ…l]) when it requires a dissimila-tory alternation and is presented first item of the 2AFC pair. The likelihood ofpreferring disagreeing liquids increases when the stem liquid does not require an65alternation (Dissimilation Faithful), but decreases when it is the second item ofthe 2AFC choices (Dissimilation Second). The relatively small effects and lackof significance for theMedium-range and Long-range predictor variables indicatesthat a Control subject is not any more or less likely to choose dissimilation at ei-ther of these distances compared to the Short-range baseline. Main effects for bothexperimental groups (S-Diss and M-Diss) are positive and significant, suggestingthat subjects in each of these two groups were more likely to choose dissimilationin Short-range test items than Control subjects were. Based on the large and sig-nificant negative coefficient estimates of the first two interaction terms (Medium-range × S-Diss and Long-range × S-Diss), a subject in the S-Diss group is muchless likely to choose dissimilation at theMedium- and Long-range distances (whichwere not encountered in training), suggesting that they did not generalize outwardsfrom the Short-range distance. Estimates for the final two interaction terms in themodel (Medium-range × M-Diss and Long-range × M-Diss) indicate that the M-Diss subjects are statistically no less likely to choose dissimilation atMedium-rangecompared to the baseline Short-range, but that they are significantly less likely todo so at Long-range.Recall from the discussion in Section that a model of subject behaviourthat uses a particular baseline measure of Distance cannot, strictly speaking, allowus to assess the hypotheses under consideration. As a more appropriate alternative,Table 3.3 provides a direct comparison, in the form of odds ratios, for each of thetwo experimental groups after re-fitting the same mixed-effects logistic regressionmodel with different choices of the baseline level for the test-item Distance factor.This table, along with Figure 3.1 illustrates that both experimental groups showevidence of learning. Overall, theM-Diss subjects learned a pattern of dissimilationfrom their cvLvcv-Lv training items, as they were more than four times (4.35)more likely than the Control group to choose dissimilation in the novel Medium-range test items. Likewise, subjects in the S-Diss group appear to have learned thepattern they were exposed to, being more than twelve times (12.38) more likelythan the Control group to choose dissimilation in the cvcvLv-Lv context that theywere exposed to in training.As seen in Figure 3.2, the M-Harm group seems to generalize this pattern in-wards to the Short-range test items (4.37 times more likely to choose dissimilation66Table 3.3: Odds ratios comparing experimental groups to Control group forchoosing dissimilation with each of the three testing distances as modelbaselines. Contexts encountered in training are in boldface and all cellsthat reach significance are shaded.Type of test item (trigger-target distance)Short-range Medium-range Long-range(cvcvLv-Lv) (cvLvcv-Lv) (Lvcvcv-Lv)Nontransvocalic vs. Control4.37(p < 0.001)4.35(p < 0.001)1.63p ≈ 0.104Transvocalic vs. Control12.38(p < 0.001)1.62p ≈ 0.0870.83p ≈ 0.498than the Control group), but the same effect was not seen for the Long-range testitems. The M-Diss group showed a small increase (OR=1.63) in the likelihood ofchoosing disharmony, compared to the Control group, in test items with a stem-initial liquid (Lvcvcv-Lv items), but the effect did not reach statistical signifi-cance.Finally, subjects in the S-Diss group are not significantly more likely to choosedissimilation at either of the greater distances, being about 1.62 more likely tochoose dissimilation at Medium-range and about equally as likely (OR=0.83) asthe Control group to choose dissimilation at Long-range. Neither of these effectsreached statistical significance. Discussion of Experiment 2The goal of Experiment 2 was to create a study of long-distance phonotactic learn-ing identical to Experiment 1 (in terms of the overall methodology) in order toevaluate two competing, yet both motivated hypotheses. It turns out, however, thatthe results of Experiment 2 (as presented in Section are not compatible, atleast in their entirety, with any of the predictions outlined in Section 3.2.1. As canbe seen in Table 3.3, which presents the size and significance of the odds ratios thatemerge from the statistical analysis that compared each group to the Control group67S-Diss learning atShort-range?cvcvLv-Lv test items(saw dissimilation)Proportion disharmony responses ([l…r] or [r…l])Control S-Diss0.000.250.500.751.00* * M-Diss learning atMedium-range?cvLvcv-Lv test items(saw dissimilation)Proportion disharmony responses ([l…r] or [r…l])Control M-Diss0.000.250.500.751.00Figure 3.1: Plots comparing proportions of disharmony responses for Controlsubjects to those of the M-Diss subjects in Medium-range test items (leftpanel) and to the S-Diss subjects in Short-range test items (right panel).Each dot represents individual subject performance, and groupmeans areindicated with a horizontal line. Significance is extracted from a mixedlogit model and indicates learning of the pattern each group was exposedto.at each of the three testing distances, subjects in the M-Diss group were signifi-cantly more likely to choose the items with disagreeing liquids at Short-range andMedium-range, but this result did not extend to Long-range test items. This is ac-tually problematic for all of the considered hypotheses—M-Diss subjects were ex-pected to either generalize the pattern from Medium-range to all distances, or onlyoutwards to the Long-range test items in accordance with the beyond-transvocalic68M-Diss generalizationto Long-range?Long-range test items(saw no evidence)Proportion disharmony responses ([l…r] or [r…l])Control M-Diss0.000.250.500.751.00* M-Diss generalizationto Short-range?Short-range test items(saw no evidence)Proportion disharmony responses ([l…r] or [r…l])Control M-Diss0.000.250.500.751.00n.s. Figure 3.2: Plots comparing proportions of disharmony responses for Controlsubjects to those of the M-Diss subjects in Short-range test items (leftpanel) and in Long-range test items (right panel). Each dot represents in-dividual subject performance, and group means are indicated with a hor-izontal line. (Non-)Significance is extracted from a mixed logit modeland indicates generalization of the pattern the group was exposed to.patterns predicted by ABC. Note, however, that the present results may be con-founded by a more general reluctance for experimental subjects to extend phono-logical alternations into salient word-initial contexts2 (see, e.g., Becker et al., 2012).2The same reluctance was, to some degree, also seen in Experiment 1. The M-Harm group wasless likely to choose the test itemwith harmony in Long-range test trials (OR=2.25; see Table 2.6) thanfor Short-range (OR=3.02) or Medium-range (3.80) trials. However, the resistance to initial-syllablealternations was not enough to reduce the effect below significance for the M-Harm group.69S-Diss generalizationto Long-range?Long-range test items(saw no evidence)Proportion disharmony responses ([l…r] or [r…l])Control S-Diss0.000.250.500.751.00n.s. n.s. S-Diss generalizationto Medium-range?Medium-range test items(saw no evidence)Proportion disharmony responses ([l…r] or [r…l])Control S-Diss0.000.250.500.751.00Figure 3.3: Plots comparing proportions of disharmony responses for Controlsubjects to those of the S-Harm subjects in Medium-range test items (leftpanel) and in Long-range test items (right panel). Each dot represents in-dividual subject performance, and group means are indicated with a hor-izontal line. (Non-)Significance is extracted from a mixed logit modeland indicates generalization of the pattern the group was exposed to.With respect to the S-Diss group, subjects applied the pattern of dissimilation to theShort-range test items (indicating learning), but were not significantly more likelythan the Control group to choose dissimilation at either of the Medium- or Long-range test items. This result is in line with the cross-linguistic distribution of non-adjacent consonant dissimilation, in that strictly-transvocalic locality is relativelycommon and does not necessarily imply dissimilation at greater distances. The lack70of generalization to either of the greater distances goes against the predictions ofthe ABC model, however, as the basic constraint set does not permit dissimilationto apply at a transvocalic distance without implying unbounded locality.3.2.3 Analysis of Successful Learners in Experiments 1 and Motivation for Additional AnalysisEvidence from the analysis of both Experiments 1 and 2, as described above inSection and Section, respectively, suggests that the results are inline with the hypothesis that human language learners only have access to two pos-sible variants of locality for long-distance phonotactics: transvocalic or unbounded.These are the only two types of locality that are reliably attested cross-linguistically,but are not the only patterns the learner should have access to if the ABC frame-work is an accurate account of long-distance consonant agreement and disagree-ment. However, the evidence is not entirely definitive, especially with respect toExperiment 2, and there exist reasons to be skeptical of any conclusions.The analyses previously presented for Experiments 1 and 2 were mixed-effectslogistic regression models whose random effects structures were included in orderto factor out individual subject tendencies in the test phase. However, when exam-ining the plots of individual subject responses in Figure 2.1 and Figure 3.1 (whichshow the results for each experimental group at the testing distance that correspondsto the type of pattern encountered in training), it is clear that the proportion of(dis)harmony responses are not clustered around the mean. Instead, subjects tendto fall into one of two categories: ‘successful learners’ and ‘non-learners’. Suc-cessful learners are those subjects whose responses place them closer to the 100%mark for the testing distance that corresponds to their training (i.e. Short-range forS-Harm and S-Diss subjects, Medium-range for M-Harm and M-Diss) and non-learners are those subjects who remain within or close to the range of the Controlsubjects (around 50%, usually due to choosing the 2AFC option that was faithful tothe stem). While it is also an interesting question to ask “Which of the two types oflocality is more difficult to learn” (perhaps measured by the proportion or numberof subjects that successfully learned the target pattern), the goal of the present study71is to determine whether or not human learners generalize patterns learned from animpoverished input in a way that matches the typology (or the theoretical predic-tions). With this in mind, the remainder of Section 3.2.3 presents an alternative andarguably more appropriate analysis of Experiments 1 and 2, augmented with datafrom further subjects, as described in the next section. Defining a Threshold for Successful LearningThe preliminary analyses of Experiments 1 and 2 included the response data fromthe first 16 participants in each group. The following procedure was used to in-stead obtain 12 “successful” learners in each of the S-Harm, M-Harm, S-Diss, andM-Diss groups. A subject was considered to have reached the threshold for suc-cessful learning provided that the proportion of responses that adhered to the targetpattern at the same distance that the subject was exposed to in training surpassed the95% confidence level using a one-tailed test on a binomial distribution.3 I note thatthis threshold was defined independently of the previously collected data (i.e. with-out knowing how many of the first 16 subjects in each condition would qualify aslearners). Intuitively, the idea was to ensure that the probability of a subject whoresponded randomly being classified as a learner was less than 1 in 20. For exam-ple, if an M-Harm subject in Experiment 1 registered a response on all 32 of therelevant test items (Medium-range in this case), at least 21 of them would need tohave harmony in order for the subject to be classified as a successful learner, suchthat their data would be retained for the present analysis. In many cases subjectsdid not respond within the allotted three second window, and therefore did not havea registered response for all 32 trials. A subject who only registered 31 responseswould have to have chosen 20 of the relevant test items with harmony to surpass thethreshold of learning. Subjects who registered either 29 or 30 responses needed toreach 19 responses with harmony. Data from subjects who did not register at least29 responses (on the 32 relevant test trials) were not considered for this analysis.For reference, in Experiment 1, eight of the sixteen M-Harm subjects achievedthe threshold for learning, having responded with harmony in a sufficient number of3A one-tailed test was used (as opposed to a two-tailed test) since subjects who learned the targetpattern are expected to choose the test items adhering to that pattern more often than the Controlgroup, rather than simply being significantly different from them.72Medium-range test items, as did six of the sixteen S-Harm subjects, where learningwas assessed based on the responses to Short-range test items. In Experiment 2, sixof theM-Diss subjects and eleven of the S-Diss subjects were considered successfullearners based on the same criteria. Data was collected from further subjects untilthe number of learners in each of the four groups reached twelve. Of the sixteensubjects in the Control group (the same subjects for both Experiment 1 and 2),no one surpassed the defined threshold at any of the three testing distances. Tomaintain consistency with the other groups, the analyses below include data onlyfrom the first twelve control subjects. Results for Successful Learners in Experiments 1 and 2After obtaining twelve successful learners in each of the S-Harm, M-Harm, S-Diss,and M-Diss groups, results were analyzed using mixed-effects logistic regressionswith structures identical to those used in the above analyses of the first 16 subjectsin each condition of Experiments 1 and 2. To highlight the important aspects of theextended analysis, tables summarizing the full regression models are omitted (seeAppendix B for the full models), and I present only the odds ratios for each of theharmony and dissimilation conditions, as seen in Table 3.4.4With respect to the two groups that were exposed to harmony, the model indi-cates that both the M-Harm and S-Harm groups are significantly more likely thanthe Control group to choose a test item with liquid harmony at all three levels oflocality given in the test phase of the experiment. This is a deviation from the oth-erwise consistent result of previous experiments looking at the learning and gen-eralization of consonant harmony locality (Section 2.2.2 of this dissertation; Fin-ley, 2011, 2012; McMullin, 2013; McMullin and Hansson, 2014). I point out thatthere is a dramatic difference in the size of these odds ratios (maximum of 14.59for S-Harm at Short-range, minimum of 1.63 for S-Harm at Long-range), but willpostpone my discussion of the implications of this result until the next section.The results obtained for the two dissimilation groups, shown in the bottom tworows of Table 3.4, are more in line with the expected outcome based on the cross-4The results are presented in the same table in order to provide an easy visual comparison of theoverall results. Data were analyzed with two separate mixed-effects logistic regressions—one for theharmony conditions and one for the dissimilation conditions.73Table 3.4: Odds ratios comparing 12 successful learners from experimentalgroups to first 12 control subjects for choosing the target pattern witheach of the three testing distances as model baselines for each of the har-mony and dissimilation conditions. Contexts encountered in training arein boldface and all cells that reach significance are shaded.Type of test item (trigger-target distance)Short-range Medium-range Long-range(cvcvLv-Lv) (cvLvcv-Lv) (Lvcvcv-Lv)M-Harm vs. Control11.95(p < 0.001)12.94(p < 0.001)3.64p ≈ 0.007S-Harm vs. Control14.59(p < 0.001)1.78p ≈ 0.0121.63p ≈ 0.031M-Diss vs. Control8.86(p < 0.001)14.65(p < 0.001)2.06p ≈ 0.009S-Diss vs. Control12.01(p < 0.001)1.55p ≈ 0.1060.87p ≈ 0.614linguistic typology of locality. Subjects in the M-Diss group were significantlymore likely than the Control group to choose a test item with two different liquidsat all three testing distances. Subjects in the S-Diss group were significantly morelikely to do so only at the Short-range distance that they were exposed to in training.S-Diss subjects were just 1.55 timesmore likely to choose dissimilation inMedium-range test items (did not reach a significance level of p < 0.05), and at Long-rangethey were about equally as likely as the Control group to choose dissimilation (OR= 0.87, p ≈ 0.614). Discussion: Successful Learners in Experiments 1 and 2As described above, the results of the first twelve learners in the harmony condi-tions did not meet expectations in that not only the M-Harm group, but also theS-Harm group, were significantly more likely to choose test items with liquid har-mony at all three levels of locality. This deviates from previous experiments as74the statistics, strictly speaking, support the conclusion that S-Harm subjects inter-preted their relatively local cvcvLv-Lv training as being representative of a patternwith unbounded locality. This result brings up a number of interesting points fordiscussion.First, even when the statistics are interpreted in this binary fashion, drawingconclusions based only on whether or not an effect reached a conventional signif-icance level of p < 0.05 (as was the original intention for this approach), the re-sult that both sets of subjects applied the pattern with unbounded locality does notcontradict the idea that there is a relationship between the types of patterns foundcross-linguistically and the types of patterns that are included in the human learner’shypothesis space. For theM-Harm group, the only attested type of locality that theircvLvcv-Lv training items were compatible with was an unbounded pattern, sincea dependency across two vowels (and an intervening consonant) implies that thesame dependency should hold both at shorter and longer distances. Any other re-sult would contradict the attested typology of locality for patterns of long-distanceconsonant agreement. However, the cvcvLv-Lv dependency that was presented inthe S-Harm training phase is compatible with two different types of locality that areboth attested—transvocalic and unbounded. From the preliminary analysis of Ex-periment 1, which used the first 16 subjects whether they were successful learnersor not, I concluded that subjects interpreted the impoverished input conservativelyand did not extend it to either of the two greater testing distances. The presentanalysis, however, shows that when restricting the data to subjects who learned apattern at the Short-range distance, they do, when taken as a group, tend to general-ize the dependency to pairs of liquids that are farther apart. Importantly, this is truenot only for the Medium-range, but also for the Long-range test items, in spite ofan established reluctance for experimental participants to generalize phonologicalalternations into initial syllables (Becker et al., 2012, an effect also seen in Finley2012). As such, when this statistical analysis is interpreted with a strict criterionfor ‘significant or not’, the result does differ from past experiments that were notrestricted only to the learners of a pattern, but arguably strengthens the evidencefor a strong relationship between learnability and typology.Further information about the behaviour of the subjects in the harmony con-ditions can be drawn by considering the relative values of the odds ratios (i.e. the75effect size) in each of the cells in Table 3.4, as opposed to only the shading thatindicates a p-value of less than 0.05. In particular, note that the S-Harm group ismore than fourteen times (OR = 14.59) more likely than the Control group to chooseliquid harmony at the Short-range distance that corresponds to their training. Thisis not at all surprising given that each of the S-Harm subjects was selected for theanalysis precisely because they provided a large proportion of harmony responsesat this distance. Comparatively, however, the odds ratios at Medium- and Long-range are much smaller at 1.78 and 1.63 respectively. This means that although theeffect at each testing distance reached significance, subjects in the S-Harm groupwere not even twice as likely as the Control group to choose harmony at either of thetwo distances not encountered in training. The M-Harm subjects, who were aboutthirteen times (OR = 12.94) more likely than the Control group to choose harmonyat the distance they were exposed to in training (Medium-range), also had a largeeffect for the Short-range distance (OR = 11.95, which indicates a strong tendencyto generalize the pattern in to shorter distances) and the odds ratio for Long-rangetest items was relatively large as well (OR = 3.64). Though the latter figure is per-haps not as impressive as those with OR > 10, it nonetheless provides evidencethat subjects in the M-Harm group tend to generalize the pattern outwards to word-initial Long-range contexts even more than S-Harm subjects applied the pattern toMedium-range test items.As a final point of discussion for the results of the learners-only analysis of theharmony conditions, I consider the issue of where non-adjacent dependencies arisein the first place. Evidence suggests that unbounded consonant harmony emergesdiachronically from systems that restrict the phonotactic dependency to transvo-calic contexts (Dolbey and Hansson, 1999; Gunnar Ólafur Hansson, pers. comm.).In probabilistic terms, the overall tendency for language learners in this experiment—and conceivably for learners of a natural language with transvocalic consonantharmony—is to apply the pattern with a high probability in…CvC…contexts, andto overextend the pattern (albeit with a much lower probability) into…C…c…C…contexts. A small effect of this typewould likely not contest the stability of a patternwith transvocalic locality, but over time some such patterns could be interpreted asunbounded by a new generation of speakers due to any number of factors (e.g. thenumber of learners that over-generalized the pattern or the number of lexical items76that adhered to an unbounded pattern surpassed some threshold).I turn now to the results for the two groups in the dissimilation condition, whichhad provided the original motivation for looking only at the learners of the pat-tern. Before limiting the analysis in this way, the results of the statistical modelwere troubling in that they did not support any of the predictions laid out in Sec-tion 3.2.1. However, as illustrated in the bottom two rows of Table 3.4, the resultsnow provide evidence that humans learn and generalize patterns of long-distancedissimilation in exactly the same way that they do for harmony—the M-Diss groupacquires an unbounded pattern, generalizing to all levels of locality (though, aswith harmony, less so to the word-initial context of the Long-range test items), andthe S-Diss group interprets the pattern as having strictly transvocalic locality anddoes not over-generalize to greater distances. I note, however, that there is alsoa small effect for the S-Diss group when choosing dissimilation in Medium-rangetest items (OR=1.55). Though the effect does not quite reach statistical significance(p ≈ 0.106), it nonetheless leaves open the possibility that a diachronic shift fromtransvocalic to unbounded dissimilation might be predicated by the tendency forsome learners to overextend the pattern.Overall, after restricting the statistical analysis to data from thosewho surpasseda threshold for learning, the results of these experiments do not support the predic-tions of the ABC model. Specifically, subjects in the the S-Diss training groupwould be expected to learn an unbounded pattern, or at least be more likely thansubjects in the S-Harm group to generalize to greater distances, but there is no evi-dence to support either of these two predictions. The observed outcome thus con-tradicts the ABCmodel since a transvocalic-only pattern of dissimilation cannot begenerated without the addition of redundant constraints that permit further unde-sirable pathologies (e.g. Corr constraints that stipulate a CVC window for assess-ing violations; see Section above). Moreover, the M-Diss group showedno tendency to interpret the dissimilation in cvLvcv-Lv contexts as the sort of“beyond-transvocalic” dependency that the ABC model predicts to be possible.773.3 Summary and ConclusionsThis chapter first demonstrated that theAgreement byCorrespondence (ABC)modelof long-distance consonant interactions (Walker, 2000a,c; Hansson, 2001, 2010a;Rose and Walker, 2004; Bennett, 2013) produces a number of questionable pre-dictions when considering complex instances of consonant harmony and patternsof long-distance consonant dissimilation (Bennett, 2013, 2015), which are not sup-ported by the typology. However, it is important to note that while these patholo-gies were generated under Bennett’s (2013) definition of the correspondence rela-tion (i.e. an equivalence relation that is symmetric, reflexive, and transitive), priorformulations of ABC (e.g. Hansson, 2001, 2010a; Rose and Walker, 2004) do notmake the same assumptions. While each proposal leads to slightly different sets ofpredictions, no version of ABC avoids all of the problematic predictions that werepresented above.The second portion of this chapter presented experimental results that indicatethe same patterns of generalization for liquid dissimilation as were previously estab-lished for liquid harmony. Learners exposed to dissimilation at Short-range tend notto extend this pattern to further distances. More importantly, learners do elevate anobserved Medium-range-only dissimilation pattern to an unbounded dependency,counter to the predictions of the ABCmodel. The fact that the same learning bias isevidenced for both harmony and dissimilation argues against the way locality rela-tions are referenced in correspondence-based analyses of consonant harmony anddissimilation and weakens the case for Bennett’s (2013) “mismatch prediction” re-garding these two types of phenomena.78Chapter 4Locality in Formal LanguageTheory: A Tier-Based SolutionThis chapter seeks to reconcile the apparent shortcomings of the Agreement byCorrespondence (ABC; Walker, 2000a,c; Hansson, 2001, 2010a; Rose and Walker,2004; Bennett, 2013) approach to long-distance consonant interactions by propos-ing a formal-language-theoretic account of the observed properties of the typologyand learnability of such patterns. In approaching phonotactic dependencies fromthe perspective of formal language theory, the goal is to characterize the set ofobserved patterns as a class of stringsets (i.e. formal languages; see Section 1.4.2).Strings of segments that adhere to a set of phonotactic restrictions will be grammat-ical words in the language (stringset), but any word that violates the phonotacticswill be ungrammatical (and therefore not a member of the stringset).As discussed in Section 1.4.2, Johnson (1972) and Kaplan and Kay (1994) es-tablish that any phonological mapping that can be generated with an ordered setof rewrite rules (i.e. A → B / C D) belongs to the class of regular relations. Fur-thermore, this means that all stringsets generated by these relations (the surfacephonotactics) are members of the regular region of the Chomsky hierarchy (Chom-sky, 1956; Rabin and Scott, 1959; see Figure 1.2). The result is that virtually allattested phonotactic patterns are indeed regular, including long-distance consonantagreement and disagreement (Heinz, 2010; Heinz et al., 2011; Payne, 2014). Whilethere is reason to be skeptical of the idea that the full class of regular languages is79a plausible definition of what constitutes a possible (and human-learnable) phono-tactic dependency (see, e.g., Heinz, 2007, 2010; Lai, 2012), the region can be fur-ther broken down into a number of subregular language classes (McNaughton andPapert, 1971; Rogers et al., 2010; Heinz et al., 2011; Rogers and Pullum, 2011).These can be organized into a subregular hierarchy, which is shown in Figure 4.1(repeated from Figure 1.3 in Chapter 1).Regular Star-Free Locally Threshold Testable Locally Testable Strictly Local Strictly Piecewise Piecewise Testable Tier-based Strictly Local Figure 4.1: Illustration of the subregular hierarchy. The largest class of formallanguages (i.e. Regular) is presented on the top, and subset classes arepresented below. Each language class is thus a proper subset of any classthat is above it and connected by a line. Subregular language classes thatare most relevant to this dissertation are presented in boldface.This chapter provides a foundation for studying the formal properties of long-distance phonotactics within the subregular hierarchy, outlining two alternativesfor a characterization of the dichotomy of unbounded vs. transvocalic harmony.I begin by summarizing a modular approach advocated by McMullin and Hans-son (2014), who argue that the typology of consonant harmony locality is a re-sult of two distinct learning modules whose combination results in two distincttypes of patterns: transvocalic dependencies between consonants may be acquiredas Strictly 3-Local (SL3; McNaughton and Papert, 1971) languages that ban certain…CvC… sequences while unbounded patterns can instead be learned as Strictly802-Piecewise (SP2; Heinz, 2010; Rogers et al., 2010) languages with restrictionson certain …C…C… subsequences (Section 4.1). The resulting definition of thelearner’s hypothesis space under this approach is a union of two subregular classesof stringsets: SL3 ∪ SP2.While the modular account provides a relatively close fit to the empirical data, Iargue in Section 4.2 that it is preferable to characterize long-distance dependenciesas Tier-based Strictly 2-Local languages, based on cases of harmony and dissimi-lation with blocking that are easily captured within the TSL2 region (but not withinthe SL3∪SP2 region). Furthermore, Section 4.3 demonstrates that each of the unat-tested patterns (including the ABC pathologies discussed in Section 3.1) cannot becharacterized as a TSL2 stringset. Section 4.4 presents the results of Experiments 3and 4, which show that patterns outside of the TSL2 region are extremely difficultfor humans to learn in the lab—very few subjects who are exposed to such patternsare able to reproduce them in testing, and several subjects seem to learn the depen-dency as a TSL2 restriction in spite of overt counter-evidence in the training dataagainst such an interpretation. Section 4.5 summarizes the arguments presented inthis chapter and concludes.As a final note, when characterizing the phonotactics of a language as a stringset,I assume that the patterns are surface true, in that all grammatical words are mem-bers of the stringset and any ungrammatical word is not. This stands in contrast tothe notion that co-occurrence restrictions are violable constraints. While the presentchapter approaches long-distance consonant interactions categorically (i.e. as gram-matical or not), I will argue in Chapter 5 that defining constraints as individual sub-regular stringsets allows for a simple account of several attested patterns that areotherwise relatively complex.4.1 Strictly Local and Strictly Piecewise LanguagesCharacterizing a co-occurrence restriction as a member of a subregular class is rel-atively simple when the dependency is bounded in terms of locality. Such patternscan always be defined as Strictly k-Local (SLk), where k is the greatest number ofsegments over which the restriction must be enforced (including those involved inthe restriction). Substrings of length k are called k-factors (i.e. segment n-grams),81and the grammar for an SLk language can be thought of as a list of permitted (or,equivalently, prohibited) k-factors. For example, we can define a language thatallows only CV syllable structure as SL2, since the restriction targets pairs of ad-jacent segments (2-factors, bigrams). With a simplified alphabet of Σ = {C, V},the grammar for such a language needs only to include a few 2-factor restrictions:G = {*CC, *VV, *#V, *C#}, where # is a word boundary.1 Transvocalic vari-ants of consonant harmony and dissimilation can also be defined as SL languages,though k must provide a slightly larger window for application of the co-occurrencerestriction. As defined in Chapter 2, transvocalic harmony does not hold over in-tervening consonants, but does hold across intervening vowels in …CvC… se-quences. Consider a simplified version of the Koyra pattern shown in (1) (repro-duced from above; see Section 2.1 for full description of the data), which reducesthe segment inventory to Σ = {s, ʃ, t, a}, where ‘s’ represents a [+anterior] sibi-lant, ‘ʃ’ is a [–anterior] sibilant, ‘t’ is any other consonant, and ‘a’ is any vowel.The grammar would simply be a list of k-factors (with k ≤ 3) not permitted in thelanguage2, namely any that include both [s] and [ʃ]: G = {*ʃas, *saʃ, *sʃ, *ʃs}.3(1) Transvocalic sibilant harmony in Koyra (Koorete; Hayward, 1982)a. /tim-d-osːo/ tindosːo ‘he got wet’b. /patʃ-d-osːo/ patʃːoʃːo ‘it became less’ *patʃːosːoc. /giːʒ-d-osːo/ giːʒːoʃːo ‘it suppurated’ *giːʒːosːod. /ʃod-d-osːo/ ʃodːosːo ‘he uprooted’e. /ʔatʃ-ut-d-osːo/ ʔatʃutːosːo ‘he (polite) reaped’For patterns with unbounded locality, however, one of the main challenges is thatany formal language characterization must allow for, in principle, an infinite num-1In general, this dissertation sets aside the issue of word boundaries. Technically speaking, theyare usually incorporated into the theory by defining the language with a set of symbols Σ that isaugmented with two word-boundary symbols—one for the beginning of a word and one for the end.22-factors are included in SL3 grammars in order to reduce the number of k-factors that must belisted. A full grammar for this example would also include 3-factors such as f*ʃʃs, *stʃ, *asʃ, …g.3The data in (1) only demonstrate the ungrammaticality of *[–ant]V[+ant]. However, the restric-tion holds more generally as a morpheme structure constraint, such that no well-formed root maycontain *[–ant]V[+ant] or *[+ant]V[–ant]. Also, since the language has no suffixes that contain anunderlying [–ant] sibilant, there is no evidence that, e.g., *saʃ should actually be permitted (Hayward,1982; Hansson, 2010a).82ber of non-participating segments to intervene between the relevant pair. Unboundeddependencies therefore cannot be SLk for any value of k, since the restrictions holdat all distances, including k+ 1.Strictly k-Piecewise (SPk) languages are those that can be defined in terms oflinear precedence relations. Using subsequences of length k, which give informa-tion about linear order without reference to distance or intervening material, it ispossible to capture the less restricted nature of unbounded consonant harmony anddissimilation. For example, the 2-subsequences (precedence relations) of a word[sataʃa] would include {s…a, s…t, s…ʃ, a…t, a…a, a…ʃ, t…a, t…ʃ, ʃ…a}. SinceSP languages encode information about the order of segments while ignoring dis-tance, most attested cases of unbounded consonant harmony can be characterizedas SP2 (including patterns with asymmetric directionality or feature dominance;Heinz, 2010). This includes the case of Aari sibilant harmony shown in (2), inwhich the perfective suffix /-s/ surfaces as [-ʃ] when it is preceded by a [–ant] sibi-lant at any distance (see Section 2.1 for full description of the data). Using a sim-plified segment inventory, as above, with Σ = {s, ʃ, t, a}, this pattern of sibilantharmony with unbounded locality can be characterized as SP2 with the followinggrammar: G = {*ʃ…s, *s…ʃ}.(2) Unbounded sibilant harmony in Aari (Hayward, 1990)a. /baʔ-s-e/ baʔse ‘he brought’b. /ʔuʃ-s-it/ ʔuʃʃ it ‘I cooked’c. /tʃʼa̤ːq-s-it/ tʃʼa̤ːqʃ it ‘I swore’d. /ʃed-er-s-it/ ʃederʃ it ‘I was seen’Finally, it is also important to note that transvocalic dependencies cannot be charac-terized as SP2, since precedence relations offer no information about the distancebetween two segments. If the target pattern is transvocalic sibilant harmony, thegrammaticality of a word such as [sadaʃ] (which does not include any…SvS…sub-strings) implies that any word with a s…ʃ subsequence—including a word such as[saʃ], which violates transvocalic harmony—should also be grammatical (barring,of course, any other phonotactic violations). In fact, this holds more generally, inthat any phonotactic dependency that is bounded by some measure of locality is notSPk for any value of k (provided, of course, that the value of k does not exceed the83longest word in the language).4.1.1 Learning Bias and the Argument for Modular LearningGiven some upper bound on k, sound patterns that are SLk or SPk are proven to beefficiently learnable in the limit from positive evidence (Gold, 1967) by relativelysimple learning algorithms. For SLk patterns, the learner simply keeps track of alln-grams, or k-factors, encountered in the training data (Garcia et al., 1990; Heinz,2007). For SP2 patterns, the learner instead records all encountered k-subsequences(a precedence learner; Heinz, 2010). Based on the fact that the transvocalic and un-bounded dependencies found in natural language instantiate well-defined but dis-tinct formal classes of languages, McMullin and Hansson (2014) argue that phono-tactic learning is modular, with different learning algorithms responsible for de-tecting different types of phonotactic regularities (Heinz 2010; for arguments formodular learning of phonological vs. syntactic patterns, see Heinz and Idsardi 2011,Lai 2012). Both types of locality can be efficiently learned under the assumptionthat the phonological learner contains at least two sub-modules—an n-gram learnerfor transvocalic harmony (SL3) and a precedence learner for unbounded harmony(SP2). By contrast, the unattested locality patterns, such as those depicted in Ta-ble 2.1, are neither SLk nor SPk (at least not for any reasonable value of k). Fromthis perspective, the gaps in the typology of locality relations in consonant harmonyare thus a direct reflection of learning bias. Alternative locality patterns, which areconceivable but unattested, are situated outside of the learner’s hypothesis spaceand are inaccessible (diachronically and synchronically) due to this inductive biasthat operates in phonotactic learning. Such patterns are beyond the capabilities ofeither the SL or the SP module and will therefore not be acquired and replicatedfaithfully by learners, other things being equal (as seen in the results of Experi-ments 1 and 2; see also Section 4.4 below for results of Experiments 3 and 4).4.1.2 Evidence Against this Approach from BlockingThough Heinz (2010) demonstrates that nearly all cases of unbounded consonantharmony from two typological surveys that were available at the time (Hansson,2001; Rose and Walker, 2004) can be described as members of the SP2 class of for-84mal languages, the potential extension of this approach to all types of long-distancedependencies is impeded by a number of attested phonotactic patterns that are notSP2 (nor SLk for that matter). For example, Heinz (2010) himself notes that certaincases of consonant dissimilation are known to exhibit segmental blocking effects(Odden, 1994; Heinz et al., 2011; Bennett, 2013). In the case of Georgian liquiddissimilation, illustrated in (3) with the liquids presented in boldface, the adjectivalsuffix /-uri/ surfaces as [-uli] when preceded by [r] anywhere in the word, except if[l] intervenes.4(3) Liquid dissimilation with blocking in Georgian(Fallon, 1993; Odden, 1994; Bennett, 2013)a. dan-uri ‘Danish’b. p’olon-uri ‘Polish’c. ungr-uli ‘Hungarian’d. aprik’-uli ‘African’e. avst’ral-uri ‘Australian’f. kartl-uri ‘Kartvelian’g. bulgar-uli ‘Bulgarian’In the above data, (3a)-(3b) show that the suffix /-uri/ surfaces faithfully as [-uri]when the root contains no liquids, or when the root contains an [l]. As seen in(3c)-(3d), when the root contains an [r] at any distance away from the suffix, /-uri/instead surfaces as [-uli]. However, (3e)-(3f) demonstrate a segmental blockingeffect—when [l] intervenes between an [r] in the root and the suffix, it surfacesfaithfully as [-uri] in spite of the presence of a preceding [r] in the root. Finally,(3g) shows that the pattern is blocked only when the [l] intervenes between two[r]s, not anytime there is an [l] in the root.Note that the pattern found in Georgian cannot be classified as SP2 with a re-striction against *r…r subsequences. Such an analysis would account for the ba-sic generalization seen in (3a)-(3d), but would fail to permit cases like (3e)-(3f),4Fallon (1993) provides evidence that the /r/→[l] alternation does not occur only for the adjectivalsuffix /-uri/ (which, when combined with the name of a country, denotes the nationality of a thing, nota person), but also for a number of other suffixes containing /r/ that adhere to the same generalization(i.e. triggered by a preceding [r], blocked by intervening [l]).85which exhibit a blocking effect, as these still contain a supposedly-banned *r…rsubsequence. Likewise, because of words like (3g), it cannot be analyzed witha relative ranking of two constraints, each defined as an SP2 stringset, such as*[l…l]≫*[r…r] (cf. *X…X constraints discussed by Pulleyblank, 2002).The typological evidence thus suggests that a description of long-distance phono-tactics in terms of precedence relations is too restrictive, accounting for only a sub-set of attested patterns. The remainder of this chapter pursues a different strategy,demonstrating that the Tier-based Strictly 2-Local class of formal languages (Heinzet al., 2011) not only encompasses each of the attested parameters of locality andblocking—some of which the ABC model cannot easily handle—but also excludesthe unattested patterns presented in Table 2.1 as well as the pathological patternspredicted by ABC, as described in Section Tier-Based Strictly 2-Local LanguagesThe precedence relations encoded in an SP2 grammar provide a convenient solutionfor ignoring irrelevant intervening material. As an alternative means of achievingthe same goal, the Georgian pattern shown in (3) can instead be thought of as arestriction against contiguous segment pairs {*ll, *rr}, where adjacency is cruciallyassessed only among liquid consonants.5 Patterns that can be similarly describedare members of the Tier-based Strictly 2-Local class of formal languages (TSL2;Heinz et al., 2011). This characterization of long-distance dependencies, while stillrelatively simple, can account for several typological properties of locality that arepresented throughout the remainder of this section.In more formal terms, TSLk languages can be defined as follows. For somealphabet Σ (a segment inventory), the grammar G of a Tier-based Strictly k-Locallanguage is a two-tupleG = 〈T; S〉, where the tier T is some subset of Σ over whichadjacency is assessed, and S is the set of k-factors permitted on that tier (for anexhaustive formal definition of TSL languages and proofs for several computational5The inclusion of *ll is a slight oversimplification, in order to provide a generalization that isdirectly comparable the pattern used in Experiment 2. While the data available in Fallon (1993) donot include any root-internal [l…l] subsequences (unless an [r] intervenes, as in [liberal-ur-] ‘liberal’),there is another suffix /-eli/, which denotes the nationality of a person, and never surfaces as *[-eri],even in cases such as [p’olon-eli] ‘Polish’.86properties of the TSL class, see Heinz et al., 2011; Jardine and Heinz, 2015). Asshown in Table 4.1, a tier can be a set of segments that corresponds to a natural class,but since it is mathematically defined simply as a subset of the segment inventory,it could also be any arbitrary collection of segments.Table 4.1: Example tier-based strings for a hypothetical word [pireʃaʃolus],when Σ = {p, s, ʃ, r, l, i, e, a, o, u}Contents of T Description of T Tier-based stringΣ all segments pireʃaʃolus{p, s, ʃ, r, l} consonants pireʃaʃolus{i, e, a, o, u} vowels pireʃaʃolus{s, ʃ} sibilants pireʃaʃolus{r, l} liquids pireʃaʃolus{p, ʃ, i, u} arbitrary set pireʃaʃolusNote that to arrive at the tier-based string, Heinz et al. (2011) define an erasing func-tion that removes any segments that are not in T.6 Following Jardine and Heinz(2015), I use R to denote the set of tier-based 2-factor restrictions (i.e. the com-plement of S with respect to all possible tier-based 2-factors), as it is often moreconvenient to describe a TSL2 grammar as G = 〈T;R〉. For the Georgian liquiddissimilation pattern presented in (3), the components of the TSL2 grammar en-coding the dependency are T = {l, r} and R = {*ll, *rr}.4.2.1 Consonant Harmony from the TSL2 PerspectivePatterns of long-distance consonant agreement can equally be described in TSL2terms. The unbounded sibilant harmony pattern of Aari shown in (2) is a restric-tion on sequences of [αanterior][–αanterior] sibilants on a tier that includes all andonly [+strident] segments. Continuing from the above examples with a simpli-fied inventory of Σ = {s, ʃ, t, a}, this pattern can be generated by the followingTSL2 grammar: G = 〈T = {s, ʃ};R = {*ʃs, *sʃ}〉. Heinz (2010) hesitates to de-6Specifically, ET(σ1 : : : σn) = u1 : : : un, where ui = σi iff σi 2 T and ui = λ otherwise, where λdenotes the empty string (Heinz et al., 2011, p. 60)87scribe patterns of long-distance consonant agreement in terms of tiers, citing a lackof known systems that exhibit blocking. I argue however, that patterns with thewidely attested locality type of transvocalic consonant harmony, such as the Koyrasibilant harmony shown in (1), can be recast as long-distance dependencies that areblocked by any intervening consonant. From the perspective of TSL2 languages,the phonotactic grammar of Koyra bans sequences of [αant][–αant] sibilants, ex-actly as in the unbounded case of Aari, but in Koyra violations are assessed on atier that includes all of the consonants, rather than only sibilants. Forms like (1d)[ʃodːosːo] ‘he uprooted’ are grammatical precisely because another consonant, inthis case [d], remains present when the segment string is reduced to the consonants.The [d], intervening between a pair of sibilants that constitutes a member of R, cantherefore be construed as a ‘blocker’, since it interrupts a sequence of sibilants thatwould otherwise violate the phonotactic grammar.Further motivation for including long-distance consonant agreement within thescope of the TSL2 approach is that additional cases have come to light that in-volve more obvious instances of consonant harmony with blocking. Relevant casesinclude Kinyarwanda (described in Section 5.1.1; Walker and Mpiranya, 2005;Hansson, 2007; Walker et al., 2008), Imdlawn Tashlhiyt Berber (described in Sec-tion; Elmedlaoui, 1995; Hansson, 2010b), and Slovenian (Jurgec, 2011),which is shown below in (4) with all coronal obstruents in boldface.(4) Sibilant harmony with blocking in Slovenian (Jurgec, 2011)a. spi ‘sleeps’ ʃpi-ʃ ‘(you) sleep’b. za-klɔn ‘shelter’ ʒa-klɔn-iʃtʃe ‘bomb shelter’c. tsepəts ‘fool’ tʃeptʃ-ək ‘fool-dim’d. sit ‘full’ na-sit-iʃ ‘(you) feed’e. zida ‘(s/he) builds’ zida-ʃ ‘(you) build’The data in (4) demonstrate that the regressive sibilant harmony pattern found inSlovenian, which bans *[+ant]…[–ant] subsequences in (4a)-(4c), is blocked whena coronal obstruent such as [t] or [d] intervenes, as in (4d)-(4e). Note that in (4b) thecoronal sonorants [n] and [l] are transparent, just like non-coronals are. As such, thegrammar for Slovenian would prohibit 2-factors of, e.g., R = {*sʃ, *zʃ, *t͡ st͡ ʃ, etc},but the relevant tier would include all coronal obstruents (as opposed to just the88sibilants, or all of the consonants).4.2.2 Locality as a Consequence of the TierIn terms of TSL2 grammars, the distinction between unbounded dependencies, de-pendencies with blocking, and transvocalic dependencies can thus be attributed toa difference in the particular subset of Σ that comprises the designated tier T (andits relationship to R), rather than any change to R itself. The grammars presented inTable 4.2 show this for three hypothetical languages with sibilant harmony, whichare representative of the range of attested patterns (e.g. Aari, Slovenian, and Koyra,respectively). In order to facilitate a direct comparison between each type, the seg-ment inventory is restricted to Σ = {s, ʃ, p, t, a}, and the set of prohibited 2-factorsis always R = {*sʃ, *ʃs}.Table 4.2: TSL2 grammars for three types of sibilant harmonyType of pattern Σ T RUnbounded {s, ʃ, p, t, a} {s, ʃ} {*sʃ, *ʃs}Blocking {s, ʃ, p, t, a} {s, ʃ, t} {*sʃ, *ʃs}Transvocalic {s, ʃ, p, t, a} {s, ʃ, p, t} {*sʃ, *ʃs}With Σ and R held constant, Table 4.2 shows that variation in (what appears to be)locality is merely a by-product of manipulating the contents of the relevant tier Tover which violations are assessed. Unbounded dependencies are a result of a tierthat includes only segments that are present in members of R. If any additional seg-ments are included in T, such as a coronal obstruent [t], these will block the pattern.The transvocalic locality type arises when all other consonants are also included inT (and are hence blockers), but the class of vowel segments is systematically absentfrom the tier.More generally, the set of segments occurring in the members of R (such as{s, ʃ} for the languages above) can be thought of as a set of harmonic segments(i.e. potential triggers or targets), which I will denote Ρ (where each σx ∈ Σ is amember of Ρ if and only if there is some 2-factor σxσy or σyσx that is in R). Notethat Ρ is a subset of T, and likewise, T is by definition a subset of Σ. Because89of this property (i.e. Ρ ⊆ T ⊆ Σ), there are overall just three possible types ofsegments. First, any segment that occurs in Ρ (and is therefore in both T and Σ) willparticipate in the dependency as a potential trigger or as a target for some repairstrategy (e.g. harmony or dissimilation). Second, any segment that is in T (andtherefore also in Σ), but is not present in Ρ, will act as a neutral segment that blocksinteraction between two segments on either side of it. Finally, a segment that is in Σbut not T (and therefore not in Ρ) will be neutral and transparent to the dependency.This is summarized in Table 4.3, for some segment σ ∈ Σ.Table 4.3: Three types of segments in a TSL2 grammarType of segment σ ∈ Σ? σ ∈ T? σ ∈ Ρ?Harmonic (trigger/target) 3 3 3Neutral (opaque) 3 3 7Neutral (transparent) 3 7 7A final relevant property of the TSL2 language class is that when T = Σ, a stringsetcan also be described as a simple Strictly 2-Local pattern (i.e. restrictions againststring-adjacent segments). As a result, the TSL2 region properly includes all SL2stringsets (a result that holds more generally for any value of k; Heinz et al., 2011).4.3 Pathological Patterns That Are Not TSL2I have now shown how the attested types of locality for long-distance consonantagreement and disagreement with and without blocking are easily captured by de-scribing phonotactic patterns as members of the Tier-based Strictly 2-Local class offormal languages. In this section, I argue that unattested variants of locality, as wellas the unusual varieties of long-distance dependency patterns that are predicted tobe possible by the basic architecture of the ABC model (see Section 3.1) are patho-logical not merely in terms of typological attestation but also from the standpointof computational complexity and learnability. Table 4.4 shows certain unattestedtypes of locality that were discussed (with respect to consonant harmony) in Sec-tion 2.1, none of which can be described as a TSL2 pattern.The patterns in the first two rows of Table 4.4, dependencies that hold across90Table 4.4: Unattested variants of long-distance localityLocality …CvC… …CvcvC… …CvcvcvC…≤ 1 consonant + + –= 1 consonant – + –≥ 1 consonant – + +either at most one or exactly one consonant, are still TSLk languages, but requireat least k = 3. For an example using Σ = {s, ʃ, t, a}, if the target pattern is sibi-lant harmony that holds across at most one or exactly one intervening consonant,one must keep track of at least three consonants in order to prohibit words such as*[sataʃa] and *[ʃatasa] while still permitting [satataʃa] and [ʃatatasa]. While bothpatterns are TSL3 with T = {s, ʃ, t}, the difference between the two locality variantsis whether or not the grammar also prohibits words that include tier-based 3-factors{Csʃ, Cʃs, sʃC, ʃsC}, where C is any of {s, ʃ, t}. That is, words such as [saʃata] and[taʃasa] would be ungrammatical if the dependency holds across up to one conso-nant (as in the first row of Table 4.4), but they would be grammatical for a patternthat holds across exactly one consonant (the second row of Table 4.4).A dependency that holds across at least one consonant (as in the last row ofTable 4.4), is not TSLk for any value of k, as the phonotactic legality of a word can-not be determined solely in terms of presence vs. absence of individual k-factorsregardless of how the tier T is construed. Again using sibilant harmony as an ex-ample, such a language would need to permit sequences of …saʃa…, …ʃasa…, butany strings including *…sataʃa…, *…ʃatasa… would be prohibited. Crucially, thelatter two restrictions cannot be ruled out simply by setting T to include all conso-nants and having R containing the 3-factors {*stʃ, *ʃts}. This is because the patternof sibilant harmony is meant to apply regardless of the number of intervening con-sonants (provided it is more than one), as in *…satataʃa…, *…satatataʃa…, etc.There is thus no upper bound on k that will suffice to rule out all illegal sequences,since a word of the form *sa…(ta)k+1…ʃa will always be erroneously classifiedas grammatical. Instead, a phonotactic pattern of this type falls into (a tier-basedinstantiation of) the Locally Testable class (Rogers and Pullum, 2011), which is de-91fined in terms of Boolean operations over sets of k-factors. In the present examplean illegal word is one that contains both a member of {ts, #s} and {tʃ, #ʃ} amongits 2-factors on the consonantal tier (augmented by word boundaries).7Recall from Section that the ABC framework can generate phonotacticdependencies of exactly this type, provided the dependency is a pattern of dissim-ilation (which I call ‘beyond-transvocalic’ dissimilation). Interestingly, if the pat-tern applies to identical consonants, such as two liquids that are both [+lateral] orboth [–lateral] (i.e. a beyond-transvocalic analogue of the Georgian case seen in (3)above), it lies outside even the Locally Testable class. This is because the grammarwould need to be able to count the number of instances of certain k-factors; in thecase of beyond-transvocalic liquid dissimilation with Σ = {l, r, t, a}, a word is il-legal if it contains two or more occurrences of one of the 2-factors in {rt, tr, lt, tl}on the consonant tier. The relevant class is therefore (a tier-based instantiation of)the Locally Threshold Testable languages (Rogers and Pullum, 2011).The computational status of phonotactic patterns with beyond-transvocalic lo-cality is thus somewhat analogous to the (unattested) “first-last harmony” patterndescribed by Lai (2012), where words of the structure #s…ʃ# or #ʃ…s# are banned(but both of, e.g., #s…s…s# and #s…ʃ…s# are permitted). Lai’s artificial languagelearning experiments showed a failure to learn first-last harmony, suggesting thatLocally Testable patterns that lie outside the TSL (and SP) regions of the subregularhierarchy (Figure 4.1) are beyond the grasp of the human phonological learner—aproposal that is further supported by evidence from Experiments 3 and 4 presentedbelow in Section 4.4.Finally, neither of the two ABC pathologies discussed in Section 3.1.1 can becharacterized as a TSL2 pattern. The status of the bizarre parity-sensitive harmonypattern generated by high-ranked Proximity (see Section for full descrip-tion and tableaux) is unclear at present (it may, for instance, be Regular but notStar-Free), but it in any case falls beyond the SL, SP or TSL subregular classes.Also outside those classes are the “agreement by proxy” effects discussed in Sec-tion In the example case, where /s…g/ → [z…g] assimilation is dependenton a nearby [x], the sound pattern cannot be expressed in TSLk terms for any k, even7Note that ‘ts’ and ‘tʃ’ do not denote affricates here, but 2-factors of [t]+[s] or [t]+[ʃ].92if the tier T is defined as {s, z, g, x}. This is because the pattern holds no matterhow many additional instances of [g] or [s] intervene between the [s…g] pair andthe “proxy” [x].4.4 Experiments 3 and 4: Rich-Stimulus TrainingThis chapter has thus far established that we can define the boundaries of a learner’shypothesis space in terms of formal language theory. After first offering two al-ternatives for doing so within the subregular hierarchy, I argued that the class ofTier-based Strictly 2-Local formal languages provides an accurate approximationof the cross-linguistic properties of long-distance dependencies with respect to bothlocality and blocking. I now present experimental evidence that the TSL2 approachis on the right track in terms of offering a level of complexity that accurately de-fines the hypothesis space of a human learner. Specifically, Experiments 3 and 4show that very few subjects who are exposed to patterns of liquid harmony or dis-similation that hold at a Medium-range distance, but demonstrably fail to hold atShort-range, are able to learn anything at all from their training. Furthermore, ofthose subjects who seem to pick up on a cvLvcv-Lv dependency, many of themappear to learn it as a TSL2 restriction that applies to Short-range contexts as well(i.e. as an unbounded dependency), in spite of the counter-evidence provided intheir training.4.4.1 Motivation for Experiments 3 and 4Experiments 1 and 2 used a “Poverty-of-Stimulus” paradigm (e.g. Wilson, 2006;Finley and Badecker, 2009) to determine not only whether humans can learn adependency between non-adjacent liquids, but also whether or not they general-ize the learned pattern to contexts that were purposely withheld from them in thetraining phase. Recall that the results suggested that subjects who learn liquid har-mony or dissimilation from Medium-range cvLvcv-Lv items tend to generalizethe pattern to all levels of locality. By contrast, subjects who learn from Short-range cvcvLv-Lv items tend to restrict the pattern to the Short-range transvocalicdistance (though see Section for evidence that some subjects in the S-Harmgroup may generalize in an unbounded fashion).93Of relevance to the current discussion is that neither of the two Medium-rangetraining groups in Experiments 1 or 2 (M-Harm or M-Diss) showed evidence ofhaving learned a pattern with beyond-transvocalic locality (i.e. by applying thepattern only to the Medium- and Long-range test items, but not to the Short-rangetest items). These results were expected based on the typology of locality in long-distance phonotactics. Furthermore, the fact that the M-Diss group in particular didnot learn a beyond-transvocalic pattern was used as an argument against the predic-tions of the Agreement by Correspondence framework, since the factorial typologyof ABC predicts that beyond-transvocalic dissimilation should be a possible pattern(while, notably, beyond-transvocalic harmony should not).This chapter has argued that beyond-transvocalic dependencies cannot exist innatural language because they are inaccessible to the human learner, whose hy-pothesis space for phonotactic patterns is defined by the proposed TSL2 region.Recall that Section 4.3 established that beyond-transvocalic dependencies are out-side of this region. However, the results of Experiments 1 and 2 do not necessarilyprovide conclusive evidence for leaving patterns with beyond-transvocalic local-ity outside the space of human-learnable languages. Humans may be capable oflearning such a dependency, but simply have a preference for the (formally lesscomplex) unbounded version of the pattern when presented with training items thatare compatible with both, as was the case in the “Poverty-of-Stimulus” design usedin Experiments 1 and 2.For example, consider the possibility that consonant harmony with either un-bounded or beyond-transvocalic locality is a possible pattern, and recall that theM-Harm group was exposed to liquid harmony in Medium-range (cvLvcv-Lv)contexts. Since the training phase presented no information about the behaviour ofliquids at the Short-range (cvcvLv-Lv) distance or the Long-range (Lvcvcv-Lv)distance, the training data were, in principle, compatible with either unboundedharmony or beyond-transvocalic harmony. Even though subjects in the M-Harmgroup showed evidence of having generalized the dependency to all distances, thisdoes not mean that they are incapable of learning beyond-transvocalic harmony.Instead, the strongest conclusion we can draw is that, when presented with datathat are ambiguous between unbounded and beyond-transvocalic locality, learnersprefer the unbounded interpretation.94The purpose of Experiments 3 and 4 is therefore to provide subjects with a“Rich-Stimulus” training phase, offering (conflicting) evidence about what happensto pairs of liquids that co-occur in Short- vs. Medium-range contexts in order todetermine whether or not the TSL2 region is defining a hypothesis space that is toorestrictive.4.4.2 Methodology4.4.2.1 Participants, Stimuli, and ProcedureParticipants were recruited and compensated in the same way as previous experi-ments, resulting in 32 new participants for both Experiment 3 (26 female, 6 male,mean age 23) and Experiment 4 (21 female, 11male, mean age 22), which were per-formed using the same stimulus set and procedures that were used for Experiments1 and 2 (see Section 2.2.1). Training ConditionsData was collected for four new groups of subjects who were again tested on thesame 96 test items, but differed in the types of words and phonotactic patterns theywere exposed to in training. Recall that in Experiments 1 and 2, only 50% of thetraining stems for each of the S-Harm, M-Harm, S-Diss, and M-Diss conditionscontained a liquid (either [l] or [ɹ]) in the relevant position. The remaining half ofthe stems contained no liquids whatsoever, and so all consonants remained faithfulwhen the suffixes [-li] or [-ɹu] were attached. In Experiments 3 and 4, this second(faithful) half of the training data was replaced with a different set of stems. Allsegments in the replacement stems continued to stay faithful when the suffixes wereadded, but in this case they contained a liquid [l] or [ɹ]. Depending on whetherthe first portion of the training data showed evidence of alternating liquids at theShort- or Medium-range distance, the faithful liquids were located at the oppositedistance. That is, subjects were given explicit evidence that liquids alternate at oneof the Short- or Medium-range distances, but that they do not alternate at the otherdistance. There are thus four new groups of this type, which are labelled S-Harm-M-Faith, M-Harm-S-Faith for Experiment 3 and S-Diss-M-Faith, M-Diss-S-Faith95for Experiment 2. These labels follow the conventions established for previousexperiments (“S” indicates Short-range cvcvLv stems, “M” indicates Medium-range cvLvcv stems, and “Harm” or “Diss” indicates that the target pattern at thisdistance was harmony or dissimilation, respectively). To further differentiate themfrom previous groups, the distance that showed counter-evidence in the form offaithful liquids is indicated by “-S-Faith” or “-M-Faith” at the end of each group’slabel.As an illustration of what the subjects were exposed to in training, Table 4.5provides a breakdown of the number and type of stimuli encountered by the S-Harm-M-Faith group in Experiment 3. Note that a pattern of this type is compatibleTable 4.5: Example of S-Harm-M-Faith training in Experiment 3.Training triplet Type and number of items…begeli…begeɹi-ɹu…begeli-li… 48 cvcvLv stems with [l]…domelo…domelo-li…domeɹo-ɹu… (Short-range harmony)…mopeɹe…mopeɹe-ɹu…mopele-li… 48 cvcvLv stems with [ɹ]…tetoɹi…tetoli-li…tetoɹi-ɹu… (Short-range harmony)…pilede…pilede-li…pilede-ɹu… 48 cvLvcv stems with [l]…nelogi…nelogi-ɹu…nelogi-li… (Medium-range faithfulness)…koɹupe…koɹupe-li…koɹupe-ɹu… 48 cvLvcv stems with [ɹ]…guɹoto…guɹoto-ɹu…guɹoto-li… (Medium-range faithfulness)with the (attested) transvocalic variant of consonant harmony and is predicted to belearned as is. While it may seem strange to include a group like this, given that theresults of Experiment 1 show that the S-Harm group is able to learn a transvocalicpattern evenwithout being exposed to faithful liquids at theMedium- or Long-rangedistance, the S-Harm-M-Faith group (and the S-Diss-M-Faith group) is importantfor the interpretability of results and for a more direct comparison with Experiments1 and 2. Consider the possibility that the responses given by the M-Harm-S-Faithgroup are not statistically different from the Control group. We cannot necessarilyattribute a failure of learning to the fact that the pattern was more complex thanthose in Experiments 1 and 2, since there aremethodological changes. For example,96every stem that subjects in Experiment 3 (and Experiment 4) were exposed to intraining contained a liquid, and this difference alone might lead to more difficultyin processing the input and learning a pattern from it. It is therefore useful to havenot only a comparison group in the form of the Control subjects, but also in the S-Harm-M-Faith group (and S-Diss-M-Faith) that allows for an overall comparabilitywith the previous experimental findings.Table 4.6 provides examples from the training phase of the M-Diss-S-Faithgroup in Experiment 4. This pattern complies with the (unattested and formallycomplex) beyond-transvocalic variant of dissimilation and is therefore predicted bythe TSL2 hypothesis to be inaccessible to human language learners, even though thefactorial typology of ABC predicts that it is a possible pattern (see Section 3.2.1).8Table 4.6: Example of M-Diss-S-Faith training in Experiment 4.Training triplet Type and number of items…begeli…begeli-ɹu…begeli-li… 48 cvcvLv stems with [l]…domelo…domelo-li…domelo-ɹu… (Short-range faithfulness)…mopeɹe…mopeɹe-ɹu…mopeɹe-li… 48 cvcvLv stems with [ɹ]…tetoɹi…tetoɹi-li…tetoɹi-ɹu… (Short-range faithfulness)…pilede…piɹede-li…pilede-ɹu… 48 cvLvcv stems with [l]…nelogi…nelogi-ɹu…neɹogi-li… (Medium-range dissimilation)…koɹupe…koɹupe-li…kolupe-ɹu… 48 cvLvcv stems with [ɹ]…guɹoto…guloto-ɹu…guɹoto-li… (Medium-range dissimilation)4.4.3 Results and AnalysisData was first collected for 16 native English speakers in each of the four newexperimental groups, and was analyzed in much the same way as was done forExperiments 1 and 2 (see Section for a full description), using the samegroup of control subjects (who were not exposed to any stems with liquids in thetraining phase) for a baseline comparison in the mixed-effects logistic regression8Note that the Agreement by Correspondence framework only predicts the possibility of a strictlybeyond-transvocalic variant of locality for patterns of consonant disagreement, and there is no wayto generate a beyond-transvocalic version of consonant harmony using ABC constraints.97analyses. Note that the results presented below omit the full statistical models (seeAppendix B for full summaries) in favour of the more informative tables of oddsratios. The OR values in Table 4.7 are presented as an indication of how each of thegroups performed as a whole. Plots comparing each group to the Control group ateach testing distance are provided below in Section (for Experiment 3) andSection (for Experiment 4), and results are discussed in more detail there,especially with respect to individual subject performance.Table 4.7: Odds ratios comparing groups in Experiments 3 and 4 to Controlsubjects for choosing the target pattern (values extracted frommixed-logitmodels relevelled at each testing distance). Contexts encountered in train-ing are in boldface and all cells that reach significance are shaded.Type of test item (trigger-target distance)Short-range Medium-range Long-range(cvcvLv-Lv) (cvLvcv-Lv) (Lvcvcv-Lv)M-Harm-S-Faith vs. Control(Experiment 3)2.22(p < 0.001)2.50(p < 0.001)1.74(p ≈ 0.011)S-Harm-M-Faith vs. Control(Experiment 3)7.42(p < 0.001)1.57(p ≈ 0.042)1.50(p ≈ 0.062)M-Diss-S-Faith vs. Control(Experiment 4)1.06(p ≈ 0.744)1.79(p ≈ 0.002)0.71(p ≈ 0.069)S-Diss-M-Faith vs. Control(Experiment 4)6.90(p < 0.001)1.34(p ≈ 0.118)0.90(p ≈ 0.583) Results of Experiment 3Learning Figure 4.2 shows that both the M-Harm-S-Faith and the S-Harm-M-Faith groups in Experiment 3 learned that a pattern of liquid harmony applies at theirrespective training distance. More specifically, the left panel of the figure comparesthe proportion of harmony responses given in cvLvcv-Lv test items for the M-Harm-S-Faith and Control groups. The mixed logit model estimates that subjects98S-Harm-M-Faith learningat Short-range?cvcvLv-Lv test items(saw harmony)Proportion harmony responses ([l…l] or [r…r])Control S-Harm-M-Faith0.000.250.500.751.00M-Harm-S-Faith learningat Medium-range?cvLvcv-Lv test items(saw harmony)Proportion harmony responses ([l…l] or [r…r])Control M-Harm-S-Faith0.000.250.500.751.00* * Figure 4.2: Plots comparing proportions of harmony responses for Controlsubjects to those of the M-Harm-S-Faith subjects in Medium-range testitems (left panel) and to the S-Harm-M-Faith subjects in Short-range testitems (right panel). Each dot represents individual subject performance,and group means are indicated with a horizontal line. Significance isextracted from a mixed logit model and indicates learning of the patterneach group was exposed to.in the former group are 2.5 times (summary of all OR values can be found above inTable 4.7) more likely than a Control subject to choose harmony at Medium-range(p < 0.001; it is clear that this effect is driven by a subset of subjects who weremore successful than the others). This result is interesting in that these subjectswere exposed to evidence against harmony at the Short-range distance, but thiswas not enough to stop a number of participants from learning the Medium-range99dependency. The right panel of Figure 4.2 shows a comparison of the S-Harm-M-Faith group and the Control group for Short-range test items. Subjects in theS-Harm-M-Faith group also picked up on the pattern of liquid harmony that waspresented to them in Short-range contexts, and the model estimates that they are7.42 times more likely than the Control group to choose harmony in cvcvLv-Lvtest items. It also appears that this group has a greater number of successful learnersthan the M-Harm-S-Faith group. This result is not surprising, as the pattern thatthe S-Harm-M-Faith group subjects were exposed to falls within the realm of TSL2languages, andwe have already seen (in Experiment 1) that subjects are able to learntransvocalic liquid harmony, even without any information about the treatment ofliquids at the Medium-range distance.M-Harm-S-FaithGeneralization The plots in Figure 4.3 show that theM-Harm-S-Faith group, as a whole, chose test items with harmony more often than the Con-trol group at both the Short-range and Long-range distance. Note that the subjectsin this group were not exposed to any evidence that liquid harmony should be en-forced at either of these two distances. Moreover, they were given explicit evidencethat liquids do not alternate in Short-range stems and that all possible combinationsof liquids are permitted in cvcvLv-Lv contexts. Nonetheless, the left panel of thefigure shows that many subjects over-generalized the pattern, with the mixed logitmodel estimating that subjects in the M-Harm-S-Faith group are 2.22 times (p <0.001) more likely to choose harmony than the Control group in Short-range testitems. The fact that the group also tends to generalize harmony to Long-range con-texts (estimated odds ratio of 1.74, p ≈ 0.011) is further evidence that they haveinterpreted their training data as a pattern with unbounded locality, even thoughthey were provided with evidence that it was not.S-Harm-M-Faith Generalization Finally, Figure 4.4 shows results of the S-Harm-M-Faith group for both Medium-range and Long-range test items. With re-spect to the left panel of the figure, the mixed logit model estimates a small effectfor the S-Harm-M-Faith group at Medium-range (OR = 1.57). However, this effectbarely reaches statistical significance (p ≈ 0.042), and the effect likely stems from100M-Harm-S-Faith generalizationto Long-range?Long-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control M-Harm-S-Faith0.000.250.500.751.00M-Harm-S-Faith generalizationto Short-range?Short-range test items(saw counterevidence)Proportion harmony responses ([l…l] or [r…r])Control M-Harm-S-Faith0.000.250.500.751.00* * Figure 4.3: Plots comparing proportions of harmony responses for Controlsubjects to those of the M-Harm-S-Faith subjects in Short-range testitems (left panel) and in Long-range test items (right panel). Each dotrepresents individual subject performance, and group means are indi-cated with a horizontal line. Significance is extracted from a mixed logitmodel and indicates generalization of the pattern the group was exposedto.a few subjects that seem to apply the pattern of liquid harmony to Medium-rangecontexts. Nonetheless, this is slightly surprising in light of the fact that subjectsin this group were exposed to evidence in their training that liquids in the stemstay faithful in Medium-range contexts. The right panel of the figure shows thatonly one subject seems to apply harmony to Long-range test items, and the effectfor the group as a whole does not reach significance. The results for this group,101S-Harm-M-Faith generalizationto Long-range?Long-range test items(saw no evidence)Proportion harmony responses ([l…l] or [r…r])Control S-Harm-M-Faith0.000.250.500.751.00* S-Harm-M-Faith generalizationto Medium-range?Medium-range test items(saw counterevidence)Proportion harmony responses ([l…l] or [r…r])Control S-Harm-M-Faith0.000.250.500.751.00n.s. Figure 4.4: Plots comparing proportions of harmony responses for Controlsubjects to those of the S-Harm-M-Faith subjects in Medium-range testitems (left panel) and in Long-range test items (right panel). Each dotrepresents individual subject performance, and group means are indi-cated with a horizontal line. (Non-)Significance is extracted from amixed logit model and indicates generalization of the pattern the groupwas exposed to.though peculiar when considering only whether or not an effect reaches statisticalsignificance at the group level, are similar to the results for the S-Harm group inExperiment 1 (see Section This suggests that subjects who are exposed toa pattern compatible with the (attested) transvocalic variant of consonant harmonyare very likely to apply the pattern in only those transvocalic contexts, but that thereis a small tendency (both within individuals and for the group as a whole) to apply102the pattern at Medium-range as well. Results of Experiment 4Learning Figure 4.5 shows that both the M-Diss-S-Faith and the S-Diss-M-Faithgroups in Experiment 4 learned that a pattern of liquid dissimilation applies at theirrespective training distance. For theM-Diss-S-Faith group, this is a relatively smalleffect with the mixed logit model estimating an odds ratio of about 1.79 (p ≈ 0.002).The left panel of the figure shows that most subjects are clustered just above a 0.50proportion of disharmony choices in Medium-range test items, and that the effectis driven by just two or three subjects who learned dissimilation with a varyingdegree of success. By contrast (but as expected), most subject in the S-Diss-M-Faith group, whose results are shown in the right panel of Figure 4.5, successfullylearned a pattern of dissimilation for the cvcvLv-Lv items that they were exposedto in training (the statistical model estimates a relatively large odds ratio of 6.90,p < 0.001). The overall results for the test items that represent learning in Experi-ment 4 thus do not differ from the results of Experiment 3 in terms of whether ornot the effects reach statistical significance at the group level (see Figure 4.2 andthe accompanying discussion). However, there does appear to be a difference inthe number of individual subjects who detected the target Medium-range pattern.Out of the 16 subjects in the M-Diss-S-Faith group, only three surpass a thresholdof “successful learning” (see Section 3.2.3 for a description of the binomial testused to determine whether individual subjects surpassed a 95% confidence levelof having successfully learned a pattern), as compared to seven out of the 16 M-Harm-S-Faith subjects. This issue is further pursued in the discussion of individualresults in Section 4.4.4 below.M-Diss-S-Faith and S-Diss-M-Faith Do Not Generalize As shown in Figures4.6 and 4.7, neither of the two experimental groups shows an overall tendency togeneralize a pattern of liquid dissimilation to other distances. This result was ex-pected for the S-Diss-M-Faith group, for whom the training data was compatiblewith a pattern of dissimilation with transvocalic locality, but differs slightly fromexpectations in the case of the M-Diss-S-Faith group. Specifically, given that the103S-Diss-M-Faith learningat Short-range?cvcvLv-Lv test items(saw dissimilation)Proportion disharmony responses ([l…l] or [r…r])Control S-Diss-M-Faith0.000.250.500.751.00* * M-Diss-S-Faith learningat Medium-range?cvLvcv-Lv test items(saw dissimilation)Proportion disharmony responses ([l…l] or [r…r])Control M-Diss-S-Faith0.000.250.500.751.00Figure 4.5: Plots comparing proportions of disharmony responses for Controlsubjects to those of the M-Diss-S-Faith subjects in Medium-range testitems (left panel) and to the S-Diss-M-Faith subjects in Short-range testitems (right panel). Each dot represents individual subject performance,and group means are indicated with a horizontal line. Significance isextracted from a mixed logit model and indicates learning of the patterneach group was exposed to.corresponding group from Experiment 3 (M-Harm-S-Faith) showed evidence ofapplying harmony in an unbounded fashion, along with the fact that a few subjectsin the M-Diss-S-Faith group did appear to have learned a Medium-range depen-dency, it is surprising that not a single subject’s proportion of disharmony responsesdistinguishes them from the Control group in either Short- or Long-range testingitems (see the individual dots in Figure 4.6). The following discussion of individ-104M-Diss-S-Faith generalizationto Long-range?Long-range test items(saw no evidence)Proportion disharmony responses ([l…l] or [r…r])Control M-Diss-S-Faith0.000.250.500.751.00M-Diss-S-Faith generalizationto Short-range?Short-range test items(saw counterevidence)Proportion disharmony responses ([l…l] or [r…r])Control M-Diss-S-Faith0.000.250.500.751.00n.s. n.s. Figure 4.6: Plots comparing proportions of disharmony responses for Controlsubjects to those of theM-Diss-S-Faith subjects in Short-range test items(left panel) and in Long-range test items (right panel). Each dot rep-resents individual subject performance, and group means are indicatedwith a horizontal line. (Non-)Significance is extracted from amixed logitmodel and indicates generalization of the pattern the group was exposedto.ual results for Experiment 4 further investigates the issue of an individual subject’sability to learn a pattern from this type of input, and presents results from an ex-tended (yet failed) attempt to collect data for at least 12 successful learners in theM-Diss-S-Faith condition.105S-Diss-M-Faith generalizationto Long-range?Long-range test items(saw no evidence)Proportion disharmony responses ([l…l] or [r…r])Control S-Diss-M-Faith0.000.250.500.751.00n.s. n.s. S-Diss-M-Faith generalizationto Medium-range?Medium-range test items(saw counterevidence)Proportion disharmony responses ([l…l] or [r…r])Control S-Diss-M-Faith0.000.250.500.751.00Figure 4.7: Plots comparing proportions of disharmony responses for Con-trol subjects to those of the S-Diss-M-Faith subjects in Medium-rangetest items (left panel) and in Long-range test items (right panel). Eachdot represents individual subject performance, and group means are in-dicated with a horizontal line. (Non-)Significance is extracted from amixed logit model and indicates generalization of the pattern the groupwas exposed to.4.4.4 Individual Results and General DiscussionThe purpose of Experiments 3 and 4 was to investigate the possibility that adult hu-man learners are indeed capable of learning unattested, computationally complexphonotactic patterns that are outside of the proposed hypothesis space for a humanlearner. As such, the present discussion focuses especially on the M-Harm-S-Faith106and M-Diss-S-Faith conditions, in which the training stimuli showed evidence thatliquid harmony or dissimilation holds at Medium-range distances, but not at Short-range. Note that the training data are compatible with dependencies that hold acrossexactly one intervening consonant (Medium-range only) or across at least one inter-vening consonant (Medium- and Long-range, but not Short-range), and recall thatSection 4.3 demonstrated that such patterns are situated outside of the TSL2 region.The prediction for these experiments was therefore that the M-Harm-S-Faith andM-Diss-S-Faith groups would either not learn any dependency between liquids, orthat they would erroneously learn an unbounded pattern that can be characterized inTSL2 terms. In the overall group results of Experiment 3, the latter was seen for thesubjects in the M-Harm-S-Faith group, who were statistically more likely than theControl group to choose harmony no matter whether the liquids were in a Short-,Medium-, or Long-range test item. Of further interest are the individual subjectresults for the M-Harm-S-Faith group (illustrated in Figure 4.8). Out of the sevensubjects with the highest proportions of harmony responses at the Medium-rangedistance, which is where harmony occurred in their training data, six of them appearto have generalized the pattern into Short-range contexts in spite of the counter-evidence provided. Only a few of these subjects also generalized to Long-rangetest items, though this result is similar to the findings of Experiments 1 and 2 inthat subjects seem reluctant to extend the alternation to consonants in word-initialposition.Recall that of the 16 subjects in theM-Diss-S-Faith group in Experiment 4, onlythree surpassed the defined threshold of learning a dependency. Of these, no subjectgeneralized the pattern of dissimilation to either the Short-range or the Long-rangetest items. In order to boost the number of learners in this group to get a more reli-able picture of subject behaviour, more data was collected in an attempt to identifya minimum of 12 successful learners in the M-Diss-S-Faith group (as was done forthe alternative analyses for Experiments 1 and 2; see Section 3.2.3). However, afterrunning a total of 40 native English speakers in this training condition, the numberof learners only rose to eight. As seen in Figure 4.9, nearly all of the subjects clus-ter around a 0.50 proportion of choosing dissimilation at all three testing distances.Of those eight who do successfully learn the Medium-range dissimilation, five sur-pass the same threshold at the Short-range distance: subjects 736, 766, 757, 752,107Experiment 3: M-Harm-S-Faith groupIndividual subject resultsTest-item typeProportion harmony responses ([l…l] or [r…r])50150250350450550650750850910511512521522523524Short-range(cvcvLv-Lv)Medium-range(cvLvcv-Lv)Long-range(Lvcvcv-Lv) 4.8: Individual results for M-Harm-S-Faith group at each of the threetesting distances. Individual subjects are distinguished by colour and3-digit code.and 732. In this sense, the additional data that was collected beyond the original 16subjects offers much better support for the prediction that subjects who were ex-posed to a beyond-transvocalic pattern should over-generalize into the Short-rangecontexts, as was also seen in the results of the M-Harm-S-Faith group.With respect to the range of patterns observed for individual subjects in Experi-ments 3 and 4, several subjects exhibit certain unexpected behaviours. For example,108Experiment 4: M-Diss-S-Faith groupIndividual subject results (including extras)Test-item typeProportion disharmony responses ([l…r] or [r…l])70170270345706770870971071171272232731732373573673738940742743744745849750752557575860762766678Short-range(cvcvLv-Lv)Medium-range(cvLvcv-Lv)Long-range(Lvcvcv-Lv) 4.9: Individual results for M-Diss-S-Faith group at each of the threetesting distances. Individual subjects are distinguished by colour and3-digit code.109subject 521 of the M-Harm-S-Faith group (results in light purple in Figure 4.8) ap-pears to have learned a beyond-transvocalic version of liquid harmony—a patternthat is not predicted to be possible, even within the ABC framework. Subject 750(in Figure 4.8, also in light purple) seems to have learned a pattern of transvocalicliquid harmony, despite having been exposed to dissimilation as a member of theM-Diss-S-Faith training group. Such peculiarities bring up a number of impor-tant points. First and foremost is that results like these are not statistically reliable,and the experiments were not designed to draw post hoc conclusions from indi-vidual subjects that deviate from overall group behaviour. Second, we do not yetknow enough about what subjects are actually doing in artificial language learningtasks to assess subjects on an individual basis. It may be, for example, that certainsubjects access a completely different learning module for a superficial languagelearning task. Nonetheless, I think it is safe to assume that some humans learn lan-guages differently than others, and furthermore that some humans may be capableof acquiring patterns that are more complex than those typically found in naturallanguage. This idea calls into question what it means to be a learning bias, whichup to this point, I have discussed as a boundary that applies for all language learn-ers. While I do not pursue the issue any further in this dissertation, it may be thatlearning biases are better conceived of in probabilistic terms, such that complexphonotactic patterns (should they arise in the first place) could be acquired as-isby certain learners. However, an overall tendency for the average learner to mis-interpret their input as evidence for a simpler pattern (e.g. learning an unboundedpattern from beyond-transvocalic input) makes it highly unlikely that such a patterncould persist for even one generation. From this view, it may be better to think ofthe TSL2 region not as a strict learning bias per se, but as a region of patterns thatare likely to be learned by everyone, and therefore be stable over time.4.5 Summary and ConclusionsIn pursuing an answer to the question ofwhat constitutes a possible, human-learnablephonotactic pattern, it is important to establish a well-defined boundary that di-vides patterns into those that are predicted to be in the human learner’s hypothesisspace and those that are not. The central claim of this chapter (and indeed of this110dissertation more generally) is that the class of Tier-based Strictly 2-Local formallanguages (as outlined in Heinz et al., 2011) offers an excellent approximation ofsuch a boundary. I have supported this argument in a number of ways.From a typological perspective, the cross-linguistic properties of locality andblocking can be captured in TSL2 terms, as the difference between unboundedpatterns, transvocalic patterns, and patterns with blocking is easily generated byvarying the contents of the tier specified by the formal grammar without the needfor modification to any other parameters. Furthermore, there are many unattestedpatterns (including several pathologies predicted by the Agreement by Correspon-dence framework) whose formal properties demonstrably situate them outside ofthe TSL2 region. With respect to experimental evidence, the overall results of Ex-periments 1 through 4 indicate that subjects are indeed able to learn patterns that arecompatible with a TSL2 language in terms of locality. However, when exposed tomore complex patterns (as in the case of the M-Harm-S-Faith and M-Diss-S-Faithtraining conditions), very few subjects are able to detect any pattern whatsoever,and those that do often show evidence of having learned a TSL2 grammar that con-tradicts their training data.When considered as a whole, I take the above findings as support for the pro-posal that the human phonotactic learner is equipped with an analytic learning biasthat restricts the hypothesis space to patterns that fall within the TSL2 region (or atleast something very close to it). Furthermore, the evidence suggests that this biasmanifests itself in the form of typological gaps consisting of patterns that are logi-cally possible, but that remain both synchronically and diachronically inaccessiblebecause they cannot be learned as TSL2 patterns.111Chapter 5Questions About the TSL2ApproachThe previous chapter argued that the region of Tier-based Strictly 2-Local formallanguages is a reasonable, if not excellent formal definition of the boundary thatestablishes a set of possible, human-learnable phonotactic patterns and separatesthem from complex patterns that are not human-learnable (and therefore impossi-ble). This chapter scrutinizes the TSL2 proposal by asking three questions:1. Is the TSL2 region too big?2. Is the TSL2 region computationally learnable?3. Is the TSL2 region too small?Sections 5.1, 5.2, and 5.3 investigate each of the above questions in turn, takinginto account evidence from typological surveys of long-distance consonant interac-tions, the results of previous artificial language learning studies from the literature,and investigations of the computational properties of the TSL2 class of formal lan-guages. Section 5.4 concludes that characterizing phonotactic patterns as TSL2stringsets remains a viable strategy in accounting for a wide variety of empiricaldata.1125.1 Is the TSL2 Region Too Big?A potential reason to be skeptical of the TSL2 approach is that, formally speaking,a tier T can be any combination of segments in Σ despite the fact that most attestedlong-distance phonotactic dependencies seem to hold on a tier that could be definedby a phonological feature or natural class, rather than an arbitrary set of segments.For example, in each of the grammars for the three variants of sibilant harmonypresented in Table 5.1 (adapted from above), the members of T can be described asa natural class, shown in parentheses.Table 5.1: TSL2 grammars for three attested variants of sibilant harmonyType of pattern Σ T RUnbounded {s, ʃ, p, t, a} {s, ʃ}(sibilants){*sʃ, *ʃs}Blocking {s, ʃ, p, t, a} {s, ʃ, t}(coronal obstruents){*sʃ, *ʃs}Transvocalic {s, ʃ, p, t, a} {s, ʃ, p, t}(consonants){*sʃ, *ʃs}I reiterate that this is a significant deviation from the concept of a tier (or projectionof segments), which has long been used in theoretical frameworks such as featuregeometry or autosegmental theory (e.g. Clements, 1980, 1985; Shaw, 1991; Odden,1994; Blevins, 2004; Clements and Hume, 1995), which have seen relative successin accounting for the range of attested patterns in terms of the classes of interactingsegments. It may therefore be desirable to integrate certain aspects of phonologicaltheory into the TSL2 account of long-distance phonotactics in order to further limitthe range of possible patterns. However, determining the best way to do so remainsan open problem for formal-language-theoretic approaches to phonologymore gen-erally, and is outside the scope of the present research. Rather than pursue such anaddition to the theory in this dissertation, I instead argue that the relative flexibilityof tier specification is a potential advantage of the TSL2 approach because it offers asimple account of peripheral empirical data. Specifically, this section presents pat-113terns from two languages, Kinyarwanda (Section 5.1.1) and Latin (Section 5.1.2),each of which contains a long-distance phonotactic dependency that belongs to theTSL2 class, but whose grammar necessarily specifies T with a set of segments thatcannot be described as a natural class. I then summarize the experiments of Kooand Oh (2013), who provide evidence that adult human learners are able to detectarbitrarily defined non-adjacent dependencies in CvcvC contexts (Section 5.1.3).5.1.1 Kinyarwanda Sibilant Harmony With BlockingAs an example of a TSL2 pattern of agreement whose tier cannot be characterized asa natural class of segments, I present data from a sibilant harmony pattern found inKinyarwanda, which is blocked by certain intervening segments (see Walker andMpiranya, 2005; Walker et al., 2008; Hansson, 2007, 2010a). Unless otherwisenoted, all cited data are from Walker and Mpiranya (2005). As shown in (1), re-gressive retroflexion harmony among sibilant fricatives is obligatory in…Sv(ː)S…contexts, such that no segment belonging to the set [s, z, n͡z] may precede any ofthe retroflex [ʂ, ʐ, ɳ͡ʐ] in a transvocalic configuration. Note that the retroflex triggermay be derived from a following /i/, as illustrated belowwith the agentive suffix /-i/and the perfective suffix, which is represented by /-i-e/ (followingWalker andMpi-ranya, 2005). As evidence that harmony is purely anticipatory and only triggeredby the series of retroflex sibilant fricatives, Walker and Mpiranya (2005) providethe form /-ʂit-i-e/, which surfaces as [-ʂise] ‘penetrated (perf.)’, rather than *[-sise]or *[-ʂiʂe].(1) Kinyarwanda: obligatory …Sv(v)S… harmonya. /-sas-i/ -ʂaʂi ‘bed maker’ *-saʂib. /-soːn͡z-i/ -ʂoːɳ͡ʐ i ‘victim of famine’ *-soːɳ͡ʐ ic. /-úzuz-i-e/ -úʐ uʐ e ‘filled (perf.)’ *-úzuʐ ed. /-sáːz-i-e/ -ʂáːʐ e ‘became old (perf.)’ *-sáaʐ eHarmony is also optional beyond the transvocalic window, as seen in (2) whereharmony is permitted (though not obligatory) across intervening non-coronal con-sonants such as [k, g, m].114(2) Kinyarwanda: optional …S…c…S… harmonya. /-sákuz-i-e/ -ʂákuʐ e ‘shouted (perf.)’(also -sákuʐ e)b. /-zímagiz-i-e/ -ʐ ímagiʐ e ‘misled (perf.)’(also -zímagiʐ e)c. /-ásamuz-i-e/ -áʂamuʐ e ‘made open mouth (perf.)’(also -ásamuʐ e)For expository purposes, the present discussion treats the above pattern as a categor-ical case of unbounded sibilant harmony. I note that optionality is not itself prob-lematic in the TSL2 approach, since we can simply label both options as phonotac-tically grammatical strings. However, in cases where harmony is obligatory at onelevel of locality, as is the case for the Kinyarwanda data in (1), but not at another,as in (2), the distinction cannot be captured as a single TSL2 language. I considerissues relating to patterns with multiple tiers in Section 5.3, but the treatment ofoptionality and gradience within the TSL2 framework is left for future research.Considering the basic generalizations in (1) and (2), it is easy to see that in aTSL2 grammar (which requires harmony at all distances), the set of tier-based 2-factor restrictionswould beR = {*sʂ, *sʐ, *sɳ͡ʐ, *zʂ, *zʐ, *zɳ͡ʐ, *n͡zʂ, *n͡zʐ, *n͡zɳ͡ʐ}.However, the grammar cannot simply specify the tier as consisting of the sibilantfricatives, with T = {s, z, n͡z, ʂ, ʐ, ɳ͡ʐ}, as the data in (3)-(5) demonstrate that har-mony is blocked by a number of other consonants (unfaithful alternations resultingin harmony are not even optionally permitted when a blocker intervenes betweenthe trigger and target).(3) Kinyarwanda: harmony blocked by non-retroflex coronal stops/nasalsa. /-síːtaːz-i-e/ -síːtaːʐ e ‘made stub (perf.)’(*-ʂíːtaːʐ e)b. /-sódoːk-i-iʐe/ -sódoːkeʐ e ‘made move slowly (perf.)’(*-ʂódoːkeʐ e)c. /-súnuːk-i-iʐe/ -súnuːkiʐ e ‘showed furtively (perf.)’(*-ʂúnuːkiʐ e)115(4) Kinyarwanda: harmony blocked by palatals (note: /n+i/ → [ɲ])a. /-zújaːz-i-e/ -zújaːʐ e ‘became warm (liquid) (perf.)’(*-ʐ újaːʐ e)b. /-zíg-an-i-iʐe/ -zígaɲiʐ e ‘economized (perf.)’(*-ʐ ígaɲiʐ e)(5) Kinyarwanda: [t͡ s] is not a target and blocks harmonya. /-t͡ siːmbaɽaz-i-e/ -t͡siːmbaɽaʐ e ‘made obstinate (perf.)’(*-ʈ͡ ʂiːmbaɽaʐ e)b. /-set͡ saguz-i-e/ -set͡saguʐ e ‘made carve up (perf.)’(*-ʂet͡saguʐ e, *-ʂeʈ͡ʂaguʐ e)The data in (5) are especially interesting. (5a) shows that the non-retroflex coronalaffricate [t͡ s] is not targeted by harmony despite being contrastive with a retroflexcounterpart [ʈ͡ ʂ], and (5b) shows that [t͡ s] blocks harmony when it intervenes be-tween two sibilant fricatives. The importance of this is that it eliminates any pos-sibility of defining the set of blockers as (non-retroflex) coronals that are not con-trastive for retroflexion. At present then, since we know that blockers must also bemembers of the tier in a TSL2 grammar, we know that T is comprised of (at least)the following segments: {s, z, n͡z, ʂ, ʐ, ɳ͡ʐ, t, d, n, j, ɲ, t͡ s}. The final data that weneed to consider before inducing the correct TSL2 grammar are presented in (6)and (7). Note that (7) is cited from Walker et al. (2008).(6) Kinyarwanda: [ɽ] is transparent to harmony, and is not a triggera. /-togoseɽez-i-e/ -togoʂeɽeʐ e ‘made boil for (perf.)’(also -togoseɽeʐ e)b. /-seɽuz-i-e/ -ʂeɽuʐ e ‘provoked, irritated (perf.)’(also -seɽuʐ e)c. /-soɽ-a/ -soɽa ‘pay tax’(*-ʂoɽa)d. /-ziɽ-a/ -ziɽa ‘be forbidden (taboo)’(*-ʐiɽa)(7) Kinyarwanda: harmony is blocked by [ɳ͡ɖ][βasaːɳ͡ɖaʐ e] ‘they blew up’ *[βaʂaːɳ͡ɖaʐ e]116In the above data, (6a) and (6b) show that the sonorant [ɽ] is transparent to the (op-tional) retroflexion harmony among sibilant fricatives, suggesting that [ɽ] is not inT. Further evidence for this is that sequences of …sVɽ… and …zVɽ… are per-mitted in (6c) and (6d). As such, [ɽ] is not a trigger of retroflexion harmony, anddoes not need to be included in any tier-based 2-factors in R. By contrast, the pre-nasalized stop [ɳ͡ɖ], which Walker et al. (2008) found to be phonetically retroflex(cf. the transcription /n͡d/ in Walker and Mpiranya, 2005), does block retroflexionharmony, as seen above in (7), and must be included in T. The final TSL2 grammarfor Kinyarwanda is given in (8).(8) G =〈T ={s, z, n͡z, ʂ, ʐ, ɳ͡ʐ,t, d, n, j, ɲ, t͡ s, ɳ͡ɖ};R ={*sʂ, *sʐ, *sɳ͡ʐ, *zʂ, *zʐ*zɳ͡ʐ, *n͡zʂ, *n͡zʐ, *n͡zɳ͡ʐ}〉Since the segment [ɽ] is not a member of the tier specified by the grammar in (8),there is no natural class that can be used to define T. Any attempt to do so willeither erroneously include [ɽ] despite its demonstrable transparency (for example,T cannot be the coronal consonants), or exclude consonants that we know to be inT (for example, T cannot be the coronal obstruents or the coronal non-continuants).In order to approximate T with a natural class, we would have to stipulate thatsomething like ‘the non-rhotic coronal consonants’ is a natural class of segments.While I know of no version of distinctive feature theory in which this holds true,it would be precisely a pattern such as this that might motivate such a proposal.However, if arbitrary collections of segments are not a problem for the learner, thenwe may not need to force a class of non-rhotic coronal consonants into a theoryof natural classes for the sole purpose of accounting for rare patterns. Finally, Ipoint out alternative ways of using natural classes to describe the contents of T forKinyarwanda. We could, for example, either reference a union of several naturalclasses (such as T = {sibilants} ∪ {coronal stops} ∪ {palatal consonants}), or usesome other operation over sets of segments that are natural classes, such as T ={coronal consonants}−{ɽ} (using any preferred combination of features that picksout [ɽ] as a natural class consisting of one segment).1175.1.2 Latin Liquid DissimilationAs another example of a long-distance dependency that can be characterized witha TSL2 grammar but whose tier cannot be described as a natural class of segments,this section presents data from the well-known case of the Latin -alis ~ -aris al-ternation. While there has been a historical lack of consensus in the literature withrespect to the precise details of the pattern (see, e.g., Watkins, 1970; Dressler, 1971;Jensen, 1974; Steriade, 1987a), I follow a more recent description of the synchronicaspects of this allomorphy given by Cser (2010; all data below are cited from thisstudy unless otherwise noted). Cser performs a corpus study of Latin focusing onthe phonotactics in Classical and Post-classical Latin, between the 1st century BCand the 4th century AD.The commonly cited, analyzed, and counter-exemplified generalization of thepattern found in Latin is that underlying /-al/ surfaces as [-ar] (e.g. the masc.nom.sg.form [-ar-is]) when preceded by an [l], unless [r] intervenes. The data in (9) provideevidence that the underlying form of the suffix is /-al/, which surfaces faithfullywhen there is no [l] in the stem. The basic dissimilation is seen in (10), whenthe stems contain a liquid [l] with no [r] following it. The data in (11) show thatdissimilation is blocked when the stem contains an l…r subsequence.(9) Latin: default form of suffix is -alisa. nav-al-is ‘naval’b. autumn-al-is ‘autumn-’c. hiem-al-is ‘winter-’d. reg-al-is ‘royal’(10) Latin: dissimilation triggered by preceding [l]a. consul-ar-is ‘consular’b. popul-ar-is ‘popular’c. stell-ar-is ‘stellar’d. milit-ar-is ‘military’e. lun-ar-is ‘lunar’118(11) Latin: dissimilation blocked by intervening [r]a. flor-al-is ‘floral’b. plur-al-is ‘plural’c. later-al-is ‘side-, lateral’The above description of the pattern is characteristic of those given in much ofthe literature (e.g. Watkins, 1970; Dressler, 1971; Jensen, 1974; Steriade, 1987a),with some authors making note of several counterexamples (such as legalis, cf.*legaris) in which dissimilation does not occur when expected. With respect tothese basic generalizations for (9)-(11), the pattern is easily described as TSL2,with G = 〈T = {l, r};R = {*ll}〉 (closely resembling the pattern in Georgian;see Section 4.1.2 above). However, Cser’s (2010) corpus investigation reveals thatthese ‘exceptions’ simply follow an additional component of the pattern: dissimila-tion is also blocked by non-coronal consonants. This extension includes both labialand velar consonants, as shown in (12) with an example for each of [b, m, w, k,g]. (Note that all examples in this section use an orthographical representation ofLatin, in which ‘v’ and ‘c’ correspond to [w] and [k], respectively.)(12) Latin: dissimilation blocked by intervening non-coronal consonantsa. gleb-al-is ‘consisting of clods’b. fulmin-al-is ‘projectile’c. pluvi-al-is ‘rainy’d. umbilic-al-is ‘umbilical’e. leg-al-is ‘legal’Aggregating all of the above data, we have evidence that each of [l, r, b, m, w, k, g]must be on the tier, and that [t, n] cannot be on the tier, since (10d) and (10e)demonstrate that they are transparent (in addition to the vowels being transparent).With respect to additional consonants such as [s, d, h, p, f], Cser (2010) cites noexamples in which they occur in the necessary context. However, even in a bestcase scenario in which homorganic consonants behave in the same way, the TSL2grammar for the pattern found in Latin would be as shown in (13).119(13) G =〈T ={l, r, p, b, f, m,w, k, g, (h)};R = {*ll}〉The above tier, which includes all labial and velar consonants but only two coronals,clearly cannot be described as a natural class of segments. I therefore take thepattern found in Latin as further evidence that such grammars should not be absentfrom the human learner’s hypothesis space.5.1.3 Experimental Learning of Arbitrary TiersAs a different type of empirical support for retaining arbitrary patterns in the learner’shypothesis space, I summarize the experimental findings of Koo and Oh (2013),who argue that human learners are capable of detecting phonotactic regularitiesamong sets of segments that cannot be phonologically defined (e.g. using featuresor natural classes).Koo and Oh (2013), a methodologically improved replication of Koo and Calla-han (2012), present the results of two artificial language learning studies in whichnative Korean-speaking subjects were exposed to long-distance dependencies be-tween consonants C1 and C3 in words of the form C1VC2VC3. For each of thewords, C1 was one of three possible consonants {m, tʰ, tʃ}, C2 was one of {h, k, s},and C3 was one of {ŋ, l, p}. In their Experiment A, words of the form [mVCVl],[tʰVCVŋ], and [tʃVCVp] were legal, and all others were not (illegal words wereabsent in the training phase). In their Experiment B, the dependency still held be-tween C1 and C3, but subjects were instead exposed to a different permutation ofgrammatical C1, C3 pairs, consisting of words of the form [mVCVŋ], [tʰVCVp],and [tʃVCVl].Both sets of subjects completed a testing phase in which they were asked to ratethe familiarity of a novel stimulus on a scale of 1 (least familiar) to 5 (most familiar).The interesting aspect of the study is that both sets of subjects completed the exactsame testing phase, such that 50% of the novel test items adhered to the phonotac-tic dependency of Experiment A (and were therefore illegal words in ExperimentB), and 50% followed the pattern in the training phase of Experiment B (and weretherefore illegal words for Experiment A). Results of their study indicate that sub-jects in both experiments learned the respective patterns (rating the “legal” words120as more familiar than the “illegal” words), which they take as evidence against theidea that human learners can only learn dependencies among segment pairs that areadjacent on tiers defined by natural classes.To translate Koo and Oh’s results into TSL2 terms, subjects in Experiment Alearned a pattern that can be represented by the grammar in (14), while subjects inExperiment B learned a pattern that is generated by the grammar in (15).(14) GA =〈TA ={m, tʰ, tʃ,ŋ, l, p}; SA = {ml, tʰŋ, tʃŋ}〉(15) GB =〈TB ={m, tʰ, tʃ,ŋ, l, p}; SB = {mŋ, tʰp, tʃl}〉Since there were no restrictions on the distribution of {h, k, s} in the C2 positionfor either experiment, each of those three segments must be transparent, and can-not be a member of the tier specified by the above grammars. This means that bothof the learned tiers (TA = TB) must include all of {m, tʰ, tʃ, ŋ, l, p} but exclude{h, k, s} (as well as the vowels)—a distinction that cannot be made using naturalclasses. Finally, note that not only are the tiers arbitrarily defined in (14) and (15),but the sets of tier-based 2-factors that are permitted (or restricted) are likewise arbi-trary. For most attested phonotactic patterns, the sets S or R are easily described interms of phonological theory, including many of the patterns presented as examplesthroughout this dissertation (e.g. avoid 2-factors with the features [+ant][–ant], oronly permit 2-factors that have different places of articulation, etc). Though theredo not seem to be many (or any) natural languages that contain such an extremeexample of a dependency with an arbitrarily defined set of 2-factors in S or R, theresults of Koo and Oh (2013) nonetheless indicate that subjects are capable of de-tecting such patterns in an artificial language.5.1.4 Conclusion: The TSL2 Region is Not Too BigBefore concluding that the class of TSL2 languages does not include too many pat-terns, it is worth pointing out that the patterns found in Kinyarwanda (Section 5.1.1)and Latin (Section 5.1.2), as they are described above, remained undetected forquite some time even though the data was already accessible. In the case of Kin-121yarwanda, recall that sibilant harmony is obligatory in transvocalic contexts butoptional outside of that window. While both Kimenyi (1979) and Coupez (1980)describe this basic generalization, the optional application of the pattern obscuredthe fact that certain segments categorically block harmony and it was not until laterwork that a complete description of the pattern was provided (Walker and Mpi-ranya, 2005; Walker et al., 2008). With respect to the pattern found in Latin, wordssuch as legalis and umbilicalis were long cited as exceptions to an otherwise intu-itive description of the data, and (to my knowledge) it was not until Cser’s (2010)corpus study that a further aspect of the pattern—that labial and velar consonantsblock the dependency—was detected in the data. It is therefore conceivable, andperhaps likely, that there exist many examples of relatively arbitrary phonotacticdependencies that have not yet been discovered, precisely because of their unex-pected nature. Furthermore, the proposed learning bias that restricts the range ofpossible languages to the TSL2 region need not be the only bias at play, and theremay be other reasons for the cross-linguistic underrepresentation of arbitrary pat-terns. That is, even if the overwhelming majority of phonotactic patterns can bedescribed as a restriction against a phonologically defined set of 2-factors on a tierthat is a natural class, this is not necessarily reason to omit others from a model ofthe learner’s hypothesis space, and a resolution of the issue depends on further em-pirical data (e.g. results from experimental studies of human learning). Importantly,the set of patterns predicted when allowing arbitrarily defined tiers is a superset ofthose predicted when imposing some restriction on the potential contents of a tier.While it would be trivial to reduce the learner’s hypothesis space in accordancewith some proposed set of restrictions on what can be a tier, I do not pursue thisoption since, as the following section demonstrates (following Jardine and Heinz,2015), it is not necessary to do so for reasons of computational learnability.5.2 Is the TSL2 Region Computationally Learnable?I have now argued that the class of TSL2 languages offers a close approximationof the types of patterns found in natural language, and the patterns that humans are(not) able to learn in the laboratory. I have also shown that the relative complexityof the grammar, which requires specification of both a tier and a set of restrictions122against certain 2-factors, is necessary in order to account for attested propertiesof locality and blocking. Furthermore, Experiments 3 and 4 (see Section 4.4) of-fered support for leaving certain dependencies that do not belong to the TSL2 lan-guage class outside of the human learner’s hypothesis space (i.e. those that applyonly beyond the transvocalic window), and the results of Koo and Oh (2013) werecited as evidence in favour of leaving phonologically arbitrary TSL2 patterns withinthe range of human learners. However, the number of possible TSL2 grammars isenormous for a segment inventory of the typical size for a natural language. Thispresents a significant challenge to the proposal that the entire TSL2 region could betraversed efficiently by a human language learner in order to arrive at the correctgrammar for any such pattern. After expanding on the precise nature of the TSL2learning problem, Section 5.2.1 summarizes a formal learning algorithm recentlyproposed by Jardine and Heinz (2015) that can provably and efficiently learn theclass of TSL2 languages. Section 5.2.2 then provides an example stepwise imple-mentation of their algorithm, as applied to a schematic case of sibilant harmonywith blocking modelled after Slovenian (Jurgec, 2011; see Section 4.2.1).To better illustrate the problems posed to a potential learner of a TSL2 language,I present a calculation of the number of possible grammars for the simplified seg-ment inventory used in Table 4.2: Σ = {s, ʃ, p, t, a}. Since there are five segmentsin Σ, there are exactly 32 combinations of segments in Σ that could constitute thetier T. More generally, for some Σ there are 2|Σ| possible tiers, where |Σ| is thenumber of segments in the inventory. Note that of the 32 options for T in this exam-ple, one includes all five segments (T1 ={s, ʃ, p, t, a}), five include four segments(T2 ={s, ʃ, p, t}, T3 ={s, ʃ, p, a}, T4 ={s, ʃ, t, a}, T5 ={s, p, t, a}, T6 ={ʃ, p, t, a}),ten include three segments, ten include two segments, five include one segment, andthe remaining logical possibility is T32 = ∅ (where ∅ denotes the empty set, with|T32| = |∅| = 0). However, since a TSL2 grammar is a two-tuple that includes spec-ification of both T and R, each of the 32 possible tiers also has a number of possibil-ities for the accompanying set of 2-factor restrictions in R. For T1 ={s, ʃ, p, t, a},there are |T1|2 = 25 possible 2-factors (i.e. {ss, sʃ, sp, st, sa, ʃs, ʃʃ, … , ap, at, aa}),each of which may or may not be included in R, resulting in 225 = 33 554 432possible grammars when T = T1. Likewise, when |T| = 4, there are 216 possiblesets R, and so on (where the number of possible specifications of R, given some T is1232|T|2). I complete the present example by adding up the number of possible gram-mars for each of T1 through T32, as shown in Table 5.2. The result, that the numberof possible TSL2 grammars reaches nearly 34 million with an inventory of just fivesegments, raises the concern that it may be impossible for a learner to traverse thespace of TSL2 languages efficiently in order to arrive at the correct grammar.1Table 5.2: Number of possible TSL2 grammars for Σ ={s, ʃ, p, t, a}|T| # possible T=(|Σ||T|) # possible R= 2|T|2# possible grammars= # of T × # of R5 1 33 554 432 33 554 4324 5 66 536 332 6803 10 512 5 1202 10 16 1601 5 2 100 1 1 1Total = 33 892 403As a general note on the learnability of TSLk languages, I point out that for anyfinite value of k and a Σ with finitely many segments, the number of possible gram-mars may be extremely large, but is nonetheless finite. As such, for the class ofTSL2 languages, even an algorithm that picks among grammars randomly will besuccessful in the limit, thus achieving Gold-learnability (Gold, 1967). However,when Heinz et al. (2011) first proposed the TSL class of formal languages, theyinvestigated a number of computational properties of the class, but did not knowwhether or not it was possible to learn a TSLk grammar efficiently (i.e. with polyno-mial bounds on time and data; de la Higuera, 1997) without prior knowledge of thesegments in T. Following up on the issue, Jardine and Heinz (2015) provide an al-gorithm that does so (see also Jardine, 2015). The algorithm, when given sufficientdata in the training set, simultaneously acquires the two components of the gram-mar (T and R) in an efficient manner for any TSL2 language, regardless of the size1To further press the issue I point out that the number of possible TSL2 grammars for a relativelymodest inventory of 20 segments is on the order of 10120. For comparison, the number of atoms inthe universe is estimated to be on the order of 1080.124of Σ. An outline of the technical aspects of this algorithm (named the Tier-basedStrictly 2-Local Inference Algorithm, or 2TSLIA, by Jardine and Heinz, 2015) isprovided below in Section 5.2.1, though I point out that my summary sacrificesa certain amount of mathematical precision in favour of accessibility to a broaderreadership. Jardine and Heinz (2015) do not ignore these details, and the readeris referred to their article for a full description of the 2TSLIA as well as formalproofs of certain mathematical properties of the learner. Section 5.2.2 shows howthe 2TSLIA would learn a pattern of sibilant harmony that is blocked by coronalobstruents (similar to the case of Slovenian, which was presented in example (4) inChapter 4)—an example that illustrates the practical importance of several aspectsof the learning algorithm.5.2.1 Summary of the Tier-based Strictly Local Inference AlgorithmOf crucial importance to the 2TSLIA is the concept of a 2-path (Jardine and Heinz,2015). A 2-path is formally defined as a 3-tuple 〈x;Z; y〉where x and y are segmentsin Σ that occur in a particular word (potentially two instances of the same segment),and Z is the set of segments that intervene between x and y in the word (note thatZ includes only one instance of the same intervening segment, and thus Z is a sub-set of Σ). For example, in the word [sapʃ], [s] precedes [ʃ] and the two segmentsare separated by intervening segments [a] and [p]. The corresponding 2-path is〈s, {a, p}, ʃ〉. Likewise, the entire set of 2-paths for the word [sapʃ] includes each ofthe following: 〈s, {}, a〉, 〈s, {a}, p〉, 〈s, {a, p}, ʃ〉, 〈a, {}, p〉, 〈a, {p}, ʃ〉, 〈p, {}, ʃ〉.Intuitively, 2-paths can therefore be thought of as x…y precedence relations thatare augmented with the set of segment interveners.The 2TSLIA takes as its input a segment inventory with each segment labelledfor some (arbitrary) order, Σ = {σ1; σ2; σ3; : : : ; σn}, as well as a finite set I of gram-matical strings (i.e. words that adhere to the phonotactic restrictions of the targetlanguage) that serves as the algorithm’s training data. The output of the learningalgorithm is a TSL2 grammar defined by G = 〈T; S〉 (with a grammar in the formG = 〈T;R〉 easily obtained as well).The algorithm begins with a hypothesized tier T0 = Σ (i.e. every segment inthe inventory is on the tier) and attempts to remove segments from the tier one at125a time (iteratively hypothesizing a potentially different tier in each step: T0, T1,…, Tn). In order to safely remove a segment σi from Ti−1 (such that in the nextstep Ti = Ti−1 − {σi}), two conditions need to be satisfied. First, σi must be whatJardine and Heinz (2015) call a free element. The second condition is that σi is notwhat they call an exclusive blocker. Being a free element means that σi may freelyco-occur with any segment (including itself), both preceding and following it. Thisamounts to stating that σi would not be present in any of the 2-factors in R if Ti−1(the current guess of T) is in fact the correct tier. To determine whether or not this istrue in the first step, where T0 = Σ, the algorithm checks the set of 2-paths that arepresent in the training data. If for every segment σ′ ∈ Σ, the set of 2-paths includesboth 〈σi; {}; σ′〉 and 〈σ′; {}; σi〉, then σi satisfies the first (free element) condition.Note that in subsequent iterations of the algorithm (when certain segments havebeen removed from the hypothesis for T), 2-paths of the form 〈σi; {Z}; σ′〉 and〈σ′; {Z}; σi〉, where Z contains only segments that have already been removed fromthe tier (i.e. Z ⊆ Σ − Ti−1), may also provide evidence that σi is a free element.If the free element condition is satisfied for σi, the algorithm then moves on tothe second (exclusive blocker) condition. Being an exclusive blocker means thatthe segment is a member of the tier T, but is not present in any member of the2-factor restrictions in R. In practice, this means that it can block a phonotacticdependency, even though it does not actively participate in the restrictions, suchas a [t] that blocks sibilant harmony. More formally, having determined that σi isa free element, we also know σi is not part of any 2-factor in R. However, thisis not enough evidence to safely remove σi from the tier. That is, if σi is indeeda member of the tier T that is specified by the target grammar G, then there maybe some 2-factor *σxσy ∈ R such that if σi was incorrectly removed from T, then*σxσy would erroneously be evidenced in a tier-adjacent context in the input data.Specifically, this type of case would arise if there is a grammatical word σxσiσy—removing σi from the tier may cause the learner to determine that σx or σy is a freeelement, even though they are actually part of a 2-factor restriction in the targetgrammar (and hence both must be in T). In order to determine if σi is an exclusiveblocker, the algorithm searches the set of 2-paths to ensure that there is no pair ofsegments, σx and σy, whose tier-based adjacency is dependent on the presence ofan intervening σi (see Step 4 in Section 5.2.2 for a practical illustration).126If σi satisfies both of the above conditions (i.e. it is a free element, but not anexclusive blocker), then then learner’s hypothesized tier Ti−1 is updated to Ti withthe removal of σi. If either of the conditions is not satisfied, then no change is madeand Ti = Ti−1. The entire process is then repeated for σi+1; σi+2; : : : ; σn, at whichpoint the learner will have determined whether or not each of the segments in theinventory is a member of T.Once the 2TSLIA has induced the tier T, the final step is to fill in S with theset of tier-based 2-factors that are observed in the data. To do this, the learner cansimply check each of the 2-paths 〈x;Z; y〉. If none of the segments in Z are alsomembers of T, then x and y are tier-adjacent, and the 2-factor xy is added to S. Ifthe desired output is G = 〈T;R〉, then R can be obtained by taking the complementof S with respect to all possible 2-factors comprised of segments in T.5.2.2 Example of the 2TSLIA: Sibilant Harmony with BlockingTarget pattern In the following example, I show each step of the 2TSLIA to il-lustrate how it would learn a case of sibilant harmony that bans *[s…ʃ] and *[ʃ…s]subsequences, unless a coronal obstruent [t] intervenes. With an inventory of Σ ={a, s, ʃ, t, p}, the target grammar is thusG = 〈T = {s, ʃ, t}; S = {ss, st, ʃʃ, ʃt, ts, tt, tʃ}〉,or equivalently G = 〈T = {s, ʃ, t};R = {*sʃ, *ʃs}〉.Input to 2TSLIA The algorithm is provided with a segment inventory Σ = {a,s, ʃ, t, p}, which has each segment labelled in some (potentially arbitrary) order. Inthis case, let σ1 = [a], σ2 = [s], σ3 = [ʃ], σ4 = [t], and σ5 = [p]. The algorithm isalso provided with a set of grammatical training words I. The types of words thatneed to be included (or not included) in I will be discussed throughout the exampleand aggregated in (16) at the end of this section as a sufficient set of training items.Step 1: determining that [a] is not inT The learner’s initial hypothesis is that thetier is T0 = Σ = {a, s, ʃ, t, p}. Beginning with the segment [a] (since it is labelledσ1), the learner must determine whether it meets the free element condition, andif so, whether it meets the exclusive blocker condition. As we already know that127[a] is not a member of the tier specified by the target grammar, we know that itmust be a free element but not an exclusive blocker. In order for the 2TSLIA todetermine this as well, it proceeds as follows. First, it looks for each of the ninepossible 2-paths of the form 〈a;Z; τ〉 and 〈τ;Z; a〉 where τ is any member of thecurrent guess T0 = {a, s, ʃ, t, p}, and Z does not include any members of T0.Since T0 = Σ, it is therefore looking for evidence that [a] may occur in a string-adjacent positionwith all segments. The following twomembers of the input stringsI, which are both grammatical words in the target language, would be sufficient forit to do so: {tasapa, ʃaataʃ}. The relevant subset of 2-paths provided by [tasapa]includes 〈t, {}, a〉, 〈a, {}, s〉, 〈s, {}, a〉, 〈a, {}, p〉, 〈p, {}, a〉, and the remaining fourpossibilities, 〈ʃ, {}, a〉, 〈a, {}, a〉, 〈a, {}, t〉, 〈a, {}, ʃ〉, are provided as a subset of the2-paths in [ʃaataʃ]. The segment [a] is therefore a free element, and the algorithmmoves on to the next step.In order for the algorithm to determine that [a] is not an exclusive blocker, andthat it may be safely removed from the tier, the input training items must includeevidence that no pair of segments requires [a] to intervene in order for them to co-occur. To state this in a more intuitive way, the set of segment pairs x; y that areobserved in 2-paths of the form 〈x; {}; y〉must not be a proper subset of the segmentpairs that are observed in 〈x; {a}; y〉 2-paths (where neither x nor y is [a]). Note thatthis condition can easily be satisfied since all string-adjacent 2-factors except {*sʃ,*ʃs} are permitted, and I might include, for example, {assa, asta, aspa, aʃʃa, aʃta,aʃpa, atsa, atʃa, atta, atpa, apsa, apʃa, apta, appa}. Finally, since words containing*[…saʃ…] or *[…ʃas…] are prohibited, the input strings in I will not include anyinstances of 〈s, {a}, ʃ〉 or 〈ʃ, {a}, s〉. With this type of evidence the 2TSLIA candetermine that [a] is not an exclusive blocker and that it can be safely removed fromthe tier for the algorithm’s next hypothesis.Steps 2 and 3: [s] and [ʃ] are in T After removing [a] from the tier, the algo-rithm’s next step is to hypothesize a tier T1 = {s, ʃ, t, p}, and to check whetherσ2 = [s] should also be removed. It is easy to see, however, that [s] is not afree element in the language. Since the language prohibits any *[…s(a)ʃ…] or*[…ʃ(a)s…] sequences, the training set I will contain no instances of the 2-paths〈s, {}, ʃ〉, 〈s, {a}, ʃ〉 or 〈ʃ, {}, s〉, 〈ʃ, {a}, s〉, which would be required in order for128[s] to satisfy the free element condition. Since [s] is not a free element, the learnerdoes not need to check the exclusive blocker condition, and instead moves directlyto the next hypothesis: T2 = {s, ʃ, t, p}. The same result will hold during the2TSLIA’s subsequent iteration for σ3 = [ʃ], and thus both [s] and [ʃ] remain in thelearner’s guess of T. (Note that no changes to the hypothesized tier were made inthese steps, and T1 = T2 = T3 = {s, ʃ, t, p}.)Step 4: [t] is an exclusive blocker With T3 = {s, ʃ, t, p}, the learner moves onto check the two conditions that determine whether or not σ4 = [t] is in T. Since[t] is not present in any member of R, we know that it is a free element and so itwill satisfy the first condition. The evidence that is necessary for the algorithm toconclude this is available from the members of I already provided in a previousstep: {asta, aʃta, atsa, atʃa, atta, atpa, apta}. This set of training items results ineach of the nine 2-paths that are required for [t] to be a free element: 〈a, {}, t〉,〈t, {}, a〉, 〈s, {}, t〉, 〈ʃ, {}, t〉, 〈t, {}, s〉, 〈t, {}, ʃ〉, 〈t, {}, p〉, 〈p, {}, t〉. (In fact, thealgorithm does not technically need to search for the first two members of this listof 2-paths, as [a] was already removed from the tier after Step 1.)Since [t] satisfies the free element condition, the 2TSLIA also checks the secondcondition, which in this case will determine that [t] is an exclusive blocker. Thereason for this is as follows. We know that the grammatical words in I will notcontain any instances of the 2-paths 〈s, {}, ʃ〉 or 〈ʃ, {}, s〉, 〈s, {a}, ʃ〉 or 〈ʃ, {a}, s〉.However, since the pattern of sibilant harmony is blocked by intervening coronalobstruents, if there is even one instance of a word such as [astʃa] or [ʃatasa], whichinclude the 2-paths 〈s, {t}, ʃ〉 and 〈ʃ, {a, t}, s〉, the algorithmwill conclude that [t] isan exclusive blocker andmust be amember of the tierT. Specifically, this is becausethe presence of a 〈ʃ, {Z}, s〉 2-path, where Z is a subset of {a, t} is dependent onthe presence of [t] in Z.Step 5: [p] is not in T The hypothesized tier is now T4 = {s, ʃ, t, p} (this has notchanged since the end of Step 1), and the last of the five segments in the inventoryto be checked is σ5 = [p]. It is clear that [p] is a free element, as the example wordsin I provided above include [p] in a string-adjacent context with all members of129the inventory, including itself. Unlike [t], however, [p] is not an exclusive blockersince any words of the form *[aspʃa] or *[ʃapasa] would violate the phonotacticsof the TSL2 language, and therefore could not be provided in I. Since [p] is a freeelement but not an exclusive blocker, the algorithm may safely remove it from thetier, arriving at its final (correct) hypothesis that T5 = T = {s, ʃ, t}.Step 6: 2-factors in S (or R) After running through each σi ∈ Σ, and havingdiscovered the correct tier T that is specified by the grammar, the last step thatthe 2TSLIA needs to complete is to compile a list of all tier-based 2-factors thatare observed in I. To achieve this, the algorithm simply records all pairs τ1; τ2where τ1; τ2 ∈ T that are observed in 2-paths of the form 〈τ1;Z; τ2〉, where Z doesnot include any members of T. This will result in the correct specification of S ={ss, st, ʃʃ, ʃt, ts, tt, tʃ}, which completes the learning of the target TSL2 grammar forthis example. If the preferred format of the grammar is in terms of the tier-based2-factor restrictions (i.e. the 2-factors in R), we can simply take the complement ofS with respect to the set of all possible 2-factors comprised of segments in T. In thepresent case, this yields R = {*sʃ, *ʃs}.Representative list of training words The set of words was used in the aboveexample of the 2TSLIA (Jardine and Heinz, 2015) was as follows:(16) I ={tasapa, ʃaataʃ, assa, asta, aspa, aʃʃa, aʃta, aʃpa, atsa,atʃa, atta, atpa, apsa, apʃa, apta, appa, astʃa, ʃatasa}Crucially, the training words must include a set of 2-paths that provides sufficientevidence about whether each segment is a free element and whether it is an ex-clusive blocker. That is, if a particular 2-path is absent from the training data, thealgorithm runs the risk of leaving a segment on the hypothesized tier either becauseit induces that it is a not free element (when there are in fact no relevant phonotacticrestrictions on its occurrence) or that it is an exclusive blocker (even if it does notactually block any dependencies).1305.2.3 Conclusion: The TSL2 Class Is LearnableWhen Heinz et al. (2011) proposed the class of Tier-based Strictly 2-Local for-mal languages, they did not know whether or not it would be possible to designan algorithm that could efficiently learn the entire class without prior knowledgeof the contents of T. The 2TSLIA designed by Jardine and Heinz (2015), whichis provably efficient and correct, thus provides a significant increase in the plausi-bility of a model that equates the human learner’s hypothesis space to the class ofTSL2 stringsets, as the algorithm offers a computationally tractable solution to theproblem of TSL2 learnability.5.3 Is the TSL2 Region Too Small?As a final question about the empirical validity of the TSL2 approach, I considerwhether or not there are any patterns beyond the proposed TSL2 boundary thatshould also be included in the set of possible, human-learnable languages. In whatfollows, I describe a number of languages whose overall (consonant) phonotac-tics cannot be generated with a single TSL2 grammar. However, each of the casespresented is clearly composed of multiple, and in some cases interacting, TSL2 de-pendencies that can be characterized individually before combining them. I arguethat the theory needs to be expanded in order to accommodate such systems in amathematically principled fashion, and I discuss a few preliminary suggestions forhow we might do so.5.3.1 Multiple Non-Conflicting TSL2 PatternsThe first type of phonotactic system to consider is one whose grammar must spec-ify two different sets of restrictions (R1 and R2) indexed for two different tiers (T1and T2). To illustrate the problem, I present data from two different Berber lan-guages: Tamashek Tuareg (Heath, 2005; Hansson, 2010a; Bennett, 2013) and Imd-lawn Tashlhiyt (Elmedlaoui, 1995; Hansson, 2010a,b). Each of the patterns high-lights a number of interesting issues that must be taken into account as we extendthe theory to account for more complex phonotactic patterns.1315.3.1.1 Tamashek Tuareg: Two Long-Distance DependenciesThe Tamashek dialect of Tuareg (Berber; Heath, 2005) has a pattern of sibilantharmony similar to a what is found in many other Berber languages. The basicgeneralization, illustrated in (17) with alternations of the causative prefix /s(ː)-/, isthat sibilants must agree in both anteriority and voicing. (17) shows that the prefixsurfaces faithfully as [s-] when the root contains no sibilants, or when the onlysibilant in the root is [s]. (18) demonstrates anticipatory harmony, in that the prefixis required to agree in both anteriority and voicing with any other sibilant [ʃ, z, ʒ]that is present in the root. Note that the surface form of the vowels in TamashekTuareg varies considerably by context, and the below examples of causative verbssimply use ‘V’ to represent any vowel if a full surface form is not provided in Heath(2005) (note that Heath further distinguishes between ‘short’ and ‘full’ vowels).(17) Tamashek Tuareg: underlying causative prefix /s-/ (Heath, 2005)Causative Glossa. -s-VdufV- ‘make plump’b. -s-VŋŋV- ‘cook’c. -s-VsVfVr- ‘treat (patient)’d. -s-VskVr - ‘hold upright’(18) Tamashek Tuareg: unbounded sibilant harmony (Heath, 2005)Causative Glossa. -ʃ-VlVjtVʃ- ‘shake off’b. ʃ-ùkməʃ ‘make scratch!’c. -z-VgzVl- ‘shorten’d. zˤ-ìhəzˤ ‘make approach!’e. -ʒ-VʒVlwVʁ- ‘glare at’The above pattern of sibilant harmony is easily represented with the following TSL2grammar: G1 = 〈T1 = {s, ʃ, z, ʒ};R1 = {*sʃ, *sz, *sʒ}〉.22G1 is slightly simplified. Heath (2005) points out that the restriction on co-occurring sibilantsseems to hold more generally, even morpheme-internally and so a full set of 2-factors restrictionswould be R1 = f*sʃ, *sz, *sʒ, *ʃs, *ʃz, *ʃʒ, *zs, *zʃ, *zʒ, *ʒs, *ʒʃ, *ʒzg.132Interestingly, however, the sibilant harmony co-exists with an independent pat-tern of long-distance labial dissimilation. This is exemplified below with /m/→[n]alternations in several types of prefix, including themedio-passive and the agentive,when they are followed by a labial [b, f, m] at any distance.(19) Tamashek Tuareg: underlying prefixes containing /m/ (Heath, 2005)a. æ-m-ɑ́jrɑd ‘one who can disappear’ (agentive)b. -æ̀m-erɑ- ‘be opened (Perf)’ (medio-passive)(20) Tamashek Tuareg: long-distance labial dissimilation (Heath, 2005)a. -ə̀nː-əbdˤɑ- ‘be dislocated’ *-ə̀mː-əbdˤɑ-b. ɑ-n-ə̀frən ‘be chosen’ *ɑ-m-ə̀frənc. ɑ-n-ɑ́nɑm ‘one who is fond’ *ɑ-m-ɑ́nɑmThe pattern of long-distance labial dissimilation can likewise be generated with aTSL2 grammar: G2 = 〈T2 = {b, f, m};R2 = {*mb, *mf, *mm}〉.As it stands, Tamashek Tuareg exhibits two independent long-distance depen-dencies that can individually be characterized as TSL2 patterns. However, since thetwo grammars specify two completely different tiers, we must ask whether or notthey can be combined into a single grammar that requires just one tier with a singleset of restrictions. One way of attempting to achieve this might be to take the unionof the tiers (T1 ∪ T2), and the union of the sets of 2-factor restrictions (R1 ∪R2), re-sulting in G3 = 〈T3 = {s, ʃ, z, ʒ, b, f, m};R3 = {*sʃ, *sz, *sʒ, *mb, *mf, *mm}〉.This strategy results in the incorrect prediction that the set of labial consonantsshould block sibilant harmony, and that the set of sibilants should block labial dis-similation. For example, since both of the 2-factors [sm] and [mʃ] are permittedon T3, we would erroneously allow a word such as *[s-ùkmɑʃ] (cf. [ʃ-ùkməʃ] in(18b)).As an alternative, I point out that in cases where patterns can individually bedescribed in TSL2 terms with no overlap in the members of each tier (and hence nooverlap in the tier-based 2-factor restrictions), a single grammar can be generatedas the conjunction of each of the TSL2 grammars. That is, a string is a memberof the language if and only if it is permitted by each of the conjoined grammars,adhering to all of the individual patterns. This is done below with the grammar in133(21).(21) G =〈T1 = {s, ʃ, z, ʒ,};R1 = {*sʃ, *sz, *sʒ}〉∧〈 T2 = {b, f, m};R2 = {*mb, *mf, *mm}〉As a useful test case for the above grammar, Heath (2005) does provide one exampleof the two patterns being upheld in a single word, which is given in (22).(22) Simultaneous sibilant harmony and labial dissimilation (Heath, 2005)ɑ-zˤ-ənː-ət-ə́lməzˤ ‘act of spitting up saliva’ *ɑ-sˤ-əmː-ət-ə́lməzˤ5.3.1.2 Imdlawn Tashlhiyt: Sibilant Harmony With Partial BlockingThe Imdlawn dialect of Tashlhiyt (Berber; Elmedlaoui, 1995; Hansson, 2010a,b)has a pattern of sibilant harmony nearly identical to what was illustrated above in(17) and (18) for Tamashek Tuareg. The basic generalization, again demonstratedwith alternations of the causative prefix /s(ː)-/, is that sibilants must agree in bothanteriority and voicing. (23) shows that the prefix surfaces as [s-] when no sibilantsfollow it, or when the only sibilant in the root is [s]. In (24), the prefix agrees inboth anteriority and voicing with any other sibilant [ʃ, z, ʒ] that is present in theroot.(23) Imdlawn Tashlhiyt: underlying causative prefix /s-/(Elmedlaoui, 1995; Hansson, 2010a,b)Base Causative Glossa. gdʷm s-gdʷm ‘arrange upside down’b. uga sː-uga ‘be evacuated’c. nsa sː-nsa ‘spend the night’d. asːtwa s-asːtwa ‘settle, be levelled’134(24) Imdlawn Tashlhiyt: sibilant harmony(Elmedlaoui, 1995; Hansson, 2010a,b)Base Causative Glossa. fiaʃr ʃ-fiaʃr ‘be full of straw, of discord’b. bːukʃ ːa ʃ-bukʃ ːa ‘be full to overflowing’c. bruzːa z-bruzːa ‘crumble’d. nza zː-nza ‘be sold’e. mːʒdawl ʒ-mːʒdawl ‘stumble’f. gˤrˤuˤʒˤːmˤ ʒˤ-gˤrˤuˤʒˤːmˤ ‘be extinguished (in cooking)’The phonotactics of the basic form of Imdlawn Tashlhiyt sibilant harmony shownin (23) and (24) are easily captured as a TSL2 language, using a formal grammarthat is typical of a pattern with unbounded locality—the tier T contains exactly theset of segments that actively participate in the dependency, namely the sibilants inthis case, and the set of permitted 2-factors on the tier is S = {ss, ʃʃ, zz, ʒʒ} (withR containing all other permutations of two sibilants). This would be enough tocharacterize the pattern of sibilant harmony as it is found in other varieties of Berber,but the case of Imdlawn Tashlhiyt is peculiar in that the requirement for agreementin voicing (but not anteriority) is blocked by intervening voiceless obstruents. Thisis shown below in (25), with examples for each of the voiceless obstruents [ħ, k, f,χ, q] blocking voicing harmony, and with (25e)-(25f) demonstrating that agreementfor anteriority is still enforced across a blocker.(25) Imdlawn Tashlhiyt: sibilant voicing harmony blocked(Elmedlaoui, 1995; Hansson, 2010a,b)Base Causative Glossa. ħuz s-ħuz ‘annex’b. ukz sː-ukz ‘recognize’c. rˤuˤfˤːzˤ sˤ-rˤuˤfˤzˤ ‘appear resistant, recalcitrant’d. m-χazaj smχazaj ‘loathe each other’e. qːuʒːi ʃ-quʒːi ‘be dislocated, broken’f. mˤ-ħˤaˤrˤaˤʒˤ ʃ ˤ-mˤ-ħˤaˤrˤaˤʒˤ ‘get angry with each other’135In contrast to the similar, previously described cases of sibilant harmonywith block-ing that occur in Slovenian (see Section 4.2.1) and inKinyarwanda (see Section 5.1.1),the pattern in Imdlawn Tashlhiyt cannot be captured using a TSL2 grammar whosetier includes all of the sibilants and all of the blockers. With respect to the abovedata, this would yield T1 = {s, ʃ, z, ʒ, ħ, k, f, χ, q}, and the observed set of 2-factorson this tier is S1 = {ss, ʃʃ, zz, ʒʒ, sħ, ħz, sk, kz, sf, fz, sχ, χz, ʃq, qʒ, ʃħ, ħʒ}. How-ever, this cannot be the correct grammar for Imdlawn Tashlhiyt—the presence ofboth {sħ, ħʒ} on the relevant tier falsely implies the grammaticality of a word suchas *[sˤ-mˤ-ħˤaˤrˤaˤʒˤ] (cf. the correct form in (25f) [ʃ ˤ-mˤ-ħˤaˤrˤaˤʒˤ]). However, ifwe leave out the voiceless obstruents, with T2 = {s, ʃ, z, ʒ}, the data above ex-hibit each of the tier-based 2-factors in S2 = {ss, ʃʃ, zz, ʒʒ, sz, ʃʒ}, misidentifying*[s-bruzːa] and *[ʃ-mːʒdawl] as permissible strings (cf. (24c) and (24e) above).Instead we can characterize the above phonotactics as a combination of two pat-terns of agreement (one for anteriority, one for voicing) using a conjunction of TSL2languages, as was done for Tamashek Tuareg in the previous section. The differencefor the present case is that here the two patterns operate on tiers that partially over-lap. One applies on a tier that is made up of all sibilants and the voiceless obstruentsT1 = {s, ʃ, z, ʒ, ħ, k, f, χ, q}, banning 2-factors whose two members are sibilantsthat disagree in voicing R1 = {*sz, *sʒ, *ʃz, *ʃʒ, *zs, *zʃ, *ʒs, *ʒʃ}. The secondpattern is enforced on the tier of sibilants T2 = {s, ʃ, z, ʒ}, banning any 2-factors ofsibilants that disagree for anteriority R2 = {*sʃ, *sʒ, *ʃz, *ʃs, *zʃ, *zʒ, *ʒs, *ʒz}.A grammar that conjoins these two patterns would be sufficient to characterize thephonotactics of Imdlawn Tashlhiyt as presented above, and is given below in (26).(26) G =〈T1 ={s, ʃ, z, ʒ,ħ, k, f, χ, q};R1 ={*sz, *sʒ, *ʃz, *ʃʒ,*zs, *zʃ, *ʒs, *ʒʃ}〉∧〈T2 = {s, ʃ, z, ʒ};R2 ={*sʃ, *sʒ, *ʃz, *ʃs,*zʃ, *zʒ, *ʒs, *ʒz}〉Lastly, there are a couple of interesting notes about the grammar in (26). Firstis that T1 is another example of a tier that cannot be classified as a natural class,since it includes all sibilants (both voiced and voiceless), but only the voicelessobstruents (voiced obstruents are not blockers), and so this pattern serves as further136support for not restricting the set of tiers that a learner may consider to naturalclasses (see Section 5.1 for more examples). Second, note not only that T2 is asubset of T1 but also that several 2-factors occur both in R1 and R2, namely each of{*sʒ, *ʃz, *zʃ, *ʒs}. This redundancy does not affect the success of the grammar,but it does leave room for a more efficient grammar. Specifically, the grammarneeds only to specify these 2-factors restrictions for the sibilant tier (i.e. T2 in thiscase), since the absence of a 2-factor on a tier implies its absence on any supersetof that tier. We could therefore reduce R1 to {*sz, *ʃʒ, *zs, *ʒʃ}, since the banningof {*sʒ, *ʃz, *zʃ, *ʒs} is enforced on T2 (implying that they may not occur on T1either). Further issues concerning the overlap of the formal properties of co-existingpatterns are certainly of broader interest, but are left for future research.5.3.2 Multiple Conflicting TSL2 DependenciesWith the types of patterns like those of Tamashek Tuareg and Imdlawn Tashlhiyt asmotivation for further study of multiple TSL2 patterns, McMullin and Allen (2015)offered a preliminary investigation of the computational properties of TSL2 con-junctions, arguing that they form a lattice class of languages (Heinz et al., 2012),and are therefore Gold-learnable (Gold, 1967). However, there also exist severalattested patterns that cannot be described as a conjunction of TSL2 patterns, sincethere are multiple individual dependencies between consonants (each characteriz-able as TSL2) that are in direct conflict with each other. Revisiting Tamashek TuaregAs a first example, I expand on the phonotactic generalizations of Tamashek Tuaregthat were presented above in Section Recall that the language includes twoseparate long-distance dependencies with unbounded locality: sibilant harmonyand labial dissimilation. The present discussion is relevant only for the latter ofthese patterns, for which the data are reproduced below in (27) and (28), in whicha prefix containing /m/ surfaces with [n] when a labial consonant [b, f, m] followsit at any distance. All data are cited from (Heath, 2005).137(27) Tamashek Tuareg: underlying prefixes containing /m/a. æ-m-ɑ́jrɑd ‘one who can disappear’ (agentive)b. -æ̀m-erɑ- ‘be opened (perf.)’ (medio-passive)(28) Tamashek Tuareg: long-distance labial dissimilationa. -ə̀nː-əbdˤɑ- ‘be dislocated’ *-ə̀mː-əbdˤɑ-b. ɑ-n-ə̀frən ‘be chosen’ *ɑ-m-ə̀frənc. ɑ-n-ɑ́nɑm ‘one who is fond’ *ɑ-m-ɑ́nɑmA TSL2 grammar for the pattern of labial dissimilation is provided in (29), whichaccounts for the generalization that prefixes containing /m/ do not surface faithfullywhen they precede a labial consonant at any distance.(29) G = 〈T = {b, f, m};R = {*mb, *mf, *mm}〉However, there is also evidence that the same prefixes are permitted to surfacefaithfully if the resulting [m] precedes an adjacent [b], as shown in (30).(30) Tamashek Tuareg: adjacent nasal place assimilation resulting in [mb]a. -æ̀m-bæbbɑ- ‘carried each other (perf.)’ *-æ̀n-bæbbɑ-b. -æm-bə̀lədˤwəj- ‘fell over (perf.)’ *-æn-bə̀lədˤwəj-The fact that [mb] sequences are permitted directly contradicts the grammar in (29).A general property of TSL2 languages is that if Tx ⊆ Ty and a 2-factor *σ1σ2is banned on Tx, then *σ1σ2 is also banned on Ty. Intuitively, this translates tothe idea that an unbounded dependency should also hold in transvocalic contexts,string-adjacent contexts, and so on. However, the data above show that there is arestriction against *[mb] on T = {b, f, m}, even though [mb] is permittedwhen T =Σ (i.e. when they are adjacent on a tier that includes all segments). Since {b, f, m}is a subset of the segment inventory Σ, it is not possible to achieve independentTSL2 characterizations of each pattern without contradicting the other.The resulting problem—that one pattern is violated in favour of another—isnot new to phonologists. From a formal language theoretic perspective, an idealsolution might be to define a new mathematical operator (i.e. in addition to con-junction, disjunction, etc.) that would allow for this type of combination of indi-138vidual formal grammars. From the perspective of theoretical phonology, there arealready (at least) two well-known strategies for solving the problem: rule orderingand constraint ranking (or weighting). I suggest that the best solution is to circleback to a constraint-based approach (e.g. OT; Prince and Smolensky, 2004), repre-senting each constraint against the co-occurrence (adjacent or non-adjacent) of twosegments as individual, violable and ranked TSL2 grammars. An example of thisis shown below in Tableaux (31) and (32), each of which compares two possiblesurface strings with [m/n] preceding [b] at some distance.(31) Tamashek Tuareg: non-adjacent disagreement〈T = ΣR = {*nb}〉 〈T = {b, f, m}R = {*mb, *mf, *mm}〉a. -ə̀mː-əbdˤɑ- ∗!b.+ -ə̀nː-əbdˤɑ-(32) Tamashek Tuareg: adjacent agreement〈T = ΣR = {*nb}〉 〈T = {b, f, m}R = {*mb, *mf, *mm}〉a.+ -æ̀m-bæbbɑ- ∗b. -æ̀n-bæbbɑ- ∗!Note that (31) and (32) are meant to provide a visualization of how formal gram-mars can be thought of as independent markedness constraints in an OptimalityTheory framework. As I continue to limit the scope of my dissertation to phono-tactics in particular, the tableaux do not include any input forms or faithfulnessconstraints—the intention is to illustrate the relative preference between two pos-sible output forms. Other, potentially optimal output candidates that do not violateany markedness constraints are presumed to be ruled out for independent reasons.Another example of a pattern that requires a similar approach is provided in the nextsection, prior to continuing the discussion of TSL2 constraints in Section Samala Sibilant Harmony Overrides PalatalizationThough it may seem intuitive that a restriction against two adjacent segments trumpsa dependency that holds at longer distances, this is not always the case. As empir-ical evidence in support of this, I present data from Samala (Ineseño Chumash;Applegate, 1972), in which a regressive sibilant harmony with unbounded localityoverrides a restriction against string-adjacent {*st, *sn, *sl} that results in a patternof dissimilation (in which, e.g., /st/ surfaces as [ʃt]).This is a well-known and often-cited case in the theoretical literature on long-distance interactions in phonology, starting with Poser (1982). It is important tonote that in that body of theoretical works, the descriptive generalization has typi-cally been understood as being the exact opposite, with local dissimilation (palatal-ization before /t,n,l/) overriding the non-local sibilant harmony (Poser, 1982, 1993;McCarthy, 2007; Hansson, 2010a). However, closer scrutiny of the primary de-scriptive source, Applegate’s (1972) grammar, strongly suggests that he intends todescribe the interaction of these patterns as it is presented below.3 All data in thissection are drawn directly from Applegate (1972).The data in (33) illustrate the pattern of anticipatory sibilant harmony, in whichsibilants are required to agree in anteriority with any sibilant that follows it. Notethat both [–ant] and [+ant] segments can be triggers or targets, and that the depen-dency holds across relatively large distances.(33) Samala: unbounded sibilant harmonya. /k-su-ʃojin/ kʃuʃojin ‘I darken it’b. /s-api-tʃʰo-us/ sapitsʰolus ‘he has a stroke of good luck’c. /s-api-tʃʰo-us-waʃ/ ʃapitʃʰoluʃwaʃ ‘he had a stroke of good luck’d. /k-su-k’ili-mekeken-ʃ/ kʃuk’ilimekeketʃ ‘I straighten myself up’3Evidence for this includes Applegate’s proposed relative ordering of the two phonological rules,the inclusion of both [st] and [sn] in his set of permitted word-medial consonant clusters, several datapoints included in the text, and an explicit statement that palatalization is reversed by the subsequentprocess of sibilant harmony (Applegate, 1972, p.120). It is also apparent that the misinterpretation ofApplegate’s description arises from a small list of exceptional words in which the local dissimilationoccurs despite the presence of another [s] later in the word. I extend thanks to Jeff Heinz and BillIdsardi (for further support for the description of the pattern provided in this section, see Heinz andIdsardi, 2010) for drawing my attention to the correct generalization of the pattern and suggestingthat I look more closely at the data as presented in Applegate (1972).140The data in (34a)-(34c) demonstrate the local restriction against *[st, sn, sl], inwhich the prefix /s-/ surfaces as [ʃ] when it immediately precedes an alveolar con-sonant [t, n, l]. (34d) shows that the resulting [ʃ] may also serve as a trigger forsibilant harmony.(34) Samala: /s/→[ʃ] when preceding (adjacent) [t, n, l]a. /s-tepuʔ/ ʃtepuʔ ‘he gambles’b. /s-niʔ/ ʃniʔ ‘his neck’c. /s-lokʼin/ ʃlokʼin ‘he cuts it’d. /s-is-tɨʔ/ ʃ iʃtɨʔ ‘he finds it’Finally, (35) shows that when both patterns cannot be upheld simultaneously, long-distance agreement is given priority over local disagreement. In (35a), an under-lying /ʃ/ surfaces as [s] even though it immediately precedes a [t], because it islater followed by a non-adjacent [s] in the suffix. In (35b), the prefix /s-/ surfacesfaithfully despite the following adjacent [n], in order to satisfy the requirement foragreement among sibilants.(35) Samala: long-distance agreement overrides local disagreementa. /s-iʃ-tiʃi-jep-us/ sistisijepus ‘they (dual) show him’b. /s-net-us/ snetus ‘he does it to him’The phonotactics of Samala present a challenge similar to the case of TamashekTuareg above (Section, in that it is not possible to capture the overall patternwith single TSL2 grammar. Since both [st] and [sn] are observed in a string-adjacentcontext, they must be permitted as 2-factors on a tier that includes all segments(even though they are only permitted when a [+ant] segment such as [s] followsthem later in the string). However, since [st] and [sn] are allowed to occur in suchcontexts, then a TSL2 grammar would have no means of banning *[st] and *[sn]when there is no subsequent [s] in the string.The tableaux in (36) and (37) demonstrate that two markedness constraints thatare themselves TSL2 grammars can be combined in an OT fashion to arrive at thecorrect pattern, when the higher ranked constraint operates on the tier of sibilantsrather than the tier that includes all segments.141(36) Local disagreement in Samala〈T = {s, ʃ}R = {*sʃ, *ʃs}〉 〈T = ΣR = {*st, *sn, *sl}〉a.+ ʃniʔb. sniʔ ∗!(37) Non-adjacent agreement in Samala〈T = {s, ʃ}R = {*sʃ, *ʃs}〉 〈T = ΣR = {*st, *sn, *sl}〉a. ʃnetus ∗!b.+ snetus ∗5.3.3 TSL2 Constraints in Phonological TheoryThe patterns found in Tamashek Tuareg and Samala are both examples of languagesthat cannot be accounted for with a single TSL2 grammar, nor with a conjunction ofmultiple TSL2 grammars, since two different tiers permit/prohibit conflicting setsof 2-factors. Moreover, in the case of Tamashek Tuareg, these patterns co-existwith an additional TSL2 pattern of sibilant harmony (see Section As dis-cussed in Section, ordered rules and ranked constraints have long been usedin the phonological analysis of such patterns, and I have presented a preliminaryillustration of how to do so using constraints that are defined as individual, violableand ranked TSL2 grammars. A violation is assigned if the candidate does not be-long to the stringset extension of the grammar, and the constraints can be ranked inany order. Intuitively, this family of constraints (i.e. those defined by TSL2 gram-mars) does not differ greatly from the types of output well-formedness constraintsthat have been proposed in the literature as drivers of harmony and dissimilation,such asAgree[F] or *XY for harmony andOCP[F] or *XX for dissimilation (see,e.g., Suzuki, 1998; Baković, 2000; Pulleyblank, 2002). From that perspective, themain contribution of the present work, aside from arguing that constraints of thiskind are indeed necessary, is that it provides an extended definition of the constraintfamily that is computationally grounded. It is also of note that each of the patternspresented in Section 5.3.1 as a conjunction of TSL2 languages can be generated142in the same way, by ranking each of the conjoined grammars in any order (sincethey do not conflict, there will never be any evidence for determining the relativeranking of the two constraints).While it would be interesting to explore whether or not the typology is skewedwith respect to a particular ranking of two such constraints (e.g. the local constraintsbeing ranked above non-local constraints), the literature does not provide enoughexamples to allow for such an assessment. This is in part due to the fact that inorder for a language to exemplify two phonotactic patterns operating on two dif-ferent tiers, the language must contain at least one long-distance dependency in thefirst place (e.g. non-adjacent labial dissimilation in Tamashek Tuareg, or sibilantharmony in Samala), and the sparsity of relevant patterns is an inevitable result ofhaving limited the scope of my dissertation research to long-distance interactionsbetween consonants in particular, which are relatively uncommon to begin with.Moreover, both of the examples presented above involve one constraint whoseTSL2 grammar operates on a tier that includes all segments.To my knowledge, there is only one potential case that involves two conflict-ing dependencies between consonants, both of which apply in non-adjacent con-texts. In Sundanese (see Cohn, 1992; Bennett, 2013, 2015), there is a pattern of(unbounded) liquid dissimilation, evidenced when the infix /-ar-/ surfaces as [-al-]when it precedes another [r] elsewhere in the word (e.g. [ŋab-ar-edol] ‘pull in’vs. [ŋ-al-umbara] ‘go abroad’; Cohn, 1992, p. 206). However, another pattern of(transvocalic) liquid harmony overrides the dissimilation in some cases, resultingin words that include grammatical sequences of [rVr] or [lVl] (e.g. [r-ar-ɨwat] ‘star-tled’, [l-al-ɨtik] ‘little’; Cohn, 1992, p. 206). Unfortunately, the full pattern of Sun-danese /-ar-/ infixation is not suitable for the present investigation, as it involvesa number of descriptively complex conditioning factors that the TSL2 account isnot yet equipped to deal with (e.g. sensitivity to stem-initial vs. non-stem-initialposition, root vs. affix affiliation, and onset vs. coda status; for further details, seeCohn, 1992; Bennett, 2015).As an alternative source of empirical evidence that a Sundanese-like patternshould be included in the set of possible languages, it will be useful, in future re-search, to conduct an additional artificial language learning experiment. In particu-lar, two new training conditions could be constructed (e.g. labelledM-Harm-S-Diss143and M-Diss-S-Harm). For the M-Harm-S-Diss group, half of their training stemswould contain liquids in the Medium-range context (cvLvcv) and exhibited a pat-tern of liquid harmony when suffixes [-li] or [-ɹu] were attached. The remaininghalf of stems, of the Short-range variety (cvcvLv), would instead exhibit a patternof liquid dissimilation triggered by the suffixes. The structure of the training phasewould be similar for subjects in the M-Diss-S-Harm group, but they would insteadbe exposed to liquid dissimilation at Medium-range and liquid harmony at Short-range. (Note that the patterns corresponding to the training phases of the proposedM-Harm-S-Diss andM-Diss-S-Harm groups can both be generated with TSL2 con-straints, but that only the latter can be derived within the factorial typology of ABCconstraints.)Even though the overall patterns cannot be described as individual TSL2 lan-guages, subjects are predicted to be able to learn the appropriate patterns. For ex-ample, the pattern corresponding to the training data of an M-Harm-S-Diss groupcould be generated with the use of TSL2 constraints, as shown in (38) and (39).(38) M-Harm-S-Diss target grammar: Medium-range harmonyMedium-range〈T = {c, l, ɹ}R = {*lɹ, *ɹl}〉 〈T = {l, ɹ}R = {*ll, *ɹɹ}〉a.+ cvlvcv-lvb. cvɹvcv-lv ∗!(39) M-Harm-S-Diss target grammar: Short-range dissimilationShort-range〈T = {c, l, ɹ}R = {*lɹ, *ɹl}〉 〈T = {l, ɹ}R = {*ll, *ɹɹ}〉a. cvcvlv-lv ∗!b.+ cvcvɹv-lv ∗The prediction that such patterns should be learnable can be tested with the samestimuli and procedures that were used in Experiments 1 through 4, and plans arecurrently underway to do so as a follow-up to this dissertation research.1445.3.4 Conclusion: The TSL2 Region Is Not Too SmallThe empirical evidence presented in this section suggests that the TSL2 region ofthe subregular hierarchy does not offer enough complexity to account for certainpatterns observed in natural language. However, each of the problematic patterns isa combination of multiple TSL2 patterns that co-exist in a single language, whichrequire different tiers to be specified for certain sets of 2-factor restrictions. Inmany cases—when there is no conflict between the two patterns—a single phono-tactic grammar can be achieved with a simple conjunction of multiple TSL2 gram-mars (see Section 5.3.1). There are a few additional cases in which this cannot beachieved, namely when two phonotactic generalizations are in direct conflict withone another and neither can be satisfied without violating the other (Section 5.3.2).Although there do not seem to be many such languages attested cross-linguistically,pilot results from an extension of Experiments 1 through 4 suggest that humanlearners are indeed able to learn such patterns in the lab. To account for this, I pro-pose that individual TSL2 grammars can be thought of as a family of constraintsthat can be integrated into various constraint-based frameworks. In particular, con-straints of this type are attractive because they are similar to many other constraintfamilies that have been proposed in literature (see Section 5.3.3), but are not sub-ject to the same theoretical restrictions. This relative flexibility of TSL2 constraintsis precisely what allows them to straightforwardly account for descriptively com-plex patterns, such as long-distance dependencies with blocking by a relatively ar-bitrary set of intervening consonants. I believe that this, along with their com-putational properties of learnability, makes the TSL2 definition of (long-distanceco-occurrence) markedness constraints an attractive alternative to other constraintfamilies that have been proposed in the literature as a means of deriving conso-nant harmony and long-distance dissimilation. I note, however, that the 2TSLIA(Jardine and Heinz, 2015) is only designed to discover surface-true patterns thatbelong to the TSL2 region. Since the types of phonotactic systems that requireconstraint-like interactions are precisely those in which one pattern overrides an-other, rendering one of the patterns non-surface-true, further research must be doneto determine what conditions are necessary in order to achieve efficient learningfor the types of grammars that were considered in this section.1455.4 Summary and ConclusionsThe purpose of this chapter was to support the argument for the TSL2 approach tocharacterizing long-distance phonotactics and the human learner’s hypothesis spacefor such patterns by asking three questions about the class of languages. I first ar-gued in Section 5.1 that the TSL2 region is not too big, as there exist certain patternswhose TSL2 grammar requires specification of a relatively arbitrary set of segmentsas the tier. Furthermore, there is experimental evidence from Koo and Oh (2013)that human learners are capable of detecting an arbitrary set of dependencies on anarbitrarily defined tier, which supports the idea that they should indeed be includedin the hypothesis space. However, the relative descriptive complexity of TSL2stringsets (as compared to, for example, the Strictly 3-Local or Strictly 2-Piecewiseclasses of formal languages; see Section 4.1) motivates the question of whether ornot they are even computationally learnable. Section 5.2 provided a summary andan example implementation of the Tier-Based Strictly 2-Local Inference Algorithm(2TSLIA) proposed by Jardine and Heinz (2015), who prove that TSL2 grammarsare efficiently learnable. Finally, Section 5.3 presented several patterns that cannotbe characterized as members of the TSL2 class of stringsets. However, each pat-tern is made up of interacting dependencies that may override each other, and I havedemonstrated how we can use TSL2 grammars to define phonological markednessconstraints that can be integrated into more familiar constraint-based frameworksand I argued that doing so offers a number of advantages from a computationalperspective.146Chapter 6Summary and Conclusions6.1 Empirical FindingsThis dissertation has established a set of empirical results that any theory of long-distance consonant phonotactics needs to be able to account for. With respect tolocality relations, Table 6.1 summarizes the range of attested patterns.Table 6.1: Typology of locality relations in patterns of consonant harmony(Harm) and long-distance consonant dissimilation (Diss). Note that thelabel ‘>transvocalic’ refers to beyond-transvocalic locality.Type Locality CvC CvcvC CvcvcvC Attested?Harmunbounded + + + 3transvocalic + – – 3>transvocalic – + + 7Dissunbounded + + + 3transvocalic + – – 3>transvocalic – + + 711Note that I classify beyond-transvocalic dissimilation as an unattested pattern, despite the po-tential empirical support from the case in Sundanese (Cohn, 1992; Bennett, 2013, 2015)—a complexpattern that I argue is better interpreted as a local (transvocalic) requirement for liquid harmony thatoverrides a more general restriction against the co-occurrence of identical liquids (see Section 5.3.3).147The typology in Table 6.1 raises the question of why there exists a robust di-chotomy between unbounded and transvocalic dependencies, and why other log-ically possible patterns remain categorically unattested, such as a co-occurrencerestriction that is enforced only in beyond-transvocalic contexts. In the above chap-ters, I have pursued in depth the idea that certain phonotactic patterns are unattestedbecause there is no grammar within the human learner’s hypothesis space that couldgenerate that pattern. In other words, the learner is equipped with an inductivelearning bias that renders certain patterns synchronically and diachronically inac-cessible.To further investigate these issues, I conducted a series of artificial languagelearning experiments, in which participants were taskedwith learning a dependencybetween two liquid consonants from various permutations of training stimuli thatresulted in nine training conditions (summarized in Table 6.2).Table 6.2: Summary of training conditions for Experiments 1 through 4.Coloured cells, marked with ‘+’ , indicate that evidence of a dependencythat holds at that distance was presented in the training phase. Grey cells,marked with ‘–’, indicate that the training stimuli included liquids at thatdistance, but that they always stayed faithful, and this did not conform toany systematic harmony or dissimilation pattern. White cells with ‘?’ arecontexts for which no exposure to liquids was provided during the trainingphase.Exp. Group …Lv-Lv …Lvcv-Lv Lvcvcv-Lv1M-Harm ? + ?S-Harm + ? ?2M-Diss ? + ?S-Diss + ? ?3M-Harm-S-Faith – + ?S-Harm-M-Faith + – ?4M-Diss-S-Faith – + ?S-Diss-M-Faith + – ?1-4 Control ? ? ?148I argued that the results of Experiments 1 through 4 provide relatively strong evi-dence in support of the hypothesis that the typology of long-distance dependenciesis a reflection of a human learning bias.In Experiments 1 and 2, human learners were presented with training data thatdid not offer complete information about the exact nature of the pattern, and thegeneral results were the same whether the target pattern was liquid harmony (Ex-periment 1) or liquid dissimilation (Experiment 2). Recall that the participants intheM-Harm (Exp. 1) andM-Diss (Exp. 2) training conditions were exposed to pairsof liquids that were separated by a Medium-range distance (cvLvcv-Lv), but didnot encounter any data on Short-range (cvcvLv-Lv) or Long-range (Lvcvcv-Lv)distances. Only one attested (and by hypothesis possible) type of locality is com-patible with the evidence presented in their training phase (namely, unbounded).Indeed, even though none of the participants had any prior experience with sucha pattern, they tended to internalize it with unbounded locality, applying the de-pendency to all three distance in the testing phase (as opposed to, e.g., learning adependency that holds specifically between a liquid in the second syllable of thestem and a liquid in the suffix). For participants in the S-Harm (Exp. 1) and S-Diss(Exp. 2) groups, the training data included evidence of a phonotactic dependencybetween liquids at Short-range distances, but they received no exposure to pairs ofliquids in Medium- or Long-range contexts. Such a pattern is, in principle, com-patible with either of the attested transvocalic or unbounded locality variants, butlearners tended to interpret the pattern as strictly-transvocalic, and only a small ef-fect (if any) was observed at the group level for generalizing the target pattern togreater distances. Experiment 1 thus replicates the findings of previous experimentslooking at the learning of locality relations in patterns of sibilant harmony (Finley,2011, 2012;McMullin and Hansson, 2014, Experiment 1), and furthermore extendsthem to a different class of segments (i.e. liquids rather than sibilants). Likewise,the results of Experiment 2 suggest that humans learn and generalize phonotacticdependencies in the same way, whether the nature of the interaction is assimilatoryor dissimilatory.In Experiments 3 and 4, the training phase provided participants with informa-tion about the behaviour of liquids in both in both Short-range and Medium-rangecontexts. For each of the experimental groups, the training phase exhibited a suffix-149triggered pattern of liquid harmony at one of the two distances, but, at the otherdistance, there was no restriction on the co-occurrence of liquids. Note that theresulting patterns shown to the S-Harm-M-Faith (Exp. 3) and the S-Diss-M-Faith(Exp. 4) groups are not in conflict with the typology, as such patterns are compat-ible with the well-attested transvocalic variants of consonant harmony and dissim-ilation, respectively. As expected, this did not impede learning, and the subjectsin both groups tended to apply the pattern exactly as it was evidenced in training.By contrast, the patterns exhibited in the training phases for the M-Harm-S-Faith(Exp. 3) and M-Diss-S-Faith (Exp. 4) groups, which exhibited phonotactic restric-tions that held at Medium-range (triggering liquid alternations), but not at Short-range (overt evidence of faithful non-alternation), are not compatible with any typeof attested pattern in terms of locality. Specifically, these beyond-transvocalic pat-terns violate the observed typological universal that if a language enforces a par-ticular phonotactic restriction in Medium-range contexts, then the same restrictionapplies at Short-range. (The contrapositive is also true—no restriction at Short-range implies no Medium-range restriction.) Not surprisingly, such patterns proveto be extremely difficult to for humans to learn in an artificial language learningtask, and very few individual participants in these training conditions successfullylearned that a dependency held in Medium-range contexts. Furthermore, of thosethat did, the majority over-generalized, applying the same phonotactic pattern topairs of liquids in Short-range contexts, in spite of the overt evidence in the train-ing data that transvocalic pairs of liquids should remain faithful. In sum, I interpretthe results of Experiments 1 through 4 as evidence for the connection between ty-pology and human learning bias.6.2 Assessing Theoretical ApproachesIn light of the above empirical data, this dissertation has assessed the predictions oftwo distinct theoretical frameworks in terms of whether or not they offer a satisfac-tory account of the boundary between phonotactic patterns that are human-learnableand those that are not.150Agreement by Correspondence Within Optimality Theory (Prince and Smolen-sky, 2004), I focused primarily on the Agreement by Correspondence framework(Walker, 2000a,c; Hansson, 2001, 2010a; Rose and Walker, 2004), which has seenrelative success in accounting for the typology of consonant harmony, and whichhas recently been extended as a comprehensive analysis of long-distance conso-nant dissimilation (Bennett, 2013). From this perspective, the boundary between apossible and impossible language can be be defined directly in terms of the facto-rial typology of the universal constraint set assumed by ABC. Two main strategieshave been proposed as a means of dealing with locality in ABC. Interestingly, al-though each relies on some version of a locality-based constraint, they result indifferent sets of patterns that can or cannot be generated. The first approach is todefine a CC·Limiter constraint that penalizes correspondence outside of the rele-vant window, such as Proximity (defined by Rose and Walker, 2004, in terms ofsyllable-adjacency; redefined in Chapter 2 with respect to transvocalic contexts),or CC-cvc (based on CC·SyllAdj; Bennett, 2013). The second strategy is to de-fine a Corr[αF] constraint that enforces correspondence only within the boundedrange (e.g. Corr-cvc[αF]). Originally advocated by Hansson (2001, 2010a) as analternative to constraints like Proximity or CC-cvc, Bennett (2013) argues thatthey are necessary in addition to a locality-based CC·Limiter.Table 6.3 summarizes the types of patterns that can be generated within a facto-rial typology of five ABC constraints: four basic constraints (including Corr[αF],CC-Ident[G], IO-Ident[F], IO-Ident[G]), and one of Proximity, CC-cvc, orCorr-cvc[αF]. In this schematic overview of the predictions, note that each pat-tern of harmony would enforce agreement for [G] among those consonants witha particular set of shared features (one of which is [F]), whereas the patterns ofdissimilation would require disagreement for [F].2With respect to consonant harmony, the ABC model makes all of the correctpredictions nomatter which of the three proposed locality-based constraints is used,Proximity, CC-cvc, or Corr-cvc[αF]. This is no doubt a reflection of the factthat the surface-correspondence approach was developed specifically as an analysisof consonant harmony and the unique cross-linguistic properties exhibited by such2Recall that, all else equal, IO-Ident[G]≫ IO-Ident[F] favours a pattern of consonant harmony,and IO-Ident[F]≫ IO-Ident[G] favours dissimilation.151Table 6.3: Types of phonotactic patterns and locality relations that can (3)or cannot (7) be generated within the factorial typology of ABC, usingdifferent locality-based constraints. Green and red icons indicate whetherthe prediction matches the typology or not, respectively.Constraint versionType Locality Attested? Proximity CC-cvc Corr-cvcHarmunbounded 3 3 3 3transvocalic 3 3 3 3>transvocalic 7 7 7 7Dissunbounded 3 3 3 3transvocalic 3 7 7 3>transvocalic 7 3 3 7patterns. However, Proximity and CC-cvc, which penalize correspondence out-side of transvocalic contexts, predict that beyond-transvocalic patterns of dissimila-tion should be attested, while the simple (and widely attested) transvocalic variantof dissimilation should not (cf. the “Mismatch Prediction” about the typologies ofconsonant harmony vs. dissimilation; Bennett, 2013; discussed in Section 3.1.2).Moreover, Section 3.1.1 demonstrated that the global evaluation of Proximityleads to pathological patterns, such as dependencies that are sensitive to the count(even vs. odd parity) of potential correspondents. CC-cvc seems to provide someimprovement, since it is evaluated only for those pairs of surface-correspondingconsonants that are “local”, in the sense of not being separated by a member of thesame correspondence class. Thus, in a sequence…Cx…Cx…Cx… the C1↔C2 andC2↔C3 correspondent pairs are subject to CC-cvc but not the C1↔C3 pair, muchas though the correspondents are being treated as a tier (ordered subsequences ofthe output string) rather than as the unordered sets (equivalence classes) definedby the formal correspondence relation. The alternative strategy of using Corr-cvc[αF] constraints, which enforce correspondence only in transvocalic contexts,seems to provide a much better fit to the empirical data. Notice, however, that thismove indirectly incorporates the notion of a “consonant tier”, in that the presenceof any intervening consonant nullifies the demand for correspondence (and thereby152the motivation for harmony or dissimilation) between the segments of interest.It is notable that additional proposals for modifying the evaluation of ABC con-straints also trend in the direction of treating collections of surface-correspondentsas if they were tiers. Hansson (2007) suggests, for example, that CC-Ident con-straints can be evaluated only for segment pairs that are adjacent in the “corre-spondence chain”. The motivation, again, is to avoid pathological predictions ofthe factorial typology, which includes harmony systems where a “majority rule”(Lombardi, 1999; Baković, 2000) determines the directionality of assimilation, oreven the status of an intervening segment as opaque vs. transparent to the harmony,on a word-by-word basis. I note, however, that in spite of all of the proposed (tier-like) changes that seem to improve the predictions, the inherently complex ABCmachinery still permits certain pathologies. In particular, I draw attention to casesof “agreement by proxy” (Hansson, 2014; discussed at length in Section,in which two relatively dissimilar segments (e.g. [s] and [g]) are forced to be insurface-correspondence (and therefore to agree in some way) if and only if eachone is sufficiently similar to a third consonant (e.g. [x] occurring somewhere else inthe word). Such patterns are expected to arise due to the transitivity of the surface-correspondence relation, whereby in a sequence …Cx…Cx…Cx…, if C1↔C3 andC2↔C3 pairs are in surface correspondence, then so is the C1↔C2 pair. It seemsthat this prediction cannot be avoided unless the idea of an all-encompassing cor-respondence relation is abandoned altogether.Based on the above evidence, this dissertation argued that the factorial typologyofABC constraints does not (and cannot) provide an accurate characterization of theset of possible patterns, and therefore that we need to pursue a different theoreticalaccount of the typology and learnability of long-distance phonotactics.Subregular Stringsets As an alternative approach, I investigated potential solu-tions within the framework of formal language theory, in which the phonotactics ofa language can be thought of as a systematic distinction between the set of gram-matical vs. ungrammatical words (where words are strings of segments). One of theconceptual draws of this approach is that the properties of phonotactic dependenciescan be investigated computationally, and the proposed space of possible patternscan be expressed as a well-defined class of formal languages. More specifically,153this dissertation explored long-distance phonotactic patterns in terms of where theyare situated within the subregular hierarchy (see, e.g., McNaughton and Papert,1971; Rogers et al., 2010; Heinz et al., 2011; Rogers and Pullum, 2011), focusingprimarily on three such classes: the Strictly 3-Local (SL3), Strictly 2-Piecewise(SP2), and Tier-based Strictly 2-Local (TSL2) languages. A summary of the typesof patterns that are contained within each of these regions (as well as the regiondefined by the union of the SL3 and SP2 classes) is provided in Table 6.4.Table 6.4: Types of phonotactic patterns and locality relations that can (3)or cannot (7) be generated as members of different subregular classes offormal languages (stringsets). Green and red icons indicate whether theprediction matches the typology or not, respectively.Subregular classType Locality Attested? SL3 SP2 SL3 ∪SP2 TSL2Harmunbounded 3 7 3 3 3transvocalic 3 3 7 3 3>transvocalic 7 7 7 7 7Dissunbounded 3 7 3 3 3transvocalic 3 3 7 3 3>transvocalic 7 7 7 7 7While neither the SL3 nor the SP2 class contains all of the desired patterns,one possibility is that long-distance dependencies can be either SL3 or SP2 (Heinz,2010). More specifically, McMullin and Hansson (2014) argue that the humanphonotactic learner consists of (at least) two modules for learning long-distanceconsonantal phonotactics: an n-gram learner for acquiring transvocalic patterns (asSL3 languages that ban certain *CvC trigrams, or 3-factors), as well as a prece-dence learner (Heinz, 2010) that is responsible for detecting unbounded depen-dencies (as SP2 languages with restrictions on certain C…C subsequences). Theresulting characterization of the learner’s hypothesis space is the union of these twoclasses of formal languages (SL3 ∪ SP2), which, as shown in Table 6.4, reflects therange of locality relations that are observed in patterns of consonant harmony andlong-distance consonant dissimilation.154The TSL2 class of formal languages (Heinz et al., 2011) likewise provides asufficient account of the empirical findings with respect to locality relations. Re-call that TSL2 languages are defined by a tier T (a subset of the segment inventoryΣ), and a set of 2-factors that are prohibited in tier-adjacent contexts (labelled R;or alternatively, a set S of 2-factors that are permitted on the tier). From this per-spective, the difference between the transvocalic and unbounded variants of long-distance dependencies is whether the grammaticality of each word is assessed onthe tier of consonants (where only those consonant pairs in aC(v)C relationship canpotentially violate the phonotactics), or a tier comprised only of potential triggersor targets (e.g. the liquid tier, the sibilant tier, etc.).As both of the proposed regions (i.e. the TSL2 class and the union of the SL3and SP2 classes) seem to offer a good definition of the boundary between possi-ble and impossible patterns with respect to locality relations, I compared additionaltypes of patterns contained within each region and argued that there is more evi-dence that supports the TSL2 approach. Specifically, I argued that treating the twoelements of a dependency as adjacent segments on a tier allows for a unified ac-count of both locality and blocking in long-distance phonotactics. The differencebetween whether a long-distance consonant interaction can or cannot be blocked bya specific intervening segment is simply whether or not that segment is a memberof the relevant tier. As an example of this, Table 6.5 demonstrates that specify-ing different sets of segments as members of the tier results in exactly the types ofpatterns that are attested cross-linguistically.Table 6.5: Illustration of how the grammaticality of words with sibilant sub-sequences varies as a result of modifying the contents of the tier specifiedby a TSL2 grammar, with a segment inventory Σ = {s, ʃ, t, p, a}.Sibilant harmonyTierWord (grammatical or not?)(R = {*sʃ, *ʃs}) sapas sapaʃ sataʃ saʃ asʃaunbounded {s, ʃ} 3 7 7 7 7blocking {s, ʃ, t} 3 3 7 7 7transvocalic {s, ʃ, t, p} 3 3 3 7 7(direct-adjacency) {s, ʃ, t, p, a} 3 3 3 3 7155Although long-distance dependencies with blocking are relatively rare acrossthe world’s languages, they are indeed attested for both assimilatory and dissim-ilatory interactions between consonants. Since these patterns cannot be capturedwithin the SL3 ∪ SP2 region, I concluded that the class of TSL2 formal languagesprovides a close approximation of the range of patterns that are supported empiri-cally.Finally, it is important to note that formal language theory and phonologicaltheory are not inherently incompatible, and I argue that the two approaches offer amutual benefit. For example, many constraints that have been proposed in order toaccount for long-distance interactions are defined in ways that very closely resem-ble a TSL2 grammar (where a violation is assigned if a particular candidate is not amember of the corresponding stringset). I suggest that stating them in formal termsallows us to better understand the range of predictions and to account for manypatterns that are otherwise rather difficult to handle within the confines of phono-logical theory (e.g. segmental blocking effects, arbitrary tiers, etc.). Likewise, anindividual TSL2 grammar has no way of accounting for complex patterns that quiteclearly arise due to the interaction of two phonotactic restrictions, in which oneoverrides the other when they cannot both be satisfied simultaneously. There is noclear path to dealing with this strictly in terms of formal language theory, but the in-tegration of formal grammars with the well-studied notion of constraint rankings (orrule orderings) often allows for relatively simple solutions for analyzing otherwisecomplex phonotactic interactions. Finally, I point out that while Jardine and Heinz(2015) have recently proposed an algorithm (the 2TSLIA; see Section 5.2.1) thatcan provably and efficiently acquire a correct grammar for any individual TSL2pattern, further investigation into the learnability of multiple TSL2 patterns (thatare not necessarily surface-true) is needed before this strategy can be consideredcomputationally tractable.6.3 Outstanding IssuesAs the scope of this dissertation was restricted to locality relations in long-distanceconsonantal phonotactics, there are a number of issues that need to be addressed. Inthis section, I briefly discuss four important areas of research that deserve attention156in future work: segmental blocking, the role of phonological similarity, other typesof non-adjacent dependencies, and the computational properties of input-outputmappings.First, there is a need for further empirical investigation of the learnability ofblocking effects in patterns of consonant harmony and long-distance consonant dis-similation. Due to the cross-linguistic sparsity of such patterns, there is much to begained from the use of an artificial language learning paradigm to study issues re-lated to their learnability. However, to my knowledge, no such study has been con-ducted, and there are a number of questions that need to be addressed. For example(among many others): Can long-distance consonant interactions that are blockedby certain intervening segments be learned in the laboratory? If so, what conditionsare necessary to achieve learning? How do learners generalize from limited infor-mation in the training phase? Can any segment be a blocker? Do the predictionsalign with the set of attested patterns (few as they may be), and the predictions ofthe TSL2 approach?With respect to phonological similarity and natural classes, I have argued thatan advantage of TSL2 grammars is the ability to specify a tier that contains any (po-tentially arbitrary) set of segments. However, it is clear that the majority of attestedpatterns can be characterized with a TSL2 language whose tier is indeed a set of seg-ments that form a natural class (e.g., the sibilants, the liquids, the voiced obstruents,etc.). At present, it remains unclear how features (or other representational struc-ture) are best treated in terms of formal language theory. However, future advancesin this area may shed light on the relationship between how the Tier-based Strictly2-Local Inference Algorithm (2TSLIA; Jardine and Heinz, 2015) learns phonotac-tic patterns (i.e. with no preference for ‘natural’ tiers), and the biases exhibited byhuman learners in artificial language learning studies that consider various typesof feature interactions (see, e.g., Wilson, 2003; Moreton, 2008, 2012; Koo and Oh,2013).As a future source of data, the TSL2 approach can be extended beyond theanalysis of long-distance interactions between consonants. For example, Jardine(2015) shows that the 2TSLIA (Jardine and Heinz, 2015, see Section 5.2.1 above)can be used to learn a TSL2 grammar for the phonotactics of vowels in Finnish(using data from Goldsmith and Riggle, 2012), which exhibits several interesting157properties of locality, transparency, and blocking in vowel harmony. Patterns ofvowel harmony are widespread cross-linguistically, and there also exist certain in-teractions that hold between consonants and vowels (e.g. nasal harmony). Patternslike these may therefore provide an excellent empirical testing ground for the pre-dictions of the TSL2 approach (for details and previous analyses of vowel harmonyand vowel-consonant harmony, see, e.g., van der Hulst and van de Weijer, 1995;Walker, 2000b; Archangeli and Pulleyblank, 2007; Finley, 2008; Nevins, 2010).Finally, the treatment of long-distance dependencies as members of the Tier-based Strictly 2-Local class of formal languages is rather limited, in that it onlyoffers an account of the phonotactic restrictions on surface forms. However, recentresearch into the formal characterization of input-output mappings has focused onestablishing a hierarchy of well-defined classes of subregular relations (as opposedto stringsets), and investigating their associated computational properties, such asrelative complexity and learnability (see, e.g., Chandlee, 2014; Chandlee and Jar-dine, 2014; Chandlee et al., 2014; Jardine et al., 2014; Payne, 2014). Althoughthere is still much work to be done in this area, present results from the literaturesuggest that this approach to phonological mappings can provide an attractive alter-native to constraint-based frameworks, and can be used to assess the computationalimplications of particular constraint sets.6.4 ConclusionWith an increasing amount of support for the hypothesis that typological distri-butions of phonological patterns are shaped, in part, by human learning biases, itis important to understand the range of patterns that any theoretical model pre-dicts to be possible and human-learnable. With respect to phonotactic patterns thatcan be generated in a constraint-based framework, this is not necessarily an easytask, since complex interactions of seemingly unrelated constraints can inadver-tently over-generate, resulting in a number of pathologies. I argue that pursuingquestions of pattern complexity and learnability within formal language theory canoffer us a ‘computational grounding’ of phonology that may help to rectify cer-tain problematic predictions, especially with respect to the structural properties ofphonological patterns (e.g. locality and opacity), in the same way that ‘phonetic158grounding’ is an attempt to confine predictions in terms of substance (e.g. percep-tual similarity). While there is much work to be done before we fully understandthe computational properties of phonological patterns in natural language, each stepof the pursuit can both advance our knowledge of the limits on human learnabilityand bring us closer to a cohesive explanation of phonological typology.159BibliographyAlderete, J. and Frisch, S. A. (2007). Dissimilation in the grammar and thelexicon. In de Lacy, P., editor, The Cambridge Handbook of Phonology, pages379–398. Cambridge University Press, Cambridge. → pages 8Applegate, R. B. (1972). Ineseño Chumash grammar. Doctoral dissertation,University of California, Berkeley. → pages ii, 1, 140Archangeli, D. and Pulleyblank, D. (1994). Grounded phonology. MIT Press,Cambridge, MA. → pages 13Archangeli, D. and Pulleyblank, D. (2007). Harmony. In de Lacy, P., editor, TheCambridge Handbook of Phonology, pages 353–378. Cambridge UniversityPress, Cambridge. → pages 158Baković, E. (2000). Harmony, dominance and control. Doctoral dissertation,Rutgers University. → pages 142, 153Bates, D., Maechler, M., and Bolker, B. (2014). lme4: Linear mixed-effectsmodels using S4 classes. R package version 1.1-6. → pages 32Becker, M., Nevins, A., and Levine, J. (2012). Asymmetries in generalizingalternations to and from initial syllables. Language, 88(2):231–268. → pages69, 75Bennett, W. (2013). Dissimilation, consonant harmony, and surfacecorrespondence. Doctoral dissertation, Rutgers University. → pages ii, 6, 7, 8,9, 10, 12, 14, 18, 21, 39, 40, 48, 49, 53, 54, 55, 56, 57, 59, 61, 62, 78, 79, 85,131, 143, 147, 151, 152Bennett, W. (2015). Assimilation, dissimilation, and surface correspondence inSundanese. Natural Language and Linguistic Theory, 33(2):371–415. → pages14, 49, 59, 62, 78, 143, 147160Blevins, J. (2004). Evolutionary phonology: The emergence of sound patterns.Cambridge University Press, Cambridge. → pages 5, 17, 113Boersma, P. (1997). How we learn variation, optionality, and probability. InProceedings of the Institute of Phonetic Sciences of the University ofAmsterdam, volume 21, pages 43–58. → pages 13, 44Boersma, P. and Hayes, B. (2001). Empirical tests of the Gradual LearningAlgorithm. Linguistic Inquiry, 32(1):45–86. → pages 44Chandlee, J. (2014). Strictly Local phonological processes. PhD thesis,University of Delaware. → pages 158Chandlee, J., Eyraud, R., and Heinz, J. (2014). Learning Strictly Localsubsequential functions. Transactions of the Association for ComputationalLinguistics, 2:491–503. → pages 158Chandlee, J. and Jardine, A. (2014). Learning phonological mappings by learningstrictly local functions. In Kingston, J., Moore-Cantwell, C., Pater, J., andStaubs, R., editors, Proceedings of the 2013 Meeting on Phonology,Washington, DC. Linguistic Society of America. → pages 158Chomsky, N. (1956). Three models for the description of language. IRETransactions on Information Theory, 2(3):113–124. → pages 15, 79Clements, G. N. (1980). Vowel harmony in nonlinear generative phonology: anautosegmental model. Indiana University Linguistics Club, Bloomington, IN.→ pages 17, 113Clements, G. N. (1985). The geometry of phonological features. PhonologyYearbook, 2:225–252. → pages 17, 113Clements, G. N. and Hume, E. V. (1995). The internal organization of speechsounds. In Goldsmith, J. A., editor, The handbook of phonological theory,pages 245–306. Blackwell, Oxford. → pages 17, 113Cohn, A. C. (1992). The consequences of dissimilation in Sundanese. Phonology,9:199–220. → pages 49, 59, 62, 143, 147Coupez, A. (1980). Abrège de grammaire Rwanda. Institut National deRecherche Scientifique, Butare. → pages 122Creel, S. C., Newport, E. L., and Aslin, R. N. (2004). Distant melodies: statisticallearning of nonadjacent dependencies in tone sequences. Journal of161Experimental Psychology: Learning, Memory, and Cognition,30(5):1119–1130. → pages 1, 9Cser, A. (2010). The alis / aris allomorphy revisited. In Rainer, F., Dressler,W. U., Kastovsky, D., and Luschützky, H. C., editors, Variation and change inmorphology: Selected papers from the 13th International Morphology Meeting,Vienna, February 2008, pages 33–51. John Benjamins, Amsterdam. → pages12, 118, 119, 122Culbertson, J. (2012). Typological universals as reflections of biased learning:Evidence from artificial language learning. Linguistics and LanguageCompass, 6:310–329. → pages 3Culbertson, J., Smolensky, P., and Legendre, G. (2012). Learning biases predict aword order universal. Cognition, 122:306–329. → pages 3Culy, C. (1985). The complexity of the vocabulary of Bambara. Linguistics andPhilosophy, 8:345–351. → pages 15de la Higuera, C. (1997). Characteristic sets for polynomial grammaticalinference. Machine Learning, 27(2):125–138. → pages 124Dolbey, A. and Hansson, G. Ó. (1999). The source of naturalness in synchronicphonology. In Billings, S. J., Boyle, J. P., and Griffith, A. M., editors, Papersfrom the 35th meeting of the Chicago Linguistic Society, volume 1, pages59–69, Chicago, IL. Chicago Linguistic Society. → pages 24, 76Dresher, E. (1999). Charting the learning path: cues to parameter setting.Linguistic Inquiry, 30:27–67. → pages 13Dressler, W. (1971). An alleged case of nonchronological rule insertion.Linguistic Inquiry, 2:597–599. → pages 12, 118, 119Eisner, J. (1997). What constraints should OT allow? Paper presented at the 71stAnnual Meeting of the Linguistic Society of America, Chicago, January 1997.[ROA-204 (talk handout)]. → pages 13Ellison, T. M. (1992). The machine learning of phonological structure. PhDthesis, University of Western Australia. → pages 14Ellison, T. M. (1994). Phonological derivation in optimality theory. InProceedings of the 15th International Conference on Computational Linguistics(COLING ’94), volume 2, pages 1007–1013, Stroudsburg, PA. Association forComputational Linguistics. [ROA-75]. → pages 14162Elmedlaoui, M. (1995). Aspects des représentations phonologiques danscertaines langues chamito-sémitiques. PhD thesis, Université Mohammed V.→ pages 12, 88, 131, 134, 135Ettlinger, M., Morgan-Short, K., Faretta-Stutenberg, M., and Wong, P. C. (2015).The relationship between artificial and second language learning. CognitiveScience. → pages 6Fallon, P. D. (1993). Liquid dissimilation in Georgian. In Kathol, A. andBernstein, M., editors, Proceedings of the 10th Eastern States Conference onLinguistics, pages 105–116, Ithaca, NY. DMLL Publications. → pages 7, 85, 86Finley, S. (2008). Formal and cognitive restrictions on vowel harmony. PhDthesis, Johns Hopkins University. → pages 2, 158Finley, S. (2011). The privileged status of locality in consonant harmony. Journalof Memory and Language, 65:74–83. → pages 5, 11, 24, 25, 39, 46, 73, 149Finley, S. (2012). Testing the limits of long-distance learning: learning beyond athree-segment window. Cognitive Science, 36:740–756. → pages 5, 11, 24, 25,39, 46, 73, 75, 149Finley, S. and Badecker, W. (2007). Towards a substantively biased theory oflearning. Berkeley Linguistics Society, 33:142–154. → pages 3Finley, S. and Badecker, W. (2009). Artificial language learning and feature-basedgeneralization. Journal of Memory and Language, 61(3):423–437. → pages 5,93Finn, A. S. and Hudson Kam, C. L. (2008). The curse of knowledge: firstlanguage knowledge impairs adult learners’ use of novel statistics for wordsegmentation. Cognition, 108:477–499. → pages 6Gafos, A. I. (1999). The articulatory basis of locality in phonology. Garland, NewYork. → pages 6, 25Garcia, P., Vidal, E., and Oncina, J. (1990). Learning locally testable languages inthe strict sense. In Arikawa, S., Goto, S., Ohsuga, S., and Yokomori, T., editors,Algorithmic Learning Theory, First International Workshop, ALT ’90, pages325–338. Springer/Ohmsha. → pages 84Garrett, A. and Johnson, K. (2012). Phonetic bias in sound change. In Yu, A.C. L., editor, Origins of sound change: Approaches to phonologization. OxfordUniversity Press, Oxford. → pages 5163Gebhart, A. L., Newport, E. L., and Aslin, R. N. (2009). Statistical learning ofadjacent and non-adjacent dependencies among non-linguistic sounds.Psychonomic Bulletin & Review, 16:486–490. → pages 1, 9Gold, E. M. (1967). Language identification in the limit. Information andControl, 10:447–474. → pages 84, 124, 137Goldsmith, J. A. and Riggle, J. (2012). Information theoretic approaches tophonological structure: the case of Finnish vowel harmony. Natural Languageand Linguistic Theory, 30:859–896. → pages 2, 157Goldwater, S. and Johnson, M. (2003). Learning OT constraint rankings using amaximum entropy model. In Proceedings of the Workshop on Variation withinOptimality Theory, Department of Linguistics: Stockholm University. → pages13, 44Halle, M. and Vergnaud, J.-R. (1981). Harmony processes. In Klein, W. andLevelt, W., editors, Crossing the boundaries in linguistics: Studies presented toManfred Bierwisch, pages 1–22. Reidel, Dordrecht. → pages 6Hansson, G. Ó. (2001). Theoretical and typological issues in consonant harmony.Doctoral dissertation, University of California, Berkeley. → pages 6, 8, 11, 14,39, 48, 61, 78, 79, 84, 151Hansson, G. Ó. (2007). Blocking effects in agreement by correspondence.Linguistic Inquiry, 38(2):395–409. → pages 12, 88, 114, 153Hansson, G. Ó. (2008). Diachronic explanations of sound patterns. Language andLinguistics Compass, 2(5):859–893. → pages 5Hansson, G. Ó. (2010a). Consonant harmony: long-distance interaction inphonology. University of California Press, Berkeley, CA. → pages ii, 6, 7, 8,10, 11, 14, 18, 21, 22, 25, 39, 48, 78, 79, 82, 114, 131, 134, 135, 140, 151Hansson, G. Ó. (2010b). Long-distance voicing assimilation in Berber: spreadingand/or agreement? In Heijl, M., editor, Actes du Congrès de l’ACL 2010 / 2010CLA Conference Proceedings. Canadian Linguistic Association. → pages 12,61, 88, 131, 134, 135Hansson, G. Ó. (2014). (Dis)agreement by (non)correspondence: inspecting thefoundations. Paper presented at ABC↔Conference, Berkeley. [Slides publishedin 2014 UC Berkeley Phonology Lab Annual Report: ABC↔ConferenceArchive, 3–62. Online: http://linguistics.berkeley.edu/phonlab/annual_164report/documents/2014/annual_report_2014_ABCC.html]. → pages 49, 56,153Hayes, B. (1999). Phonetically driven phonology: the role of optimality theoryand inductive grounding. In Darnell, M., Moravscik, E., Newmeyer, F. J.,Noonan, M., and Wheatly, K., editors, Functionalism and formalism inlinguistics, volume 1, pages 243–285. John Benjamins. → pages 13Hayes, B. (2004). Phonological acquisition in Optimality Theory. In Kager, R.,Pater, J., and Zonneveld, W., editors, Fixing priorities: constraints inphonological acquisition, pages 158–203. Cambridge University Press,Cambridge. → pages 44Hayes, B. and Wilson, C. (2008). A maximum entropy model of phonotactics andphonotactic learning. Linguistic Inquiry, 39:379–440. → pages 2, 14Hayward, R. J. (1982). Notes on the Koyra language. Afrika und Übersee,65:211–268. → pages 10, 21, 22, 53, 82Hayward, R. J. (1990). Notes on the Aari language. In Hayward, R. J., editor,Omotic language studies, pages 425–493. School of Oriental and AfricanStudies, University of London, London. → pages 10, 21, 83Heath, J. (2005). A grammar of Tamashek (Tuareg of Mali). Mouton de Gruyter,Berlin. → pages 131, 132, 133, 134, 137Heinz, J. (2007). Inductive learning of phonotactic patterns. Doctoraldissertation, University of California, Los Angeles. → pages 2, 80, 84Heinz, J. (2009). On the role of locality in learning stress patterns. Phonology,26:303–351. → pages 13Heinz, J. (2010). Learning long-distance phonotactics. Linguistic Inquiry,41(4):623–661. → pages 2, 15, 17, 79, 80, 81, 83, 84, 85, 87, 154Heinz, J. and Idsardi, W. J. (2010). Learning opaque generalizations: the case ofSamala. Unpublished ms. University of Delaware, University of Maryland. →pages 140Heinz, J. and Idsardi, W. J. (2011). Sentence and word complexity. Science,333:295–297. → pages 84Heinz, J., Kasprzik, A., and Kötzing, T. (2012). Learning in the limit withlattice-structured hypothesis spaces. Theoretical Computer Science,457:111–127. → pages 137165Heinz, J., Rawal, C., and Tanner, H. G. (2011). Tier-based strictly localconstraints for phonology. In Proceedings of the 49th Annual Meeting of theAssociation for Computational Linguistics, pages 58–64, Portland, OR.Association for Computational Linguistics. → pages iii, 2, 15, 17, 18, 79, 80,85, 86, 87, 90, 111, 124, 131, 154, 155Hudson Kam, C. L. and Newport, E. L. (2005). Regularizing unpredictablevariation: the roles of adult and child learners in language variation and change.Language Learning and Development, 1:151–195. → pages 5Hudson Kam, C. L. and Newport, E. L. (2009). Getting it right by getting itwrong: when learners change languages. Cognitive Psychology, 59:30–66. →pages 5Hyman, L. M. (1995). Nasal consonant harmony at a distance: the case of Yaka.Studies in African Linguistics, 24:5–30. → pages 22Jardine, A. (2015). Learning tiers for long-distance phonotactics. Unpublishedms. University of Delaware. [To appear in Proceedings of the 6th Conferenceon Generative Approaches to Language Acquisition North America (GALANA2015)]. → pages 124, 157Jardine, A., Chandlee, J., Eyraud, R., and Heinz, J. (2014). Very efficient learningof structured classes of subsequential functions from positive data. In Clark, A.,Kanazawa, M., and Yoshinaka, R., editors, Proceedings of the 12thInternational Conference on Grammatical Inference (ICGI 2014), volume 34 ofJMLR Workshop Proceedings, pages 94–108. → pages 158Jardine, A. and Heinz, J. (2015). Learning Tier-based Strictly 2-Local languages.Unpublished ms. University of Delaware. → pages 2, 17, 87, 122, 123, 124,125, 126, 130, 131, 145, 146, 156, 157Jensen, J. (1974). Variables in phonology. Language, 50:675–686. → pages 12,118, 119Jesney, K. and Tessier, A.-M. (2011). Biases in Harmonic Grammar: the road torestrictive learning. Natural Language and Linguistic Theory, 29:251–290. →pages 44Johnson, C. D. (1972). Formal aspects of phonological description. Mouton, TheHague. → pages 14, 79Jurgec, P. (2011). Feature spreading 2.0: a unified theory of assimilation. PhDthesis, University of Tromsø. → pages 12, 88, 123166Kaplan, R. M. and Kay, M. (1994). Regular models of phonological rule systems.Computational Linguistics, 20(3):331–378. → pages 14, 79Kimenyi, A. (1979). Studies in Kinyarwanda and Bantu phonology. LinguisticResearch, Carbondale, IL. → pages 122Kiparsky, P. (1968). Linguistic universals and linguistic change. In Bach, E. andHarms, R., editors, Universals in linguistic theory, pages 170–202. Holt,Rinehart and Winston, New York. → pages 4Kirby, S., Cornish, H., and Smith, K. (2008). Cumulative cultural evolution in thelaboratory: an experimental approach to the origins of structure in humanlanguage. In Proceedings of the National Academy of Sciences, volume 105,pages 10681–10686. → pages 3Kobele, G. M. (2006). Generating copies: an investigation into structural identityin language and grammar. PhD thesis, University of California, Los Angeles.→ pages 15Koo, H. and Callahan, L. (2012). Tier-adjacency is not a necessary condition forlearning phonotactic dependencies. Language and Cognitive Processes,27:1425–1432. → pages 120Koo, H. and Cole, J. (2006). On learnability and naturalness as constraints onphonological grammar. In Botinis, A., editor, Proceedings of ISCA Tutorial andResearch Workshop on Experimental Linguistics, pages 165–168, Athens.University of Athens. → pages 11Koo, H. and Oh, Y.-I. (2013). Beyond tier-based bigrams: an artificial grammarlearning study. Language Sciences, 38:53–58. → pages 9, 114, 120, 121, 123,146, 157Lai, R. (2012). Domain specificity in learning phonology. Doctoral dissertation,University of Delaware. → pages 2, 15, 80, 84, 92Levelt, C. C. (2011). Consonant harmony in child language. In van Oostendorp,M., Ewen, C. J., Hume, E., and Rice, K., editors, Blackwell companion tophonology, volume 3, pages 1691–1716. Blackwell, Oxford. → pages 8Lombardi, L. (1999). Positional faithfulness and voicing assimilation inoptimality theory. Natural Language and Linguistic Theory, 17:267–302. →pages 16, 153167McCarthy, J. J. (2003). OT constraints are categorical. Phonology, 20:75–138. →pages 13McCarthy, J. J. (2007). Consonant harmony via correspondence: evidence fromChumash. In Bateman, L., O’Keefe, M., Reilly, E., and Werle, A., editors,Papers in Optimality Theory III (University of Massachusetts OccasionalPapers in Linguistics 32), pages 223–238. GLSA Publications, University ofMassachusetts, Amherst, MA. → pages 140McMullin, K. (2013). Learning consonant harmony in artificial languages. InProceedings of the 2013 Meeting of the Canadian Linguistic Association. →pages 73McMullin, K. and Allen, B. H. (2015). Phonotactic learning and the conjunctionof Tier-based Strictly Local languages. Paper presented at the 89th AnnualMeeting of the Linguistic Society of America. Portland, OR. → pages 137McMullin, K. and Hansson, G. Ó. (2014). Locality in long-distance phonotactics:evidence for modular learning. In Iyer, J. and Kusmer, L., editors, Proceedingsof the 44th meeting of the North Eastern Linguistic Society, volume 2, pages1–14, Amherst. GLSA Publications, University of Massachusetts. → pages 11,17, 24, 25, 39, 46, 73, 80, 84, 149, 154McNaughton, R. and Papert, S. A. (1971). Counter-free automata. MIT Press,Cambridge, MA. → pages 15, 80, 154Moreton, E. (2008). Analytic bias and phonological typology. Phonology,25:83–127. → pages 2, 3, 5, 157Moreton, E. (2012). Inter- and intra-dimensional dependencies in implicitphonotactic learning. Journal of Memory and Language, 67:165–183. → pages2, 3, 5, 157Moreton, E. and Pater, J. (2012a). Structure and substance in artificial phonologylearning. Part I: Structure. Language and Linguistics Compass, 6(11):686–701.→ pages 5, 6Moreton, E. and Pater, J. (2012b). Structure and substance in artificial phonologylearning. Part II: Substance. Language and Linguistics Compass,6(11):702–718. → pages 5, 6Morley, R. (2015). Can phonological universals be emergent? modeling the spaceof sound change, lexical distribution, and hypothesis selection. Language,91(2):e40–e70. → pages 2168Nevins, A. (2010). Locality in vowel harmony. MIT Press, Cambridge, MA. →pages 6, 158Newport, E. L. and Aslin, R. N. (2004). Learning at a distance I: statisticallearning of non-adjacent dependencies. Cognitive Psychology, 48:127–162. →pages 1, 9Ní Chiosáin, M. and Padgett, J. (2001). Markedness, segment realization, andlocality in spreading. In Lombardi, L., editor, Segmental phonology inoptimality theory, pages 118–156. Cambridge University Press, Cambridge. →pages 6Odden, D. (1994). Adjacency parameters in phonology. Language, 70:289–330.→ pages 6, 7, 9, 10, 12, 17, 21, 22, 23, 25, 59, 85, 113Ohala, J. J. (1993). The phonetics of sound change. In Jones, C., editor, Historicallinguistics: Problems and perspectives, pages 237–278. Longman, London. →pages 5Pater, J. and Tessier, A.-M. (2003). Phonotactic knowledge and the acquisition ofalternations. In Solé, M., Recasens, D., and Romero, J., editors, Proceedings ofthe 15th International Congress on Phonetic Sciences, pages 1177–1180,Barcelona. → pages 45Payne, A. (2014). Dissimilation as a subsequential process. In Iyer, J. andKusmer, L., editors, Proceedings of the 44th meeting of the North EasternLinguistic Society, volume 1, pages 79–90, Amherst. GLSA Publications,University of Massachusetts. → pages 15, 79, 158Poser, W. J. (1982). Phonological representations and action-at-a-distance. Invan der Hulst, H. and Smith, N., editors, The structure of phonologicalrepresentations, volume 2, pages 121–158. Foris, Dordrecht. → pages 6, 140Poser, W. J. (1993). Are strict cycle effects derivable? In Hargus, S. and Kaisse,E. M., editors, Studies in lexical phonology, pages 315–321. Academic Press,New York. → pages 140Prince, A. and Smolensky, P. (2004). Optimality Theory: constraint interaction ingenerative grammar. Blackwell, Malden, MA. [Originally distributed 1993 asTechnical Report RuCCS-TR-2/CU-CS-696-93, Rutgers Center for CognitiveScience, Rutgers University, and Department of Cognitive Science, Universityof Colorado at Boulder; revised 2002 version as ROA-537]. → pages 13, 139,151169Prince, A. and Tesar, B. (2004). Learning phonotactic distributions. In Kager, R.,Pater, J., and Zonneveld, W., editors, Fixing priorities: constraints inphonological acquisition, pages 245–291. Cambridge University Press,Cambridge. → pages 44Pulleyblank, D. (2002). Harmony drivers: no disagreement allowed. In Larson, J.and Paster, M., editors, Proceedings of the 28th Annual Meeting of the BerkeleyLinguistics Society, pages 249–267. Berkeley Linguistics Society, Berkeley,CA. → pages 6, 9, 86, 142Pycha, A., Nowak, P., Shin, E., and Shosted, R. (2003). Phonological rule-learingand its implications for a theory of vowel harmony. In Tsujimura, M. andGarding, G., editors, Proceedings of the 22nd West Coast Conference on FormalLinguistics, pages 423–435, Somerville, MA. Cascadilla Press. → pages 5R Core Team (2014). R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna. → pages 32Rabin, M. O. and Scott, D. (1959). Finite automata and their decision problems.IBM Journal of Research and Development, 3:114–125. → pages 14, 79Rafferty, A. N., Griffiths, T. L., and Ettlinger, M. (2013). Greater learnability isnot sufficient to produce cultural universals. Cognition, 129:70–87. → pages 3Reber, R. and Perruchet, P. (2003). The use of control groups in artificial grammarlearning. Quarterly Journal of Experimental Psychology: Human ExperimentalPsychology, 56A:91–115. → pages 6Riggle, J. (2004). Generation, recognition and learning in finite-state optimalitytheory. Doctoral dissertation, University of California, Los Angeles. → pages13Rogers, J., Heinz, J., Bailey, G., Edlefsen, M., Visscher, M., Wellcome, D., andWibel, S. (2010). On languages piecewise testable in the strict sense. In Ebert,C., Jäger, G., and Michaelis, J., editors, The mathematics of language, pages255–265. Springer, Berlin. → pages 15, 80, 81, 154Rogers, J. and Pullum, G. K. (2011). Aural pattern recognition experiments andthe subregular hierarchy. Journal of Logic, Language and Information,20(3):329–342. → pages 15, 80, 91, 92, 154Rose, S. and Walker, R. (2004). A typology of consonant agreement ascorrespondence. Language, 80:475–531. → pages ii, 6, 8, 10, 11, 14, 18, 20,21, 22, 39, 41, 48, 51, 52, 53, 54, 78, 79, 84, 151170Scott-Phillips, T. and Kirby, S. (2010). Language evolution in the laboratory.Trends in Cognitive Sciences, 14:411–417. → pages 3Shaw, P. A. (1991). Consonant harmony systems: the special status of coronalharmony. In Paradis, C. and Prunet, J.-F., editors, The special status ofcoronals: internal and external evidence, pages 125–158. Academic Press, SanDiego, CA. → pages 17, 113Shieber, S. (1985). Evidence against the context-freeness of natural language.Linguistics and Philosophy, 8:333–343. → pages 15Smolensky, P. (1996). On the comprehension/production dilemma in childlanguage. Linguistic Inquiry, 21:720–731. → pages 44Steriade, D. (1987a). Locality conditions and feature geometry. In McDonough, J.and Plunkett, B., editors, Proceedings of the 17th meeting of the North EasternLinguistic Society, pages 595–617. GLSA Publications, University ofMassachusetts, Amherst, MA. → pages 6, 12, 118, 119Steriade, D. (1987b). Redundant values. In Bosch, A., Need, B., and Schiller, E.,editors, CLS 23: Papers from the parasession on autosegmental and metricalphonology, pages 339–362. Chicago Linguistic Society, Chicago, IL. → pages 6Suzuki, K. (1998). A typological investigation of dissimilation. Doctoraldissertation, University of Arizona. → pages 8, 142Tesar, B. and Smolensky, P. (2000). Learnability in Optimality Theory. MITPress, Cambridge, MA. → pages 13, 44van de Weijer, J. (2014). The origin of OT constraints. Lingua, 142:66–75. →pages 14van der Hulst, H. and van de Weijer, J. (1995). Vowel harmony. In Goldsmith, J.,editor, The handbook of phonological theory, pages 495–531. Blackwell,Oxford. → pages 158Walker, R. (2000a). Long-distance consonantal identity effects. In Billerey, R.and Lillehaugen, B. D., editors, Proceedings of the 19th West Coast Conferenceon Formal Linguistics, pages 532–545. Cascadilla Press, Somerville, MA. →pages 14, 39, 78, 79, 151Walker, R. (2000b). Nasalization, neutral segments, and opacity effects. Garland,New York. → pages 158171Walker, R. (2000c). Yaka nasal harmony: spreading or segmentalcorrespondence? In Conathan, L., Good, J., Kavitskaya, D., Wulf, A., and Yu,A. C. L., editors, Proceedings of the 26th Annual Meeting of the BerkeleyLinguistics Society, pages 321–332. Berkeley Linguistics Society, Berkeley,CA. → pages 14, 39, 78, 79, 151Walker, R. (2001). Consonantal correspondence. In Kirchner, R., Pater, J., andWikeley, W., editors, Proceedings of the Workshop on the Lexicon in Phoneticsand Phonology (Papers in Experimental and Theoretical Linguistics 6), pages73–84, Edmonton. Department of Linguistics, University of Alberta. → pages39Walker, R., Byrd, D., and Mpiranya, F. (2008). An articulatory view ofKinyarwanda coronal harmony. Phonology, 25:499–535. → pages 12, 88, 114,116, 117, 122Walker, R. and Mpiranya, F. (2005). On triggers and opacity in coronal harmony.In Proceedings of the Berkeley Linguistic Society, volume 31, pages 383–394.→ pages 12, 88, 114, 117, 122Warker, J. A., Xu, Y., Dell, G. S., and Fisher, C. (2009). Speech errors reflect thephonotactic constraints in recently spoken syllables, but not in recently heardsyllables. Cognition, 112:81–96. → pages 28Watkins, C. (1970). A case of non-chronological rule insertion. LinguisticInquiry, 1:525–527. → pages 12, 118, 119Wilson, C. (2003). Experimental investigation of phonological naturalness. InTsujimura, M. and Garding, G., editors, Proceedings of the 22nd West CoastConference on Formal Linguistics, pages 533–546, Somerville, MA. CascadillaPress. → pages 5, 157Wilson, C. (2006). Learning phonology with substantive bias: an experimentaland computational study of velar palatalization. Cognitive Science,30:945–982. → pages 3, 5, 93172Appendix AFull List of Stimuli Used inExperimentsTable A.1: List of stimuli used in the practice phase.Stem Past Futurebigo bigoli bigorudeto detoli detorugone goneli gonerukumu kumuli kumurumebi mebili mebirunipu nipuli nipurupudi pudili pudirutoke tokeli tokeru173Table A.2: Training stimuli with liquids at “Medium-range”M-Stem M-Harm-Past M-Harm-Fut M-Diss-Past M-Diss-Futpolipu polipuli poripuru poripuli polipurupilede piledeli pirederu piredeli pilederubilono bilonoli bironoru bironoli bilonoruneluki nelukili nerukiru nerukili nelukirubelibu belibuli beriburu beribuli beliburupelege pelegeli peregeru peregeli pelegerutilipe tilipeli tiriperu tiripeli tiliperunoleni nolenili noreniru norenili nolenirunuloto nulotoli nurotoru nurotoli nulotorugiluko gilukoli girukoru girukoli gilukorumelomi melomili meromiru meromili melomirukolugu koluguli koruguru koruguli koluguruberigi beligili berigiru berigili beligirumureke mulekeli murekeru murekeli mulekerumoropo molopoli moroporu moropoli moloporugurubo guluboli guruboru guruboli guluborunuronu nulonuli nuronuru nuronuli nulonurugirudi giludili girudiru girudili giludiruguritu gulituli gurituru gurituli guliturupiredu pileduli pireduru pireduli piledurubirobe bilobeli biroberu birobeli biloberuneruti nelutili nerutiru nerutili nelutirutirime tilimeli tirimeru tirimeli tilimeruperemo pelemoli peremoru peremoli pelemorugulidu guliduli guriduru guriduli gulidurumulegu muleguli mureguru mureguli mulegurumolobi molobili morobiru morobili molobirugulune guluneli guruneru guruneli gulunerubolipi bolipili boripiru boripili bolipiruContinued on next page174Table A.2 – Medium-range training (Continued from previous page)M-Stem M-Harm-Past M-Harm-Fut M-Diss-Past M-Diss-Futkeluko kelukoli kerukoru kerukoli kelukorudelito delitoli deritoru deritoli delitorudilemo dilemoli diremoru diremoli dilemorunilobu nilobuli niroburu nirobuli niloburutilute tiluteli tiruteru tiruteli tiluterutelede teledeli terederu teredeli telederubulomi bulomili buromiru buromili bulomiruporiku polikuli porikuru porikuli polikurunorego nolegoli noregoru noregoli nolegorumeroni melonili meroniru meronili melonirukorupe kolupeli koruperu korupeli koluperuturebe tulebeli tureberu turebeli tuleberugorodo golodoli gorodoru gorodoli golodoruderiki delikili derikiru derikili delikirudireno dilenoli direnoru direnoli dilenoruburomu bulomuli buromuru buromuli bulomurumuruge mulugeli murugeru murugeli mulugerukuripu kulipuli kuripuru kuripuli kulipurutiruti tilutili tirutiru tirutili tilutirugilipo gilipoli giriporu giripoli giliporukoledi koledili korediru koredili koledirunelogi nelogili nerogiru nerogili nelogirumolunu molunuli morunuru morunuli molunurugolobo goloboli goroboru goroboli goloborumuluke mulukeli murukeru murukeli mulukerukulipo kulipoli kuriporu kuripoli kuliporutulegi tulegili turegiru turegili tulegirubilotu bilotuli biroturu birotuli biloturupelume pelumeli perumeru perumeli pelumerutolitu tolituli torituru torituli tolituruContinued on next page175Table A.2 – Medium-range training (Continued from previous page)M-Stem M-Harm-Past M-Harm-Fut M-Diss-Past M-Diss-Futmileme milemeli miremeru miremeli milemerugirige giligeli girigeru girigeli giligerutereno telenoli terenoru terenoli telenoruniropu nilopuli niropuru niropuli nilopurukerude keludeli keruderu kerudeli keluderuboriki bolikili borikiru borikili bolikirukorebu kolebuli koreburu korebuli koleburugeridi gelidili geridiru geridili gelidirupureto puletoli puretoru puretoli puletorunerobo neloboli neroboru neroboli neloborumorumi molumili morumiru morumili molumirutoronu tolonuli toronuru toronuli tolonuruduruke dulukeli durukeru durukeli dulukerugelibe gelibeli geriberu geribeli geliberukelego kelegoli keregoru keregoli kelegorubuloku bulokuli burokuru burokuli bulokurudiluni dilunili diruniru dirunili dilunirudolepe dolepeli doreperu dorepeli doleperumelodo melodoli merodoru merodoli melodorukilibi kilibili kiribiru kiribili kilibirupuleti puletili puretiru puretili puletirutolodo tolodoli torodoru torodoli tolodorupoluku polukuli porukuru porukuli polukurudulimu dulimuli durimuru durimuli dulimurudulune duluneli duruneru duruneli duluneruduribi dulibili duribiru duribili dulibirumireko milekoli mirekoru mirekoli milekoruburogu buloguli buroguru buroguli bulogurudirudu diluduli diruduru diruduli diludurukirine kilineli kirineru kirineli kilineruContinued on next page176Table A.2 – Medium-range training (Continued from previous page)M-Stem M-Harm-Past M-Harm-Fut M-Diss-Past M-Diss-Futperupi pelupili perupiru perupili pelupirutorite toliteli toriteru toriteli toliterukerepi kelepili kerepiru kerepili kelepirubirogo bilogoli birogoru birogoli bilogoruporumo polumoli porumoru porumoli polumorudorete doleteli doreteru doreteli doleterumeromu melomuli meromuru meromuli melomuru177Table A.3: Training stimuli with liquids at “Short-range”S-Stem S-Harm-Past S-Harm-Fut S-Diss-Past S-Diss-Futpupoli pupolili puporiru puporili pupolirudepile depileli depireru depireli depilerunobilo nobiloli nobiroru nobiroli nobilorukinelu kineluli kineruru kineruli kinelurububeli bubelili buberiru buberili bubelirugepele gepeleli gepereru gepereli gepelerupetili petilili petiriru petirili petiliruninole ninoleli ninoreru ninoreli ninolerutonulo tonuloli tonuroru tonuroli tonulorukogilu kogiluli kogiruru kogiruli kogilurumimelo mimeloli mimeroru mimeroli mimelorugukolu gukoluli gukoruru gukoruli gukolurugiberi gibelili giberiru giberili gibelirukemure kemuleli kemureru kemureli kemulerupomoro pomololi pomororu pomoroli pomoloruboguru bogululi bogururu boguruli bogulurununuro nunuloli nunuroru nunuroli nunulorudigiru digiluli digiruru digiruli digilurutuguri tugulili tuguriru tugurili tugulirudupire dupileli dupireru dupireli dupilerubebiro bebiloli bebiroru bebiroli bebilorutineru tineluli tineruru tineruli tinelurumetiri metilili metiriru metirili metilirumopere mopeleli mopereru mopereli mopeleruduguli dugulili duguriru dugurili dugulirugumule gumuleli gumureru gumureli gumulerubimolo bimololi bimororu bimoroli bimolorunegulu negululi negururu neguruli negulurupiboli pibolili piboriru piborili piboliruContinued on next page178Table A.3 – Short-range training (Continued from previous page)S-Stem S-Harm-Past S-Harm-Fut S-Diss-Past S-Diss-Futkokelu kokeluli kokeruru kokeruli kokelurutodeli todelili toderiru toderili todelirumodile modileli modireru modireli modilerubunilo buniloli buniroru buniroli bunilorutetilu tetiluli tetiruru tetiruli tetilurudetele deteleli detereru detereli detelerumibulo mibuloli miburoru miburoli mibulorukupori kupolili kuporiru kuporili kupolirugonore gonoleli gonoreru gonoreli gonolerunimero nimeloli nimeroru nimeroli nimelorupekoru pekoluli pekoruru pekoruli pekolurubeture betuleli betureru betureli betulerudogoro dogololi dogororu dogoroli dogolorukideri kidelili kideriru kiderili kidelirunodire nodileli nodireru nodireli nodilerumuburo mubuloli muburoru muburoli mubulorugemuru gemululi gemururu gemuruli gemulurupukuri pukulili pukuriru pukurili pukulirutitiru titiluli titiruru titiruli titilurupogili pogilili pogiriru pogirili pogilirudikole dikoleli dikoreru dikoreli dikoleruginelo gineloli gineroru gineroli ginelorunumolu numoluli numoruru numoruli numolurubogolo bogololi bogororu bogoroli bogolorukemulu kemululi kemururu kemuruli kemulurupokuli pokulili pokuriru pokurili pokulirugitule gituleli gitureru gitureli gitulerutubilo tubiloli tubiroru tubiroli tubilorumepelu mepeluli meperuru meperuli mepelurututoli tutolili tutoriru tutorili tutoliruContinued on next page179Table A.3 – Short-range training (Continued from previous page)S-Stem S-Harm-Past S-Harm-Fut S-Diss-Past S-Diss-Futmemile memileli memireru memireli memilerugegiri gegilili gegiriru gegirili gegilirunotere noteleli notereru notereli notelerupuniro puniloli puniroru puniroli punilorudekeru dekeluli dekeruru dekeruli dekelurukibori kibolili kiboriru kiborili kibolirubukore bukoleli bukoreru bukoreli bukolerudigeri digelili digeriru digerili digelirutopure topuleli topureru topureli topulerubonero boneloli boneroru boneroli bonelorumimoru mimoluli mimoruru mimoruli mimolurunutoro nutololi nutororu nutoroli nutolorukeduru kedululi kedururu keduruli kedulurubegeli begelili begeriru begerili begelirugokele gokeleli gokereru gokereli gokelerukubulo kubuloli kuburoru kuburoli kubulorunidilu nidiluli nidiruru nidiruli nidilurupedole pedoleli pedoreru pedoreli pedolerudomelo domeloli domeroru domeroli domelorubikili bikilili bikiriru bikirili bikilirutipule tipuleli tipureru tipureli tipulerudotolo dotololi dotororu dotoroli dotolorukupolu kupoluli kuporuru kuporuli kupolurumuduli mudulili muduriru mudurili mudulirunedulu nedululi nedururu neduruli nedulurubiduri bidulili biduriru bidurili bidulirukomire komileli komireru komireli komileruguburo gubuloli guburoru guburoli gubulorududiru dudiluli dudiruru dudiruli dudilurunekiri nekilili nekiriru nekirili nekiliruContinued on next page180Table A.3 – Short-range training (Continued from previous page)S-Stem S-Harm-Past S-Harm-Fut S-Diss-Past S-Diss-Futpiperu pipeluli piperuru piperuli pipelurutetori tetolili tetoriru tetorili tetolirupikere pikeleli pikereru pikereli pikelerugobiro gobiloli gobiroru gobiroli gobilorumoporu mopoluli moporuru moporuli mopolurutedore tedoleli tedoreru tedoreli tedolerumumero mumeloli mumeroru mumeroli mumeloru181Table A.4: Training stimuli with no liquids.Stem Past Futuretikemu tikemuli tikemurukibupi kibupili kibupirupupugu pupuguli pupugurugonuni gonunili gonunirubipobe bipobeli bipoberutepobi tepobili tepobirutomeku tomekuli tomekurupibogo pibogoli pibogorunekine nekineli nekinerumutumu mutumuli mutumurudubope dubopeli duboperudegiti degitili degitirukukedo kukedoli kukedorunomene nomeneli nomenerugegebi gegebili gegebirubutopi butopili butopirudodigo dodigoli dodigorunimimo nimimoli nimimorupededu pededuli pededuruminoko minokoli minokorugutudo gutudoli gutudorumogiku mogikuli mogikurukonute konuteli konuterubedite bediteli bediterupodoge podogeli podogerugibipe gibipeli gibiperutopidu topiduli topidurupotetu potetuli poteturubumumo bumumoli bumumoruContinued on next page182Table A.4 – cvcvcv-Lv training(Continued from previous page)Stem Past Futuredimumi dimumili dimumirubotini botinili botinirugipebu gipebuli gipeburudenenu denenuli denenurunupidi nupidili nupidirubokuno bokunoli bokunorunonegu noneguli nonegurukemoti kemotili kemotirudigupo digupoli diguporupubigi pubigili pubigirumedoto medotoli medotorutuniki tunikili tunikirukekoke kekokeli kekokerumegobe megobeli megoberugebepu gebepuli gebepurumigode migodeli migoderunukuko nukukoli nukukorukidubo kiduboli kiduborututeme tutemeli tutemerupemoti pemotili pemotirutetobu tetobuli tetoburutetige tetigeli tetigerubegiku begikuli begikurupipeto pipetoli pipetorugokine gokineli gokinerumonube monubeli monuberudobemu dobemuli dobemurunituki nitukili nitukirumupudu mupuduli mupuduruContinued on next page183Table A.4 – cvcvcv-Lv training(Continued from previous page)Stem Past Futurepibupi pibupili pibupirudoduno dodunoli dodunorukugumi kugumili kugumirukugeko kugekoli kugekorunemide nemideli nemiderumubope mubopeli muboperubikote bikoteli bikoterutidido tididoli tididorugomepu gomepuli gomepuruduponi duponili duponirukodegu kodeguli kodegurunenobi nenobili nenobirugukego gukegoli gukegorubinimo binimoli binimorumipede mipedeli mipederupegono pegonoli pegonorukikuge kikugeli kikugerudibigi dibigili dibigirubukoke bukokeli bukokeruninipu ninipuli ninipurubemubo bemuboli bemuborukedetu kedetuli kedeturubobeki bobekili bobekirugepunu gepunuli gepunurupodupo podupoli poduporumetopo metopoli metoporunodobu nodobuli nodoburutiteme titemeli titemerukotike kotikeli kotikeruContinued on next page184Table A.4 – cvcvcv-Lv training(Continued from previous page)Stem Past Futurenubutu nubutuli nubuturumomime momimeli momimeruginoto ginotoli ginotorutunenu tunenuli tunenurugukidi gukidili gukidirupumogi pumogili pumogirutogudi togudili togudirudegemi degemili degemirudupibo dupiboli dupiboru185Table A.5: List of stimuli used in testing phase.Distance Stem “l” option “r” optionShort dotile dotileli dotireliShort tipoli tipolili tiporiliShort bibolo bibololi biboroliShort pudele pudeleli pudereliShort guneli gunelili guneriliShort momilu momiluli momiruliShort negulu negululi neguruliShort kekulo kekuloli kekuroliShort pidole pidoleru pidoreruShort nonolu nonoluru nonoruruShort tepilo tepiloru tepiroruShort gigili gigiliru gigiriruShort mukelu mukeluru mukeruruShort detule detuleru detureruShort komuli komuliru komuriruShort bubelo bubeloru buberoruShort gegori gegolili gegoriliShort kidure kiduleli kidureliShort dutere duteleli dutereliShort popero popeloli poperoliShort nibiru nibiluli nibiruliShort memoru memoluli memoruliShort bonuro bonuloli bonuroliShort tukiri tukilili tukiriliShort mipuru mipuluru mipururuShort ditore ditoleru ditoreruShort pemeri pemeliru pemeriruShort goniro goniloru goniroruShort kudire kudileru kudireruContinued on next page186Table A.5 – Testing stimuli(Continued from previous page)Distance Stem “l” option “r” optionShort nuburi nubuliru nuburiruShort tokoro tokoloru tokororuShort begeru begeluru begeruruMedium beliki belikili berikiliMedium dilopo dilopoli diropoliMedium molutu molutuli morutuliMedium pelemi pelemili peremiliMedium kilono kilonoli kironoliMedium gulibe gulibeli guribeliMedium tuluge tulugeli turugeliMedium noledu noleduli noreduliMedium pilepe pileperu pireperuMedium muluto mulutoru murutoruMedium nulimu nulimuru nurimuruMedium tolone toloneru toroneruMedium golobi golobiru gorobiruMedium keleku kelekuru kerekuruMedium deludo deludoru derudoruMedium bilegi bilegiru biregiruMedium burike bulikeli burikeliMedium dorupi dolupili dorupiliMedium merenu melenuli merenuliMedium puremo pulemoli puremoliMedium korogo kologoli korogoliMedium nirobu nilobuli nirobuliMedium tiriti tilitili tiritiliMedium gerude geludeli gerudeliMedium porodi polodiru porodiruMedium mirete mileteru mireteruContinued on next page187Table A.5 – Testing stimuli(Continued from previous page)Distance Stem “l” option “r” optionMedium girupu gilupuru girupuruMedium teriko telikoru terikoruMedium nereme nelemeru neremeruMedium kuroni kuloniru kuroniruMedium duribo duliboru duriboruMedium borugu boluguru boruguruLong letubi letubili retubiliLong linode linodeli rinodeliLong limegu limeguli rimeguliLong lugupi lugupili rugupiliLong ledimo ledimoli redimoliLong lokenu lokenuli rokenuliLong lipoke lipokeli ripokeliLong lebito lebitoli rebitoliLong lunedo lunedoru runedoruLong lotiku lotikuru rotikuruLong lokite lokiteru rokiteruLong lemogo lemogoru remogoruLong lugoni lugoniru rugoniruLong lipube lipuberu ripuberuLong lobupu lobupuru robupuruLong ludemi ludemiru rudemiruLong rupimu lupimuli rupimuliLong ronupe lonupeli ronupeliLong romuge lomugeli romugeliLong rebeti lebetili rebetiliLong ruteki lutekili rutekiliLong rikono likonoli rikonoliLong rodobo lodoboli rodoboliContinued on next page188Table A.5 – Testing stimuli(Continued from previous page)Distance Stem “l” option “r” optionLong rugidu lugiduli rugiduliLong regedi legediru regediruLong ritoko litokoru ritokoruLong ribopo liboporu riboporuLong rumibu lumiburu rumiburuLong renitu lenituru renituruLong rodume lodumeru rodumeruLong rekune lekuneru rekuneruLong ripegi lipegiru ripegiru189Appendix BStatistical AnalysesB.1 Experiment 1 (16 Subjects)Table B.1: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 1 with Short-range baseline (N = 4518; log-likelihood = –2235.4)Coefficient Estimate SE Pr(>|z|)Intercept −1:13651 0.26127 < 0.0001Harmony Faithful 2:47228 0.28800 < 0.0001Harmony Second −0:50501 0.12784 < 0.0001Medium-range 0:09456 0.16467 0.5658Long-range −0:18878 0.16318 0.2473S-Harm 1:28500 0.29777 < 0.0001M-Harm 1:10671 0.30310 0.0003Medium-range × S-Harm −1:02310 0.22904 < 0.0001Long-range × S-Harm −0:89644 0.22821 < 0.0001Medium-range ×M-Harm 0:22645 0.23169 0.3284Long-range ×M-Harm −0:29497 0.22715 0.1941190Table B.2: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 1 with Medium-range baseline (N = 4518; log-likelihood = –2235.4)Coefficient Estimate SE Pr(>|z|)Intercept −1:04300 0.26124 < 0.0001Harmony Faithful 2:47209 0.28791 < 0.0001Harmony Second −0:50513 0.12783 < 0.0001Short-range −0:09291 0.16463 0.5725Long-range −0:28145 0.16398 0.0861S-Harm 0:26217 0.29489 0.3740M-Harm 1:33476 0.30544 < 0.0001Short-range × S-Harm 1:02171 0.22899 < 0.0001Long-range × S-Harm 0:12460 0.22549 0.5805Short-range ×M-Harm −0:22857 0.23162 0.3237Long-range ×M-Harm −0:52396 0.23064 0.0231Table B.3: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 1 with Long-range baseline (N = 4518; log-likelihood = –2235.4)Coefficient Estimate SE Pr(>|z|)Intercept −1:3242 0.2614 < 0.0001Harmony Faithful 2:4724 0.2879 < 0.0001Harmony Second −0:5050 0.1278 < 0.0001Short-range 0:1877 0.1632 0.2500Medium-range 0:2826 0.1640 0.0848S-Harm 0:3877 0.2940 0.1872M-Harm 0:8975 0.3005 0.0071Short-range × S-Harm 0:8975 0.2282 < 0.0001Medium-range × S-Harm −0:1260 0.2255 0.5764Short-range ×M-Harm 0:2964 0.2271 0.1919Medium-range ×M-Harm 0:5225 0.2307 0.0235191B.2 Experiment 1 (12 “Successful Learners”)Table B.4: Summary of the fixed effects portion of the mixed-effects logis-tic regression for learners in Experiment 1 with Short-range baseline(N = 3400; log-likelihood = –1491.9)Coefficient Estimate SE Pr(>|z|)Intercept −1:14488 0.23378 < 0.0001Harmony Faithful 2:43613 0.32511 < 0.0001Harmony Second −0:37411 0.12162 0.0021Medium-range 0:04516 0.18374 0.8058Long-range −0:18983 0.18263 0.2986S-Harm 2:67994 0.26191 < 0.0001M-Harm 2:48025 0.26095 < 0.0001Medium-range × S-Harm −2:10084 0.28427 < 0.0001Long-range × S-Harm −2:18634 0.28641 < 0.0001Medium-range ×M-Harm 0:07936 0.30243 0.7930Long-range ×M-Harm −1:18575 0.28199 < 0.0001192Table B.5: Summary of the fixed effects portion of the mixed-effects logis-tic regression for learners in Experiment 1 with Medium-range baseline(N = 3400; log-likelihood = –1491.9)Coefficient Estimate SE Pr(>|z|)Intercept −1:10047 0.23475 < 0.0001Harmony Faithful 2:43685 0.32674 < 0.0001Harmony Second −0:37468 0.12170 0.0021Short-range −0:04425 0.18376 0.8097Long-range −0:23266 0.18318 0.2041S-Harm 0:57830 0.22909 0.0116M-Harm 2:56041 0.26558 < 0.0001Short-range × S-Harm 2:10158 0.28435 < 0.0001Long-range × S-Harm −0:08781 0.26121 0.7368Short-range ×M-Harm −0:08072 0.30242 0.7895Long-range ×M-Harm −1:26740 0.28582 < 0.0001Table B.6: Summary of the fixed effects portion of the mixed-effects logis-tic regression for learners in Experiment 1 with Long-range baseline(N = 3400; log-likelihood = –1491.9)Coefficient Estimate SE Pr(>|z|)Intercept −1:33227 0.23452 < 0.0001Harmony Faithful 2:43742 0.32626 < 0.0001Harmony Second −0:37446 0.121704 0.0021Short-range 0:18682 0.18263 0.3063Medium-range 0:23086 0.18317 0.2075S-Harm 0:48959 0.22688 0.0309M-Harm 1:29107 0.26724 < 0.0001Short-range × S-Harm 2:19090 0.28648 < 0.0001Medium-range × S-Harm 0:08921 0.26120 0.7327Short-range ×M-Harm 1:18938 0.28196 < 0.0001Medium-range ×M-Harm 1:26937 0.28581 < 0.0001193B.3 Experiment 2 (16 Subjects)Table B.7: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 2 with Short-range baseline (N = 4534; log-likelihood = –2128.3)Coefficient Estimate SE Pr(>|z|)Intercept −0:72714 0.24416 0.0029Dissimilation Faithful 2:38800 0.29266 < 0.0001Dissimilation Second −0:71309 0.13950 < 0.0001Medium-range −0:09619 0.16489 0.5596Long-range 0:18624 0.16338 0.2543S-Diss 2:51590 0.30450 < 0.0001M-Diss 1:47592 0.30610 < 0.0001Medium-range × S-Diss −2:03036 0.24977 < 0.0001Long-range × S-Diss −2:70521 0.25118 < 0.0001Medium-range ×M-Diss −0:00647 0.23652 0.9782Long-range ×M-Diss −0:98679 0.23280 < 0.0001194Table B.8: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 2 with Medium-range baseline (N = 4534; log-likelihood = –2128.3)Coefficient Estimate SE Pr(>|z|)Intercept −0:82611 0.24453 0.0007Dissimilation Faithful 2:38980 0.29295 < 0.0001Dissimilation Second −0:71033 0.13961 < 0.0001Short-range 0:09523 0.16490 0.5636Long-range 0:28440 0.16423 0.0833S-Diss 0:48717 0.28413 0.0864M-Diss 1:47160 0.30499 < 0.0001Short-range × S-Diss 2:03270 0.24984 < 0.0001Long-range × S-Diss −0:67777 0.22487 < 0.0001Short-range ×M-Diss 0:00435 0.23653 0.9853Long-range ×M-Diss −0:98355 0.23217 < 0.0001Table B.9: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 2 with Long-range baseline (N = 4534; log-likelihood = –2128.3)Coefficient Estimate SE Pr(>|z|)Intercept −0:5424 0.2431 0.0257Dissimilation Faithful 2:3903 0.2933 < 0.0001Dissimilation Second −0:7096 0.1397 < 0.0001Short-range −0:1892 0.1634 0.2469Medium-range −0:2843 0.1642 0.0835S-Diss −0:1908 0.2829 0.5000M-Diss 0:4884 0.2997 0.1032Short-range × S-Diss 2:7103 0.2512 < 0.0001Medium-range × S-Diss 0:6781 0.2249 0.0026Short-range ×M-Diss 0:9875 0.2328 < 0.0001Medium-range ×M-Diss 0:9830 0.2322 < 0.0001195B.4 Experiment 2 (12 “Successful Learners”)Table B.10: Summary of the fixed effects portion of the mixed-effects logisticregression for first 12 learners in Experiment 2with Short-range baseline(N = 3403; log-likelihood = –1564.2)Coefficient Estimate SE Pr(>|z|)Intercept −0:69269 0.24504 0.0047Dissimilation Faithful 2:16993 0.33560 < 0.0001Dissimilation Second −0:57469 0.12261 < 0.0001Medium-range −0:04593 0.18393 0.8028Long-range 0:18743 0.18290 0.3055S-Diss 2:14799 0.28624 < 0.0001M-Diss 2:14863 0.30144 < 0.0001Medium-range × S-Diss −1:83421 0.27418 < 0.0001Long-range × S-Diss −2:35790 0.27517 < 0.0001Medium-range ×M-Diss 0:49996 0.30027 0.0959Long-range ×M-Diss −1:46357 0.27468 < 0.0001196Table B.11: Summary of the fixed effects portion of the mixed-effects logis-tic regression for first 12 learners in Experiment 2 with Medium-rangebaseline (N = 3403; log-likelihood = –1564.2)Coefficient Estimate SE Pr(>|z|)Intercept −0:73727 0.24535 0.0027Dissimilation Faithful 2:16952 0.33587 < 0.0001Dissimilation Second −0:57574 0.12263 < 0.0001Short-range 0:04554 0.18395 0.8045Long-range 0:23465 0.18343 0.2008S-Diss 0:31389 0.26695 0.2397M-Diss 2:64772 0.31482 < 0.0001Short-range × S-Diss 1:83424 0.27421 < 0.0001Long-range × S-Diss −0:52551 0.25236 0.0373Short-range ×M-Diss −0:50210 0.30029 0.0945Long-range ×M-Diss −1:96560 0.28927 < 0.0001Table B.12: Summary of the fixed effects portion of the mixed-effects logisticregression for first 12 learners in Experiment 2with Long-range baseline(N = 3403; log-likelihood = –1564.2)Coefficient Estimate SE Pr(>|z|)Intercept −0:5036 0.2444 0.0393Dissimilation Faithful 2:1752 0.3369 < 0.0001Dissimilation Second −0:5773 0.1225 < 0.0001Short-range −0:1887 0.1829 0.3024Medium-range −0:2337 0.1834 0.2027S-Diss −0:2128 0.2657 0.4233M-Diss 0:6777 0.2808 0.0158Short-range × S-Diss 2:3585 0.2752 < 0.0001Medium-range × S-Diss 0:5254 0.2524 0.0374Short-range ×M-Diss 1:4647 0.2747 < 0.0001Medium-range ×M-Diss 1:9622 0.2892 < 0.0001197B.5 Experiment 3 (16 Subjects)Table B.13: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 3 with Short-range baseline (N = 4490; log-likelihood = –2124.8)Coefficient Estimate SE Pr(>|z|)Intercept −1:28430 0.24824 < 0.0001Harmony Faithful 2:68550 0.31605 < 0.0001Harmony Second −0:43627 0.12893 0.0007Medium-range 0:09698 0.16511 0.5570Long-range −0:18805 0.16336 0.2497S-Harm-M-Faith 2:00380 0.23778 < 0.0001M-Harm-S-Faith 0:79664 0.22255 0.0003Medium-range × S-Harm-M-Faith −1:55412 0.24271 < 0.0001Long-range × S-Harm-M-Faith −1:59964 0.24277 < 0.0001Medium-range ×M-Harm-S-Faith 0:12194 0.23500 0.6038Long-range ×M-Harm-S-Faith −0:24235 0.23050 0.2931198Table B.14: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 3 with Medium-range baseline (N = 4490;log-likelihood = –2124.8)Coefficient Estimate SE Pr(>|z|)Intercept −1:18733 0.24887 < 0.0001Harmony Faithful 2:68610 0.31641 < 0.0001Harmony Second −0:43653 0.12893 0.0007Short-range −0:09711 0.16512 0.5565Long-range −0:28618 0.16432 0.0816S-Harm-M-Faith 0:44918 0.22104 0.0421M-Harm-S-Faith 0:91821 0.22555 < 0.0001Short-range × S-Harm-M-Faith 1:55429 0.24273 < 0.0001Long-range × S-Harm-M-Faith −0:04407 0.22868 0.8472Short-range ×M-Harm-S-Faith −0:12202 0.23500 0.6036Long-range ×M-Harm-S-Faith −0:36298 0.23268 0.1188Table B.15: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 3 with Long-range baseline (N = 4490; log-likelihood = –2124.8)Coefficient Estimate SE Pr(>|z|)Intercept −1:4733 0.2481 < 0.0001Harmony Faithful 2:6844 0.3160 < 0.0001Harmony Second −0:4364 0.1290 0.0007Short-range 0:1896 0.1633 0.2458Medium-range 0:2874 0.1643 0.0802S-Harm-M-Faith 0:4062 0.2173 0.0616M-Harm-S-Faith 0:5556 0.2179 0.0108Short-range × S-Harm-M-Faith 1:5977 0.2428 < 0.0001Medium-range × S-Harm-M-Faith 0:0435 0.2287 0.8491Short-range ×M-Harm-S-Faith 0:2411 0.2305 0.2955Medium-range ×M-Harm-S-Faith 0:3625 0.2327 0.1193199B.6 Experiment 4 (16 Subjects)Table B.16: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 4 with Short-range baseline (N = 4518; log-likelihood = –2253.0)Coefficient Estimate SE Pr(>|z|)Intercept −0:83807 0.18363 < 0.0001Dissimilation Faithful 2:56905 0.28066 < 0.0001Dissimilation Second −0:57949 0.11888 < 0.0001Medium-range −0:09637 0.16470 0.5585Long-range 0:18660 0.16301 0.2523S-Diss-M-Faith 1:93214 0.20439 < 0.0001M-Diss-S-Faith 0:06160 0.18875 0.7442Medium-range × S-Diss-M-Faith −1:63444 0.23878 < 0.0001Long-range × S-Diss-M-Faith −2:04029 0.23802 < 0.0001Medium-range ×M-Diss-S-Faith 0:51938 0.22372 0.0203Long-range ×M-Diss-S-Faith −0:40628 0.22213 0.0674200Table B.17: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 4 with Medium-range baseline (N = 4518;log-likelihood = –2253.0)Coefficient Estimate SE Pr(>|z|)Intercept −0:93381 0.18449 < 0.0001Dissimilation Faithful 2:56793 0.28062 < 0.0001Dissimilation Second −0:57989 0.11889 < 0.0001Short-range 0:09676 0.16469 0.5568Long-range 0:28368 0.16391 0.0835S-Diss-M-Faith 0:29786 0.19036 0.1177M-Diss-S-Faith 0:58095 0.18966 0.0022Short-range × S-Diss-M-Faith 1:63310 0.23875 < 0.0001Long-range × S-Diss-M-Faith −0:40721 0.22436 0.0695Short-range ×M-Diss-S-Faith −0:52090 0.22371 0.0199Long-range ×M-Diss-S-Faith −0:92738 0.22379 < 0.0001Table B.18: Summary of the fixed effects portion of the mixed-effects logisticregression for Experiment 4 with Long-range baseline (N = 4518; log-likelihood = –2253.0)Coefficient Estimate SE Pr(>|z|)Intercept −0:6561 0.1821 < 0.0001Dissimilation Faithful 2:5685 0.2804 < 0.0001Dissimilation Second −0:5782 0.1189 < 0.0001Short-range −0:1817 0.1630 0.5570Medium-range −0:2757 0.1639 0.2648S-Diss-M-Faith −0:1034 0.1884 0.5832M-Diss-S-Faith −0:3421 0.1884 0.0694Short-range × S-Diss-M-Faith 2:0337 0.2379 < 0.0001Medium-range × S-Diss-M-Faith 0:3968 0.2243 0.0769Short-range ×M-Diss-S-Faith 0:4018 0.2221 0.0704Medium-range ×M-Diss-S-Faith 0:9179 0.2238 < 0.0001201


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items