UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Kinematic patterning of flaps, taps and rhotics in English Derrick, Donald 2011

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
24-ubc_2011_fall_derrick_donald.pdf [ 43.02MB ]
Metadata
JSON: 24-1.0072056.json
JSON-LD: 24-1.0072056-ld.json
RDF/XML (Pretty): 24-1.0072056-rdf.xml
RDF/JSON: 24-1.0072056-rdf.json
Turtle: 24-1.0072056-turtle.txt
N-Triples: 24-1.0072056-rdf-ntriples.txt
Original Record: 24-1.0072056-source.json
Full Text
24-1.0072056-fulltext.txt
Citation
24-1.0072056.ris

Full Text

Kinematic patterning of flaps, taps and rhotics in English  by Donald Derrick B.A. Anthropology, St. Mary’s University, 1995 M.A. Anthropology, Dalhousie University, 1997  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  Doctor of Philosophy in THE FACULTY OF GRADUATE STUDIES (Linguistics)  The University Of British Columbia (Vancouver) August 2011 c Donald Derrick, 2011  Abstract Psychological researchers have found evidence for speech planning down to the syllable, with some evidence for planning at the level of the phoneme (Levelt, 1989) or feature (Bernhardt and Stemberger, 1998). Speech scientists who examine coarticulation argue for no speech planning (Saltzman and Munhall, 1989), or limited planning (Whalen, 1990). I provide evidence for subphonemic speech planning based on B/M ultrasound to measure tongue shape and motion, identifying four categorical variants of flap/taps (‘T’) in North American English, [alveolar taps ([R ]), down-flaps ([R ]), upflaps ([R ]), and postalveolar taps ([R↔ ])], and two broad categories of rhotic vowels (‘R’) [tongue tip-up rhotics ([õ]) and tongue tip-down rhotics ([ô])], even across " " repetitions of the same utterance in identical phonetic contexts. I explain the pattern of variation in terms of hypothesized constraints on rapid articulation. These include articulatory conflicts (Gick and Wilson, 2006) between segments that require an articulator to be in two places at once, and the end-state comfort effect (Rosenbaum et al., 1992), where an articulator begins a complex sequence in an awkward position in order to end comfortably. Speakers who can repeat syllables quickly are more likely to avoid articulatory conflicts during normal speech production. Speakers who repeat syllables more slowly produce ‘T’ variants involving fewer changes in motion, sometimes forcing non-rhotic vowels in the middle of ‘T’ sequences to become rhotacized in exchange for canonical vowels at the end. These results provide evidence for planning across syllable, morpheme and word boundaries. Other hypothesized constraints on speech planning such as gravity and tissue elasticity are also examined, and demonstrate a mismatch between the number of ii  distinct articulatory actions and the number of phoneme units in a given speech sequence. The results support a theory of subphonemic speech planning that takes into account potential upcoming articulatory conflicts, a person’s motor skills, and the effects of gravity and elasticity.  iii  Preface This dissertation includes four chapters containing articles for publication. Chapter 2, “There are no Fixed Motor Programs in Speech: Evidence from Categorical Kinematic Variants of English Flaps and Taps”, was written by Donald Derrick, in collaboration with Bryan Gick, the dissertation supervisor. As with all the chapters in this dissertation, Bryan Gick piloted early research into tongue motion during flap production in English, and helped with experiment design and editing. Chapter 3 “Three phonological segments, one motor event: Evidence for speech-motor disparity from English flap production” was written by Donald Derrick, in collaboration with Bryan Gick and Ian Stavness. Ian Stavness and the ArtiSynth team created ArtiSynth, and they were the primary researchers on the tongue-jaw-hyoid model that formed the simulation environment used in Chapter 3’s simulations. Ian Stavness also wrote some of the description of ArtiSynth used in this paper, helped create the simulations, and edited the paper along with Bryan Gick. Chapters 4 “Subphonemic planning across syllable, morpheme and word boundaries” and 5 “Maximum syllable repetition rate influences categorical variation of English flaps and taps during normal speech” were written in collaboration with Bryan Gick, as described above. As the author of the dissertation, I did most of the work on all these chapters, including designing and conducting the experiments, video analysis, statistical analysis, writeup and argumentation. Publication of the results of this dissertation is in the early stages, consisting of conference presentations and proceedings, listed here: Derrick, D. and Gick, B. (2008). Quantitative analysis of subphonemic flap/tap iv  variation in NAE. Canadian Acoustics, 36-3:162-163. Derrick, D. and Gick, B. (2009). End-state comfort governs kinematic variation in english flap/tap sequences. Journal of the Acoustical Society of America, 125(4):2569. Derrick, D. and Gick, B. (2010). Two phonological segments, one motor event: Evidence for speech-motor disparity from english flap production. Canadian Acoustics, 38(3):128-129. The research was conducted with the approval of the Behavioural Research Ethics Board, as part of the research project entitled “Processing Complex Speech Motor Tasks”, H04-80337 (current version A010), and B04-0337, principal investigator, Bryan Gick, co-investigator Janet Werker. My personal certificate for completing the interagency advisory panel on research ethics introductory tutorial for the tri-council policy statement: Ethical conduct for research involving humans (TCPS) was issued on February 23, 2006.  v  Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iv  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi  List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  x  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xiii  Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xix  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  xxi  Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii 1  2  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  1.1  Rhotics, flaps and taps . . . . . . . . . . . . . . . . . . . . . . .  1  1.2  Constraints on ‘R’ and ‘T’ variant production . . . . . . . . . . .  2  1.3  Disparity between motor actions and units of Linguistic representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4  1.4  Collecting data on categorical variation in speech production . . .  5  1.5  Theoretical importance of research . . . . . . . . . . . . . . . . .  7  One phonological segment, multiple motor events: Evidence from English flaps and taps . . . . . . . . . . . . . . . . . . . . . . . . . .  8  2.1  8  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi  2.2  2.3  2.4  3  English ‘T’ variants in conflicting phonetic context . . . .  10  2.1.2  Methodological solutions . . . . . . . . . . . . . . . . . .  14  2.1.3  Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . .  17  Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  20  2.2.1  Measurement methods . . . . . . . . . . . . . . . . . . .  23  2.2.2  Exemplars . . . . . . . . . . . . . . . . . . . . . . . . .  23  Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27  2.3.1  Flap orientation by phrase . . . . . . . . . . . . . . . . .  28  2.3.2  ‘T’ variant by surrounding ‘R’ variants. . . . . . . . . . .  33  2.3.3  ‘T’ variation across immediate vocalic contexts . . . . . .  36  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  38  2.4.1  Main effect of local context . . . . . . . . . . . . . . . .  38  2.4.2  39  2.4.3  Flaps preferred over taps, [õ] preferred over [ô] . . . . . . " " End-state comfort . . . . . . . . . . . . . . . . . . . . . .  2.4.4  No fixed categories of articulator motion in speech . . . .  42  41  Three phonological segments, one motor event: Evidence for speechmotor disparity from English flap production . . . . . . . . . . . . .  45  3.1  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  45  3.2  Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  50  3.2.1  Experiment methods . . . . . . . . . . . . . . . . . . . .  51  3.2.2  Experiment results . . . . . . . . . . . . . . . . . . . . .  53  3.2.3  Simulation . . . . . . . . . . . . . . . . . . . . . . . . .  57  3.2.4  Simulation methods . . . . . . . . . . . . . . . . . . . .  60  3.2.5  Simulation results . . . . . . . . . . . . . . . . . . . . .  61  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  64  3.3.1  65  3.3  4  2.1.1  Future work . . . . . . . . . . . . . . . . . . . . . . . . .  Subphonemic planning across syllable, morpheme and word boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  66  4.1  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  66  4.1.1  Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . .  71  Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  72  4.2  vii  4.3  Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  74  4.3.1  Hypothesis 1: mammifer vs. editor/auditor . . . . . . . .  75  4.3.2  Hypothesis 2: ‘edify/audify’ ([VV] sequences) vs. ‘editor/auditor’ ([VVR] sequences). . . . . . . . . . . . . . .  4.3.3  Hypothesis 3: ‘otter’ ([VR]) vs. ‘editor/auditor’ ([VVR]) sequences. . . . . . . . . . . . . . . . . . . . . . . . . .  4.3.4 4.4 5  79  Hypothesis 4: ‘edit/audit a’ ([VVV]) vs ‘edit/audit the’ ([VV]) sequences. . . . . . . . . . . . . . . . . . . . . .  81  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  86  Syllable iterance rate influences categorical variation of English flaps and taps during normal speech . . . . . . . . . . . . . . . . . . . . .  89  5.1  Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  89  5.1.1  Articulatory conflict . . . . . . . . . . . . . . . . . . . .  90  5.1.2  One vs. two directions of motion . . . . . . . . . . . . . .  91  5.1.3  Iterance rate . . . . . . . . . . . . . . . . . . . . . . . . .  92  5.1.4  Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . .  93  5.2  Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  94  5.3  Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  97  5.3.1  Single ‘T’ phrases . . . . . . . . . . . . . . . . . . . . .  97  5.3.2  First ‘T’ in double ‘T’ phrases . . . . . . . . . . . . . . . 100  5.3.3  Second ‘T’ in double ‘T’ phrases . . . . . . . . . . . . . . 103  5.4  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.4.1  6  77  Future work . . . . . . . . . . . . . . . . . . . . . . . . . 107  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.1  Subphonemic planning and constraints on speech production . . . 109  6.2  Coordinative structures . . . . . . . . . . . . . . . . . . . . . . . 115  6.3  Disparity between units of action and units of linguistic representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116  6.4  Defining task space . . . . . . . . . . . . . . . . . . . . . . . . . 120  6.5  Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122  6.6  Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123  viii  6.7  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 A Appendices for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . 137 A.1 ‘T’ variants by ‘R’ variants by participant . . . . . . . . . . . . . 137 A.1.1 Distribution of ‘T’ variants by initial ‘R’ variant in the phrase ‘We have Berta beep’ by participant . . . . . . . . 137 A.1.2 Distribution of ‘T’ variants by final ‘R’ variant in the phrase ‘We have otter books’ by participant . . . . . . . . . . . . 139 A.1.3 ‘R’ variant by initial and final ‘R’ variant in the phrase ‘We have him murder a mob’, by participant . . . . . . . . . . 141 A.2 ‘R’ variants - ‘T’ vs. control phrases . . . . . . . . . . . . . . . . 144 A.3 Full dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 B Appendices for chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . 155 B.1 Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.1.1  Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 156  B.1.2  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 156  ix  List of Tables Table 2.1  single ‘T’ list. ‘T’ = flap/tap, c = control phrase . . . . . . . .  Table 2.2  Wilcoxon Signed rank tests comparing prevalence of ‘T’ vari-  22  ants based on phrase. VR = ‘otter’, RV = ‘Berta’, RR = ‘murder’, VV = ‘autumn’. * = significant (α = 0.05). V is the sum of ranks assigned to the differences with positive sign. . . . . . Table 2.3  33  Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on the initial ‘R’ variant in ‘Berta’. * = significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Table 2.4  34  Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on initial ‘R’ variant in ‘otter’. * = significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Table 2.5  Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on the final ‘R’ variant in ‘otter’. õô = tip-up initial, "" tip-down final ‘R’, ôô = tip-down initial and final ‘R’, õõ = tip-up "" "" initial and final ‘R’, ôõ = tip-down initial, tip-up final ‘R’. * = "" significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . .  Table 2.6  34  36  Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on vowel context before and after the ‘T’. * = significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . . . . .  38  Table 3.1  double ‘T’ list . . . . . . . . . . . . . . . . . . . . . . . . . .  52  Table 3.2  Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘Saturday’ vs. ‘Peppermint’. U = [õ], D = [ô]. * = " " significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . .  x  55  Table 3.3  Wilcoxon signed-rank tests comparing prevalence of initial ‘T’ variant in the word ‘Saturday’ based on ‘R’ variant. * = significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . . . . .  Table 3.4  56  Wilcoxon signed-rank tests comparing prevalence of final ‘T’ variant in the word ‘Saturday’ based on ‘R’ variant. * = significant (α = 0.05). . . . . . . . . . . . . . . . . . . . . . . . . .  57  Table 4.1  single ‘T’ list. ‘T’ = flap/tap, ‘C’ = control phrase. . . . . . . .  74  Table 4.2  Wilcoxon Signed rank tests comparing prevalence of ‘R’ variants in ‘mammifer vs. ‘editor/auditor’. * = significant (α = 0.05). 77  Table 4.3  Wilcoxon Signed rank tests comparing prevalence of initial ‘T’ types in ‘edify/audify’ ([VV] sequences) vs. ‘editor/auditor ([VVR] sequences). * = significant (α = 0.05). . . . . . . . . . . . . .  Table 4.4  79  Wilcoxon Signed rank tests comparing prevalence of final ‘T’ variants for ‘editor/auditor’ ([VVR] sequences) vs. ‘otter’ ([VR] phrase). * = significant (α = 0.05). . . . . . . . . . . . . . . .  Table 4.5  81  Wilcoxon Signed rank tests comparing prevalence of final ‘T’ variants in the words ‘edit/audit a’ ([VVV] sequences) vs. ‘edit/audit the’ ([VV] sequences) * = significant (α = 0.05). . . . .  Table 4.6  84  Wilcoxon Signed rank-sum tests comparing prevalence of final ‘T’ variants in ‘edit/audit a’ ([VVV] sequences) vs. ‘edit/audit the’ ([VV] sequences). * = significant (α = 0.05) + = marginally significant (α = 0.1). . . . . . . . . . . . . . . . . . . . . . . .  85  Table 5.1  Potential articulatory conflicts based on phrase. . . . . . . . . .  91  Table 5.2  single ‘T’ phrase list . . . . . . . . . . . . . . . . . . . . . . .  95  Table 5.3  double ‘T’ phrase list . . . . . . . . . . . . . . . . . . . . . .  96  Table 5.4  Iterance duration (ms), and iterance rate (hz), by participant. . .  97  Table 5.5  VGLM comparison of t scores showing the change in likelihood of production of ‘T’ variants in single flap phrases, based on the iterance duration. * t-scores > ± 2 or more are significant. . . 100  xi  Table 5.6  VGLM comparison of t scores showing the change in likelihood of ‘T’ variant in the first ‘T’ of double ‘T’ phrases, based on iterance duration. * t scores > ± 2 are significant. . . . . . . . 103  Table 5.7  VGLM comparison of t scores showing the change in likelihood of flap/tap types in the second ‘T’ in double ‘T’ phrases, based on iterance duration. * t scores > ± 2 are significant. . . . . . 106  Table A.1  Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘Berta’ vs. ‘Burma’. * = significant (α = 0.05). . . . . . 146  Table A.2  Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘otter’ vs. ‘offer’. * = significant (α = 0.05). . . . . . . 148  Table A.3  Wilcoxon signed-rank tests comparing prevalence of initial ‘R’ variants in ‘murder’ vs. ‘murmur’. * = significant (α = 0.05). . 150  Table A.4  Wilcoxon signed-rank tests comparing prevalence of final ‘R’ variants in ‘murder’ vs. ‘murmur’. * = significant (α = 0.05). . 152  Table A.5  Complete table of all stimuli used in this dissertation, part 1. ‘C’ = control phrase, ‘T’ = flap phrase, ‘N’ = nasal phrase. . . . 153  Table A.6  Complete table of all stimuli used in this dissertation, part 2. ‘C’ = control phrase, ‘T’ = flap phrase, ‘N’ = nasal phrase. . . . 154  xii  List of Figures Figure 2.1  Example of ‘R’ variants from an MRI trace of held vowels. Lines point to the tongue tip. . . . . . . . . . . . . . . . . . .  11  Figure 2.2  Schematics of ‘T’ variant motions . . . . . . . . . . . . . . .  13  Figure 2.3  Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘autumn’ . . . . . .  Figure 2.4  Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘Berta’ . . . . . . .  Figure 2.5  19  Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘otter’ . . . . . . . .  Figure 2.6  19  19  Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘murder’ . . . . . .  20  Figure 2.7  [R ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  24  Figure 2.8  [R ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  24  Figure 2.9  Alternate [R ] exemplars. White lines are for visual aid, and are approximations only. . . . . . . . . . . . . . . . . . . . .  Figure 2.10 [R ] and  [R↔ ]  . . . . . . . . . . . . . . . . . . . . . . . . . .  25 26  Figure 2.11 ‘R’ variants. White lines are for visual aid, and are approximations only. . . . . . . . . . . . . . . . . . . . . . . . . . . . .  26  Figure 2.12 Distribution of flap variants by participant. . . . . . . . . . .  28  Figure 2.13 Distribution of flap variants by subjects (2-12) by phrase. . . .  30  Figure 2.14 Distribution of ‘T’ variants by subjects (13-26) by phrase. . .  31  Figure 2.15 Distribution of ‘T’ variants by phrase. . . . . . . . . . . . . .  32  Figure 2.16 Distribution of ‘T’ variants in the phrase ‘We have Berta beep’.  33  Figure 2.17 Distribution of ‘T’ variants in the phrase ‘We have Otter beep’.  34  xiii  Figure 2.18 Distribution of ‘T’ variants in the phrase ‘we have him murder a mob’ based on initial and final ‘R’ variants. . . . . . . . . .  35  Figure 2.19 Distribution of ‘T’ variants based on vowel context before and after the ‘T’, with main hypotheses highlighted underneath. .  37  Figure 2.20 Distribution of ‘T’ variants in the phrase ‘We have autumn books’ with relation to end-state-comfort as presented in the matrix of possible tongue-tip trajectories. The small number of [R ] may be the result of the initial vowel having an extremely low tongue tip position. . . . . . . . . . . . . . . . . . . . . .  40  Figure 2.21 Distribution of ‘T’ variants in the phrase ‘We have Berta beep’ with relation to end-state-comfort as presented in the matrix of possible tongue-tip trajectories. . . . . . . . . . . . . . . . .  42  Figure 2.22 Similar constriction location and degree . . . . . . . . . . . .  43  Figure 2.23 Schematic tongue tip trajectory showing differences in outcomes, even in the same speaker and context, result from different weighting of constraints. Illustrative example taken from  Figure 3.1  Figure 3.2  ‘Berta’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  44  Schematic tongue tip trajectory showing ‘Saturday’: [R õR ], " with one arc of tongue-tip motion, vs. [R ôR ], with 2 arcs of " tongue-tip motion. . . . . . . . . . . . . . . . . . . . . . . .  48  Schematic of possible underlying patters of muscle contractions for production of the tongue tip motions in the word ‘Saturday’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure 3.3  Distribution of final ‘R’ variants by phrase (‘We have Saturday off’ vs. the control phrase ‘We have peppermint now’.)  Figure 3.4  . . .  55  Distribution of final ‘T’ variants in the word ‘Saturday’ based on ‘R’ variant (in ‘ur’). . . . . . . . . . . . . . . . . . . . . .  Figure 3.6  54  Distribution of initial ‘T’ type in the word ‘Saturday’ based on ‘R’ variant (in ‘ur’). . . . . . . . . . . . . . . . . . . . . . . .  Figure 3.5  49  56  Flap sequences in‘Saturday’ based on the ‘R’ variant. X axis lists the initlal ‘T’, Y axis lists the final ‘T’. . . . . . . . . . .  xiv  57  Figure 3.7  Active model: Tongue tip positions in relation to ArtiSynth muscle activations with active [R ] and [R ] muscle activations.  Figure 3.8  62  Passive model: Tongue tip positions in relation to ArtiSynth muscle activations with [R ] muscle activations, and no [R ] muscle activations. . . . . . . . . . . . . . . . . . . . . . . .  Figure 4.1  63  Schematic of possible tongue-tip motions for ‘editor/auditor’ phrases. The example on the left demonstrates end-state comfort in exchange for middle-state vowel quality. . . . . . . . .  Figure 4.2  69  Schematic of possible tongue-tip motions for ‘edit/audit a’ phrases. The example on the left demonstrates end-state comfort in exchange for middle-state vowel quality. . . . . . . . . . . . . .  Figure 4.3  Schematic of hypothesis for ‘editor/auditor’ ([VVR sequences) vs. ‘edify/audify’ ([VV] sequences). . . . . . . . . . . . . . .  Figure 4.4  71  Schematic of hypothesis for ‘edit/audit a’ ([VV] sequences) vs. ‘edit/audit the’([VVV] sequences). . . . . . . . . . . . . . . .  Figure 4.5  70  72  Distribution of ‘R’ variants by participant, top = ‘C’ (‘mammifer’), bottom = ‘T’ (‘editor/auditor’). ‘C’ = control sequences, ‘T’ = flap/tap sequences. . . . . . . . . . . . . . . . . . . . .  Figure 4.6  76  Distribution of ‘T’ variants by participants by phrase group: top: ‘edify/audify’ ([VV] sequences), bottom: first ‘T’, ‘editor/auditor’ ([VVR] sequences). . . . . . . . . . . . . . . . .  Figure 4.7  78  Distribution of ‘T’ variants by phrase group: left: ‘edify/audify’ ([VV] sequences), right: first ‘T’, ‘editor/auditor’ ([VVR] sequences). . . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure 4.8  Distribution of ‘T’ variants by participants by phrase group: top: otter, bottom: second ‘T’, editor/auditor. . . . . . . . . .  Figure 4.9  79 80  Distribution of ‘T’ variants by phrase group: left: ‘otter’ ([VR] phrase), right: second ‘T’, ‘editor/auditor’ ([VVR] sequences).  81  Figure 4.10 Distribution of ‘T’ variants by participants by phrase group: top: ‘edit/audit a’ ([VVV] sequences), bottom: ‘edit/audit the’ ([VV] sequences). . . . . . . . . . . . . . . . . . . . . . . .  xv  83  Figure 4.11 Distribution of ‘T’ variants by phrase group: left: ‘edit/audit a’ ([V (V)VV] sequences), right: ‘edit/audit the’ ([VV] sequences). 84 Figure 4.12 Count by ‘T’ sequences in ‘edit/audit a’ ([VVV] sequences). . Figure 5.1  86  Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for ‘T’ productions based on vowel context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure 5.2  90  Schematic tongue tip trajectory showing flaps (green) have one direction of motion, and taps (red dashed) have two, as highlighted by the black arrows. . . . . . . . . . . . . . . . . . .  Figure 5.3  91  Schematic tongue tip trajectory showing sequences of flaps (i.e. [R ], [R ] have one arc of motion, but sequences of taps (i.e. [R ], [R ]) have two. . . . . . . . . . . . . . . . . . . . .  Figure 5.4  92  Schematic tongue tip trajectory showing hypothesis that speakers with slower iterance rates are more likely to produce the ‘T’ motions seen on the left, speakers with faster iterance rates are more likely to produce the ‘T’ motions seen on the right. . . .  Figure 5.5  ‘T’ variants in single ‘T’ phrases compared with iterance duration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Figure 5.6  98  Details of ‘T’ variants in single ‘T’ phrases compared with iterance duration. . . . . . . . . . . . . . . . . . . . . . . . . .  Figure 5.7  93  99  ‘T’ variants for the first ‘T’ in double ‘T’ phrases compared with iterance duration. . . . . . . . . . . . . . . . . . . . . . 101  Figure 5.8  Details of ‘T’ variants for the first ‘T’ in double ‘T’ phrases compared with iterance duration. . . . . . . . . . . . . . . . . 102  Figure 5.9  ‘T’ variants for the second ‘T’ in double ‘T’ phrases compared with itrance duraction. . . . . . . . . . . . . . . . . . . . . . 104  Figure 5.10 ‘T’ variants for the second ‘T’ in double ‘T’ phrases compared with iterance duration. . . . . . . . . . . . . . . . . . . . . . 105 Figure 6.1  Schematic tongue tip trajectory showing constraint against articulatory conflict. For a given ‘T’, the best outcome avoids rapid transitions into and out of the ‘T’. . . . . . . . . . . . . 110  xvi  Figure 6.2  Schematic tongue tip trajectory showing End-state comfort constraint: For a given ‘T’, the best outcome avoids rapid transitions, the next avoids transitions out of the ‘T’, the third best avoids transitions into the ‘T’, and the worst outcome has rapid transitions into and out of the ‘T’. . . . . . . . . . . . . . . . 110  Figure 6.3  Schematic tongue tip trajectory showing end-state comfort constraint: Example from the word ‘edit a’. . . . . . . . . . . . . 111  Figure 6.4  Schematic tongue tip trajectory showing one direction of motion > two directions of motion: Flaps ([R ], [R ]), which have one direction of motion, are preferred over taps ([R ], [R↔ ]), which have two. . . . . . . . . . . . . . . . . . . . . . . . . . 112  Figure 6.5  Schematic tongue tip trajectory showing one arc of motion > two arcs of motion: Flap sequences ([R ], [R ] or [R ], [R ]), which have one arc of motion, are preferred over tap sequences ([R ], [R ] or [R↔ ], [R↔ ]), which have two arcs of motion. . . . 112  Figure 6.6  Schematic tongue tip trajectory showing gravitational and myoelastic constraint: ‘T’ sequences that use gravity ([R ], [R ]) are preferred over sequences that do not ([R ], [R ]. The worst is those that oppose gravity ([R ], [R ]). . . . . . . . . . . . . 112  Figure 6.7  Relative importance of one direction > two directions of motion constraint vs. articulatory conflict constraint. . . . . . . . 114  Figure 6.8  Relative importance of one direction > two directions of motion constraint vs. state comfort constraint. Here an initial flap (green) is assumed to be preferred over a final flap (gold). Also  Figure 6.9  note that a preference for a word-final [õ] predicts different out" puts from a preference fo word-final [ô]. . . . . . . . . . . . . 115 " Both ‘R’ variants ([ô], [õ]) can be generated by a coordinative " " structure over the same constriction location and constriction degree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118  Figure 6.10 All the ‘T’ variants ([R ], [R ], [R ], [R↔ ]) can be generated by a coordinative structure over the same constriction location and constriction degree. . . . . . . . . . . . . . . . . . . . . . . . 119  xvii  Figure 6.11 Schematic tongue tip trajectory showing some ‘T’ and ‘R’ sequences, such as [R ], [õ], [R ] can result from one coordinative " structure, whereas others, such as [R ], [ô], [R ] will result from " multiple coordinative structures. . . . . . . . . . . . . . . . . 120 Figure A.1  Distribution of ‘T’ variants in the phrase ‘We have Berta beep’ by participants based on initial ‘R’ variant. . . . . . . . . . . 138  Figure A.2  Distribution of ‘T’ variants by initial ‘R’ variant in the phrase ‘We have otter books’ by participant. . . . . . . . . . . . . . 140  Figure A.3  Distribution of ‘T’ variants by initial and final ‘R’ variants in the phrase ‘We have him murder a mob’, by participant (2-12). 142  Figure A.4  Distribution of ‘T’ variants by initial and final ‘R’ variants in the phrase ‘We have him murder a mob’, by participant (13-  Figure A.5  26). DD = initial and final [ô], UD = initial [õ], final [ô], DU = " " " initial [ô], final [õ], UU = initial and final [õ]. . . . . . . . . . 143 " " " Distribution of final ‘R’ variant sin the single ‘T’ phrase ‘We have Berta beep’ vs. the control phrase ‘We have Burma books’. 145  Figure A.6  Distribution of final ‘R’ variants in the single ‘T’ phrase ‘We have otter books’ vs. the control phrase ‘we have him offer books’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147  Figure A.7  Distribution of initial ‘R’ variants in the single ‘T’ phrase ‘We have him murder a mob’ vs. the control phrase ‘we have him murmur a vow’. d = ‘murder’, m = ‘murmur’.  Figure A.8  . . . . . . . . 149  Distribution of final ‘R’ variants in the single ‘T’ phrase ‘We have him murder a mob’ vs. the control phrase ‘we have him murmur a vow’. d = ‘murder’, m = ‘murmur’.  Figure B.1  . . . . . . . . 151  Variability of ‘T’ sequences in ‘Saturday’ vs. ‘herded her’ X axis lists the initial ‘T’, Y axis lists the final ‘T’ . . . . . . . . 156  xviii  Glossary [õ] "  tongue tip-up rhotic  [ô] "  tongue tip-down rhotic  [R ]  up-flap  [R ]  down-flap  [R ]  alveolar tap  [R↔ ]  postalveolar tap  ‘V’  non-rhotic vowel  ‘R’  rhotic vowel  ‘C’  consonant  ‘T’  flap/tap  ‘N’  nasals  SL  superior longitudinal  TRANS transversus VERT  verticalis  GGP  posterior genioglossus  GGM  medial genioglossus xix  GGA  anterior genioglossus  IL  inferior longitudinal  STY  styloglossus  FEM  finite element method  EMG  electromyography  PG  palatoglossus  HG  hyoglossus  JTHP  jaw-tongue-hyoid-palate  CT  computed tomography  EMMA electromagnetic midsagittal articulometer DICOM Digital Imaging and Communications in Medicine VGLM  Vector generalized linear model  fps  frames per second  xx  Acknowledgements Completing a dissertation is a task of monumental difficulty, requiring the help of many people to accomplish with any degree of skill. I would like to thank my family, including Linda Derrick (my mother), Jim Derrick (my father), and my sisters Diane and Jennifer. I also thank my girlfriend Lay Tong Khoo for her incredible support, kindness and generosity - I love you so much! To my cohort Shujun Koon, Calisto Mudzingwa, Mario Ch´avez-Pe´on, Amelia Reis-Silva and Ryan Waldie, thank you for struggling with me through the first years of coursework, and for your continued support since. I would like to thank Dan Archambault, Peter Anderson, Sheldon Green and Ian Stavness for their friendship and for the excellent co-authorship on papers. Special thanks to Alex Harmsen, Ian Hansen, Mark Warren and Janet Joy, and most importantly Laura Turner for being the best of people to share a flat with. I also want to thank Laura for her years of friendship, support, and encouragement you are awesome! Thank you to Jessica and Steve Aument, Susan and Tim Carter, Simon Charles, Emily Eidsmoe, Karl and Meg Persson, Jennifer West, and Abigail Scott (may she rest in peace) for being the best of friends in the best Bible study of my life. Thank you to Cory and Shelly Seguin; Cory has been my friend since forever, and has been amazingly kind and supportive throughout my entire PhD. To my whole lab at ISRL, thank you for your questions and answers and years of good research together. There are too many to name, but I thank Chenhao Chiu for his excellent research and for connecting me to the kinesiologists at UBC, to Beth Rogers, Anita Szakay for both their excellent research, introducing me to ELAN and helping me with R, and Mark Scott for all that he has taught me about xxi  psychology and the human mind. To these four and all the rest, you make the lab meetings awesome and fun! Thank you to Ian Franks, John Houde, Martina Wiltschko, Pat Shaw, Lisa Matthewson, Hotze Rullmann, Doug Pullyblank, Henry Davis, and the many other professors who taught me linguistics and helped me with my journal articles. Thank you to Guy Carden for teaching me to be careful and diligent in design and execution. Guy Carden spent countless hours training and mentoring me during my early PhD career, and left me with a concern for the details that has helped me in all my endeavours since. Extra thanks to Aislin Stott who did much of the acoustic segmentation for my research data. I would not have been able to get through this work were it not for her months of dedicated service. This research was funded by a Discovery Grant from the Natural Sciences and Engineering Council of Canada (NSERC) to Bryan Gick, and by National Institutes of Health (NIH) Grant DC-02717 to Haskins Laboratories. Especially in regards to chapter 3, I also thank Alan Hannam for providing the jaw model and his expertise, Gipsa-Lab, Grenoble for providing CT image data (P. Badin) and the tongue model geometry and constitutive law (S. Buchaillard, J.M. G´erard, P. Perrier, Y. Payan), and the ArtiSynth team. I conclude with my deepest thanks to my committee members.Thanks to Eric Vatikiotis-Bateson for his excellent advice and unfathomable depth of knowledge of the literature and history of all the research programs which I have barely touched. I cannot count the times that a ten minute conversation with Eric has often saved weeks of effort on my part. I thank Joe Stemberger, who has provided amazing support, advice, and incredibly useful editorial work. When Joe reads a document, he quickly gets to the point of it, identifying the major issues that still need resolution, and the best path to accomplish same. Moreso, I would like to thank my supervisor Bryan Gick for providing countless hours of his time and training to me. Bryan’s ability to explain theory and synthesize arguments is incredible, and I would not be a good scientist today were it not for his patience and dedication. Bryan is not only a great supervisor, but a true friend. xxii  Dedication In loving memory of my dear friend Abigail Mary Scott May 16th, 1986 - June 18th, 2011 I will miss you all the days of my life.  Soli Deo Gloria!  xxiii  Chapter 1  Introduction Researchers interested in speech planning have typically worked downwards from higher-level structure, identifying evidence for planning down to units as small as the syllable, and possibly smaller; those interested in lower-level speech phenomena have long been interested in coarticulation, but have seldom found clear evidence for planning. I provide evidence that speakers plan speech acts at the level of subphonemic movements, considering a variety of factors in the construction of their plans including avoidance of articulatory conflicts, accommodation of their own motor skills, gravity, and elasticity, and prioritizing end-state comfort. The resulting actions do not match one-to-one with any linguistic units previously discussed in the literature. In order to track the timing and direction of these rapid motions of the tongue tip and blade, I use a combination of B-mode and M-mode ultrasound imaging.  1.1  Rhotics, flaps and taps  Take, for instance, the word ‘murder’. A speaker can pronounce this word by using a variety of very different-looking sequences of motion while still maintaining the same meaning. One reason is that this word has two English rhotic vowels, which I will refer to collectively as ‘R’. There are two well-known, categorically different but meaningfully identical (that is, subphonemic) variants: a tongue tip-up rhotic ([õ]) and a tongue tip-down rhotic ([ô]) (minimally, see Delattre and Freeman, 1968 " " 1  for a description of eight phonologically conditioned variants of this sound across English dialects). For the tongue tip-up rhotics, the tongue tip is above the alveolar ridge, and this includes both tip-up and retroflex rhotic vowels (‘R’) (Hagiwara, 1995). For the tongue tip-down rhotics, (also called ‘bunched’) the tongue tip is below the alveolar ridge. Both the first and second ‘R’ in ‘murder’ could be either of these variants, for a total of four possible ‘R’ sequences. Another reason speakers can pronounce the word ‘murder’ with very different tongue motions is because the middle consonant between the two ‘R’s is an English flap/tap (‘T’). These are the rapid coronal sounds that make words like ‘ladder’ and ‘latter’ sound ambiguous when spoken out loud in many dialects. This ‘T’ can take the form of one of four categorically different, but meaningfully identical variants: an alveolar tap ([R ]), an up-flap ([R ]), a down-flap ([R ]) or a postalveolar tap ([R↔ ]) (see chapter 2 and Derrick and Gick, 2008). I describe these four variants within the context of the four ‘R’ combinations for ‘Murder’. With an alveolar tap, the tongue moves from an initial [ô] position below the " alveolar ridge upwards, makes contact, and moves back down into a position below the alveolar ridge for the final [ô], giving us [môR ô]. With a down-flap, the " " " tongue moves from above the alveolar ridge from an initial [õ], makes contact, and " continues downward below the alveolar ridge to a final [ô], giving us [mõR ô]. With " " " an up-flap, the tongue moves from an initial [ô] below the alveolar ridge, makes " contact, and continues upward into a position above the alveolar ridge for a final [õ], giving us [môR õ]. With a postalveolar tap, the tongue tip starts above the alve" " " olar ridge from an initial [õ], moves roughly horizontally to a point above the ridge, " makes contact, and moves back to a position that remains above the alveolar ridge for a final [õ], giving us [mõR↔ õ]. These four ‘T’ variants are described in detail in " " " chapter 2, and I gave them these symbols in order to make them easy to identify in text.  1.2  Constraints on ‘R’ and ‘T’ variant production  I will show that speakers not only use different strategies from each other, but the same speaker will use different strategies for the same word within the same sentence (e.g. ‘We have him murder a mob’) from utterance to utterance. Speakers  2  also vary ‘T’ independently of ‘R’ these contexts. As a result, ‘R’ and ‘T’ variants are not simply allophones of their phoneme that are conditioned by phonological context. The variations for both ‘R’ and ‘T’ are neither random nor simply transparently conditioned by phonological contexts, but follow consistent patterns driven by factors that have not been included in our theories of speech motor control (see Chapters 4 and 5). Nevertheless, the observed kinematic patterns fall out of constraints including avoidance of articulatory conflict, accommodation of end-state comfort, differences in motor skills, and constant effects such as gravity and elasticity. In relation to this research, the production of ‘T’ variants in contexts other than the types listed above, for example [môR ô], would require a rapid transition of " " tongue tip raising out of the ô and into the R . This kind of multiple requirement " has been referred to as articulatory conflict (Gick and Wilson, 2006), or blending coarticulation (Saltzman and Munhall, 1989). As well, except for [õ], no sound in English is retroflexed. However, a flap " requires the tongue tip to be above the alveolar ridge either before (in the case of [R ]) or after (in the case of [R ]) the flap. This conflicts with any other sound in English as the tongue tip cannot simultaneously be both above and below the alveolar ridge. For example, in the phrase ‘edit a’, potential articulatory conflicts may be avoided by using a double-tap sequence (‘VR VR V’). However, producing two taps ([R ] or [R↔ ]) in a row may be kinematically difficult for some speakers in comparison to one arc of motion in [R ], [R ], or [R ], [R ] sequences. This means that sequences of flaps should be easier to produce than sequences taps for many speakers. As a result, speakers may prefer using flaps, allowing articulatory conflicts in the beginning or middle of a sequence to avoid them at the end, as in an alternate production of ‘edit a’: ‘VR VR V’. In this case, the speaker’s middle vowel (the ‘i’ in ‘edit’) becomes ‘rhotacized’ because the tongue tip is forced into a retroflex position, while the initial and final vowels are produced normally. In order to understand why speakers would ever produce ‘T’ variants that do not resolve these articulatory conflicts, I turn to skeletal motor behaviour research. In research on hand grasping choices, Rosenbaum et al. (1992) demonstrated that participants will tolerate initially awkward positions in order to end a motion se3  quence with a comfortable posture, and named this end-state comfort. That is, they plan for the end-state in that they decide between two or more possible choices at a given point in advance of a desired outcome. Results like the production ‘edit a’ as ‘VR VR V’ show support for the ‘endstate comfort effect’ in speech. Arguments for local end-state comfort effects are presented in chapter 2, and arguments for end-state comfort effects across morpheme and word boundaries are presented in chapter 4. Evidence showing that people who can produce repeated syllables rapidly are more likely to be able to avoid articulatory conflicts, whereas people who cannot are more likely to allow conflicts with preference for end-state comfort, is presented in chapter 5. Also, I argue that fixed effects such as gravity and elasticity can influence patterns of speech production, and help explain why most speakers produce the word ‘Saturday’ as [sæR õR eI]. That is, speakers contract muscles to produce the R " motion into a õ, and the R results from relaxing those muscles. Evidence for this " argument is presented in chapter 3. These tradeoffs between avoiding articulatory conflicts and the limitations of motor skills, mediated by end-state accommodation, gravity, and elasticity, allow the formation of a theory of speech planning at the level of subphonemic actions. Speakers must plan these subphonemic details, using local information (chapter 2), as well as information across syllable, morpheme and word boundaries (chapter 4) for each speech act. I also believe that the reason that speakers will sometimes employ different strategies for the same words in the same sentence frame is that the importance of each constraint may vary from utterance to utterance, and that there may be constraints that have not been identified or examined in this research. Such lowlevel planning challenges, or at least constrains, nearly every existing theory of speech motor control.  1.3  Disparity between motor actions and units of Linguistic representation  While nothing in this thesis directly examines how the nervous system controls motion, observations of categorical kinematic variation are exceedingly rare in the 4  speech literature, and tell us a lot about how theories of motor control must be constrained in order to account for these speech acts. For instance, there are at least two categorically distinct actions, defined as a the motion of a single articulator in a direction or arc of motion, that correspond to ‘R’, and four that correspond to ‘T’. Yet under most phonological analyses, these variants correspond to no one segment or phoneme each. Where such mismatches exist in very similar phonological contexts, they pose a challenge to a host of theoretical positions that argue for parity between speech actions and units of linguistic representation, whether at the feature (Chomsky and Halle, 1968; Meyer and Gordon, 1985), phoneme (Guenther et al., 1999; Perkell et al., 2000), gesture (Browman and Goldstein, 1986, 1989, 1992), syllable (Levelt, 1994), or word (Browman and Goldstein, 1986, 1989, 1992) level. These mismatches also add to many arguments against the ‘string of beads’ method that linguists use to transcribe speech, and are discussed in chapter 2. Mismatches where one discrete action corresponds to multiple segments or phonemes are discussed in chapter 3. Also, the two ‘R’ variants involve making constrictions with two different parts of the tongue, and the four ‘T’ variants involve making contact with the alveolar ridge from different directions. But the ‘R’ variants both share a constriction location and constriction degree, as do the four ‘T’ variants. This observation fits especially well with theories of speech motor control that are use coordinative structures, as explained in chapter 2.  1.4  Collecting data on categorical variation in speech production  These results provide new challenges to theories of speech motor control in part because the fields of research that generated theories of speech motor control and the methods used to investigate motor control have not allowed for such observations in the past. In psychological research, the typical experiments used to study speech involve acoustic recording of participants where patterns of speech behaviour such as an speech errors or speech onset times are recorded. Such research can let researchers infer the processing complexity of a speech act, or identify the patterns of organization of speech, but it is limited in the ability to document articulator 5  motion. As a result, psychological theories of speech behaviour, and particularly planning, tend not to extend below the phoneme because this is the lowest level that these techniques can reasonably examine (see Levelt, 1989). In contrast, speech scientists often use tools for articulatory measurement such as electromagnetic midsagittal articulometer (EMMA) (Sch¨onle et al., 1987), Xray cine (Delattre and Freeman, 1968), anatomical MRI (Narayanan and Alwan, 1997), or ultrasound (Kelsey et al., 1969) to measure speech articulator motion. These techniques can be used to look at the smallest of speech articulator motions, but for reasons related to each instrument have rarely been employed to identify evidence of low level planning. Speech scientists have typically argued for coarticulated features with no plan¨ ning (Fowler, 1980; Ohman, 1966, 1967; Saltzman and Munhall, 1989), defined as the generation of a strategy for speech production at utterance time. However, from the beginning, Henke (1966) incorporated low level planning in his computational model to deal with anticipatory coarticulation, and Whalen (1990) has found evidence for limited planning in coarticulation. Testing for subphonemic planning benefits greatly from a method that at once allows the identification of rapid, categorically describable articulator motion and also allows collection of large number of randomized but repeated tokens. Otherwise there is no way to identify cases when the same speaker will produce different categorical variants in the same tokens and context, which makes it extremely difficult to unambiguously argue for planning at the subphonemic level. I address the data collection portion of this problem by using simultaneous B/M mode ultrasound carefully aligned to the acoustic signal. B-mode ultrasound is used to capture 2-dimensional images of the midsagittal plane of the tongue at 30 frames per second (fps). The M-mode (motion mode) ultrasound provides a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, which ranged from 60-100 Hz depending on the depth of the scan, inside regular video. The M-mode data, when synchronized with audio data, allowed me to capture the general direction of motion of the front of the tongue, 6  which is ideal for identifying the ‘T’ variants described above when combined with our knowledge of the physical constraints on tongue-tip motion. At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, which along with the M-mode data allowed identification of the ‘R’ variants described above. I describe the use of B/M ultrasound for identifying ‘R’ and ‘T’ variants in detail and with exemplars in chapter 2. These variants are relatively easy to identify with a frame-by-frame analysis of the ultrasound data time-aligned to the acoustic data. Because the measurement technique is relatively non-invasive and easy to collect, a lot of data can be collected for each participant, allowing many participants and many repetitions of the relevant data sets per participant.  1.5  Theoretical importance of research  The results of these studies present challenges to theories of motor control in that motion direction and the position of specific articulators form the basis of planned variation in English ‘R’ and ‘T’. The results also show that the relationship between speech actions and units of linguistic representation depends on constraints, such as end-state comfort, gravity, elasticity and differences in individual motor skills, that are beyond the usual domain of Linguistics. The importance of these constraints can vary from utterance to utterance, resulting in categorically different productions of the same words and phrases. As well, the resulting speech action disparities, including different ‘R’ and ‘T’ productions in the same phrase, or single speech actions spanning multiple phonemes, as in the typical production of ‘Saturday’, require rethinking the relationship between linguistics and speech action, and present a path for future research.  7  Chapter 2  One phonological segment, multiple motor events: Evidence from English flaps and taps 2.1  Introduction  Characterizing speech variation has been very important in speech production research because such an understanding is useful for figuring out how speech motor control works on the one hand, and how phonology works on the other. In particular, understanding categorical speech variation is especially important because it could provide a missing link between gradient speech variation and categorical phonological variation. However, there are very few known examples of categorical speech variation. Most of the time, when speakers produce a particular sound, the articulators will move in a similar pattern and produce a similar shape. Differences are typically believed to depend on production competency or phonological context, and are often argued to be gradient in the mathematical sense, varying around a single ‘average’ shape and motion from one instance to another. Nevertheless, examples of categorical speech variation do exist. Delattre and Freeman (1968) demonstrated that English ‘r’ has categorical vari-  8  ants. That is, the shape and position of the tongue does not vary around a single ‘average’ shape, but around several different shapes. These categorical subphonemic variants extend across speakers based on dialect, and within speakers based on phonological context (see Westbury et al., 1999). Delattre and Freeman identified eight distinct allophonic subcategories, and found that among speakers who produced more than one ‘r’ variant, they were predictable from vowel context. For instance, Mielke et al. (2010) found that, for those American English participants who showed variation in /r/ shape (11 out of 27, or about 40 percent of participants in their study), ‘bunched’ variants of /r/ were more likely to occur adjacent to the vowel /i/ while ‘tip-up’ postures occurred coupled with /a/ and /o/. Nevertheless, while this work points to the possibility, these researchers did not identify categorical variation within the same speaker and phonological context. Variation within the same context has been observed quite often, as in tradeoffs between muscles during labial closures, with variable amounts of contribution of jaw movement, lower lip movement, and upper lip movement on different tokens on the same targets under the same circumstances (Folkins and Abbs, 1975), but unlike the ‘r’ variations above, these are gradient tradeoffs around an average lip constriction degree. Clear evidence of categorical variation within the same speaker and phonetic context would contradict many theories of speech motor control, and force researchers to more closely examine the relationship between speech articulation and phonological representation. However, few researchers have sought such variation, and none have observed it. We hypothesize that the same individuals will sometimes produce different categorical variants for the same stimuli in the same sentence frame. To test this hypothesis we identified a second sound with categorical subphonemic variation, flap/tap (‘T’), which occurs in a position of articulatory conflict with neighbouring sounds, including English ‘r’. Articulatory conflict occurs when two articulations within a speech act are in direct physical conflict (Gick and Wilson, 2006; Wood, 1996). When two articulations within a speech act are in direct physical conflict, multiple strategies could be used to resolve the conflict (Gick and Wilson, 2006), creating a context where categorical subphonemic kinematic variation, even for the same speaker and same phonological context, might occur. 9  2.1.1  English ‘T’ variants in conflicting phonetic context  Flaps and taps are short stop-like consonants made through rapid gestures initiated by muscle constrictions and aimed at an articulatory region. They are ballistic (DeJong, 1998; Hockett, 1955), muscle-induced and controlled gestures that typically impact roughly along the dental or alveolar ridge (Ladefoged and Maddieson, 1996). In most dialects of North American English, coronal stop consonants (/t/, /d/), and occasionally other coronal sounds can be realized as flaps (Umeda, 1976). Flaps differ from stop consonants in that there is little or no buildup of air pressure at the place of articulation, and as a result there is little or no stop burst at the closure release (Zue and Laferriere, 1979). Tap or flap production is particularly likely when the consonant follows a stressed syllable. In the case of /t/ and /d/, the phonetic distinction largely neutralizes because flaps are typically voiced. Taps are constrictions where the tongue begins and ends its movement in the same relative location (either above or below the alveolar ridge), while flaps are constrictions where the tongue begins its movement tangentially, striking its target and continuing, either upward or downward, along a single trajectory. In most languages, including English (though see Ladefoged 1968), taps and flaps are not contrastive. At least one dialect of Warlpiri, spoken in the Tanami Desert and surrounding area of central Australia, has a phonemic distinction between apicoalveolar taps, which vary in predictable ways with apico-alveolar trills, and postalveolar flaps (Ingram and Laughren, 1999). Hindi also has both a retroflex flap and an alveolar tap (possibly also varying with a trill), as contrasting phonemes (Kelkar, 1968; Maddieson, 1984). In Spanish, [R ] starts and ends just below and behind the alveolar ridge and as a result the formant transitions into and out of Spanish taps are similar to each other. In contrast, English downward flaps involve a preparatory raising and retraction of the tongue, and as a result the formant transitions before and after the flap are quite different from each other (Ladefoged and Maddieson, 1996; Monnot and Freeman, 1972). English ‘T’ must interface with surrounding vocalic segments. As noted above, (Mielke et al., 2010) demonstrated that among speakers who produced more than one ‘r’ variant, the variant was related to vowel context. For this study, we focus on the distinction between tongue-tip down bunched ‘r’ (hereafter [ô]), tongue-tip " 10  up bunched ‘r’, and retroflex ‘r’ (hereafter collectively [õ]) because for this study, " tongue tip position above or below the alveolar ridge is the most important characteristic of the categorical variants of ‘R’ (see Hagiwara, 1995). Mielke et al.’s results demonstrate that articulatory conflict is one cause of subphonemic variation in ‘r’. Similarly, within the context of a given flap/tap, articulatory conflict occurs when the tongue tip is raised adjacent to a non-rhotic vowel (‘V’) or a [ô], or when " the tongue tip is lowered adjacent to a [õ]. One example each of the many possible " versions of [õ] and [ô] rhotic vowels is seen in Figure 2.1. " "  tip-down rhotic [ɹ̩]  tongue tip tip-up rhotic [ɻ̩]  tongue tip  (a) tip-up rhotic [õ] "  (b) tip-down rhotic [ô] "  Figure 2.1: Example of ‘R’ variants from an MRI trace of held vowels. Lines point to the tongue tip. In contrast to the categorical variability of rhotic vowels, non-rhotic vowels in English tend to have tongue position targets that are usually low in the mouth below the alveolar ridge, as can be seen by examining both the acoustic and articulatory data in early cinefluorographic vowel studies. English vowels all have low tongue tip positions in comparison to [õ], but low and back vowels tend to have lower " tongue tip positions than front or high vowels (Gay, 1974; Perkell, 1969). As a result, in order to avoid articulatory conflict, the tongue tip must be low adjacent to a non-rhotic vowel. So, assuming that speakers can produce all the variants that resolve these articulatory conflicts, then English should have four subphonemic kinematic variations of ‘T’. The first, an alveolar tap ([R ]), would be most likely to occur when a sin-  11  gle flap is flanked by two ‘V’, [ô], or combination of same. The tongue tip would " move from below the alveolar ridge upwards, make contact and move back down into position for the following vowel. The second, a [R ], would be most likely to occur when a single flap is preceded by a [õ] and followed by a vowel or [ô]. The " " tongue tip would move from above the alveolar ridge, make contact, and continue downward into position for the following vowel. The third, an [R ], would be most likely to occur when a single flap is preceded by a vowel or [ô] and followed by a " [õ]. The tongue tip would move from below the alveolar ridge, make contact, and " continue upward into position for the [õ]. The fourth, a [R↔ ], would be most likely " to occur when a single flap is flanked by two [õ]s. The tongue tip would move from " above the alveolar ridge roughly horizontally to a point above the ridge and back for the following [õ]. Schematics of the four types of ‘T’ listed above are found in " Figure 2.2.  12  Down flap [ɾ↘]  Alveolar Tap [ɾ↕]  (a) alveolar tap [R ]  (b) down-flap [R ]  Postalveolar Tap [ɾ↔]  Up flap [ɾ↖]  (d) postalveolar tap [R↔ ]  (c) up-flap [R ]  Figure 2.2: Schematics of ‘T’ variant motions The production of ‘T’ variants in contexts other than the ones listed would produce articulatory conflicts that could be resolved by rapid transitions into or out of the ‘T’, or by producing the relevant vowel in a non-canonical way (i.e. a tip-up non-rhotic vowel). As mentioned above, there is no guarantee that all speakers can or will perform all the possible ‘T’ variants. Individual variation is expected, even for speakers of the same or similar dialects. For instance, articulatory conflict conditions can vary between speakers because some speakers may prefer producing [õ] and others " [ô], so then the preferred ‘T’ will vary between speakers. Other speakers may " 13  not produce particular ‘T’ types consistently. In the cases where a speaker does not produce a particular ‘T’ type consistency regardless of potential articulatory conflicts, Rosenbaum’s end-state-comfort hypothesis provides a potential response Rosenbaum et al. (1992). In the classic description of this effect, a person asked to pick up a glass and put it down again will grasp it with their wrist in mid-rotation (thumb and index finger up) and put the glass back down. The same person asked to pick the glass up and put it down upside-down will rotate their wrist and grasp the glass with their thumb and index finger down (an awkward grip) so that they can rotate their arm back to a comfortable position when they put the glass back down. We expect a similar result with speech, in that the speakers will produce the vowels before the ‘T’ with awkward transitions in exchange for standard productions of the vowel after the ‘T’. But most importantly, we hypothesize that, having identified the categories of kinematic variation relevant to this study, we will see that individual participants produce different categorical variants of ‘T’ and ‘R’ in different repetitions of the same sentence for the same context. Because each ‘T’ variant is clearly produced with a different articulatory motion leading to a different path for the ‘T’ variant, such observations would demonstrate that there are no fixed subphonemic categories of articulator motion in speech, even in the same phonological context.  2.1.2  Methodological solutions  Testing these hypotheses is a challenge because the speed of ‘T’ articulation creates difficulty in accurate collection and interpretation of data. That is, there is a need to collected data at a fast rate and for a long time. Nevertheless, since identifying the type of ‘T’ is based largely on direction of motion of the tongue tip and blade, potentially both imaging and point tracking techniques could provide suitable data. One technique, ultrasound, is ideal for capturing both tongue shape and motion data. Also, ultrasound has the advantage of easily allowing long recording sessions. Since one of the goals of this study is to identify kinematic variation in ‘T’, and since most participants are expected to shift their strategies across repetitions of otherwise identical sentences, this study requires large bodies of data for each speaker. As a result, at least 10 tokens of each phrase, combined with con-  14  trol stimuli and stimuli for future work, taking 15-20 minutes of recording time, is desirable. However, ultrasound machine data capture rates have typically been too slow to fulfil all the needs of this project. For most ultrasound machines, B-mode ultrasound (2D images of the gontue) records data at a rate as high as 120 Hz. However, the recording is reduced to 24 or 30 Hz based on the standard video display rate for the video region in which the machine is sold (Hedrick et al., 1995). This is fast enough to identify ‘R’ variants easily, but it is too slow to be ideal for capturing the motion involved in the production of ‘T’ variants. Even HD ultrasounds, with 60 frames per second, are not perfectly suited for the task. Some modern ultrasound machines can output data in Digital Imaging and Communications in Medicine (DICOM) format at speeds up to 120 Hz, which is more than fast enough, but at the time of data collection, computer interfaces allow only about 8 seconds of data capture at a time. Another problem is that many ultrasound setups do not allow researchers to record the tongue tip. Nevertheless, the use of thin ultrasound probes placed close to the thyroid notch allows most or all of the tongue tip to be imaged most of the time. As a result, careful control of the experiment setup allow easy identification of ‘R’ variants. Nevertheless, there is concern that the tip of the tongue might disappear for some frames in which rapid tongue gestures occur - just the frames needed to study ‘T’ variants. One solution then is to combine B-mode with M-mode ultrasound. M-mode ultrasound is a one-dimensional view of ultrasound data that is displayed as a progressive scan in the video output of an ultrasound machine. As a result, M-mode provides information from the ultrasound machine at the higher rate of data collection and displays it all on a single video frame (Kelsey et al., 1969). It is therefore capable of providing high speed data of the motion of one dimension of the tongue using a low speed display (Ostry et al., 1983). The temporal resolution is high enough to allow the study of uvular trills (Kavitskaya et al., 2009), and is therefore more than high enough for tracking ‘T’ variants. At the same time, if the ultrasound image is carefully aligned to acoustic signal, it is relatively easy to know the point of contact for a ‘T’, regardless of which variant. Therefore, knowing that the tongue tip/blade have a limited range of motion in line with the hypothesized ‘T’ 15  variants listed above, tracking time aligned motion of that tongue region allows for accurate identification of ‘T’ variants. As a result, it does not matter if the tongue tip disappears from some frames, as this method is designed to track the motion of the tip/blade region, and make reasonable inferences about ‘T’ variants based on this motion. At the same time, B-mode ultrasound is more than suitable for identifying ‘R’ variants because [õ] and [ô] look quite different from each other, even if " " part of the tongue tip is cut off. A combination of B and M-mode ultrasound is therefore suitable to the task as it provides high speed data over extended periods of time using a methodology that is unlikely to interfere with natural speech articulation, and allows confirmation of the motion scanned in M-mode with the overall tongue shape shown in B-mode (see Chang et al., 2003; Miller and Watkin, 1997). Data analysis solutions Another issue is the method by which identification of the categories of ‘T’ and ‘R’ are made. While the analysis had one primary rater, all the data for the first 8 participants were completely reexamined a second time to ensure the accuracy of ratings, and samples of the data from all 18 participants were reexamined by a second rater. Detailed examples of how these identifications were made are presented in the section on exemplars. Obtaining statistical analyses that make appropriate assumptions with regard to the data collected proved to be a serious challenge. There is no implied ordered relationship between the individual ‘T’ variants or ‘R’ variants. Therefore, this categorical data requires non-parametric statistical analysis, which limits the number of usable models - especially since there are four ‘T’ variants, which therefore require tests that can deal with multinomial distributions. An ideal model would be generalized linear mixed effect models fitted to multinomial distributions, however, they require massive amounts of data, and all attempts to use them failed to converge. As a result, we turned to a very simple model, the Wilcoxon signed rank test. These tests make the most appropriate assumptions. The dependent variables of this study, either ‘R’ or ‘T’ variants depending on the test, involve data that was  16  placed in ‘bins’ based on a set of well-described standards. Nevertheless, there is no intrinsic ordering or ranking between one group and the next. The distinctions between these variants is based on tongue tip position in relation to the alveolar ridge for ‘R’, and tongue motion direction before and after contact for ‘T’. Wilcoxon tests do not require an ordering or ranking of the data to provide suitable results, and so they are most appropriate. At the same time, we expect speech production results such as these to be easily understood from the descriptive statistics. That is, the results should be robust against Type II statistical errors, avoiding the biggest weakness of the Wilcoxon signed rank test.  2.1.3  Hypothesis  We test the hypothesis that there are no fixed categories of articulator motion for speech production. Therefore we expect the same individuals will sometimes produce different variants of ‘T’ for the same stimuli. However, we expect that speakers can make use of all possible directions of tongue tip motion toward and away from an articulatory target. In the case of English this means that there are four categorical kinematic variants of ‘T’ in North American English: I) alveolar tap ([R ]): Defined as a ‘T’ where the tongue moves from below the alveolar ridge upwards, makes contact and moves back down below the alveolar ridge. II) down-flap ([R ]): Defined as a ‘T’ where the tongue moves from above the alveolar ridge, makes contact, and continue downward below the alveolar ridge. III) up-flap ([R ]): Defined as a ‘T’ where the tongue moves from below the alveolar ridge, makes contact, and continue upward above the alveolar ridge. IV) postalveolar tap ([R↔ ]): Defined as a ‘T’ where the tongue moves from above the alveolar ridge to a point above or at the ridge horizontally and back to a position above the alveolar ridge. We anticipate that most categorical variance will be predictable from local phonetic context. In the case of ‘T’ in English, this leads to the following predictions:  17  I) In ‘VTV’ context, such as in the word ‘autumn’, we predict a preponderance of [R ]. II) In ‘RTV’ context, such as in the word ‘Berta’: a) Given an initial [õ], we predict [R ], and " b) Given an initial [ô], we predict [R ]. " III) In ‘VTR’ context, such as in the word ‘otter’: a) Given a final [õ], we predict [R ], and " b) Given a final [ô], we predict [R ]. " IV) In ‘RTR’ context, such as in the word ‘murder’: a) Given an initial and final [õ], we predict [R↔ ]. " b) Given initial and final [ô], we predict [R ]. " c) Given initial [ô] and final [õ], we predict [R ], and " " d) Given an initial [õ] and final [ô], we predict [R ]. " "  Where speakers do not produce results in accordance to the hypotheses above, we expect speakers to produce ‘T’ variants that respect end-state comfort as opposed to beginning state comfort. In Figures 2.3, 2.4, 2.5, and 2.6, a schematization of most possible motions of the tongue-tip (similar to what is seen in M-mode ultrasound) during the production of the words ‘autumn’, ‘Berta’, ‘otter’, and ‘murder’ is presented. The thick coloured line represents the motion of the tongue tip, with likelihood represented from green as most likely to yellow, orange, and lastly red as least likely. Thin dashed lines represent articulatory conflicts before the ‘T’, or violations of beginning-state comfort, and thick close-dashed lines represent articulatory conflicts after the ‘T’, or violations of end-state comfort.  18  V  ɾ  ↕  V  alveolar tap  V  ɾ  ↘  V  V  down-flap  ɾ  ↖  up-flap  V  V  ɾ  ↔  V  postalveolar tap  Figure 2.3: Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘autumn’  ɹ̩  ɾ  V  ɹ̩  ɾ  V  ɹ̩  ɾ  V  ɹ̩  ɾ  V  ɻ̩  ɾ  V  ɻ̩  ɾ  V  ɻ̩  ɾ  V  ɻ̩  ɾ  V  ↕  ↕  alveolar tap  ↘  ↘  down-flap  ↖  ↖  up-flap  ↔  ↔  postalveolar tap  Figure 2.4: Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘Berta’  V  ɾ  ɹ̩  V  ɾ  ɹ̩  V  ɾ  ɹ̩  V  ɾ  ɹ̩  V  ɾ  ɻ̩  V  ɾ  ɻ̩  V  ɾ  ɻ̩  V  ɾ  ɻ̩  ↕  ↕  alveolar tap  ↘  ↘  down-flap  ↖  ↖  up-flap  ↔  ↔  postalveolar tap  Figure 2.5: Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘otter’  19  ɹ̩  ɾ  ɹ̩  ɹ̩  ɾ  ɹ̩  ɹ̩  ɾ  ɹ̩  ɹ̩  ɾ  ɹ̩  ɻ̩  ɾ  ɹ̩  ɻ̩  ɾ  ɹ̩  ɻ̩  ɾ  ɹ̩  ɻ̩  ɾ  ɹ̩  ɹ̩  ɾ  ɻ̩  ɹ̩  ɾ  ɻ̩  ɹ̩  ɾ  ɻ̩  ɹ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ↕  ↕  ↕  ↕  alveolar tap  ↘  ↘  ↘  ↘  down-flap  ↖  ↖  ↖  ↖  up-flap  ↔  ↔  ↔  ↔  ɻ̩  postalveolar tap  Figure 2.6: Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for productions of ‘murder’  2.2  Methods  Twenty-six native speakers of North American English between the ages of 18 and 40 participated in the study. Seven of the participants produced stops instead of ‘T’ during read speech, and the equipment failed to record one person correctly, leaving 18 participants (ten males and eight females). All participants had normal speech and hearing. Participants were seated in a customized American Optical Co. model 507-a (1953) ophthalmic chair with a 2-cup rear headrest adjusted to contact the base of the skull just above the neck. A UST-9118 EV 180 electronic curved array ultrasound probe was placed under the chin. The probe has a variable frequency range of 3-9.0 MHz with an average mean (µ) slice thickness of the tissue viewed with this probe of approximately 3 mm (Medicines and Healthcare products Regulatory Agency, 2004). The probe was attached to an Aloka ProSound SSD-5000 ultrasound machine connected via s-video cable to a Canopus ADVC-110 advanced digital video recorder. 20  A Sennheiser MKH-416 short shotgun microphone was mounted on a microphone stand and aimed at the participant approximately 30 cm away from each participant’s mouth. The microphone was plugged into a M-Audio DMP3 preamplifier via XLR balanced cable and out with an unbalanced RCA cable to the Canopus card to guarantee time synchronization between the Ulrasound and audio output. The Canopus card was connected via FireWire to a MacPro Quad Core 2.8 gHz computer. An LCD monitor was mounted on the opthalmic chair’s monitor mount facing the front of the participant. A computer running the experiment stimuli presentation software was connected to the LCD monitor so that the participant could easily read the stimuli from the screen. The ultrasound machine was set up in B/M mode and aligned to the acoustic signal. B-mode ultrasound was used to capture 2-dimensional images of the midsagittal plane of the tongue at 30 fps. The M-mode (motion mode) ultrasound provided a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, which ranged from 60-100 Hz depending on the depth of the scan. While this motion is not connected to any specific flesh-point, it allows capture of the general direction of motion of the front of the tongue, which is ideal for identifying the ‘T’ variants described above. At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, which along with the M-mode data allowed identification of the ‘R’ variants described above. Tokens were selected to contain a single ‘T’ or sequences of ‘T’s in consecutive syllables. Data was collected on 17 control sentences, 9 sentences with 1 ‘T’, 10 sentences with double ‘T’ sequences, and 2 sentences with triple ‘T’ sequences, for a total of 38 unique sequences. This report focuses on 4 of the sentences, described in Table 2.1. Stimuli were carefully selected to control for vowel/rhotic contexts and syllable count. However, the need to keep the study within a reasonable duration of 20 minutes, and the limited number of relevant tokens in English (this study used a 21  large percentage of the available words in the language), dominated many other concerns. As a result, controlling for word frequency was not possible. Single ‘T’ and double ‘T’ words were chosen to provide a minimum of one token for each of the full set of possible contexts for rhotic and non-rhotic vowels. This report is based on the data collected in this larger experiment, but here we focus on the single ‘T’ sequence data, as shown in Table 2.1 (tokens 1-4), as well as the control tokens (5-7) designed to identify comparisons of ‘R’ variants for similar words that do not have ‘T’.  Token 1 2 3 4 5 6 7  Word autumn berta otter murder Burma offer murmer  Carrier Phrase ‘T’ Context Type We have autumn books 1 VV ‘T’ We have Berta beep 1 RV ‘T’ We have otter books 1 VR ‘T’ We have him murder a mob 1 RR ‘T’ We have Burma books 0 RV ‘C’ We have him offer books 0 VR ‘C’ We have him murmur a vow 0 RR ‘C’  Table 2.1: single ‘T’ list. ‘T’ = flap/tap, c = control phrase In words where a non-rhotic vowel precedes the first ‘T’, vowels ending in the high-front position were avoided because Zue and Laferriere (1979) showed these examples have ‘T’ with longer duration than ‘T’ with other vowels, including ‘R’, preceding the ‘T’. Stimuli were also chosen to have as few syllables as possible, which meant 2 syllables for these (C)VCV sequences. Each stimulus was wrapped in a carrier phrase. The purpose of the carrier phrase was to be as repetitive as possible and therefore help place focus on the stimulus instead of the carrier phrase. For this reason, all stimuli begin with ‘we have (him). The phrase ‘We have (him)’ was also chosen because it contains only labial or glottal consonants and so leaves the tongue free for other articulations. Similarly, carrier phrases end in words that have coronal/velar consonants only at the end (i.e. ‘books’), if at all. The stimuli were presented using PXlabRT set to present stimuli such that each sentence was displayed on an LCD screen for 2.2 seconds for a total of 12 blocks. The software automatically paused the experiment after the first 6 blocks (9 min22  utes) to allow participants to swallow some water or take a short break if needed. The 12 blocks were presented in set order, but the entire set of 38 sentences was randomized for each block. The experiments began with a test for tongueing speed in which the participant was asked to make a rapid sequences of the syllable ‘ta’. The participant was then asked to repeat sentences containing ‘T’ sequences while the ultrasound machine was configured to match the size and shape of their head and tongue. The experiment software was then activated and experiment data were recorded as described above. Participants were asked to repeat ‘ta’ at least 10 times rapidly in order to record tongue motion speed and to provide data for audio synchronization. Participants were then asked to say 38 stimuli repeated 6 times for each of 2 blocks, for a total of 456 stimuli. Each block took 9 minutes, for a total of 18 minutes recording time.  2.2.1  Measurement methods  Data were recorded directly onto a Macbook via the Canopus card, and the audio was extracted from the DV recordings. Audio and video synchronization were confirmed using the sequences of acoustic transients from the alveolar stop releases in the spoken sequences of ‘ta’ with tongue dropping gestures associated with the same. The Canopus card’s audio and video synchronization were consistently within 1 frame, requiring no special post-production synchronization. The acoustic signal was labeled and transcribed in PRAAT, with attention to identifying segment boundaries and the acoustic low amplitude point (centre) of ‘T’. Data were then imported into ELAN and the kinematic variants of ‘T’ identified. The process of identifying kinematic variants of ‘T’ in English is described next.  2.2.2  Exemplars  An [R ] is produced when the tongue tip is raised from below, makes contact with the alveolar ridge and keeps moving above the ridge. In [R ], the white line in the M-mode ultrasound image, indicating the tongue surface trajectory, moves upward as seen in Figure 2.7. Examples of [R ] are shown in ‘VTR’ sequences, as in  23  Subfigure 2.7(a), ‘RTR’, as in Subfigure 2.7(b) and ‘VTV’, as in Subfigure 2.7(c).  (a) [R ] ⇒ ‘VTR’ ⇒ ‘otter’  	
    (b) [R ] ⇒ ‘RTR’ ⇒ ‘murder’ (c) [R ] ⇒ ‘VTV’ ⇒ ‘autumn’  Figure 2.7: [R ] Similarly, a [R ] is produced when the tongue tip begins from above the alveolar ridge, makes contact and keeps moving below the ridge. In a [R ], the white line in the M-mode ultrasound image, indicating the tongue-surface trajectory, moves downward as seen in Figures 2.8 and 2.9. An example of the ‘RTV’ sequences with [õ]is seen in Subfigure 2.8(a). "  (a) [R ] ⇒ ‘RTV’ ⇒ ‘Berta’  Figure 2.8: [R ] A variant of [R ] occurs with what initially appear to be [ô] that becomes [õ] " " halfway through the segment, as seen in Subfigure 2.9(a). Since the ‘R’ in these cases is [õ] for at least 66 ms before the [R ], as seen in Subfigures 2.9(b) and 2.9(c), " these flaps are identified as [R ].  24  	
    	
    	
    (a) [R ] ⇒ ‘RTV’ ⇒ ‘Berta’ (b) [R ] ⇒ ‘VTV’ ⇒ ‘Autumn’ (c) [R ] ⇒ ‘VTV’ ⇒ ‘Autumn’ - frame preceding [R ] - frame following [R ]  Figure 2.9: Alternate [R ] exemplars. White lines are for visual aid, and are approximations only. The two taps include an [R ], where the tongue moves up to the alveolar ridge, hits the ridge and falls back, typically in a double ‘T’ sequence following a [R ], or in a single ‘T’ sequence between two ‘V’s. During a [R ] the white line in the M-mode ultrasound image, indicating the tongue-surface trajectory, curves up and back down as seen in Subfigure 2.10(a). English speakers also produce a [R↔ ] where the tongue moves higher in the oral cavity to a post-alveolar region, makes contact and is retracted back, typically in a double ‘T’ sequence following an [R ], or in a single ‘T’ sequence between two ‘R’s. During a [R↔ ] the white line in the M-mode ultrasound image, indicating the tongue-surface trajectory, is ‘squiggly’ and higher than the peak in a [R ], as seen in Subfigure 2.10(b).  25  	
    (a) [R ] ‘Autumn’  ⇒  ⇒ (b) [R↔ ] ‘Murder’  ‘VTV’  ⇒  ⇒  ‘RTR’  Figure 2.10: [R ] and [R↔ ] Identification of ‘R’ variants requires only the B-mode data, as seen in Figure 2.11. The important distinction is between [ô], with the tongue tip bunched into " the sublingual cavity, as seen in Subfigure 2.11(a), and the other two ‘R’ variants. Both the tip-up bunched ‘R’ seen in Subfigure 2.11(b), and the retroflex ‘R’ seen in Subfigure 2.11(c) have tongue tips above the alveolar ridge, and are therefore both examples of [õ]. Identification of ‘R’ variants is also confirmed through iden" tification of ‘T’ variants, particularly [R ] (which end in the tongue tip-up) and [R ] (which end in the tongue tip-down). Tip	
    Tip	
   Tip	
    	
    (a) tip-down (bunched) (b) tip-up [õ] ⇒ ‘RTR’ " [ô] ⇒ ‘RCV’ ⇒ ‘Burma’ ‘murder’ "  	
    ⇒ (c) retroflex [õ] ⇒ ‘VTRTV’ ⇒ " ‘Saturday’  Figure 2.11: ‘R’ variants. White lines are for visual aid, and are approximations only.  26  	
    2.3  Results  The results show a high degree of between-subject variability, as seen in Figure 2.12. Five of the 18 participants (subjects 9, 10, 14, 17 and 26) produced all four ‘T’ variants more than once, while 9 more (participants 2, 4, 5, 8, 12, 13, 15, 18 and 21) produced each of the four variants at least once. Three more produced each variant except the [R↔ ] (participants 6, 16 and 23), and the last (participant 3) produced flaps only. Flaps were more common than taps, with the [R↔ ] the least common variant.  27  subject: 2  subject: 3  ɾ↘ 11  ɾ↕ 11 23  ɾ↕ 0  37  1  ɾ↔  ɾ↖  ɾ↖  subject: 9  13  8  ɾ↔  ɾ↖  25  ɾ↘ 12  16  12  ɾ↔  subject: 17  ɾ↘ 12 0  ɾ↔  ɾ↖  ɾ↕ 8  ɾ↖  subject: 16 ɾ↕ 11  0  ɾ↔  subject: 10  ɾ↘ 15  ɾ↕ 11  ɾ↘ 11  ɾ↕ 13  ɾ↘ 12  12  10  ɾ↖  ɾ↔  subject: 4  subject: 5  ɾ↘ 12  ɾ↕ 12 21  2  subject: 13  ɾ↘ 7  subject: 21  ɾ↘ 13  ɾ↘ 23  ɾ↕ 1  6  22  ɾ↔  ɾ↖  subject: 18  21  ɾ↔  ɾ↖  5  21  ɾ↔  ɾ↕ 7  ɾ↘ 13  ɾ↕ 8  7  23  ɾ↖  ɾ↔  ɾ↖  subject: 12 ɾ↕ 5  3  22  ɾ↔  ɾ↖  ɾ↘ 11  ɾ↕ 12  2  ɾ↔  ɾ↖  subject: 6  ɾ↕ 11 23  ɾ↖  ɾ↘ 12 0  ɾ↔  subject: 14 ɾ↕ 12 12  ɾ↖  ɾ↘ 10 9  ɾ↔  subject: 23 ɾ↕ 36 3  ɾ↖  ɾ↘ 7 0  ɾ↔  subject: 8  ɾ↕ 14 14  ɾ↖  ɾ↘ 16 4  ɾ↔  subject: 15 ɾ↕ 18 5  ɾ↖  ɾ↘ 16 3  ɾ↔  subject: 26 ɾ↕ 12  ɾ↘ 12  13  11  ɾ↖  ɾ↔  (a) By participant ɾ↘ 225  ɾ↕ 202  326  83  ɾ↔  ɾ↖  (b) Summary  Figure 2.12: Distribution of flap variants by participant.  2.3.1  Flap orientation by phrase  Examining ‘T’ variants by phrase allows us to identify differences between participants, but also identify whether speakers will select a different ‘T’ variant for repetitions of the same phrase in the same context. The results are shown in Figures 2.13 and 2.14. For the phrase ‘We have autumn books’ (the top rows of Figures 2.13 and 2.14), 28  12 of 18 participants produced [R ] most of the time, 4 more produced [R ] some of the time, and [R ] the rest of the time. Participants 3 and 21 did not produce [R ] at all. Note that 12 of 18 participants produced more than one kinematic variant for the same phrase and context, with five participants (10, 12, 13, 18 and 21) using two variants frequently. For the phrase ‘We have Berta beep’ (the second rows of Figures 2.13 and 2.14), all but participant 23 produced [R ] all or most of the time. Participant 23 produced [R ] more often than [R ]. The results for this phrase show the least amount of within and between subject variability. For the phrase ‘We have otter books’ (the third rows of Figures 2.13 and 2.14), 16 of 18 participants produced [R ] all or most of the time. Participant 15 produced [R ] half of the time, and [R ] the other half of the time, and participant 23 produced [R ] most of the time. The phrase ‘We have him murder a mob’ (the bottom rows of Figures 2.13 and 2.14) shows a higher degree of between subject variability than any of the other tables. Participants 2, 3, 11, 16 and 21 produce [R ] most of the time, participants 10, 14, 17 and 23 produce [R↔ ] most of the time, and the rest have much more variable kinematic selections.  29  30  ɾ↘ 0  1  0  subject: 2 ↘ subject: ɾ13 ɾ↕  ɾ↕0 0  subject: 13 10 ↖ ɾ12 ɾ↖  1 0ɾ↔ ɾ↔  ɾ↕ 0 ɾ↘ 1  5  5  ɾ↖ ɾ↔  0ɾ↔ ɾ↖4 ↔ ɾ2 ɾ↖ subject:  ɾ↕0 8  ɾ↘ 11 0  1  0  ↘ subject: ɾ13 ɾ↕  ↘ subject: ɾ14 ɾ↕  ↘ subject: ɾ15 ɾ↕ ↘ subject: ɾ16 ɾ↕  ↘ subject: ɾ17 ɾ↕  ↘ subject: ɾ18 ɾ↕  ↘ subject: ɾ21 ɾ↕  0 11 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 2 subject: 13 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  0 12 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 3 subject: 14 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0 0 12 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 4 subject: 15 ɾ↘ ɾ↕ ɾ0↘ ɾ↕1 6 0 0 12 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 5 subject: 16 ɾ↘ ɾ↕ ɾ1↘ ɾ↕0 0 0 0 11 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 6 subject: 17 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 1 0 12 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 8 subject: 18 ɾ↘ ɾ↕ ɾ4↘ ɾ↕2 0 2 0 12 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 9 subject: 21 ɾ↘ ɾ↕ ɾ4↘ ɾ↕0 1 2  ↘ 0ɾ 12  ɾ↔  21  ɾ↕ 11  21  ɾ↖  21  21  (subject 2, throughout 3, in4, the 5, 6, 14, flap 16, 17, 26) changed strategies very rarely, the will use different ipants flap/tap variants the experiment. Thoughfor half ofphrases the particeven single phrases described inthe this paper. The other more often, particularly ‘We have ot in 5, the6,single flapoften, phrases described inallowable thisphrases paper. The have other halfdid changed ipants (subject 2,even 3, 4, 14,more 16, 17, 26)particularly strategies veryvariability rarely, they all for the ‘We books’, achanged mob’. The of otter rhotic vowels ans particularly for the phrases ‘We other haveofcomfort otter books’, and ‘We have him even in the singlemore flap often, phrases described in this paper. The half changed strategies a mob’. The allowable variability rhotic vowels andto the importa over beginning-state appeared reduce conf a mob’. for Thethe allowable variability of rhotic vowels the importance of end-state more often, particularly phrases ‘We have otter and ‘We have him murder over beginning-state comfort appeared to reduce confounding making it books’, possible toand see that participants do const not h over beginning-state comfort appeared totoreduce confounding constraints oneven flap va a mob’. The allowable variabilitymaking of rhotic vowels the importance of end-state comfort it possible to see that participants do not and havenot any fixed not and down the subphonemic level, w making it possible seereduce that participants do not have any fixed ortomotor over beginning-state comfort appeared to confounding constraints on flap variability, not to down to the subphonemic level, and not even when and accounted. The results demonstrate atask needphrase revise nottodown to the subphonemic level, and not even when phrase and prosodic conce making it possible see that participants do can not have any task or motor program, accounted. The results demonstrate a need to to revise theories of spel account forfixed planning down the subphonemic The results demonstrate a need to revise theories of speech motor not down to the accounted. subphonemic level, and not for even when phrase and prosodic concerns were can account planning down to in the subphonemic level. Likecontr planning uncovered the support for hypothesis 4the ab can account for planning planning down to the subphonemic the foralr lo accounted. The results demonstrate a need to revise theories of to speech motor control which uncovered in the support forlevel. hypothesis 4 above, we are dataset in order develop aLike theory ofevidence motor control planning down uncovered in subphonemic the for hypothesis 4 above, we for are already analyzing can account for planning to the level. Like the evidence look-ahead dataset in support order to develop a theory of motor control that accountst Murder dataset order to develop a theory of motor accounts these obser planning uncovered in theinsupport for hypothesis 4 Otter above, we arecontrol alreadythat analyzing thefor larger Murder Murder a theoryOtter dataset in order to develop of motor control that accounts for these observations. Berta Otter Murder Berta Autumn Berta Otter Autumn Autumn Berta Autumn subject: 2  subject: 3  ɾ↖  ɾ↕ 0  ɾ↘ 0  12 0  subject: 3 ↘ subject: ɾ14 ɾ↕  ɾ↕0 0  subject: 14 12 ↖ ɾ12 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0 ɾ↘ 1  0  9  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ ɾ3 ɾ↖ subject: ɾ↕0 12 ɾ↘ 11 0  1  0  ↘ 0ɾ 9  ɾ↔  subject: 4  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 4 ↘ subject: ɾ15 ɾ↕  ɾ↕0 0  5 ɾ↖ ɾ↖ 8  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ ɾ4 ɾ↖ subject:  ɾ↕0 10  ɾ↘ 12 0  0 0  ↘ 0ɾ 10  Figure 2.13: byɾ↕subjects ↘ ɾ↘ Distribution ɾ↘ of flap ɾ↘ ɾ↘ (2-12) ɾ↕ ɾ↕ ɾ↕ variants ɾ↕ by ɾphrase. ɾ↕ subject: 15 2 0ɾ↔ ɾ↔  2 6  0  3  ɾ↔  subject: 5  ɾ↖  ɾ↕ 12  ɾ↘ 0  0 0  subject: 5 ↘ subject: ɾ16 ɾ↕  ɾ↕0 0  subject: 16 9 ↖ ɾ12 ɾ↖  ɾ↖  ɾ↔  0ɾ↔ ɾ↖2 ↔ ɾ5 ɾ↖ subject:  ɾ↕0 10  ɾ↘ 10 0  1 1  ↘ 0ɾ 12  2 0ɾ↔ ɾ↔  1 0  11  0  ɾ↔  subject: 6  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 6 ↘ subject: ɾ17 ɾ↕  ɾ↕0 0  subject: 17  11 ↖ ɾ11 ɾ↖  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ ɾ6 ɾ↖ subject:  ɾ↕0 12  ɾ↘ 12 0  0 0  ↘ 0ɾ 12  0 0ɾ↔ ɾ↔  0 0  1  10  ɾ↔  subject: 8  ɾ↖  ɾ↕ 12  ɾ↘ 0  0 0  subject: 8 ↘ subject: ɾ18 ɾ↕  ɾ↕0 0  subject: 18  2 ↖ ɾ10 ɾ↖  ɾ↖  ɾ↔  0ɾ↔ ɾ↖7 ↔ ɾ8 ɾ↖ subject:  ɾ↕0 5  ɾ↘ 12 0  0 0  ↘ 0ɾ 12  4 0ɾ↔ ɾ↔  0  1  5  6 ɾ↔  subject: 9  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 9 ↘ subject: ɾ21 ɾ↕  ɾ↕0 0  0  subject: 21  8 ɾ↖ ɾ↖  ɾ↖  ɾ↔  0ɾ↔ ɾ↖3 ↔ ɾ9 ɾ↖ subject:  ɾ↕0 0  ɾ↘ 11 9  0 1  ↘ 0ɾ 12  7 0ɾ↔ ɾ↔  0  0  10 2 ɾ↔  subject: 10  ɾ↖  ɾ↕ 8  ɾ↘ 0  4 0  subject: 10 ↘ subject: ɾ23 ɾ↕  ɾ↕0 7  0  subject: 23  2 ɾ↖ ɾ↖  ɾ↖  ɾ↔  0ɾ ɾ↖1 ↔ ɾ10 ɾ↖ subject:  ɾ↕0 11  ɾ↘ 12 0  0 0 ↔  ↘ subject: ɾ23 ɾ↕ ↘ 0ɾ 4  0 12 0ɾ↔ ɾ↖0 ↔ ɾ10 ɾ↖ subject: subject: 23 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 10 0 12 0ɾ↔ ɾ↔  8  ɾ↘ 3  0 0 ɾ↔  subject: 12  ɾ↖  ɾ↕ 5  ɾ↘ 0  7 0  subject: 12 ↘ subject: ɾ26 ɾ↕ ɾ↕0 11 0  ɾ↕0 0  subject: 26  4 ↖ ɾ11 ɾ↖ 7 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  1 11  ɾ↖  ɾ↔  ↘ 7ɾ 0  0ɾ ɾ↖1 ↔ ɾ12 ɾ↖ subject:  0 ↔  ↘ subject: ɾ26 ɾ↕ ↘ 0ɾ 12  0 12 0ɾ↔ ɾ↖0 ↔ ɾ12 ɾ↖ subject: subject: 26 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 1  ɾ↔  31  1  0  subject: 2 ↘ ɾ13 ɾ↕ subject:  subject: 13  10 ↖ ɾ12 ↖ ɾ  1 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 1  5  5  ɾ↖  ɾ↔  0ɾ↔ ɾ↖4 ↔ subject: ɾ2 ɾ↖ ↘ ɾ13 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 2 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 13 ɾ0↘ ɾ↕0 0 0 ɾ↕0 8  11 ɾ↘ 0  1  0  11  0  ɾ↔  21  ɾ↘ 0  21  ɾ↕ 11  21  ɾ↖  21  (subject 2, throughout 3,in4,the 5, single 6,more 14, 16, 17,particularly 26) described changed very rarely, th will use differentipants flap/tap variants theflap experiment. Though half of the particeven phrases in paper. The oth often, forstrategies thethis phrases ‘We have o even flap phrases described thisphrases paper. Thehave other half changed ipants (subject 2, 3, in 4, the 5, 6,single 14, 16, 17, 26) strategies very rarely, they all did more often, particularly forin the ‘We otter books’, a changed mob’. The allowable variability of rhotic vowels aa more particularly for the ‘We have otter books’, have hi even in the single flapoften, phrases in this paper. The other half changed strategies adescribed mob’. The allowable variability of rhotic vowelsand and the import overphrases beginning-state comfort appeared to‘We reduce con a mob’. for Thethe allowable variability of rhotic vowels the importance of do end-sta more often, particularly phrases ‘We have otter books’, and ‘We have him murder over beginning-state comfort appeared to reduce confounding cons making it possible toand see that participants not over beginning-state comfort appeared to reduce confounding constraints on flap a mob’. The allowable variability of rhotic vowels and the importance of end-state comfort making it possible to see participants do notand have any fixew not down to that the subphonemic level, not even making it possible to see that participants not have anyflap fixed task or over beginning-state comfort appeared to reduce confounding constraints on variability, not down to the subphonemic level, and not even an accounted. Thedo results demonstrate awhen needphrase to motor revise nottodown to the subphonemic level, and not even when phrase and prosodicofcon making it possible see that participants docan not have any task or motor program, accounted. The results demonstrate a need toto revise theories sp account for fixed planning down the subphonemic The results a uncovered need to revise theories of for speech motor not down to the accounted. subphonemic level, anddemonstrate notfor even when phrase and prosodic concerns were can account planning down to subphonemic level. Like4con th planning inthe the support hypothesis a can account forplanning planning down to the subphonemic the forall accounted. The results demonstrate a need todataset revise oftospeech motor control which uncovered intheories the support for level. hypothesis 4 above, we contro are in order develop a Like theory ofevidence motor in the for hypothesis 4 above, we are already analyzing can account for planning uncovered down dataset to the subphonemic level. Like the evidence for look-ahead in support order to develop a theory of motor control that accoun Murder dataset order tofor develop a theory of motor accounts these obs planning uncovered in theinsupport hypothesis 4Otter above, we arecontrol alreadythat analyzing thefor larger Murder dataset in order to Murder develop a theory of motor control Otter Bertathat accounts for these observations. Otter Murder Berta Autumn Berta Otter Autumn Autumn Berta Autumn subject: 2  subject: 3  ɾ↖  ɾ↕ 0  ɾ↘ 0  12 0  subject: 3 ↘ ɾ14 ɾ↕ subject:  0ɾ↔ ɾ↖0 3 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 14 ɾ0↘ ɾ↕0 0 0 12  subject: 14 12 ↖ ɾ12 ↖ ɾ  0 0ɾ↔ ɾ↔  ɾ↕ 0 ɾ↘ 1  0  9  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ subject: ɾ3 ɾ↖ ↘ ɾ14 ɾ↕ subject: 0ɾ↘ ɾ↕0 9 0  ɾ↕0 12  11 ɾ↘ 0  1 0  0  ɾ↔  subject: 4  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 4 ↘ ɾ15 ɾ↕ subject:  8  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ subject: ɾ4 ɾ↖ ↘ ɾ15 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 10  0ɾ↔ ɾ↖0 4 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 15 ɾ0↘ ɾ↕1 6 0  ɾ↕0 10  12 ɾ↘ 0  0 0  12 0  subject: 15 5 ɾ↖ ɾ↖  2 0ɾ↔ ɾ↔  ɾ↕ 2 ɾ↘ 6  0  3  ɾ↔  subject: 5  ɾ↖  ɾ↕ 12  ɾ↘ 0  0 0  subject: 5 ↘ ɾ16 ɾ↕ subject:  subject: 16  9 ↖ ɾ12 ɾ↖ 2 0ɾ↔ ɾ↔  ɾ↕ 1 ɾ↘ 0  11  0  ɾ↖  ɾ↔  0ɾ↔ ɾ↖2 ↔ subject: ɾ5 ɾ↖ ↘ ɾ16 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 5 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 16 ɾ1↘ ɾ↕0 0 0  ɾ↕0 10  10 ɾ↘ 0  1 1  12 0  ɾ↔  subject: 6  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 6 ↘ ɾ17 ɾ↕ subject:  subject: 17  11 ↖ ɾ11 ↖ ɾ 0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  1 10  ɾ↖  ɾ↔  0ɾ↔ ɾ↖0 ↔ subject: ɾ6 ɾ↖ ↘ ɾ17 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 6 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 17 ɾ0↘ ɾ↕0 0 1  ɾ↕0 12  12 ɾ↘ 0  0 0  11 0  ɾ↔  subject: 8  ɾ↖  ɾ↕ 12  ɾ↘ 0  0 0  subject: 8 ↘ ɾ18 ɾ↕ subject:  subject: 18  2 ↖ ɾ10 ɾ↖ 4 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 1  5 6  ɾ↖  ɾ↔  0ɾ↔ ɾ↖7 ↔ subject: ɾ8 ɾ↖ ↘ ɾ18 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 8 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 18 ɾ4↘ ɾ↕2 0 2  ɾ↕0 5  12 ɾ↘ 0  0 0  12 0  ɾ↔  subject: 9  ɾ↖  ɾ↕ 11  ɾ↘ 0  1 0  subject: 9 ↘ ɾ21 ɾ↕ subject:  0  subject: 21  8 ɾ↖ ɾ↖  7 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  10 2  ɾ↖  ɾ↔  0ɾ↔ ɾ↖3 ↔ subject: ɾ9 ɾ↖ ↘ ɾ21 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 9 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 21 ɾ4↘ ɾ↕0 1 2  ɾ↕0 0  11 ɾ↘ 9  0 1  12 0  ɾ↔  subject: 10  ɾ↖  ɾ↕ 8  ɾ↘ 0  4 0  subject: 10 ↘ ɾ23 ɾ↕ subject:  subject: 23  2 ɾ↖ ɾ↖  ɾ↖  Figure 2.14: Distribution of ‘T’ variants by subjects (13-26) by phrase.  ɾ↔  0ɾ ɾ↖1 ↔ subject: ɾ10 ɾ↖ ↘ ɾ23 ɾ↕ subject: 0ɾ↘ ɾ↕0 7 4  0ɾ↔ ɾ↖0 ↔ subject: ɾ10 ɾ↖ ɾ↘ ɾ↕ subject: 23 ɾ0↘ ɾ↕0 10 0  ɾ↕0 11  12 ɾ↘ 0  0 0  12 0  0 12 0ɾ↔  ↔  ɾ↔  ɾ↕ 8  ɾ↘ 3  0 0 ɾ↔  subject: 12  ɾ↖  ɾ↕ 5  ɾ↘ 0  7 0  subject: 12 ↘ ɾ26 ɾ↕ subject: ɾ↕0 11  subject: 26  4 ↖ ɾ11 ɾ↖ 7 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  1 11  ɾ↖  ɾ↔  7ɾ↘ 0  0ɾ ɾ↖1 ↔ subject: ɾ12 ɾ↖ ↘ ɾ26 ɾ↕ subject: 0ɾ↘ ɾ↕0 0 12  0ɾ↔ ɾ↖0 ↔ subject: ɾ12 ɾ↖ ɾ↘ ɾ↕ subject: 26 ɾ0↘ ɾ↕0 0 1  0 0  12 0  ↔  ɾ↔  Examining ‘T’ variants by phrase shows that every participant had at least one example of using a different ‘T’ variant for the same phrase. Half of the participants (participant 2, 3, 4, 5, 6, 14, 16, 17, 26) did this very rarely, but the other half did this much more often - especially with the phrases ‘We have autumn books’ and ‘We have him murder a mob’. The summary results in Figure 2.15 shows the least ‘T’ variability for the phrases that have an ‘R’ on one side of the ‘T’ and a ‘V’ on the other side (‘Berta’ and ’Otter’), followed by the phrase with ‘V’ on either side (‘Autumn’), leaving the phrase with ‘R’ surrounding the ‘T’ (‘Murder’) with the most variability. phrase: a) autumn ɾ↕ 160  45  ɾ↖  ɾ↘ 9  0  ɾ↔  phrase: b) Berta ɾ↕ 7  3  ɾ↖  phrase: c) otter  ɾ↘ 193  ɾ↕ 21  189  2  ɾ↔  ɾ↖  ɾ↘ 2  phrase: d) murder  0  ɾ↔  ɾ↕ 14  ɾ↘ 21  89  81  ɾ↖  ɾ↔  Figure 2.15: Distribution of ‘T’ variants by phrase. Wilcoxon signed-rank tests were performed on the data summarized in Figure 2.15. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant in each of the four phrase contexts were compared with each other. As expected from the descriptive statistics in Figure 2.15, many of these comparisons are significant, and the results can be seen in Table 2.2.  32  contexts VR : RR VR: RV VR : VV RR : RV RR : VV RV : VV  [R ] V 31 21 2 15 0 0  p p = 0.341 p = 0.355 *p < 0.001 p = 0.059 *p < 0.001 *p < 0.001  [R ] V 5 0 0 0 36 171  [R ] V 133 171 153 91 81 3  p *p = 0.044 *p < 0.001 p=1 *p < 0.001 p = 0.12 *p < 0.001  p *p < 0.001 *p < 0.001 *p < 0.003 *p < 0.002 p = 0.079 *p = 0.003  [R↔ ] V 0 0 0 105 105 3  p *p = 0.001 p = 0.346 NA *p = 0.001 *p = 0.001 p = 0.346  Table 2.2: Wilcoxon Signed rank tests comparing prevalence of ‘T’ variants based on phrase. VR = ‘otter’, RV = ‘Berta’, RR = ‘murder’, VV = ‘autumn’. * = significant (α = 0.05). V is the sum of ranks assigned to the differences with positive sign.  2.3.2  ‘T’ variant by surrounding ‘R’ variants.  Looking at the ‘T’ variant based on surrounding ‘R’ variants allows us to identify the relationship between the two, making it possible to test the relationship to local phonetic context effectively. The initial ‘R’ variant in the phrase ‘We have Berta beep’ did not covary much with ‘T’ variant, as seen in Figure 2.16. Binomial Logistic regression comparing ‘T’ variants based on the initial ‘R’ variant uncover no significant differences.  initial rhotic: ɹ̩ ɾ↕ 7  3  ɾ↖  initial rhotic: ɻ̩  ɾ↘ 113  ɾ↕ 0  0  0  ɾ↔  ɾ↖  ɾ↘ 80  2  ɾ↔  Figure 2.16: Distribution of ‘T’ variants in the phrase ‘We have Berta beep’.  Wilcoxon signed-rank tests were performed on the data summarized in Figure 33  2.16. As expected from the descriptive statistics in Figure 2.16, none of the results were significant, as seen in Table 2.3.  contexts õ:ô " "  [R ] V p 0 p=1  [R ] V p 38 p = 0.683  [R ] V p 0 p = 0.174  [R↔ ] V p 3 p = 0.371  Table 2.3: Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on the initial ‘R’ variant in ‘Berta’. * = significant (α = 0.05).  For the phrase ‘We have otter books’, the final ‘R’ type covaries with the ‘T’ variant, as shown in Figure 2.17. While there are few examples of final [ô], when " they were produced participants were more likely to produce [R ] instead of [R ].  final rhotic: ɹ̩ ɾ↕ 18  ɾ↘ 2  4  0  ↖  ɾ  final rhotic: ɻ̩  ↔  ɾ↕ 3  ɾ↘ 0  185  0  ↖  ɾ  ɾ  ɾ↔  Figure 2.17: Distribution of ‘T’ variants in the phrase ‘We have Otter beep’.  Wilcoxon signed-rank tests were performed on the data summarized in Figure 2.17. As expected from the descriptive statistics in Figure 2.17, the results are significant for [R ], and marginally significant for [R ], as seen in Table 2.4.  contexts õ:ô " "  [R ] V p 20 p = 0.058  [R ] V p 1 p=1  [R ] V p 1 *p = < 0.001  [R↔ ] V p 0 NA  Table 2.4: Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on initial ‘R’ variant in ‘otter’. * = significant (α = 0.05).  34  For the phrase ‘We have him murder a mob’ , ‘T’ variant covaries with tonguetip position both before and after the ‘T’ , as illustrated in the raw data in Table 2.18. An initial [ô] vowel followed by a [õ] vowel is highly related to [R ], and an " " initial [õ] vowel followed by a [õ] vowel is highly related to [R↔ ]. While there is " " much less data on word final [ô] vowels, an initial [ô] vowel followed by a [ô] vowel " " " is most commonly associated with [R ], and the 7 initial [õ] vowels followed by a " [ô] vowel all had [R ]. " rhotic context: ɹ̩ɹ̩ ɾ↕ 13  0  ɾ↖  ɾ↘ 5  6  ɾ↔  rhotic context: ɹ̩ɻ̩ ɾ↕ 1  89  ɾ↖  rhotic context: ɻ̩ɹ̩  ɾ↘ 2  ɾ↕ 0  6  0  ɾ↔  ɾ↖  ɾ↘ 9  0  ɾ↔  rhotic context: ɻ̩ɻ̩ ɾ↕ 0  0  ɾ↖  ɾ↘ 5  69  ɾ↔  Figure 2.18: Distribution of ‘T’ variants in the phrase ‘we have him murder a mob’ based on initial and final ‘R’ variants.  Wilcoxon signed-rank tests were performed on the data summarized in Figure 2.18. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant for initial and final ‘R’ variant. As expected from the descriptive statistics in Figure 2.18, many of the results were significant, as seen in Table 2.5.  35  contexts õô : ôô "" "" õô : õõ "" "" õô : ôõ "" "" ôô : õõ "" "" ôô : ôõ "" "" õõ : ôõ "" ""  [R ] V p 0 p = 0.058  [R ] V p 14 p = 0.099  [R ] V p 0 NA  [R↔ ] V 0  p p = 0.181  0  NA  14  p = 0.099  0  NA  0  *p < 0.001  0  1  18  p = 0.131  0  *p = 0.001  0  p = 0.191  15  p = 0.058  5  p=1  0  NA  10  *p = 0.005  14  p = 0.104  7  p = 0.584  0  *p = 0.001  11.5  p = 0.916  0  p=1  5  p=1  0  *p = 0.001  91  *p = 0.001  Table 2.5: Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on the final ‘R’ variant in ‘otter’. õô = tip-up initial, tip-down final ‘R’, ôô = tip-down initial and final ‘R’, õõ" "= tip-up initial and final ‘R’, ôõ "" "" "" = tip-down initial, tip-up final ‘R’. * = significant (α = 0.05).  2.3.3  ‘T’ variation across immediate vocalic contexts  Figure 2.19 presents ‘T’ variation based on the immediate vocalic contexts, independent of the phrase. The results show the strong covariance of tongue position before and after a ‘T’ on the ‘T’ variant. The big exception is the results for [ô]‘T’‘V’ sequences. Here we expected mostly [R ], but instead we see mostly [R ] " (see the first row, third column of Figure 2.19).  36  context:  context: VV 160  9  13  context:  context: 5  9  0  1  2  context: 45  0  0  6  0  0  89  6  context: V  context: V  context: V  context: V  7  113  18  2  0  80  3  0  3  0  4  0  0  2  185  0  0  5  0  69  ɾ↕ Tip-down before and after  ɾ↘  ɾ↖  ɾ↔  Tip-up before, tipdown after  Tip-down before, tip-up after  Tip-up before and after  Figure 2.19: Distribution of ‘T’ variants based on vowel context before and after the ‘T’, with main hypotheses highlighted underneath.  Wilcoxon signed-rank tests were performed on the data summarized in Figure 2.19. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant in the non-rhotic‘V’ preceding, [ô] following (VD) context were " calculated and compared with each of the other contexts. As expected from the descriptive statistics in Figure 2.19, many of these comparisons are significant, and the results can be seen in Table 2.6.  37  contexts Vô : Võ " " Vô : õô " "" Vô : ôô " "" Vô : õõ " "" Vô : ôõ " "" Vô : õV " " Vô : ôV " " Vô : VV "  [R ] V 20  p p = 0.058  [R ] V p 1 p=1  [R ] V p 1 *p < 0.001  [R↔ ] V p 0 NA  15  p = 0.054  1  p = 0.090  6  p = 0.181  0  NA  20.5 15  p = 0.777 p = 0.054  2.5 2  p=1 p = 0.789  6 6  p = 0.181 p = 0.181  0 0  p = 0.181 *p < 0.001  19  p = 0.089  3  p=1  7  *p = 0.004  0  p = 0.181  15  p = 0.054  1  *p < 0.001  6  p = 0.181  0  p = 0.371  15 6  p = 0.054 *p = 0.002  0 0  *p = 0.001 p=1  14 18  p = 0.528 p = 0.068  0 0  NA NA  Table 2.6: Wilcoxon signed-rank tests comparing prevalence of ‘T’ variants based on vowel context before and after the ‘T’. * = significant (α = 0.05).  2.4  Discussion  All four ‘T’ variants were quite common overall, as seen in Figure 2.12. These results provide an extreme example of covert categorical subphonemic variation: We demonstrate four easily distinguishable categorical kinematic variations of ‘T’. There are as yet no IPA symbols for these variants, but the four symbols ([R ], [R ], [R ], and [R↔ ]) provide a useful method for distinguishing the four ‘T’ variants. Participants also preferred flaps over taps, and [õ] over [ô]. Where these two " " goals conflicted, participants favoured the production of [õ] following a ‘T’, demon" strating look-ahead planning through accommodation of end-state comfort. Here planning is simply defined as a decision between two or more possibilities at a given point in advance of a desired outcome. Most importantly, participants sometimes produced different ‘T’ and ‘R’ variants for repetitions of the same word in the same context, indicating that there are no fixed motor programs or tasks. Each main finding is discussed in more detail below.  2.4.1  Main effect of local context  In single ‘T’ sequences, ‘T’ variants were strongly related to the immediate vocalic context. Figure 2.19 shows that ‘V’, [ô] vowels and [õ] vowels all have different " " 38  influences on the selection of ‘T’ variants. These influences match the expectations outlined in the hypothesis section above, with one notable exception: The initial ‘R’ variant in ‘Berta’ had little or no relationship to the following ‘T’ variant – those were almost always [R ]. This result contradicts prediction II.b (that for the word ‘Berta’, Given an initial [ô], we predict [R ]). However, it does fit the end-state " comfort hypothesis, and will be discussed in more detail shortly.  2.4.2  Flaps preferred over taps, [õ] preferred over [ô] " "  All four ‘T’ variants were commonly attested, Figure 2.12 shows that flaps were more common than taps, especially compared to the least common [R↔ ]. All of the participants produced both [R ] and [R ], but some participants failed to produce one or both of the taps. While the [R ] seen in the ‘autumn’ data, as shown in Figure 2.15, and highlighted in Figure 2.20, could be influenced by the tongue position of the initial low back vowel, the evidence from the rest of the phrases provides strong support for the argument that flaps are generally preferred over taps.  39  context: VV  ɹ̩  ɾ  ↕  V  ɹ̩  ɾ  ↘  160  9  45  0  V  ɹ̩  ɾ  ↖  V  ɹ̩  ɾ  ↔  V  Figure 2.20: Distribution of ‘T’ variants in the phrase ‘We have autumn books’ with relation to end-state-comfort as presented in the matrix of possible tongue-tip trajectories. The small number of [R ] may be the result of the initial vowel having an extremely low tongue tip position.  We see in Figure 2.15 that in the phrase ‘We have Berta beep’, [R ] were much more common than [R ]. Similarly, [R ] were more common than [R ] in the phrase ‘We have otter books’. One factor that may contribute to the preference for flaps over taps is that flaps involve only one direction of motion, whereas taps involve two. Similarly, in flap sequence contexts (such as the phrase ‘edit a’), [R ], [R ] and [R ], [R ] sequences involve one arc of motion, whereas two taps in a row involve two arcs of motion. Therefore, flap sequences involve fewer arcs of motion than tap sequences. These results could also be interpreted to suggest instead that [õ] are preferred " over [ô]. Data from the word ‘murder’ shows support for both conclusions at the " same time. The most common configuration is either [õR↔ õ] or [ôR õ], as seen in " " " " Figures 2.18 and 2.15. That is, there is an overall preference for [õ], and when " a speaker does not follow this pattern, they usually produce an [R ]. That is, their preference for flaps over taps is typically resolved by favouring the final ‘R’ variant 40  over the initial ‘R’ variant.  2.4.3  End-state comfort  The word ‘otter’ almost always has an [R ], as shown in Figure 2.15, and the final ‘R’ in ‘otter’ is almost always [õ] in this dataset. However, when the final ‘R’ in " ‘otter’ was a [ô], the ‘T’ was usually an [R ], fitting our initial prediction. " In contrast, for the word ‘Berta’ the ‘T’ is almost always a [R ], regardless of whether the ‘R’ variant is a [õ] or [ô], as shown in Figure 2.16. This result " " contradicts prediction II.b.: given an initial [ô], the results were more often [R ] " rather than the predicted [R ]. Specifically, in the [ô] cases for ‘Berta’, the tongue tip usually moves upward " throughout the production of the ‘R’ and then moves down again for the [R ]. That is, the ‘T’ variant is influenced by the end-state requirements for the final vowel, as is the transition out of initial ‘R’. These results demonstrate accommodation of end-state comfort, a result that parallels the hand motion choices discussed in Rosenbaum et al. (1992), and so provides evidence for look-ahead planning for speech. These results demonstrate planning that spans two phonemes within the same word, and further research is needed to see if there is evidence for planning across longer spans. The results are highlighted again against the initial hypotheses in Figure 2.21.  41  context:  context: VV  context: V  context: V  160  9  13  5  18  2  7  113  45  0  0  6  4  0  3  0  ɹ̩  ɾ  ↕  V  ɹ̩  ɾ  ↘  V  ɹ̩  ɾ  ↖  V  ɹ̩  ɾ  ↔  V  Figure 2.21: Distribution of ‘T’ variants in the phrase ‘We have Berta beep’ with relation to end-state-comfort as presented in the matrix of possible tongue-tip trajectories.  2.4.4  No fixed categories of articulator motion in speech  The main hypothesis for this paper was that participants will use different subphonemic variants of ‘T’ and ‘R’, for the same word in the same phrase, with the same time allotment, across repetitions. Half of the participants (participants 2, 3, 4, 5, 6, 14, 16, 17, 26) changed strategies only occasionally, while the other half changed strategies frequently - particularly for the word ‘murder’. The results show categorical variability in speech production for what would appear to be the same phonological context. This type and degree of variability in speech has different implications for different theories of motor control. Within a motor theory that relies on muscle commands (Keele, 1968), speakers would have to memorize muscle behaviours for each ‘R’ and ‘T’ variant. To a certain degree, this would also be true in a schema theory (Schmidt, 1975), where patterns of movement are memorized instead of muscle commands.  42  Palatal Constriction Same  Postalveolar Tap [ɾ↔] TT tip-up rhotic  Up flap [ɾ↖]  TT schwa  TT tip-up rhotic Down flap [ɾ↘]  TT tip-down rhotic  Alveolar Tap [ɾ↕] TT tip-down rhotic (a) Similar constriction location  (b) same constriction location and degree  Figure 2.22: Similar constriction location and degree In contrast, a dynamical perspective on speech production (Kelso et al., 1986; Saltzman and Byrd, 2000) relies on coordinative structures instead of motor programs. Coordinative structures are assembled and exist only until the task is accomplished, and require no central brain control (Turvey et al., 1982). At their smallest scope, coordinative structures emerge out of a task space, and are based on constriction degree and constriction location, which may or may not be reached due to a number of reasons such as speed and stiffness of articulation (see Munhall et al., 2000). In such an analysis, the observed ‘T’ and ‘R’ variants could fall out of constraints on speech production at the time of utterance. Figure 2.22 illustrates how the ‘T’ and ‘R’ variants share roughly the same degree and location of constriction. However, none of these theories of speech motor control address the reasons why a speaker may produce one articulatory sequence at one time, and a different sequence during a later repetition of the same sentence. In these cases, it appears that the constraints on speech production identified in this paper, such as local context, flap preference over taps (a potential indicator of motor skills), and endstate comfort may vary in relative importance from utterance to utterance. While the experiment was designed to reduce the possibility of contextual differences as much as possible, it may be that small changes in circumstances, such as fatigue or subtle changes in speech rate, at the time of speech production can result in 43  End-state comfort ɻ̩̩  Accounting for:  ɹ̩  Articulatory conflict  1)  ɹ̩  ɾ↕  ʌ  >  ɹ̩  ɹ̩  ɾ↘ ʌ  >  ɹ̩  ʌ  Motor Skills  ɾ↘ ʌ  >  ɾ↘ ʌ  ɹ̩  >  ɹ̩  ‘Berta’ ɾ↕  ʌ  =  Articulatory conflict  Motor Skills  2)  ʌ˞  ɾ  ɾ↕  ʌ  >  ɹ̩  ɾ↕  ʌ  >  ɹ̩  ɾ↘ ʌ  ɹ̩  ɾ↕  ʌ  ‘Berta’  =  ɹ̩  ɾ↘ ʌ  Figure 2.23: Schematic tongue tip trajectory showing differences in outcomes, even in the same speaker and context, result from different weighting of constraints. Illustrative example taken from ‘Berta’. categorical differences in motor behaviour. Figure 2.23 illustrates an example, highlighting the potential effects of importance of avoiding articulatory conflict vs. preferring flaps over taps for the production of the word ‘Berta’ where the initial ‘R’ is the [ô] variant. " The importance of a constraint like end-state-comfort, which requires some kind of ‘awareness’ of the desired end-state, is that there must be planning involved in each speech act. Therefore the results of this research support a theory of speech motor control that takes into account subphonemic planning as choice and skills-based anticipation, storage of tasks used in speech acts, and self-assembled coordinative structures in speech. Nevertheless, more research is needed to uncover the span of end-state comfort effects. The current evidence for the end-state comfort effect based on a very local constraints; the transition before a ‘T’ is compromised to achieve the desired tongue position for the vowel after the same ‘T’. Therefore more research is needed to confirm the importance of low level planning. Similarly, both the number of potential constraints and the reasons for shifts in the importance of these constraints remain to be identified. 44  Chapter 3  Three phonological segments, one motor event: Evidence for speech-motor disparity from English flap production 3.1  Introduction  A pervasive assumption in speech motor behaviour is that units of speech production systematically correspond to actions. This is true despite widely varying views among researchers concerning the definition of a speech action and the units of speech production to which they match. The notion of contrastive features originally included acoustic and articulatory targets (Jakobson et al., 1951). While features have been largely described in articulatory terms since the writing of The Sound Pattern of English (Chomsky and Halle, 1968), features are often still thought of as an interface between intermediate linguistic units like words and syllables, and physical subunits like articulatory targets and motor commands (Meyer and Gordon, 1985). Associating larger units such as phonemes with distinct speech actions has been even more complicated due to the difficulty in defining the term phoneme, and the idea that phonemes  45  are composed of differing numbers of features. In part to address these issues, Browman and Goldstein (1986, 1989, 1992) argued for a different phonological unit, the gesture. Gestures are primitive phonological units that are neither features nor phonemes, but sometimes appear to correspond to one or the other (Browman and Goldstein, 1992). In this theory of articulatory phonology, each gesture corresponds to a coordinative structure, a soft-assembled speech action. In North American and other varieties of English, when uttering ‘Saturday’, there are typically two critical movements of the tip of the tongue, an upward movement and a downward movement. We argue that, during fluent speech, while the upward movement is produced through muscle activation, the downward movement occurs automatically due to gravity and elasticity. In chapter 2, we identified four subphonemic categorical kinematic variations of ‘T’ that vary based on how the tongue tip approaches and leaves the alveolar ridge. The first is an [R ], in which the tongue moves from below the alveolar ridge upwards, makes contact and moves back down into position for the following vowel. The second is a [R ], in which the tongue moves from above the alveolar ridge, makes contact, and continues downwards below the alveolar ridge. The third is an [R ], in which the tongue moves from below the alveolar ridge, makes contact, and continues upward into a position above the alveolar ridge. The fourth is a [R↔ ], in which the tongue moves from above the alveolar ridge to a point at or above the ridge horizontally and back to a position above the alveolar ridge. We also know that there is a strong relationship between ‘T’ variants and tongue tip position of the vowels before and after a ‘T’. This tongue tip position is dependent upon whether surrounding vowels are ‘R’ (rhotic, as in the word ‘herd’), or ‘V’ (non-rhotic, as in the word ‘heed’, or ‘had’). In the case of a ‘T’ with a ‘V’ preceding and a [õ] following (as in the word ‘otter’), there is a higher " likelihood of [R ] occurring in single ‘T’ words. There is also a higher likelihood of [õ] in such words than in similar words without ‘T’ (as in the word ‘offer’). In " the case of a ‘T’ with a ‘R’ preceding and a ‘V’ following (as in the word ‘Berta’), there is a higher likelihood of [R ] occurring in single ‘T’ words, as seen in chapter 2. So, assuming that the ‘R’ in ‘Saturday’ is a [õ], we expect a high likelihood of " [R ], [R ] sequences, producing [sæR õR eI]. The ‘T’ variant [R ] is unique among " 46  the four variants in that it involves motion that takes advantage of gravity and elasticity. Down-flaps take advantage of gravitational forces because, while a speaker is standing or sitting, the tongue tip moves from a high position to a low position. They take advantage of elasticity because the tongue moves towards a shape similar to that of the speech rest position (Gick et al., 2004). The human nervous system does not completely compensate for the effects of gravitational load on speech; for example, jaw motion during speech differs based on whether a speaker is prone (face down) or supine (face up) (Shiller et al., 1999). These results also show that tongue motion does not entirely compensate in place of jaw motion, as evidenced by differences in measurements of F1 and F2 during vowel production in prone and supine position. The evidence demonstrates that actions depend on assumptions about the direction of gravity, and such actions are slow to respond to changes in gravitational direction. Similarly, despite the fact that the tongue is a highly flexible muscular hydrostat, muscle structure and elasticity of the tongue play an important role in tongue motion trajectories during speech. Perrier et al. (2003) have provided experimental and 2D finite element method (FEM) vocal tract simulation-based evidence that tissue elasticity factors in the motions of vocal tract articulators during the production of velar stops. FEM is a well known computational technique for calculating the effect, or distribution, of stress within a structure to which stress was applied, and is therefore useful for modelling muscle, cartilage and bone. In their example, much of the forward looping pattern of velar stop production in VCV sequences is based on the anatomical structure of the tongue such that planning may be based on target sequence as much or more than trajectory motion. This suggests that the planning system incorporates information about the structure and elasticity of the anatomy. Based on the potential effects of gravity and elasticity on articulator motion and planning, it is reasonable to expect that both forces contribute to the production of [R ] by contributing to the lowering of the tongue tip from an initial high position above the alveolar ridge. In the case of the word ‘Saturday’, as mentioned before we would expect an [R ], [R ] sequence. This is especially true since the tongue tip must move upward from the initial ‘V’ to the alveolar ridge for the ‘T’, and since ‘R’ can be produced with the tongue tip-up [õ] or -down [ô], it requires less motion " " 47  just to continue moving the tongue tip up into position for a retroflex [õ]. In short, " we expect that [R ], [R ] sequences are preferred in ‘VTRTV’ sequences in order to take advantage of the added efficiency. Particularly from a retroflex [õ] position, it may then be simply a matter of re" laxing the muscles to allow the tongue tip to fall back down, gravity and elasticity thereby completing the closure for a [R ]. We therefore propose that the first movement in an [R ], [R ] sequence is mediated by muscle control, whereas the second movement emerges automatically due to the effects of gravity and elasticity. If this hypothesis is correct, it will show that one motor action can encompass the production of three segments spanning a syllable boundary, indicating disparity between phonological representation and actions. We expect that for most speakers most of the time, there will be one arc of motion, as in [R õR ], instead of two, as in [R ôR ], in production sequences for " " the word ‘Saturday’. An illustration of one arc of motion vs. two is illustrated in  ‘Saturday’  Figure 3.1.  æ  ɾ  ↖  ɻ̩  ɾ  ↘  æ  eɪ  ɾ  ↕  ɹ̩  ɾ  ↕  eɪ  Figure 3.1: Schematic tongue tip trajectory showing ‘Saturday’: [R õR ], " with one arc of tongue-tip motion, vs. [R ôR ], with 2 arcs of tongue" tip motion. Specifically hypothesis leads to two predictions: The first is that there will be more instances of [õ] in ‘Saturday’ than in a similar word without ‘T’s, like " ‘peppermint’. The second is that ‘Saturday’ will usually be produced with [R ], [R ] sequences. This pattern should be stable across repetitions and speakers. However, documenting the disproportionate use of stable [R ], [R ] sequence for the production of ‘Saturday’ is not enough. Figure 3.2 shows how one arc of motion in Saturday (left, [R õR ]) might be produced with either one or two sets " of muscle contractions. In comparison, the alternative production (right, [R ôR ]) " 48  would always require two sets of muscle contractions.  ‘Saturday’ æ  ɾ  ↖  ɻ̩  ɾ  ↘  æ  eɪ  ɾ  ↕  ɹ̩  ɾ  ↕  eɪ  ɹ̩  ɾ  ↕  eɪ  1 vs. 2 Arcs of Motion  æ  ɾ  ↖  ɻ̩  ɾ  ↘  2 muscle activation groups  eɪ æ  æ  ɾ  ↖  ɻ̩  ɾ  ↘  1 muscle activation group  ɾ  ↕  2 muscle activation groups  eɪ  Potential underlying muscle control  Figure 3.2: Schematic of possible underlying patters of muscle contractions for production of the tongue tip motions in the word ‘Saturday’. We therefore need to determine whether gravity and myoelasticity can, in principle, complete a [R ] closure, and complete it fast enough to produce a ‘T’ instead of a stop. Ideally, we might test this hypothesis via electromyography (EMG) to see if muscle activity involved in the production of the [R ] and tongue tip position of the following retroflex [õ] ended before or during the production of the [R ]. " Aside from evidence based on the tongue muscle vectors (Abd-El-Malek, 1939; Miyawaki, 1974), surface EMG studies have been used to identify the tongue muscles used in the production of vowels (MacNeilage and Sholes, 1964), and identify regions of muscle control (Miyawaki et al., 1975). However, [R ] and retroflex [õ] " are produced using tongue muscles such as the styloglossus (STY) that interdigi49  tate extensively with other muscles that would likely not participate, or would add confounding activations, such as the inferior longitudinal (IL), palatoglossus (PG), transversus (TRANS), verticalis (VERT), and hyoglossus (HG), as well as muscles that would act as synergists, such as the posterior genioglossus (GGP), and medial genioglossus (GGM) (Abd-El-Malek, 1939; Saito and Itoh, 2007). Worse, in this analysis we expect at least some of the [R ] to take place without activation of either the IL or the anterior genioglossus (GGA) muscle, both of which could be used to actively produce a [R ]. However, activations of any nearby muscles for lateral tongue bracing could corrupt the analysis. In short, the complex interdigitation of muscles within the body of the tongue make EMG a less desirable option. Computer simulations provide another tool for understanding muscle activation, as described in Section 3.2.3. Below we present our experiments, followed by our simulations, in the same order as the introduction above.  3.2  Experiment  The use of ultrasound imaging to look at midsagittal slices of the tongue (B-mode) along with three one-dimensional slices that cut through the tip and blade of the tongue (M-mode) can provide indirect evidence of tongue-tip motion in rapid sequences. Using a narrow transducer placed against the skin near the angle of the neck, B-mode provides a low speed image (30 frames per second [fps]) of the overall shape of the midsagittal surface of the tongue from the root to the tip. M-mode ultrasound provides high-speed trajectories (60-120 fps) of the direction of tongue motion through fixed cross-sections in the vocal tract. We expect that ‘R’ in the word ‘Saturday’ would be more likely to be a [õ] " than ‘R’ in the word ‘peppermint’, similar to the situation with the word ‘otter’ as opposed to ‘offer’ from Chapter 2. We also expect most instances of initial [R ] to be followed by [õ], whereas we " would expect most instances of initial [R ] to be followed by [ô]. " Similarly, we expect most instances of [õ] to be followed by [R ], and most " instances of [ô] to be followed by [R ]. " As a result of this strong preponderance of [õ], we expect that most of the " 50  sequences in ‘Saturday’ will be [R õR ] sequences, as per our single action hy" pothesis. Most of the rest should be [R ôR ] sequences, produced by three distinct " actions.  3.2.1  Experiment methods  Data from 18 native speakers of North American English between the ages of 18 and 40 participated in the study. All participants had normal speaking and hearing. Participants were seated in a customized American Optical Co. model 507-a (1953) opthalmic chair with a 2-cup rear headrest adjusted to contact the base of the skull just above the neck. A UST-9118 EV 180 electronic curved array ultrasound probe was placed under the chin. The probe has a variable frequency range of 3-9.0 MHz with an average µ slice thickness of the tissue viewed with this probe of approximately 3 mm (Medicines and Healthcare products Regulatory Agency, 2004). The probe was attached to an Aloka ProSound SSD-5000 ultrasound machine connected via s-video cable (marked video IN) to a Canopus ADVC-110 advanced digital video recorder. A Sennheiser MKH-416 short shotgun microphone was mounted on a microphone stand and aimed at the participant about 30 cm away from their mouth. The microphone was plugged into a M-Audio DMP3 pre-amplifier via XLR balanced cable and out with an unbalanced RCA cable to the Canopus card to guarantee time synchronization between the Ulrasound and audio output. The Canopus card was connected via FireWire to a MacPro Quad Core 2.8 gHz computer. An LCD monitor was mounted on the opthalmic chair’s monitor mount and placed in front of the participant. A computer containing the experiment stimuli presentation software was connected to the LCD monitor so that the participant could easily read from the stimuli from the screen. The ultrasound machine was set up in B/M mode and aligned to the acoustic signal. B-mode ultrasound was used to capture 2-dimensional images of the midsagittal plane of the tongue at 30 fps. The M-mode (motion mode) ultrasound provided a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the  51  palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, which ranged from 60-100 Hz depending on the depth of the scan. While this motion is not connected to any specific flesh-point, it allows capture of the general direction of motion of the front of the tongue, which is ideal for identifying the ‘T’ variants described above. At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, which along with the M-mode data allowed identification of the ‘R’ variants described above. Tokens were selected to contain single ‘T’ or sequences of two ‘T’s within consecutive syllables. Data was collected on 17 control sentences, 9 sentences with 1 ‘T’, 10 sentences with double ‘T’ sequences, and 2 sentences with triple ‘T’ sequences, for a total of 38 unique sequences. The sentences were randomized for each of 12 blocks, giving a total of 456 stimulus sentences. The stimuli were presented using PXlabRT such that each sentence was displayed on an LCD screen for 2.2 seconds. The software automatically paused the experiment after the first 6 blocks to allow participants to swallow some water or take a short break if needed. Each set of 6 blocks took 9 minutes, for a total of 18 minutes recording time. This report is based on the data collected in this larger experiment, but this section focuses on some of the double ‘T’ sequence data, as shown in Table 3.1. Token 1 2  Word/Phrase Saturday peppermint  Carrier Phrase We have Saturday off We have peppermint now  ‘T’s 2 0  Syl. 3 3  Context ‘VRV’ ‘VRV’  Table 3.1: double ‘T’ list For all instances of ‘T’ in the dataset, the kinematic variant of the ‘T’ was identified using the motion of the tongue front captured in the B/M-mode ultrasound data synchronized with the audio signal, as described in the introduction and in Chapter 2. The tongue positions of the ‘R’ were also identified by examining the tongue position at vowel midpoints, as seen in the B-mode ultrasound data, and coded as to whether the vowel was [õ] or [ô]. The ‘T’ closure times were also iden" " tified as the point of lowest amplitudes, and the duration, in milliseconds, between 52  the first and second ‘T’ in phrases 1 and 2 were recorded. The ‘R’ and ‘T’ variants in all four tokens were used to test the hypothesis.  3.2.2  Experiment results  As seen in Figure 3.3, six of the 18 participants (3, 4, 6, 8, 16, 21) produced [ô] " exclusively in the phrase ‘We have peppermint now’, and [õ] exclusively in the " phrase ‘We have Saturday off’. Six others showed a similar pattern with most of their tokens (2, 5, 12, 13, 17, 18). Three other participants usually produced [õ] in " both conditions (9, 12, 26). One participant produced [ô] in both conditions (23), " and the last two were mixed in both conditions (14, 15).  53  Sbj.: 2  Sbj.: 3  Sbj.: 4  Sbj.: 5  Sbj.: 6  Sbj.: 8  Sat : ɹ̩ 1  Sat : ɻ̩ 11  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 10  12 mint : ɹ̩  0 mint : ɻ̩  12 mint : ɹ̩  0 mint : ɻ̩  11 mint : ɹ̩  0 mint : ɻ̩  9 mint : ɹ̩  3 mint : ɻ̩  12 mint : ɹ̩  0 mint : ɻ̩  12 mint : ɹ̩  0 mint : ɻ̩  Sbj.: 9  Sbj.: 10  Sbj.: 12  Sbj.: 13  Sbj.: 14  Sbj.: 15  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 3  Sat : ɻ̩ 9  Sat : ɹ̩ 3  Sat : ɻ̩ 9  1 mint : ɹ̩  10 mint : ɻ̩  0 mint : ɹ̩  12 mint : ɻ̩  10 mint : ɹ̩  2 mint : ɻ̩  6 mint : ɹ̩  6 mint : ɻ̩  0 mint : ɹ̩  9 mint : ɻ̩  4 mint : ɹ̩  8 mint : ɻ̩  Sbj.: 16  Sbj.: 17  Sbj.: 18  Sbj.: 21  Sbj.: 23  Sbj.: 26  Sat : ɹ̩ 0  Sat : ɻ̩ 11  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 2  Sat : ɻ̩ 10  Sat : ɹ̩ 0  Sat : ɻ̩ 12  Sat : ɹ̩ 11  Sat : ɻ̩ 1  Sat : ɹ̩ 0  Sat : ɻ̩ 12  11 mint : ɹ̩  0 mint : ɻ̩  3 mint : ɹ̩  9 mint : ɻ̩  11 mint : ɹ̩  1 mint : ɻ̩  12 mint : ɹ̩  0 mint : ɻ̩  12 mint : ɹ̩  0 mint : ɻ̩  1 mint : ɹ̩  11 mint : ɻ̩  (a) By participant Saturday : ɹ̩ 20  Saturday : ɻ̩ 193  139 peppermint : ɹ̩  71 peppermint : ɻ̩  (b) Summary  Figure 3.3: Distribution of final ‘R’ variants by phrase (‘We have Saturday off’ vs. the control phrase ‘We have peppermint now’.) Wilcoxon signed-rank tests were performed on the data summarized in Figure 3.3. For each of the two ‘R’ variants, the percentage of productions matching that tongue tip position for based on whether the word in question is ‘Saturday’ or ‘peppermint’ is compared. As expected from the descriptive statistics in Figure 3.3, the results are significant, as seen in Table 3.2.  54  [õ] " V 147.5  contexts Peppermint vs. Saturday  p *p < 0.001  [ô] " V 5.5  p *p < 0.001  Table 3.2: Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘Saturday’ vs. ‘Peppermint’. U = [õ], D = [ô]. * = significant (α = " " 0.05).  Most initial ‘T’ variants in ‘Saturday’ were [R ], or 191 out of 213, as seen in Figure 3.4. Of these, 186 were followed by [õ]. In contrast, of the 20 tokens of " ‘Saturday’ with [ô], 15 were [R ]. "  rhotic vowel: U  rhotic vowel: D ↘  ↕  ɾ 15  ɾ 0  5  0  ↖  ɾ  ↔  ɾ↕ 7  ɾ↘ 0  186  0  ↖  ɾ  ɾ  ɾ↔  Figure 3.4: Distribution of initial ‘T’ type in the word ‘Saturday’ based on ‘R’ variant (in ‘ur’).  Wilcoxon signed-rank tests were performed on the data summarized in Figure 3.4. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant based on the ‘R’ variant in ‘Saturday’ were compared. As expected from the descriptive statistics in Figure 3.4, the results are significant for [R ], as seen in Table 3.3.  55  [R ] V p 12 p = 0.27 9  contexts [õ] : [ô] " "  [R ] V p 0 NA  [R ] V p 1 * p< 0.001  [R↔ ] V p 0 NA  Table 3.3: Wilcoxon signed-rank tests comparing prevalence of initial ‘T’ variant in the word ‘Saturday’ based on ‘R’ variant. * = significant (α = 0.05).  Most of the final ‘T’s in ‘Saturday’ were [R ], 194 out of 213. Of these, 187 were preceded by [õ]. In contrast, of the 20 tokens of ‘Saturday’ with [ô], 13 were " " [R ].  rhotic vowel: U  rhotic vowel: D ɾ↕ 13  0  ↖  ɾ  ɾ↘ 7  ɾ↕ 0  0  1  ɾ↔  ↖  ɾ  ɾ↘ 187  5  ɾ↔  Figure 3.5: Distribution of final ‘T’ variants in the word ‘Saturday’ based on ‘R’ variant (in ‘ur’).  Wilcoxon signed-rank tests were performed on the data summarized in Figure 3.5. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant based on the ‘R’ variant in ‘Saturday’ were compared. As expected from the descriptive statistics in Figure 3.5, the results are significant for [R ], as seen in Table 3.4.  56  contexts [õ] : [ô] " "  [R ] V p 10 p = 0.098  [R ] V p 0 *p < 0.001  [R ] V p 0 p=1  [R↔ ] V p 0 p = 0.181  Table 3.4: Wilcoxon signed-rank tests comparing prevalence of final ‘T’ variant in the word ‘Saturday’ based on ‘R’ variant. * = significant (α = 0.05).  The results also show that that of the 213 ‘T’ sequences among the 18 participants of this study, 180 of them were [R ], [õ], [R ] sequences, representing 84.5% " of the sequences, as seen in Figure 3.6.  0  180  1  5  0  0  0  0  0  7  0  0  ɾ↕  ɾ↘  ɾ↖  ɾ↔  ɾ↔  0  ɾ↖  0  ɾ↘  0  ɾ↕  0  Flap Sequence 'Saturday' - tip down rhotic  Initial 'T'  ɾ↖ ɾ↘ ɾ↕  Initial 'T'  ɾ↔  Flap Sequence 'Saturday' - tip up rhotic  Final 'T'  0  0  0  0  0  5  0  0  0  0  0  0  13  2  0  0  ɾ↕  ɾ↘  ɾ↖  ɾ↔  Final 'T'  (a) [õ] "  (b) [ô] "  Figure 3.6: Flap sequences in‘Saturday’ based on the ‘R’ variant. X axis lists the initlal ‘T’, Y axis lists the final ‘T’.  3.2.3  Simulation  Biomechanical simulation is well suited for this study as it is characterizes the mechanics of a biological system, i.e. how the forces within the system interact in order to generate observed movements. ArtiSynth is a biomechanics simulation 57  toolkit, targeted toward modelling and simulation of the human vocal tract (Fels et al., 2009, 2006, 2003). Recently, a model of coupled jaw-tongue-hyoid dynamics has been developed within the ArtiSynth framework (Stavness et al., 2010). This jaw-tongue-hyoid-palate (JTHP) model was built from reference tongue and jaw models. The reference tongue model (Buchaillard et al., 2009) was originally based on the work of G´erard et al. (2006, 2003) who themselves used data from the Visible Human Project R and work by Wilhelms-Tricarico (2000). In mapping the muscle groups and hyoid bone connections for their model, the authors used CT, MRI, and X-ray data to match the characteristics of an exemplar male. Muscle fibres are embedded within the tongue body to represent the muscle structure of the tongue. Takemoto (2001) provides a thorough description of the tongue musculature that forms the foundation of this model. Tissue properties originally derived from fresh cadavers were modified to match living tissue (G´erard et al., 2005). The tongue model uses the FEM to represent the non-linear, large deformation tissue properties of the tongue. Muscle control is based in part on EMG studies of the tongue which argue for partially independent control of parts of the genioglossus (Miyawaki et al., 1975), but due to the historical difficulty in identifying motor units that control parts of tongue muscles (see Slaughter et al. (2005)), the authors rely heavily on the anatomical structures of muscles themselves for their control groups. Buchaillard et al. (2009) also used this model to test the effects of gravity on vowel production, and this capability has been preserved in the JTHP model. The reference jaw model (Hannam et al., 2008) is composed of rigid body components for the skeletal structures (cranium, mandible, hyoid bone) connected with point-to-point Hill-type muscles and has been used to analyze forces during unilateral chewing. The model itself is based on decades of research into EMG readings of jaw muscles (see Peck et al., 2000), integrated into a hyoid and larynx model (Stavness et al., 2006), and is registered to high resolution computed tomography (CT) scans of an exemplar male. The JTHP model is composed from these reference models for the tongue and jaw, and adapted to fit the anatomy of a single speaker (Stavness et al., 2010). It includes bone structures, a deformable tongue model, muscle forces, and dynamic 58  coupling (tongue muscle forces act on the jaw and vis versa), and contact (tonguejaw and tongue-palate). The JTHP includes a general mechanism for attaching rigid and deformable bodies, a collision detection system based on intersection contours between surface matches, full coupling between the FEM tongue model and the jaw-hyoid dynamics, and real time physics simulation. Simulations reported for the coupled JTHP model have shown a wide range of plausible speech and chewing motions. Given the decades of careful research throughout the history of the JTHP model and its reference models, we believe it is highly suitable for our simulation needs. Here, we use this model to investigate the effect of muscle forces, elasticity, and gravity on [R ] closure. We expect certain muscles to participate in the formation of an [R ], such as muscles for raising the jaw, the superior longitudinal (SL) muscle for curling up the tongue tip, the GGP and GGM, for advancing the tongue tip and body, and the TRANS for narrowing the tongue and elevating the surface. We also found the STY was necessary to retract the tongue sufficiently to allow the production of a retroflex ‘r’. For the production of [R ], we expect that contracting the muscles above will lead to tongue-tip motion upward, contacting the alveolar ridge and pulling away into a tip-up (retroflex) position. Potential antagonists for a [R ] include the GGA and IL muscle for lowering the tongue tip. However, we do not expect these muscles to be needed to produce a [R ]. We used the JTHP model to test whether an [R ], [R ] sequence can be produced with muscle activations for the [R ] only, and we used the JTHP model to create an [R ], [R ] sequence via direct activation of muscles for both flaps. For the active simulation, we demonstrate that an [R ] motion into a [õ] can be " generated with one set of muscle contractions, and the [R ] can be generated with a second set of muscle contractions. For the passive simulation, we expect an [R ] will occur during or just after completion of muscle activations for the [R ], and the [R ] will occur during or just after muscle deactivation. The duration between the [R ] and [R ] may be due to either the strength of the initial muscle activations, or the length of time in which the muscle activations were sustained. The [R ] will occur slower, and possibly at a different tongue-contact point than 59  in the active model, but still fast enough to be a flap and not a stop. This faster [R ] will be distinguishable from the active model [R ] because of a slight increase in duration between the flaps. Nevertheless, we expect the differences would be subtle enough that it is difficult for us to imagine identifying the difference in human experiments without EMG recordings.  3.2.4  Simulation methods  To test the two simulation hypotheses above, we created two simulation models. Input probes were created for the JTHP model in order to simulate an [R ] followed by the tongue-tip position for a retroflex ‘r’. For both models, the jaw positioning and [R ] muscle activations were the same. Bilateral closers (which are masseter, temporalis, and medial pterygoids) were programmed to move the jaw into position for speech. For [R ] muscle activations, the SL probe was set to 33.5% of maximum, TRANS to 33% of maximum, the GGP to 60% of maximum, and GGM to 32% of maximum. All three probes were set to activate over a 25 millisecond period, be sustained for 110 more ms, and be relaxed to 0% over a period of 25 more milliseconds. These three muscles were used to create the [R ] motion of the tongue tip. The STY probe was also activated to 29% of maximum, starting 20 ms after the activation of the other three muscles over a 25 ms period, sustained over a 70 ms period, and relaxed over a 25 ms period. The STY muscle was used to pull the tongue away from the alveolar ridge into a retroflex ‘r’ position. These very specific activations were generated from well-known ideas about how the tongue tip is raised, and careful hand-tuning of the JTHP system. The JTHP models were then run with the above input probes, and the position of the tongue tip was recorded from the beginning of activation until 55 ms after the relaxation of the SL, TRANS, GGP, and GGM ([R ]) probes in order to see if the tongue moved through a [R ] while the muscles were relaxing. The passive simulation involved no other muscle activations. For the active model, the [R ] muscle activations involved two muscles. The GGA and the IL was set to activate to 30% of maximum over 25 ms, and then deactivate over 25 ms, for a total of 50 ms of activation. These constitute the [R ] probes. The active  60  simulation has the [R ] probes activate while the [R ] probes were deactivating such that they reach full activation just as all the [R ] probes are fully deactivated.  3.2.5  Simulation results  The results of the simulations for the active and passive models are presented below. These include the timings of [R ] contact, mid-point of the ‘R’, the [R ] contact and mid-point of the final vowel, all in relation to the muscle activations. Active simulation The active simulation showed that the [R ] and time at which the tongue is farthest away from the alveolar ridge remains the same. However, the [R ] muscle activations, which overlap the passive production of the [R ], causes the [R ] to occur 10 ms sooner, and with a contact point slightly anterior to the passive [R ] contact point, as seen in Figure 3.7.  61  Muscles: Up-flap -5  5  15  25  35  45  55  65  75  85  95  105  125 145 165 185 205 115 135 155 175 195 215  GGP 0 GGM up-flap contact 45  up-flap contact  STY  down-flap contact  TRANS  115  SL  down-flap contact  Muscles: Down-flap GGA  145  IL  185  Figure 3.7: Active model: Tongue tip positions in relation to ArtiSynth muscle activations with active [R ] and [R ] muscle activations.  Passive simulation The passive simulation showed that the [R ], and time at which the tongue is farthest away from the alveolar ridge are generated through active muscle control, but the [R ] takes place during relaxation of the same muscles. That is, as a result of the passive elasticity and gravitational forces in the model. The results are seen in Figure 3.8.  62  Muscles: Up-flap -5  5  15  25  35  45  55  65  75  85  95  105  125 145 165 185 205 115 135 155 175 195 215  GGP 0 GGM up-flap contact 45  up-flap contact  STY  down-flap contact  TRANS  115  SL  down-flap contact  Muscles: Down-flap GGA  155  IL  215  Figure 3.8: Passive model: Tongue tip positions in relation to ArtiSynth muscle activations with [R ] muscle activations, and no [R ] muscle activations. The simulation can be programmed to provide shorter and longer retroflex ‘r’ durations based on the strength and/or length of muscle contractions. Stronger contractions lead to more pronounced retroflexion and longer tongue-tip contact durations during the [R ], but are otherwise similar to the simulations above.  63  3.3  Discussion  The results of the experiments support the hypothesis that speakers prefer [R õR ] " sequence in ‘Saturday’ - there are 185 [R ], [R ] sequences recorded out of 213 tokens for the word ‘Saturday’. The ‘R’ in the word ‘Saturday’ is significantly more likely to be [õ] (193 out of 213) than the ones in the control phrase ‘peppermint’ " (71 out of 210), similar to the results with the comparison of the words ‘otter’ and ‘offer’ in Chapter 2. Therefore results show that the vast majority of ‘Saturday’ tokens were produced with an [R õR ] sequence, a result we expected if the whole " sequence is due to one action. The results of the simulations support the simulation hypotheses that both active and passive control of a [R ] are possible. The JTHP simulation can be programmed to produce an [R ] into a [õ] with a cluster of muscle activations; relaxing " these muscles allows the tongue to passively move downward and forward, producing a [R ] and supporting the passive [R ] hypothesis. The gravitational force allowed the [R ] contact to move quickly enough such that total occlusion of the alveolar ridge lasted less than 10 ms, about the duration of a ‘T’ contact. The myoelastic properties of the tongue allowed the tongue tip to make appropriate contact with the alveolar ridge. The active model produced the [R ] in a similar fashion to the passive model, but more quickly, and with a different tongue position after the ‘T’ (more like a low front vowel position). These results show that the [R õR ] sequence in ‘Saturday’ is a stable sequence. " Since this sequence follows a pattern of ending by allowing the tongue to take advantage of both gravity and its natural elasticity, we take these results as support for our argument. Evidence from the production of the word ‘Saturday’ shows that one speech action, initiated at the initial ‘T’ and culminated through muscle relaxation, gravity and elasticity at the end of the final ‘T’, can produce the most common [R õR ] " sequence. In comparison, the less common sequence [R ôR ] involves at least two " actions. These results show disparity between phonology and actions such that one action can lead to the production of three phones when in other cases three actions  64  are needed. We also expect that there are many more such disparities that will be revealed as simulations of the human vocal tract become more detailed and researchers uncover the ways in which the shape and elasticity of parts of the vocal tract make such disparities possible.  3.3.1  Future work  Famously, phonation and trills are produced based on a combination of myoelastic principles combined with aerodynamic factors (Van Den Berg, 1958). However, the degree to which aerodynamic forces influence articulation during other speech acts must not be underestimated. Houde (1968), Perkell (1969), and Kent and Moll (1972) noticed a forward looping of the the tongue during the production of alveolar and velar stops. (Hoole et al., 1998) further demonstrated that aerodynamic forces influence the shape and extent of this forward looping. In their study, participants were asked to produce VCV sequences with velar and alveolar stops at two different loudness conditions and two different airflow conditions, egressive and ingressive. The ingressive tokens had less forward tongue motion during the stop closure than the egressive tokens, indicating that air pressure behind the tongue pushes the tongue forward during motion towards a stop closure. It is therefore reasonable to assume that similar effects will take place during ‘T’ production. Similarly, airflow out of the mouth will produce forward and downward pressure on the tongue tip in roughly the direction of the [R ], an observation that can be seen directly in studies of airflow leaving the mouth during speech (Derrick et al., 2009), and is likely due to the height of the upper vocal tract in relation to the mouth. ArtiSynth does not yet have the capability of simulating aerodynamic effects, but researchers are already hard at work on this task, and future analysis would benefit from including aerodynamic simulations. We may also want to reexamine this question by recording this data on participants with an EMMA, and with participants who are seated, in the prone position, and in the supine position, and with participants who are speaking vs. mouthing the words. Doing so could achieve two goals, one is to map point tracking to ‘T’ motion in speech, and the other is to obtain higher resolution of tongue tip motion to better understand the effects of gravity and aerodynamics on ‘T’ production.  65  Chapter 4  Subphonemic planning across syllable, morpheme and word boundaries 4.1  Introduction  Speech scientists have been trying to explain speech coarticulation without plan¨ ning for several decades (Boyce, 1990; Fowler, 1980; Ohman, 1966, 1967; Saltzman and Munhall, 1989). Planning is the generation of a strategy for the implementation of speech production at utterance time. Planning differs from memorization, which takes place long before utterance time and changes slowly if at all, but planning can and to some degree must involve choosing between memorized actions. Planning must take place at the level of the phrase or sentence because there are a nearly infinite number of possible sentences that can be uttered. However, it is possible to imagine memorizing all the phoneme combinations in a language because there are a limited number of them, all of which a native speaker of a language will experience many times long before reaching adulthood. While many speech researchers have tried to argue against low-level planning, some have argued for limited planning in anticipatory coarticulation (Whalen, 1990), and others have written computer models incorporating planning (Henke,  66  1966). Furthermore, some psychological researchers have found evidence for speech planning down to the level of the syllable, with some evidence for planning at the phoneme (Dell, 1986; Levelt, 1989) or feature (Bernhardt and Stemberger, 1998; Dell, 1986; Mowrey and MacKay, 1990) level. Researchers such as Munhall et al. (2000) have also illustrated the difficulty in determining the importance of planning in speech output because actual speech output may look very similar regardless of whether there is low-level planning or not. What is needed is a metric for identifying low-level motor planning in speech. The end-state comfort literature offers such a metric. Rosenbaum et al. (1996, 1992) and Cohen and Rosenbaum (2004) first observed that people grasp objects at the beginning of transport in a way that allows joints to be in mid-range at the end of transport. If one is asked to pick up a glass and put it down, the hand is held with the thumb in medial position throughout, but if one is asked to put the glass down upside-down, the arm is twisted so that the thumb is in lateral position when the glass is picked up, and twisted back to the more comfortable thumb in medial position when the cup is put down upsidedown. In speech articulation this would translate as motor activity planned so that an action is completed with the articulators in a “comfortable” (canonical or at-rest) position. Observations of the end-state comfort effect have been used as diagnostics of motor planning in humans, lemurs (Chapman et al., 2010), and cotton-top tamarins (Weiss et al., 2007). End-state comfort effects are difficult to identify in speech because there is a need to identify a context in natural speech where articulators can be in a categorically less-than-ideal position at the beginning or in the middle of a difficult sequence, as in a rhotacized non-rhotic vowel. The interaction of ‘T’ and surrounding ‘V’ and ‘R’ provide such a context. In Chapter 2, we identified four subphonemic categorical kinematic variations of ‘T’ types in English. The first is an [R ], in which the tongue moves from below the alveolar ridge upwards, makes contact and moves back down into position for the following vowel. The second is a [R ], in which the tongue moves from above the alveolar ridge, makes contact, and continues downwards below the alveolar ridge. The third is an [R ], in which the tongue moves from below the alveolar ridge, makes contact, and continues upward into a position above the alveolar ridge. 67  The fourth is a [R↔ ], in which the tongue moves from above the alveolar ridge to a point at or above the ridge horizontally and back to a position above the alveolar ridge. Similarly, previous research Delattre and Freeman (1968); Hagiwara (1995) demonstrated that rhotic vowels can be produced in one of two broad categories, tongue tip-down or ‘bunched’ [ô], or tip-up [õ]. In contrast, ‘V’ are typically pro" " duced tip-down only. One indication that end-state comfort is an important factor in speech is that in words with one ‘T’, ‘T’ variants are better predicted from the following ‘R’ variant then from the preceding ‘R’ variant. For example, for the word ‘Berta’, speakers largely ignored the ‘R’ variant, producing [R ] even following a [ô]. In " these cases, the [ô] would rapidly transition to a [õ] position prior to the [R ]. There " " were few if any similarly odd transitions into the final ‘R’ for ‘otter’. However, this evidence for planning based on end-state comfort focuses on the immediate context, extending no further than the transition from a preceding ‘R’ into a ‘T’ variant in relation to the tongue tip position of the following ‘V’. This result could potentially be explained via local coarticulation. We therefore want to see if there is evidence that speakers take into account information from following morphemes and words in choosing subphonemic variants of sounds during speech production. Here we define morpheme with the simple definition as the smallest component of a word or other linguistic unit with semantic meaning. Words include free morphemes that can and do appear on their own, as opposed to bound morphemes that must be part of at least one other morpheme to form a word. To see if there is planning across morpheme boundaries, we compare ‘edify/audify’ ([VVV] sequences) vs. ‘editor/auditor’ ([VVR] sequences). In the case of the [VVV] sequences we expect the sequences to end with a low tongue tip, as it is in all non-rhotic vowels. This means that the end-state comfort position for the tongue is tip-down. In the case of the [VVR] sequences, the tongue tip may be tip-up or tip-down as rhotic vowels have tip-up and tip-down variants. However, Chapter 2 shows that the ‘R’s following ‘T’s are typically [õ], meaning that we " expect the most common end-state comfort position for the tongue is tip-up. In the case of sequences ending with a ‘V’, we expect the preceding ‘T’ to end 68  in a tip-down position ([R ] or [R ]). In the case of a sequence ending with a [õ], we " expect the preceding ‘T’ to be more likely to end with in a tip-up positiion ([R ] or [R↔ ]). If the preceding vowel was a ‘V’, we might expect a [VR õ] sequence. " However, the tongue-tip must move upward to a high position in order to touch the alveolar ridge for the initial ‘T’. Keeping the tongue tip high throughout the next vowel and producing a [R↔ ] into the word-final [õ] vowel would produce less tongue " motion, leading to a [VR VR↔ õ] sequence for ‘editor/auditor’ ([VVR] sequences). " More importantly, this sequence demonstrates the end-state comfort effect most effectively because the middle vowel is rhotacized, whereas the end-state rhotic vowel is produced in the most ideal fashion. Figure 4.1 helps explain the argument. The black lines illustrate the predicted position of tongue tip height during the given production. The blue circle highlights the predicted end-state, or [õ]. The dashed " red circle highlights a predicted rhotacized middle vowel, whereas the green circle highlights a predicted normal non-rhotic vowel. In this diagram, only the example on the left provides clear evidence of the end-state comfort effect because it is the only example where a middle-state vowel quality is sacrificed for the end-state vowel.  V  ɾ↖  ɪ˞  ɾ↔  V  ɻ̩  ɾ↕  ɪ  ɾ↖  ɻ̩  Figure 4.1: Schematic of possible tongue-tip motions for ‘editor/auditor’ phrases. The example on the left demonstrates end-state comfort in exchange for middle-state vowel quality.  V  ɾ↕  ɪ  ɾ↕  ɹ̩  This preferred sequence involves economy of motion (Lindblom, 1983; Nelson, 1983). However, it relies upon an expectation of a particular end-state at the time of production of the initial ‘T’, which would be evidence of look-ahead planning across a morpheme boundary. That is, the usual ‘T’ used in the same initial morpheme is predicted to differ based on the context generated by a following morpheme. 69  This prediction is not based on local context across a morpheme boundary, which we already know is altered for ‘edit’ and ‘audit’ because the word-final stop becomes a ‘T’. Instead the argument is that the morpheme ending will interact with the initial flap, which is both separated by and not adjacent to the morpheme boundary. Similarly, to see if there is planning across word boundaries, we compare the first ‘T’ in ‘edit a’ and ‘audit a’ ([VVV] sequences) vs. the ‘T’ in ‘edit the’ and ‘audit the’ ([VV] sequences). In the case of [VV] sequences, we expect a preponderance of [R ] like we saw with the same speakers producing the word ‘autumn’ in Chapter 2. In the case of [VVV] sequences, we might expect [VR VR V] sequences. But, we also expect some speakers to produce an [VR VR V] sequence based either on a preference for less motion and jerk in the sequence, or on a simple inability of the speaker to produce double [R ] sequences at a normal speech rate. This [R ], [R ] sequence from speakers who produce [R ] in the same words followed by ‘the’ would provide evidence that the same word’s (initial) ‘T’ is produced differently based on context provided from a following word. The argument is illustrated in Figure 4.2. Like the previous example, both motion sequences preserve the end-state, but only the one on the left shows evidence of the end-state comfort effect because the middle-state vowel quality is sacrificed for the end-state vowel.  V  ɾ  ↖  ɪ˞  ɾ  ↘  V  ə  ɾ  ↕  ɪ  ɾ  ↕  ə  Figure 4.2: Schematic of possible tongue-tip motions for ‘edit/audit a’ phrases. The example on the left demonstrates end-state comfort in exchange for middle-state vowel quality. We might also expect an alternative production of [VV] phrases with a [R ] among speakers who do not always produce [R ] because a [R ] preserves end-state comfort in exchange for rhotacizing the initial vowel. 70  4.1.1  Hypotheses  The hypotheses for speech production behaviour across morpheme and word boundaries are presented below: Morpheme boundary First, we test the hypothesis that subphonemic planning spans morpheme boundaries, as follows: 1) Because we expect more [õ] following ‘T’ than following labial or glottal " consonants, we expect the final ‘R’ in ‘editor/auditor’ to be [õ] more often than the " final ‘R’ in ‘mammifer’. 2) If part 1 is true, we expect [R ] for ‘edify/audify’, but [R ] for the first ‘T’ in ‘editor/auditor’. 3) If part 1 is true, we expect [R↔ ] for the final ‘T’ in ‘editor/auditor’ as opposed to [R ] in ‘otter’. Figure 4.3 illustrates hypothesis 2 and 3. The dashed grey lines represent a rough location for the morpheme boundary. The blue circle highlights the predicted end-state for [VVR] sequences, a [õ]. The dashed red circle highlights the predicted " rhotacized middle vowel for [VVR] sequences. The green circle highlights the predicted first ‘T’ variant for both [VVR] and [VV] sequences. Lastly, the grey arrow emphasizes the relationship between the initial ‘T’ variant and the predicted end-state for [VVR] sequences.  ‘VV’ phrase  ‘VVR’ phrase  V  ɾ↖  ɪ˞  ɾ↔  ɻ̩  V  ɾ↕`  ɪ  f  ai  Figure 4.3: Schematic of hypothesis for ‘editor/auditor’ ([VVR sequences) vs. ‘edify/audify’ ([VV] sequences). 71  Word boundary Second, we also test the hypothesis that subphonemic planning spans across word boundaries, as follows: 4) In cases where the speaker cannot or does not produce sequences of taps, we expect the ‘T’ in ‘edit/audit the’ to be an [R ] but the initial ‘T’ in ‘edit/audit a’ to be an [R ], that is, the beginning of an [R ], [R ] sequence. Figure 4.4 illustrates hypothesis 4, similar to that of Figure 4.3 above. The dashed grey lines instead represent a rough location for the word boundary, and the blue circle highlights the predicted end-state for [VV‘V’] sequences, a non-rhotic vowel.  ‘VV’ phrase  ‘VVV’ phrase  V  ɾ  ↖  ɪ˞  ɾ  ↘  ə  V  ɾ  ↕  ɪ  t  θ  Figure 4.4: Schematic of hypothesis for ‘edit/audit a’ ([VV] sequences) vs. ‘edit/audit the’([VVV] sequences). 5) In cases where the speaker does not produce a [R ] for ‘edit/audit the’, we might expect them to produce [R ] instead, and so more instances of [R ] than in [VV‘V’] phrases.  4.2  Methods  Data from 18 native speakers of North American English between the ages of 18 and 40 participated in the study. All participants had normal speaking and hearing. Participants were seated in a customized American Optical Co. model 507-a (1953) opthalmic chair with a 2-cup rear headrest adjusted to contact the base of the skull just above the neck. 72  A UST-9118 EV 180 electronic curved array ultrasound probe was placed under the chin. The probe has a variable frequency range of 3-9.0 MHz with an average µ slice thickness of the tissue viewed with this probe of approximately 3 mm Medicines and Healthcare products Regulatory Agency (2004). The probe was attached to an Aloka ProSound SSD-5000 ultrasound machine connected via s-video cable (marked video IN) to a Canopus ADVC-110 advanced digital video recorder. A Sennheiser MKH-416 short shotgun microphone was mounted on a microphone stand and aimed 30 cm away from the participant’s mouth. The microphone was plugged into a M-Audio DMP3 pre-amplifier via XLR balanced cable and out with an unbalanced RCA cable to the Canopus card to guarantee time synchronization between the Ulrasound and audio output. The Canopus card was connected via FireWire to a MacPro Quad Core 2.8 gHz computer. An LCD monitor was mounted on the opthalmic chair’s monitor mount and aimed in front of the participant. A computer containing the experiment stimuli presentation software was connected to the LCD monitor so that the participant could easily read from the stimuli from the screen. The ultrasound machine was set up in B/M mode and aligned to the acoustic signal. B-mode ultrasound was used to capture 2-dimensional images of the midsagittal plane of the tongue at 30 fps. The M-mode (motion mode) ultrasound provided a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, which ranged from 60-100 Hz depending on the depth of the scan. While this motion is not connected to any specific flesh-point, it allows capture of the general direction of motion of the front of the tongue, which is ideal for identifying the ‘T’ variants described above. At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, which along with the M-mode data allowed identification of the ‘R’ variants described above. Tokens were selected to contain single ‘T’s or sequences of ‘T’s in consecutive syllables. Data was collected on 17 control sentences, 9 sentences with 1 ‘T’, 10 73  sentences with double ‘T’ sequences, and 2 sentences with triple ‘T’ sequences, for a total of 38 unique sequences. The sentences were randomized for each of 12 blocks, giving a total of 456 stimuli sentences. The stimuli were presented using PXlabRT such that each sentence was displayed on an LCD screen for 2.2 seconds. The software automatically paused the experiment after the first 6 blocks to allow participants to swallow some water or take a short break if needed. Each set of 6 blocks took 9 minutes, for a total of 18 minutes recording time. The results from this chapter are based on the subset of phrases collected, as shown in Table 4.1. Token 1 2 3 4 5 6 7 8 9 10  Word editor auditor mammifer edify audify otter audit the edit the edit a audit a  Carrier Phrase We have editor books We have auditor books We have mammifer books We have him edify a book We have him audify a book We have otter books We have him audit the books We have him edit the books We have him edit a book We have him audit a book  ‘T’ count 2 2 0 1 1 1 1 1 2 2  Context VVR VVR VVR VV VV VR VV VV VVV VVV  Type ‘T’ ‘T’ ‘C’ ‘T’ ‘T’ ‘T’ ‘T’ ‘T’ ‘T’ ‘T’  Table 4.1: single ‘T’ list. ‘T’ = flap/tap, ‘C’ = control phrase. The acoustic signal was labeled and transcribed in PRAAT, and then imported into ELAN and the ‘T’ variants identified according to the description in the introduction above. These methods, along with exemplars, were presented in detail in chapter 2.  4.3  Results  The results of the experiment are presented below, including descriptive statistics and logistic regression tests for each of the four hypotheses listed above.  74  4.3.1  Hypothesis 1: mammifer vs. editor/auditor  The kinematic variants of the (final) ‘R’ produced during the phrase ‘We have mammifer books’ vs. the phrases ‘We have editor/auditor books’ are shown in Figure 4.5. The results show that six participants (2, 3, 5, 8, 16 and 23) produced mostly [ô] for ‘mammifer’, but [õ] for ‘editor/auditor’. Four more participants (9, " " 13, 17 and 26) produced mostly [õ] for both phrase sets. Three participants (10, " 14 and 18) had a mixture of [õ] and [ô] for ‘editor/auditor’, and mostly [õ] for " " " ‘mammifer’. One participant (15) produced [ô] for ‘editor/auditor’, and [õ] for " " ‘mammifer’. Lastly participants 4, 12 and 21 sometimes, and participant 6 always pronounced the word ‘mammifer’ as ‘mammifier’, making some or all of their data unusable.  75  subject: 2  subject: 3  subject: 4  subject: 5  subject: 6  subject: 8  'C' : ɹ̩ 12  'C' : ɻ̩ 0  'C' : ɹ̩ 8  'C' : ɻ̩ 2  'C' : ɹ̩ 5  'C' : ɻ̩ 0  'C' : ɹ̩ 10  'C' : ɻ̩ 1  'C' : ɹ̩ 0  'C' : ɻ̩ 0  'C' : ɹ̩ 10  'C' : ɻ̩ 1  1 'T' : ɹ̩  21 'T' : ɻ̩  0 'T' : ɹ̩  23 'T' : ɻ̩  1 'T' : ɹ̩  13 'T' : ɻ̩  0 'T' : ɹ̩  24 'T' : ɻ̩  0 'T' : ɹ̩  24 'T' : ɻ̩  2 'T' : ɹ̩  21 'T' : ɻ̩  subject: 9  subject: 10  subject: 12  subject: 13  subject: 14  subject: 15  'C' : ɹ̩ 0  'C' : ɻ̩ 12  'C' : ɹ̩ 0  'C' : ɻ̩ 11  'C' : ɹ̩ 6  'C' : ɻ̩ 0  'C' : ɹ̩ 1  'C' : ɻ̩ 9  'C' : ɹ̩ 0  'C' : ɻ̩ 12  'C' : ɹ̩ 0  'C' : ɻ̩ 12  4 'T' : ɹ̩  20 'T' : ɻ̩  5 'T' : ɹ̩  11 'T' : ɻ̩  2 'T' : ɹ̩  22 'T' : ɻ̩  3 'T' : ɹ̩  20 'T' : ɻ̩  7 'T' : ɹ̩  14 'T' : ɻ̩  16 'T' : ɹ̩  6 'T' : ɻ̩  subject: 16  subject: 17  subject: 18  subject: 21  subject: 23  subject: 26  'C' : ɹ̩ 12  'C' : ɻ̩ 0  'C' : ɹ̩ 0  'C' : ɻ̩ 12  'C' : ɹ̩ 12  'C' : ɻ̩ 0  'C' : ɹ̩ 2  'C' : ɻ̩ 0  'C' : ɹ̩ 11  'C' : ɻ̩ 0  'C' : ɹ̩ 1  'C' : ɻ̩ 9  4 'T' : ɹ̩  20 'T' : ɻ̩  0 'T' : ɹ̩  24 'T' : ɻ̩  6 'T' : ɹ̩  15 'T' : ɻ̩  9 'T' : ɹ̩  13 'T' : ɻ̩  23 'T' : ɹ̩  1 'T' : ɻ̩  2 'T' : ɹ̩  22 'T' : ɻ̩  (a) By participant 'C' : ɹ̩ 90  'C' : ɻ̩ 81  85 'T' : ɹ̩  314 'T' : ɻ̩  (b) Summary  Figure 4.5: Distribution of ‘R’ variants by participant, top = ‘C’ (‘mammifer’), bottom = ‘T’ (‘editor/auditor’). ‘C’ = control sequences, ‘T’ = flap/tap sequences. Wilcoxon Signed-Rank tests were performed on the data summarized in Figure 4.5. For each of the two ‘R’ variants, the percentage of productions matching that tongue tip position based on whether the word in question is ‘mammifer’ or ‘editor/auditor’ is compared. As expected from the descriptive statistics in Figure 4.5, the results are significant, as seen in Table 4.2. 76  contexts Mammifer vs. editor/auditor  [õ] " V 26  p *p = 0.018  [ô] " V 110  p *p = 0.032  Table 4.2: Wilcoxon Signed rank tests comparing prevalence of ‘R’ variants in ‘mammifer vs. ‘editor/auditor’. * = significant (α = 0.05).  4.3.2  Hypothesis 2: ‘edify/audify’ ([VV] sequences) vs. ‘editor/auditor’ ([VVR] sequences).  The ‘T’ variants produced during the phrases ‘We have him edify/audify a book’ ([VV] sequences) vs. the phrases ‘We have editor/auditor books’ ([VVR] sequences) are shown in Figures 4.6 and 4.7. The results show that five participants (2, 4, 5, 6 and 26) produced exclusively or almost exclusively [R ] during the production of [VV] sequences, but exclusively or almost exclusively [R ] during the production of [VVR] sequences. Seven more participants followed the same pattern, but not as consistently (9, 10, 12, 14, 16, 17 and 18). One participant (13), followed the opposite pattern, producing mostly [R ] during [VVR] sequences, and [R ] during [VV] sequences. Three participants (8, 15 and 23) almost always produced [R ] in all four phrases. One participant (3) produced mostly [R ]s in all four phrases. The last participant (21) produced a mixed bag of [R ], [R ] and [R ].  77  PA T  11800 12 0  00  01 0  U FPA PATT PA T UUFF  U FPA PATT PA T UUFF  23  02  23 19 0 0  00  UUFFPA T PA PATT U F  021 0 21 2100  UUFPA FF T UUFPA FPATU T FPA PATTUUPA FPATU T FPA PATT UPA F T UUFFPA U TUFPA FPATT UPA PA F TTUUFFPA U TUFPA FPATT PA PATT U F  000  23 19 0 19 2400 UF  UUFFPA T PA PATT UF  PA T  PA T  PA 0 T  0  0  D0 F  subject: 15  0 U1F  0  subject: 14  U1F  23  23  UU115 FFPA PA 00 TT 0 T PA  subject: subject: 12 9 subject: subject: 13 subject: 14 12 9 subject: 15 13 10 subject: 14 12 15 subject: subject: subject: 9 subject: 10 10 subject: subject: subject: 12 subject: 9 subject: subject: subject: 13subject: 10 subject: subject: 14subject: subject: 12 subject: 9 subject: subject: 15subject: subject: 13 13 10 subject: subject: 14 14 12 subject: subject: 15 15 13  UUFF  23 2  21 2100 21 02  021 0  23  00  UU1F PA 00 TT U1F 9FPA 0 T PA  19 2400 24 00  23 15 0  UU9F FF F FFPA F FPA T 0FPA PA FPA T 4FPA PA TU9F FPA F TU FPA PA 00 TU 00 TTUU1PA 0 T UU 15 00 TU 00 TT UPA 0 T UU1F 00 TT UPA 00 TTUU115 15 00 TT PA 00 TT U1F 0PA 15 9FPA 0U 0PA 0U 4PA 4PA 1F 1PA  23 19  23  00  PA 0 T  00  2  21 02  U4FPA PA 00 TT PA 0 T UU1F 9F  21 21  0  U0FPA F PA 00 TT PA 0 T UU 15 4F  82000  A 1T  021 0  UU9F 0F  238 0  D0 D0FF A23 T 23  23 00  23 15 0 15 2200  AA1T1TD0 F  A23 T  23 0  82000 20 00  2380  D0 D0FF A 1T 0  15 2200 22 00  23 15  23 23 0 F AA3T0TD0 0 F AA1T1TD0 FAA1T0TD0 0 AA1T0TD0 T3 D D D T3TD0 FAA3T0TD0 D0FFA 0TD0 D0FF AA T D0FFA 0TD0 D0FF A 1 T D0FF A23 T D0FF AA D0FF D0 F  0  20 00  820  0  D0 F AA1T1T  A 0TD0 D0FF  D0 F AA3T0T  A T D subject: 15F  PA T  0  0  DF  subject: 8  0 UF  0  23  0  PA T  AT  DF  subject: 6  UF  1  21  AT  00  UUFFPA T PA PATT  00  23 22 0  subject: subject: 85 subject: 8  0  00  UUFFPA T PA PATT U F  subject: subject: 64 subject: 6  1  00  DDFF  AATTD F  7 00  17 00  87  13 17  PA T  DF  0  0  PA T  AT  7  17  UF  U FPA PATT PA T UUFF  718010  41100  24 0  subject: 16 9 subject: subject: 16 ɾ↘ ↘ ɾ↕↕  227 0  20 0 22  10 18 0  22  10  subject: 9 subject: 16 ↕ ɾɾ9↘ ↘ ɾɾsubject: ↕ subject: 16 ↘ ↕ ↘ ↕  00  0 22 17 0 0 0 0 0↔ ɾ0 ↔ ɾ↖ ↖ 2 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ2↔ ɾsubject:  100 22  2  00  ɾ ɾ0↘ 0 0  0 0 0↔ ↔ ɾɾ↔ ɾ  ɾ↘ 0  ɾ↔  0 UUFFPA T PA PATT U F  0 0 0↔ ɾɾ0 0↔ ↔ ɾɾ↔ ɾ↔  UUFPA FF T UUFPA FPATU T FPA PATTUUPA FPATU T FPA PATT UPA F T UUFFPA U TUFPA FPATT UPA PA F TTUUFFPA U TUFPA FPATT PA PATT U F  ɾɾɾ0↘ 0 0 0 0 UU2F PA 00 TT U2F 0 T PA 4FPA  ɾɾ0↘ 0 0 0 0  UU4F F FPA T 2FPA PA FPA T 5FPA PA TU4F FPA F TU6F FPA PA 5PA 5PA 00 TU 00 TTUU2PA 0 T UU6F 00 TU 00 TT UPA 0 T UU2F 00 TT UPA 00 TTUU2F 00 TT PA 00 TT U2F 6F 0U 6FPA 0U 4FPA 2PA 2F 2PA 2PA U2F  UF AT  UU2F PA 00 TT 6FPA 0 T PA  0 0 ɾ0↔ ɾɾ↔ ↔  UUFFPA T PA PATT  77  717  0  0  D F AATT  13 1700 17 00  subject: 17  ɾ↘ 0  17  ɾ↔ 0  71700 17 00  20  213 0 13 1700  87 00  ɾ↕ 7  228 0  0 0 0↔ ↔ ɾɾ↔ ɾ  0 22  9 17 ↖ 5 ↖ ɾɾ↖ ɾ  7 80  DDFF A T  subject: 10 subject: 17 subject: 17 ɾ↘ ↘ ɾ↕↕  ɾ ɾ↕7 7 18  ɾ ɾ0↘ 0 0 00 22 227 0  71700  00  2  22  2 0 0↔ ɾɾ0 0↔ ↔ ɾɾ↔ ɾ↔  80  21 9 17 4 ɾɾ↖ ↖ 5 ↖ ɾɾ↖ ɾ↖  77 08  ɾɾɾ0↘ 0 0 0 0  27 0  DDFF  08  AATTD F  27 0  UUFFPA T PA PATT  2  UF  ɾɾ↕ 1 16 15  ɾɾ↘ 0 00  subject: 5 subject: 13 5↘ subject: ɾ↘ ɾ↕↕ ɾɾ↕ 3 22 8  ɾɾ↘ 0 00  13 29 ↖ ɾ↖ ɾɾ↖ 0 0 ɾ0↔ ɾɾ↔ ↔ 21 1 ↖ ɾ15 ɾɾ↖ ↖ 0 0 ɾ0↔ ɾɾ↔ ↔  ɾɾ↕1 6 16 15 20 ɾɾ0↘ 000 0  22  2  UF  0  0  PA T  ɾ↔  ɾ↖ 13  0  subject: 12 4 subject: subject: 12 4 subject: ɾ↘ ↘ ɾ↕↕ subject: 18 ɾɾ↘ ↘ ɾɾ↕↕  13 0 18 00 29 ɾ00↔ ↔ ɾ↖ ↖ 4 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ4↔ ɾsubject:  subject: 12 subject: 18 subject: 18 ɾ↘ ↘ ɾ↕↕  subject: 12 subject: 18 subject: 12 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 18 ↘ ↘ ↕  ɾɾɾ↕1 68 15 20 ɾɾɾ0↘ 0 000  subject: 13 subject: 21 subject: 13 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 21 ↘ ↘ ↕  ɾɾɾ↕3 19 877 ɾɾɾ0↘ 0 08 10  13 18 13 94 ɾɾ↖ ↖ ↖ ɾɾ↖ ↖ ɾ  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  21 47 ↖ ɾɾ15 ↖ 6 ↖ ɾɾ↖ ɾ↖  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  ɾ ɾ↕6 8 20 ɾ ɾ0↘ 0 0 ɾ↕ ɾ19 7 7 ɾ ɾ0↘ 8 10  18 13 ↖ 4 ↖ ɾɾ↖ ɾ  00 0↔ ↔ ɾɾ↔ ɾ  47 6 ɾɾ↖ ↖ ɾ↖  ɾ↕ 8 ɾ↘ 0 ɾ↕ 7  subject: 18  ɾ↔ 0  78  9  subject: 4 subject: 12 4↘ subject: ɾ↘ ɾ↕↕  ɾ↖  9  22  AT  DF  2  ɾ↘ 0  9  ɾɾ0↘ 0 2 0 0 00  ɾɾ↕0 47 20 18 227 0  2 0 ɾ0↔ ɾɾ↔ ↔  00  ɾɾ↘ 0 2 0  UUFFPA T PA PATT U F  22  0  subject: 4  ɾ↕ 16  9  subject: 10 3 subject: subject: 10 3 subject: ɾ↘ ↘ ɾ↕↕ subject: 17 ɾɾ↘ ↘ ɾɾ↕↕ 21 18 ↖ 4 ɾ↖ ɾɾ↖  2  ɾ↘ 2  9  00  228 0  21 2 9 0 18 0 0↔ 4 ɾ0 ↔ ɾ↖ ↖ 5 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ3↔ ɾsubject:  213 0  DDFF A T  AATTD F  subject: 10 subject: 17 subject: 10 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 17 ↘ ↘ ↕  ɾɾɾ↕0 7 7 20 18  200  ɾɾ↕ 0 4 20  2270  ɾ↖  77 08  subject: 3 subject: 10 3↘ subject: ɾ↘ ɾ↕↕  ɾ↔  UUFPA FF T UUFPA FPATU T FPA PATTUUPA FPATU T FPA PATT UPA F T UUFFPA U TUFPA FPATT UPA PA F TTUUFFPA U TUFPA FPATT PA PATT U F  213  270  subject: 3  9  0  17 00  7 00  87 00  AATTDDFFA TDDFF AATD T F AATTDDFFA TDDFF A D T F AATTD FAATTDDFF A D TDFF AATTD FAATTDDFF  228  18  U FPA PATT PA T UUFF  0  D F AATT  7 80  A TDDFF  ɾ↖  U FPA PATT PA T UUFF  A TDDFF  UF  PA T  PA T  0  0  DF  PA T  PA 0 T  0  0  subject: 26  U2F  PA 0 T  ɾ↕ 4  UUFF  AATT  2  22  0  0  ɾ↘ 0  subject: 23  10  00  24 0  22  010  227 0  subject: 2 subject: subject: 29 ↕ ɾ9↘ ↘ ɾsubject: ↕ subject: 16 ɾɾ↘ ↘ ɾɾ↕↕  00  ↕ ɾ9↘ ɾsubject: ɾɾ↘ ↘ ɾɾ↕↕ 0 0 0 21 0 22  subject: subject: 18 subject: 21 subject: 23 18 16 subject: 26 21 17 subject: 23 18 26 subject: subject: subject: 16 16 subject: subject: 17 17 subject: subject: subject: 18subject: 16 subject: subject: subject: 21subject: 17 subject: subject: 23subject: subject: 18 subject: 16 subject: subject: 26subject: subject: 21 21 17 subject: subject: 23 23 18 subject: subject: 26 26 21  U FPA PATT PA T UUFF  200  10 18 0 18 1700  0  UUFF  00 22  20 0 20 22 2200  subject: 2 subject: 2  ɾ↔  subject: 17  UF  0 10  17  0 22  ɾ↖  41100 11 00  subject: 16  718010 18 100  ɾ↕ 7  240  2270  17 17 ↖ 2 ↖ ɾɾ↖ ɾ  18 1700 17 00  ɾ ɾ↕7 7 22  U5FPA PA 00 TT PA 0 T UU2F 4F  Figure 4.6: Distribution of ‘T’ variants by participants by phrase group: top: ‘edify/audify’ ([VV] sequences), bottom: first ‘T’, ‘editor/auditor’ ([VVR] sequences).  20 2200 22 00  22 17 17 0 ɾɾ↖ ↖ 2 ↖ ɾɾ↖ ↖ ɾ  10 18  20 22  ɾɾɾ↕0 7 7 22 22  0  0  ɾɾ↕0 7 21 22 22  11 00  18 100  0  U2FPA PA 5F 00 TT PA 0 T UU6F  411  0  ɾ↖  UU4F 2F  PA 0 T  U5F  718  0  22 0 ↖ 0 ɾ↖ ɾɾ↖  17 00  18 17  0  11  22 00  20 22  0  18  bution ofD17Fflap variants by subjects 1:D16FDistribution ofD17Fgroup: flap top: Figure -ify by subjects ,Asubject: 1: bottom: by -or ofD17 flap variants top: by subjects ,Asubject: bottom: by phrase top: bottom: Figure of variants Figure by subjects by phrase of flap variants Figure by ,A-ify 1: bottom: byAsubject: -or phrase of flap variants -ify by ,A-ify bottom: by-or phrase top: -ify ,Asubject: bottom: -or A T A TT Figure A T1: D FFDistribution A TT by A Tphrase D FFflap A TT variants A TTD FFA T1: D FFDistribution TD A TTD FFA Tphrase D FFgroup: A Ttop: T-ify FF subjects D FFDistribution TTD TTD FF-or D FFgroup: A Ttop: FF subjects TTD FF-or group: A D A D A A D D A T FDistribution A D D A T Fgroup: A TD A TD D TD A D D A T group: A TD D T ,A A D A T D T D subject: subject: subject: 18 subject: subject: 21 subject: subject: 23 subject: 18 16 subject: 26 subject: 21 subject: 23 18 26 21 subject: 23 subject: 26 subject: 16 subject: 17 subject: subject: 18 16 subject: subject: 21 17 subject: subject: 23FAsubject: subject: 18 subject: 16 subject: 26FAsubject: 21 17 subject: subject: 23F D 18 subject: subject: 26F D 21 subject: 23F 26F lveolar tap, = : AKey: flap, UAA= F =A11up flap, tap, PA D Fdown = postalveolar down Key: flap, A U = F alveolar up flap, PA D T F = postalveolar down flap, U F up flap, PA T postalveolar = D = :A= flap, = F =Atap. flap, PA D F = = postalveolar down :FA= tap. = = tap, PA D F = = down tap. up tap. 13 13 13 13 6Key: 7TA 19 6Key: 7TA 0T 19 0 FT 6Key: 0 19 0:FFA 19 0= 0 0FFflap, 19 0postalveolar 0FFflap, 19 21 11 21 21 11 11 21 11 A11 T D0 F D F:AA T7T down D0 D0 T Falveolar TT alveolar Ttap, D0 D0 FF AA T6TT T7TD0 D0 AA TD Falveolar Tup FFA Ttap, D0 D AA T6TD T7TD0 D AA TF Tup FFflap, D0 AA T6TD AA TF D0 D0FT F A D0FF A D D0FT FU T A TD0 D D0FF Atap, T A D AA T D0FT FU TD0alveolar FAtap. A TD0 D0= D0FF A13 TT F= D0 D A21 T U TD0= Ftap. D0FFflap, PA A13 T T = D0postalveolar F A21 T D0 F  subject: 10  01 0  23  00  DDFF A T  D0 D0FF  24 00  UF  000  20 00  12 0  16 0 21  AATTD F  238 0  19 24  2  10  11800 18 00  010  21  02  DDFF A T  23 AA T3TD0 F  22 00  A 0TD0 D0FF  15 22  AA1T0T  0  PA 0 T  0 00  23 22 0 224 00  A T D subject: 14F  D0 F  21  20 00  23 00  16 0 16 21 2100  UUFPA FF T UUFPA FPATU T FPA PATTUUPA FPATU T FPA PATT UPA F T UUFFPA U TUFPA FPATT UPA PA F TTUUFFPA U TUFPA FPATT PA PATT U F  2 21  4 02  224 00  23 0 22  subject: 42 subject: 53 subject: 42 subject: 53 subject: subject: subject: subject: subject: 64 subject: 2 subject: 85 subject: 3 subject: 64 subject: 85 subject: subject: 6 subject: subject: 8 subject: subject:  12  16 2100 21 00  AATTDDFFA TDDFF AATD T F AATTDDFFA TDDFF A D T F AATTD FAATTDDFF A D TDFF AATTD FAATTDDFF  A TT Asubject: TD TT A TD FF D TT A TTD FFAsubject: TD TD TTD FFA TD FF A D TTD TTD FF A D TTD TTD FF D FF A TTD FF A TTD FF Asubject: D9FF D9F A A D A Asubject: D D9FF A A T9F A A D D TF A A Asubject: D TD9FF A A A D D T A A D T A A D subject: 12 subject: subject: 13 10F Asubject: subject: 14 12 subject: subject: 15 subject: 13 10 subject: 14 12 subject: 15 13 subject: 14 subject: 15 subject: 10 subject: 12 subject: subject: 13 10 subject: subject: 14FAsubject: 12 subject: subject: 15FAsubject: subject: 13 10 subject: subject: 14F D 12 subject: subject: 15F D 13  20  U4F  0  18 00  16 21  U FPA PATT PA T UUFF  2  D F AATT  4 02  A TDDFF  subject: 53 subject: 3 subject:  118  0  subject: 42 subject: 2 subject:  224  0  D F AATT  U FPA PATT PA T UUFF  A 0T  A T D subject: 10F  subject: 3  20  0  PA T  18  UF  0 00  21 00  16 21  2  4  UUFF  A TDDFF  AATT  DF  AT  subject: 2  ɾ↕ 21  ɾ↕ 22  subject: 5  1  ɾɾ↕3 19 22 87  ɾ↖ 7  ɾ↘ 0  ɾ↖  ɾ↔  0  subject: 13 5 subject: subject: 13 5 subject: ɾ↘ ↘ ɾ↕↕ subject: 21 ɾɾ↘ ↘ ɾɾ↕↕  subject: 14 6 subject: subject: 14 6 subject: ɾ↘ ↘ ɾ↕↕ subject: 23 ɾɾ↘ ↘ ɾɾ↕↕  subject: 15 8 subject: subject: 15 8 subject: ɾ↘ ↘ ɾ↕↕ subject: 26 ɾɾ↘ ↘ ɾɾ↕↕  0 21 00 14 ↖ ɾ00↔ ↔ ɾ15 ↖ 6 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ5↔ ɾsubject: 23 0 10 00 11 ɾ00↔ ↔ ɾ↖ ↖ 2 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ6↔ ɾsubject: 0 0 012 00 ɾ00↔ ↔ ɾ↖ ↖ 2 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ8↔ ɾsubject:  ɾɾ0↘ 000 10  subject: 13 subject: 21 subject: 21 ɾ↘ ↘ ɾ↕↕  subject: 21  ɾ↔ 0  ɾ↕ 21  subject: 6  1  ɾ↖ 2  ɾ↘ 0  subject: 6 subject: 14 6↘ subject: ɾ↘ ɾ↕↕ ɾɾ↕ 1 21 23  ɾɾ↘ 0 00  subject: 8 subject: 15 8↘ subject: ɾ↘ ɾ↕↕ ɾɾ↕ 23 23 23  ɾɾ↘ 0 00  23 11 ↖ ɾ↖ ɾɾ↖ 0 0 ɾ0↔ ɾɾ↔ ↔ 0 01 ↖ ɾ↖ ɾɾ↖ 0 0 ɾ0↔ ɾɾ↔ ↔  ɾɾ↕1 11 21 23 22  ɾɾ0↘ 000 0  ɾɾ23 ↕ 20 23 23 22  ɾɾ0↘ 000 0  ɾ↖  ɾ↔  0  subject: 14 subject: 23 subject: 23 ɾ↘ ↘ ɾ↕↕  ɾ↕ ɾ11 22 22 ɾ ɾ0↘ 0 0  subject: 15 subject: 26 subject: 26 ɾ↘ ↘ ɾ↕↕ ɾ↕ ɾ20 2 22  ɾ ɾ0↘ 0 0  00 0↔ ↔ ɾɾ↔ ɾ  10 2 2 ɾɾ↖ ↖ ↖ ɾ  00 0↔ ↔ ɾɾ↔ ɾ  2 22 ↖ 2 ↖ ɾɾ↖ ɾ  00 0↔ ↔ ɾɾ↔ ɾ  ɾ↘ 8 ɾ↕ 22 ɾ↘ 0 ɾ↕ 2 ɾ↘ 0  subject: 23  ɾ↔ 0  ɾ↕ 23  subject: 8  0  ɾ↖ 22  ɾ↘ 0  ɾ↖  ɾ↔  0  subject: 14 subject: 23 subject: 14 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 23 ↘ ↘ ↕ ɾɾɾ↕1 11 22 23 22  ɾɾɾ0↘ 0 000  subject: 15 subject: 26 subject: 15 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 26 ↘ ↘ ↕ ɾɾɾ23 ↕ 20 2 23 22  ɾɾɾ0↘ 0 000  23 10 122 ɾɾ↖ ↖ ↖ ɾɾ↖ ↖ ɾ  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔ 02 22 12 ɾɾ↖ ↖ ↖ ɾɾɾ↖ ↖ 00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  subject: 26  ɾ↔ 0  phrase: -ify  ɾ↕ 329  73  phrase: first flap: -or  ɾ↘ 12  ɾ↕ 151  0  238  ɾ↔  ɾ↖  ɾ↖  ɾ↘ 8  2  ɾ↔  Figure 4.7: Distribution of ‘T’ variants by phrase group: left: ‘edify/audify’ ([VV] sequences), right: first ‘T’, ‘editor/auditor’ ([VVR] sequences). Wilcoxon Signed-Rank tests were performed on the data summarized in Figure 4.7. For each of the four ‘T’ variants, the percentage of productions matching that variant were compared based on [VV] vs. [VVR] sequences. As expected from the descriptive statistics in Figure 4.7, the results are significant for [R ] and [R ], as seen in Table 4.3.  contexts VV vs. VVR  [R ] V 127  p *p = 0.002  [R ] V p 3 p = 0.371  [R ] V p 8 *p = 0.002  [R↔ ] V p 0 p=1  Table 4.3: Wilcoxon Signed rank tests comparing prevalence of initial ‘T’ types in ‘edify/audify’ ([VV] sequences) vs. ‘editor/auditor ([VVR] sequences). * = significant (α = 0.05).  4.3.3  Hypothesis 3: ‘otter’ ([VR]) vs. ‘editor/auditor’ ([VVR]) sequences.  The final ‘T’ variants produced during the phrase ‘We have otter books’ ([VR] sequences) vs. the phrases ‘We have editor books’ and ‘We have auditor books’ ([VVR] sequences) are shown in Figures 4.8 and 4.9. The results show that for six participants (2, 3, 4, 5, 6 and 26) most of the ‘T’ variants used in the [VR] sequence were [R ], but most of the final ‘T’ variants used in the [VVR] sequences were [R↔ ]. 79  A TD F DF 4 0 0  subject: 17  DF AT AT 0 22 1  A TD FA TD F A TD F A TD FA TD F DF AT A TD F DF AT A TD F A TD F DF AT A TD F DF AT A TD FA TD F A TD F A TD FA TD F 230 23 16 0 21 0 16 0 0 22 0 4 0 2 21 22 0 0 0 0 10 0 1 1 2 2 0 0 0 10 1 0 1 2  DF DF AT 0 10 0  0  0  22 0  0 2418 0 2119  5 7 0 21  6 00 02  0  0 2 23  23 0  018 0  15 0  2 14 19 0  8 0  5 7 6 0 23 21 0  23  19 0  0  0  U FPA T PA T  0  1  PA T  subject: 14  UF  U 1FPA T PA 0 TU F  14 0  0  23  0 2 23  D0 F  1T 0 A7  Asubject: TD F D 14F A T  DF DF 0 2  7 0  0  D0 F  PA T  subject: 15  UF  U 1FPA T PA 0 T  6 0 23  23  9T 0 A23  Asubject: TD F D 15F  0 0 8 0 0 UF PA T Usubject: FPA T PA 8T  AT A TD F 23 1 0  UF  0  22  AT  UF  2  10  0 UF  1  0  2  AT  DF  PA T  11 UF  0 PA T  1  0  PA T  23  0  DF  PA T  0  0  DF  subject: 26 AT  DF  subject: 23 AT  ɾ↕ 0  subject: 2  ɾ↖  11  subject: 9ɾ↘ ɾ↕ ɾɾ↘ ↘ ɾɾ↕↕ 0 2 0 0 0 0  subject: 2 subject: 2  4 0  0  18  Figure 4.8: Distribution of ‘T’ variants by participants by phrase group: top: otter, bottom: second ‘T’, editor/auditor. 4 0  0  20 17 18↔ 0 0ɾɾ↔ ↔ ɾ↔ ɾɾ↔  17 0  DF AT  subject: 17  ɾ↕ 0  ɾ↘ 0 0 2  subject: 10 subject: 17 subject: 17ɾ↘ ↘ ɾ↕↕  22  2  A TD F  DF  ɾ ɾɾ↘ 1 0 0 0 0  0 0 5 12 ɾɾ↖ ↖ 11 ↖ ↖ ɾ ɾɾ↖  22 9 19 ↔ 0 0ɾɾ↔ ↔ ↔ ɾ ɾɾ↔  ɾ ɾ↘ 0 0 0  19  ɾ↔ 0  U FPA T PA T  7 0  8  ɾɾ↕ɾ0 7 0 0 1  7 0  DF 0  AT  22  UF  2  ɾɾ↕ 0 0 0 ɾɾ↘ 1 0 0  0 12 12 ɾ↖ ɾɾ↖ ↖  22 0 0ɾ↔ ɾɾ↔ ↔ PA T  22  0  0  ɾ↔  UF  subject: 3 subject: 3↘ subject: 10 ɾ ɾ↕ ↘ ↕ ɾ↖  2  ɾ↕ 0  subject: 4  0  12  ɾ↖  ɾ↘ 0  subject: 4 subject: 4↘ subject: 12 ɾ ɾ↕ ↘ ↕  ɾɾ↕ 1 00 ɾɾ↘ 0 00  subject: 5 subject: 5↘ subject: 13 ɾ ɾ↕ ↘ ↕ ɾɾ↕ 0 00  ɾɾ↘ 0 00  subject: 6 subject: 6↘ subject: 14 ɾ ɾ↕ ↘ ↕ ɾɾ↕ 0 00  ɾɾ↘ 0 00  0 12 12 ɾ↖ ɾɾ↖ ↖  13 00 ↔ ɾ ɾɾ↔ ↔  3 12 12 ɾ↖ ɾɾ↖ ↖  21 00 ↔ ɾ ɾɾ↔ ↔ 1 11 12 ɾ↖ ɾɾ↖ ↖ 23 00 ↔ ɾ ɾɾ↔ ↔  ɾɾ↕ 1 002 2 ɾɾ↘ 0 000 0 ɾɾ↕ 0 004 1 ɾɾ↘ 0 000 2 ɾɾ↕ 0 007 10  ɾɾ↘ 0 000 0  ɾ↖  ɾ↔  0  subject: 12 subject: 18 subject: 12 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 18 ↘ ↘ ↕  ɾɾ↕ɾ1 2 027 ɾ ɾɾ↘ 00 000  subject: 13 subject: 21 subject: 13 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 21 ↘ ↘ ↕  ɾɾ↕ɾ0 4 019 ɾ ɾɾ↘ 00 020  subject: 14 subject: 23 subject: 14 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 23 ↘ ↘ ↕  ɾɾ↕ɾ0 7 022 10 ɾ ɾɾ↘ 00 000  04 0 12 ɾɾ↖ ↖ 10 ɾ↖ ɾɾ↖ ↖  13 18 0014 ɾɾ↔ ↔ ↔ ɾ↔ ɾɾ↔  3 15 6 12 ɾɾ8↖ ↖ ɾ↖ ɾɾ↖ ↖  21 4 00ɾ7↔ ɾɾ↔ ↔ ↔ ɾɾ↔  14 0 12 ɾɾ2↖ ↖ ɾ↖ ɾɾ↖ ↖  23 10 00ɾ2↔ ɾɾ↔ ↔ ↔ ɾɾ↔  subject: 12 subject: 18 subject: 18ɾ↘ ↘ ɾ↕↕  ɾ ɾ↕ 2 7 2 ɾ ɾ↘ 00 0  40 10 ɾɾ↖ ↖ ɾ↖  subject: 18 subject: 13 subject: 21 subject: 21ɾ↘ ↘ ɾ↕↕  ɾ ɾ↕ 4 9 1 ɾ ɾ↘ 00 2  subject: 14 subject: 23 subject: 23ɾ↘ ↘ ɾ↕↕  ɾ ɾ↕ 7 22 10 ɾ ɾ↘ 00 0  18 14 0ɾ↔ ɾ↔ ɾ↔  15 6 ɾ8↖ ↖ ɾ↖ ɾ  47 0ɾ↔ ɾ↔↔ ɾ  40 ɾ2↖ ↖ ɾ↖ ɾ  10 2 0ɾ↔ ɾ↔ ɾ↔  10 11 ɾɾ↖ ↖ ɾ↖  2 23 0ɾ↔ ɾ↔↔ ɾ  ɾ↕ 7 ɾ↘ 0 ɾ↕ 9 ɾ↘ 0 ɾ↕ 22 ɾ↘ 0 ɾ↕ 1 ɾ↘ 0  0  14  0  23  ɾ↔  9  10  PA T  0  DF  AT  ɾ↘ 0  9  0  0  DF AT  subject: 10 subject: 17 subject: 10 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 17 ↘ ↘ ↕  U FPA T PA T U F  13 0  8 0  10  A TD F  9 19 0ɾ↔ ɾ↔↔ ɾ  22  0 5 11 ɾɾ↖ ↖ ↖ ɾ  0  subject: 3  9  7 0  7 8  0 22 9 0 0 12↖ 0 12 ɾɾ↔ ↔ ɾ↖ ↖ 0 11 ɾ ɾɾ↔ ↔ ɾɾ↖ ↔ ɾ↖subject:ɾ 3  U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F  220  7 0  ɾɾ↘ 1 0 0 0 0  U FPA T PA T U F  17 0  10  20  5  13 0  7 0  ɾ↖  2  8 0  A TD FA TD F A TD F A TD FA TD F  ɾ ɾ↕ 7 0 1  0  22  subject: 10 3 subject: subject: 3↘ subject: 10 ɾɾ↘ ɾ↕↕ subject: 17 ↘ ↕ɾ ɾ↘ ɾ↕  ɾɾ↕ 0 7 0 0 1  17 0  0  DF AT  12  7  7 8  ɾ↕ 0  U FPA T PA T U F  0  17 0  7  PA T  subject: 26  UF  0  0  0  PA 0 T  ɾ↔  DF AT  2  U2F  ɾ↘ 0  7 0  PA T  subject: 23  UF  0  PA 0 T  U2F  ɾɾ↘ 2 0 0 0 0  U FPA T PA T  0 20 17 4 0 11↖ 0 12 ɾɾ↔ ↔ ɾ↖ ↖ 0 12 ɾ ɾɾ↔ ↔ ɾɾ↖ ↔ ↖ ɾ subject:ɾ 2  10  subject: 16 9 subject: subject: 9ɾ↘ ɾɾ↕↕↕ subject: 16 ɾ↘ ↘ ↘ ↕  ɾ ɾɾ↘ 2 0 0 0 0  U6FPA 0 T PA 0 T  subject: subject:229↘ subject: subject: 9ɾ↘ ɾ↕↕ subject: 16 ↕ɾ ɾ ɾ↘ ↘ ɾ↕  20 0 0ɾ↔ ɾɾ↔ ↔  ect: subject: 18 16 subject: subject: 21 17 subject: subject: 23 subject: 18 subject: 16 subject: 26 subject: 21 17 subject: subject: 23 18 subject: subject: 26 21  A TD F  2  A TD F  0  U FPA T PA T U F  18 0  U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F  ɾ↔  U FPA T PA T U F  ɾ↘ 0  U FPA T PA T U F  10  subject: 16  17 18 0ɾ↔ ɾ↔ ɾ↔  U4FPA 0 T PA 0 T U2F  11 0  ɾ ɾ↘ 0 0 0  U4FPA PA F0 T U6FPA 0 TU2FPA 0 T U2 0 TU5FPA 0 T PA 0 T U2F  20  2  17 0  ɾ↖  18 0  ɾ↕ 4  10  4 2 12 ɾɾ↖ ↖ ↖ ɾ  0  ɾ ɾ↕ 3 4 0  11 0  subject: 16 9 subject: subject: 16ɾ↘ ↘ ɾ↕↕  U5FPA 0 T PA 0 T U2F  0 4 2 12 ɾɾ↖ ↖ 12 ↖ ↖ ɾ ɾɾ↖  4  ɾɾ↕ɾ0 3 4 0 0  0  ɾɾ↕ 0 3 0 0 0  U2FPA 0 T PA 0 T U6F  0 11 12 ɾ↖ ɾɾ↖ ↖  17 0  :Figure Distribution ofA flap variants Figure by 4: subjects Distribution by phrase of flap group: variants top: Figure by otter, 4: subjects Distribution bottom: by phrase of flap group: variants top: by otter, subjects bottom: by phrase ts subjects 1:16FDistribution of17Fgroup: flap top: Figure -ify subjects , 1: bottom: Distribution by-or of17Fgroup: flap top: -ify by subjects , bottom: by-or phrase group: top: -ify ,Asubject: bottom: -orgroup: top: otter, bottom: A Tby D A T by Tphrase D A T variants A TD Tby D TD TD Tphrase D A T variants A TD T A TD A T D T D ect: subject: 18F D subject: subject: 21F D subject: subject: 23FAsubject: 18F Asubject: 16F Asubject: 26FAsubject: 21F D subject: subject: 23F D 18F Asubject: subject: 26F D 21F subject: 23F 26F ap, editor/auditor. Key: second alveolar flap, editor/auditor. tap, F = Key: flap, second U alveolar = flap, up editor/auditor. tap, PA F == Key: flap, APA U alveolar =D0postalveolar up tap, PA F D0=F down flap, U F = up flap, PA :own AD0 T F =A11up tap, D T FA= = down :FAKey: flap, T = F alveolar = up tap, D TFA= down flap, UAtap. F up TF = 13 13 13 7TD 0 Fflap, 19 6TD 0postalveolar 7TD 0F 19 6TD 0postalveolar 0F 19 21 11 AKey: FUA= T alveolar TD0 Fflap, D0A FPA AT T= AA TD0 D FUAtap. TD0 FAdown TD0 Fflap, D0A FPA AT T= F flap, D A21 T D TD0 F down D0 Fflap, AT T= F flap, A21 T D tap. talveolar 0tap.7 18 10 0 T 20 0 22 0 0 T 20 0 0 10 0 0 22 0 22= postalveolar 220tap.7 0 18 10 22= postalveolar 22tap.7 0 22 22  ect: subject: 12 9 subject: subject: 13 10 subject: subject: 14 subject: 12 subject: 9 subject: 15 subject: 13 10 subject: subject: 14 12 subject: subject: 15 13  U FPA T PA T U F  U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F  20 0  8 0  230  U FPA T PA T U F  2 14 19 0 2418 0  22 0  D0 F 22  U FPA T PA T U F  0 23  15 0  9T 0 A0 1D0 F A23 3TD0 F  U 9FPA F0 T U15 FPA F0 T U 1FPA TU 9FPA F0 T U 1FPA TU15 FPA PA 0 TU0FPA 0 T U1 0 TU4FPA 0 T U PA 0 T U PA 0 T PA 0 TU F  2  0 2119  23  9TD0 F A0 3TD0 FA 0TD0 F 22D0 F A7 1TD0 FA 0TD0 F A23 1T 0 A4 1TD0 F A4  U4FPA 0 T PA 0 T U 1F  0  1T D0 F A7  20 0  A 0TD0 F  F U0FPA 0 T PA 0 T U15  5 21  8  3T D0 F A0  A 0TD0 F  T T Asubject: TD TD9F Asubject: TD TD FAsubject: TD TD FAsubject: TD Asubject: TD A TD TD TD ect: 12F D9F Asubject: subject: 13F D 10F Asubject: 14F Asubject: 12F Asubject: 15F Asubject: 13F A TD 10F Asubject: 14F A TD 12F Asubject: 15F D 13F A T  DF DF AT 0 0 1  0 1 10 0 0 11 UF PA T Usubject: FPA T PA 6T U F  AT A TD F 21 2 0  subject: subject: 18 16 subject: subject: 21 17 subject: subject: 23 subject: 18 subject: 16 subject: 26 subject: 21 17 subject: subject: 23 18 subject: subject: 26 21  Figure 1: Distribution of flap Figure variants Figure 1: subjects Distribution byofphrase of group: variants Figure top: 1: -ify subjects Distribution , bottom: of -orflap group: variants top: -ify subje , 1: by Distribution flapflap variants Figure by 1: by subjects Distribution by by phrase ofphrase flap group: variants Figure top: by 1: by -ify subjects Distrib , bo : Key: A T = alveolar tap, D: FKey: :=Key: down Aflap, T alveolar =Ualveolar F = tap, uptap, flap, DPA FKey: :down =TKey: down =Apostalveolar Aflap, TUalveolar =F Ualveolar F up = tap. up tap, flap, DPA FTKey: = T down =Apostalve AT = DF := flap, T = = tap, flap, DPA F := down = postalveola flap, T flap, =Ualv  DF AT AT 2 21 2  Figure 1: Distribution of flap Figure variants Figure 1: subjects Distribution byofphrase of group: variants Figure top: 1: -ify subjects Distribution , bottom: of -orflap group: variants top: -ify subje , 1: by Distribution flapflap variants Figure by 1: by subjects Distribution by by phrase ofphrase flap group: variants Figure top: by 1: by -ify subjects Distrib , bo : Key: A T = alveolar tap, D: FKey: :=Key: down Aflap, T alveolar =Ualveolar F = tap, uptap, flap, DPA FKey: :down =TKey: down =Apostalveolar Aflap, TUalveolar =F Ualveolar F up = tap. up tap, flap, DPA FTKey: = T down =Apostalve AT = DF := flap, T = = tap, flap, DPA F := down = postalveola flap, T flap, =Ualv  0 0 0 18 0 0 00 0 0 0 0 1 1 2 0 0 0 1 0 18 0 1 2 0 1 0 0 0 0 10 8 0 10 0 12 0 8 0 11 0 0 12 0 2 2 11 11 0 110 U FPA T PA T U F U FPA T PA T U F U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F U FPA T PA T U F U FPA T PA T PA T U F PA T U F U F PA T PA T U F U F PA T PA T U F U F PA T U F PA T U PA F T U F PA T U F PA T PA ject: 4 5 subject: 4 subject: 5 subject: 4 subject: subject: 5T U F subject: 2 subject: subject: 3 subject: 6 subject: 2 subject: 8 subject: 3 subject: 6 8  A TD F DF 21 0 0  ect: 16  Seven other participants (9, 10, 12, 14, 16, 17 and 18) follow the same pattern, but  with more variability. Two participants (8 and 13) produced mostly [R ] in both phrase sets. One participant (23) produced mostly [R ] in both phrase sets, and two  participants (15 and 21) had highly variable productions in both phrase sets. ɾ↕ 0  subject: 5  12  ɾ↖ 6  80  ɾ↘ 0  ɾ↖  ɾ↔  0  subject: 21  ɾ↔ 7  ɾ↕ 0  subject: 6  11  ɾ↖ 0  ɾ↘ 0  ɾ↖  ɾ↔  0  subject: 23  ɾ↔ 2  ɾ↕ 0  subject: 8  12  ɾ↖  ɾ↘ 0  subject: 8 subject: 8↘ subject: 15 ɾ ɾ↕ ↘ ↕  subject: 12 4 subject: subject: 4↘ subject: 12 ɾɾ↘ ɾ↕↕ subject: 18 ↘ ↕ɾ ɾ↘ ɾ↕  subject: 13 5 subject: subject: 5↘ subject: 13 ɾɾ↘ ɾ↕↕ subject: 21 ↘ ↕ɾ ɾ↘ ɾ↕  subject: 14 6 subject: subject: 6↘ subject: 14 ɾɾ↘ ɾ↕↕ subject: 23 ↘ ↕ɾ ɾ↘ ɾ↕  subject: 15 8 subject: subject: 8↘ subject: 15 ɾɾ↘ ɾ↕↕ subject: 26 ↘ ↕ɾ ɾ↘ ɾ↕  04 13 0018↔ 12↖ 12 ɾɾ↔ ɾ↖ ↖ 10 0 ɾ ɾɾ↔ ↔ ɾɾ↖ ↔ ɾ↖subject:ɾ 4  3 21 15 004↔ 12 ↖ 12 ɾɾ↔ ɾ↖ ↖ 8 0 ɾ ɾɾ↔ ↔ ɾɾ↖ ↔ ɾ↖subject:ɾ 5  23 14 0010↔ 11↖ 12 ɾɾ↔ ɾ↖ ↖ 0 2 ɾ ɾɾ↔ ↔ ɾɾ↖ ↔ ɾ↖subject:ɾ 6 20 0 1 002↔ 12 5↖ ɾ↔ ɾ↖ ↖ 0 11 ɾ↔ ɾ ɾɾ↔ ɾɾ↖ ↔ ɾ↖subject:ɾ 8  ɾ↖  ɾ↔  ɾɾ↕ 3 06  ɾɾ↘ 0 00  20 12 ɾ5↖ ɾɾ↖ ↖ 0 00 ↔ ɾ ɾɾ↔ ↔  ɾɾ↕ 3 0619 1  ɾɾ↘ 0 000 0  0  subject: 15 subject: 26 subject: 15 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 26 ↘ ↘ ↕  subject: 15 subject: 26 subject: 26ɾ↘ ↘ ɾ↕↕  ɾɾ↕ɾ3 19 611  ɾ ɾɾ↘ 00 000  20 10 ɾɾ5↖ ↖ 11 ↖ ↖ ɾ ɾɾ↖  02 0023 ɾɾ↔ ↔ ↔ ɾ↔ ɾɾ↔  ɾ ɾ↕19 1 1 ɾ ɾ↘ 00 0  subject: 26  ɾ↔  ɾ↕ 21  group: otter  189  group: second flap: -or  ɾ↘ 2  0  65  ɾ↔  ɾ↖  ɾ↘ 3  ɾ↕ 89  242  ɾ↔  ɾ↖  Figure 4.9: Distribution of ‘T’ variants by phrase group: left: ‘otter’ ([VR] phrase), right: second ‘T’, ‘editor/auditor’ ([VVR] sequences). Wilcoxon Signed-Rank tests were performed on the data summarized in Figure 4.9. For each of the four ‘T’ variants, the percentage of productions matching that variant is compared for [VR] vs. [VVR] sequences. As expected from the descriptive statistics in Figure 4.9, the results are significant for [R ], [R ], and [R↔ ], as seen in Table 4.4.  contexts VR vs. VVR  [R ] V 5.5  p *p = 0.003  [R ] V p 3 p=1  [R ] V 171  p *p < 0.001  [R↔ ] V p 0 *p = < 0.001  Table 4.4: Wilcoxon Signed rank tests comparing prevalence of final ‘T’ variants for ‘editor/auditor’ ([VVR] sequences) vs. ‘otter’ ([VR] phrase). * = significant (α = 0.05).  4.3.4  Hypothesis 4: ‘edit/audit a’ ([VVV]) vs ‘edit/audit the’ ([VV]) sequences.  The ‘T’ variants produced during ‘We have him edit/audit the books’ ([VV] sequences) vs. ‘We have him edit/audit a book’ ([VVV] sequences) are shown in Figures 4.10 and 4.11. While a number of individual participants exhibited distinct patterns in [VVV] vs. [VV] sequences, individual strategies varied quite dramati-  81  cally, making across-subject results less clear than for the other three hypotheses. One participant (2) consistently used [R ] for [VVV] sequences, and [R ] for [VV] sequences, whereas 6 participants (10, 12, 15, 16, 17 and 18) used [R ] in [VVV] sequences more often than in [VV] sequences. Four participants (8, 9, 14 and 23) have predominantly [R ] in all conditions, two other participants (4 and 6) have mostly [R ], but with more variability. Three other participants (13, 15 and 26) have mostly [R ] in [VVV] sequences and [R ] in [VV] sequences. Lastly, two other participants (3 and 21) have a mixture of [R ], [R ] and [R ]. Several participants sometimes did not produce a second ‘T’ in the [VVV] sequences because they slowed down and produced the ‘T’ as a stop, most notably participant 5. These productions were excluded from analysis.  82  subject: 10  subject: subject: subject: 9129  subject: subject: subject: 101310  subject: subject: subject: subject: subject: 1214 9129  subject: subject: subject: subject: 1315 101310  Figure 6: Distribution of flap variants Figure by6:subjects Distribution by phrase of flap group: variants Figure top: byedit/audit 6:subjects Distribution by a, phrase of flap group: variants top: byedit/audit subjects by a, phrase group: top: edi Figure 6: Distribution of flap variants Figure by6:subjects Distribution by phrase ofedit/audit flap group: variants Figure top: by edit/audit 6:subjects Distribution by a,alveolar phrase ofedit/audit flap group: variants top: by edit/audit subjects a,alveolar group: top: edit/audit bottom: the. Key: AT= bottom: tap, D F the. = down Key: flap, A TU= bottom: Fby =phrase up flap, edit/audit tap, PA DT F the. = down Key: flap, A TU=Fa,alveolar = up flap, tap,PA DT F = down flap, U F = up bottom: edit/audit the. Key: A T = bottom: alveolaredit/audit tap, D=Fpostalveolar the. = down Key:flap, Atap. TU= bottom: Falveolar = up flap, edit/audit tap,PA D=T Fpostalveolar the. = down Key:flap, Atap. TU=Falveolar = up flap, tap,PA D=T Fpostalveolar = down flap, U F = up flap, PA T tap. = postalveolar tap. = postalveolar tap. = postalveolar tap.  subject: 9  subject: subject: subject: subject: subject: 141214 912  subj sus  1621 00 00 0 16 00 00 21 2021 22 22 0 2220  UF  00 2 210 0  00 00 4 10 422 2 22 210 7 18 00 723 18 22  2  4 422 7 18 18  16 0 00 0 1622 00 00 00 21 2120 2120 22 22  0  0  subject: 517 subject: 526 subject: subject: 8321 3 subject: subject: subject: subject: 21 17  subject: 4 subject: 423 subject: 6218 subject: 618 subject: subject: subject: subject: 23 16  UF  3  19  AT  A A T T D D D F F A A T A Tsubject: TT D D FD D F9D FF subject: subject: 14 12 subject: subject: 14D 12 A A T F A A T A T T F D F F 00 0F 0A 0D 0D 0D 17 1T D0 1A 1A T F A22 T8 A T22 T8 F0 D F F 0 0 231523 2215 0 0 00 0  3  A A T A T T D D D F D FF F A A TA TA D D FD FD subject: subject: subject: 15 13 10 subject: 13 10 A T T T D F F A A T T D F F 3A 3A 0T D0 0 0 0A T T7 F F0 A7 A T23 T7 D F F 00 8D 0F 8D 0D 27 823 208 20 0 00 0 0  DF  subject: 416 subject: 423 subject: subject: 6218 2 subject: subject: subject: subject: 18 16  PA T  00 0  2322 7 22  1810 101810 4 11 2 4 11 0 00 0 0 171817 0 00 0 0 1718 0 0 00 0 24 0F PA 00 0T 0T T U6 05 00 0T 0T T U2 00 0T 0 0PA 0T 0PA 0T 0 0PA 0T 0T 0U 12 2U 1U 12 2F PA 2 1 1U 2 1 U U F F2 PA PA U U F F18 PA PA U U F PA PA U U4 U F2 F4 PA T U F18 F6 PA T F4 U F2 F4 PA T T U2 F6 U F 5F PA 0 0 0 0 0 0 0 0 0 0 0 0 2 U U F U FF F PA PA PA T PA TT T U U U F U FF F PA PA PA T PA TT T U U U F F PA PA PA T T U U UU U FU FU PA PA T T UU U FU FU PA PA T T UU U FU U FU FF PAPA PA T PA T TT UU U FU U F F F PA PA T PA T F F PA PA T PA T F PA PA T U U F F PA T T U F F PA T T F U F F T PA T T F U F A A T A Tsubject: T D D D F D F9F F A A Tsubject: TA D D F9D FD subject: subject: 14 12 subject: 12 A A T T T D F F A A T T D F F 0T D0 0 0 0A 17 1A 1A T T7 F F0 A8 A T22 T8 D F F 00 0D 0F 0D 0D 1523 221522 0 00 0 0  PA T  DF  UF PA T 93 00 U F PA U F PA subject: 17TT  00 0 00 0  1622 21 2120 22  114 11  0T 0T T 0PA PA PA 0 0 0 PA PA TT T PA T PA PA T PA T  subject: 517 subject: 3 subject: 321 subject: subject: subject: 17  ss sub subj su  A A A T A sTT subj su A A A T A 3A A23 T7 A T 2A 23 8  0  22 0  23  UF  A TD F  AT  subject: subject: subject: subject: subject: 141214 912 8 6 3 0 0 8 0 0 0 70 6 0 3 0 0 8 0 0 7 6 0 0 7 0 0 0 0 0 0 0 0 6 0 6 6 0 0 6 0 6 0 6 0 0 9 0 9 0 0 0 40 4 1 1 0 9 0 2 2 UF U FPA T PA T U F U FPA T PA T U F U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F U FPA T PA T U F U FPA T PA T UF PA T UF PA T U F PA T U F PA T U F U F PA T PA T U F U F PA T PA T U F U F PA T U F PA T U PA F T U F PA T U F PA T PA T U F U F PA T PA T U F 00 61 62 00 00 61 00 62 00 61 00 00 00 93 00 00 00 00 47 00 47 47 10 10 00 93 00 10 00 22 22 22 U F F F F F PA F U F U U FPA PA PA U F U U FPA PA T PA PA TFU U F U U FPA PA TU U FPA PA TFU U PA F16T TU U FPA PA TU U FPA PA TF PA PA TFU U FT U U FPA PA TF PA PA TFU U FT U U FPA PA TFAPA PA TF A TD FU U F TD FAPA PA TF U FT PA TF AF T T DT AF T T DT AF T T AF TDT DT AF AF TDT DT AF AF TDT TDT AF TDT D FU AF APA TDT DF AT subject: subject: 18TT PA 16TT U subject: subject: 21 17 subject: subject: 23 subject: 18 subject: subject: 26 subject: 21 17 subject: subject: 23 18 subject: subject: 26 21 subject: 23 subject: 26  subject: subject: subject: subject: 1315 101310  subj sus  U F U FPA F U FPA F U FPA FPA PA F FPA FPA F U FPA F U FPA subject: subject: 18T PA 16T U subject: subject: 21T PA 17T U subject: subject: 23TU subject: 18T U subject: 16T U subject: 26TU subject: 21T PA 17T U subject: subject: 23T PA 18T U subject: subject: 26T PA 21T 15 18 19 0 0 15 0 24 0 150 18 0 19 0 0 15 0 0 15 18 0 0 0 24 0 22 22 9F D 0F A 0F D 0F A 0F 18 0F 0F A 0F A A A A D A D D D D D D D AT T AT TD DF F D DF F A AT T AT T18 DF F D DF AT T AT T14 DF FA AT TD DF AT T DF F A AT T15 DF FA AT T18 DF DF AT T13 A AT T14 DF DF AT T18 A AT T15 DF DF 13 9 13 9 13 9 15 18 00 18 15 18 18 15 00 18 00 00 14 23 23 01 18 23 01 19 10 19 10 10 22 00 12 21 22 00 1400 12 13 14 21 14 12 13  subject: subject: subject: subject: subject: 1214 9129  0  23 1917 00 00 0 0 21 00 00 19 0 00 0 1923 00 00 00 1923 00 00 00 0 21 24 0 21 0 2 24 21 21 20 24 13 7 17 722 7 2 13 2 13 17 17 2 13 17 22 U U F U F 0F F PA PA PA T PA TT T U U U F U F 4F F PA PA PA T PA TT T U U U F F PA PA PA T T U U U9 U F10 F9U PA PA T T U15 U F14 F15 PA PA T T U1U U F9 U F10 F9F PA PA T PA T TT U1U U F15 U F 00T 0T 0T 00T 0T 0T 00T 0T 0PA 0PA 0PA 0T U U F F PA PA U U F U F PA PA U U F PA PA U U U F F PA T U F F PA T F U F F PA T F U F F F F PA PA PA T PA T T UU F F F PA PA PA T PA T T UU F PA PA T UU U FU FU PA T T U FU FU PA T T FU U FU F F PAPA T PA T TT UU FU U F  0  00 2 20 0 PA PA TT T PA T 0T 0T 0PA PA PA PA PA TT PA T  subject: subject: subject: 101310  0  00 0  171817  subject: 416 subject: 2 subject: 218 subject: subject: subject: 16  21 2121 7 17 17 U U F 4F F U4 F15 U U F U F FF UU FU  22  2 0  1U U F18 U18 F6 5F 5 U U FF F UU FU F U F  0T 0T T 0PA PA PA 0 0 0 PA PA TT T PA T PA PA T PA T  0F 0 2U U F2 U2 F4 U U FF F UU FU F U F  00 0 00 0  1917 24 2413 17  subject: subject: subject: 9129  AT  4 18  0  A A T T D D D FF F A TA D FD subject: subject: 13 10 subject: 10 A T T D F A T F 3A 0T D0 0 0A T7 F0 A7 T7 F 8D 0F 0D 208 20 00 0  A A Tsubject: T D D D F9F F Asubject: TA D9D FD subject: 12 A T T F A T F 0T D0 0 0A 1A T7 F0 A7 T8 F 0D 0F 0D 00 0 221522  PA PA TT T PA T 0T 0T 0PA PA PA PA PA TT PA T  U U F 0F F U0 F9U U F U F FF UU FU  0  0 0  21 22  11  0 T PA 0 PA T T PA  subject:17 3 subject:  2 0 PA T 0 T PA PA T  subject: 10  UF PA T 62 00 U F PA U F PA subject: 16TT  0  17  subject:16 2 subject:  21 17 U4F F U UF  U F PA subject: 17T 19 0 A D AT T DF F 00 14 19  U18 5F UF F U  0 T PA 0 PA T T PA  0F U2 UF F U  0 0  24 17  subject: 9  U F PA subject: 16T 0 24 A D AT T DF F 18 00 22  Figure 1:D16 ofD17 variants 1: phrase of group: variants top: 1: subjects ,AAsubject: bottom: phrase of -or group: variants top: subje Distribution flap variants Figure 1: subjects Distribution by phrase of flap group: variants Figure -ify subject ,Asubj bo subject: 4D subject: 4D 4F-ify subject: 4D subject: 5 subject: 4D s,AA subject: 2FDistribution subject: 3Fflap Figure subject: subject: subject: subject: subject: 2by subject: 3 2by 3 subject: 6by 2by sub A T A T A A T T 1:D D F2subjects FDistribution A A T Tbyof D D F3Fflap A A T A T T by D F F2-ify FDistribution A T A T Tby D D F D F3Fflap A A T Ttop: D F A TFigure F A Tsubject: D F5 A A Tsubject: TFigure D D F6 F A Tsubject: Tsubject: D D F8 F5 A A T A Tsubject: Tsubject: D D F1: D F6 FDistrib A T subject: subject: subject: subject: 18 16 subject: subject: 21 17 subject: subject: subject: 23 18 16 subject: subject: 26 21 17 subject: subject: 23 18 sT subject: 16 subject: 17 subject: 18 16 subject: 21 17 subject: subject: 23 18 16 su T F = alveolar A11 T D0 F F D: F A A TKey: Tdown D F0flap, F alveolar A A T11 T= tap, D D F0flap, F A A T A T Tdown D D F D F0flap, FU A A T A T11 Tup D D F D F0flap, F A A T T D D F Fflap, A AKey: T:= F T19 D F A T13 T:down D F F A19 A T T19 D0tap, D F F A A T Tdown D0postalveola D F F A T19 A T : AAKey: ADD0 T tap, A T =AAU alveolar F up DAAPA F = TKey: =DDA A T = U alveolar F = tap. up D13AATTPA F = T down =DDFFA ADD0DDT = DF := Key: T = alveolar F = DPA F := T Key: T =AAU alv 13 7T 6A 7T 0D 0tap, 6A 7T 0postalveolar 0D 00flap, 0tap, 6= 0postalve 0F 7A 0D 6 7A 0D 0flap, 6 7A 0D 0flap, 21 11 11 21 F A T D T F F A A T T D F F A T T D F F F A A T A T T D D F D F F A T T D F A A T F T D F A T T D F F A A T T D D F F A A T T D D F F A T A T  A T D subject: 10FF A T D 0T A7 D0 0F 20 0  Asubject: T D9F F A T D 0T A7 D0 0F 0 22  PA T 0 T PA PA T  U0F F U UF  83  4 422 7 18 18  00 2 210 0  13  00 0  1622 00 00 0 1622 00 00 21 21 0 2120 20 22 1810 171817 0 00 0 0  23 00 00 00 4 10 422 2 22 210 7 18 722 18 4 11 2 4 11 0 00 0 0  16 0 00 0 1622 00 00 00 21 2120 2120 22 22 101810 1718 0 0 00 0  19 10  2322 7 22 24  00T 0T 0T T U1 0U 00T 0T 0T T U1U 00T 0T 0PA 0PA 0PA 0T U U F F 0F PA PA PA U F F18 F PA PA PA U U F PA PA U U2 U F10 F2U PA T U F18 F1U PA T F2 U F10 F2F PA PA T T U0U F1 U F  0  0T 0T T 0PA PA PA  12  00 0 00 0  1622 21 2120 22  114 11 U F18 F U18 F1U  0  00 0  171817  U FPA T PA T U F  2 0  0T 0T T 0PA PA PA  U F 0F U0 F2U  2 0  4 18  0 0 T PA  A T22 DF  0  0 0  21 22  11 U18 F  DF AT  0  17  5 56 5 0 00 0 00 0 6 25 6 5 0 00 0 0 4 22 4 2 0 00 0 0 24 2 2 4 22 4 0 0 00 0 26 UF F PA T T U U FF F PA PA PA TT T 9U U U FF F PA PA PA TT T U U F U FF F PA PA PA T PA T9 T U U U F U FF F PA PA PA T PA TT T U U U F F PA PA PA T T 9U U UU FU PA T UU FU PA T UU U FU FU PA PA T T U9U U FU FU PA PA T T UU U FU U FU FF PAPA PA T PA T TT UU U F9U U F U PA F PA PA T F PA PA T F F PA PA T PA T T F F PA PA T PA T F PA PA T U F T F T U U F F PA T T U F F PA T T F U F F T PA T T F U F subject: 416 subject: 5of subject: 416 subject: 5of subject: 4 by subject: 423 subject: 526 subject: 423 s,s subject:17 3 of flap subject: 2by subject: 3phrase subject: subject: 62by 2by subject: subject: 83phrase 3phrase subject: 62by subject: 218 subject: 321 subject: 6top: sub subject: subject: subject: subject: subject: 17 subject: subject: subject: 18 subject: subject: 21 17 subject: subject: 18 subject: 16 subject: 17 subject: 18 16 subject: 21 17 subject: subject: 23 18 16 subj su Distribution variants Figure 1: subjects Distribution by of group: variants Figure top: 1: -ify subjects Distribution ,subject: bottom: by of -or flap group: variants top: -if su Figure 1: Distribution flapflap variants Figure 1: subjects Distribution by flap group: variants Figure 1: -ify subje Dist T  U FPA T PA T U F  2  0  21  U FPA TU FPA T U PA F T U FPA TU FPA T PA T U F  subject: subject: subject: subject: 1315 101310  subject: subject: subject: subject: subject: 141214 912 0 0 A TD F A T23 A T19 D FA T D F D FA T22 DF  subject: subject: subject: subject: subject: 1214 9129  0PA T 2  00 2 20 0  21 2121 7 17 17  23 1917 00 00 0 0 21 00 00 19 0 00 0 1923 00 00 00 1923 00 00 00 0 21 24 0 21 0 2 24 21 21 20 24 13 7 17 722 7 2 13 2 13 17 17 2 13 17 22 U U F U F 0F F PA PA PA T PA TT T U U U F U F 4F F PA PA PA T PA TT T U U U F F PA PA PA T T U U U U F10 F9U PA PA T T U15 U F14 F15 PA PA T T U1U U F9 U F10 F9F PA PA T PA T TT U1U U F15 U F 00T 0T 0T 00T 0T 0T 00T 0T 0PA 0PA 0PA 0T U U F F PA PA U U F U F PA PA U U F PA PA U U9 U F F PA T U F F PA T F U F F PA T F U F F F F PA PA PA T PA T T UU F F F PA PA PA T PA T T UU F PA PA T UU U FU FU PA T T U FU FU PA T T FU U FU F F PAPA T PA T TT UU FU U F  0  00 0 00 0  PA T PA 0T 0T 0PA PA PA TT PA T PA PA TT PA T  subject: subject: subject: 101310  0 AT DF  1917 24 2413 17  U U F 4F F U F15 U U F U4 F FF UU FU  U 3 F0  AT T D A A Tsubject: T D D D FF F A A T T D D D FF F A A T A Tsubject: T D D D F D F9F F A A A T A T T D D F D FF F A A A T T D D D F F A A Asubject: TA D9D FD AA TA DD FD AA A Tsubject: TA D D F9D FD A A TA TA DD D FD FD A A T A Tsubject: TT D D FD D F9D FF AA A T A 10FFtap, D subject: 12 subject: subject: 10 subject: subject: 14 subject: subject: subject: 13 10 subject: subject: 14 sTT 12 subject: 10 subject: subject: 14 su D A T T T T F A T T T D F A T T T F F A A T F A A T F T F A T T D F F A T T D F F A T A T T F D F F T A =Asubject: alveolar :=Key: down Aflap, TAsubject: = U alveolar F 10 =13 up tap, flap, D PA F :down =TKey: down =12 Aflap, T = U alveolar F13D15 = tap. up tap, flap, D PA FTKey: = TD12 down =12 fla :F Key: AF9T = alveolar tap, DAsubject: F := Key: AFpostalveolar flap, T = Ualveolar F = up tap, flap, D PA F := down = postalve Apostal flap, Tsubj = 3A 3A 3A 3A 0T 0F 0T D0 0D 0F 0T D0 0D 0F 0T D0 00 0D 0 F A7 0T D8 00 0D 0 F A22 00 0 F A23 0A 0D 0A 0D 0A 0D 0D 0A 0D 0D 0A 0D 0D 0D 1A 17 1A 17 1T D0 1A 1A 1A A7 D0 T7 F0 T7 F0 T T7 F F0 T T7 F F0 T F A7 T8 F A7 T7 F A8 A T22 T8 D F F A T23 T7 D F F T8 A T22 T8 F0 D F F T7 A T 0 8 0 0 0 8 0 0 27 2A 231523 23 8 20 0 00 0 208 20 00 0 1523 823 208 20 0 00 0 0 221522 221522 0 00 0 0 2215 0 0 00 0  0  ɾ↔ 0  A T DF Asubject: T : Key:D9FA 0T 0F A7 D0 0 22  ɾ↖  2 0  0  21 17  ɾ↔  0 0  1  ɾ↖ 2  ɾ↔ 0  24 17  PA T PA 0T 0T 0PA PA PA TT PA T PA PA TT PA T  ɾ↖  subject: subject: subject: 9129  0  U U F 0F F U F9U U F U0 F FF UU FU  ɾ↔  PA T 0 T PA PA T  3  subject: 10  ɾ↖  U F U4F UF  0  PA T 0 T PA PA T  ɾ↔  subject: 9  2  U F U0F UF  1  Figure 1:D16 ofD17 variants 1: phrase of group: variants top: 1: subjects ,AAsubject: bottom: phrase of -or group: variants top: subje Distribution flap variants Figure 1: subjects Distribution by phrase of flap group: variants Figure -ify subject ,Asubj bo subject: 4D subject: 4D 4F-ify subject: 4D subject: 5 subject: 4D s,AA subject: 2FDistribution subject: 3Fflap Figure subject: subject: subject: subject: subject: 2by subject: 3 2by 3 subject: 6by 2by sub A T A T A A T T 1:D D F2subjects FDistribution A A T Tbyof D D F3Fflap A A T A T T by D F F2-ify FDistribution A T A T Tby D D F D F3Fflap A A T Ttop: D F A TFigure F A Tsubject: D F5 A A Tsubject: TFigure D D F6 F A Tsubject: Tsubject: D D F8 F5 A A T A Tsubject: Tsubject: D D F1: D F6 FDistrib A T subject: subject: subject: subject: 18 16 subject: subject: 21 17 subject: subject: subject: 23 18 16 subject: subject: 26 21 17 subject: subject: 23 18 sT subject: 16 subject: 17 subject: 18 16 subject: 21 17 subject: subject: 23 18 16 su T F = alveolar A11 T D0 F F D: F A A TKey: Tdown D F0flap, F alveolar A A T11 T= tap, D D F0flap, F A A T A T Tdown D D F D F0flap, FU A A T A T11 Tup D D F D F0flap, F A A T T D D F Fflap, A AKey: T:= F T19 D F A T13 T:down D F F A19 A T T19 D0tap, D F F A A T Tdown D0postalveola D F F A T19 A T : AAKey: ADD0 T tap, A T =AAU alveolar F up DAAPA F = TKey: =DDA A T = U alveolar F = tap. up D13AATTPA F = T down =DDFFA ADD0DDT = DF := Key: T = alveolar F = DPA F := T Key: T =AAU alv 13 7T 6A 7T 0D 0tap, 6A 7T 0postalveolar 0D 00flap, 0tap, 6= 0postalve 0F 7A 0D 6 7A 0D 0flap, 6 7A 0D 0flap, 21 11 11 21 F A T D T F F A A T T D F F A T T D F F F A A T A T T D D F D F F A T T D F A A T F T D F A T T D F F A A T T D D F F A A T T D D F F A T A T  0 PA 0 T PA T T PA  U0 2F UF F U  subject: 21: subject: Figure16  Figure 4.10: Distribution of ‘T’ variants by participants by phrase group: top: ‘edit/audit a’ ([VVV] sequences), bottom: ‘edit/audit the’ ([VV] sequences). ɾ↖  subj sus  2022 222022 0 00 0 0  722 187 18 100010 0  222022 2220 0 0 00 0  0  13  2 0PA T 7 0  010 0  3  187 18  1810 171817 0 00 0 0  U 1 F0  00 0  00 0  PA PA TT PA T  4 11 2 4 11 0 00 0 0  101810 1718 0 0 00 0  140  ɾ↘ 0  ɾ↕ 12  222022  114 11  22 0  60 0 ɾɾ↖ ↖ ɾ↖  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  07 017 ɾɾ↖ ↖ ↖ ɾɾɾ↖ ↖  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  ɾ↕ ɾ15 14 15  ɾ ɾ0↘ 3 0  77 1 ɾɾ↖ ↖ ɾ↖  00 0↔ ↔ ɾɾ↔ ɾ  ɾ↕ 14  ɾ↘ 3  ɾ↘ 10  subject: 21 00 0↔ ↔ ɾɾ↔ ɾ  ɾ↕ 21  ɾ↘ 1  subject: 23  02 0 ɾɾ↖ ↖ ɾ↖  00 0↔ ↔ ɾɾ↔ ɾ  22 7 24  F F F PA PA PA T PA T T UU F F F PA PA PA T PA T T UU F PA PA T UU U FU FU PA T T U FU FU PA T T FU U FU F F PAPA T PA T TT UU FU U F  77 7  08 0  822 78 7  0 00 0 0  7 27 7 7  8 00 8 0  2 0  00 0  78 22 822  13  0  0  722 177 17 0 00 0 0  0 A T12 A T19 DF D10 F  1317 2 1317 0 00 0 0  U 3 F0PA T 0PA T U 2F  78 7  00 0  0 0 00 0  2 13 0 0 00 0 2 1317  27  22 7  F F F PA PA PA T PA T T UU F F F PA PA PA T PA T T UU F PA PA T UU U FU FU PA T T U FU FU PA T T FU U FU F F PAPA T PA T TT UU FU U F  0  21  PA PA TT PA T  1  0  177 17  FF UU FU  U 0 F0  7  00 0  PA PA TT PA T  19 10  0  171317  FF UU FU  12 3  5 0 0 00 0 4 22 4 2 0 00 0 0 2 24 2 2 4 22 4 0 0 00 0 9 56 5 00 0 9 96 25 6 5 0 00 0 0 9 296 UF PA T UF PA T F F PA PA PA TT F F PA PA PA TT F F F PA PA PA T PA T T UU F F F PA PA PA T PA T T UU F PA PA T UU FU T UU FU T UU U FU FU PA T T U FU FU PA T T FU U FU F F PAPA T PA T TT UU FU U F subject: subject: 17of flap subject: subject: subject: subject: 21 subject: subject: subject: 18 subject: subject: 21 subject: subject: 18 subject: 1618 subject: 17 subject: 1823 16 subject: 2126 17 subject: subject: 23top: 1823 16 subj su Figure161: Distribution variants Figure by 1: subjects Distribution by phrase of group: variants Figure top: by 1: -ify subjects Distribution ,subject: bottom: by phrase of -or flap group: variants top: -if su Figure 1:16 Distribution of17 flapflap variants Figure by 1:16 subjects Distribution by phrase of17 flap group: variants Figure by 1: by -ify subje Dist ,s T D Ftap, D A TKey: T:=Key: F A T TF D D F Ftap, A T A TKey: T:down D F F A T T T= D F D Fflap, Ftap, A T T AA T: F DD F TA A TA T DD D F= F AA A T TA F Ftap, A A TA T D D Fpostalve A TA A T Key:D FA T =Aalveolar down Aflap, TAalveolar = Ualveolar =DFtap, up flap, D PA F =TKey: down Aflap, T = Ualveolar FD DDup = tap. up flap, D PA F = TDF Ddown =DFFAFpostal ADF T = DA F := ADFpostalveolar flap, T = Ualveolar F DATPA F := TKey: down = flap, TAfla =  ɾ↕ ɾ22 21 2  ɾ ɾ0↘ 1 0  subject: 14 subject: 23 subject: 23 ɾ↘ ↘ ɾ↕↕  7  ɾ ɾ0↘ 10 6  A T:  ɾ↕ ɾ18 12 12  subject: 13 subject: 21 subject: 21 ɾ↘ ↘ ɾ↕↕ 0  13  0 2  14  U FPA T PA T U F  3 0  0  U 7F 1 F0PA T 0PA T U  23 1  0 A T21 A T23 1 DF DF  2 0 70  UF  0 0  21  PA T  3 0  1  19 D10 F  0  UF  2  14  0 A T21  0 0 ɾ0↔ ɾɾ↔ ↔  12 2 ↖ ɾ16 ɾɾ↖ ↖  0 0 ɾ0↔ ɾɾ↔ ↔  ɾɾ20 ↕ 24 1 4 10  ɾɾ0↘ 0 0 0 0  subject: 2 subject: subject: 29 ↕ ɾ9↘ ↘ ɾsubject: ↕ subject: 16 ɾɾ↘ ↘ ɾɾ↕↕  ɾɾ5↘ 0 3 0 0  subject: 10 3 subject: subject: 10 3 subject: ɾ↘ ↘ ɾ↕↕ subject: 17 ɾɾ↘ ↘ ɾɾ↕↕  13  U FPA T PA T  1 0  10  140 A T12 3  PA T  1 0 0  0  0  ɾ↔  2  0  13  7  14  18  U F0PA T PA T U F  3  23 1 DF  U F PA subject: 26T 15 0 15 0F 0 A D AT T18 DF 18 13 14  U F TU 0 F0PA TU F0PA T U PA 7F 2 FPA TU F0PA T PA T U  12 3  0 22 DF  U F PA subject: 23T 0 22 9F 180 A D AT T13 1 14 DF 18 01 21  3 21 ↖ 0 ɾ↖ ɾɾ↖  ɾ↖  ɾ↖  ɾ↔  ɾɾ↕ 17 15 5  ɾɾ↘ 0 00  4 1 ↖ ɾ18 ↖ ɾɾ↖  0 0 ɾ0↔ ɾɾ↔ ↔  ɾɾ17 ↕ 15 15 5 14  ɾɾ0↘ 000 0  subject: 12 4 subject: subject: 12 4 subject: ɾ↘ ↘ ɾ↕↕ subject: 18 ɾɾ↘ ↘ ɾɾ↕↕ 0  subject: 26  subject: 15 subject: 26 subject: 26 ɾ↘ ↘ ɾ↕↕  ɾɾ↕ 24 0 22  ɾɾ↘ 0 00  subject: 6 subject: 14 6↘ subject: ɾ↘ ɾ↕↕  ɾɾ↕ 16 19 22  ɾɾ↘ 0 00  subject: 8 subject: 15 8↘ subject: ɾ↘ ɾ↕↕  ɾɾ↕ 24 23 23  ɾɾ↘ 0 00  0 02 ↖ ɾ↖ ɾɾ↖  0 0 ɾ0↔ ɾɾ↔ ↔  8 41 ↖ ɾ↖ ɾɾ↖  0 0 ɾ0↔ ɾɾ↔ ↔  0 00 ↖ ɾ↖ ɾɾ↖  0 0 ɾ0↔ ɾɾ↔ ↔  ɾɾ24 ↕ 18 0 22 12  ɾɾ0↘ 000 6  ɾɾ16 ↕ 22 19 22 2  ɾɾ0↘ 000 0  ɾɾ24 ↕ 15 23 23 15  ɾɾ0↘ 000 0  ɾ↖  ɾ↔  0  ɾ↘ 0  subject: 14 6 subject: subject: 14 6 subject: ɾ↘ ↘ ɾ↕↕ subject: 23 ɾɾ↘ ↘ ɾɾ↕↕ ɾ↖  ɾ↔  4  0  ɾ↕ 0  subject: 4  0 0 007 00 ɾ00↔ ↔ ɾ↖ ↖ 1 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ8↔ ɾsubject:  subject: 5 subject: 13 5↘ subject: ɾ↘ ɾ↕↕  subject: 13 5 subject: subject: 13 5 subject: ɾ↘ ↘ ɾ↕↕ subject: 21 ɾɾ↘ ↘ ɾɾ↕↕  subject: 4 subject: 12 4↘ subject: ɾ↘ ɾ↕↕ 1  ɾ↕ 15  0  80 102 ɾɾ↖ ↖ ↖ ɾɾɾ↖ ↖  PA T  ɾɾ↘ 5 3 0  0  ɾ↘ 3  subject: 3  ɾ↔  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  8 0 00 410 ɾ00↔ ↔ ɾ↖ ↖ 0 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ6↔ ɾsubject:  9 9 9 9 9 Figure 1: Distribution of flap Figure variants Figure 1: subjects Distribution byofphrase of group: variants Figure top: 1: -ify subjects Distribution , bottom: of -orflap group: variants top: -if su, 1: by Distribution flapflap variants Figure by 1: by subjects Distribution by by phrase ofphrase flap group: variants Figure top: by 1: by -ify subje Dist : Key: A T = alveolar tap, D: FKey: :=Key: down Aflap, T alveolar =Ualveolar F = tap, uptap, flap, DPA FKey: :down =TKey: down =Apostalveolar Aflap, TUalveolar =F Ualveolar F up = tap. up tap, flap, DPA FTKey: = T down =Apostal AT = DF := flap, T = = tap, flap, DPA F := down = postalve flap, T fla =  ɾɾ↕ 7 4 7  ɾ↔  ɾ↕ 4  7  06 200 ɾɾ↖ ↖ ↖ ɾɾɾ↖ ↖  17  00 0 ɾɾ00↔ ↔ ↔ ɾɾɾ↔ ↔  UF  ɾ ɾ0↘ 0 0  48 1 ↖ ɾɾ18 ↖ 6 ↖ ɾɾɾ↖ ↖  0  ɾ↕ ɾ15 23 14  subject: 12 subject: 18 subject: 18 ɾ↘ ↘ ɾ↕↕ ɾɾɾ0↘ 0 000  PA T  ɾ ɾ0↘ 0 0  ɾɾɾ17 ↕ 15 23 5 14  subject: 12 subject: 18 subject: 12 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 18 ↘ ↘ ↕  0 0 026 00 ɾ00↔ ↔ ɾ↖ ↖ 0 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ5↔ ɾsubject:  17  ɾ↕ ɾ19 19 14  subject: 10 subject: 17 subject: 17 ɾ↘ ↘ ɾ↕↕ 0 0 0↔ ɾɾ0 0↔ ↔ ɾɾ↔ ɾ↔ ɾɾɾ5↘ 0 0 0 0  0 4 00 18 ↖ ɾ00↔ ↔ ɾ18 ↖ 6 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ4↔ ɾsubject:  UF  subject: 16 9 subject: subject: 16 ɾ↘ ↘ ɾ↕↕  ɾɾɾ7↕ 19 19 7 14  0 12 0 0 23 ↖ 0↔ ɾ0 ↔ ɾ16 ↖ 9 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ3↔ ɾsubject:  subject: 3 subject: 10 3↘ subject: ɾ↘ ɾ↕↕  21  ɾ↘ 0  subject: 2  ɾ↖  ɾɾɾ0↘ 0 003  0  ɾ↕ 23  00 0↔ ↔ ɾɾ↔ ɾ  subject: 18  ɾɾɾ24 ↕ 15 14 23 15  18  ɾ↘ 0  subject: 15 subject: 26 subject: 15 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 26 ↘ ↘ ↕  0  ɾ↕ 19  subject: 17  81 6 ɾɾ↖ ↖ ɾ↖  ɾɾɾ0↘ 0 001  22  ɾ↘ 0  FF UU FU  23 1  Figure 1:D16FDistribution ofD17Fflap Figure variants by 1: phrase group: variants top: 1: subjects ,AAsubject: -or group: variants top: subje ,AAsT Distribution flap variants Figure 1:D23 subjects Distribution phrase of flap group: variants Figure by -ify subject ,Asubj bo A T A T A A T T 1:D D D F subjects FDistribution A T Tbyof D D F Fflap A T A T T by D F D F-ify FDistribution A T A T Tby D D F D F Fflap A T D D F F-ify A TFigure F A TA D Fof A A TA TFigure D D Fby F A Tbottom: Tby D D Fphrase Fof A A T A TA T Ttop: D F1: D Fby FDistrib A T subject: subject: subject: subject: 18 16 subject: subject: 21 17 subject: subject: subject: 18 16 subject: subject: 26 21 17 subject: subject: 23 18 subject: 16 subject: 17 subject: 18 16 subject: 21 17 subject: subject: 23D 18 16 su : AKey: AD0 T tap, A T =AU alveolar F up DAPA F = TKey: =DA A T = U alveolar F = tap. up D13ATPA F = T down =DFA AD0DT = DF := Key: T = alveolar F = DPA F := T Key: T =AU alv 13 7T 6A 7Tdown 0D 0tap, 6A 7Tdown 0postalveolar 0D 00flap, 0tap, 6= 0postalve 0 Fflap, 7A 0D 6 7A 0D 0flap, 6 7A 0D 0flap, 21 11 11 21 F = alveolar A11 T D0 F D : F TKey: F0flap, F alveolar A A T11 T= tap, D F0flap, F A T T D F F0flap, FU A A T A T11 Tup D D F D F0flap, F A T T D F A AKey: T:= F T19 D F A T13 T:down D F F A19 A T T19 D0tap, D F F A A T Tdown D0postalveola D F F A T19 A T ɾ↕ 22  subject: 16  0 0 0↔ ↔ ɾɾ↔ ɾ  00 0  3 3 9 ɾɾ↖ ↖ ↖ ɾ  PA PA TT PA T  0 0 0↔ ↔ ɾɾ↔ ɾ  171817  0 2 6 ɾɾ↖ ↖ ↖ ɾ  FF UU FU  ɾ ɾ0↘ 0 0  ɾɾɾ16 ↕ 22 21 22 2  0  ɾ↕ ɾ24 22 10  subject: 14 subject: 23 subject: 14 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 23 ↘ ↘ ↕  PA T  12 3 3 ↖ ɾɾ16 ↖ 9 ↖ ɾɾ↖ ↖ ɾ  ɾɾɾ0↘ 0 10 06  11  0 0 0↔ ɾɾ0 0↔ ↔ ɾɾ↔ ɾ↔  ɾɾɾ24 ↕ 18 12 22 12  UF  3 0 2 0 ɾɾ↖ ↖ 6 ↖ ɾɾ↖ ↖ ɾ  subject: 13 subject: 21 subject: 13 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 21 ↘ ↘ ↕  0  subject: 10 subject: 17 subject: 10 ɾɾ↘ ↘ ɾɾ↕↕↕ subject: 17 ↘ ↘ ↕  PA T  ɾɾɾ0↘ 0 0 0 0  17  ɾɾɾ20 ↕ 24 22 4 10  UF  subject: 9 subject: 16 ↕ ɾɾ9↘ ↘ ɾɾsubject: ↕ subject: 16 ↘ ↕ ↘ ↕ 3 0 0 0 0 21 0 0↔ ɾ0 ↔ ɾ↖ ↖ 6 ɾɾɾ↔ ↔ ɾɾɾ↖ ↖ ↖ ɾ2↔ ɾsubject:  ɾɾ7↕ 19 4 7 14  subject: 2 subject: 2  ↕ ɾ9↘ ɾsubject: ɾɾ↘ ↘ ɾɾ↕↕ 20 0 0 1 0 4  ɾ↖ ɾ↕ 1  ɾ↘ 0  subject: 15 8 subject: subject: 15 8 subject: ɾ↘ ↘ ɾ↕↕ subject: 26 ɾɾ↘ ↘ ɾɾ↕↕  0  ɾ↖  ɾ↔  0  ɾ↘ 0  ɾ↕ 19  subject: 5  ɾ↕ 23  subject: 6  0  ɾ↘ 0  subject: 8  2  UF  0  0 0  U FPA T PA T  10 21  AT  DF  12 3  2  A TD F  0  18  UF  1 0  9  U FPA T PA T  13 1  0  PA T  1  DF  0  PA T  0  ɾ↕ 212  87  ɾ↖  group: a  ɾ↘ 9  ɾ↕ 332  0  66  ɾ↔  ɾ↖  group: the  ɾ↘ 19  0  ɾ↔  Figure 4.11: Distribution of ‘T’ variants by phrase group: left: ‘edit/audit a’ ([VVV] sequences), right: ‘edit/audit the’ ([VV] sequences). Wilcoxon Signed-Rank tests were performed on the data summarized in Figure 4.11. For each of the four ‘T’ variants, the percentage of productions matching that ‘T’ variant based on [VV] vs. [VVV] sequences. The results of these tests were not significant, as seen in Table 4.5.  contexts VV vs. VVV  [R ] V p 86 p = 0.366  [R ] V p 7 p = 0.584  [R ] V p 40 p = 0.451  [R↔ ] V p 0 NA  Table 4.5: Wilcoxon Signed rank tests comparing prevalence of final ‘T’ variants in the words ‘edit/audit a’ ([VVV] sequences) vs. ‘edit/audit the’ ([VV] sequences) * = significant (α = 0.05).  However, there are individuals who appear to produce different ‘T’ variants based on the [VV] vs [VVV] phrase distinction. Wilcoxon rank-sum tests for each participant demonstrate significance for 6 of 18 participants (2, 10, 12, and 15, 17 and 18), and marginal significance for 1 more (participant 16), as seen in Table 4.6. Note that Wilcoxon signed-rank tests require paired comparisons, and so were inappropriate for comparisons of data within each participant, which is why the Wilcoxon rank-sum test was employed instead.  84  contexts 2 3 4 5 6 8 9 10 12 13 14 15 16 17 18 21 23 26  [R ] W 25.5 90 126 NA 168 162 36 60 76.5 198 136 198 87.5 96.5 108 153 20 129  p *p < 0.001 p = 0.601 p = 0.194 NA p = 0.541 NA NA *p < 0.001 *p = 0.007 p = 0.119 p = 0.347 *p = 0.015 +p = 0.069 *p = 0.047 *p = 0.015 p = 0.412 p = 0.716 p = 0.211  [R ] W 144.5 94.5 112 NA 153 162 36 153 144.5 162 144.5 153 117 136 153 117 18 96  p NA p = 0.353 NA NA NA NA NA NA NA NA NA NA NA NA NA p = 0.412 NA p = 0.260  [R ] W 263.5 58.5 98 NA 138 162 36 246 212.5 126 153 108 146.5 175.5 198 135 16 99  p *p < 0.001 p = 0.184 p = 0.194 NA p = 0.541 NA NA *p < 0.001 *p = 0.007 p = 0.119 p = 0.347 *p = 0.015 +p = 0.068 *p = 0.047 *p = 0.015 NA p = 0.716 p = 0.541  Table 4.6: Wilcoxon Signed rank-sum tests comparing prevalence of final ‘T’ variants in ‘edit/audit a’ ([VVV] sequences) vs. ‘edit/audit the’ ([VV] sequences). * = significant (α = 0.05) + = marginally significant (α = 0.1).  Note that three participants (3, 21 and 26) produced more [R ] in [VV] sequences than in [VVV] sequences, as predicted in hypothesis 5, but the results were not statistically significant. Lastly, an examination of ‘T’ sequences shows that many participants produced [R ], [R ] sequences. However, there were 81 tokens with [R ], [R ] sequences, as seen in Figure 4.12.  85  ɾ↖ ɾ↘ ɾ↕  Initial 'T'  ɾ↔  Flap Sequence for edit/audit a 0  0  0  0  0  80  0  7  8  0  1  0  207  4  0  1  ɾ↕  ɾ↘  ɾ↖  ɾ↔  Final 'T'  Figure 4.12: Count by ‘T’ sequences in ‘edit/audit a’ ([VVV] sequences).  4.4  Discussion  The results show that, as per hypothesis 1, participants are more likely to produce [õ] for the word-final ‘R’ in ‘auditor’ and ‘editor’ than in the control phrase ‘mam" mifer’. This result demonstrated the necessary context to test whether participants would produce different ‘T’ variants based on end-state comfort across morpheme boundaries. The descriptive statistics for the test of hypothesis 2 show that participants produce more [R ] for ‘edify/audify’ and more [R ] for ‘editor/auditor’, supporting the hypothesis. Similarly, the descriptive statistics for the test of hypothesis 3 show that participants produce more instances of [R↔ ] for the second ‘T’ in ‘editor/auditor’, and more instances of [R ] for ‘otter’. Also, the number of instances [R ] produced for ‘editor/auditor’ phrases is in line with the number of [R ] produced  86  for the same phrases. For hypothesis 4, most of the participants produced [R ], [R ] sequences, which fits the hypothesis that many of them were capable of producing [R ] sequences, as seen in Figure 4.12. This behaviour removes the possibility of articulatory conflict, which removes the ability to observe the end-state comfort effect, so while the results are consistent with look-ahead planning, the diagnostic test cannot be used to confirm this planning. However, the descriptive statistics show that, overall, participants were more likely to produce [R ] for the ‘audit/edit the’ phrases, and [R ] for ‘audit/edit a’ phrases, and that these results were significant for 6 participants, and marginally significant (α = 0.1) for 1 more. For these participants, the end-state comfort effect was observed as predicted in hypothesis 4. For hypothesis 5, three participants produced more [R ] in [VV] sequences than in [VVV] sequences. While the results were not statistically significant, there were no examples of trends in the other direction. Alternate analyses of these observations, wherein a wide range of possible outcomes are stored rather than planned, could in principle account for these results. For instance, usage-based grammars (e.g., Bybee, 1995; Tomasello, 2005) where commonly used chunks such as ‘edit/audit a’ ([VV]) vs. ‘edit/audit the’ ([VVV]) are stored in memory could explain differences in ‘T’ selection. Similarly, exemplar theories (e.g., Pierrehumbert, 2001) in which multiple variants of the whole word ‘edit’ and ‘audit’ are stored based on individual experience, with different contextual variants used before vowels and consonants, could explain different ‘T’ selection between phrase contexts. Lastly, models that allow complex representation of each word with stored choices at specific points in the word (e.g., Hudson, 1980) could also account for the observed trends because a different representation would be pulled from stored memory based on the context. However, the proponents of these accounts would all have difficulty explaining why individual speakers’ repetitions of phrases within the same context vary categorically. This speech behaviour cannot be entirely explained through the memorization of possible productions based on phonological context because the observed behaviours are themselves behavioural trends, not behavioural certainties. Therefore we argue that these results demonstrate look-ahead planning across both morpheme and word boundaries, seriously challenging any analysis of the motor 87  control of speech production that does not involve planning.  88  Chapter 5  Syllable iterance rate influences categorical variation of English flaps and taps during normal speech 5.1  Introduction  Previous research demonstrated that speakers engage in subphonemic planning of speech based on local constraints (chapter 2), and across syllable, morpheme and word boundaries (chapter 4). Here we seek to identify a possible constraint on motor skills that will help provide a framework for a theory of subphonemic speech planning. To do this, we look at the speed at which speakers can produce repeated syllables in rapid sequence, or their iterance rate, and compare it to their choice of ‘T’ variants in words/phrases with one ‘T’ and sequences of two ‘T’s. We expect speakers with faster iterance rates to be able to avoid the articulatory conflicts through the production of more taps ([R ], [R↔ ]) than flaps [R ], [R ]) where appropriate. In chapter 2, we identified four subphonemic categorical kinematic variants of English flap/tap (‘T’). The first is [R ], in which the tongue moves from below the  89  alveolar ridge upwards, makes contact and moves back down into position for the following vowel. The second is [R ], in which the tongue moves from above the alveolar ridge, makes contact, and continues downwards below the alveolar ridge. The third is [R ], in which the tongue moves from below the alveolar ridge, makes contact, and continues upward into a position above the alveolar ridge. The fourth is [R↔ ], in which the tongue moves from above the alveolar ridge to a point at or above the ridge horizontally and back to a position above the alveolar ridge.  5.1.1  Articulatory conflict  Production of any of these variants can lead to articulatory conflict (Gick and Wilson, 2006) if the tongue position at the start or end of the ‘T’ differs from the one normally associated with the adjacent vowel. Because the tongue tip position in non-rhotic vowels (‘V’) is normally tip-down (Gay, 1974; Perkell, 1969)., and rhotic vowels (‘R’) can be either [õ] or [ô] [see Delattre and Freeman (1968) for " " more variants] there will be potential articulatory conflicts as seen in Figure 5.1.  ɹ̩/V ɾ  ɹ̩/V  ɹ̩/V  ɾ  ɹ̩/V  ɹ̩/V ɾ  ɹ̩/V  ɹ̩/V  ɾ  ɹ̩/V  ɻ̩  ɾ  ɹ̩/V  ɻ̩  ɾ  ɹ̩/V  ɻ̩  ɾ  ɹ̩/V  ɻ̩  ɾ  ɹ̩/V  ɹ̩/V  ɾ  ɻ̩  ɹ̩/V  ɾ  ɻ̩  ɹ̩/V  ɾ  ɻ̩  ɹ̩/V  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ɻ̩  ɾ  ɻ̩  ↕  ↕  ↕  ↕  ↘  ↘  ↘  ↘  ↖  ↖  ↖  ↖  ↔  ↔  ↔  ↔  Figure 5.1: Schematic tongue tip trajectory showing matrix of possible tongue-tip trajectories for ‘T’ productions based on vowel context.  90  Since ideal productions of words can include either ‘R’ variant ([õ], [ô]), but " " only tip-down ‘V’ variants, [R ] involve the fewest potential for conflicts, [R ] and [R ] have next fewest potential conflicts, and [R↔ ] have the most potential conflicts in a balanced dataset of single ‘T’ words with all combinations of rhotic and nonrhotic vowels before and after the ‘T’, as shown in Table 5.1. Flap  [R ] [R ] [R ] [R↔ ]  autumn ‘VTV’ 0 1 1 2  number of conflicts Berta otter Murder ‘RTV’ ‘VTR’ ‘RTV’ 0-1 0-1 0-2 0-1 1-2 0-2 1-2 0-1 0-2 1-2 1-2 0-2  Table 5.1: Potential articulatory conflicts based on phrase.  5.1.2  One vs. two directions of motion  At the same time, taps ([R ], [R↔ ]) differ from the flaps ([R ], [R ]) in that they involve moving the tongue tip towards a target and then back toward where it originated. The flaps move in one direction only, as seen in Figure 5.2. As a result, we expect flaps to be preferred over taps.  ɾ  ↖  ɾ  ɾ  ↘  ↕  ɾ  ↔  Figure 5.2: Schematic tongue tip trajectory showing flaps (green) have one direction of motion, and taps (red dashed) have two, as highlighted by the black arrows. Even more importantly, in words with ‘T’ sequences, a [R ], [R ] or a [R ], [R ] sequence will have one arc of motion, where as a [R ], [R ], or [R↔ ], [R↔ ] sequences will have two arcs of motion, as seen in Figure 5.3. In these cases, flaps are more easily seen as preferable to taps. 91  ɾ  ↖  ɾ  ɾ  ↕  ↘  ɾ  ↕  Figure 5.3: Schematic tongue tip trajectory showing sequences of flaps (i.e. [R ], [R ] have one arc of motion, but sequences of taps (i.e. [R ], [R ]) have two.  5.1.3  Iterance rate  We propose that iterance rate can act as an indicator of a speaker’s ability to consistently produce ‘T’s that have more changes in motion, thereby allowing them to avoid articulatory conflicts. Rapid iterance rate has been used to measure the rate of rapid articulator motion in speech as distinguished from non-speech (Nelson et al., 1984). While articulators in such sequences only move at about half their maximum rate seen in normal speech, iterance rate places strain on the speech production system that effectively distinguishes one speakers’ patterns of motion from another’s because speakers demonstrate a wide range of maximum movement velocities in repeated syllables. We measured iterance rate among participants by asking them to repeat the syllable ‘ta’ in rapid succession, defining the iterance rate in terms of acoustic measurements taken from the stop release of a ’ta’ to the stop release of a following ’ta’. Because of the range of tongue motion between the alveolar stop and the low vowel, the iterance rate is in part limited by the speed at which the tongue can repeat these motions. The ‘T’ types were identified with B/M mode ultrasound carefully aligned to the acoustic signal. The M-mode (motion mode) ultrasound provides a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, ranging from 60-100 Hz depending on the depth of the scan. This motion allows us to capture the general direction of motion of the front of the tongue. 92  When synchronized with audio, this method is sufficient to identify the ‘T’ variants described above. This is because ‘T’ contact time is identified as the point of greatest decrease in sound amplitude (Zue and Laferriere, 1979). At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, providing information about the position of the whole tongue (used to identify ‘R’ variants), as well as confirmation of the position of the front of the tongue.  5.1.4  Hypothesis  Because speakers with faster iterance rates can move their tongues more quickly, we expect that the faster the iterance rate, the higher the potential capacity for changes in tongue tip/blade motion, and therefore more taps ([R ] and [R↔ ]) are possible in relation to flaps ([R ] and [R ]) in both single ‘T’ phrases and in the initial and final ‘T’s in double ‘T’ phrases, as illustrated in Figure 5.4 below. That is, we expect speakers with faster iterance rates to avoid the articulatory conflicts mentioned above through the production of more taps where appropriate.  ɾ  ɾ  ↖  ɾ  ↖  ɾ  ɾ  ↘  ↔  ↕  ɾ  ɾ  ↕  ↘  ɾ  ↕  Figure 5.4: Schematic tongue tip trajectory showing hypothesis that speakers with slower iterance rates are more likely to produce the ‘T’ motions seen on the left, speakers with faster iterance rates are more likely to produce the ‘T’ motions seen on the right.  93  5.2  Methods  Data from 18 native speakers of North American English between the ages of 18 and 40 participated in the study. All participants had normal speaking and hearing. Participants were seated in a customized American Optical Co. model 507-a (1953) opthalmic chair with a 2-cup rear headrest adjusted to contact the base of the skull just above the neck. A UST-9118 EV 180 electronic curved array ultrasound probe was placed under the chin. The probe has a variable frequency range of 3-9.0 MHz with an average µ slice thickness of the tissue viewed with this probe of approximately 3 mm Medicines and Healthcare products Regulatory Agency (2004). The probe was attached to an Aloka ProSound SSD-5000 ultrasound machine connected via s-video cable to a Canopus ADVC-110 advanced digital video recorder. A Sennheiser MKH-416 short shotgun microphone was mounted on a microphone stand and aimed about 30 cm away from the participant’s mouth. The microphone was plugged into a M-Audio DMP3 pre-amplifier via XLR balanced cable and out with an unbalanced RCA cable to the Canopus card, which synchronizes the audio and video to within 1 frame. The Canopus card was connected via FireWire to a MacPro Quad Core 2.8 gHz computer. An LCD monitor was mounted on the opthalmic chair’s monitor mount and aimed in front of the participant. A computer with the experiment stimuli presentation software was connected to the LCD monitor so that the participant could easily read from the stimuli from the screen. The ultrasound machine was set up in B/M mode and aligned to the acoustic signal. B-mode ultrasound was used to capture 2-dimensional images of the midsagittal plane of the tongue at 30 fps. The M-mode (motion mode) ultrasound provided a progressive scan of three selected one-dimensional lines accessible from an ultrasound probe. These one-dimensional M-mode lines follow the line of the palate, in the region of intercept with the blade/tip of the tongue. Because M-mode ultrasound is a progressive scan, it presents the motion data at the full capture rate of the ultrasound probe, which ranged from 60-100 Hz depending on the depth of the scan. While this motion is not connected to any specific flesh-point, it allows capture of the general direction of motion of the front of the tongue, which is ideal  94  for identifying the ‘T’ variants described above. At the same time, the B-mode ultrasound allows examination of the midsagittal plane of the tongue surface at 30 fps, which along with the M-mode data allowed identification of the ‘R’ variants described above. Each collection session began with a recording of iterance rate in which the participant was asked to make a rapid sequences of the syllable ‘ta’ at least 10 times in a row. This timing information was used both to align the audio and video, and to provide the tongue speed data used in this report. Tokens were selected to contain single ‘T’s or sequences of two ‘T’s in consecutive syllables. Data were collected on 17 control sentences, 9 sentences with a single ‘T’, 10 sentences with double ‘T’ sequences, and 2 sentences with triple ‘T’ sequences, for a total of 38 unique sequences. The sentences were randomized for each of 12 blocks, giving a total of 456 stimuli sentences. The stimuli were presented using PXlabRT such that each sentence was displayed on an LCD screen for 2.2 seconds. The software automatically paused the experiment after the first 6 blocks to allow participants to swallow some water or take a short break if needed. Each set of 6 blocks took 9 minutes, for a total of 18 minutes recording time. This analysis is based on a subset of the data containing single ‘T’ and double ‘T’ sequence phrases. In the single ‘T’ phrases, the relevant words have two syllables with a ‘T’ separating the first from the second. There is one phrase for each combination of a ‘V’ or ‘R’ preceding and following the ‘T’. Token 1 2 3 4  Word autumn berta otter murder  Carrier Phrase We have autumn books We have Berta beep We have otter books We have him murder a mob  Syllables 2 2 2 2  Context ‘VV’ ‘RV’ ‘VR’ ‘RR’  Table 5.2: single ‘T’ phrase list We also used the double ‘T’ sequence data, as shown in Table 5.3. The words or phrases in this set of data begin with a stressed syllable, followed by two more syllables separated by ‘T’s. These double ‘T’ sequences include a full factorial of ‘V’ vs. ‘R’ contexts surrounding the ‘T’s, except for /VTRTR/, for which we could 95  not find an example in English. Token 1 2 3 4 5 6 7 8 9 10  Word/Phrase audit a edit a herded a absurdity Saturday murdered a auditor editor herded her murdered her  Carrier Phrase We have him audit a book We have him edit a book We have herded a mob We have ’absurdity fests’ We have Saturday off We have murdered a mob We have auditor books We have editor books We have herded her mob We have murdered her mob  Syllables 3 3 3 4 3 3 3 3 3 3  Context ‘VVV’ ‘VVV’ ‘RVV’ ‘RVV’ ‘VRV’ ‘RRV’ ‘VVR’ ‘VVR’ ‘RVR’ ‘RRR’  Table 5.3: double ‘T’ phrase list To measure ‘T’ direction, the acoustic signal was labeled and transcribed in PRAAT, and then imported into ELAN and the ‘T’ variants identified according to the description in the introduction above. Exemplars for each of the ‘T’ variants can be found in chapter 2. To measure iterance rates, we measured the acoustical duration between the stop releases in the above mentioned sequences of ‘ta’s. The average of these durations was recorded for each participant. The percentage of each ‘T’ variant used in the single ‘T’ and double ‘T’ data sets were compared to iterance rates using Vector generalized linear model (VGLM) statistics (Yee, 2008; Yee and Wild, 1996). This method allows for the effective analysis of multinomial interactions like the kind used in this paper as it provides optimal linear regressions that take into account not just the data for the ‘T’ variant under consideration, but all the others as well. VGLM is one of the few methods that can provide a realistic analysis of the relationship between the tongue speed data and the distribution of ‘T’ variants in order to avoid both type I and type II errors.  96  5.3  Results  Iterance durations for rapidly repeated ‘ta’s varied from 104.6 milliseconds (ms) to as high as 155.4 ms, as seen in Table 5.4. This is a difference of 50.8 ms, and the fastest ‘ta’s were produced 48.5% faster than the slowest.  Part..  2  3  4  5  6  8  9  10  12  Dur.  111.3  143.8  133.0  149.0  155.4  104.6  142.2  131.6  145.9  Rate  8.985  6.954  7.519  6.711  6.435  9.560  7.032  7.599  6.854  Part.  13  14  15  16  17  18  21  23  26  Dur.  129.6  133.6  122.1  136.3  122.4  117.0  145.3  135.5  127.8  Rate  7.716  7.485  8.190  7.337  8.170  8.547  6.882  7.380  7.825  Table 5.4: Iterance duration (ms), and iterance rate (hz), by participant.  5.3.1  Single ‘T’ phrases  VGLM analysis of the relationship between ‘T’ variants and rate in single ‘T’ phrases shows that speakers are less likely to use [R ] and more likely to use [R ] and [R↔ ] as iterance rate increases, as shown in Figures 5.6 and 5.5, and Table 5.5.  97  Fitted (VGLM) probability % (flap)  0  10  20  30  40  50  ɾ↕ ɾ↘ ɾ↖ ɾ↔  150  140  130  120  iterance duration (ms)  110  (a)  Figure 5.5: ‘T’ variants in single ‘T’ phrases compared with iterance duration.  98  20  40  60  Fitted (VGLM) probability %  80  ɾ↘  0  0  20  40  60  Fitted (VGLM) probability %  80  ɾ↕  150  140  130  120  iterance duration (ms)  110  150  (a) alveolar tap  130  120  110  (b) down-flap  ɾ↔  20  40  60  Fitted (VGLM) probability %  0  0  20  40  60  Fitted (VGLM) probability %  80  ɾ↖  80  140  iterance duration (ms)  150  140  130  120  iterance duration (ms)  110  150  (c) up-flap  140  130  120  iterance duration (ms)  110  (d) postalveolar tap  Figure 1. Scatterplot by tongue speed in single flap phrases  Figure 5.6: Details of ‘T’ variants in single ‘T’ phrases compared with iterance duration.  Table 5.5 shows the t-scores for the data presented graphically in Figure 5.6. Reading Table 5.5 by rows shows how much more likely speakers are to produce the listed ‘T’ variant compared to the variant listed in the columns, as iterance rate increases. For instance, the third row shows speakers are significantly less likely to 99  produce [R ] compared to [R ] (t = -2.89), and [R↔ ] (t = -2.29). 1) [R ]  2) [R ]  3) [R ]  4) [R↔ ]  1) [R ]  N/A  1.00  * 2.89  -0.11  2) [R ]  -1.00  N/A  1.88  -0.89  3) [R ]  * -2.89  -1.88  N/A  * -2.29  0.11  0.89  * 2.29  N/A  4)  [R↔ ]  Table 5.5: VGLM comparison of t scores showing the change in likelihood of production of ‘T’ variants in single flap phrases, based on the iterance duration. * t-scores > ± 2 or more are significant.  5.3.2  First ‘T’ in double ‘T’ phrases  The VGLM analysis of the relationship between ‘T’ variant and iterance rate in the first ‘T’ of double ‘T’ phrases shows that speakers are less likely to use [R ] and [R↔ ], and more likely to use [R ] and [R ], as iterance rate increases, as shown in Figures 5.7 and 5.8, and Table 5.6.  100  Fitted (VGLM) probability % (flap)  0  10  20  30  40  50  ɾ↕ ɾ↘ ɾ↖ ɾ↔  150  140  130  120  iterance duration (ms)  110  Figure 5.7: ‘T’ variants for the first ‘T’ in double ‘T’ phrases compared with iterance duration.  101  20  40  60  Fitted (VGLM) probability %  80  ɾ↘  0  0  20  40  60  Fitted (VGLM) probability %  80  ɾ↕  150  140  130  120  iterance duration (ms)  110  150  130  120  110  (b) down-flap  ɾ↔  60 40 20 0  0  20  40  60  Fitted (VGLM) probability %  80  ɾ↖  Fitted (VGLM) probability %  80  (a) alveolar tap  140  iterance duration (ms)  150  140  130  120  iterance duration (ms)  110  150  (c) up-flap  140  130  120  iterance duration (ms)  110  (d) postalveolar tap  Figure 1. Scatterplot by tongue speed for the first flap/tap in double  Figure Details of ‘T’ variants for the first ‘T’ in double ‘T’ phrases comflap5.8: phrases pared with iterance duration.  The first row of Table 5.6 shows that speakers are significantly more likely to produce [R ] than [R ] the faster the iterance rate. The second row shows that speakers are significantly more likely to produce [R ] than [R ] (t = 4.17), and [R↔ ] (t = 2.53) the faster the iterance rate. 102  1) [R ]  2) [R ]  3) [R ]  4) [R↔ ]  1) [R ]  N/A  -0.93  * 3.35  1.96  2) [R ]  0.93  N/A  * 4.17  * 2.53  3) [R ]  * -3.35  * -4.17  N/A  -0.02  -1.96  * -2.53  0.02  N/A  4)  [R↔ ]  Table 5.6: VGLM comparison of t scores showing the change in likelihood of ‘T’ variant in the first ‘T’ of double ‘T’ phrases, based on iterance duration. * t scores > ± 2 are significant.  5.3.3  Second ‘T’ in double ‘T’ phrases  VGLM analysis of the relationship between ‘T’ type for the second ‘T’ of a double ‘T’ phrase shows that as iterance rate increases, speakers are less likely to use [R ] and [R↔ ], and more likely to use [R ] and [R ], as shown in Figures 5.9 and 5.10, and Table 5.7.  103  0  10  20  30  40  50  Fitted (VGLM) probability % (flap)  ɾ↕ ɾ↘ ɾ↖ ɾ↔  150  140  130  120  iterance duration (ms)  110  Figure 5.9: ‘T’ variants for the second ‘T’ in double ‘T’ phrases compared with itrance duraction.  104  80 20  40  60  Fitted (VGLM) probability %  ɾ↘  0  0  20  40  60  Fitted (VGLM) probability %  80  ɾ↕  150  140  130  120  iterance duration (ms)  110  150  (a) alveolar tap  130  120  110  (b) down-flap  20  40  60  Fitted (VGLM) probability %  0  0  20  40  60  Fitted (VGLM) probability %  ɾ↔  80  ɾ↖  80  140  iterance duration (ms)  150  140  130  120  iterance duration (ms)  110  150  (c) up-flap  140  130  120  iterance duration (ms)  110  (d) postalveolar tap  Figure 1. Scatterplot by tongue speed for the second flap/tap in double  Figure ‘T’ variants for the second ‘T’ in double ‘T’ phrases compared flap5.10: phrases with iterance duration.  The first row of Table 5.7 shows that speakers were significantly less likely to produce [R ] than [R ] (t = -3.53), and more likely to produce [R ] than [R↔ ] (t = 3.69) as iterance rate increases. The second row shows that speakers were significantly less likely to produce [R ] than [R ] (t = -4.09), and more likely to 105  produce [R ] than [R↔ ] (t = 2.18) as iterance rate increases. The third row shows that speakers were significantly more likely to produce [R ] than [R↔ ] (t = 5.47) as iterance rate increases. 1) [R ]  2) [R ]  3) [R ]  4) [R↔ ]  1) [R ]  N/A  1.27  * -3.53  * 3.69  2) [R ]  - 1.27  N/A  * -4.09  * 2.18  3) [R ]  * 3.53  * 4.09  N/A  * 5.47  [R↔ ]  * 3.69  * -2.18  * -5.47  N/A  4)  Table 5.7: VGLM comparison of t scores showing the change in likelihood of flap/tap types in the second ‘T’ in double ‘T’ phrases, based on iterance duration. * t scores > ± 2 are significant.  5.4  Discussion  The results show evidence for a relationship between ‘T’ variant production and iterance rate. For single ‘T’ phrases, the results were as expected, with participants producing fewer [R ] than either of the two taps the faster the iterance rate. The results also showed that [R ] were consistently more common in both the first and second ‘T’ in double ‘T’ phrases for speakers with faster iterance rates. Since [R ] have the lowest likelihood of articulatory conflicts of the four ‘T’ variants overall (see Table 5.1), the results suggest that those who faster iterance rates are more likely to be able to avoid all articulatory conflicts in rapid speech sequences. The results also suggest that, at least in double ‘T’ phrases, other factors may be at play. In the first ‘T’ of double ‘T’ sequences, [R ] were also more common among people faster iterance rates, a result for which we do not yet have a good explanation. Similarly, the results for the second ‘T’ in double ‘T’ sequences showed that while [R ] was more common as syllable duration decreased, so was [R ], in comparison, [R↔ ] was much less common. As we observed in chapter 4, speakers are more likely to produce [R ], [R↔ ] sequences in the phrases ‘We have editor books’ and ‘We have auditor books’, a sequence that takes advantage of the need to move the tongue tip up from an initial ‘V’ during the first ‘T’ even if it is an 106  [R ]. The speakers are likely to reduce tongue travel time in return for pronouncing a retroflexed vowel in the middle of the words ‘editor’ or ‘auditor’. It seems that speakers capable of faster iterance rates are more likely to avoid such articulatory conflicts in the middle of the double ‘T’ sequences. The results show that iterance rate has an impact on ‘T’ variant selection such that faster iterators are able to produce articulations with longer travel distances and/or more frequent changes in motion to avoid articulatory conflicts. The results show that North American English speakers have at least some individual subphonemic targets that differ from each other based on iterance rates. Most significantly, this interaction in subphonemic speech planning between avoidance of articulatory conflicts and the limitations of a person’s speech motor skill provides a framework for studying how speakers cope with limitations in production skill based on physical or cognitive disability.  5.4.1  Future work  In measuring the data, we noticed that people with longer tongue tips appeared to be more likely to produce initial down flaps, indicating that there may be a relationship between tongue shape and ‘T’ variant selection worth examining. The importance of differences in individual physical morphology on speech production, as evidenced by differences in higher order formants, are well known for identifying speakers (Sambur, 2003), but very little is known about the effect of differences in physical morphology on speech variation. This paradigm for examining categorical variation in speech can also be directly applied to the examination of speech produced at different speech rates. We have always known that humans and animals change the way they move as they increase speed. Decades ago, researchers discovered that animals switch the type of skeletal motions they employ as they move faster in order to conserve energy: Humans switch from walking to running (Margaria, 1938), and horses switch from a walk to a trot to a gallop (Hoyt and Taylor, 1981). Each of these motions has an ideal speed, and deviations above and below that speed increase the energy consumption considerably. Research reveals significant differences between vocal tract motion and skele-  107  tal motions (excluding the jaw). As long as there are no categorical shifts in motion style, the velocity contours, or first derivative, of skeletal motion maintain the same shape despite changes of duration and amplitude, but in vocal tract motion, changes in duration do change the velocity contour (Ostry et al., 1987). Also, faster vocal tract articulations have higher amplitude and stiffness (Ostry and Munhall, 1985), and people reduce tongue travel distance to increase speech rate (Gooz´ee et al., 2005). These results point towards nonlinear, but gradient changes in vocal tract articulation as a result of speech rate changes. Nevertheless, it is known that speaking rates induce categorial changes in articulation towards optimal syllable phase alignments (Kelso et al., 1986). Also, data from short laboratory experiments show faster speech rates correlate with simpler vocal tract trajectories, but at least one attempt to predict speech rate from kinematic data from long data sets has failed (Tillmann and Pfitzinger, 2003). Researchers may have found it difficult to find patterns in high speed speech because their analyses have been based on velocity (1st order derivative), acceleration (2nd order derivative), and jerk (3rd order derivative) or stiffness (the ratio of maximum velocity to distance travelled). Researchers had at least one good reason to do this. While energy use from vocal tract is more difficult to measure due to the small size in relation to the whole body, there is a strong relationship between the derivatives of motion and energy use. As a result, measuring changes in motion over time allows researchers to make assumptions about energy use. Nevertheless, analyzing derivatives of motion abstracts away from the direction of motion, and obscures categorical changes. Vocal tract articulators do slow down for slow speech, but instead of just speeding up during fast speech, they change movement patterns in person-specific ways (McClean, 2000), making general predictions difficult, but also pointing back to the need to compare categorical kinematic changes relating to individual articulator speed. As a result, looking directly at categorical variation in speech, rather than derivatives of same, will provide more insight into speech planning.  108  Chapter 6  Conclusion I have provided data in this dissertation that necessitate substantial modification of theories of speech production. Evidence is presented suggesting that speech planning takes place below the level of the phoneme, and is constrained by factors such as avoiding articulatory conflict and differences in individuals’ motor skills. Further, a single speaker’s repetitions of a particular word in an identical environment vary not just gradiently, but categorically. Because categorical behaviour in speech production does not result exclusively from differences in phonological status and is instead governed by non-linguistic (e.g., motoric) factors, units of speech production do not straightforwardly map onto known units of linguistic representation.  6.1  Subphonemic planning and constraints on speech production  The first violable motor system constraint on speech production I examined is articulatory conflict. Gick and Wilson (2006) noted that languages use different strategies for resolving or avoiding potential articulatory conflicts. The present study shows evidence from speakers’ responses to articulatory conflicts that such strategies are under the control of individual speakers. Strategies observed here include avoidance, quick transitions and end-state accommodation. In Chapter 2, the results showed that for single ‘T’ words, speakers usually selected particular ‘T’ variants in order to avoid articulatory conflict, as seen in Figures 2.13, and 2.14,  109  and detailed in Figures A.1, A.2, A.3, and A.4. The results support a constraint against articulatory conflict, as illustrated in Figure 6.1.  V  ɾ  ↕  V  >  V  ɾ  ↘  V  ,  V  ɾ  V  ↖  >  V  ɾ  ↔  V  Figure 6.1: Schematic tongue tip trajectory showing constraint against articulatory conflict. For a given ‘T’, the best outcome avoids rapid transitions into and out of the ‘T’. The above constraint is incomplete though because it does not explain some of the data in chapter 2. For the word ‘Berta’, speakers produced quick transitions out of a word-initial [ô], raising the tongue tip prior to a [R ] for productions with " a [ôR V] sequence. Similar results did not occur with production of ‘otter’, that is, " there were almost no [VR ô] sequences. The results support an end-state comfort " constraint, where ‘T’ productions that require transitions before the ‘T’ are preferred over ‘T’ productions that require rapid transitions after the ‘T’, as illustrated in Figure 6.2.  > V  ɾ  ↕  V  > V  ɾ  ↘  V  > V  ɾ  ↖  V  V  ɾ  ↔  V  Figure 6.2: Schematic tongue tip trajectory showing End-state comfort constraint: For a given ‘T’, the best outcome avoids rapid transitions, the next avoids transitions out of the ‘T’, the third best avoids transitions into the ‘T’, and the worst outcome has rapid transitions into and out of the ‘T’. In words/phrases with ‘T’ sequences, speakers sometimes avoided all articulatory conflicts, and other times only avoided conflicts at the beginning and end of the sequences, as evidenced in ‘editor’ and ‘auditor’, where the vowel in the middle of the word was often rhotacized, as can be inferred from Figures 4.7 and 4.9. Similar results were observed with 6 (or 7, α = 0.1) speakers for the ‘edit a’ and ‘audit a’ sequences, as seen in Figure 4.10 and Table 4.6. These results fur110  ther demonstrate the Rosenbaum et al. (1992) end-state comfort effect in speech, which itself has often been used in the motor literature as evidence for planning. That is, for a given word/phrase with two ‘T’, the best outcome achieves non-end state and end-state comfort, followed by a violation of non-end-state comfort only, followed by violation of end-state comfort only, and finally the worst is violation of non-end-state and end-state comfort, as illustrated in Figure 6.3.  1) V  ɾ  ↕  ɪ  ə  ɾ  ↕  3) >  2) >  V  ɾ  V  ɾ  ↖  ɪ˞  ɾ  ɪ˞  ɾ  ↘  ə  4) > V  ɾ  ↕  ɪ  ɾ  ↖  ə˞  ↖  ↔  ə˞  Figure 6.3: Schematic tongue tip trajectory showing end-state comfort constraint: Example from the word ‘edit a’. Since all of these constraints would support productions that completely avoid rapid transitions or state-comfort violations, they cannot explain any of the observed variability within or between speakers. In order to explain why behaviour in the same environment differs between and within speakers, There must be a constraint opposed to always successfully avoiding articulatory conflicts. For instance, speakers may not be good at producing one or more of the ‘T’ types. For instance, four of the participants never used [R↔ ]s in single ‘T’ words, as seen in Figure 2.12, and three participants (8, 15, and 23) rarely used them in any context (see Figure 4.8). I therefore propose a constraint favouring flaps over taps, as illustrated in Figure 6.4.  111  ,  ɾ  >  ɾ  ↖  ,  ɾ  ↘  ɾ  ↔  ↕  Figure 6.4: Schematic tongue tip trajectory showing one direction of motion > two directions of motion: Flaps ([R ], [R ]), which have one direction of motion, are preferred over taps ([R ], [R↔ ]), which have two. Similarly, in ‘T’ sequences, the results, especially from Chapters 3 and 4 provide evidence that flap sequences that produce one arc of motion over two ‘T’s are preferred over tap sequences. This constraint is illustrated in Figure 6.5.  >  ɾ  ɾ  ɾ  ↖  ɾ  ↕  ↘  ↕  Figure 6.5: Schematic tongue tip trajectory showing one arc of motion > two arcs of motion: Flap sequences ([R ], [R ] or [R ], [R ]), which have one arc of motion, are preferred over tap sequences ([R ], [R ] or [R↔ ], [R↔ ]), which have two arcs of motion. This constraint is improved with reference to the evidence for gravitational and elastic effects on speech production, as presented in Chapter 3, and in particular Figure B.1. The constraint is illustrated in Figure 6.6.  ɾ  ↖  ɾ  ↘  >  ɾ  ɾ  ↕  ↕  >  ɾ  ↘  ɾ  ↖  Figure 6.6: Schematic tongue tip trajectory showing gravitational and myoelastic constraint: ‘T’ sequences that use gravity ([R ], [R ]) are preferred over sequences that do not ([R ], [R ]. The worst is those that oppose gravity ([R ], [R ]). While not presented in this thesis, preliminary results on the interaction between speech errors and strategy shifts (changes in production between otherwise  112  identical utterances) in my data suggests that speech errors may impact these plans for many minutes after the error. In this analysis, subphonemic speech planning takes into account potential upcoming articulatory conflicts, a preference for flaps over taps, and the relationship between articulator motion and constraints like gravity and elasticity to form a particular utterance at a particular moment. The results from Chapter 5 suggest that speakers who are capable of producing rapid syllable repetition at a faster rate were more likely to be able to produce ‘T’ sequences that avoided articulatory conflicts than those who produced such sequences more slowly. That is, speakers optimize and plan their speech productions to avoid violating these constraints on speech production. In this analysis, the differences in actual speech productions for different speakers are represented as difference in the relative importance of each constraint for different speakers. Much like in an OT representation (McCarthy and Prince, 1993), lower ranking constraints may be violated in the production of optimal output. Also, given that many factors can influence motor skills, much of the variance demonstrated in chapter 2 may be related to changes in the current motor skills of the speaker (i.e. fatigue, attention, recent speech errors, etc...). That is, within speaker variability may result from changes in the relative importance of these constraints at the time of utterance. Because details of these constraints, in particular a person’s motor skills at the time of utterance, can change from utterance to utterance, the plan may change and therefore the directions of articulator motion may change from utterance to utterance. An example of the impact of different ordering of the one direction vs. two directions of motion constraint and the articulatory conflict constraint, based on an utterance of the word ‘Berta’, is illustrated in Figure 6.7.  113  End-state comfort ɻ̩̩  Accounting for:  ɹ̩  ɹ̩  ɾ↕  ʌ  >  ɹ̩  ɾ↘ ʌ  >  ɹ̩  1 motion > 2 motions 2)  ɹ̩  ɾ↘ ʌ  >  ɹ̩  ɾ↕  ʌ  1 motion > 2 motions  Articulatory conflict 1)  ʌ˞  ɾ  ɾ↘ ʌ  >  ɹ̩  ɾ↕  ‘Berta’ ʌ  =  ɹ̩ ɾ↕  Articulatory conflict ʌ  >  ɹ̩  ɾ↕  ʌ  >  ɹ̩  ɾ↘ ʌ  ʌ  ‘Berta’  =  ɹ̩  ɾ↘ ʌ  Figure 6.7: Relative importance of one direction > two directions of motion constraint vs. articulatory conflict constraint. Alternate analyses of these observations, wherein a wide range of possible outcomes are stored rather than planned, could in principle account for some of these results. Examples of theories that employ such mechanisms include usage-based grammars (e.g., Bybee, 1995; Tomasello, 2005), exemplar theories (e.g., Pierrehumbert, 2001), or models that allow complex representation of each word with stored choices at specific points in the word (e.g., Hudson, 1980). One thing this family of theories can explain is why we should care where our tongue tip is for non-rhotic vowels. Speakers would have dominantly tip-down vowels in their cloud of experiences, making this tongue position desirable in speech. Also, speakers appear to have preference for tip-up vs. tip-down rhotic vowel in some circumstances based on usage patterns. The problem is illustrated in Figure 6.8 below, and a usage based constraint might explain the preference for options 1 and 2 over options 3 and 4, just as a very low tongue tip position for the initial vowel might explain the typical preference for option 1 over option 2. That is, since the tongue tip being very low requires a larger motion for an initial [R ] than for a final [R ], the preference for an initial [R ] (highlighted in green) is more significant than the preference for a final [R ] (highlighted in gold). 114  1 motion > 2 motions 1)  ɛ  ɾ↖ ɪ˞  ɾ↔  ɻ̩  State comfort ɻ̩ ɪ˞ ɾ ɾ ɹ̩ ɪ ɛ ɛ˞  2)  >  ɛ  ɾ↕  >  ɾ↖ ɻ̩  ɪ  ɛ  ɾ↖ ɪ˞  ɾ↘ ɹ̩  State comfort ɻ̩ ɪ˞ ɾ ɾ ɹ̩ ɪ ɛ ɛ˞  4)  ‘editor’  =  ɛ  >  ɾ↖ ɪ˞  ɛ  ɾ↔  >  ɻ̩  >  ɛ  ɾ↕  ɪ  ɾ↕  ɹ̩  >  ɛ  ɾ↕  ɪ  ɾ↖ ɻ̩  State comfort ɻ̩ ɪ˞ ɾ ɾ ɹ̩ ɪ ɛ ɛ˞  =  ɛ  ɛ  ɾ↖ ɪ˞  >  ɾ↘ ɹ̩  ɛ  ɾ↕  ɾ↔  ɻ̩  ɾ↕  ɪ  ɾ↖ ɻ̩  ‘editor’  =  ɛ  ɾ↕  ɾ↕  ɪ  ɹ̩  ‘editor’  1 motion > 2 motions  >  ɾ↖ ɪ˞  ‘editor’  1 motion > 2 motions  1 motion > 2 motions 3)  State comfort ɻ̩ ɪ˞ ɾ ɾ ɹ̩ ɪ ɛ ɛ˞  ɪ  ɾ↕  ɹ̩  =  ɛ  ɾ↖ ɪ˞  ɾ↘ ɹ̩  Figure 6.8: Relative importance of one direction > two directions of motion constraint vs. state comfort constraint. Here an initial flap (green) is assumed to be preferred over a final flap (gold). Also note that a preference for a word-final [õ] predicts different outputs from a preference fo " word-final [ô]. " However, most such accounts would have difficulty explaining why individual speakers’ repetitions of phrases within the same context vary categorically this speech behaviour cannot be entirely explained through the memorization of possible productions based on phonological context. The results of this research therefore support a theory of subphonemic speech planning that responds to constraints that may vary from speaker to speaker and utterance to utterance. Such an analysis works well with a theory of speech motor control that employes coordinative structures  6.2  Coordinative structures  Many researchers criticize the speech motor program for emphasizing brain computation and memorization over the dynamical properties of speech motor control (Easton, 1972; Turvey et al., 1978). By taking into account the dynamical prop-  115  erties of motor systems of living things, degrees of freedom of motion (Bernstein, 1967) are reduced based on the natural properties of living tissues (Kelso et al., 1981). This type of criticism led to and helped support an alternative vision of motor control, the coordinative structure. Coordinative structures are not memorized motor routines, instead, the motor routines of the coordinative structure are assembled and exist only until the task is accomplished, and require no central brain control (Summers and Anson, 2009; Turvey et al., 1982). Coordinative structures still require some external command component. In task dynamics, this external command is based on task space, which involves no particular muscle motions until those motions are required to achieve the task (Kelso et al., 1986). These tasks are described in terms of constriction location and constriction degree, independent of specific articulators. Only in the actual execution of a coordinative structure is this task attached to body space through articulator-based tract variables (see Saltzman, 1979; Saltzman and Kelso, 1987). At this point recruitments of specific muscles, and through them the direction and contour of motion, are assembled and executed to achieve the task. In this theory, the direction of motion to and from the target is not memorized, but instead encoded at the time of execution. Such structures can go a long way toward explaining the observations in this dissertation, but only if one accepts some extreme disparities between units of action/scope of coordinative structures and units of linguistic representation.  6.3  Disparity between units of action and units of linguistic representation  There have been many attempts to match units of action to units of linguistic representation. In SPE (Chomsky and Halle, 1968), Chomsky posited an trivial and direct conversion of features to physiological output. Meyer and Gordon (1985) proposed an interactive-activation model matching phonemes to features to articulation. Perkell et al. (2000) matches phonemes to memorized acoustic targets, mapping articulation to area functions (shapes) of the vocal tract. Browman and Goldstein (1986, 1989, 1992) map coordinative structures to gestures, which are 116  themselves units of linguistic representation ranging in approximate size from the feature to the phoneme. Gestures are themselves organized as gestural scores, which through task dynamics (Saltzman and Byrd, 2000; Saltzman and Kelso, 1987) become speech actions. Gestural scores map to either the word (Browman and Goldstein, 1992) or the syllable (Levelt, 1994) such that these theories have one-to-one correspondence between gestures and coordinative structures, and gestural scores and syllables or words. While the particulars of these theories vary considerably, they all share the goal of finding a one-to-one correspondence between actions and linguistic units. However, the present findings suggest that there are disparities between speech actions and linguistic units such that one linguistic unit will correspond to many speech actions, or one speech action will correspond to many linguistic units, depending on the context and what is most efficient. It is a well-known observation that speakers never produce the same articulations in the same context the same way twice. Typically such differences are seen as a matter of normal gradient variability around a mean articulation pattern. For instance, recent research suggests the variability in production of vowels is confined to a relatively tight space of articulation, regardless of the size of the phonetic inventory/perception space of the language (Meunier et al., 2003), pointing toward a unified motor routine executed slightly differently each time. However, there were already recorded exceptions to this generalization, including the well-known rhotic variability in English described by Delattre and Freeman (1968). However, the different rhotics all share similar constriction locations and degrees around the alveolar region, as illustrated in Figure 6.9. That is, depending on one’s definition of production goals, the [ô] and [õ] may share a similar task (as " " well as pharyngeal and labial constrictions).  117  Palate and Teeth  Task: (Constriction location and constriction degree)  ‘R’ tip-up rhotic [ɻ̩] tip-down rhotic [ɹ̩]  Figure 6.9: Both ‘R’ variants ([ô], [õ]) can be generated by a coordinative " " structure over the same constriction location and constriction degree. While this categorical variability was only documented for differing phonological contexts, there are cases of proven extreme variability within the same contexts. One of the most significant and interesting is the production of the Spanish alveolar fricative /s/. Many speakers will produce a long-duration [s], short [s], an [h], or no audible production for the same words in apparently the same context (File-Muriel and Brown, 2010). While the differences in frication can arguably be attributed to very tiny variations in place and degree of constriction (see Stevens, 2000), the differences in duration in the same contexts are harder to explain away as slight variations in the same motor routines. Nevertheless, the differences are gradient, not categorical, and were not used to reanalyze the relationship between linguistic units and actions. This dissertation represents documentation of categorical kinematic variation within the same phonetic context. Evidence for similar behaviour in English /t/ vs. glottal stop variation was collected for Manuel and Vatikiotis-Bateson (1988), but not reported. Close examination of this variation has revealed a range of influencing factors. For the word ‘murder’, all but three speakers used at least two of the four most likely sequences, as presented in the introduction (see Figures 2.13 118  and 2.14). In an extreme case, participant 8 produced all four versions in a single sitting. This represents extreme categorical subphonemic variation, standing in stark contrast to the stability of the linguistic representation. Nevertheless, Figure 6.10 below illustrates how all four of these ‘T’ variants share the same constriction location and constriction degree (i.e. arguably the same task).  ‘T’  Up flap [ɾ↖]  Down flap [ɾ↘]  Postalveolar Tap [ɾ↔]  Alveolar Tap [ɾ↕]  Figure 6.10: All the ‘T’ variants ([R ], [R ], [R ], [R↔ ]) can be generated by a coordinative structure over the same constriction location and constriction degree. Yet a very similar action can span multiple phones/segments in the production of the word ‘Saturday’. One action, instantiated in the initial ‘T’ and perpetuated through muscle relaxation, gravity and elasticity at the end of the final ‘T’, can produce the most common [R õR ] sequence, compared to three actions that prob" ably control each phone for the much less common sequences such as [R ôR ]. The " difference between these two sequences is illustrated in Figure 6.11 below.  119  vs.  ɾ  ↖  ‘T’  ɻ̩  ɾ  ɾ  ɹ̩  ɾ  ‘R’  ‘T’  ‘T’  ‘R’  ‘T’  ↕  ↘  ↕  Figure 6.11: Schematic tongue tip trajectory showing some ‘T’ and ‘R’ sequences, such as [R ], [õ], [R ] can result from one coordinative struc" ture, whereas others, such as [R ], [ô], [R ] will result from multiple " coordinative structures. This means that the relationship between actions (i.e the scope of coordinative structures) and units of linguistic representation will vary depending upon the context. This is, whether efficiencies within the production environment are available to reduce the number of actions needed to accomplish an accurately perceivable output. By disconnecting units of linguistic representation from units of action, each system is free to function in the most efficient manner required to achieve the goals of both.  6.4  Defining task space  As described above, theories with coordinative structures can provide an economical description of categorical kinematic variation in this dataset. To account for the data in this dissertation, however, coordinative structures need to be uncoupled from tasks, as well as units of linguistic representation, in order to take advantage of potential efficiencies within the production environment. Also, to avoid needing to encode direction of motion, coordinative structures need to use local and longer-distance articulatory conflicts, a person’s motor skills, gravity and elasticity to provide information that is used to generate direction of motion at utterance  120  time. Because coordinative structures are soft-assembled, with a sufficiently inclusive task space they may be able to account for the observed variability - even the within-speaker, within-context variability -with soft-assembled solutions that take into account the relevant information at the time of utterance. When combined with what we know about language, these results point toward an analysis that depends on much more than just constriction location, constriction degree, and planning. Action tasks must also access articulator behaviour, and aerodynamic, visual, and auditory information. For instance, as Kelso et al. (1986) originally suggested, fixed spacial locations and degrees of constriction are not adequate for relating what happens with labials since labial stops involve compressing the lips together, labial fricatives are typically produced by pulling the lower lip against the bottom of the upper teeth, and labial approximants are produced by rounding the lips. There is therefore no stable shared location in relation to incremental changes in the degree of constriction, and so the task would need access to information about how articulators work. (Though, see (Gick et al., ress) for a simulation-based reanalysis of the labial problem.) Aerodynamic information provides more detailed information for differentiating many speech sounds, as well as explaining many phonological observations (see Ohala, 1997). Visual information is not just useful in speech perception (Sunby and Pollack, 1954), but necessary for the perception of speech (M´enard et al., 2008; Sato et al., 2010), especially at the end of some words, e.g. in Oneida and Blackfoot (Gick et al., 2011). In these cases, the task space must include visual information. Similarly, some sounds appear to rely on auditory information as much or more than place and degree of constriction, especially vowels (see Stevens, 2000) as there is too much instability in production for the system to depend on place and degree of constriction. Given the preference for non-rhotic vowels to be produced with the tip-down, shape might also be a constraint. Such evidence points toward an analysis of the task space of speech motor control as a multi-dimensional structure, with the importance of one or more parts of the structure based largely on what is required to achieve the individual task.  121  6.5  Future work  The results of this research do not include any analysis of the acoustic signal beyond that which is required to identify the contact time of ‘T’ variants as indicated in Zue and Laferriere (1979). Future work therefore includes studying acoustic correlates of the ‘T’ variants. Possible correlates include duration of acoustic effects of the ‘T’ variant, as well as transitions into and out of the ‘T’, in particular the F4 and F5 formants (see Zhou et al., 2007). An ability to identify the four ‘T’ variants based on acoustic correlates would make future research much easier, in particular if acoustic correlates of ‘R’ variants, as identified in Zhou et al. (2007), are consistently useful in identifying the differences between [õ] and [ô], as well " " as normal and rhotacized non-rhotic vowels. This information would reduce the need for articulatory measures and allow much larger scale projects on categorical differences in ‘T’ and ‘R’ production. But even if such correlates are not identifiable through acoustic measurements, they may still be perceivable by native speakers of English. That is, ‘T’ variants in the same phonological context may have an impact on speech perception. Similarly, it is likely that the difference between rhotacized and non-rhotacized ‘V’ may have a significant impact on speech perception. The potential differences in perception of these sounds is worth studying, including when the sounds are produced in the contexts from which they were found and when they are produced in isolation. This information will help identify whether state-comfort effects are purely articulatory, or whether there is a potential perceptual acoustic factor. There is also a need to follow up on the preliminary evidence that previous speech errors have an impact on ‘T’ and ‘R’ variants produced by the same speakers in the same phonological context. Such observations would provide evidence for another cause of within speaker variation. Other future research includes plans to look at the effects of induced speech rate changes on ‘T’ production, as well as the effects of speech motor disabilities or partial glossectomies on ‘T’ production. The results also suggest that primitive aspects of phonology, specifically constraint ranking, are present in speech motor behaviour. However, these constraint rankings are person and event specific, and the relationship between these con-  122  straints and other traditional phonological constraints has yet to be established. Therefore, further research is indicated.  6.6  Limitations  This research used an ultrasound probe that was placed under the chin in such a way that it may attenuate jaw motion in some speakers. Also, because of the way ultrasound works, the effects of the remaining jaw motion are difficult to distinguish from the effects of tongue motion. Future research might benefit from the use of an ultrasound helmet that allows the probe to move freely with the jaw (see Miller and Finch, 2011). Future research might also benefit from the use of recording techniques that do not restrict jaw motion, such as the new WAVE 3D EMA device. At the same time, such a device would allow very careful analysis of the motion of a few points on the tongue tip and blade, making it possible to see any gradient motion effects that are masked by pooling data into categorical ‘bins’. On that note, while the ‘T’ and ‘R’ data are relatively easy to describe in terms of categories, there are many possible gradient differences within each category, the identification of which might help explain shifts in categorical behaviour within the same speaker and phonological context. At the very least, it may be possible to quantify potential fatigue effects in this way. Using an EMA may easily reveal the gradient details of ‘T’ and ‘R’ motion would be particularly useful in planned research to look at the effects of speech rate on ‘T’ production since some of the variability (i.e. ‘T’ to stop in slow speech, or ‘T’ to approximant in fast speech) is an entirely gradient production change. Nevertheless, EMA devices have receivers attached to the tongue, which can interfere with tongue-tip/palate interactions unless the receivers are moved back 1 cm or more from the tip, in which case much of the relevant information may be lost. Also, pellets can fall off, making it difficult to run long experiments. Lastly, in a flesh point measurement, it can be difficult to reconstruct the motion of the articulator due to the limited number of points measured. Lastly, while EMA is more invasive than ultrasound, and so more difficult to run non-standard speaker populations, using an EMA device for research into ‘T’  123  and ‘R’ production among those with speech motor disabilities or partial glossectomies might be useful for identifying possible productions that are difficult to predict in advance of the research.  6.7  Conclusion  Results in this dissertation implicate a system of constraints minimally including articulatory conflict, a preference for one motion direction or motion arc (flap and flap sequences) over two (taps and tap sequences), gravity and elasticity, mediated by individual and utterance specific motor skills interacting with the details of motor control down to the level of the articulators. While the results are consistent with some planning to account for the longer distance constraints across syllable, morpheme and word boundaries, this does not imply that we plan all aspects of our speech acts from scratch all the time. A motor control system that requires planning of every detail of every speech act is simply computationally impossible for the brain to implement, and it does not match with what we already know about the brain, including the sensory register, short-term and long-term memory (Klatzky, 1975). At the same time, the results strongly argue for disparity between motor actions and coordinative structures on the one hand, and units of linguistic representation on the other. Instead, the two interact to accomplish accurately perceivable output. As a result, I believe a complete model of the speech motor control system must include some memorization, some planning, and some degree of self-organization, free to self-organize in whatever way and over whatever scope most efficiently accomplishes the output required for accurate speech perception.  We plan small speech acts. Language intertwines freely, both most efficient.  124  Bibliography Abd-El-Malek, S. (1939). Observations on the morphology of the human tongue. Journal of Anatomy, LXXIII(2):201–212. → pages 49, 50 Bernhardt, B. H. and Stemberger, J. P. (1998). Handbook of phonological development from a nonlinear constraints-basted perspective. Academic Press, San Diego, CA. → pages ii, 67 Bernstein, N. A. (1967). The co-ordination and regulation of movements. Pergamon Press, Oxford. → pages 116 Boyce, S. E. (1990). Coarticulatory organization for lip rounding in Turkish and English. Journal of the Acoustical Society of America, 88:2584–2595. → pages 66 Browman, C. P. and Goldstein, L. (1986). Towards an articulatory phonology. Phonology Yearbook, 3:219–252. → pages 5, 46, 116 Browman, C. P. and Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6:201–251. → pages 5, 46, 116 Browman, C. P. and Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49:166–180. → pages 5, 46, 116, 117 Buchaillard, S., Perrier, P., and Payan, Y. (2009). A biomechanical model of cardinal vowel production: Muscle activations and the impact of gravity on tongue positioning. Journal of the Acoustical Society of America, 126(4):2033–2051. → pages 58 Bybee, J. (1995). Regular morphology and the lexicon. Language and Cognitive Process, 10(5):425–455. → pages 87, 114 Chang, Y.-C., Lee, F.-P., Peng, C.-L., and Lin, C.-T. (2003). Measurement of tongue movement during vowels production with computer-assisted B-mode 125  and M-mode ultrasonography. Otolaryngology Head and Neck Surgery, 128(6):805–814. → pages 16 Chapman, K. M., Weiss, D. J., and Rosenbaum, D. A. (2010). Evolutionary roots of motor planning: The end-state comfort effect in lemurs. Journal of Comparative Psychology, 124(2):229–232. → pages 67 Chomsky, N. and Halle, M. (1968). The sound pattern of English. Harper & Row, New York. → pages 5, 45, 116 Cohen, R. G. and Rosenbaum, D. A. (2004). Where grasps are made reveals how grasps are planned: generation and recall of motor plans. Experimental Brain Reseaarch, 157(4):487–95. → pages 67 DeJong, K. (1998). Stress-related variation in the articulation of coda alveolar stops: fapping revisited. Journal of Phonetics, 26:283–310. → pages 10 Delattre, P. and Freeman, D. (1968). A dialect study of American Rs by x-ray motion picture. Linguistics, 44:29–68. → pages 1, 6, 8, 68, 90, 117 Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychological Review, 93(3):283–321. → pages 67 Derrick, D., Anderson, P., Gick, B., and Green, S. (2009). Characteristics of air puffs produced in English ‘pa’: Experiments and simulations. Journal of the Acoustical Society of America, 125(4):2272–2281. → pages 65 Derrick, D. and Gick, B. (2008). Quantitative analysis of subphonemic flap/tap variation in nae. Canadian Acoustics, 36-3:162–163. → pages 2 Easton, T. A. (1972). On the normal use of reflexes. American Scientist, 60:591–599. → pages 115 Fels, S., Stavness, I., Hannam, A., Lloyd, J. E., Anderson, P., Batty, C., Chen, H., Combe, C., Pang, T., Mandal, T., Teixeira, B., Greena, S., Bridson, R., Lowe, A., Almeida, F., Fleetham, J., and Abugharbieh, R. (2009). Advanced tools for biomechanical modeling of the oral, pharyngeal, and laryngeal complex. In International Symposium on Biomechanics Healthcare and Information Science, page online. → pages 58 Fels, S., Vogt, F., van den Doel, K., Lloyd, J., Stavness, I., and Vatikiotis-Bateson, E. (2006). Artisynth: A biomechanical simulation platform for the vocal tract and upper airway. Technical Report TR-2006-10, Computer Science Dept., University of British Columbia. → pages 58 126  Fels, S. S., Vogt, F., Gick, B., Jaeger, C., and Wilson, I. (2003). User-centered Design for an Open source 3D Articulatory synthesizer. In Proceedings of the 15th International Congress of Phonetic Science (ICPhS), pages 179–182. → pages 58 File-Muriel, R. J. and Brown, E. K. (2010). The Gradient Nature of S-Lenition in Cale˜no Spanish. In University of Pennsylvania Working Papers in Linguistics, volume 16 (2), pages 46–55. → pages 118 Folkins, J. W. and Abbs, J. H. (1975). Lips and jaw motor control during speech: Responses to resistive loading of the jaw. Journal of Speech and Hearing Research, 18:207–220. → pages 9 Fowler, C. A. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8(113-133). → pages 6, 66 Gay, T. (1974). A cineflurographic study of vowel production. Journal of Phonetics, 2(255-266). → pages 11, 90 G´erard, J.-M., J.Ohayon, Luboz, V., Perrier, P., and Payan, Y. (2005). Non-linear elastic properties of the lingual and facial tissues assessed by indentation technique. application to the biomechanics of speech production. Medical Engineering & Physics, 27:884–892. → pages 58 G´erard, J.-M., Perrier, P., and Payan, Y. (2006). 3D biomechanical tongue modelling to study speech production. In Harrington, J. and M. Tabain (Psychology, N. Y., editors, Speech Production: Models, Phonetic Processes and Techniques, pages 85–102. → pages 58 G´erard, J.-M., Wilhelms-Tricarico, R., Perrier, P., and Payan, Y. (2003). A 3D dynamical biomechanical tongue model to study speech motor control. Recent Research Developments in Biomechanics, 1:49–64. → pages 58 Gick, B., Bliss, H., Michelson, K., and Radanov, B. (2011). Articulation without acoustics: “Soundless” vowels in Oneida and Blackfoot. Journal of Phonetics. To appear. → pages 121 Gick, B., Stavness, I., Chiu, C., and Fels, S. (In Press). Categorical variations in lip posture is determined by quantal biomechanical-articulatory relations. Canadian Acoustics. → pages 121 Gick, B. and Wilson, I. (2006). Excrescent schwa and vowel laxing: Cross-linguistic responses to conflicting articulatory targets. In Goldstein, L., 127  Walen, D. H., and Best, C. T., editors, Laboratory Phonology 8, pages 635–659. Mouton de Gruyter, New York. → pages ii, 3, 9, 90, 109 Gick, B., Wilson, I., Koch, K., and Cook, C. (2004). Language-specific articulatory settings: Evidence from inter-utterance rest position. Phonetica, 61:220–233. → pages 47 Gooz´ee, J. V., Stephenson, D. K., Murdoch, B. E., Darnell, R. E., and Lapointe, L. L. (2005). Lingual kinematic strategies used to increase speech rate: Comparison between younger and older results. Clinical Linguistics & Phonetics, 19(4):319–334. → pages 108 Guenther, F. H., Espy-Wilson, C. Y., Boyce, S. E., Matthies, M. L., Zandipour, M., and Perkell, J. S. (1999). Articulatory tradeoffs reduce acoustic variability during American English /r/ production. Journal of the Acoustical Society of America, 105(5):2854–2865. → pages 5 Hagiwara, R. (1995). Acoustic Realizations of American /r/ as Produced by Women and Men. PhD thesis, UCLA. → pages 2, 11, 68 Hannam, A., Stavness, I., Lloyd, J. E., and Fels, S. (2008). A dynamic model of jaw and hyoid biomechanics during chewing. Journal of Biomechanics, 41(5):1069–1076. → pages 58 Hedrick, W. R., Hykes, D. L., and Starchman, D., editors (1995). Ultrasound Physics and Instrumentation. Moasby, St. Louis, 3rd edition. → pages 15 Henke, W. (1966). Dynamic articulatory model of speech production using computer simulation. PhD thesis, MIT. → pages 6, 66 Hockett, C. (1955). A Manual of Phonology, volume 11. Indiana University Publications in Anthropology and Linguistics, Baltimore. → pages 10 Hoole, P., Munhall, K., and Mooshammer, C. (1998). Do airstream mechanisms influence tongue movement paths? Phonetica, 55(3):131–146. → pages 65 Houde, R. A. (1968). A study of tongue body motion during selected speech sounds. Speech Communication Research Laboriatory Monograph No. 2, Santa Barbara, CA. → pages 65 Hoyt, D. F. and Taylor, C. R. (1981). Gait and the energetics of locomotion in horses. Nature, 292(16):239–240. → pages 107 Hudson, G. (1980). Automatic alternations in non-transformational phonology. Language, 51(1):94–125. → pages 87, 114 128  Ingram, J. and Laughren, M. (1999). The stop flap contrast in western warlpiri. In Henderson, J., editor, Australian Linguistic Society Conference, 1999: proceedings., Perth. Australian Linguistic Society. → pages 10 Jakobson, R., Fant, G., and Halle, M. (1951). Preliminaries to speech analysis: The distinctive features and their correlates. MIT Press, Cambridge, MA. → pages 45 Kavitskaya, D., Iskarous, K., Moiray, A., and Proctor, M. (2009). Trills and Palatalization: Consequences for Sound Change. Yale and Haskins Laboratories. → pages 15 Keele, S. W. (1968). Movement control in skilled motor performance. Psychological Bulletin, 70:387–403. → pages 42 Kelkar, A. R. (1968). Studies in Hindi-Urdu, pages ix + 87. Deccan College Building Centenary and Jubilee Series, 35, Poona: Deccan College Postgraduate and Research Institute. → pages 10 Kelsey, C. A., Crummy, A. B., and Schulman, E. Y. (1969). Comparison of ultrasonic and cineradiographic measurement of lateral pharyngeal wall motion. Investigative Radiology, 4:241–245. → pages 6, 15 Kelso, J. A., Saltzman, E. L., and Tuller, B. (1986). The dynamical perspective on speech production: Data and theory. Journal of Phonetics, 14(1):29–59. → pages 43, 108, 116, 121 Kelso, J. A. S., Holt, K. G., Rubin, P., and Kugler, P. N. (1981). Patterns of human interlimb coordination emerge from the properties of non-linear, limit cycle oscillatory processes: Theory and data. Journal of Motor Behavior, 13(4):226–261. → pages 116 Kent, R. and Moll, K. (1972). Cineflurographic analysis of selected lingual consonants. Journal of Speech and Hearing Research, 15:453–473. → pages 65 Klatzky, R. L. (1975). Human Memory: Structures and Processes. W. H. Freeman & Co., San Francisco. → pages 124 Ladefoged, P. and Maddieson, I. (1996). The Sounds of the World’s Languages. Blackwell, Oxford. → pages 10 Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. MIT Press, Cambridge, Massachusetts. → pages ii, 6, 67 129  Levelt, W. J. M. (1994). Do speakers have access to a mental syllabary? Cognition, 50:239–269. → pages 5, 117 Lindblom, B. (1983). Economy of speech gestures. In MacNeilage, editor, The production of speech. Springer, New York. → pages 69 MacNeilage, P. F. and Sholes, G. N. (1964). An Electromyographic Study of the Tongue During Vowel Production. Journal of Speech and Hearing Research, 7(3):209–232. → pages 49 Maddieson, I. (1984). Patterns of Sounds. Cambridge University Press, Cambridge. → pages 10 Manuel, S. Y. and Vatikiotis-Bateson, E. (1988). Oral and glottal gestures and acoustics of unterlying /t/ in English. Journal of the Acoustical Society of America, 84:S84. → pages 118 Margaria, R. (1938). Physiology and energy expenditure during walking and running at different speeds and slopes of the ground. Atti della Reale Accademia Nazionale dei Lincei, 7:277–283. → pages 107 McCarthy, J. and Prince, A. (1993). Prosodic morphology: Constraint interaction and satisfaction. Rutgers University Center for Cognitive Science Technical Report 3. → pages 113 McClean, M. D. (2000). Patterns of orofacial movement velocity across variations in speech rate. Journal of Speech, Language, and Hearing Research, 43:205–216. → pages 108 Medicines and Healthcare products Regulatory Agency (2004). Evaluation report mhra 03107. Technical report, MHRA. → pages 20, 51, 73, 94 M´enard, L., Leclerc, A., Brisebois, A., Aubin, J., and Brasseur, A. (2008). Production and perception of french vowels by blind and sighted speakers. In 8th international Seminar on Speech Production, pages 197–200. → pages 121 Meunier, C., Frenck-Mestre, C., Lelekov-Boissard, T., and Le Besnerais, M. (2003). Production and perception of vowels: does the density of the system play a role? In Submission. → pages 117 Meyer, D. E. and Gordon, P. C. (1985). Speech Production: Motor Programming of Phonetic Features. Journal of Memory and Language, 24:3–26. → pages 5, 45, 116  130  Mielke, J., Baker, A., and Archangeli, D. (2010). Variability and homogeneity in american english /ô/ allophony and /s/ retraction. Labphon. → pages 9, 10 Miller, A. L. and Finch, K. B. (2011). Corrected high-frame rate anchored ultrasound with software alignment. Journal of Speech, Language, and Hearing Research, 54:471–486. → pages 123 Miller, J. L. and Watkin, K. L. (1997). Lateral pharyngeal wall motion during swallowing using real time ultrasound. Dysphagia, 12:125–132. → pages 16 Miyawaki, K. (1974). A study on the musculature of the human tongue. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 8:23–50. → pages 49 Miyawaki, K., Hirose, H., Ushijina, T., and Sawashima, M. (1975). A preliminary report on the electromyographic study of the activity of lin- gual muscles. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 9:91–106. → pages 49, 58 Monnot, M. and Freeman, M. (1972). A comparison of Spanish single-tap /r/ with american /t/ and /d/ in post-stress intervocalic position. In Valdman, A., editor, Papers in Linguistics to the memory of Pierre Delattre, pages 409–416, The hague. Mouton. → pages 10 Mowrey, R. A. and MacKay, I. R. A. (1990). Phonological primitives: Electromyographic speech error evidence. Journal of the Acoustical Society of America, 88(3):1299–1312. → pages 67 Munhall, K. G., Kawato, M., and Vatikiotis-Bateson, E. (2000). Coarticulation and physical models of speech production. In Broe, M. B. and Pierrehumbert, J. B., editors, Papers in Laboratory Phonology V: Acquistion and the Lexicon, chapter 1, pages 9–28. Cambridge University Press, Cambridge, UK. → pages 43, 67 Narayanan, S. S. and Alwan, A. A. (1997). Toward articulatory-acoustic models of liquid approximants based on MRI and EPG data. Part I. The laterals. Journal of the Acoustical Society of America, 101(2):1064–1077. → pages 6 Nelson, W. L. (1983). Physical principles for economies of skilled movements. Biological Cybernetics, 46(2):135–147. → pages 69 Nelson, W. L., Perkell, J. S., and Westbury, J. R. (1984). Mandible movements during increasingly rapid articulations of single syllables: Preliminary 131  observations. Journal of the Acoustical Society of America, 75(3):945–951. → pages 92 Ohala, J. J. (1997). Aerodynamics of phonology. In Proceedings of the Seoul International Conference on Linguistics, pages 92–97. → pages 121 ¨ Ohman, S. (1966). Coarticulation in vcv utterances: Spectrographic measurements. Journal of the Acoustical Society of America, 39(1):151–168. → pages 6, 66 ¨ Ohman, S. (1967). Numerical model of coarticulation. Journal of the Acoustical Society of America, 41(2):310–320. → pages 6, 66 Ostry, D., Keller, E., and Parush, A. (1983). Similarities in the control of the speech articulators and the limbs: Kinematics of tongue dorsum movement in speech. Journal of Experimental Psychology: Human Perception and Performance, 9:622–636. → pages 15 Ostry, D. J., Cooke, J. D., and Munhall, K. G. (1987). Velocity curves of human arm and speech movements. Experimental Brain Reseaarch, 68:37–46. → pages 108 Ostry, D. J. and Munhall, K. G. (1985). Control of rate and duration of speech movements. Journal of the Acoustical Society of America, 77(2):640–648. → pages 108 Peck, C. C., Langenback, G. E. J., and Hannam, A. G. (2000). Dynamic simulation of muscle and articulator properties during human wide jaw opening. Archives of Oral Biology, 45(11):963–982. → pages 58 Perkell, J. S. (1969). Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Research Monograph No. 53. The M.I.T. Press, Cambridge, Massachusetts. → pages 11, 65, 90 Perkell, J. S., Guenther, F. H., Lane, H., Matthies, M. L., Perrier, P., Vick, J., Wilhelms-Tricarico, R., and Zandipour, M. (2000). A theory of speech motor control and supporting data from speakers with normal hearing and with profound hearing loss. Journal of Phonetics, 28:233–272. → pages 5, 116 Perrier, P., Payan, Y., Zandipour, M., and Perkell, J. (2003). Influences of tongue biomechanics on speech movements during the production of velar stop consonants: A modeling study. Journal of the Acoustical Society of America, 114(3):1582–1599. → pages 47 132  Pierrehumbert, J. (2001). Exemplar dynamics: Word frequency, lenitiion and contrast. In Bybee, J. and Hopper, P., editors, Frequency effects and emergent grammar, pages 137–157. John Benjamins. → pages 87, 114 Rosenbaum, D. A., van Heugten, C. M., and Caldwell, G. E. (1996). From cognition to biomechanics and back: the end-state comfort effect and the middle-is-faster effect. Acta Psychologica (Amsterdam), 94(1):59–85. → pages 67 Rosenbaum, D. A., Vaughan, J., Barnes, H. J., and Jorgensen, M. J. (1992). Time course of movement planning: selection of handgrips for object manipulation. Journal of Experimental Psychology: Learning, Memory and Cognition, 18(5):1058–1073. → pages ii, 3, 14, 41, 67, 111 Saito, H. and Itoh, I. (2007). The three-dimensional architecture of the human styloglossus especially its posterior muscle bundles. Annals of Anatomy, 189(3):261–267. → pages 50 Saltzman, E. (1979). Levels of sensorimotor representation. Journal of Mathematical Psychology, 20:91–163. → pages 116 Saltzman, E. and Byrd, D. (2000). Task-dynamics of gestural timing: Phase windows and multifrequency rhythms. Human Movement Science, 19:499–526. → pages 43, 117 Saltzman, E. and Kelso, J. A. S. (1987). Skilled Actions: A Task-Dynamic Approach. Psychological Review, 94(1):84–106. → pages 116, 117 Saltzman, E. and Munhall, G. (1989). A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1:333–382. → pages ii, 3, 6, 66 Sambur, M. R. (2003). Selection of acoustic features for speaker identification. IEEE Transactions on Accoustics, Speech and Signal Processing, 23(2):176–182. → pages 107 Sato, M., Cav´e, C., M´enard, L., and Brasseur, A. (2010). Auditory-tactile speech perception in congenitally blind and sighted adults. Neuropsychologica, page 4. doi:10.1016/j.neuropsychologia.2010.08.017. → pages 121 Schmidt, R. A. (1975). A schema theory of discrete motor skill learning. Psychological Review, 82:225–260. → pages 42  133  Sch¨onle, P. W., Gr¨abe, K., Wenig, P., H¨ohne, J., Schrader, J., and Conrad, B. (1987). Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31:26–35. → pages 6 Shiller, D. M., Ostry, D. J., and Gribble, P. L. (1999). Effects of Gravitational Load on Jaw Movements in Speech. Journal of Neuroscience, 19(20):9073–9080. → pages 47 Slaughter, K., Li, H., and Sokoloff, A. J. (2005). Neuromuscuar organization of the superior longitudinalis muscle in the human tongue. i. motor endplace morphology and muscle fiber architecture. Cells Tissues Organs, 181:51–64. → pages 58 Stavness, I., Hannam, A. G., Lloyd, J. E., and Fels, S. (2006). An Integrated dynamic jaw and laryngeal model constructed from CT data. In Proceedings of the International Symposium for Biomedical Simulation (ISBMS06), pages 169–177. Springer LNCS 4072. → pages 58 Stavness, I., Lloyd, J., Payan, Y., and Fels, S. (2010). Coupled hard-soft tissue simulation with contact and constraints applied to jaw-tongue-hyoid dynamics. International Journal for Numerical Methods in Biomedical Engineering, In Press. → pages 58 Stevens, K. (2000). Acoustic Phonetics. MIT Press, Cambridge, Massachusetts. → pages 118, 121 Summers, J. J. and Anson, J. G. (2009). Current status of the motor program: Revisited. Human Movement Science, 28:566–577. → pages 116 Sunby, W. H. and Pollack, I. (1954). Visual Contribution to Speech Intelligibility in Noise. Journal of the Acoustical Society of America, 26:212–215. → pages 121 Takemoto, H. (2001). Morphological analysis of the human tongue musculature for three-dimensional modeling. Journal of Speech, Language and Hearing Research, 44:95–107. → pages 58 Tillmann, H. G. and Pfitzinger, H. R. (2003). Local Speech Rate: Relationships between Articulation and Speech Acoustics. In 15th ICPhS Barcelona, pages 3177–3180. → pages 108 Tomasello, M. (2005). Constructing a language: A usage-based theory of language acquisition. Harvard University Press. → pages 87, 114 134  Turvey, M. T., Fitch, H. L., and Tuller, B. (1982). The bernstein perspective: I. the problems of begrees of freedom and context-conditioned variability. In Kelso, J. A. S., editor, Human Motor Behavior: an Introduction. Lawrence Erlbaum Associates. → pages 43, 116 Turvey, M. T., Shaw, R., and Mace, W. M. (1978). Issues in the theory of action: Degrees of freedom, coordinative structures and coalitions. In Requin, J., editor, Attention and Performance VII, pages 557–595, Hillsdale, NJ. Lawrence Erlbaum Associates, Inc. → pages 115 Umeda, N. (1976). Consonant duration in american english. Journal of the Acoustical Society of America, 61(3):846–858. → pages 10 Van Den Berg, J. W. (1958). Myoelastic-aerodynamic theory of voice production. Journal of Speech and Hearing Research, 1:227–244. → pages 65 Weiss, D. J., Wark, J. D., and Rosenbaum, D. A. (2007). Monkey See, monkey plan, monkey do: The end-state comfort effect in cotton-top tamarins (Saguinus oedipus). Psychological Science, 18(12):1063–1068. → pages 67 Westbury, J. R., Hashi, M., and Lindstrom, M. J. (1999). Differences among speakers in lingual articulation for american english /ô/. Speech Communication, 26:203–226. → pages 9 Whalen, D. H. (1990). Coarticulation is largely planned. Journal of Phonetics, 18:3–35. → pages ii, 6, 66 Wilhelms-Tricarico, R. (2000). Development of a tongue and mouth floor model for normalization and biomechanical modelling. In Proceedings of the Fifth Speech Production Seminar and CREST Workshop on Models of Speech Production, pages 141–148, Kloster Seeon, Bavaria. → pages 58 Wood, S. A. J. (1996). Assimilation or coarticulation? Evidence from the temporal co-ordination of tongue gestures for the palatalization of Bulgarian alveolar stops. Journal of Phonetics, 24(1):139–164. → pages 9 Yee, T. W. (2008). The VGAM package. R News, 8(2):28–39. → pages 96 Yee, T. W. and Wild, C. J. (1996). Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58:481–493. → pages 96 Zhou, X. H., Espy-Wilson, C. Y., Boyce, S., and Tiede, M. (2007). An articulatory and acoustic study of “retroflex” and “bunched” american English rhotic sound based on MRI. In Interspeech 2007. → pages 122 135  Zue, V. W. and Laferriere, M. (1979). Acoustic study of medial/t,d/in american english. Journal of the Acoustical Society of America, 66(4):1039–1050. → pages 10, 22, 93, 122  136  Appendix A  Appendices for Chapter 2 A.1  ‘T’ variants by ‘R’ variants by participant  A.1.1  Distribution of ‘T’ variants by initial ‘R’ variant in the phrase ‘We have Berta beep’ by participant  Here we present individual breakdowns of ‘T’ variants by ‘R’ variants in the immediate context. Half the participants produced either initial [ô] or [õ] for the phrase ‘We have " " Berta beep’. The other half produced different ‘R’ variants in different repetitions of the same phrase, as seen in Figure A.1. In general, and as demonstrated by the Wilcoxon signed-rank test in the main paper, the initial rhotic did not covary with the ‘T’ variant. However, participant 23 did show more [R ] than [R ] when the preceding ‘R’ was a [ô]. "  137  ɾ↖  ɾ↕0 0  subject: 3 subject: ↘ ɾ14 ɾ↕  subject: 4 subject: ↘ ɾ15 ɾ↕  subject: 5 subject: ↘ ɾ16 ɾ↕  subject: 6 subject: ↘ ɾ17 ɾ↕  subject: 8 subject: ↘ ɾ18 ɾ↕  subject: 9 subject: ↘ ɾ21 ɾ↕  subject: 10 subject: ↘ ɾ23 ɾ↕  subject: 12 subject: ↘ ɾ26 ɾ↕  0 10 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 2 ↘ ɾ13 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 84  0 10 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 3 ↘ ɾ14 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 09  0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 4 ↘ ɾ15 ɾ↕ subject: ɾ↕0 1ɾ↘ 10 0 0  01 10 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 5 ↘ ɾ16 ɾ↕ subject: ɾ↕0 3ɾ↘ 0 0 12 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 6 ↘ ɾ17 ɾ↕ subject: ɾ↕0 1ɾ↘ 12 0 0 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 8 ↘ ɾ18 ɾ↕ subject: ɾ↕0 1ɾ↘ 2 0 10 0 01 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 9 ↘ ɾ21 ɾ↕ subject: ɾ↕0 11 ɾ↘ 0 0 12 0 0 0ɾ↔ ɾ↖0 ↔ ɾ10 ɾ↖ subject: ↘ ɾ23 ɾ↕ subject: ɾ↕0 12 ɾ↘ 70 40 0 0 0ɾ↔ ɾ↖0 ↔ ɾ12 ɾ↖ subject: ↘ ɾ26 ɾ↕ subject: ɾ↕0 4ɾ↘ 10 0 2  ɾ↖0 ɾ↖  0  ɾ↖  References  subject: 2 subject: ↘ ɾ13 ɾ↕ ɾ↘ 11  0  ɾ↔  ↘ 0ɾ 11 8  subject: 13  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 4  0  0  ɾ↔  subject: 3  ɾ↖  ɾ↕ 0  ɾ↘ 11  1  0  ɾ↕0 0  0  subject: 14  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 9  0  0  ɾ↖  ɾ↔  ↘ 0ɾ 11 0  ɾ↔  subject: 4  ɾ↖  ɾ↘ 11  0  0  ɾ↕0 0  0  subject: 15  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 10  0  0  ɾ↖  21  ɾ↕ 0  ɾ↔  21  21  21  Abbs, J. H., Nadler, R. D., and Fujimura, Abbs, O. (1988). J. H., Nadler, X-ray microbeams R. D., and Fujimura, track the O. shape (1988). of X-r 2:29–34. speech. Abbs, J. H., Nadler, R.speech. D., andSOMA, Fujimura, Abbs, O. J. (1988). H., Nadler, X-ray R. microbeams D., andSOMA, Fujimura, track 2:29–34. theO. shape (1988). of X-ray microbeams track speech. SOMA, 2:29–34. speech. SOMA, 2:29–34. Allott, R. (2003). Outline of a motor theory Allott, of natural R. (2003). language. OutlineCognitive of a motor systems, theory 6(1):93– of natural lan 101.of a motor theory 101. Allott, R. (2003). Outline Allott, of natural R. (2003). language. Outline Cognitive of a motorsystems, theory of 6(1):93– natural language. Cognitive syste 101. 101. Bates, D. and Sarkar, D. (2008). lme4: Bates, LinearD. mixed-effects and Sarkar,models D. (2008). usinglme4: S4 classes. Linear URL mixed-effe http://CRAN. R-project. org/package= http://CRAN. lme4, R package versionURL 0.999375-28. org/package=models lme4, using R packag Bates, D. and Sarkar, D. (2008). lme4: Bates, LinearD.mixed-effects and Sarkar, models D. (2008). using lme4: S4R-project. classes. Linear mixed-effects S4 c http://CRAN. R-project. org/package=http://CRAN. lme4, R package R-project. version org/package= 0.999375-28. lme4, R package version 0.999375-28 Chang, Y.-C., Lee, F.-P., Peng, C.-L., and Chang, Lin, C.-T. Y.-C.,(2003). Lee, F.-P., Measurement Peng, C.-L., of and tongue Lin,moveC.-T. (200 ment during vowels production with F.-P., computer-assisted ment during vowels b-mode production and m-mode with computer-assisted ultrasonograChang, Y.-C., Lee, F.-P., Peng, C.-L., and Chang, Lin, Y.-C., C.-T. (2003). Lee, Measurement Peng, C.-L., of and tongue Lin, moveC.-T. (2003). Measurement of tb phy. Otolaryngology Head and Neck Surgery, phy. Otolaryngology 128(6):805–814. Head and Neck Surgery, ment during vowels production with computer-assisted ment during vowels b-mode production and m-mode with computer-assisted ultrasonograb-mode and128(6):805 m-mode phy. Otolaryngology Head and Neck Surgery, phy. Otolaryngology 128(6):805–814.Head and Neck Surgery, 128(6):805–814.  References  1  References  ɾ↕ 0  References  138  same subjects thewhether same phrases the same in subjects uttering or verypurpose similar the same phrases in The last observationwhether takes usthe back to The the underlying last uttering observation purpose takes of this us back research, to the the tosame underlying identify ofcontexts this research will useuttering differentthe flap/tap variants will useuttering the different experiment. Though variantsin half throughout of same the particexperi whether the same subjects whether same phrases the same inthroughout the subjects same or very similar theflap/tap same contexts phrases the orthe very sim ipantsvariants (subjectthroughout 2, will 3, 4,use 5, the 6, 14, 16, 17, ipants 26) (subject changed 2, strategies 4, partic5,the very 6, experiment. 14, rarely, 16, 17, they 26) allchanged did half sto will use different flap/tap different experiment. flap/tap Though variants half throughout of3,the Though flap26) phrases described invery paper. single flap other phrases halfdescribed changed in strategies this ipants (subject 2, 3,even 4, 5,in6,the 14,single 16, 17, ipants (subject changed 2,strategies 3, even 4,in5,this 6,the 14, rarely, 16, The 17, they 26) allchanged did strategies verypaper. rarely, more often,described particularly forthe the phrases more ‘We often, have otter particularly books’, for andpaper. the ‘Wephrases have ‘We murder have even in the single flap phrases even in in this paper. single The flap other phrases halfdescribed changed instrategies this Thehim other half otter changb a mob’. Thephrases allowable of books’, rhotic a mob’. vowels The and allowable the him importance of end-state ofbooks’, rhotic vowels comfort and the more often, particularly for the more ‘Wevariability often, have otter particularly for andthe ‘We phrases have ‘Wevariability murder have otter and ‘We have over beginning-state comfort appeared to over reduce beginning-state confounding constraints appeared to variability, reduce confound a mob’. The allowable variability of rhotic a mob’. vowels The and allowable the importance variability of of end-state rhoticcomfort vowels comfort and on theflap importance of end-s possibleto toreduce see that participants making do not it possible have any fixed see that task participants or motorconstraints program, do not have over beginning-state making comfort it appeared over beginning-state confounding comfort constraints appeared on flap totovariability, reduce confounding on fla the subphonemic level, and notdown even when themotor phrase subphonemic and level, any and concerns not even werewhen p making it possible tonot seedown that to participants making do it notpossible have any tonot see fixed that taskto participants or program, do notprosodic have fixed task or mot accounted.level, The and results demonstrate a need accounted. to revise The theories results ofdemonstrate speech motor a control needand to which revise theor not down to the subphonemic not not down eventowhen the subphonemic phrase and prosodic level, and concerns not even were when phrase prosodic c candemonstrate account for aplanning to the subphonemic can accountmotor for level. Liketothe down evidence to thefor subphonemic look-ahead level.c accounted. The results need accounted. to down revise The theories results ofdemonstrate speech aplanning need control which revise theories of speech motor planning uncovered in the support for hypothesis planning 4uncovered above, wein are thealready support analyzing for hypothesis 4 above,fo can account for planning down to the subphonemic can account for level. planning Like the down evidence to thefor subphonemic look-ahead level. Like the the larger evidence in order develop auncovered ofinmotor dataset control in order that to accounts develop theory thesewe ofobservations. motor control that planning uncovered indataset the support for to hypothesis planning 4 theory above, we are thealready support analyzing for hypothesis the larger 4afor above, are already analyzi Tip-up rhoticof dataset Tip-up rhotic dataset in order to develop a theory motor control in order that to develop accounts a for theory theseofobservations. motor control that accounts for these o Tip-down rhotic Tip-down rhotic Tip-up rhotic Tip-up rhotic Tip-down rhotic Tip-down rhotic  subject: 2  subject: 5  ɾ↔  ɾ↖  ↘ 1ɾ 11 0  ɾ↕ 0  ɾ↘ 7  1 0  ɾ↕0 0  0  subject: 16  ɾ↖0 ɾ↖  1 0ɾ↔ ɾ↔  ɾ↕ 0 ɾ↘ 0  0  0  ɾ↖  ɾ↔  ↘ 73ɾ 12  ɾ↔  subject: 6  ɾ↖  ɾ↕ 0  ɾ↘ 11  0 0  ɾ↕0 0  0  subject: 17  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 12  0 0  ɾ↖  ɾ↔  ↘ 1ɾ 11 0  ɾ↔  subject: 8  ɾ↖  ɾ↕ 0  ɾ↘ 11  0 0  ɾ↕0 0  0  subject: 18  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 2  0 0  ɾ↖  ɾ↔  ↘ 1ɾ 11 10  ɾ↔  subject: 9  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 0  0  subject: 21  ɾ↖0 ɾ↖  1 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↖  ɾ↔  ↘ 11 0ɾ 12  ɾ↔  subject: 10  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 7  0  subject: 23  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↖  ɾ↔  ↘ 12 0ɾ 4  ɾ↔  subject: 12  ɾ↖  ɾ↕ 0  ɾ↘ 3  0 0  ɾ↕0 0  0  subject: 26  ɾ↖0 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 10  0 0  ɾ↖  ɾ↔  ↘ 34ɾ 2  ɾ↔  Figure A.1: Distribution of ‘T’ variants in the phrase ‘We have Berta beep’ by participants based on initial ‘R’ variant.  A.1.2  Distribution of ‘T’ variants by final ‘R’ variant in the phrase ‘We have otter books’ by participant  Most participants produced the phrase ‘We have otter books’ with a final [õ]. How" ever, participants 15, 17, 18, 21, 23 and 26 sometimes produced the phrase with a final [ô]. These participants usually produced [R ] instead of [R ] in the context of " final [ô], as seen in Figure A.2. "  139  ɾ↖  ɾ↕0 0  subject: 3 subject: ↘ ɾ14 ɾ↕  subject: 4 subject: ↘ ɾ15 ɾ↕  subject: 5 subject: ↘ ɾ16 ɾ↕  subject: 6 subject: ↘ ɾ17 ɾ↕  subject: 8 subject: ↘ ɾ18 ɾ↕  subject: 9 subject: ↘ ɾ21 ɾ↕  subject: 10 subject: ↘ ɾ23 ɾ↕  subject: 12 subject: ↘ ɾ26 ɾ↕  11 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 2 ↘ ɾ13 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 0  12 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 3 ↘ ɾ14 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 0  12 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 4 ↘ ɾ15 ɾ↕ subject: ɾ↕0 0ɾ↘ 51 0  12 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 5 ↘ ɾ16 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 0 11 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 6 ↘ ɾ17 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 10 12 0 0 0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 8 ↘ ɾ18 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 1 12 0 0 0ɾ↔ ɾ↖2 ↖ ɾ↔ ɾsubject: 9 ↘ ɾ21 ɾ↕ subject: ɾ↕0 0ɾ↘ 10 20 12 0 0 0ɾ↔ ɾ↖1 ↔ ɾ10 ɾ↖ subject: ↘ ɾ23 ɾ↕ subject: ɾ↕0 0ɾ↘ 0 10 0 12 0 0 0ɾ↔ ɾ↖1 ↔ ɾ12 ɾ↖ subject: ↘ ɾ26 ɾ↕ subject: ɾ↕0 0ɾ↘ 01 0  ɾ↖  References  subject: 2 subject: ↘ ɾ13 ɾ↕ ɾ↘ 0  0  subject: 13  11 ↖ 0 ɾ12 ɾ↖  ɾ↔  ↘ 0ɾ 0  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  12  0  ɾ↔  subject: 3  ɾ↖  ɾ↕ 0  ɾ↘ 0  0  0  ɾ↕0 0  subject: 14  12 ↖ 0 ɾ12 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  12  0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  subject: 4  ɾ↖  ɾ↘ 0  0  0  ɾ↕0 5  12  subject: 15  ɾ↖05 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 1  ɾ↘ 0  5  0  ɾ↖  21  ɾ↕ 0  ɾ↔  21  21  21  Abbs, J. H., Nadler, R. D., and Fujimura, Abbs, O. (1988). J. H., Nadler, X-ray microbeams R. D., and Fujimura, track the O. shape (1988). of X-r 2:29–34. speech. Abbs, J. H., Nadler, R.speech. D., andSOMA, Fujimura, Abbs, O. J. (1988). H., Nadler, X-ray R. microbeams D., andSOMA, Fujimura, track 2:29–34. theO. shape (1988). of X-ray microbeams track speech. SOMA, 2:29–34. speech. SOMA, 2:29–34. Allott, R. (2003). Outline of a motor theory Allott, of natural R. (2003). language. OutlineCognitive of a motor systems, theory 6(1):93– of natural lan 101.of a motor theory 101. Allott, R. (2003). Outline Allott, of natural R. (2003). language. Outline Cognitive of a motorsystems, theory of 6(1):93– natural language. Cognitive syste 101. 101. Bates, D. and Sarkar, D. (2008). lme4: Bates, LinearD. mixed-effects and Sarkar,models D. (2008). usinglme4: S4 classes. Linear URL mixed-effe http://CRAN. R-project. org/package= http://CRAN. lme4, R package versionURL 0.999375-28. org/package=models lme4, using R packag Bates, D. and Sarkar, D. (2008). lme4: Bates, LinearD.mixed-effects and Sarkar, models D. (2008). using lme4: S4R-project. classes. Linear mixed-effects S4 c http://CRAN. R-project. org/package=http://CRAN. lme4, R package R-project. version org/package= 0.999375-28. lme4, R package version 0.999375-28 Chang, Y.-C., Lee, F.-P., Peng, C.-L., and Chang, Lin, C.-T. Y.-C.,(2003). Lee, F.-P., Measurement Peng, C.-L., of and tongue Lin,moveC.-T. (200 ment during vowels production with F.-P., computer-assisted ment during vowels b-mode production and m-mode with computer-assisted ultrasonograChang, Y.-C., Lee, F.-P., Peng, C.-L., and Chang, Lin, Y.-C., C.-T. (2003). Lee, Measurement Peng, C.-L., of and tongue Lin, moveC.-T. (2003). Measurement of tb phy. Otolaryngology Head and Neck Surgery, phy. Otolaryngology 128(6):805–814. Head and Neck Surgery, ment during vowels production with computer-assisted ment during vowels b-mode production and m-mode with computer-assisted ultrasonograb-mode and128(6):805 m-mode phy. Otolaryngology Head and Neck Surgery, phy. Otolaryngology 128(6):805–814.Head and Neck Surgery, 128(6):805–814.  References  0  References  ɾ↕ 0  References  140  same subjects thewhether same phrases the same in subjects uttering or verypurpose similar the same phrases in The last observationwhether takes usthe back to The the underlying last uttering observation purpose takes of this us back research, to the the tosame underlying identify ofcontexts this research will useuttering differentthe flap/tap variants will useuttering the different experiment. Though variantsin half throughout of same the particexperi whether the same subjects whether same phrases the same inthroughout the subjects same or very similar theflap/tap same contexts phrases the orthe very sim ipantsvariants (subjectthroughout 2, will 3, 4,use 5, the 6, 14, 16, 17, ipants 26) (subject changed 2, strategies 4, partic5,the very 6, experiment. 14, rarely, 16, 17, they 26) allchanged did half sto will use different flap/tap different experiment. flap/tap Though variants half throughout of3,the Though flap26) phrases described invery paper. single flap other phrases halfdescribed changed in strategies this ipants (subject 2, 3,even 4, 5,in6,the 14,single 16, 17, ipants (subject changed 2,strategies 3, even 4,in5,this 6,the 14, rarely, 16, The 17, they 26) allchanged did strategies verypaper. rarely, more often,described particularly forthe the phrases more ‘We often, have otter particularly books’, for andpaper. the ‘Wephrases have ‘We murder have even in the single flap phrases even in in this paper. single The flap other phrases halfdescribed changed instrategies this Thehim other half otter changb a mob’. Thephrases allowable of books’, rhotic a mob’. vowels The and allowable the him importance of end-state ofbooks’, rhotic vowels comfort and the more often, particularly for the more ‘Wevariability often, have otter particularly for andthe ‘We phrases have ‘Wevariability murder have otter and ‘We have over beginning-state comfort appeared to over reduce beginning-state confounding constraints appeared to variability, reduce confound a mob’. The allowable variability of rhotic a mob’. vowels The and allowable the importance variability of of end-state rhoticcomfort vowels comfort and on theflap importance of end-s possibleto toreduce see that participants making do not it possible have any fixed see that task participants or motorconstraints program, do not have over beginning-state making comfort it appeared over beginning-state confounding comfort constraints appeared on flap totovariability, reduce confounding on fla the subphonemic level, and notdown even when themotor phrase subphonemic and level, any and concerns not even werewhen p making it possible tonot seedown that to participants making do it notpossible have any tonot see fixed that taskto participants or program, do notprosodic have fixed task or mot accounted.level, The and results demonstrate a need accounted. to revise The theories results ofdemonstrate speech motor a control needand to which revise theor not down to the subphonemic not not down eventowhen the subphonemic phrase and prosodic level, and concerns not even were when phrase prosodic c candemonstrate account for aplanning to the subphonemic can accountmotor for level. Liketothe down evidence to thefor subphonemic look-ahead level.c accounted. The results need accounted. to down revise The theories results ofdemonstrate speech aplanning need control which revise theories of speech motor planning uncovered in the support for hypothesis planning 4uncovered above, wein are thealready support analyzing for hypothesis 4 above,fo can account for planning down to the subphonemic can account for level. planning Like the down evidence to thefor subphonemic look-ahead level. Like the the larger evidence in order develop auncovered ofinmotor dataset control in order that to accounts develop theory thesewe ofobservations. motor control that planning uncovered indataset the support for to hypothesis planning 4 theory above, we are thealready support analyzing for hypothesis the larger 4afor above, are already analyzi Tip-up rhoticof dataset Tip-up rhotic dataset in order to develop a theory motor control in order that to develop accounts a for theory theseofobservations. motor control that accounts for these o Tip-down rhotic Tip-down rhotic Tip-up rhotic Tip-up rhotic Tip-down rhotic Tip-down rhotic  subject: 2  subject: 5  ɾ↔  ɾ↖  ↘ 0ɾ 0  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 0  subject: 16  12 ↖ 0 ɾ12 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0 ɾ↘ 0  12  0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  subject: 6  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 1  subject: 17  11 ↖ 0 ɾ11 ɾ↖ 0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  11 0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  subject: 8  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 1  subject: 18  12 ↖ 0 ɾ10 ɾ↖ 0 0ɾ↔ ɾ↔  ɾ↕ 1  ɾ↘ 0  10 0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  subject: 9  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 1  12  subject: 21  ɾ↖26 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  6 0  ɾ↖  ɾ↔  ↘ 0ɾ 2  ɾ↔  subject: 10  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 10  12  subject: 23  ɾ↖1 ɾ↖  0 0ɾ↔ ɾ↔  ɾ↕ 0  ɾ↘ 0  1 0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  subject: 12  ɾ↖  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↕0 0  subject: 26  12 ↖ 1 ɾ10 ɾ↖ 0 0ɾ↔ ɾ↔  ɾ↕ 1  ɾ↘ 0  10 0  ɾ↖  ɾ↔  ↘ 0ɾ 0  ɾ↔  Figure A.2: Distribution of ‘T’ variants by initial ‘R’ variant in the phrase ‘We have otter books’ by participant.  A.1.3  ‘R’ variant by initial and final ‘R’ variant in the phrase ‘We have him murder a mob’, by participant  Not surprisingly, the word ‘murder’ showed the most within subject variability across repetitions, as seen in Figures A.3 and A.4. There were four possible combinations of ‘R’ variants for this phrase. Only five participants (3, 8, 10, 14 and 23) produced all the repetitions with just one of these ‘R’ variants. Nine participants (2, 9, 12, 13, 16, 17, 18, 21 and 26) produced two of the possible ‘R’ variant combinations. Three others (4, 5 and 15) used three of the combinations, and one participant (8) used all four possible combinations. Nine participants consistently used the same ‘T’ variant for the same ‘R’ variant context. Seven other participants (9, 13, 15, 17, 18, 23 and 26) used different ‘T’ variants for different repetitions with the same ‘R’ variant context.  141  142  flap/tap the Though half the particwhether the samethroughout subjects uttering the takes same phrases in or very sim ipants (subject 2, will 3, 4,use5, different 6, 14, 16, 17, 26)variants changed strategies veryexperiment. rarely, theyus all didto The last observation back theofsame underlying purp (subject 2, will 3, 4,use 6, 14, 16, 17, changed strategies veryexperiment. rarely, theyphrases all didin different flap/tap variants the Though half even in the single ipants flap phrases described in5,this paper. The26) other halfthroughout changed strategies whether the same subjects uttering the same the even inforthe flap‘We phrases in5,this paper. The26) other halfthroughout changed strategies ipants (subject 2, will 3,books’, 4,use 6, 16, 17, changed strategies veryexperime rarely, more often, particularly thesingle phrases havedescribed otter and14, ‘We have him murder different flap/tap variants the morevariability often, particularly phrases ‘We havedescribed otter and ‘We him murder thethe single flapimportance phrases paper. The26) other half chang a mob’. The allowable of even rhoticinfor vowels and the of end-state comfort ipants (subject 2, 3,books’, 4,in5,this 6, 14, 16,have 17, changed stra a mob’. The allowable of even rhoticinfor vowels and the importance of end-state comfort more often, confounding particularly phrases ‘We havedescribed otter books’, and ‘We have over beginning-state comfort appeared to variability reduce constraints on variability, thethe single flapflap phrases in this paper. Th beginning-state comfort appeared to variability reduce confounding constraints on flap a mob’. The allowable vowels the importance of end-s making it possibleover to see that participants do not have any fixed particularly taskoforrhotic motor program, more often, for the and phrases ‘Wevariability, have otter boo making it possible to see that do notallowable have to any fixed confounding task motor program, beginning-state comfort appeared reduce constraints on fla not down to the subphonemic level, over and not evenparticipants whena phrase and prosodic concerns were mob’. The variability ofor rhotic vowels and the i not down to the subphonemic level, and not even when phrase prosodic concerns wereor mo possible to see participants doand not have to any fixed confoundin task accounted. The results demonstrate amaking need toit revise theories ofthat speech motor control which over beginning-state comfort appeared reduce accounted. results demonstrate a need revise theories oflook-ahead speech motor control which not down to thelevel. subphonemic level, and not even when phrase prosodic can account for planning downThe to the subphonemic Liketoitthe evidence making possible to for see that participants doand not have anc account for for planning down the Liketothe the evidence accounted. The results demonstrate a need revise theories oflook-ahead speech motor planning uncoveredcan in the support hypothesis 4 to above, wedown are already analyzing larger notsubphonemic to thelevel. subphonemic level, and for not even when phrc the support hypothesis 4for above, weobservations. are alreadylevel. analyzing larger account for for planning down to the subphonemic Liketothe evidence fo dataset in order toplanning develop uncovered a theory can ofinmotor control that accounts these accounted. The results demonstrate a need revise theories dataset in order toplanning develop uncovered a theory can ofinmotor control that accounts these the support hypothesis above, weobservations. are alreadylevel. analyzi DD account for for planning down4 for to the subphonemic L DD dataset in order toplanning develop uncovered a theory ofinmotor controlfor that accounts4 for these DU the support hypothesis above, weo DU DD UD dataset in order to develop a theory of motor control that a UD DU UU DD UU UD DU UU UD References UU ɾ↕ 0  ɾ↖  0  ɾ0↕ 0  0ɾ↔ ɾ↖5 ↖ ɾ↔ ɾsubject: 2 subject: 13 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  0  subject: 2 ↘ subject: ɾ13 ɾ↕  subject: 3 ↘ subject: ɾ14 ɾ↕  subject: 4 ↘ subject: ɾ15 ɾ↕  subject: 5 ↘ subject: ɾ16 ɾ↕  subject: 6 ↘ subject: ɾ17 ɾ↕  subject: 8 ↘ subject: ɾ18 ɾ↕  subject: 9 ↘ subject: ɾ21 ɾ↕  subject: 10 ↘ subject: ɾ23 ɾ↕  10 0 0 0 ɾ↔ ɾ↖ ɾ2↔ ɾ↖ subject: ↘ subject: ɾ13 ɾ↕ ɾ0↘ ɾ0↕ 0 1  0 12 0 0 ɾ↔ ɾ↖ ɾ3↔ ɾ↖ subject: ↘ subject: ɾ14 ɾ↕ ɾ0↘ ɾ0↕ 0 0  8 0 0 2 ɾ↔ ɾ↖ ɾ4↔ ɾ↖ subject: ↘ subject: ɾ15 ɾ↕ ɾ0↘ ɾ0↕ 0 0  9 0 0 0 ɾ↔ ɾ↖ ɾ5↔ ɾ↖ subject: ↘ subject: ɾ16 ɾ↕ ɾ1↘ ɾ0↕ 0 0 0 11 0 0 ɾ↔ ɾ↖ ɾ6↔ ɾ↖ subject: ↘ subject: ɾ17 ɾ↕ ɾ0↘ ɾ0↕ 0 0 0 2 0 0 ɾ↔ ɾ↖ ɾ8↔ ɾ↖ subject: ↘ subject: ɾ18 ɾ↕ ɾ2↘ ɾ0↕ 0 1 0 0 0 2 ɾ↔ ɾ↖ ɾ9↔ ɾ↖ subject: ↘ subject: ɾ21 ɾ↕ ɾ4↘ ɾ0↕ 0 0 0 ɾ0 ɾ↖ ɾ↔ ɾ↖ subject: 10 ↘ subject: ɾ23 ɾ↕ ɾ0↘ ɾ0↕ 0 0  0 0 ↖ 0ɾ↔ ɾ11 ↖ ɾ↔ ɾsubject: 5 subject: 16 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0 1ɾ↔ ɾ↖1 ↖ ɾ↔ ɾsubject: 6 subject: 17 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  0 0 ↖ 0ɾ↔ ɾ10 ↖ ɾ↔ ɾsubject: 9 subject: 21 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0 0ɾ ɾ↖0 ↔ ɾ10 ɾ↖ subject: subject: 23 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  0 ɾ0↖ ɾ↖  ɾ↖ 0 ɾ↘ 0  ɾ↔  0  ↘ 0ɾ 0  Figure ofɾ↕‘T’ variants by ‘R’ inɾ↘the phrase ‘We ↘ ↕ ↕ mob’, ɾ↘A.3: Distribution ɾ↘ ɾ↘ ɾ↘ and ɾfinal ɾ↘ variants ɾ↘ haveɾ↕ him ɾmurder ɾ↘ by ɾ↕ ɾ↕ ɾ↕ initial ɾ↕ ɾ↕ ɾa 0 0 0 (2-12). 0 0 0 0 0 0 0 0 0 0 0 0 0 1 4 participant subject: 13 ɾ↔ 0  1 ɾ0↔  5  ɾ↔  subject: 3  ɾ↕ 0  0  ɾ↖ 0  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  ↘ 0ɾ 0  0  0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 3 subject: 14 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  subject: 14 0  0  0 ɾ0↖ ɾ↖ 0 ɾ0↔  ɾ↔  9  ɾ↔  ɾ↕ 1  0  ɾ↖  ɾ↔  ɾ0↕ 2  ↘ 0ɾ 0  0ɾ↔ ɾ↖0 ↖ ɾ↔ ɾsubject: 4 subject: 15 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 2  subject: 15 0  0  0 ɾ0↖ ɾ↖ 2 ɾ0↔  ɾ↖ 0 ɾ↔  1  ɾ↔  22 22  Abbs, J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track the shape of References Abbs, J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track the shape of speech. SOMA, 2:29–34. References speech. SOMA, Abbs, 2:29–34. J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track Allott, R. (2003). Outline of a motor theory of natural language. Cognitive systems, 6(1):93– speech. SOMA, 2:29–34. Nadler, R. D., and Fujimura, O. (1988). Allott, R. (2003). Outline of a motor Abbs, theoryJ.ofH., natural language. Cognitive systems, 6(1):93–X-ray 101. speech. SOMA, 2:29–34. 101. Allott, R. (2003). Outline of a motor theory of natural language. Cognitive syst Bates, D. and Sarkar, D. (2008). lme4: Linear mixed-effects models using S4 classes. URL 101. Allott, R. (2003). Outlinemodels of a motor of natural Bates, D.org/package= and Sarkar, D. (2008). lme4: Linear usingtheory S4 classes. URLlangu http://CRAN. R-project. lme4, R package versionmixed-effects 0.999375-28. 101. http://CRAN. R-project. R package versionmixed-effects 0.999375-28.models using S4 Bates, D.org/package= and Sarkar, D.lme4, (2008). lme4: Linear Chang, Y.-C., Lee, F.-P., Peng, C.-L., and Lin, C.-T. (2003). Measurement of tongue movehttp://CRAN. R-project. R package versionmixed-effects 0.999375-2 Bates, D.org/package= and Sarkar, D.lme4, (2008). lme4:ofLinear Chang, Y.-C., Lee, Peng, C.-L., and Lin, C.-T. (2003). Measurement tongue movement during vowels production withF.-P., computer-assisted b-mode and m-mode ultrasonograhttp://CRAN. R-project. org/package= lme4, R package v ment during vowels production withF.-P., computer-assisted b-mode m-mode Chang, Y.-C., Lee, Peng, C.-L., and Lin, and C.-T. (2003).ultrasonograMeasurement of t phy. Otolaryngology Head and Neck Surgery, 128(6):805–814. phy. Otolaryngology Head and Neck Surgery, 128(6):805–814. ment during vowels production withF.-P., computer-assisted b-mode m-mode Chang, Y.-C., Lee, Peng, C.-L., and Lin, and C.-T. (2003). phy. Otolaryngology Head and Neck 128(6):805–814. ment during vowelsSurgery, production with computer-assisted b22 phy. Otolaryngology Head and Neck Surgery, 128(6):805–8 22  References  subject: 2  subject: 4  ɾ↘ 0  0  subject: 5 ɾ↕ 0 0  subject: 16 0 ɾ0↖ ɾ↖  ɾ↖ 0  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 1  ↘ 0ɾ 0  0  ɾ↔ 2 ɾ0↔  0  ɾ↔  subject: 6 ɾ↕ 0 0  ɾ↖ 0  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  ↘ 0ɾ 0  0  subject: 17  0 0  0 ɾ0↖ ɾ↖ 0 ɾ0↔  ɾ↔  9  ɾ↔  subject: 8 ɾ↕ 1 0  ɾ↖  0  ɾ↘ 2  ɾ↖  ɾ↔  ɾ1↕ 0  ↘ 0ɾ 0  2  1ɾ↔ ɾ↖5 ↖ ɾ↔ ɾsubject: 8 subject: 18 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  subject: 18  0 0  0 ɾ0↖ ɾ↖ 2 ɾ0↔  ɾ↔  5 ɾ↔  subject: 9 ɾ↕ 0 0  subject: 21  0 ɾ0↖ ɾ↖  ɾ↖  0  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  ↘ 0ɾ 0  0  ɾ↔  7 ɾ0↔  0 ɾ↔  subject: 10 ɾ↕ 0 0  ɾ↖  0  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 8  ↘ 0ɾ 3  0  subject: 23  0 0  0 0  0 ɾ0↖ ɾ↖ 12 ɾ0↔  ↔  ↔  ɾ↔  0 ɾ↔  subject: 12 ɾ↕ 0 0  ɾ↖  0  ɾ↘ 0  subject: 12 ↘ subject: ɾ26 ɾ↕  ɾ↖  ɾ↔  ɾ0↕ 0  ↘ 0ɾ 0  0  0 ɾ0 ɾ↖ ɾ↔ ɾ↖ subject: 12 ↘ subject: ɾ26 ɾ↕ ɾ0↘ ɾ0↕ 0 0  4ɾ ɾ↖1 ↔ ɾ12 ɾ↖ subject: subject: 26 ɾ↘ ɾ↕ ɾ0↘ ɾ↕0 0 0  subject: 26  4 0  0 0  0 ɾ0↖ ɾ↖ 7 ɾ0↔  ↔  ↔  ɾ↔  7 ɾ↔  143  use5,different flap/tap variants the experiment. Though the particwhether the samethroughout subjects uttering thetakes same inhalf theofunderlying same or very sim ipants (subject 2,will 3, 4, 6, 14, 16, 17, 26) changed strategies very rarely, they all did The last observation usphrases back to the purpo (subject 2,will 3, 4, 6, 14, 16, 17, changed strategies very rarely, they all didinhalf use flap/tap variants the experiment. Though even in the singleipants flap phrases described in5,different this paper. The26) other halfthroughout changed strategies whether the same subjects uttering the same phrases theo even infor thethe single flap ‘We phrases this paper. The26) other halfthroughout changed strategies ipants (subject 2,will 3, books’, 4, 5,different 6,and 14, 16, 17, changed strategies very rarely, more often, particularly phrases havedescribed otter ‘We have him murder usein flap/tap variants the experime more variability often, particularly phrases havedescribed otter books’, ‘We him murder infor thethe single flap phrases paper. The26) other half chang a mob’. The allowable ofeven rhotic vowels andipants the ‘We importance comfort (subject 2,of3,end-state 4, in 5, this 6,and 14, 16, have 17, changed strat a mob’. The allowable ofeven rhotic vowels andflap the importance of end-state comfort more often, the phrases ‘We havedescribed otter books’, and ‘We have over beginning-state comfort appeared to variability reduceparticularly confounding constraints on flap variability, infor the single phrases in this paper. The beginning-state comfort to reduce confounding constraints on ‘We flap variability, a mob’. Theappeared allowable variability of or rhotic vowels and the importance of end-s making it possibleover to see that participants do not have any fixedparticularly task motor program, more often, for the phrases have otter book making it possible to see doand not have to any fixed confounding task or motor program, beginning-state comfort appeared reduce constraints on fla not down to the subphonemic level,over and notthat evenparticipants when phraseThe prosodic concerns were a mob’. allowable variability of rhotic vowels and the im not down to the subphonemic level,theories and notof even when phrase prosodic concerns wereor mot making it possible to see that participants doand notwhich have to any fixed confounding task accounted. The results demonstrate a need to revise speech motor control over beginning-state comfort appeared reduce accounted. The demonstrate a need tothe revise theories speech motor control notsubphonemic down to the level. subphonemic level, and not even when phrase prosodic can account for planning down to results the Like evidence forof look-ahead making it possible to see that participants doand notwhich have anyc can for for planning down the Like evidence foroflook-ahead accounted. The results demonstrate a need to the revise theories speech motor planning uncovered in account the support hypothesis 4 to above, we are to already analyzing larger notsubphonemic down the level. subphonemic level, and not even when phrac the support for hypothesis 4to above, wedemonstrate are alreadylevel. analyzing the larger account for planning down the subphonemic Like evidence fo dataset in order toplanning develop uncovered a theorycan ofinmotor control that accounts for these observations. accounted. The results a need to the revise theories dataset in order toplanning developuncovered a theorycan ofinmotor control accounts forthe these the support forthat hypothesis above, weobservations. are alreadylevel. analyzi DD account for planning down4to subphonemic Li DD dataset in order toplanning developuncovered a theory ofinmotor controlforthat accounts4 above, for these DU the support hypothesis weoa DU DD UD dataset in order to develop a theory of motor control that ac UD DU UU DD UU UD DU UU UD References UU ɾ↕ 0  ɾ↖  0  ɾ0↕ 0  0ɾ↔ ɾ↖5 2 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 13 ɾ0↘ ɾ↕0 0 0  0  subject: 2 ↘ ɾ13 ɾ↕ subject:  subject: 3 ↘ ɾ14 ɾ↕ subject:  subject: 4 ↘ ɾ15 ɾ↕ subject:  subject: 5 ↘ ɾ16 ɾ↕ subject:  subject: 6 ↘ ɾ17 ɾ↕ subject:  subject: 8 ↘ ɾ18 ɾ↕ subject:  subject: 9 ↘ ɾ21 ɾ↕ subject:  subject: 10 ↘ ɾ23 ɾ↕ subject:  10 0 0 0 ɾ↔ ɾ↖ subject: ↖ ɾ2↔ ɾ ↘ ɾ13 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 1  0 12 0 0 ɾ↔ ɾ↖ subject: ↖ ɾ3↔ ɾ ↘ ɾ14 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 0  8 0 0 2 ɾ↔ ɾ↖ subject: ↖ ɾ4↔ ɾ ↘ ɾ15 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 0  9 0 0 0 ɾ↔ ɾ↖ subject: ↖ ɾ5↔ ɾ ↘ ɾ16 ɾ↕ subject: ɾ1↘ ɾ0↕ 0 0 0 11 0 0 ɾ↔ ɾ↖ subject: ↖ ɾ6↔ ɾ ↘ ɾ17 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 0 0 2 0 0 ɾ↔ ɾ↖ subject: ↖ ɾ8↔ ɾ ↘ ɾ18 ɾ↕ subject: ɾ2↘ ɾ0↕ 0 1 0 0 0 2 ɾ↔ ɾ↖ subject: ↖ ɾ9↔ ɾ ↘ ɾ21 ɾ↕ subject: ɾ4↘ ɾ0↕ 0 0 0 ɾ0 ɾ↖ subject: 10 ɾ↔ ɾ↖ ↘ ɾ23 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 0  0 0 ↖ 0ɾ↔ ɾ11 5 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 16 ɾ0↘ ɾ↕0 0 0 1ɾ↔ ɾ↖1 6 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 17 ɾ0↘ ɾ↕0 0 0  0 0 ↖ 0ɾ↔ ɾ10 9 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 21 ɾ0↘ ɾ↕0 0 0 0ɾ↔ ɾ↖0 ↔ subject: ɾ10 ɾ↖ ɾ↘ ɾ↕ subject: 23 ɾ0↘ ɾ↕0 0 0  0 ɾ0↖ ɾ↖  subject: 13  ɾ↖  References  subject: 2  ɾ↔  0  0ɾ↘ 0  0  ɾ↔  1 ɾ0↔  ɾ↕ 0  ɾ↘ 0  0  5  ɾ↔  ɾ↕ 0  0  0ɾ↔ ɾ↖0 3 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 14 ɾ0↘ ɾ↕0 0 0  subject: 14  0  0  0 ɾ0↖ ɾ↖  0 ɾ0↔  ɾ↔  ɾ↕ 0  ɾ↘ 1  0  9  ɾ↖  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  0ɾ↘ 0  0  ɾ↔  ɾ↕ 1  0  ɾ↖  ɾ↔  ɾ0↕ 2  0ɾ↘ 0  0ɾ↔ ɾ↖0 4 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 15 ɾ0↘ ɾ↕0 0 2  subject: 15  0  0  0 ɾ0↖ ɾ↖  2 ɾ0↔  ɾ↔  ɾ↕ 0  ɾ↘ 4  0  1  ɾ↖  0  ɾ↔  22  subject: 3  22  Abbs, J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track the shape of References J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track the shape of speech. SOMA, Abbs, 2:29–34. References speech. SOMA, Abbs, 2:29–34. J. H., Nadler, R. D., and Fujimura, O. (1988). X-ray microbeams track Allott, R. (2003). Outline of a motor theory of natural language. Cognitive systems, 6(1):93– speech. SOMA, 2:29–34. J.ofH., Nadler, R. D., and Fujimura, O. (1988). X-ray Allott, R. (2003). Outline of a motorAbbs, theory natural language. Cognitive systems, 6(1):93– 101. speech. SOMA, 2:29–34. 101. Allott, R. (2003). Outline of a motor theory of natural language. Cognitive syste Bates, D. and Sarkar, D. (2008). lme4: Linear mixed-effects models using S4 classes. URL 101. Allott, R. (2003). Outline models of a motor theory of natural langua Bates, D. org/package= and Sarkar, D.lme4, (2008). lme4: Linear using S4 classes. URL http://CRAN. R-project. R package versionmixed-effects 0.999375-28. 101. http://CRAN. R-project. R package versionmixed-effects 0.999375-28.models using S4 c Bates, D. org/package= and Sarkar, D.lme4, (2008). lme4: Linear Chang, Y.-C., Lee, F.-P., Peng, C.-L., and Lin, C.-T. (2003). Measurement of tongue movehttp://CRAN. R-project. org/package= R package versionmixed-effects 0.999375-28 Bates, D. Sarkar, D.lme4, (2008). lme4: of Linear Chang, Y.-C., Lee, Peng, C.-L., and Lin,and C.-T. (2003). Measurement tongue movement during vowels production withF.-P., computer-assisted b-mode and m-mode ultrasonograhttp://CRAN. R-project. org/package= lme4, R package ve ment during production withF.-P., computer-assisted b-mode Chang, Y.-C., Lee, Peng, C.-L., and Lin, and C.-T.m-mode (2003).ultrasonograMeasurement of t phy. Otolaryngology Head and vowels Neck Surgery, 128(6):805–814. phy. Otolaryngology Head and vowels Neck Surgery, 128(6):805–814. ment during production withF.-P., computer-assisted b-mode Chang, Y.-C., Lee, Peng, C.-L., and Lin, and C.-T.m-mode (2003).u phy. Otolaryngology Head and vowels Neck Surgery, 128(6):805–814. ment during production with computer-assisted b-m 22 phy. Otolaryngology Head and Neck Surgery, 128(6):805–81 22 ɾ↘ 0  subject: 4  ɾ↘ 0  subject: 5 ɾ↕ 0 0  subject: 16  0 ɾ0↖ ɾ↖ ɾ↔  ɾ↕ 0 ɾ↘ 0  0  0  ɾ↖  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 1  0ɾ↘ 0  0  2 ɾ0↔  ɾ↔  subject: 6 ɾ↕ 0 0  subject: 17  0 0  0 ɾ0↖ ɾ↖ 0 ɾ0↔  ɾ↔  ɾ↕ 0  ɾ↘ 0  0 9  ɾ↖  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  0ɾ↘ 0  0  ɾ↔  subject: 8 ɾ↕ 1 0  1ɾ↔ ɾ↖5 8 ↖ ɾ↔ ɾsubject: ɾ↘ ɾ↕ subject: 18 ɾ0↘ ɾ↕0 0 0  subject: 18  0 0  0 ɾ0↖ ɾ↖ 2 ɾ0↔  ɾ↔  ɾ↕ 0  ɾ↘ 0  0 5  ɾ↖  ɾ↘ 2  ɾ↖  ɾ↔  ɾ1↕ 0  0ɾ↘ 0  2  ɾ↔  subject: 9 ɾ↕ 0 0  subject: 21  0 ɾ0↖ ɾ↖ ɾ↔  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↖  ɾ↘ 0  ɾ↖  ɾ↔  ɾ0↕ 0  0ɾ↘ 0  0  7 ɾ0↔  ɾ↔  subject: 10 ɾ↕ 0 0  subject: 23  ɾ↖  ɾ↔  ɾ0↕ 8  0ɾ↘ 3  0 0  0 0  0 ɾ0↖ ɾ↖ 12 ɾ0↔  ɾ↕ 0  ɾ↘ 0  0 0  ɾ↖  ɾ↘ 0 0  ↔  ɾ↔  ɾ↔  subject: 12 ɾ↕ 0 0  subject: 12 ↘ ɾ26 ɾ↕ subject:  0 ɾ0 ɾ↖ subject: 12 ɾ↔ ɾ↖ ↘ ɾ26 ɾ↕ subject: ɾ0↘ ɾ0↕ 0 0  4ɾ↔ ɾ↖1 ↔ subject: ɾ12 ɾ↖ ɾ↘ ɾ↕ subject: 26 ɾ0↘ ɾ↕0 0 0  subject: 26  ɾ↖  ɾ↔  ɾ0↕ 0  0ɾ↘ 0  4 0  0 0  0 ɾ0↖ ɾ↖ 7 ɾ0↔  ɾ↔  ɾ↕ 0  ɾ↘ 0  0 7  ɾ↖  ɾ↘ 0 0  ↔  ɾ↔  Figure A.4: Distribution of ‘T’ variants by initial and final ‘R’ variants in the phrase ‘We have him murder a mob’, by participant (13-26). DD = initial and final [ô], UD = initial [õ], final [ô], DU = initial [ô], final [õ], UU = initial and " " " " " final [õ]. "  A.2  ‘R’ variants - ‘T’ vs. control phrases  Comparing the ‘R’ variants in phrase containing ‘T’ vs. similar phrases without ‘T’ lets us identify the interaction of the ‘T’ variant with the ‘R’ variants in the immediate context. This information helps with testing hypothesis 3 regarding end-state comfort. There is very little difference in the initial ‘R’ variant in the phrase ‘we have Berta beep’ compared to the control phrase ‘We have Burma books’, as seen in Figure A.5.  144  subject: 2  Berta : ɹ̩ 12  Berta : ɻ̩ 0  subject: 3  Berta : ɹ̩ 12  Berta : ɻ̩ 0  subject: 4  Berta : ɹ̩ 11  Berta : ɻ̩ 1  subject: 5  Berta : ɹ̩ 8  Berta : ɻ̩ 4  subject: 6  Berta : ɹ̩ 11  Berta : ɻ̩ 1  subject: 8  Berta : ɹ̩ 11  Berta : ɻ̩ 1  0 0 0 0 0 11 11 11 11 1 12 12 Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩  subject: 9  Berta : ɹ̩ 0  Berta : ɻ̩ 12  subject: 10  Berta : ɹ̩ 0  Berta : ɻ̩ 12  subject: 12  Berta : ɹ̩ 3  Berta : ɻ̩ 4  subject: 13  Berta : ɹ̩ 8  Berta : ɻ̩ 4  subject: 14  Berta : ɹ̩ 0  Berta : ɻ̩ 9  subject: 15  Berta : ɹ̩ 0  Berta : ɻ̩ 10  9 0 0 10 3 0 12 12 1 1 11 12 Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩  subject: 16  Berta : ɹ̩ 12  Berta : ɻ̩ 0  subject: 17  Berta : ɹ̩ 0  Berta : ɻ̩ 12  subject: 18  Berta : ɹ̩ 10  Berta : ɻ̩ 2  subject: 21  Berta : ɹ̩ 12  Berta : ɻ̩ 0  subject: 23  Berta : ɹ̩ 11  Berta : ɻ̩ 0  subject: 26  Berta : ɹ̩ 2  Berta : ɻ̩ 10  9 0 10 0 8 0 0 0 12 2 12 12 Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩ Burma : ɹ̩ Burma : ɻ̩  (a) By participant Berta : ɹ̩ 123  Berta : ɻ̩ 82  126 Burma : ɹ̩  79 Burma : ɻ̩  (b) Summary  Figure A.5: Distribution of final ‘R’ variant sin the single ‘T’ phrase ‘We have Berta beep’ vs. the control phrase ‘We have Burma books’. Generalized linear mixed model tests did not show significance (AIC = 220, z = -1.552, p = 0.121). Wilcoxon Signed-Rank tests were performed on the data summarized in Figure A.5. For each of the two (initial) ‘R’ variants, and by participant, the percentage of productions matching that variant based on whether the word in question is ‘Berta’ or ‘Burma’ is compared. The results were not significant, as seen in Table A.1. 145  contexts ‘Berta’ vs. ‘Burma’  [õ] " V 14  p p = 0.184  [ô] " V 40  p p = 0.220  Table A.1: Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘Berta’ vs. ‘Burma’. * = significant (α = 0.05).  As shown in the Figure A.6, final ‘R’ variant is more likely to be a [ô] in the " control phrase ‘We have him offer books’ than in the ‘T’ phrase ‘We have otter books’.  146  subject: 2  subject: 3  subject: 4  subject: 5  subject: 6  subject: 8  otter : ɹ̩ 0  otter : ɻ̩ 11  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 11  otter : ɹ̩ 0  otter : ɻ̩ 12  10 offer : ɹ̩  0 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  9 offer : ɹ̩  0 offer : ɻ̩  9 offer : ɹ̩  1 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  subject: 9  subject: 10  subject: 12  subject: 13  subject: 14  subject: 15  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 5  otter : ɻ̩ 6  0 offer : ɹ̩  12 offer : ɻ̩  0 offer : ɹ̩  12 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  2 offer : ɹ̩  9 offer : ɻ̩  0 offer : ɹ̩  12 offer : ɻ̩  1 offer : ɹ̩  11 offer : ɻ̩  subject: 16  subject: 17  subject: 18  subject: 21  subject: 23  subject: 26  otter : ɹ̩ 0  otter : ɻ̩ 12  otter : ɹ̩ 1  otter : ɻ̩ 11  otter : ɹ̩ 1  otter : ɻ̩ 11  otter : ɹ̩ 5  otter : ɻ̩ 6  otter : ɹ̩ 11  otter : ɻ̩ 1  otter : ɹ̩ 1  otter : ɻ̩ 11  11 offer : ɹ̩  1 offer : ɻ̩  0 offer : ɹ̩  12 offer : ɻ̩  11 offer : ɹ̩  1 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  12 offer : ɹ̩  0 offer : ɻ̩  2 offer : ɹ̩  9 offer : ɻ̩  (a) By participant otter : ɹ̩ 24  otter : ɻ̩ 188  127 offer : ɹ̩  80 offer : ɻ̩  (b) Summary  Figure A.6: Distribution of final ‘R’ variants in the single ‘T’ phrase ‘We have otter books’ vs. the control phrase ‘we have him offer books’. Wilcoxon Signed-Rank tests were preformed on the data summarized in Figure A.6. For each of the two (final) ‘R’ variants, and by participant, the percentage of productions matching that variant based on whether the word in question is ‘otter’ or ‘offer’ is compared. The results were significant, as seen in Table A.2.  147  contexts ‘otter’ vs. ‘offer’  [õ] " V 113  p *p = 0.003  [ô] " V 6  p *p = 0.002  Table A.2: Wilcoxon signed-rank tests comparing prevalence of ‘R’ variants in ‘otter’ vs. ‘offer’. * = significant (α = 0.05).  As shown in the Figure A.7, initial ‘R’ variant is more likely to be [ô] in the " control phrase ‘We have him murmur a vow’ than in the ‘T’ phrase ‘We have him murder a mob’.  148  subject: 2  subject: 3  subject: 4  subject: 5  subject: 6  subject: 8  d : ɹ̩ 10  d : ɻ̩ 1  d : ɹ̩ 12  d : ɻ̩ 0  d : ɹ̩ 9  d : ɻ̩ 2  d : ɹ̩ 9  d : ɻ̩ 3  d : ɹ̩ 11  d : ɻ̩ 0  d : ɹ̩ 8  d : ɻ̩ 4  9 m : ɹ̩  0 m : ɻ̩  11 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  11 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  subject: 9  subject: 10  subject: 12  subject: 13  subject: 14  subject: 15  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 12  d : ɹ̩ 4  d : ɻ̩ 7  d : ɹ̩ 6  d : ɻ̩ 5  d : ɹ̩ 0  d : ɻ̩ 10  d : ɹ̩ 4  d : ɻ̩ 7  0 m : ɹ̩  10 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  11 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  2 m : ɻ̩  1 m : ɹ̩  11 m : ɻ̩  1 m : ɹ̩  9 m : ɻ̩  subject: 16  subject: 17  subject: 18  subject: 21  subject: 23  subject: 26  d : ɹ̩ 12  d : ɻ̩ 0  d : ɹ̩ 2  d : ɻ̩ 9  d : ɹ̩ 7  d : ɻ̩ 5  d : ɹ̩ 12  d : ɻ̩ 0  d : ɹ̩ 11  d : ɻ̩ 0  d : ɹ̩ 5  d : ɻ̩ 7  11 m : ɹ̩  0 m : ɻ̩  1 m : ɹ̩  9 m : ɻ̩  11 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  (a) By participant murder : ɹ̩ 122  murder : ɻ̩ 83  127 murmur : ɹ̩  65 murmur : ɻ̩  (b) Summary  Figure A.7: Distribution of initial ‘R’ variants in the single ‘T’ phrase ‘We have him murder a mob’ vs. the control phrase ‘we have him murmur a vow’. d = ‘murder’, m = ‘murmur’. Wilcoxon Signed-Rank tests were preformed on the data summarized in Figure A.7. For each of the two initial ‘R’ variants, the percentage of productions matching that variant based on whether the word in question is ‘murder’ or ‘murmur’ is compared. The results were not significant, as seen in Table A.3.  149  contexts ‘murder’ vs. ‘murmur’  [õ] " V 16  p p = 0.142  [ô] " V 49  p p = 0.168  Table A.3: Wilcoxon signed-rank tests comparing prevalence of initial ‘R’ variants in ‘murder’ vs. ‘murmur’. * = significant (α = 0.05).  As shown in the Figure A.8, the final ‘R’ variant is more likely to be [ô] in the " control phrase ‘We have him murmur a vow’ than in the ‘T’ phrase ‘We have him murder a mob’.  150  subject: 2  subject: 3  subject: 4  subject: 5  subject: 6  subject: 8  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 12  d : ɹ̩ 1  d : ɻ̩ 10  d : ɹ̩ 1  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 7  d : ɻ̩ 5  9 m : ɹ̩  0 m : ɻ̩  9 m : ɹ̩  2 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  7 m : ɹ̩  4 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  9 m : ɹ̩  1 m : ɻ̩  subject: 9  subject: 10  subject: 12  subject: 13  subject: 14  subject: 15  d : ɹ̩ 4  d : ɻ̩ 7  d : ɹ̩ 0  d : ɻ̩ 12  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 10  d : ɹ̩ 6  d : ɻ̩ 5  0 m : ɹ̩  10 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  11 m : ɹ̩  0 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  0 m : ɹ̩  10 m : ɻ̩  subject: 16  subject: 17  subject: 18  subject: 21  subject: 23  subject: 26  d : ɹ̩ 1  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 11  d : ɹ̩ 0  d : ɻ̩ 12  d : ɹ̩ 2  d : ɻ̩ 10  d : ɹ̩ 11  d : ɻ̩ 0  d : ɹ̩ 0  d : ɻ̩ 12  10 m : ɹ̩  1 m : ɻ̩  0 m : ɹ̩  10 m : ɻ̩  10 m : ɹ̩  1 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  10 m : ɹ̩  0 m : ɻ̩  0 m : ɹ̩  12 m : ɻ̩  (a) By participant murder : ɹ̩ 33  murder : ɻ̩ 172  105 murmur : ɹ̩  87 murmur : ɻ̩  (b) Summary  Figure A.8: Distribution of final ‘R’ variants in the single ‘T’ phrase ‘We have him murder a mob’ vs. the control phrase ‘we have him murmur a vow’. d = ‘murder’, m = ‘murmur’. Wilcoxon Signed-Rank tests were preformed on the data summarized in Figure A.8. For each of the two final ‘R’ variants, the percentage of productions matching that variant based on whether the word in question is ‘murder’ or ‘murmur’ is compared. The results were significant, as seen in Table A.4.  151  contexts ‘murder’ vs. ‘murmur’  [õ] " V 5  p *p = 0.008  [ô] " V 73  p *p = 0.008  Table A.4: Wilcoxon signed-rank tests comparing prevalence of final ‘R’ variants in ‘murder’ vs. ‘murmur’. * = significant (α = 0.05).  There is also evidence for a preference for [õ] in the word ‘murder’ as shown by " the increased tendency for word final [õ] compared to the control word ‘murmur’, " as seen in Figure A.8. Similarly, there is evidence for the same preference for final [õ] in the word ‘otter’ compared to the control word ‘offer’, as seen in Figure A.6. " These results show that ‘T’ variants are more tightly correlated with the final ‘R’ variant than with the initial ‘R’ variant.  A.3  Full dataset  This appendix documents the entire dataset collected in the studies used throughout this dissertation. The full stimuli list are presented in Tables A.5 and A.6. Nasal phrase stimuli were not used in this dissertation, but collected for future analysis.  152  153  token 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  word acerbity autumn fauna Emma edit the edify vomit the vomit a audit the audify pawnin’ the berta Burma Myrna offer otter honour murder burner  carrier phrase We have acerbity books We have autumn books We have fauna books We have Emma beep We have him edit the books We have him edify a book We have him vomit the book We have him vomit a book We have him audit the books We have him audify a book We have him pawnin’ the book We have Berta beep We have Burma books We have Myrna beep We have him offer books We have otter books We have honour books We have him murder a mob We have burner bibs  ‘T’s 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 1 0 1 0  syllables 4 2 2 2 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2  morphemes 2 1 1 1 2 2 2 2 2 2 3 1 1 1 1 1 1 1 2  vocalic context VV VV VV VV VV VV VV VV VV VV VV RV RV RV VR VR VR RR RR  stimuli type ‘T’ ‘T’ ‘N’ ‘C’ ‘T’ ‘T’ ‘C’ ‘C’ ‘T’ ‘T’ ‘N’ ‘T’ ‘C’ ‘N’ ‘C’ ‘T’ ‘N’ ‘T’ ‘N’  Table A.5: Complete table of all stimuli used in this dissertation, part 1. ‘C’ = control phrase, ‘T’ = flap phrase, ‘N’ = nasal phrase.  154  token 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38  word murmer edit a audit a pawnin’ a editor mammifer auditor Saturday peppermint herded a absurdity acerbified murdered a herded her herbifer murdered her murmur her edited a audited a  carrier phrase We have him murmur a vow We have him edit a book We have him audit a book We have him pawning’ a book We have editor books We have mammifer books We have auditor books We have Saturday off We have peppermint now We have herded a mob We have ’absurdity fests’ We have acerbified food We have murdered a mob We have herded her mob We have my herbifer book We have murdered her mob We have him murmur her a vow We have edited a book We have audited a book  ‘T’s 0 2 2 0 2 0 2 2 0 2 2 0 2 2 0 2 0 3 3  syllables 2 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4  morphemes 1 2 2 3 2 2 2 1 3 3 2 2 3 3 2 3 2 3 3  vocalic context RR VVV VVV VVV VVR VVR VVR VRV VRV RVV RVV RVV RRV RVR RVR RRR RRR VVVV VVVV  stimuli type ‘C’ ‘T’ ‘T’ ‘N’ ‘T’ ‘C’ ‘T’ ‘T’ ‘C’ ‘T’ ‘T’ ‘C’ ‘T’ ‘T’ ‘C’ ‘T’ ‘C’ ‘T’ ‘T’  Table A.6: Complete table of all stimuli used in this dissertation, part 2. ‘C’ = control phrase, ‘T’ = flap phrase, ‘N’ = nasal phrase.  Appendix B  Appendices for chapter 3 B.1  Variability  We would also expect that speakers will be more consistent with their ‘T’ sequence variants ‘Saturday’ in comparison to the ‘T’ sequence variants in ‘herded her’. As already noted, we expect a gravitational and myoelastic advantage for [R ], [R ] sequences in Saturday. On the other hand, there are many possible forces at work on the expected flap sequences for ‘herded her’. First, with‘herded her’, there are four expected sequences: 1) [R ] , [R ] , 2) [R ], [R ], 3) [R ], [R ], or 4) [R ], [R ], as opposed to two with Saturday: 1) [R ], [R ] and 2) [R ],[R ]. Secondly, speakers have demonstrated a tendency to select ‘T’ variants based on end-state comfort, in particular producing [R ] into a word final vowel in single ‘T’ sequences, regardless of the tongue position before the ‘T’ (Chapter 2), an observation that did not occur with words that ended with ‘R’ because ‘R’s may be either a [õ] or [ô]. " " So, there’s no end-state-comfort reason to end ‘herded her’ with either the tongue tip up or down. Nevertheless, we might expect [R ], [R ] sequences the most because this sequence follows gravity while simultaneously having the least amount of changes in tongue motion direction. Similarly, we could expect [R ], [R ] sequences because they involve the least changes in tongue motion direction, and [R ], [R ] sequences avoid the need to lift the tongue high against gravity. Lastly, we expect the ‘T’ sequence in ‘Saturday’ to be less variable than the one in ‘herded her’ because we believe that the ‘T’s in ‘Saturday’ involve one planned gesture 155  instead of two.  B.1.1  Results  Graphs of the flap types show that in the phrase ‘We have Saturday off’, the vast majority are [R ], [R ]. In comparison, in the phrase ‘We have herded her mob’, most are [R ], [R ], followed by [R ], [R↔ ], and [R ], [R ] sequences. There were also significant numbers of [R ], [R ] sequences. The results were therefore considerably more variable than those for ‘We have Saturday off’, as seen in Figure B.1.  0  185  1  5  0  0  0  0  13  9  0  0  ɾ↕  ɾ↘  ɾ↖  ɾ↔  ɾ↔  0  ɾ↖  0  ɾ↘  0  ɾ↕  0  Flap Sequence for 'Herded her'  Initial 'T'  ɾ↖ ɾ↘ ɾ↕  Initial 'T'  ɾ↔  Flap Sequence for 'Saturday'  Final 'T'  1  0  0  13  1  1  0  31  82  0  30  1  18  0  0  0  ɾ↕  ɾ↘  ɾ↖  ɾ↔  Final 'T'  (a)  (b)  Figure B.1: Variability of ‘T’ sequences in ‘Saturday’ vs. ‘herded her’ X axis lists the initial ‘T’, Y axis lists the final ‘T’  B.1.2  Discussion  The results show almost all of the ‘T’ sequences from the word ‘Saturday’ are [R ], [R ] sequences. In comparison, the ‘T’ sequences from the phrase ‘herded her’ were much more variable across speakers and even productions from the same speaker. The most common sequence was [R ], [R ], as predicted, followed by  156  [R ], [R ] (the motion likely to have the least jerk), and [R ], [R ] (the motion with the least opposition to gravity). However, there were plenty of [R ], [R↔ ], and [R↔ ], [R↔ ] sequences. These sequences both violated the predictions, and require the production of tip-up or retroflexed schwas in the second syllable of the word ‘herded’, production that are not typical for schwas. This observation will be addressed more thoroughly in the next chapter when we discuss planning across the morpheme and word boundary.  157  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0072056/manifest

Comment

Related Items