UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Startling auditory stimulus as a window into speech motor planning Chiu, Cheng-hao 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2016_february_chiu_chenghao.pdf [ 9.29MB ]
Metadata
JSON: 24-1.0220650.json
JSON-LD: 24-1.0220650-ld.json
RDF/XML (Pretty): 24-1.0220650-rdf.xml
RDF/JSON: 24-1.0220650-rdf.json
Turtle: 24-1.0220650-turtle.txt
N-Triples: 24-1.0220650-rdf-ntriples.txt
Original Record: 24-1.0220650-source.json
Full Text
24-1.0220650-fulltext.txt
Citation
24-1.0220650.ris

Full Text

Startling Auditory Stimulus as aWindow into Speech Motor PlanningbyCheng-hao ChiuB.A., National Chengchi University, Taiwan, 2002M.A., National Chung Cheng University, Taiwan, 2005A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Linguistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)December 2015© Cheng-hao Chiu 2015AbstractWhile speech planning has long been a topic of discussion in the lit-erature, the specific content of speech plans has remained largely conjec-tural. The present dissertation brings to this problem a methodology usingstartling auditory stimulus (SAS) to examine the contents of prepared move-ment plans unaltered by feedback regulation.The startling auditory stimulus (SAS, > 120dB) has been found to elicitrapid release of prepared movements with high accuracy and largely unal-tered EMG muscle activity patterns. Because the response latency of theseSAS-triggered movements is too short to allow for feedback or correctionprocesses, the executed movements have been used to reveal the contents ofthe movement plans with little or no feedback information influencing theprepared motor behaviours.In the present dissertation, the first experiment applied this methodol-ogy to CV syllable production to test whether English CV syllables can beelicited in the same manner as other limb movements. Results show thata SAS can trigger an early release of a well-formed prepared English CVsyllable, including intact lip kinematics and vowel formants. The secondexperiment investigated whether the observed short latency and additionallip compression are speech-specific or generic to any oral movement. Resultsiishow that while prepared speech-like and non-speech movements are subjectto early release by SAS, lip compression does not occur as frequently as itdoes in Spoken speech, suggesting that this preparatory compression may bespeech-specific, likely relating to aerodynamic factors. The third experimentfurther tests whether lip compression that is independent of aerodynamicfactors is observed in all speech-related tasks and is elicited at a short latencyby SAS. Results show that comparable lip compression resulting from move-ment overshoot was observed for both Spoken and Mouthed speech. Thefourth experiment looked into the level of suprasegmental gestures in speechplanning. The results show that while both pitch contour and formants weremaintained in the SAS-induced responses, pitch levels were compromised,suggesting that a prepared syllable ought minimally to include phonemiccontrasts. SAS provides a useful tool for observing the contents of speechplans.iiiPrefaceThis dissertation contains one chapter of introduction, four content chap-ters, and one chapter of conclusion. Three papers out of these chapters havebeen published in different journals and one of the chapter has been ac-cepted for a conference proceedings. Detailed descriptions of collaborationand contribution are as follows.Chapter 1, Section 1.2 “Neural pathways for normal and SAS-inducedspeech” was written by Cheng-hao Chiu, in collaboration with Dr. BryanGick, the dissertation supervisor. Experiments reported in Chapters 2 – 4were designed and run in collaboration with the Motor Control and Learn-ing Laboratories at UBC, directed by Dr. Ian Franks. Experiment designwas generated by Dr. Bryan Gick, Cheng-hao Chiu, and the group fromthe Motor Control and Learning Laboratories, including Dr. Romeo Chua,Dr. Dana Maslovat, and Andrew Stevenson. Cheng-hao Chiu was respon-sible for the writing of these chapters, with collaboration from Dr. BryanGick, Dr. Ian Franks, and Dr. Eric Vatikiotis-Bateson. Chapter 2, “TheStartReact effect in syllable production”, was included as Experiment 2 of 2in Stevenson et al. (2014). Chapter 3, “Startling spoken, mouthed, and non-speech movements”, has been submitted to the Annual Meeting of CanadianAcoustical Association and will be published in Canadian Acoustics. Chap-ivter 5, “Pitch planning in English and Taiwanese Mandarin” was designedand conducted by Cheng-hao Chiu, with help from Dr. Yu-an Lu at thedepartment of Foreign Language and Literature at National Chiao TungUniversity, Hsinchu, Taiwan.Published papers included in this dissertation are listed here.Chiu, C. and Gick, B. (2014a). Pitch planning in English and TaiwaneseMandarin: Evidence from startle-elicited responses. Journal of the Acousti-cal Society of America Express Letters, 136(4):EL322 – 328.Chiu, C. and Gick, B. (2014b). Startling speech: Eliciting preparedspeech using startling auditory stimulus. Frontiers in Psychology - CognitiveScience, 5(1082).Stevenson, A. J., Chiu, C., Maslovat, D., Chua, R., Gick, B., Blouin, J.-S., and Franks, I. M. (2014). Cortical involvement in the StartReact effect.Neuroscience, 269:21 – 34.Chiu, C. and Gick, B. (in press). Startle-induced evidence for multi-dimensional information in speech plans. In the Canadian Acoustics.The experiments included in this dissertation were conducted with theapproval of the UBC Behavioural Research Ethics Board: Preparation andControl of Movement (H09-00632) and Processing Complex Speech MotorTasks (H04-80337). I have completed the Interagency Advisory Panel onResearch Ethics Introductory Tutorial for the Tri-Council Policy Statement:Ethical Conduct for Research Involving Humans (TCPS). The certificate wasissued on August 22, 2008.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Startle paradigm . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 SAS and its applications . . . . . . . . . . . . . . . . 21.1.2 Simple RT tasks and the StartReact effect . . . . . . 41.2 Neural pathways for normal and SAS-induced speech . . . . 81.3 Organization of this dissertation . . . . . . . . . . . . . . . . 152 The StartReact effect in syllable production . . . . . . . . . 17vi2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.1 The StartReact effect and cortically determined pro-cesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.2 Syllables as speech units . . . . . . . . . . . . . . . . 202.1.3 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.2.1 Participants . . . . . . . . . . . . . . . . . . . . . . . 222.2.2 Apparatus, task, and procedures . . . . . . . . . . . . 232.2.3 Recording equipment . . . . . . . . . . . . . . . . . . 242.2.4 Interpretation of EMG and lip displacement . . . . . 272.2.5 Data reduction, dependent measures, and statisticalanalyses . . . . . . . . . . . . . . . . . . . . . . . . . 302.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.3.1 Startle indicators . . . . . . . . . . . . . . . . . . . . 322.3.2 The StartReact effect . . . . . . . . . . . . . . . . . . 362.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Startling spoken, mouthed, and non-speech movements . 443.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2.1 Participants . . . . . . . . . . . . . . . . . . . . . . . 483.2.2 Apparatus, task, and procedures . . . . . . . . . . . . 483.2.3 Recording equipment . . . . . . . . . . . . . . . . . . 493.2.4 Data reduction, dependent measures, and analyses . . 493.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52vii3.3.1 Lip compression . . . . . . . . . . . . . . . . . . . . . 523.3.2 Startle indicator . . . . . . . . . . . . . . . . . . . . . 593.3.3 Kinematic markers, reaction times, and timing . . . . 613.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674 Startling syllables with pre-speech movements . . . . . . . 724.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.2.1 Participants . . . . . . . . . . . . . . . . . . . . . . . 744.2.2 Apparatus, task, and procedures . . . . . . . . . . . . 744.2.3 Data reduction, dependent measures, and analyses . . 754.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.3.1 Startle indicator . . . . . . . . . . . . . . . . . . . . . 824.3.2 Kinematic markers, reaction times, and timing . . . . 844.3.3 Lip kinematics . . . . . . . . . . . . . . . . . . . . . . 884.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925 Pitch planning in English and Taiwanese Mandarin . . . . 945.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005.2.1 Participants . . . . . . . . . . . . . . . . . . . . . . . 1005.2.2 Apparatus, task, and procedures . . . . . . . . . . . . 1015.2.3 Recording equipment . . . . . . . . . . . . . . . . . . 1025.2.4 Data preparation and statistical analyses . . . . . . . 1035.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108viii6 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . 1126.1 Summaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.2 Theoretical implications . . . . . . . . . . . . . . . . . . . . . 1146.2.1 Startle paradigm as a window into speech motor plan-ning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.2.2 Forward control . . . . . . . . . . . . . . . . . . . . . 1186.3 Conclusion and future work . . . . . . . . . . . . . . . . . . . 119Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121AppendicesA Chapter 3 additional figures . . . . . . . . . . . . . . . . . . . 138ixList of Tables2.1 Experimental results of Exp. 2, showing mean and standarddeviations (in brackets) . . . . . . . . . . . . . . . . . . . . . 363.1 Numbers of trials with lip compression across conditions. Per-centages were calculated from the trials included in analyses. 533.2 Mean compression displacement across conditions (mm); stan-dard deviations in parentheses. . . . . . . . . . . . . . . . . . 543.3 Mean reaction times (in ms) to dependent markers acrossconditions; standard deviations in parentheses. The volun-tary movement onsets of Spoken responses from Chapter 2are included for comparison. . . . . . . . . . . . . . . . . . . . 644.1 Mean reaction time (in ms) at dependent markers across con-ditions; standard deviations are in parentheses. . . . . . . . . 844.2 Mean lip compression onset latency (in ms), lip compressionduration (in ms) and lip compression displacement (in mm);standard deviations are in parentheses. . . . . . . . . . . . . . 885.1 The four lexical tones in Mandarin Chinese and possible words 99x5.2 Average acoustic burst release times and syllable durations(in ms) across 4 Taiwanese Mandarin tones. Standard devia-tions are shown in parentheses. . . . . . . . . . . . . . . . . . 105xiList of Figures1.1 Information processing for simple RT tasks . . . . . . . . . . 51.2 Proposed neural circuits for control and SAS-induced speechproduction. The grey arrows show the pathways for con-trol responses and the black arrows present the pathways forSAS-induced responses. The solid lines are stimulus-triggeredpathways in forward control and the dashed lines representthe origin of the feedback information input. . . . . . . . . . 132.1 Relative positions of the monitor, participant, and loud speaker. 232.2 Sternocleidomastoid (SCM) and orbicularis oris muscles forthe attachment of surface EMG electrodes. Note the SCMand orbicularis oris figures were adapted and modified fromGray’s Anatomy (20th ed.) . . . . . . . . . . . . . . . . . . . 252.3 Experimental apparatus and set-up. . . . . . . . . . . . . . . 262.4 Average rectified raw EMG traces of baseline trials, includ-ing (from top) right SCM, left SCM, upper lip, and lowerlip. EMG activity was plotted with respect to the imperativestimulus (the vertical grey line). . . . . . . . . . . . . . . . . . 28xii2.5 EMG activity, lip displacement and response acoustic wave-forms of control exemplar. The EMG measure includes (fromtop) right SCM, left SCM, upper lip, and lower lip. EMGactivity was rectified and plotted with respect to the imper-ative stimulus (the vertical grey line). Lip displacement andresponse acoustic waveforms are also plotted with respect tothe imperative stimulus. Point A marks the beginning of vol-untary movement; point B marks the time to the lower lipopening onset; point C marks the lowest position of the lowerlip. The intervals across kinematic markers were labeled fromt1 to t5. See text for details. . . . . . . . . . . . . . . . . . . . 292.6 Mean EMG activity and kinematic displacement of control(grey) and startle (black) trials in the vocalization conditionwith respect to the imperative stimulus. The top four chan-nels represent the mean EMG across four muscles (from topto bottom): left SCM, right SCM, upper lip, and lower lip.The bottom two channels depict the mean upper lip and lowerlip displacement trajectories, respectively. . . . . . . . . . . . 342.7 Mean EMG activity and kinematic displacement of control(grey) and startle (black) trials in the vocalization conditionwith respect to the lower lip opening onset. The top fourchannels represent the mean EMG across four muscles (fromtop to bottom): left SCM, right SCM, upper lip, and lowerlip. The bottom two channels depict the mean upper lip andlower lip displacement trajectories, respectively. . . . . . . . . 35xiii2.8 SS ANOVA comparison of formant frequencies F1 and F2over syllable /ba/ durations. The black line denotes the pre-dicted fit for control responses and the white line denotesthe startle responses. The dark grey and light grey bandssurrounding the predicted fits represent a 95% confidence in-terval. Any white space between the transition lines for eachcondition represents a statistically significant difference be-tween the given measures. . . . . . . . . . . . . . . . . . . . . 393.1 Lip displacement and response acoustic waveforms are alsoplotted with respect to the imperative stimulus. Point Amarks the beginning of voluntary movement; point B marksthe time to the lower lip opening onset; point C marks thelowest position of the lower lip. The intervals across thesemarkers were labeled as t1, t2, and t3. See texts for details. . 513.2 Mean upper lip (left) and lower lip (right) EMG activity anddisplacement trajectories in the Spoken condition across par-ticipants. Data in all channels were normalized to the lowerlip opening onset (vertical line). The distance of lip displace-ment between control and startle responses does not reflectthe absolute distance between control and startle responses. . 56xiv3.3 Mean upper lip (left) and lower lip (right) EMG activity anddisplacement trajectories in the Mouthed condition acrossparticipants. Data in all channels were normalized to thelower lip opening onset (vertical line). The distance of lipdisplacement between control and startle responses does notreflect the absolute distance between control and startle re-sponses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.4 Mean upper lip (left) and lower lip (right) EMG activity anddisplacement trajectories in the Non-speech condition acrossparticipants. Data in all channels were normalized to thelower lip opening onset (vertical line). The distance of lipdisplacement between control and startle responses does notreflect the absolute distance between control and startle re-sponses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583.5 Mean EMG activity and kinematic displacement of control(grey) and startle (black) trials in the Spoken (repeated fromChapter 2), Mouthed, and Non-speech condition. All datawere normalized to the imperative stimulus. The top fourchannels represent the mean EMG across four muscles (fromtop to bottom): left SCM, right SCM, upper lip, and lowerlip. Each channel is of the same scale but vertically arrangedfor visualization. The bottom two channels depict the meanupper lip and lower lip displacement trajectories, respectively.Red boxes mark the window of startle reflex activity, rangingfrom 30 ms to 120 ms after the onset of the imperative stimulus. 60xv3.6 Relative timing ratio (non-transformed) of various kinematictime markers for startle and control trials. The whiskers foreach bar are standard error bars. The time frame for theseevents is the time between lower lip voluntary movement onsetand lower lip displacement trough (t3). Time intervals includethe time between voluntary movement onset and lower lipopening onset (t1) and between lower lip opening onset andlowest lower displacement trough (t2). . . . . . . . . . . . . . 664.1 Schematic Spoken response with lip displacement and re-sponse acoustic waveforms plotted with respect to the im-perative stimulus. Point A marks the beginning of voluntarymovement; point B marks the time to the lower lip openingonset; point C marks the acoustic onset; point D marks thelowest position of the lower lip. The intervals across thesemarkers were labeled as t1, t2, t3, t4, and t5. See text fordetails. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.2 Lip compression markers. Points M and N mark the onset andoffset of lip compression, respectively (see text for details).Lip compression is here defined as the latency between PointsM and N (tc). Lip compression displacement measures thechange in lower lip displacement within tc. . . . . . . . . . . . 78xvi4.3 Schematic lip displacement trajectories (top) from the upperand lower lips and correspondent lip aperture profile (bot-tom). Data was taken from one trial from one of the par-ticipants. The two lips reach an equilibrium point and lipaperture stabilizes. Data was taken from one example trialfrom a participant. The vertical grey line marks the onset oflip compression. See text for details. . . . . . . . . . . . . . . 794.4 Schematic lip displacement trajectories (top) from the upperand lower lips and correspondent lip aperture profile (bot-tom). Data was taken from one trial from one of the par-ticipants. This figure shows that lip aperture continues todecrease after the two lips make contact. The vertical greyline marks the onset of lip compression. An inflection of lipaperture profile is observed. See text for details. . . . . . . . 814.5 Mean EMG activity and kinematic displacement of control(grey) and startle (black) trials in the Spoken, Mouthed, andNon-speech condition. All data were normalized to the im-perative stimulus. The top four channels represent the meanEMG across four muscles (from top to bottom): left SCM,right SCM, upper lip, and lower lip. Each channel is of thesame scale but vertically arranged for visualization. The bot-tom two channels depict the mean upper lip and lower lipdisplacement trajectories, respectively. Red boxes mark thewindow of startle reflex activity, ranging from 30 ms to 120ms after the onset of the imperative stimulus. . . . . . . . . . 83xvii4.6 Relative timing ratio (non-transformed) of various kinematicand acoustic time markers for startle and control trials. Thetime frame for these events is the time between lower lip vol-untary movement onset and lower lip displacement trough(t5). Time intervals include the time between: voluntarymovement onset and lower lip opening onset (t1), lower lipopening onset and voice onset (t2), voluntary movement on-set and voice onset (t3), and lower lip opening onset and lowerlip displacement trough (t4). . . . . . . . . . . . . . . . . . . . 874.7 Upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Spoken condition. Data in allchannels were normalized to the lower lip opening onset (ver-tical line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.8 Upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Mouthed condition. Data in allchannels were normalized to the lower lip opening onset (ver-tical line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.9 Upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Non-speech condition. Data inall channels were normalized to the lower lip opening onset(vertical line). . . . . . . . . . . . . . . . . . . . . . . . . . . . 91xviii5.1 SS ANOVA comparison of formant frequencies F1 and F2over syllable /ba/ durations. The black line denotes the pre-dicted fit for control responses and the white line denotesthe startle responses. The dark grey and light grey bandssurrounding the predicted fits represent a 95% confidence in-terval. Any white space between the transition lines for eachcondition represents a statistically significant difference be-tween the given measures. NB: The figure is a revised versionof Figure 2.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.2 Schematic timeline of potential feedback for SAS-elicited speech 985.3 Taiwanese Mandarin F1 and F2 range for control and startleresponses. SS ANOVA results of F1 and F2 frequency rangeacross 4 tones. Each shaded band represents a 95% confidenceinterval. The white space between the transition lines foreach condition represents a statistically significant differencebetween the given measures. . . . . . . . . . . . . . . . . . . . 1055.4 SS ANOVA comparison of pitch over English /ba/ durations.Each shaded band represents a 95% confidence interval. Thewhite space between the transition lines for each conditionrepresents a statistically significant difference between thegiven measures. . . . . . . . . . . . . . . . . . . . . . . . . . . 1065.5 SS ANOVA results of pitch profiles across 4 Taiwanese Man-darin tones. Other legends follow Figure 5.3. . . . . . . . . . 107xixA.1 Number of trials with lip compression in Mouthed condition.The bars in black represent trials occurring in the first halfof the block; the grey bars represent trials occurring in thesecond half of the block. . . . . . . . . . . . . . . . . . . . . . 139A.2 Number of trials with lip compression in Non-speech condi-tion. The bars in black represent trials occurring in the firsthalf of the block; the grey bars represent trials occurring inthe second half of the block. . . . . . . . . . . . . . . . . . . . 139xxAcknowledgementsI am immensely grateful to get this far. A moment like this would not bepossible were it not for all the support that I received from people aroundme during my Ph.D. years. Here I am giving my sincere thanks.First and foremost, my deepest gratitude goes to my advisor, BryanGick, for his guidance through every step of this dissertation work and forhis being a mentor for my academic career as well as an inspiring friend forlife. My perspectives on research and on life have been brought to a wholenew level based on the fellowship we built together. Thank you, Bryan, foryour unwavering support and encouragement, particularly during the mostdifficult times.I am also enormously grateful to Dr. Ian Franks from UBC Kinesiologyfor all his patient guidance, responsive communication, and selfless support.Working with Dr. Franks truly broadened my view of science and of knowl-edge, particularly from a perspective other than linguistics. His profoundunderstanding of motor control and his careful attitude to research are thetwo things that I look up to the most. Thank you, Dr. Franks, for showingme how to become a scientist. I expect myself to shout out “Eureka!” toyou in return one day in the future.I am also thankful to Eric Vatikiotis-Bateson, my other supervisory com-xximittee member. Thank you for sharing your thoughts and providing alterna-tive perspectives to my research, especially through the last few years of mywriting process. Your insights to data and research questions are more thanvaluable. I cannot thank you enough for all your input to my dissertation.During the final stage of this dissertation write-up, precious commentsand suggestions came from my university and external examiners: ToddHandy, Carla Kam, and Philip Hoole. Thank you all for your helpful com-ments and suggestions on my work.Throughout my time at UBC, my home lab, the Interdisciplinary SpeechResearch Laboratory (ISRL), undoubtedly provided me the strongest sup-ports. In particular, I would like to give my most sincere thanks to DonaldDerrick and Mark Scott, for their inspiring research and insights. I wastruly privileged to work with you. Wonderful people I met over the past fewyears, including Scott Moisik, Noriko Yamane, Murray Schellenberg, JenAbel, and Alexis Black, thank you for all the supports and your faith in me.My connection to the Department of Kinesiology at UBC was also anunforgettable experience. Romeo Chua, Dana Maslovat, Andrew Stevenson,Chris Forgaard, and Jarrod Blinch, you have all shown me how to approacha discipline outside linguistics. Here I would like to show my gratitude foryou help and sharing. Thank you all!Over these years at UBC, the Department of Linguistics not only ignitedmy enthusiasm toward research but also accommodated me both academi-cally and personally. My special thanks to all of the UBC Linguistics faculty,staff, and graduate fellows. Molly Babel, Strang Burton, Gunnar Hansson,Lisa Matthewson, Amanda Miller, Douglas Pulleyblank, Hotze Rullmann,xxiiJoseph Stemberger, and Martina Wiltschko, thank you for your directionand inspiration. Thank you, Sonja Thoma, Meagan Louie, Joel Dunham,Joash Johannes, for the supports and covers during my life trough. My grat-itude also goes to Edna Dharmaratne and Shaine Meghji for their heartyhelp, supports and kindness over these years. Your efforts on administrativematters made my life at UBC home. Thank you!I would also like to give my thanks to some scholars that I have workedwith for the past few years: Sid Fels, Ian Stavness, Peter Anderson, LingTsou, Cormac Flynn, Andrew Ho, Jonathan Evans, Jackson T.-S. Sun, andYuwen Lai. Thank you for including me in you projects and collaboration.My appreciation to all your supports and guidance cannot be overlooked.This dissertation and my research alongside involve great deal amount oflaboratory work. This dissertation work wouldn’t be finished as smoothlyas it was without the help from these laboratories. I would like to give mythanks to the laboratories: the ISRL (UBC), the Motor Control and Learn-ing Laboratory (UBC), the Psycholinguistics Laboratory (NCTU, Taiwan),and the Experimental Phonology Laboratory (NCTU, Taiwan) and theirdirectors who had kindly offered the space and help for my research. Thankyou everyone.Many thanks to the National Institute of Health Grant DC-02717 awardedto Haskins Laboratories and to a Discovery Grant from the Natural Sciencesand Engineering Council of Canada (NSERC) awarded to Bryan Gick tosupport my research throughout these years.My journey through Ph.D. cannot be as fruitful if there had not beenfriends, cohorts, and colleagues sharing all my highs and lows. Thank you,xxiiiAme´lia Reis Silva, Anal´ıa Gutie´rrez, Raphael Girard, Masaki Noguchi, MarkScott, Anita Szakay, and Beth Stelle. Your support and the time we spenttogether have become an unforgettable chapter of my life. I will alwaysremember our laughs and tears together. They made my Ph.D. stories. Mythanks also go to my caring Taiwanese friends: Rae (林瑞屏), Sally (潘昭穎),Chiang (江佳陽), Justin (林昱廷), and John (吳宗聖). The true friendshipwe share together is one of the most valuable things I obtained from UBC.Thank you all from the bottom of my heart.I am much blessed by the love and support from my beloved family. Mygratitude to all of you can never be overemphasized. 最後,我要感謝我的家人。謝謝我的父親邱垂榮先生、母親吳玉女女士,謝謝您們的養育以及無條件的愛,今天我所有的一切都是您們的付出所成就的,除了感謝,我無以回報;謝謝我的大哥邱振訓,一直以來不僅是我的榜樣,更是我最信賴的依靠;謝謝我的岳父母盧振哲先生以及劉秋旻女士、大姐盧郁文、大姐夫 Sung Lee、外甥 Caleb Lee、以及二姊盧郁心,謝謝您們在這些年做我堅強的後盾。最重要的,我要謝謝我摯愛的太太盧郁安,謝謝妳的一路相伴、包容、和支持,是妳讓我在學術以及人生路上有信心,有勇氣能繼續向前,同時也開拓了我的視界,有妳和真語做為我最親密的家人,我的人生也更加完整。謝謝妳們,我愛妳們。All the remaining errors in this work are my full responsibility.xxivDedicationTo my mother in heavenMom, everything good about me is from you.Hope this makes you proud.媽,謝謝您成就我的夢想希望您感到驕傲xxvChapter 1IntroductionWhen people are startled, a series of physiological reflexes can be ob-served: we shut our eyes, shrug our shoulders, and grimace. These startlereflexes are the result of a protective mechanism. Interestingly, if peopleare startled when they are engaged in a prepared movement, this preparedmovement will be performed at a much shorter latency than a normal volun-tary movement (e.g., Carlsen et al., 2004b; Siegmund et al., 2001; Valls-Sole´et al., 1999). The rapid release reflex opens a window that permits re-searchers to investigate how people plan actions, and in particular complexactions such as speech. This dissertation employs the startle paradigm toexplore speech motor planning.While speech motor control has attracted a lot of attention over the pastdecades, only few studies have offered insight into specific motor content ina speech plan. The content of such a plan should presumably specify, atminimum, those aspects of speech that are essential in determining linguis-tic contrast, though it may presumably omit many aspects of a physicalspeech utterance that may be determined by production or feedback mech-anisms. Recent speech research focuses on forward models (Pickering andGarrod, 2013; Scott et al., 2013; Tian and Poeppel, 2012), expanding on a1long tradition of work in programming of speech motor plans (e.g., Keele,1981; Klapp, 2003; Lashley, 1951). These works have attempted to uncoverthe detailed content of forward plans by assessing behavioral performancesand neuroimaging results. However, the plans revealed in these studies werenot investigated using methods that allow the forward plans to be observedindependently from feedback control. This dissertation uses an experimen-tal paradigm with a startling auditory stimulus (the “startle” paradigm; seebelow for details) to elicit prepared speech production. This paradigm isconsidered a reliable method to elicit prepared responses while only limitedor no feedback information is in play. The content of such plans shouldpresumably specify, at minimum, those aspects of speech that are essentialfor determining linguistic contrasts. This paradigm would thus allow re-searchers to reveal some of the details in speech plans before any feedbackadjustments.This chapter will first introduce the startle paradigm and then providean overview of neural pathways for SAS-induced responses. A series of ex-periments using this startle paradigm conducted to investigate speech motorcontrol are described in the subsequent chapters. The overall structure ofthis dissertation is included in the final section of the chapter.1.1 Startle paradigm1.1.1 SAS and its applicationsStudies over the past four decades have illustrated that the responsein a simple reaction time (RT) task may be programmed prior to the in-2troduction of the imperative stimulus (Klapp, 1977, 1995, 2003; Wadmanet al., 1979). The presentation of the imperative stimulus then triggers theinitiation of the prepared response. In limb movement studies, a startlingacoustic stimulus (SAS, > 120 dB) has been shown to be an effective triggerof prepared movements (Carlsen et al., 2012; Valls-Sole´ et al., 1999). Pre-pared responses to a SAS are performed at shorter latencies than normallytriggered responses, while the kinematic variables and EMG characteristicsof prepared responses remain largely unchanged; for instance, a SAS wasfound to trigger a prepared arm movement after a time lapse as small as70 ms after the stimulus onset (e.g., Carlsen et al., 2004b; Valls-Sole´ et al.,1999). This early release of prepared action in response to a SAS, termedthe StartReact effect, has been shown to produce a high accuracy rate withunmodified muscle activity profiles (Valls-Sole´ et al., 2008, 1999). Valls-Sole´et al. (2008) hypothesized that the StartReact effect bypasses feedback in-put because the response latencies produced are too short to accommodatethe usual cortical processing for voluntary movements. Because of their veryshort onset latency, SAS-induced actions may be fully executed before theyare affected by sensory feedback, thus offering the researcher a window ontothe unmodified plan.The largely unaltered muscle activity patterns observed in startle trialssuggest that the SAS-induced release is not simply the result of a startlereflex superimposed on a faster voluntary movement. Rather, the earlyrelease of the voluntary movement indicates response triggering at a sub-cortical level, since the reaction times observed are too short to allow forany cortico-cortical transfer of activity (cf. Carlsen et al., 2004b; Valls-Sole´3et al., 1999). Carlsen et al. (2009a) proposed that the voluntary and star-tle systems interact at the supra-spinal level (in the reticular formation inthe brainstem), where information for the prepared movements is stored.A comparison of finger and arm movements in Carlsen et al. (2009a) showsthat movements involving subcortical pathways are more susceptible to earlyinitiation by a SAS, while movements that are more strongly mediated bycortico-spinal connections, such as finger lifting movements, are less sensi-tive to SAS triggering. These findings suggest that the StartReact responseinvolves a rapid triggering of subcortically stored information.Recent studies however, have provided evidence for cortical activationduring startle, suggesting that not all programming is stored and triggeredat subcortical levels. Transcranial magnetic stimulation (TMS) studies havedemonstrated that, when stimulation is applied to the motor cortex togetherwith a SAS, the resulting cortical silent period delays the StartReact re-sponse in startle trials (Alibiglou and MacKinnon, 2012; Stevenson et al.,2014)1. If only subcortical processes were involved, TMS should not af-fect the StartReact response; the observed delay in the StartReact responsetherefore suggests that the pathways involved in the StartReact responseare mediated by, rather than bypass, cortical areas.1.1.2 Simple RT tasks and the StartReact effectIn the motor control literature, a lot of attention has been paid to re-sponse preparation and execution. From the viewpoint of information pro-1Exp. 2 of 2 in Stevenson et al. (2014) is included below as Chapter 2 of this disserta-tion.4cessing, a movement response is invoked through a series of processes, fromthe input of the imperative stimulus to the output of motor behaviour. Asimple reaction time (RT) task with a single target response can be used totest the StartReact effect. In this task, participants prepare the responsein advance and only respond to the imperative stimulus, thus ensuring thatthe programming of the response occurs prior to the stimulus presentation(Figure 1.1).Figure 1.1: Information processing for simple RT tasksKlapp (1995, 2003) demonstrated that in a simple RT task, the reac-tion time needed to initiate the prepared speech act is only sensitive tothe number of response elements (e.g., the number of repetitions), not tothe complexity of the response (e.g., the number of syllables). When aspeech response has been prepared in advance of a stimulus, the internalcomplexity of that response does not affect the reaction time. Klapp (1995,2003) argued, based on these data, that speech motor plans are determinedby two different processes: the internal (INT) structure and the sequence(SEQ) structure. In a simple RT task, the complexity of a prepared speechsequence (INT) does not affect reaction time, since the internal structureof the response (including duration) is specified before the stimulus. Onlythe number of chunks (SEQ) affects the RT in this type of task. In thecontext of the SAS design, it is assumed that in a simple RT task, the in-ternal structure of a response is pre-specified before the stimulus, and these5pre-specified details, including kinematics and acoustics, are then subject torapid release by the presentation of a SAS. However, Klapp only considersthe INT and SEQ processes at the level of segments and syllables. It is notclear whether phonemic features, pre-speech postures, and suprasegmentalinformation can also be pre-specified or organized in a speech plan. Thepresent thesis intends to bridge this gap by using the startle paradigm toexamine the planning of syllable production.In a simple RT task using SAS, the target response is prepared in ad-vance, while the imperative stimulus may be either a control tone or anunpredictable SAS. As the selection and programming of responses occursbefore the stimulus presentation, the prepared upper limb response can bereleased by a SAS at a short response latency (Carlsen et al., 2004b; Valls-Sole´ et al., 1999). When participants are startled, reflexive EMG activitycan be observed in the bilateral sternocleidomastoid (SCM) muscles; suchactivity is considered to be a reliable indicator of the startle reflex (Carlsenet al., 2011). Such reflex activity (pre-motor time shorter than 70 ms) is con-sistently observed when a SAS is presented to participants. In addition tothis reflexive muscle activity, the StartReact effect is characterized by rapidrelease of a prepared response with unaltered kinematics (e.g., Carlsen et al.,2012; Valls-Sole´ et al., 1999).The StartReact effect has also been observed in experiments on head-turning movements (Oude Nijhuis et al., 2007; Siegmund et al., 2001), eyemovements (Castellote et al., 2007), and stepping (MacKinnon et al., 2007).In addition to the early release of the prepared response, the SAS can alsocause the release of independently prepared motor events. For example,6Forgaard et al. (2013) used SAS to elicit prepared upper limb movements,which typically display an agonist (ag1) – antagonist (ant) – agonist (ag2)tri-phasic EMG pattern. While the tri-phasic activation pattern in limbmovements is traditionally understood to be a temporally bound movementseries, the observed spatiotemporal complexity between agonist and antag-onist muscles suggests that these targeted movements in fact result fromcoordination of muscle synergies (d’Avella and Bizzi, 2005): muscle syner-gies that are associated with the same functional task may rely on time-varied coordination, as opposed to a time-fixed clock. Forgaard et al. (2013)showed that, when a SAS is presented at the onset of ag1 during a prepareda short elbow extension movement (30◦), the ant and ag2 are releasedat a shorter latency. The fact that a SAS causes early elicitation of antand ag2 suggests that agonist and antagonist forces may be independentlyprogrammed by a SAS. These results indicate that the SAS paradigm canbe used to determine whether serial/concurrent motor behaviours associ-ated with a common functional purpose are independently programmed andwhether or not their initiation is time-locked.The SAS paradigm has been applied to both cortically and subcorti-cally determined processes. As noted in the paragraphs above, an ongoingdebate exists concerning whether a SAS triggers only subcortically storedprograms (Carlsen et al., 2004a; Castellote et al., 2012; Nonnekes et al., 2014;Valls-Sole´ et al., 1999), or whether it also affects cortically processed motorbehaviours (Alibiglou and MacKinnon, 2012; Stevenson et al., 2014). Thisdissertation seeks evidence to test first whether the StartReact effect doesoccur in cortically involved processes. SAS triggering may also be an effec-7tive method for probing motor behaviours with heavy cortical involvement(see Section 1.2), such as speech. In this case, the pathways for the StartRe-act response would be mediated by, rather than bypass, the cortical areas.In Section 1.2, I summarize current scholarship on neural communicationfor voluntary and SAS-induced speech production.1.2 Neural pathways for normal andSAS-induced speechTwo types of voluntary vocalization can be identified, with differenti-ated neural pathways: low-level, innate (nonverbal or emotional) vocaliza-tion, and speech vocalization (Ju¨rgens, 2002; Simonyan and Horwitz, 2011).As summarized in Ju¨rgens (2002), nonverbal vocalization is initiated fromthe anterior cingulate cortex. The periaqueductal gray receives the projec-tion from the anterior cingulate cortex and functions as a relay station byexerting activations on the reticular formation in the lower brainstem. Vol-untary speech, by contrast, involves a wider and more complicated neuralcommunication relay across cortical and subcortical areas. Speech motorpreparation, storage, and initiation rely on transcortical communications,which are mediated by cortico-spinal connections.One of the techniques most commonly used to identify cortical areasassociated with a particular motor behaviour is transcranial magnetic stim-ulation (TMS). When TMS is applied through the scalp to the cortical area,cortical activity is paused for a very short period of time. A quiet periodof EMG activity is observed after the presentation of TMS, resulting in a8delay of the intended motor response. Terao et al. (2001) performed TMSover the cortex during the preparation of vocalization in response to a vi-sual cue. Their results show that RT of vocalization is delayed when TMSis delivered up to 150 ∼ 200 ms before the expected onset of vocalization.Similarly, Schuhmann et al. (2009) showed that object-naming utterancescan be delayed when TMS is applied 300 ms after the presentation of apicture prompt.While TMS-induced delay provides evidence for the involvement of cor-tical processing in speech production, fMRI studies also reveal strong evi-dence of cortical involvement for speech preparation and initiation. Rieckeret al. (2005) investigated the role of the cortex in the production of sylla-ble repetition. A blood-oxygen-level dependent (BOLD) contrast measure-ment revealed substantial activations in the left supplementary motor area(SMA), left dorsolateral prefrontal cortex (including Broca’s area), left an-terior insula, and right superior cerebellum 3 ∼ 5 seconds after the onsetof acoustic stimulation (i.e., clicks). On the other hand, the left sensorimo-tor cortex, left thalamus, left putamen/pallidum, left caudatum, and rightinferior cerebellum achieved their peak activations 8 ∼ 9 seconds after theonset of speech production. The segregation of activations across these areassuggests two levels of speech motor control. The areas showing early BOLDsignal changes are responsible for motor preparation, whereas the areas withlate BOLD signal changes account for the execution process. Brendel et al.(2010) employed a similar design to investigate the time course of activationof the mesio-frontal cortex during the preparation and execution of sylla-ble repetitions. A widespread network, including the brainstem, thalamus,9basal ganglia, SMA proper, inferior frontal gyrus (IFG; i.e., Broca’s area),and insula, showed significant activations for both motor preparation andexecution (see Brendel et al., 2010, for details). In particular, extensive ac-tivations in SMA, IFG, and insula suggest that these areas are involved inboth the preparation and execution of syllable repetitions.Once prepared, speech motor commands are then delivered via a de-scending pathway through the putamen and substantia nigra to the parvo-cellular reticular formation, which is directly connected to the phonatorymotoneurons in the medulla oblongata (Iwata et al., 1996; Ju¨rgens, 2002;Simonyan and Horwitz, 2011). It should be noted that the initiation ofprepared speech hinges on cortically determined processes and cannot bereleased simply by reticulo-spinal commands. When participants producea pre-cued syllable sequence in response to a GO signal (as opposed to aNO GO signal), more significant activity is observed in the primary and so-matosensory cortices, the superior temporal plane, the anterior insula, andthe medial premotor areas, with particularly rich activity in the SMA nearthe superior convexity and portions of the pre-SMA and anterior cingulatesulcus (Bohland and Guenther, 2006). A main effect of overt speech (i.e., GOtrials) suggests that these cortical areas are associated more with initiationthan with planning or sequence buffering of the prepared speech.The above studies suggest that the preparation of intended speech in-volves a wide range of cortical and subcortical components; in particular,the initiation and execution of prepared speech depends on the exertion ofcortical projections to subcortical areas via a descending cortico-subcorticalpathway.10The fact that presentation of a startling stimulus can cause rapid releaseof a response suggests that responses can be prepared ahead of time by therelevant systems and stored (or kept on hold) somewhere before the release.The debate about the storage location is still ongoing. While some SASstudies argue for subcortical storage based on the rapid release triggered bya SAS (Carlsen et al., 2004a; Castellote et al., 2012; Nonnekes et al., 2014;Valls-Sole´ et al., 1999), TMS data show that a SAS can elicit corticallyprepared responses via a subcortically mediated pathway; therefore, theStartReact effect is also seen in cortically processed responses (Alibiglouand MacKinnon, 2012; Stevenson et al., 2014).The storage of phonological representations was discussed by Baddeleyet al. (1984), who demonstrated that the articulatory loop comprises sub-systems for both storage and rehearsal. While initial storage may decayrapidly, rehearsal keeps providing updates (Baddeley, 1998; Baddeley et al.,1984; Jacquemot and Scott, 2006). Neuro-imaging data also support thenotion of continuous updates provided by rehearsal (e.g., Chein and Fiez,2001), which continually feed information to the storage. The encoding,storage, and rehearsal of phonological representations is believed to takeplace in a wide range of cortical areas, including the SMA, premotor cortex,and bilateral inferior frontal regions (see e.g., Chein and Fiez, 2001).Now, turn to the neural pathways for SAS-induced responses. As sum-marized in Carlsen et al. (2012), the StartReact response is mediated via anascending thalamo-cortical pathway, activated by exertion of the reticularformation on the thalamus. The neural pathways used during the StartRe-act response are considered to be separate from the normal pathways used11for voluntary movements. Increased activation in the thalamus providesinput to the primary motor cortex to initiate the cortically prepared move-ment via a descending cortico-spinal pathway. It is noteworthy that theStartReact pathways for upper limb movements largely overlap with thepathways for speech production. Similarly, speech production may also relyon thalamo-cortical circuits and a descending cortico-spinal pathway. Specif-ically, after receiving input from the cerebellum, the thalamus projects to theprimary motor cortex and to Broca’s area; commands are mediated via theputamen and reticular formation and sent down to the phonatory motoneu-rons in the spine (Guenther et al., 2006; Iwata et al., 1996; Ju¨rgens, 2002).Given that speech production and upper limb movements both involve sim-ilar thalamo-cortical pathways, it is likely that upper limb movements andspeech movements is hypothesized also share the same StartReact pathwayswhen elicited by a SAS. Shorter reaction times in StartReact responses areaccounted for by increased neuron activity, causing the response signal toreach the initiation threshold more quickly (see Carlsen et al., 2012, fordetails).Figure 1.2 depicts the neural pathways for control and SAS-inducedspeech production. The speech plan is encoded and prepared primarilyin the cortical areas (the grey circle), with the perceptual input and feed-back information initiating from other cortical areas and the cerebellum (thegrey dashed lines inputting to the encoded program). While waiting for theimperative stimulus, the encoded program can be rehearsed and constantlyupdated (the black circle). When the imperative stimulus is presented, themost updated motor program is released. As summarized in Carlsen et al.12!"#$#%&'()*+,-).-(/.$$)/&$&'( /./"$,#-(*&/$,&'(0&1)!.-2(3.-!,4(5-./,'')*6($..7'(84,/&9.*($..7'(-,9/&$#-((+.-%#9.*(7".*#!.-2(%.!.*,&-.*,'(:8;5<=;8(;.%#!.',*'.-2(/.-!,4(Figure 1.2: Proposed neural circuits for control and SAS-induced speechproduction. The grey arrows show the pathways for control responses andthe black arrows present the pathways for SAS-induced responses. The solidlines are stimulus-triggered pathways in forward control and the dashed linesrepresent the origin of the feedback information input.(2012), the SAS-induced response traverses an ascending pathway from thecochlear nucleus, through the inferior colliculus, reticular formation, andthalamus, to the motor cortex (the black arrows). The released preparedprogram then travels along the descending motor pathways (the black ar-rows). For speech production, the commands of the prepared program canbe released via the reticular formation and sent down to the spinal motoneu-13rons. If the speech performance can be initiated at a short latency withlimited feedback information, this suggests that the triggering for speechand limb movements shares a similar subcortically mediated pathway.We can conservatively calculate the time required for a speech responseby following the same procedure used to calculate the required time spanfor voluntary limb movements. First, Schroeder and Foxe (2002) reported aresponse latency of 10 ∼ 25 ms from the onset of the auditory stimulus tothe activation in the auditory cortex. Another 5 ∼ 10 ms are required forconduction between the lateral lemniscus and the thalamus (Stockard et al.,1977) for auditorily evoked responses. Transcortical and thalamus-primarymotor cortex transmissions require 2 ∼ 4 ms for conduction (Carlsen et al.,2012; Guenther et al., 2006). Finally, the latency of the orofacial muscleEMG response to TMS on the face area of the motor cortex is about 11 ∼12 ms (Meyer et al., 1994), and the motor time for the muscle movement isdelayed by 30 ms. Adding these values together yields a minimum of 58 ∼81 ms in response to a SAS. As reported in Stevenson et al. (2014), the onsetof SAS-induced responses occurs at about 75 ms, suggesting the presenceof a StartReact effect for speech movement. As the results reported in thefollowing chapters will demonstrate, voluntary movement onsets fall withinthe predicted range for a StartReact response, suggesting that the releaseof the prepared syllable undergoes the proposed StartReact pathway.To sum up this section: neural correlates and pathways for SAS-inducedresponses support the view that SAS-induced speech responses may containunaltered details of speech plans, providing researchers with a window intospeech motor planning that bypasses afferent feedback information.141.3 Organization of this dissertationThe main purpose of this dissertation is to use the SAS paradigm toinvestigate how speakers construct their speech plans in speech motor con-trol. Specifically, a series of experiments was designed to determine whichdetails are included in the speech plan. The SAS paradigm elicits the pre-pared responses both rapidly and accurately. Therefore, these SAS-inducedresponses may reveal details that are pre-specified in the speech plan andare executed independently from feedback regulation.Each chapter of this dissertation tackles a different aspect of the planningof spoken syllables. The dissertation content is structured as follows.Chapter 2 begins by investigating whether the StartReact effect is ob-served in prepared speech elicited by a SAS. While a number of studieson limb movements suggest that the StartReact effect may be more associ-ated with triggering from subcortical areas, it is not clear whether preparedmovements that involve heavy cortical processing, such as speech produc-tion, are also subject to the same effect. An experiment was conducted thatelicited a simple RT task under SAS. The upshot of this test is a straight-forward confirmation that the startle paradigm is applicable to preparedspeech. Specifically, the results show that the StartReact effect is observedfor prepared syllable production, and that speech planning may govern adomain of syllables at least up to CV length.Chapter 3 further tests whether multi-dimensional aspects of speechtasks are preserved in SAS-induced responses. In particular, this chaptertackles the question of whether SAS-induced effects can be speech-specific15or whether they are generic to any voluntary lip movements. To test this,Mouthed speech and Non-speech oral movements were employed in a SAS ex-periment. SAS-elicited responses reveal that lip compression did not occur inthe Mouthed speech responses as frequently as in Spoken speech responses,and that lip compression in terms of lower lip vertical displacement fromthe Mouthed speech responses was not as large as those in Spoken speechresponses. The results suggest that lip compression, potentially driven byaerodynamic factors, may be specific to speech, and that multi-dimensionalinformation that is pre-specified in speech tasks ought to be preserved inresponse to SAS.Chapter 4 investigates whether, in a simple RT task, lip compression thatis independent of aerodynamics of the responses is observed in all speech-related tasks and ought to be elicited at a short latency by SAS. In particular,the experiment examines lip compression of the prepared sequence with apreceding mouth-closing movement and compares that with those withoutpreparatory movements. While reaction time was not affected by additionalpreparatory movement, lip compression appears to be comparable in Spokenand Mouthed responses. The differences with regards to lip compression inthe mouth-closed condition are not seen in this mouth-open condition.Chapter 5 compares SAS responses in English and Taiwanese Mandarin,showing that acoustic details that create phonemic contrasts, such as for-mants and pitch contour profiles, may be pre-specified in the speech planand thus be resistant to SAS-induced effects.Finally, Chapter 6 discusses the implications of the findings and con-cludes the dissertation.16Chapter 2The StartReact effect insyllable production2.1 IntroductionIn order to use the startle paradigm to examine different aspects of speechplans, it is first necessary to confirm that this experimental design is appli-cable to prepared speech production. In this chapter, an experiment usingSAS was conducted to study prepared CV syllables. Positive results supportthe conclusion that the startle paradigm effectively elicits prepared speech,suggesting likely cortical involvement for startle effects, and thus providinga platform for further investigation.2.1.1 The StartReact effect and cortically determinedprocessesRecently, research investigating the preparation and initiation of goal-directed actions (e.g., arm extension movements) has shown that prepa-ration can be completed in advance, possibly stored in subcortical areas,and then triggered by a startling acoustic stimulus (i.e., SAS; > 120 dB)17(Carlsen et al., 2004b; Valls-Sole´ et al., 1999). When a participant is en-gaged in a reaction time task and is waiting for an imperative stimulusto produce a prepared movement, a startling auditory stimulus can trig-ger the specific pre-programmed response in significantly reduced reactiontime without compromising response accuracy. The resulting response la-tency is so short (< 70 ms) that it appears unlikely the process of initiationcould involve cortical or cognitive mechanisms. This early release of the in-tended movement, termed the “StartReact effect” (Valls-Sole´ et al., 1999),is accompanied by largely unaltered kinematic phasing and muscle activitypatterns.The StartReact effect has been reported in head-turning movements(Oude Nijhuis et al., 2007; Siegmund et al., 2001), eye movements (Castel-lote et al., 2007), and stepping (MacKinnon et al., 2007). Nevertheless, itis not clear whether speech movements, which presumably hinge on heavycortical processing, can also be elicited in a similar fashion. In this thesis,I will explore whether a SAS can induce a StartReact effect in speech pro-duction. A simple reaction time (RT) task will be employed to examine thepreplanning of various details of speech production.Early in the literature on the StartReact effect, it was proposed thatintended responses are programmed and stored in subcortical areas (e.g.,Carlsen et al., 2004b; Valls-Sole´ et al., 1999), with the release of the pre-pared program bypassing normal cortical pathways for voluntary move-ments. However, while these studies emphasize the storage and triggeringin subcortical areas, other studies propose that the subcortical storage andtriggering hypothesis may not provide the only account for the StartRe-18act mechanism. For example, Siegmund et al. (2001) attributed the earlyrelease to enhanced perceptual processing and intersensory facilitation incortical circuits. Evidence from transcranial magnetic stimulation (TMS)also provides support for cortical involvement in the initiation of preparedresponses (e.g., Alibiglou and MacKinnon, 2012; Stevenson, 2011). Morerecent studies propose that the startle reflex induces increased activationin the thalamus, allowing the prepared response to be released as early asan involuntary response (e.g., Carlsen et al., 2012; Maslovat et al., 2011).Following Carlsen et al. (2012), it is hypothesized that “a SAS acts as a sub-cortically mediated trigger for a cortically stored motor program.” (Carlsenet al., 2012, p. 30).Considering speech production, it is widely acknowledged that a widerange of cortical and subcortical areas are involved in the preparation andinitiation of speech. In particular, the initiation of prepared speech hinges oncortically determined processes and cannot be released simply by reticulo-spinal commands (Bohland and Guenther, 2006). In their study, partici-pants were instructed to produce a pre-cued syllable sequence when a GOsignal stimulus was presented. No syllable production was elicited when thestimulus was a NO GO signal. Their results of GO trials showed more sig-nificant responses in the primary and somatosensory cortices, the superiortemporal plane, the anterior insula, and the medial premotor areas, withparticular focuses in the SMA near the superior convexity and portions ofpre-SMA and anterior cingulate sulcus. A main effect of the overt speech(i.e., GO trials) suggests that these cortical areas are associated more withthe initiation than with the planning or sequence buffering of the prepared19speech. Applying the startle paradigm to prepared speech not only allows usto examine the potential impact of the StartReact effect on speech produc-tion, but more generally can lend support to the hypothesis that corticallydetermined motor programs are subject to a rapid trigger by a SAS.2.1.2 Syllables as speech unitsFor the purpose of the current study, CV syllables are chosen as thetarget syllable for a number of reasons. CV syllables are cross-linguisticallycommon in terms of word inventory (Bell and Hooper, 1978). Earlier stud-ies talked about the planning for different levels of “phonological/phoneticrepresentations,” such as syllables (Levelt et al., 1999), feature (Bernhardtand Stemberger, 1988), and subphonemic representations (Derrick, 2011).In particular, syllables have attracted most attention and the role of men-tal syllabary has been emphasized through different experiment settings,such as naming task (Ferrand and Segui, 1998; Levelt and Wheeldon, 1994),priming (Cholin and Levelt, 2009), and fMRI (Brendel et al., 2011). Beingfundamental units in speech production, mental syllabary is interpreted asoutput of the projections from premotor cortex, to primary motor cortex,and supplemented by cerebellum (Guenther, 2006).Syllables being fundamental units of speech production is not only groundedby experimental results, but also supported by human biomechanical motordevelopment. In language acquisition, it has been suggested that CV sylla-ble (e.g., /ba/) may be a development emerged from mastication behaviors(Davis et al., 2002; MacNeilage, 1998; MacNeilage et al., 2000). In pre-speech babbling, CV syllables are commonly observed as it has a lot of in20common with chewing behaviors. Take /ba/ for example. The closure of theconsonant and the opening of the vowel form a syllabic frame. The closureis completed between lips whereas no positional specification is required forthe tongue. Thus, when the lips were burst open, the null specification ofthe tongue would naturally lead to a central vowel /a/, as opposed to otherfront vowels (e.g., /i/) or back vowels (e.g., /o/) (cf. MacNeilage et al.,2000). It is why /ba/ is commonly seen in early speech babbling.If a syllable is a result of the composition of these motor commandprojections, the articulatory content should include details from these motorcommands.2.1.3 ProposalThe aim of this study is to investigate whether the StartReact effect isalso observed in movements that involve heavy cortically determined com-mands, such as prepared speech. There exists detailed evidence to suggestthat the preparation of intended speech involves a wide range of cortical andsubcortical networks and the initiation and execution of prepared speech de-pends on exerting cortical projections to subcortical areas via a descendingcortico-subcortical pathway (Iwata et al., 1996; Ju¨rgens, 2002; Simonyanand Horwitz, 2011). I believe the use of speech provides a novel alternativeto cortically dependent finger movements (i.e., Carlsen et al., 2009a; Hon-eycutt et al., 2013), as speech requires a high degree of cortical and cognitiveinvolvement (Grimme et al., 2011) and has been shown to be controlled in adifferent manner than non-speech facial movements (Tremblay et al., 2003).Given the proximity between the mouth and the neo-cortex, it is assumed21that the threshold for the StartReact effect may not be as high as thatin finger lifting tasks. Here we hypothesize that the initiation of preparedspeech relying on cortical commands is also susceptible to the rapid releasetriggered by a SAS. I test this hypothesis by employing the SAS design. Asrevealed in previous studies, when paired with an imperative stimulus, aSAS can trigger an early release of the prepared response. If there is corti-cal involvement in the StartReact effect, we would predict that the syllable/ba/ would be subject to the StartReact effect and triggered early by theSAS.As proposed by Carlsen et al. (2012), the StartReact circuit involvesan ascending reticulo-thalamo-cortical pathway. The SAS invokes a rapidinvoluntary release of a prepared response. Given this account of the mech-anism that underlies the StartReact effect I hypothesize that the initiationof prepared speech is also susceptible to rapid release which is triggered bya SAS.2.2 Methods2.2.1 ParticipantsData were collected and analyzed from nine participants (3 male and 6female; Mean = 23 years, SD = 4.2 years) who showed a consistent startlereflex in the sternocleidomastoid (SCM) muscle on baseline startle trials (apretest startle trial prior to the testing session, no speech required) and morethan 50% of the startle testing trials. Participants were all native speak-ers of North American English. Prior to the testing, participants signed22an informed consent form and were na¨ıve to the hypothesis under investi-gation. The experiment was conducted by following the ethical guidelinesestablished by the University of British Columbia.2.2.2 Apparatus, task, and proceduresParticipants sat in an upright chair facing a computer monitor (Acer,X223W, 22”, 60 Hz refresh rate) at a distance of approximately 1.5 metersand were instructed to look straight ahead at the monitor and respond toan acoustic stimulus by vocalizing the target syllable /ba/ as quickly aspossible. A visual display of the syllable /ba/ was presented on the monitorconcurrently with the acoustic stimulus. Throughout the testing session,participants were asked to start with their mouths closed in a relaxed posturewithout compressing their lips during the preparation. Prior to this testingsession a baseline startle trial was introduced. Participants were seated andwaiting for testing to begin when the unexpected baseline SAS (124 ± 2 dB,40 ms, 1,000 Hz, < 1 ms rise time) was delivered. Figure 2.1 depicts thepositioning of participants and experimental apparatus.1.5	m	 30	cm	ba	Figure 2.1: Relative positions of the monitor, participant, and loud speaker.All testing trials began with a warning tone (100 ms, 1000 Hz, 80 dB)23played directly from the computer’s sound card. The acoustic imperativestimulus and visual /ba/ followed the warning tone by a random foreperiodof between 1500 and 2500 ms. This auditory signal was either a controlstimulus (80 ± 2 dB, 100ms, 1,000 Hz) or startling stimulus (124 ± 2 dB,40 ms, 1,000 Hz, < 1 ms rise time), generated by a customized computerprogram. The acoustic stimuli were amplified (HiFi stereo power audio am-plifier A180W) and then presented via a loudspeaker placed directly behindthe head of the participant. The acoustic stimulus intensities were measuredusing a sound level meter (Cirrus Research model CR:252B, “A” weightedscale, impulse response mode) at a distance of 30 cm from the loudspeaker(approximately the distance to the ears of the participant).Participants performed a single testing session of approximately 20 min.The testing block consisted of twenty control trials and five startle trials.The five startle trials were presented pseudo-randomly such that the firsttrial was never a startle trial, nor were there two consecutive startle tri-als. Prior to this testing, participants also performed lip movement andmouthing articulation tasks in order to familiarize the testing procedure.Only results from the vocalized syllables were included in the analyses.2.2.3 Recording equipmentParticipants performed the tasks with three infrared light-emitting diodesplaced on the center of the upper lip, the lower lip, and the bridge of thenose. 3D positions of these diodes were monitored using an OPTOTRAK(Northern Digital Inc., Waterloo, Ontario) motion analysis system (spatialresolution 0.01 mm). The data collected from the bridge of the nose were24considered as a reference marker for the other two landmarks. The OPTO-TRAK camera unit was placed above the computer monitor that was usedto display the syllable /ba/. The 3D positions of the upper and lower lipswere sampled at 500 Hz. Raw data from the OPTOTRAK were convertedinto 3D coordinates and digitally filtered using a second-order dual-passButterworth filter with a low-pass cutoff frequency of 10 Hz.Figure 2.2: Sternocleidomastoid (SCM) and orbicularis oris muscles for theattachment of surface EMG electrodes. Note the SCM and orbicularis orisfigures were adapted and modified from Gray’s Anatomy (20th ed.)Figures 385 (http://www.bartleby.com/107/illus385.html) and 381(http://www.bartleby.com/107/illus381.html), respectively.Bipolar surface electromyography (EMG) electrodes (Therapeutics Un-limited Inc., Iowa City, IA) were attached to four different locations: theskin above the upper vermilion border (labeled as “upper lip”), the skinbelow the lower vermilion border (labeled as “lower lip”), and the left andright SCM muscles. The EMG electrodes were placed centered between themidline and the right corner of the mouth, parallel to the line of force of the25OptoTrak	diodes	Surface	EMG	Lapel	mic	Figure 2.3: Experimental apparatus and set-up.muscles. A ground electrode was placed on the right ulnar styloid process.A wired lapel microphone was pinned onto the collar of the participant inorder to record the participant’s responses. Acoustic data were collected bythe wired lapel microphone through a Preamp (USBPre Microphone Inter-face for Computer Audio, Sound Devices, LLC) before analyses. Figure 2.2highlights the SCM and lip muscles to which the surface EMG electrodes areattached; Figure 2.3 illustrates the participants’ set-up for the current exper-iment. A customized LabView® computer program controlled the stimuluspresentation and the collection of EMG and acoustic data at a rate of 4kHz (National Instruments, PC-MIO-16E-1). Data collection began 500 ms26before the presentation of the stimulus and was terminated 2500 ms later.2.2.4 Interpretation of EMG and lip displacementIn the current design, the surface EMG electrodes were placed on theskin above and below the vermillion border. The electrodes were positionedto capture activity from the orbicularis oris superior (OOS) and inferior(OOI), respectively. While it is acknowledged that the surface EMG mayalso pick up some activity from other surrounding muscles, such as levatorlabii superioris, depressor labii inferiroris, or depressor anguli oris. It isassumed in this study that the upper lip electrode indicates by and largeEMG activity from OOS and the lower lip electrode shows primarily EMGactivity from OOI.Averaged rectified raw EMG traces of all baseline startle trials are dis-played in Figure 2.4. EMG responses were observed in left and right SCMas well as the upper and lower lips at approximately the same time (upperlip Mean = 59 ms, SD = 9 ms; lower lip Mean = 54 ms, SD = 13 ms, leftSCM Mean = 60 ms, SD = 16 ms; right SCM Mean = 62 ms, SD = 17 ms),all were considered as startle indicators in the following analyses. While itappears that all four muscles can be considered startle indicators, we choseto use SCM as our primary indicator for consistency with previous work(Carlsen et al., 2011). However, the involvement of the prime movers in thereflexive startle response did not allow for determination of voluntary EMGonset for vocalization during startle trials and thus our dependent measuresincluded kinematic markers and acoustic burst onset (see below).Typical profiles of EMG and lip displacement for control trials are dis-27Figure 2.4: Average rectified raw EMG traces of baseline trials, including(from top) right SCM, left SCM, upper lip, and lower lip. EMG activitywas plotted with respect to the imperative stimulus (the vertical grey line).played in Figure 2.5. In response to an imperative stimulus, the upper andlower lips first compress against each other (note the upward displacementof the lower lip from point A to B, Figure 2.5). This lip compression wasevident in 91% of control trials and 87 % of startle trials. Such compressionbetween the lips is anticipated in order to contain the intraoral air pressureassociated with the bilabial consonant [b]. As such, only the trials with lipcompression were analyzed. Following lip compression, the opening phaseof the vowel [a] was realized primarily through the downward movement ofthe lower lip, with only a minimal contribution from the upward movementof the upper lip (point B through C, as seen in Figure 2.5). SCM musclesalso showed activity during this opening phase, presumably associated with28Figure 2.5: EMG activity, lip displacement and response acoustic waveformsof control exemplar. The EMG measure includes (from top) right SCM, leftSCM, upper lip, and lower lip. EMG activity was rectified and plotted withrespect to the imperative stimulus (the vertical grey line). Lip displace-ment and response acoustic waveforms are also plotted with respect to theimperative stimulus. Point A marks the beginning of voluntary movement;point B marks the time to the lower lip opening onset; point C marks thelowest position of the lower lip. The intervals across kinematic markers werelabeled from t1 to t5. See text for details.29jaw lowering (see Uemura et al., 2008, for similar findings).2.2.5 Data reduction, dependent measures, and statisticalanalysesA total of 36 of the 225 trials (16%) were excluded from the analyses.Reasons for discarding trials included delayed lip movement onsets or voiceonsets (defined as 2 SD away from the mean, 25 trials), startle trials inwhich no detectable SCM startle response or delayed response (>120 ms)was observed (10 trials), and trials with poor EMG data in which no obviousonsets could be identified (1 trial).In each response, the initiation of lip compression was marked as thevoluntary movement onset (point A on lower lip trace in Figure 2.5) whichallowed us to measure the RT of the voluntary movement onset (time be-tween the onset of IS to the beginning of lip compression). Differencesbetween RTs of voluntary movements in both control and startle responseswere measured and analyzed using a paired Student’s t-test. After the lowerlip reached its highest point (point B in Figure 2.5), the opening phase ofthe movement began, here termed the lower lip opening onset. The lower lipcontinued to move downward during the production of /ba/, until it reachedits lowest position (point C in Figure 2.5). Thus, two kinematic events canbe identified: lip compression and lip opening. The acoustic burst releasetime was defined as the latency from the presentation of the IS to the onsetof the acoustic burst of the bilabial consonant [b] (see acoustic signal, Figure2.5). To measure the time of the initial acoustic burst release, the acous-tic signals were first de-emphasized by applying a filter at a level of 10Hz30using the PRAAT software (Boersma and Weenink, 2009). A narrow-bandspectrogram (bandwidth of 43 Hz; wave-length of 30 ms) was displayed inPRAAT to mark the acoustic burst release. Syllable durations were alsomeasured, marking the interval from the acoustic burst release to the endof periodicity of the vowel [a].In order to examine whether the kinematic events in control responsesare different from those in startle responses, we calculated the relative tim-ing across kinematic markers A, B, and C by measuring the time betweenthese markers, as well as the relative timing between the kinematic mark-ers and acoustic burst onset (see Figure 2.5). The time between voluntarymovement onset and lower lip opening onset is shown as t1 (duration oflip compression) whereas t2 marks the duration of the opening movementfrom the lower lip opening onset to the lower lip’s lowest position. The timeframe for both of these events is t3 (time between point A and C in Figure2.5), and the relative timing of each kinematic event with respect to theentire time course was calculated. To examine the relative timing betweenthe kinematic events and acoustic signal, we calculated the time from move-ment onset to acoustic burst release (t4) and lower lip opening to acousticburst release (t5). The ratios of t1/t3, t2/t3, t4/t3, t5/t3 for each trial (bothstartle and control) were calculated with the ratio data transformed by trialusing the arcsine square root transformation (this was used due to correla-tions between markers; McDonald, 2009). The transformed ratios were thenaggregated by participant and analyzed via a paired Student’s t-test.The peak-to-peak displacement of the lower lip was calculated as thevertical distance from the highest position (Figure 2.5, point B) to the lowest31position (Figure 2.5, point C). The difference in peak-to-peak displacementbetween control and startle trials was analyzed using a paired Student’s t-test. Acoustic burst release times and syllable durations were both similarlyanalyzed using paired Student’s t-tests.In order to confirm that the target syllable /ba/ was successfully pro-duced, the acoustics of the produced syllables in control and startle trialswere analyzed. Acoustic formants F1 and F2 (the two lowest resonant fre-quencies of the vocal tract) are standard indicators of overall vocal tractshape for speech sounds, encoding information about articulator positionsduring vowels (Fant, 1960; Peterson and Barney, 1952) as well as aboutconsonants and CV transitions (Delattre et al., 1955). Frequency valuesfor F1 and F2 were extracted throughout the entire duration of producedsyllables using LPC formant tracker in PRAAT and normalized to theirrespective z -scores within each participant. Data were analyzed across nor-malized durations. For statistical comparison, formant data were submittedto a smoothing spline analysis of variance (SS ANOVA; see Davidson, 2006;Derrick and Schultz, 2013).2.3 Results2.3.1 Startle indicatorsAverage EMG traces of every trial in each condition for the four musclesof interest (left SCM, right SCM, upper lip, and lower lip) are displayed inFigures 2.6. When EMG traces are temporally aligned to the stimulus onset(Figure 2.6), all of the muscles that are involved in the voluntary action of32saying /ba/ also show startle responses. These startle indicators are quiteinvariant and display typical reflexive patterns (as seen in Carlsen et al.,2009b; Maslovat et al., 2009, 2011), followed by more diffuse EMG activityassociated with the subsequent voluntary action. Figure 2.6 also shows thatthe onset latencies of the startle EMG responses in the /ba/ trials werecomparable with those in the baseline trials (Figure 2.4).As seen in Figure 2.6, the more diffuse EMG activity for the voluntarymovement was due to higher variability with regard to the onset of thevoluntary action. Normalizing data to the lower lip opening onset allows usto better examine the muscle activity that is associated with the voluntaryaction and to clearly compare the difference between control and startletrials. When the EMG traces are normalized to the onset of the lowerlip opening movement (i.e., temporally aligned to the lower lip openingonset, as marked by the vertical grey line in Figure 2.7), the EMG traceof voluntary action is now distinct (variability of time between onset ofopening movement and EMG is low). As seen in Figure 2.7, in controltrials both upper and lower lips were engaged in the lip compression priorto the opening, whereas only the lower lip showed EMG activity during thevoluntary opening movement. Similar patterns were also observed in startletrials. It is noted that stronger EMG activity in response to the SAS wasobserved in the lip muscles as well as in the SCM muscles. The SCM musclesshowed both startle responses (activity before zero, marked by the grey linein Figure 2.7) and EMG activity for the intended movement (activity afterzero, marked by the vertical grey line in Figure 2.7).33EMG activity (mv)01ControlStartleLSCMRSCMULLLTime (ms)Displacement (mm)-100 0 100 200 300 400 50002ULLLStimulus presentationFigure 2.6: Mean EMG activity and kinematic displacement of control (grey)and startle (black) trials in the vocalization condition with respect to theimperative stimulus. The top four channels represent the mean EMG acrossfour muscles (from top to bottom): left SCM, right SCM, upper lip, andlower lip. The bottom two channels depict the mean upper lip and lower lipdisplacement trajectories, respectively.34EMG activity (mv)01ControlStartleLSCMRSCMULLLTime (ms)Displacement (mm)-300 -200 -100 0 100 200 30002ULLLLower lip opening onsetFigure 2.7: Mean EMG activity and kinematic displacement of control (grey)and startle (black) trials in the vocalization condition with respect to thelower lip opening onset. The top four channels represent the mean EMGacross four muscles (from top to bottom): left SCM, right SCM, upper lip,and lower lip. The bottom two channels depict the mean upper lip and lowerlip displacement trajectories, respectively.352.3.2 The StartReact effectA summary of the results for all measures, including means, standarddeviations, and p-values, are provided in Table 2.1.Table 2.1: Experimental results of Exp. 2, showing mean and standard deviations(in brackets)Experimental trialsp-value SignificantControl StartleRT of voluntary movementonset (ms)116.33 (32) 75.46 (30) 0.0036 *RT of acoustic burst releasetime (ms)268.33 (73) 203.64 (43) 0.018 *Lower lip peak-to-peak dis-placement (mm)18.5 (5.4) 21.17 (6.2) 0.007 *t1/t3 (transformed arcsinesquare root)0.58 (0.08) 0.57 (0.06) 0.68 –t2/t3 (transformed arcsinesquare root)0.99 (0.08) 1.00 (0.06) 0.68 –t4/t3 (transformed arcsinesquare root)0.86 (0.08) 0.85 (0.06) 0.37 –t5/t3 (transformed arcsinesquare root)0.55 (0.06) 0.55 (0.06) 0.89 –Syllable duration (ms) 176.29 (62) 168.42 (51) 0.59 –It was predicted that a SAS would induce faster onset of movements ofa prepared syllable. Our results support this prediction. Shorter reactiontimes of voluntary movement onsets (time from stimulus presentation to36point A in Figure 2.5) were observed for startle trials (Mean = 75 ms, SD= 30 ms) than for control trials (Mean = 116 ms, SD = 32 ms; t(8) =4.07, p < 0.01). The latency of the acoustic burst release (from stimuluspresentation to point C in Figure 2.5) was also accelerated in startle trials(Mean = 204 ms, SD = 43 ms) compared to control trials (Mean = 268ms, SD = 73 ms; t(8) = 2.97, p = 0.018). In addition to shorter latenciesof the kinematic markers, the SAS triggered increased muscle activity (seeFigure 2.6 and 2.7), yielding a significantly increased range of lower lip peak-to-peak displacement (Mean = 21.2 mm) than control trials (Mean = 18.5mm), (t(8) = -3.83, p < 0.01).However, the relative timing relationship of kinematic and acoustic eventsremained unaffected. The time between voluntary movement onset andlower lip opening onset (t1 in Figure 2.5) for control and startle responseswere 30% and 29%, respectively, of latencies from voluntary movement onsetto the time when the lower lip reached its lowest point (t3 in Figure 2.5).Secondly, latencies between lower lip opening onset and the time of the lowerlip’s lowest point (t2 in Figure 2.5) for control and startle trials were 70%and 71%, respectively, of latencies from voluntary movement onset to thetime of the lower lip’s lowest point (t3 in Figure 2.5). Thirdly, the laten-cies from the voluntary onset to the acoustic burst release (t4 in Figure 2.5)for control and startle trials were 58% and 56%, respectively, of t3. Lastly,the latencies between lower lip opening onset and acoustic burst release forcontrol and startle trials were 27% and 26 %, respectively of t3. All theseratios were transformed using arcsine square root transformation. The re-sults are summarized in Table 2.1). No statistical difference was found for37the time course of these kinematic events between control and startle re-sponses, suggesting that the relative timing of the kinematic markers wasnot compressed during the startle trials.The SS ANOVA results of formants are displayed in Figure 2.8, includingthe fit predicted by the SS ANOVA model and shading of 95% confidenceintervals. For F1 and F2, frequency profiles between control and startleresponses were within 95% confidence interval (as indicated by overlappingconfidence interval bands in Figure 2.8). No difference in syllable durationwas observed between control and startle trials (t(8) = - 0.56, p = 0.59).In addition to shorter latencies of the kinematic markers, the SAS trig-gered increased muscle activity, as indicated by EMG, yielding a significantlyincreased range of lower lip peak-to-peak displacement than control trials(t(8) = -3.83, p < 0.01).380.0 0.2 0.4 0.6 0.8 1.06007008009001000F1Normalized syllable durationFrequency (Hz)0.0 0.2 0.4 0.6 0.8 1.010001100120013001400F2Normalized syllable durationFrequency (Hz)Figure 2.8: SS ANOVA comparison of formant frequencies F1 and F2 oversyllable /ba/ durations. The black line denotes the predicted fit for con-trol responses and the white line denotes the startle responses. The darkgrey and light grey bands surrounding the predicted fits represent a 95%confidence interval. Any white space between the transition lines for eachcondition represents a statistically significant difference between the givenmeasures.392.4 DiscussionApart from other limb movements in response to auditory stimulus, thecurrent experiment considers syllable production in speech as another move-ment task known to require cortical involvement. In syllable production,primary motor cortex and Broca’s area (i.e., left inferior frontal gyrus) areassociated with linguistic processing including syllabification and phoneticencoding (Hickok and Poeppel, 2000, 2004, 2007; Indefrey and Levelt, 2004;Papoutsi et al., 2009; Tourville and Guenther, 2011). We predicted thatprepared syllables would be subject to rapid release by a SAS. The cur-rent experiment found that both the lower lip voluntary movement and theacoustic burst were initiated earlier in startle trials while the acoustics ofproduced syllables remained intact. These results support our hypothesisthat speech is subject to the StartReact effect despite its heavy corticalinvolvement.It is conceivable that only the early part of the response (e.g., points Aand B in Figure 2.5) would be included in the preplan such that a SAS wouldonly trigger these early parts without accelerating the rest of the response(i.e., the mouth opening from point B to point C). However, we found thatthe response triggered by the imperative stimulus was executed as a coherentwhole, as evidenced by a fixed timing relationship across all the kinematicmarkers. Likewise, the acoustic formant profiles of SAS-induced syllables(indicating the shape and position of the speech articulators), matched thoseof the voluntary responses (Figure 2.8). Thus, when a preplanned syllableis triggered by a SAS, both temporal and acoustic properties of the syllable40remain intact, indicating that the SAS effects rapid release of the entireprepared syllable.It is noteworthy that the StartReact pathways largely overlap with thepathways for speech production. As summarized in Maslovat et al. (2011)and Carlsen et al. (2012), we propose that the StartReact response is me-diated via an ascending thalamo-cortical pathway, generated by activationfrom reticular formation exerting on thalamus. Increased activation in tha-lamus provides inputs to primary motor cortex to initiate the cortically pre-pared movement via a descending corticospinal pathway. Similarly, speechproduction also relies on thalamo-cortical circuits and a descending corti-cospinal pathway. Receiving inputs from cerebellum, thalamus projects toprimary motor cortex and Broca’s area. The commands are mediated viaputamen and reticular formation and sent down to the phonatory motoneu-rones in the spine (Guenther et al., 2006; Iwata et al., 1996; Ju¨rgens, 2002).The triggering by a SAS involves the initiation of the cortically prepared andstored syllable movements via the same StartReact pathways for upper limbmovements. Shorter reaction times in StartReact responses are accountedfor by increased neuron activation reaching faster above initiation threshold(see Carlsen et al., 2012, for details).In SAS-induced responses, the voluntary movements for the prepared syl-lable were accelerated. In addition to the StartReact effect, we also foundstartle reflex activity in the perioral region. As both the startle reflex andthe accelerated StartReact responses occur in the perioral region, it presentsa potential confound to identify whether there are two events (i.e., a startlereflex followed by a voluntary movement) or there is only one responsive41activity in the perioral muscles. It is no surprise to observe a startle reflexin lip muscles as similar reflex activity in lips and orofacial muscles havebeen reported in other startle studies (Brown et al., 1991; Valls-Sole´ et al.,2008) and lip tapping studies (Bratzlavsky, 1979; McClean, 1991). As ob-served in Figure 2.6, the earliest EMG activity in SAS-induced responsesis associated with the startle reflex, ranging from 40 ms to 90 ms. Thestartle reflex activity was present in all muscles measured, but more robustin the two SCM and lower lip muscles. Note that this reflex activity isdistinct from the subsequent activity for the lower lip, which correspondswith the prepared but accelerated speech movement. Using TMS and SAS,Stevenson et al. (2014) showed that the startle reflex is dissociated fromthe StartReact responses. The startle reflex is modulated via a descendingpathway from pontine reticular formation to nRPC (Carlsen et al., 2012;Yeomans and Frankland, 1995). When a SAS is presented, activations areprojected to the cranial nerve VII (facial nerve), by which perioral muscles,including the orbicularis oris, levator and depressor labii muscles, are in-nervated. This innervation accounts for the observed startle reflex in theperioral muscles whereas accelerated voluntary responses are mediated bythe dissociated pathways for StartReact responses. Therefore, SAS-inducedEMG activity bears on two separate events associated with distinct neuralpathways.In SAS-elicited responses, the SCM muscles have been considered as areliable startle indicator. The current experiment measured not only theSCM but also the perioral muscles in response to SAS. The results revealedthat in addition to the SCM, the lower lip muscle also consistently exhibited42startle reflex activity prior to the activity for the voluntary movement (Fig-ure 2.6). When data are normalized to the voluntary opening onset (Figure2.7), two EMG events were observed in the lip muscles. The first one isassociated with lip compression whereas the second one is directly relatedto the opening movement. While both lips are involved in lip compression,only the lower lip exhibited activity for the opening movement. Whetherthese observations are speech-specific or generic to any oral movements willrequire more experiments and empirical support.In summary, this experiment is the first study applying the startle paradigmto prepared speech. In response to a SAS, the prepared syllable can berapidly released with the temporal and acoustic properties of the responsesyllable preserved intact. In terms of associated neural pathways, the re-sults suggest that the early startle reflex observed in the lips is mediated viareflex pathways dissociated from the StartReact pathways, and that upperlimb movements and syllable production share similar neural pathways forStartReact responses. The rapid and accurate release of the prepared syl-lable by a SAS confirms that this startle paradigm can serve as a suitabledesign to further investigate other details of speech plans. A more signifi-cant implication of the startle paradigm design is that the StartReact effectis observed in tasks that are dependent on heavy cortical processing, suchas syllable production.43Chapter 3Startling spoken, mouthed,and non-speech movements3.1 IntroductionChapter 2 showed that a speech plan as long as a CV sequence (i.e.,/ba/) can be prepared in advance and subject to rapid execution whenelicited by a SAS. While only limited feedback may be available at such ashort latency, the prepared syllable is performed as intended without anyinterruption to the synergistic system. While Chapter 2 has demonstratedthat the StartReact effect is observed in prepared syllable production, itremains unclear whether these SAS-induced effects are speech-specific orgeneric to any lip movements. Minimally, if detailed information is specifiedfor different speech tasks, these details ought to be preserved in SAS-inducedresponses. As such, we can apply the startle paradigm to different speechtasks in order to examine the pre-specified details and possible inducedeffects.A number of studies have suggested that speech plans may encode multi-dimensional details, such as aerodynamics (e.g., Cho et al., 2002; Gick44et al., 2012; Murphy et al., 1997), muscular structure and coordination (e.g.,Gracco and Lo¨fqvist, 1994; Lo¨fqvist and Gracco, 1997), and sensitivity tosomatosensory feedback (e.g., Larson et al., 2008; Tremblay et al., 2003).It should be noted that different speech tasks may construct speech planswith different specifications across these dimensions. For example, Mur-phy et al. (1997) compared four different speech-related tasks: vocalizedspeech, mouthed speech (i.e., speech with articulatory movements, but noaudible output), unarticulated speech (i.e., speech with vocalization, butno corresponding articulatory movements), and internal speech (i.e., silentspeech without any articulation or vocalization). Their results show thatthe breathing patterns for vocalized speech and unarticulated speech aremore similar, whereas the breathing pattern for mouthed speech resemblesthe pattern for internal speech. In addition to breathing patterns, spokenspeech and mouthed speech also differ in EMG activity and correspondentkinematics. Compared with spoken speech, mouthed speech has shorterword duration, reduced EMG amplitude, and overall hypoarticulation in lipmovements (Crevier-Buchman et al., 2011; Janke et al., 2010; Wand et al.,2009). While different multi-dimensional information is included in the pre-plans for these speech tasks, both spoken and mouthed speech tasks servesimilar functions (i.e., both are speech-related) and both have a similar so-matosensory basis during speech production. Tremblay et al. (2003) examinesomatosensory feedback during speech production by applying mechanicalperturbation to the jaw. In their study, the participant’s jaw was perturbedduring the production of vocalized speech, mouthed speech, and non-speechmovement. After training, adaptation to compensate for the perturbation45was observed in vocalized and mouthed speech. For both speech-relatedtasks, an after-effect was reported when perturbation was removed fromtrials. However, the training effect and after-effect were not found in thenon-speech movement task. Their results suggest that somatosensory ba-sis for speech-related tasks may be dissociated from the basis for genericnon-speech movements.In Chapter 2, participants were instructed to produce the sequence /ba/from a mouth-closed condition. Intuitively, the initial component of thesyllable production should be the release of the bilabial stop. However, asrevealed by the results, the voluntary onset action was the two lips com-pressing against each other – in particular, the lower lip pushing upwardagainst the upper lip. This lip compression occurred in 91% of spoken /ba/responses. It is important to note here that this compression is not a sideeffect of StartReact itself; the same compression was also observed in 87%of control responses, suggesting that the observed lip compression is mostlikely a preparatory movement, part of the motor plan in preparation forthe upcoming speech movements. We have proposed that the observed lipcompression is associated with intraoral pressure required for the produc-tion of /b/; Chiu and Gick (2013) use a computational model to simulatethe production of bilabial stops, reporting results supporting the view thatanticipatory lip compression is required when intraoral pressure is imple-mented during production. Based on this previous work, we can predictthat reduced lip compression should be observed in speech tasks where noor limited air pressure is required, as in mouthed instances of the syllable/ba/. Alternatively, there remains the possibility that this lip compression46could be simply a precursor to any opening movement (including a non-speech movement), in which case such compression should be observed tooccur as frequently in other speech and non-speech tasks. A more proba-ble prediction is that, since no intraoral pressure is required, much reducedlip compression would take place in a non-speech lip opening movement,compared with spoken and mouthed speech tasks.In a simple RT task, all the details of the response have to be pre-pared in advance of stimulus presentation. The complexity of a preparedspeech response does not affect reaction time, since the internal structure ofthe response (including duration) is specified before the presentation of thestimulus (cf. Klapp, 2003). Therefore, in terms of response latency, a rea-sonable prediction based on this would be that there should be no differencewhen triggered by an auditory stimulus regardless of the target response;that is, when elicited by an auditory stimulus (i.e., a control “go” signal),the latency of voluntary movement for spoken speech should be of no dif-ference from that for mouthed speech or non-speech responses. Similarly,when a SAS is presented as an imperative GO signal, prepared mouthedand spoken speech should be released at a comparable latency.This chapter examines how a SAS can elicit prepared mouthed-speechand non-speech movements. Two predictions can be made. First, reducedand less frequent lip compression would be expected for the mouthed speechand non-speech oral movement. Second, a SAS may induce acceleratedrelease of prepared mouthed speech and non-speech movements, whereasno differences with regards to reaction time between conditions is expected.That is, the latency of the voluntary movement onset from control responses47for mouthed speech and non-speech movement should be comparable to thatfor Spoken speech; the latency of the voluntary movement onset from startleresponses, though shortened, for all three conditions should have comparablelatencies.3.2 MethodsThe experiment presented here was conducted to test the hypothesis thatthe StartReact effect and associated lip compression observed in spokensyllables can be observed in speech-like (i.e., Mouthed speech) and Non-speech (i.e., mouth opening movements) motor behaviours.3.2.1 ParticipantsData were collected and analyzed from the same nine participants re-ported in Chapter 2 (3 male and 6 female; Mean = 23 years, SD = 4.2 years).Participants were all native speakers of North American English. Prior tothe testing, participants signed an informed consent form and were na¨ıve tothe hypothesis under investigation. The experiment was conducted followingthe ethical guidelines established by the University of British Columbia.3.2.2 Apparatus, task, and proceduresThe same apparatus from Chapter 2 were applied. Two tasks were de-signed in blocks. Throughout the testing session, participants were asked tostart with their mouths closed in a relaxed posture without compressing theirlips during the preparation. The first block (hereafter the Non-speech con-48dition) required participants to respond to an acoustic stimulus by openingtheir mouths. In the second block, participants were instructed to respondto the acoustic stimulus by mouthing a silent articulation of /ba/ (here-after the Mouthed condition), in which no pulmonic air flow (and hence nophonation) was produced.The testing procedures are identical to those reported in Chapter 2.Participants performed a single testing session of approximately 20 minutes.The testing block consisted of twenty control trials and five startle trials.The five startle trials were presented pseudo-randomly such that the firsttrial was never a startle trial, nor were there two consecutive startle trials.3.2.3 Recording equipmentThe same recording equipment and set-up from Chapter 2 were used.3.2.4 Data reduction, dependent measures, and analysesTwo conditions and twenty-five trials in each condition for nine partici-pants yielded 450 trials in total. A total of 40 of the 450 trials were excludedfrom analysis for the following reasons: anticipation (4 trials; 0.89%), dataloss (1 trial; 0.2%), false startle (6 trials, 1.3%), hesitation (5 trials, 1.1%),startle indicator with late RT (>120 ms; 5 trials; 1.1%), and outlier filter-ing by subject (19 trials, 4.2%), where “outlier” was defined as a voluntaryonset of 2 SD or more away from the mean. The remaining 410 trials wereexamined and further analyzed.Following the analysis procedure utilized in Chapter 2, EMG activitycollected from the SCM and lip muscles was aggregated and analyzed. Mean49EMG activity in relation to the imperative stimulus is shown in the results.Lip displacement trajectories in the Mouthed and Non-speech conditions arealso included. It was expected that SCM and the lower lip muscles wouldshow startle reflex activity in response to a SAS. Meanwhile, amplified EMGlevels and accelerated lip displacements were also anticipated in SAS-inducedresponses.The acceleration of lip movement was determined on the basis of threekinematic markers. As illustrated in Figure 3.1, the latencies of three kine-matic markers was measured: (1) the voluntary movement onset (point A),defined as the initiation of lip compression, (2) the lower lip opening onset(point B), defined as the highest point of the lower lip, and (3) the lowerlip displacement trough (point C), defined as the lower position of the lowerlip. Measurement analyses were performed using a 3 Condition × 2 Stimu-lus repeated measures ANOVA. Greenhouse-Geisser correction was used forany violation of sphericity. Partial eta-squared (η2p) values are calculated asa measure of effect size.Relative timing between the kinematic markers A, B, and C was calcu-lated by measuring the time between these markers. The time between vol-untary movement onset and lower lip opening onset is shown as t1 (durationof lip compression), while t2 marks the duration of the opening movementfrom the lower lip opening onset to the lower lip’s lowest position. The timeframe for both of these events is t3 (time between points A and C in Figure3.1), and the relative timing of each kinematic event with respect to theentire time course was calculated. The ratios of t1/t3, t2/t3 for each trial50-100 0 100 200 300 400 50002Displacement (mm)Time (ms)Upper LipLower LipStimulus presentationA BC* **t1t2t3Figure 3.1: Lip displacement and response acoustic waveforms are also plot-ted with respect to the imperative stimulus. Point A marks the beginning ofvoluntary movement; point B marks the time to the lower lip opening onset;point C marks the lowest position of the lower lip. The intervals across thesemarkers were labeled as t1, t2, and t3. See texts for details.(both startle and control) were calculated, with the ratio data transformedby trial using the arcsine square root transformation (this was used due tocorrelations between markers; McDonald, 2009). The transformed ratioswere then aggregated by participant and analyzed using Student’s t-tests.The trials were also analyzed for lip compression. In each experimentaltrial, lower lip movements were carefully examined in terms of their dis-placement trajectories. Following Chapter 2, the initiation of the lower lipmovement was recorded as the voluntary movement onset (i.e., point A in51Figure 3.1). Since the lips began in a mouth-closed position, after the ini-tiation of the voluntary movement, the upward movement observed in thelower lip is understood as lip compression. Conversely, direct downwardmovement of the lower lip at the onset of voluntary movement indicates anabsence of lip compression. The number of trials with and without lip com-pression was calculated. Compression displacement (i.e., the vertical heightbetween voluntary movement onset and lower lip opening onset) was alsomeasured and analyzed.3.3 Results3.3.1 Lip compressionAs discussed in Chapter 2, lip compression was observed at the initiationof voluntary movement in Spoken responses. This compression occurred in91% of control responses and 87% of startle responses. The frequency of lipcompression in the current design was also calculated, with the raw countof trials showing lip compression presented in Table 3.1. As can be seen,lip compression in Mouthed and Non-speech responses did not occur asfrequently as in Spoken responses. The proportion of lip compression acrosscontrol and startle was highest for Spoken, followed by Mouthed, and thenNon-speech. For Spoken responses, 91% of trials showed lip compression,whereas for Mouthed responses, the occurrence of lip compression was 56%.Only 22% of Non-speech responses exhibited lip compression. Note that adecrease in the occurrence of lip compression from control to startle trialswas also observed in Mouthed responses whereas an increase was observed52in Non-speech responses.One may wonder whether a training effect may be responsible for the re-duced lip compression frequency in the Mouthed and Non-speech conditions.Additional analyses showed no effect of training. Please refer to AppendixA for further details.Table 3.1: Numbers of trials with lip compres-sion across conditions. Percentages were calcu-lated from the trials included in analyses.Spoken Mouthed Non-speechControl 145 (91%) 104 (63%) 23 (14%)Startle 26 (87%) 10 (24%) 22 (55%)TOTAL 171 (90%) 114 (56%) 45 (22%)Compression displacement (as measured by the vertical height betweenlower lip voluntary onset and lower lip opening onset) across conditions issummarized in Table 3.2. Only trials that exhibited lip compression wereincluded for the analyses here. In order to perform a 2 × 3 ANOVA, miss-ing values were filled in by a constant. Since there was no training effect,missing data was treated as random and substituted with the mean acrossparticipants in the same condition.As the table shows, the largest compression displacement was observedin the Spoken condition, followed by the Mouthed condition, and then theNon-speech condition. Across all three conditions, more compression dis-placement was observed for startle responses than for control responses.Results revealed that a main effect for Condition (F(2, 16) = 12.46, p <.01,53η2p = 0.61) was reported. No effect for Stimulus (p = 0.06) or Condition ×Stimulus interaction (p = 0.47) was noted. Post-hoc analyses found signif-icant differences between Spoken and Non-speech (p = 0.027) and betweenSpoken and Mouthed (p =.006), but not between Mouthed and Non-speech(p = 1).Table 3.2: Mean compression displacementacross conditions (mm); standard deviations inparentheses.Spoken Mouthed Non-speechControl 0.96 (0.73) 0.30 (0.41) 0.21 (0.12)Startle 1.38 (0.95) 0.48 (0.42) 0.48 (0.50)To further examine muscle activity and lip displacement during lip com-pression, EMG data and lip displacement trajectories were normalized tothe lower lip voluntary movement onset. This marker was chosen becausethe lower lip demonstrated more robust displacement than the upper lip.In particular, the closed-mouth position of the lips at onset made it pos-sible for the upper lip to show no obvious displacement at the instant oflip compression. Figure 3.2 illustrates upper lip and lower lip mean EMGactivity and lip displacement trajectories in the Spoken condition. As thefigure shows, in control responses (grey lines), both upper lip and lowerlip produced EMG activity before the lower lip voluntary movement onset.The EMG activity onset preceded the voluntary movement onset due to themotor time required for the muscle to initiate the displacement. EMG ac-54tivity elicited under a SAS was of larger amplitude than that elicited undera control stimulus. Compared to the lower lip, the upper lip did not ex-hibit obvious displacement until after the lower lip had started its openingmovement. That is, while the upper lip was actively engaged during lipcompression (as evidenced by EMG activity), no displacement was observedfor the upper lip. This absence might be due to either any downward move-ments being canceled out by the upward movement from the lower lip, orlip compression induced deformation of the lip.Figure 3.3 depicts lip EMG activity and displacement trajectories in theMouthed condition. The averaged EMG activity for both lips across controlresponses in the Mouthed condition was of smaller amplitude than the samemeasure averaged across Spoken control responses. This reduced EMG ac-tivity may be due to fewer trials exhibiting lip compression in the Mouthedcondition. As observed with Spoken responses, the upper lip did not gen-erate obvious displacement at the instant of lower lip voluntary movementonset. Displacement was only observed when the lower lip started the open-ing movement and dragged the upper lip with it.Upper lip and lower lip mean EMG activity and displacement trajectoriesfrom the Non-speech Control condition are shown in Figure 3.4. In Non-speech control responses, the lower lip displayed a clear EMG event, whereasthe upper lip did not show much activity during the response. Figure 3.4shows mean EMG activity of control and startle responses in the Non-speechcondition. As revealed in the control responses (Figure 3.4 left), only limitedupper-lip EMG activity was observed, appearing to occur later than thelower lip voluntary movement onset (i.e., right to the grey vertical line).55Figure 3.2: Mean upper lip (left) and lower lip (right) EMG activity anddisplacement trajectories in the Spoken condition across participants. Datain all channels were normalized to the lower lip opening onset (vertical line).The distance of lip displacement between control and startle responses doesnot reflect the absolute distance between control and startle responses.It is likely that the observed upper lip EMG was involved in the lower lipopening movement, rather than voluntary upward compression. SAS-elicitedstartle reflex activity was more robust in the lower lip than the upper lip. Asobserved with both Spoken and Mouthed responses, no obvious displacementfrom the upper lip took place until the lower lip opening onset.56Figure 3.3: Mean upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Mouthed condition across participants. Datain all channels were normalized to the lower lip opening onset (vertical line).The distance of lip displacement between control and startle responses doesnot reflect the absolute distance between control and startle responses.As revealed by Figures 3.2 and 3.3, for Spoken and Mouthed responses,the voluntary movement involved both lips compressing against each other.Non-speech responses, on the other hand, did not involve upper lip and lowerlip time-locked activation (Figure 3.4). For Non-speech control responses,mean latency to the EMG onset of the upper lip was 180.3 ms (SD = 6557Figure 3.4: Mean upper lip (left) and lower lip (right) EMG activity anddisplacement trajectories in the Non-speech condition across participants.Data in all channels were normalized to the lower lip opening onset (verticalline). The distance of lip displacement between control and startle responsesdoes not reflect the absolute distance between control and startle responses.ms) and 127.8 ms for the lower lip (SD = 45 ms). While the lower lipwas active for the voluntary (upward compressing) movement, the upper lipdid not show activity until the lower lip started to move downward for theopening movement (at 157 ms from the stimulus onset, see Table 3.3 in thenext section). This EMG pattern suggests a lack of voluntary compression58between the upper and lower lips; in this case, displacement from the lowerlip is more likely to be anticipatory to the opening movement. It is possiblethat this anticipatory movement may serve a different function from volun-tary compression in the speech-related responses (i.e., Spoken and Mouthedresponses).3.3.2 Startle indicatorIn Chapter 2, we observed distinct startle reflex activity in the two SCMand lower lip muscles when a speech response was elicited by a SAS. Figure3.5 aligns mean EMG activity and lip kinematic trajectories from the Spoken(repeated from Figure 2.6), Mouthed, and Non-speech responses in relationto the stimulus onset. As the figure shows, larger EMG amplitude from theSCM and lips were found in SAS-elicited responses than in control responses.More importantly, for both Mouthed and Non-speech responses, the startlereflex activity was observed in the two SCM and lower lip muscles, but notin the upper lip muscle. A window ranging from 30 ms to 120 ms after theonset of the imperative stimulus was labeled for all three conditions. Thiswindow marks the range for startle reflex activity. As revealed, the observedreflex activity is distinct from a second EMG event that is associated withthe voluntary mouth opening. This echoes with the findings in Chapter 2.59Spoken speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleMouthed speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleNon-speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleTime (ms)0 100 200 30002Displacement (mm)ULLLTime (ms)0 100 200 30002Displacement (mm)ULLLTime (ms)0 100 200 30002Displacement (mm)ULLLFigure 3.5: Mean EMG activity and kinematic displacement of control (grey) and startle (black) trials in theSpoken (repeated from Chapter 2), Mouthed, and Non-speech condition. All data were normalized to the imperativestimulus. The top four channels represent the mean EMG across four muscles (from top to bottom): left SCM,right SCM, upper lip, and lower lip. Each channel is of the same scale but vertically arranged for visualization.The bottom two channels depict the mean upper lip and lower lip displacement trajectories, respectively. Redboxes mark the window of startle reflex activity, ranging from 30 ms to 120 ms after the onset of the imperativestimulus.603.3.3 Kinematic markers, reaction times, and timingA summary of the results for all measures, including means and standarddeviations, are presented in Table 3.3. Note that the lower lip opening onsetand lower lip displacement trough were not observed in Chapter 2 for theSpoken response. These measures of Spoken responses are restated andanalyzed here for comparison.Chapter 2 reported that RTs for voluntary movement onset were shorterin SAS-induced responses (Mean= 75 ms) than those in control responses(Mean = 116 ms). In the present experiment the table illustrates slowerRT was observed in Mouthed and Non-speech responses when elicited by aSAS. Main effects for Condition (F(2, 16) = 4.44, p = 0.049, η2p = 0.04) andStimulus (F(1, 8) = 17.87, p <.01, η2p = 0.32) were found. A Condition ×Stimulus interaction was also reported (F(2, 16) = 5.21, p = .02, η2p = 0.03).Post-hoc analyses using Tukey’s HSD found that for control responses, thereaction time from each task was not significantly different from the others(all p >.05). Similarly, no differences with regards to reaction time for star-tle responses were reported between tasks (all p >.05). Across conditions,pairwise comparisons (using a Holm-Bonferroni correction; significance levelat 0.01667) confirmed that there were no differences between Spoken andNon-speech (p = 0.06) and between Mouthed and Non-speech (p = 0.52).However, significant difference was noted between Spoken and Mouthed (p= 0.001). Between control and startle responses, pairwise comparisons (us-ing a Holm-Bonferroni correction; significance level at 0.01667) confirmedthat latencies of startle responses are significantly shorter than latencies of61control responses in the Spoken (p = 0.004) and Non-speech (p = 0.002)conditions. However, latencies between control and startle responses werenot significantly different in the Mouthed condition (p = 0.02).Considering the lower lip opening onset, shorter latencies were observedin SAS-induced responses from all three conditions. Main effects for Con-dition (F(2, 16) = 22.99, p <.01, η2p = 0.2) and Stimulus (F(1, 8) = 14.78,p <.01, η2p = 0.26) were found, but not in the Condition × Stimulus in-teraction (p = 0.27). Post-hoc analyses using Tukey’s HSD found that forcontrol responses, the reaction time from each task was not significantlydifferent from the others (all p >.05). Similarly, no differences with regardsto reaction time for startle responses were reported between tasks (all p>.05). Comparing across conditions, pairwise comparisons (using a Holm-Bonferroni correction; significance level at 0.01667) confirmed that latencieswas longer for Spoken than Silent and than Non-speech (both p <.001). Nodifference was reported between Silent and Non-speech responses (p = 0.07).Similarly, shorter latencies measured from the lower lip displacementtrough were observed in SAS-induced responses. Main effects for Condition(F(2, 16) = 9.76, p <.01, η2p = 0.06) and Stimulus (F(1, 8) = 11.68, p <.01,η2p = 0.29) were found, but not in the Condition × Stimulus interaction(p = 0.38). Post-hoc analyses using Tukey’s HSD found that for control re-sponses, the reaction time from each task was not significantly different fromthe others (all p >.05). Similarly, no differences with regards to reactiontime for startle responses were reported between tasks (all p >.05). Com-paring across conditions, pairwise comparisons (using a Holm-Bonferronicorrection; significance level at 0.01667) confirmed that latencies was longer62for Spoken than Silent (p <.001) and than Non-speech (p = 0.002). No dif-ference was reported between Silent and Non-speech responses (p = 0.04).63Table 3.3: Mean reaction times (in ms) to dependent markers across conditions; standard devi-ations in parentheses. The voluntary movement onsets of Spoken responses from Chapter 2 areincluded for comparison.Spoken Mouthed Non-speechcontrol startle control startle control startleVoluntary movement onset 116 (32) 75 (30) 128 (40) 96 (19) 138 (43) 78 (35)Lower lip opening onset 198 (60) 142 (44) 157 (51) 107 (35) 143 (39) 101 (24)Lower lip displacementtrough377 (89) 302 (42) 339 (72) 268 (48) 351 (77) 265 (49)64In Chapter 2, it was reported that while shorter latencies at differentkinematic markers were observed in startle responses, the relative timingbetween these kinematic markers remained unaffected. In the current ex-periment the results also show no significant difference between control andstartle values in Mouthed responses for the arcsine square-root-transformedrelative time data between voluntary movement onset and lower lip openingonset (i.e., t1/t3; control Mean = 0.44, startle Mean = 0.47, p = 0.38), orbetween lower lip opening onset and the lower lip displacement trough (i.e.,t2/t3; control Mean = 0.96, startle Mean = 0.94, p = 0.48). Timing acrossmarkers was also unaffected in the Non-speech condition. No significant dif-ference was found between control and startle values for the arcsine square-root-transformed relative time data between voluntary movement onset andlower lip opening onset (i.e., t1/t3; control Mean = 0.42, startle Mean =0.45, p = 0.58), or between lower lip opening onset and the lower lip dis-placement trough (i.e., t2/t3; control Mean = 0.99, startle Mean = 0.96, p= 0.55).The results suggest that the relative timing across these kinematic mark-ers is not altered by SAS. These relative time frames for control and startletrials are shown in Figure 3.6, with the relative time intervals representinguntransformed data.65Mouthed speechRelative time ratio0.00.20.40.60.81.0ControlStartlet1 t3 t2 t3Non-speechRelative time ratio0.00.20.40.60.81.0ControlStartlet1 t3 t2 t3Figure 3.6: Relative timing ratio (non-transformed) of various kinematictime markers for startle and control trials. The whiskers for each bar arestandard error bars. The time frame for these events is the time betweenlower lip voluntary movement onset and lower lip displacement trough (t3).Time intervals include the time between voluntary movement onset andlower lip opening onset (t1) and between lower lip opening onset and lowestlower displacement trough (t2).663.4 DiscussionThe experiment presented in this chapter was designed to test whethermulti-dimensional (in this case aerodynamic) details may be revealed bySAS and whether the startle paradigm has a different impact on Mouthedspeech versus Non-speech movements. The results supported our two predic-tions: (1) Differentiated lip compression frequency was observed for differentspeech and non-speech tasks, and (2) SAS was not observed to impact thelatency of voluntary movements across different tasks. Much as we saw withthe Spoken responses in Chapter 2, the results of the current study confirmthat the timing profile of lip kinematics for Non-speech and Mouthed re-sponses remains unaffected when elicited by a SAS. These findings providesupport for the StartReact effect (i.e., prepared responses are accelerated butunaffected in other measures) and demonstrate that both prepared speechand non-speech oral responses are subject to this effect.In parallel to the analysis of Spoken responses in Chapter 2, EMG activ-ity from SCM and lip muscles, lip kinematics, and reaction times at differentkinematic markers of the Mouthed and Non-speech responses were examinedin this study. As shown in Figure 3.5, for both Mouthed and Non-speech re-sponses, SCM and lower lip EMG activity exhibited startle reflex behaviourwhen triggered by a SAS. This result echoes the findings in Chapter 2. Inaddition to initiating startle reflex activity, the SAS also produced an accel-erated initiation of the prepared response for both Mouthed and Non-speechmovements. Although Spoken responses exhibit more frequent, longer, andlarger lip compression than Mouthed and Non-speech responses, all these67prepared responses are susceptible to early release by SAS.The prediction regarding differentiated lip compression for Spoken, Mouthed,and Non-speech movements was supported. While 91% of Spoken responsesshowed lip compression, only 56% of Mouthed responses and 22% of Non-speech responses demonstrated compression when elicited by a normal con-trol stimulus. For Spoken responses, a SAS did not appear to have an impacton the occurrence frequency of lip compression. Frequent lip compressionobserved in Spoken responses supports the view that such lip movementsare part of the plan for a Spoken task. While non-startled Mouthed re-sponses revealed less lip compression than Spoken responses did overall, thepercentage of Mouthed responses with lip compression decreased even fur-ther in SAS-elicited trials. As summarized in Table 3.1, lip compressionwas observed least frequently in the Non-speech condition. Compared withMouthed responses, a much lower lip compression frequency in Non-speechresponses suggests that non-speech lip movement does not require lip com-pression as part of the preparatory movement. Lip compression may not bedirectly connected with mouth opening, but may be performed in anticipa-tion of speech behaviours, possibly in anticipation of the need to contain anincrease in intraoral pressure associated with an oral stop. It should be notedthat EMG activity from the upper lip and lower lip was temporally alignedin Mouthed and Spoken responses, whereas in Non-speech responses, EMGactivity from the upper lip was not observed to be temporally aligned withlower lip EMG activity, suggesting a lack of active involvement of the upperlip in Non-speech responses. Meanwhile, lip compression in Non-speech andMouthed responses is unlikely to be the result of a training effect, since the68initiation of lip compression did not increase in frequency towards the endof the block. As such, lip compression appears to be a speech movementthat is speech-specific, rather than a preparatory kinematic movement forthe upcoming opening movement. A more general implication is that detailsabout speech motor control for speech and non-speech movements may berevealed by SAS.In addition to the lower frequency of lip compression, significantly shorterlatencies to the lower lip opening onset and lower lip displacement troughfor Mouthed and Non-speech responses also characterized a differentiatedlip compression from Spoken responses. As a SAS reliably elicits preparedmovements, these results suggest that lip compression may not be pre-tunedfor Mouthed speech tasks. If lip compression were part of the speech-relatedmotor plan, we should have observed a comparable amount and frequencyof compression in both Mouthed and Spoken responses. The contrast inthe frequency of lip compression between these two speech modes can bebest accounted for by the hypothesis that different muscular structures areinvolved in the implementation and anticipation of intraoral pressure forSilent and Speech tasks. The implementation of intraoral pressure as partof a speech motor plan requires further investigation.Compared with Spoken responses, Mouthed responses exhibited less fre-quent lip compression (Table 3.1). Meanwhile, it is noted that longer laten-cies to the voluntary movement onset were observed for Mouthed responseswhen elicited by a SAS (96 ms vs. 75 ms for Spoken and 78 ms for Non-speech; see Table 3.3). Recall that voluntary movement onset correlateswith the initiation of lip compression; this being the case, the long latency of69Mouthed responses to the voluntary movement onset may be best accountedfor by assuming optionality of lip compression in the prepared program.That is, the choice to incorporate orofacial muscle tensing in a Mouthedresponse may result in extra online workload and consequently increase thereaction time. Compared with Spoken responses, the performance latencyfor the Mouthed responses was also longer. As Spoken responses are more“common” than Mouthed responses (speakers perform speech tasks everyday and are very experienced with them, whereas mouthing a silent soundis not a commonly performed task), extra inhibitory effects, such as a con-certed effort to hold back air flow, may be required for Mouthed responses.This additional inhibition may also lead to increased processing load andlonger reaction time for execution. If these inhibitory efforts are includedin the speech plan, their timing in relation to other movements may bepre-specified, in which case no delay in reaction time should be observed.The current results reveal the opposite to this prediction, suggesting thata potential inhibitory effort may be implemented after the initiation of theresponse, consequently resulting in a longer reaction time latency.As mentioned in the Introduction (Section 3.1), a finding of differentbreathing patterns would indicate different aerodynamic structures encodedin speech tasks. While Spoken and Mouthed speech tasks may involve differ-ent aerodynamics, this kind of differentiation can also be found in differentspoken tasks. For example, Gick et al. (2012) show that /p/ and /m/ requiredifferent degrees of muscle tension to resist intraoral pressure, even thoughboth sounds are bilabial stops. In the production of /p/, intraoral pressureis built up during lip closure and the perioral muscles are activated in order70to keep the mouth closed and maintain the air pressure within. Comparedto /p/, the production of the bilabial nasal /m/, which requires very lim-ited or no aspiration or air pressure, does not require as intensive muscleactivity in the perioral areas. Comparing /pa/ with /ma/ would be usefulto further test the possibility of implementation of intraoral pressure in thespeech plan, though there could be other unknown kinematic differences be-tween theses sounds that would confound such a comparison. The currentexperiment design did not directly or independently test the implementa-tion of intraoral pressure, or the controlled kinematics of lip compression bycomparing spoken /ba/ and mouthed /ba/. Applying the startle paradigmto /ba/ vs. /ma/ would likely improve our understanding of the role of in-traoral pressure in the speech plan. This design, which requires more carefulanalysis of lip kinematics, will be exploited in future studies.As all of the trials in the current experiment involved prepared sequences,we can conclude that the observed latency differences do not reflect differ-ences in planning, but in execution. In Klapp’s (2003) model, the complexityof a prepared response does not affect reaction time, but the sequence of theelements does. Recently, Maslovat et al. (2014) showed that effects on re-action time may be associated with an inability to prepare timing of theelements of a prepared response. The current results suggest that there maybe some other potential mechanisms, such as inhibition of air flow and vo-calization, that impact on the reaction time of Mouthed startle responses.Further investigation into the timing of inhibitory efforts will call for futureresearch.71Chapter 4Startling syllables withpre-speech movements4.1 IntroductionChapters 2 and 3 examined the elicitation of prepared Spoken syllables,Mouthed syllables, and Non-speech lip movements. When a SAS is usedas an imperative stimulus, shorter latencies of voluntary movement onsetsare induced. We also observed lip compression in elicited speech responses.The observed lip compression appeared to be speech-specific as it occurredwith significantly greater frequency in Spoken vs. Mouthed or Non-speechresponses, suggesting that lip compression may be associated with the needto contain air flow required for Spoken speech and thus to be included inthe speech plan. As proposed by McClean and Clay (1995), in connectedspeech, lip compression is necessary in order to guarantee a full closure forthe bilabial sound and to retain possible oral pressure within. This chaptertests whether, in a simple RT task, lip compression that is independent ofaerodynamic factors is observed in all speech-related tasks and is elicited ata short latency by SAS. This will be tested by observing the production of72both Spoken and Mouthed syllables as well as Non-speech mouth openingmovements starting from a mouth-open posture, thus effectively adding apreceding mouth-closing movement to prepared sequences.As suggested by the results in Chapter 3, lip compression appears to bespeech-specific and is neither guaranteed nor fully realized for other oromo-tor tasks. It should be noted that lip compression may also be achieved byother forces. Lo¨fqvist and Gracco (1997) considered lip compression (i.e.,negative lip aperture in their terms) as a virtual target for bilabial stops.Boucher (2008, p. 297) further proposes that such compression implies anovershoot of the vocal-tract space. As the major contrast distinguishingSpoken from Mouthed speech is the presence vs. absence of air flow, notarticulatory movements or lip kinematics, it is thus predicted that whenstarting with a mouth-open condition, Spoken and Mouthed speech shar-ing comparable articulatory movements and lip kinematics should generatecomparable lip compression.Temporally speaking, from a simplistic point of view, one can considera mouth-open response as the linear combination of a mouth-closing move-ment plus a mouth-closed response (cf. Perrier et al., 1996; Sanguineti et al.,1998). That is, producing a /ba/ from a mouth-open position would be con-sidered as the production of a mouth-closing movement followed by a pro-duction of /ba/ from a mouth-closed condition. In a simple RT task, addingthis additional pre-speech closure movement should not affect reaction time.It is predicted that prepared Spoken /ba/, Mouthed /ba/, and Non-speechoral movements are subject to early release by SAS and no differences withregards to reaction time is expected.73This chapter will report on an experiment on SAS-induced responsesfrom a mouth-open position and present data to test the above predictions.4.2 Methods4.2.1 ParticipantsData were collected and analyzed from the same nine participants re-ported in Chapter 2 (3 male and 6 female; Mean = 23 years, SD = 4.2 years).Participants were all native speakers of North American English. Prior tothe testing, participants signed an informed consent form and were na¨ıve tothe questions under investigation. The experiment was conducted followingthe ethical guidelines established by the University of British Columbia.4.2.2 Apparatus, task, and proceduresThe same apparatus and procedures from Chapter 3 were applied, ex-cept that participants were asked to begin each task with their mouths open.The first experiment block required participants to respond to an acousticstimulus by closing their mouths (hereafter the Non-speech condition). Inthe second block, participants were instructed to respond to the acousticstimulus by mouthing a silent articulation of /ba/ (hereafter the Mouthedcondition). In the final block, participants responded to the acoustic stim-ulus by producing a vocalized /ba/ (hereafter the Spoken condition).The testing procedures used for this experiment were identical to thosereported in Chapters 2 and 3. Participants performed a single testing sessionof approximately 20 minutes. The testing block consisted of twenty control74trials and five startle trials. The five startle trials were presented pseudo-randomly such that the first trial was never a startle trial, nor were theretwo consecutive startle trials.4.2.3 Data reduction, dependent measures, and analysesNine participants were exposed to three conditions, each containingtwenty-five trials for a total of 675 trials. A total of 43 trials were excludedfrom the analyses for the following reasons: incorrect movement (i.e., start-ing with mouth closed, 5 trials; 0.74%), hesitation (2 trials; 0.3%), antici-pation (5 trials; 0.74%), bad audio recording (1 trial; 0.15%), and data loss(2 trials; 0.3%), and outlying data points (defined as those voluntary onsetsbeing at least 2 SD away from the mean) (28 trials; 4.1%). The remaining632 trials were examined and further analyzed.EMG activity measured from the SCM and lip muscles were aggregatedand analyzed. Individual trials were first aggregated by participants andconditions. Mean EMG activity was taken across aggregated by-participantmeans. Lip displacement trajectories from all the conditions are also in-cluded. It was expected that SCM and the lower lip muscles would showstartle reflex activity in response to the SAS; amplified EMG levels andaccelerated lip displacements were also anticipated.Latencies to three kinematic markers (denoting the lower lip voluntaryonset, lower lip opening onset, and lower lip displacement trough). In thecurrent experiment design, participants started with their mouths open.Within this context, the lower lip voluntary onset was defined as the instantwhen the lower lip started to move upward for the closure (point A in Figure75-100 0 100 200 300 400 50003Amplitude              Displacement (mm)Time (ms)Lower LipAcousticsStimulus presentationA*B*C D*t1t2t3t4t5Figure 4.1: Schematic Spoken response with lip displacement and responseacoustic waveforms plotted with respect to the imperative stimulus. Point Amarks the beginning of voluntary movement; point B marks the time to thelower lip opening onset; point C marks the acoustic onset; point D marksthe lowest position of the lower lip. The intervals across these markers werelabeled as t1, t2, t3, t4, and t5. See text for details.4.1); the lower lip opening onset was defined as the instant when the lowerlip started to descend for the opening burst of the syllable /ba/ (point Bin Figure 4.1); the lower lip displacement trough was defined as the lowestpoint reached by the lower lip during the opening movement (point D inFigure 4.1).To measure the timing of the initial acoustic burst release, the acous-tic signals were first filtered (at a level of 10Hz) using PRAAT (Boersmaand Weenink, 2009). A narrow-band spectrogram (bandwidth of 43 Hz;76wavelength of 30 ms) was displayed in PRAAT to mark the acoustic burstrelease (point C in Figure 4.1). Reaction times to points A, B, C and Dwere measured and analyzed. Measurement analyses were performed usinga 3 Condition × 2 Stimulus repeated measures ANOVA. Greenhouse-Geissercorrection was used for any violation of sphericity. Partial eta-squared (η2p)values are calculated as a measure of effect size.Relative time between the kinematic markers A, B, C, and D was mea-sured (Figure 4.1), with the time differences labelled as follows: t1 measuresthe latency between voluntary movement onset and lower lip opening onset;t2 measures the latency between opening movement and acoustic onset; t3measures the latency between lower lip voluntary onset and acoustic onset;t4 measures the latency between lower lip opening onset and lower lip dis-placement trough. The time frame for all of these events is t5 (the timebetween points A and C in Figure 1); the latency of each kinematic eventwith respect to t5 was also calculated. For each trial (startle and control),the ratios t1/t5, t2/t5, t3/t5, and t4/t5 were calculated and transformed us-ing the arcsine square root transformation (this was used due to correlationsbetween markers; McDonald, 2009). The transformed ratios were then ag-gregated by participant and analyzed via Student’s t-tests.In Chapter 3, we observed lip compression during the production ofelicited bilabial stops. The current design asked participants to begin eachtask with their mouths in an open position. As predicted by Lo¨fqvist andGracco (1997), a negative lip aperture is anticipated in the production of thebilabial /b/. From a mouth-open position, the two lips start to move towardeach other. After the contact, the lower lip, which usually induces greater77force, pushes against the upper lip and reverses its direction of movement(from downward to upward); this is shown as Point M in Figure 4.2.0 100 200 300 400 50003Displacement (mm)Time (ms)Upper LipLower LipMN**tcFigure 4.2: Lip compression markers. Points M and N mark the onset andoffset of lip compression, respectively (see text for details). Lip compressionis here defined as the latency between Points M and N (tc). Lip compressiondisplacement measures the change in lower lip displacement within tc.The opening burst is mainly determined by the lower lip (and the jaw).As the two lips come into contact, the initiation of opening from the lowerlip affects upper lip movement, pulling the upper lip downward when theopposite forces between the two lips are not strong enough to separate them.Once the opposing force generated by the two lips becomes greater than thestickiness between them, the two lips move apart. This series of steps pro-duces a w-shaped movement trajectory in the upper lip (the upper channelin Figure 4.2).To measure lip compression during the mouth-open experiment, two tem-78Lip displacement (mm)01020304050Time (ms)Lip aperture (mm)0 100 200 300 400 50001020304050Figure 4.3: Schematic lip displacement trajectories (top) from the upperand lower lips and correspondent lip aperture profile (bottom). Data wastaken from one trial from one of the participants. The two lips reach anequilibrium point and lip aperture stabilizes. Data was taken from oneexample trial from a participant. The vertical grey line marks the onset oflip compression. See text for details.poral points were marked. The onset of lip compression was defined as thepoint at which the movement trajectory of the upper lip changes (Point M79in Figure 4.2). Arrival at this point reliably indicates that the two lips havecome into contact and that the closure force from the lower lip is affectingthe upper lip movement trajectory. The offset of lip compression was definedby the lower lip (Point N in Figure 4.2), which is a reliable agonist for theopening movement. The duration of lip compression (tc) was defined as thelatency between Points M and N in Figure 4.2. Lip compression durationwas calculated as the interval between the onset and offset of lip compression(tc). Lip compression displacement was calculated as the change in lower lipvertical displacement between the onset and offset of lip compression (tc).Lip compression measurements were then analyzed by a 2 Stimulus (con-trol vs. startle) × 2 Condition (Mouthed vs. Spoken) repeated-measuresANOVA.Figure 4.3 shows schematic lip displacement trajectories and the corre-spondent lip apertures. The upper and lower lip data from one trial fromone participant was plotted. The lip displacement trajectories in relation tothe imperative stimulus (the grey line in Figure 4.3) were illustrated in thetop figure of Figure 4.3. To obtain lip aperture data, the absolute verticaldifference between the upper and lower lips were calculated. The verticalheight of the lower lip was subtracted from the vertical height of the upperlip (Figure 4.3 bottom). When participants initiate the prepared response,the two lips move closer to each other and lip aperture starts to decrease.When the two lips make contact, lip aperture stabilizes (bottom of Figure4.3). However, while the lower lip is moving upward and pushing againstthe upper lip, the upper lip may continue moving downward for the lip clo-sure. Lip aperture does not stay constant; an inflection is observed in the80lip aperture profile in Figure 4.4.Lip displacement (mm)01020304050Time (ms)Lip aperture (mm)0 100 200 300 400 50001020304050Figure 4.4: Schematic lip displacement trajectories (top) from the upperand lower lips and correspondent lip aperture profile (bottom). Data wastaken from one trial from one of the participants. This figure shows that lipaperture continues to decrease after the two lips make contact. The verticalgrey line marks the onset of lip compression. An inflection of lip apertureprofile is observed. See text for details.814.3 Results4.3.1 Startle indicatorFor all three conditions, mean EMG activity and lip displacement withrespect to the imperative stimulus were plotted in Figures 4.5. Across allthree conditions, startle reflex activity was observed in both SCM musclesand the lower lip in SAS-induced responses, whereas only limited activitywas observed in the upper lip muscles in the same context. In each case, theonsets of startle reflex activity in the SCM muscles and lower lip occurredwithin 100 ms of the imperative stimulus.As observed in mouth-closed responses (Chapters 2 and 3), the observedstartle reflex activity in the current mouth-open condition was consistentlypresent in SCM and lower lip muscles, distinct from EMG activity for volun-tary movements, and not task-dependent. In the current mouth-open design,voluntary movement started at the instant when both lips came towards eachother for the closure movement. Unlike in the mouth-closed experiments,the upper lip was engaged in the voluntary closure movement and showedEMG activity from the initiation of the closure. For all three conditions, alarger lip displacement range was observed in SAS-elicited responses.82Spoken speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleMouthed speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleNon-speech01234EMG voltage (mv)LSCMRSCMULLLControlStartleTime (ms)0 100 200 30002Displacement (mm)ULLLTime (ms)0 100 200 30002Displacement (mm)ULLLTime (ms)0 100 200 30002Displacement (mm)ULLLFigure 4.5: Mean EMG activity and kinematic displacement of control (grey) and startle (black) trials in theSpoken, Mouthed, and Non-speech condition. All data were normalized to the imperative stimulus. The top fourchannels represent the mean EMG across four muscles (from top to bottom): left SCM, right SCM, upper lip, andlower lip. Each channel is of the same scale but vertically arranged for visualization. The bottom two channelsdepict the mean upper lip and lower lip displacement trajectories, respectively. Red boxes mark the window ofstartle reflex activity, ranging from 30 ms to 120 ms after the onset of the imperative stimulus.834.3.2 Kinematic markers, reaction times, and timingA summary of reaction times derived from kinematic markers is pre-sented in Table 4.1. Concerning lower lip voluntary movement onset, longerlatencies were observed for control responses than for SAS-elicited responsesacross all three conditions. Two-way ANOVA results reported a main effectfor Stimulus (F(1, 8) = 24.78, p <.01, η2p = 0.29). No effect was found forCondition (p = 0.054) or Condition × Stimulus (p = 0.7).Table 4.1: Mean reaction time (in ms) at dependent markers across conditions;standard deviations are in parentheses.Spoken Mouthed Non-speechVoluntary movement onsetControl 119 (26) 136 (31) 132 (32)Startle 91 (22) 101 (27) 99 (22)Lower lip opening onsetControl 265 (69) 282 (84) –Startle 210 (28) 230 (37) –Lower lip displacement troughControl 423 (94) 461 (125) –Startle 352 (46) 382 (68) –Latency to the lower lip opening onset was only measured and analyzedfor Mouthed and Spoken responses. As reported above for the lower lipvoluntary movement onset, shorter latencies were observed in SAS-elicitedresponses than control responses. Results showed main effects for Condition(F(1, 8) = 5.96, p = 0.04, η2p = 0.03) and for Stimulus (F(1, 8) = 7.39, p= 0.03, η2p = 0.19), but no Condition × Stimulus interaction (p = 0.68).Compared to the control responses, startle responses showed shorter laten-cies before the lower lip displacement trough was reached. A main effect84for Stimulus was found (F(1, 8) = 10.6, p =.01, η2p = 0.17). No other ef-fect or interaction was reported. Note that standard deviations for bothMouthed and Spoken responses gradually increased from voluntary move-ment onset, to lower lip opening onset, and then lower lip displacementtrough. This increase of standard deviation was driven by one participant(P19) exhibiting much longer lip compression during the closure. For thisparticipant, longer latencies to the lower lip opening onset and displacementtrough were observed. These longer latencies resulted in the observed largestandard deviations across participants.While the lower lip initiated the voluntary movement at around the sametime for all three conditions, the lower lip varied in the time taken to reach itspeak position in the closure (i.e., the opening onset) and its lowest position(i.e., displacement trough). As summarized in Table 4.1, shorter latenciesto these markers were observed for Spoken responses than for Mouthedresponses. These shorter latencies may be due to the presentation order ofthe blocks. In the current design, the presentation order of the blocks wasfixed: participants performed all tasks in the order Non-speech > Mouthed> Spoken. As they were asked to begin each response from a mouth-opencondition, the degrees of opening decreased as the tasks proceeded. Thelargest opening (measured by the distance between the two lips at the lowerlip voluntary onset) was observed for Non-speech (Mean = 23.5 mm, SD =8.6 mm), followed by Mouthed (Mean = 18.09 mm, SD = 8.3 mm), and thenSpoken responses (Mean = 16.79 mm, SD = 8.1 mm). As such, the shorterlatencies to lower lip peak position observed in the Spoken condition maybe accounted for by the differences in the degree of opening at the starting85position.While shorter latencies at different kinematic markers were observedin startle responses, the relative timing between these kinematic markersremained unaffected. For Non-speech responses, only the latency betweenlower lip voluntary movement onset and lower lip opening onset (i.e., t1 inFigure 4.1) was measured. The t-test result showed no significant differencebetween control (Mean = 204 ms, SD = 37 ms) and startle responses (Mean= 206 ms, SD = 45 ms; p = 0.93). For Mouthed responses, the lowerlip kinematic makers at the lower lip’s voluntary movement onset, openingonset, and displacement trough were measured. The t-test results showedno statistical difference between control and startle values for the arcsinesquare root transformed relative time data between voluntary movementonset and lower lip opening onset (i.e., t1/t5; control Mean = 0.68, startleMean = 0.69, p = 0.48) or between lower lip opening onset and lower lipdisplacement trough (i.e., t4/t5; control Mean = 0.77, startle Mean = 0.75,p = 0.52).For Spoken responses, no significant difference was reported betweencontrol and startle values for the arcsine square root transformed relativetime data between voluntary movement onset and lower lip opening onset(i.e., t1/t5; control Mean = 0.70, startle Mean = 0.69, p = 0.54), betweenlower lip opening onset and voice onset (i.e.,t2/t5; control Mean = 0.63,startle Mean = 0.52, p = 0.14), between voluntary movement onset and voiceonset (i.e., t3/t5; control Mean = 0.85, startle Mean = 0.81, p = 0.21), orbetween lower lip opening onset and displacement trough (i.e., t4/t5; controlMean = 0.74, startle Mean = 0.76, p = 0.53). As in the mouth-closed86conditions (Chapters 2 and 3), the relative timing across these kinematicmarkers in the mouth-open condition was not altered by SAS-elicitation.The relative time frames for control and startle trials are illustrated in Figure4.6, with the relative time intervals representing untransformed data.Mouthed speechRelative time ratio0.00.20.40.60.81.0ControlStartlet1 t5 t2 t5 t3 t5 t4 t5Spoken speechRelative time ratio0.00.20.40.60.81.0ControlStartlet1 t5 t2 t5 t3 t5 t4 t5Figure 4.6: Relative timing ratio (non-transformed) of various kinematicand acoustic time markers for startle and control trials. The time frame forthese events is the time between lower lip voluntary movement onset andlower lip displacement trough (t5). Time intervals include the time between:voluntary movement onset and lower lip opening onset (t1), lower lip openingonset and voice onset (t2), voluntary movement onset and voice onset (t3),and lower lip opening onset and lower lip displacement trough (t4).874.3.3 Lip kinematicsTable 4.2 summarizes lip compression onset latency and duration inMouthed and Spoken responses. Observed lip compression onset latencies(illustrated by the onset latency for Point M in Figure 4.2) were longer forMouthed responses than for Spoken responses, and longer for control re-sponses than for startle responses. Main effects for Condition (F(1, 8) =17.61, p <.01) and Stimulus (F(1, 8) = 25.07, p <.01) were observed. NoCondition × Stimulus interaction was reported (p = 0.76). Mouthed andSpoken responses showed no difference for lip compression duration (p =0.91). While shorter mean lip compression durations were observed in star-tle responses than in control responses, the difference was not significant (p= 0.5). No interaction of Condition × Stimulus was reported (p = 0.86).Table 4.2: Mean lip compression onset latency (in ms), lip compressionduration (in ms) and lip compression displacement (in mm); standarddeviations are in parentheses.Mouthed SpokenLip compression onset latency (ms) Control 233 (54) 216 (47)Startle 190 (37) 169 (29)Lip compression duration (ms) Control 59.9 (66) 57.5 (38)Startle 50.5 (28) 49.7 (19)Lip compression displacement (mm) Control 3.70 (4.3) 4.09 (4.5)Startle 5.93 (5.6) 5.05 (5.1)No main effect for Condition (p = 0.61) or Stimulus (p = 0.11) wasreported with regards to lower lip compression displacement, whereas a88Stimulus × Condition interaction was found (F(1, 8) = 6.18, p =.04, η2p= 0.004).Compared with the upper lip, the lower lip showed more robust EMGactivity and displacement. To observe EMG activity and lip kinematics ofvoluntary responses, both upper lip and lower lip data were normalized tothe lower lip voluntary onset (point A in Figure 4.1). Figures 4.9, 4.8, and4.7 present mean EMG activity and lip displacement trajectories for theSpoken, Mouthed, and Non-speech conditions, respectively. Across all threeFigure 4.7: Upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Spoken condition. Data in all channels werenormalized to the lower lip opening onset (vertical line).89conditions, for both lips, larger EMG activity and displacement ranges wereobserved in startle responses than in control responses. Compared withthe upper lip, the lower lip generated more EMG activity and exhibitedlarger displacement for both control and startle responses. It is noted that,Figure 4.8: Upper lip (left) and lower lip (right) EMG activity and dis-placement trajectories in the Mouthed condition. Data in all channels werenormalized to the lower lip opening onset (vertical line).for Mouthed and Spoken responses, the lower lip demonstrated two EMGevents, whereas the upper lip only showed one instance of EMG activity.The first EMG event measured in the lower lip was temporally bound tothe event observed in the upper lip. This EMG event is considered to be90related to the closing movement. Comparable EMG activity for the closuremovement was also seen in Non-speech responses (Figure 4.9). The secondEMG activity observed in the lower lip, on the other hand, was responsiblefor the opening of the bilabial burst.Figure 4.9: Upper lip (left) and lower lip (right) EMG activity and displace-ment trajectories in the Non-speech condition. Data in all channels werenormalized to the lower lip opening onset (vertical line).914.4 DiscussionThe results of this chapter show that the StartReact effect is observedfor utterances produced from a mouth-open starting position for all threeconditions in this study (Spoken, Mouthed, and Non-speech), as evidencedby the presence of startle reflex activity in the SCM and the lower lip andby accelerated voluntary movement onset for the prepared response. Allthe responses were elicited at shorter latencies by a SAS; stronger EMGactivity and larger displacement amplitudes were observed for SAS-inducedresponses than for control responses. As predicted, adding an additional pre-speech closure movement does not affect reaction time. Earlier onset of lipcompression for Spoken responses than Mouthed responses were consistentwith the smaller degree of mouth opening observed for Spoken responses.Meanwhile, relative timing across different kinematic markers was unaffectedby SAS (Figure 4.6).Comparing Tables 3.3 with 4.1, for both Spoken and Mouthed syllables,voluntary movement onsets for mouth-closed and mouth-open responseswere comparable. It is therefore suggested that in a simple RT task, addingan additional preparatory movement does not appear to yield any addi-tional delay of execution. The present study took the position that themouth-open condition is a temporal combination of a mouth-closing move-ment and a mouth-closed condition. While an overshoot from the mouth-closing movement is expected, such an overshoot does not contribute morelip compression to the existing mouth-closed speech plan. Lip compressionappears to fall within a narrow range, regardless of the starting position92of the mouth. Further research would be needed to determine whether thecompression from the mouth-closed speech plan has been overwritten orotherwise modified. Future research into lip compression would be needed.Different degrees of mouth opening may affect the timing, strength, andduration of lip compression. In the current design, the degree of mouthopening in each participant’s initial posture was not controlled for. Givendifferent beginning positions, different lip forces and lip movement velocitiesmay be generated, which in turn may affect lip compression kinematics.The observed lip deformation may thus be the result of pure biomechanicalinteraction between the two lips. In instances where the upper lip is affectedby the lower lip, it is not always clear that the upper lip has been pushedupward and forced to change its trajectory. In fact, after contact betweenthe two lips is initiated, the influence coming from the lower lip may notbe great enough to change the moving trajectory of the upper lip. Theforce generated by the upper lip may be strong enough to resist the impactfrom the (upward-moving) lower lip. As a result, the upper lip may only beaffected in terms of its velocity, while maintaining its downward movement.In turn, measured lip compression may not necessarily reflect differentiatedlip compression for Spoken and Mouthed responses.To summarize, the results show that preparatory oral movements can beincluded in a speech plan and may be subject to rapid release by a SAS.While lip compression is speech-specific in the mouth-closed condition, it isnot unique to Spoken speech for the mouth-open condition.93Chapter 5Pitch planning in Englishand Taiwanese Mandarin5.1 IntroductionPrevious chapters have demonstrated that the SAS experimental paradigmis capable of uncovering the content of speech plans. Using this method, aprepared CV syllable can be elicited at shorter latencies, and prepared CVsyllables can be performed rapidly and accurately, though some aspects ofcoordination may be disturbed. The present chapter tests whether phonemicand non-phonemic aspects of pitch are included in a speech plan.As reported in Chapter 2, prepared English CV syllables elicited bySAS are initiated at shortened latencies, with lower-lip movements for /ba/initiated earlier in SAS-elicited responses (Mean = 75 ms) than in controlresponses (Mean = 116 ms). In English SAS-triggered syllable production,throughout the syllable, lip kinematics and vowel F1 and F2 (Figure 5.1) areunaffected by the presentation of a SAS, indicating that these parametersare encoded in the speech plan and are not dependent on afferent feedback.The lack of affect on F1 and F2 indicates that SAS-induced CV syllables94share comparable vowel quality with control responses. These responses aretherefore considered “accurate.” However, in addition to formants, othermeasurements, such as pitch (i.e., fundamental frequency f 0) and speakingrate, may also indicate different properties of the syllable. Results for pa-rameters relating to pitch control, such as pitch height and contour profile,are not discussed in Chapter 2. This leaves an important gap, as it remainsunclear whether or not SAS elicitation impacts pitch. If a SAS imposes anyeffects on pitch, at least three questions ought to be answered. First, howis pitch affected by SAS? Second, are SAS-induced responses with affectedpitch still considered “accurate?” Third, if these SAS-induced responses arenot accurate, is there any correction or adjustment based on feedback? Toanswer these questions, the current chapter investigates the pitch height andcontour of SAS-induced responses.The SAS paradigm is ideally suited for investigating pitch control, bothbecause a SAS can elicit rapid response of prepared movement sequences,allowing less time for effects of sensory feedback, and because the associatedstartle reflex provides a natural, physiological pitch perturbation. Followingpresentation of a SAS, a brief physiological startle reflex response occurs(muscle pre-motor reaction time around 40 ms), followed by accelerated re-lease of the prepared movement sequence (see Stevenson et al., 2014, fordetails); by the time of the voice onset (Mean = 204 ms), this initial startlereflex has long since been resolved. In an experiment conducted by Baer(1979), startle responses were observed when loud clapping was introducednear the participants’ ears. The participants’ continuous phonation wasperturbed by this startling acoustic stimulus, with a consequent increase950.0 0.2 0.4 0.6 0.8 1.06008001000120014001600Normalized syllable durationFrequency (Hz)ControlStartleFigure 5.1: SS ANOVA comparison of formant frequencies F1 and F2 oversyllable /ba/ durations. The black line denotes the predicted fit for con-trol responses and the white line denotes the startle responses. The darkgrey and light grey bands surrounding the predicted fits represent a 95%confidence interval. Any white space between the transition lines for eachcondition represents a statistically significant difference between the givenmeasures. NB: The figure is a revised version of Figure 2.8.in the fundamental frequency of the phonation at a latency of 50 ms (seeBaer, 1979, for details). The perturbation lasted for approximately 100ms, after which the overall frequency of phonation was higher than beforethe perturbation. Baer argues that tension in the larynx increases after theperturbation due to a protective closure reflex that occurs in response tothe unexpected loud sound. When an unexpected startling auditory stim-ulus was presented during constant phonation, the startle reflex caused anincrease in laryngeal tension resulting in a momentary elevation in pitchheight, followed by a correction (i.e., dropoff) back to baseline level.96Extrapolating from Baer (’s 1979) findings for constant phonation tothe production of spoken syllables, let us consider the sequence of pitch-related events beginning with presentation of the SAS. First, a speaker whois planning to produce a spoken syllable hears a SAS; after about 40 ms, thefirst premotor reaction associated with the startle reflex can be observed,initiating an increase in laryngeal tension. As soon as 20 ms after the startlereflex has begun to perturb laryngeal tension, any initial somatosensory(e.g., proprioceptive) feedback-based correction may begin to take place (cf.Larson et al., 2008). Over 100 ms later, the physiological reflex and theinitial response having resolved, voicing onset occurs (∼ 204 ms); vocal foldvibration offers a second opportunity for somatosensory (e.g., vibrotactile,aerotactile) feedback to take effect at a further latency of ∼ 20 ms after thepitch onset (cf. Larson et al., 2008). If auditory feedback plays a role incorrection, compensation would occur at an additional latency from as earlyas 100 ∼ 150 ms (Hain et al., 2000) up to 210 ms (Jones and Munhall, 2002)following voicing onset. Thus, if initial pitch height is pre-specified in thespeech plan (i.e., if there exists a specified baseline to which the system mightbe corrected), proprioceptive feedback streams should have had sufficienttime to correct for SAS perturbation to laryngeal configuration prior tovowel onset. However, if initial pitch height is not part of the speech plan,an uncorrected increase in onset pitch height may be expected to persistin SAS-induced responses. Taking the above observations into account, weexpect vowel-initial pitch height to be elevated by SAS perturbation. Figure5.2 illustrates windows for potential somatosensory and auditory feedbackfor a SAS-induced response.97Time (ms)0 100 200 300 400 500 600Laryngeal startle reflex Voice onsetSomatosensory feedback correctionSomatosensory feedback correctionAuditory feedback correctionSASonsetFigure 5.2: Schematic timeline of potential feedback for SAS-elicited speechWhile elevated pitch height is anticipated in response to a SAS, it isalso necessary to consider whether perturbed pitch control affects responseaccuracy. In the experiment reported in Chapter 2, the target responsewas a nonsense syllable /ba/. Since this is not a word in English, perturbedpitch control cannot produce a meaning change or shift. Therefore, responseaccuracy cannot be determined in this design. In addition, even for a gram-matical English word, such as /ti/ (‘tea’), pitch height does not create ameaning contrast. For example, /ti/ produced in a higher pitch range, froma 3-year-old toddler, conveys the same meaning /ti/ produced in a muchlower range by a 65-year-old man. Thus, examining SAS-induced Englishpitch alone cannot sufficiently address the question of elicited response ac-98curacy.Mandarin Chinese, on the other hand, serves as a qualified candidate toinvestigate this issue. As a tonal language, Mandarin Chinese uses differen-tiated pitch level and contours to distinguish word meanings. The syllable/ba/ can be combined with four lexical tones to produce four different lex-ical content words (cf. Table 5.1). Studying SAS-induced Mandarin pitchcan thus reveal whether or not pitch contour profiles are affected by SAS,resulting in a change in word meaning. If pitch contour profiles are pre-specified in the speech plan, they should not be perturbed by a SAS. In thiscase, the induced responses can be considered accurate responses.Table 5.1: The four lexical tones in Mandarin Chinese and possible wordsMatched syllable Tone Pitch control Word meaning/ba/Tone 1 High-level ‘eight ’Tone 2 Rising ‘pull ’Tone 3 (Mid-) Falling (-rising) ‘target ’Tone 4 Falling ‘father ’Compared with pitch height, pitch contour in Mandarin appears to befairly stable, possibly because of its more crucial role in maintaining lexicalcontrasts. Liu et al. (2007), for example, showed that while Mandarin-speaking mothers use higher pitch level to address infants than to addressadults, the contours associated with phonemic contrasts are maintained andeven exaggerated between adult-directed and infant-directed speech. Thus,lexical contrasts can be maintained by preserving the contour profiles that99are used to differentiate lexical tones, despite changes in absolute pitchheight. If absolute pitch height is part of the speech plan, a correctionto baseline level in SAS-induced responses is anticipated at some point dur-ing the vowel (i.e., auditory feedback-based correction), compromising pitchcontour. However, if pitch contour is pre-specified but absolute height is not(as we might expect from previous literature we described above), the con-tour profile should be preserved when elicited by a SAS, even if at a higherpitch range. Thus, while we predict an elevated pitch height at vowel onset,we do not expect to see a rapid dropoff in pitch signifying feedback-basedcorrection following the physiological response. Rather, we expect to seepitch contours maintained, though at an elevated level.The current experiment was designed to examine how pitch in tonal andatonal languages is produced in response to a SAS. English, a non-tonallanguage, is compared with a Mandarin dialect spoken in Taiwan (here-after Taiwanese Mandarin). The issue of how pitch control affects responseaccuracy under SAS elicitation is discussed.5.2 Methods5.2.1 ParticipantsFifteen Taiwanese Mandarin native speakers were recruited for the study,of which seven (2 male and 5 female) exhibited consistent startle reflexin response to SAS (e.g., shoulder shrugging or facial grimace); all thedata collected from these seven participants were analyzed. Prior to test-ing, participants signed an informed consent release and were left unin-100formed about the research question under investigation. The experimentwas conducted following the ethical guidelines established by the Universityof British Columbia (UBC). For the Taiwanese Mandarin experiment, allthe testing was conducted in a sound-attenuated room at National ChiaoTung University, Hsinchu, Taiwan. Data from Stevenson et al. (2014) werereanalyzed to identify SAS-induced responses in English for comparison. Ap-paratus and procedures were as described in Stevenson et al. (2014), exceptfor minor changes as described here.5.2.2 Apparatus, task, and proceduresParticipants sat in an upright chair facing a computer monitor (View-sonic, VE175, 17”, 75 Hz refresh rate) at a distance of approximately 0.5meters and were instructed to look straight ahead at the monitor and re-spond to an acoustic stimulus by vocalizing the target syllable as quickly aspossible. A visual display of the syllable /ba/ was presented on the moni-tor concurrently with an acoustic stimulus of either 80 dB (control) or 124dB (startle). Participants were instructed to respond to the auditory stim-ulus by vocalizing the target syllable /ba/ while acoustics and video werecollected.In the Taiwanese Mandarin experiment, the syllable /ba/ was matchedwith each of the four lexical tones and presented in blocks, so that all syl-lables in each block contained the same tone (in the order of Tone 1, 2, 3,and 4). Target syllables were visually displayed using Taiwanese phoneticorthography (Zhuyin bopomofo), with tonal information specified.All testing trials began with a warning tone (500 ms, 440 Hz, 80 dB).101The acoustic imperative stimulus and the visual display of the target syllablefollowed the warning tone by a random time delay of between 1500 and 2500ms. This auditory signal was either a control stimulus (80 ± 2 dB, 100 ms,1,000 Hz) or a startling stimulus (124 ± 2 dB, 40 ms, 1,000 Hz, < 1ms risetime). The acoustic stimuli were amplified (R Long Audio amplifier AK302)and then presented via a loudspeaker placed directly behind the head ofthe participant. The acoustic stimulus intensities were measured using asound-level meter (TES sound level meter model: 1350A, “A” weightedscale, impulse response mode) at a distance of 30 cm from the loudspeaker(approximately the distance to the ears of the participant). Prior to thetesting session, a baseline startle trial (124 ± 2 dB, 40ms, 1,000 Hz, < 1msrise time) was introduced.Participants performed a single testing session of approximately 15 min-utes. The testing block consisted of twenty control trials and five startletrials. The five startle trials were presented pseudo-randomly such that thefirst trial was never a startle trial, nor were there two consecutive startletrials.5.2.3 Recording equipmentAcoustic production was recorded by the computer program PsycoPy(www.psychopy.org). Acoustic data were collected by a microphone directlyconnected to the computer. Data were sampled at 44.1 K Hz; collectionbegan 160 ms before the presentation of the stimulus and was terminated2000 ms later. Video was recorded using Lumix digital camera (PanasonicModel DMC-LX3).1025.2.4 Data preparation and statistical analysesA total of 34 of the 700 Taiwanese Mandarin trials (4.8%) were excludedfrom analysis, due either to slow voice onset (defined as 2 SD from themean) or to participants’ unfamiliarity with the trial event at the beginningof the testing. For both English and Taiwanese Mandarin, acoustic signalswere processed using Praat (Boersma and Weenink, 2009, http://www.fon.hum.uva.nl/praat/). Syllable boundaries were visually examined by theexperimenter and manually marked. Before marking syllable boundaries,the acoustic waveforms were first filtered via de-emphasis at 50 Hz.The voice onset was marked based on the appearance of substantialacoustic wave bursts; the end of the syllable was marked as the end ofperiodic waveforms. Acoustic burst release times and syllable durationswere submitted as dependent measures and analyzed using paired Student’st-tests.The formants were extracted using the LPC formant tracker, with awindow from 0 to 2000 Hz, and a window length of 0.0125 s. Data werethen analyzed across normalized durations. Voicing onset was identifiedand marked based on the appearance of substantial acoustic wave bursts;the end of the syllable was identified and marked as the end of the periodicwaveforms. Acoustic burst release times and syllable durations were used tomeasure reaction time and submitted as dependent variables. Measurementanalyses were performed using a 2 Stimulus × 4 Tone repeated measuresANOVA. Pitch and formant transitions from both English and TaiwaneseMandarin were analyzed using a smoothing spline analysis of variance (SS103ANOVA) to examine the fit of each variant (cf. Davidson, 2006; Derrick andSchultz, 2013).5.3 ResultsAs with the English results reported in Stevenson et al. (2014), in Tai-wanese Mandarin, significantly shorter latencies for acoustic burst releasetime were reported for SAS-induced responses than control responses (Ta-ble 5.2). Main effects were observed for Stimulus (F (1, 6) = 10.87, p =0.02), Tone (F (3, 18) = 8.94, p < .01), and the Stimulus × Tone interaction(F (3, 18) = 4.9, p = 0.01). Post-hoc analyses revealed that the reactiontime latency for Tone 4 was significantly shorter than that for Tone 1 andTone 2. The main effects for Tone and Stimulus × Tone interaction maybe interpreted as a practice effect, as reaction time latencies gradually de-creased from Tone 1 to Tone 4. Syllable durations are summarized in Table5.2. No effect for Stimulus or Tone was reported. No statistically signifi-cant difference was found for either F1 or F2 (as indicated by the overlapsbetween the control and startle fits in Figure 5.3), suggesting that formantprofiles, acting in a phonemic role, are preserved in SAS-induced responses.104Table 5.2: Average acoustic burst release times and syllable du-rations (in ms) across 4 Taiwanese Mandarin tones. Standarddeviations are shown in parentheses.Burst release time Syllable durationcontrol startle control startleTone 1 297.07 (70) 225.49 (49) 275.32 (103) 252.38 (79)Tone 2 272.37 (67) 233.01 (48) 349.49 (127) 313.03 (109)Tone 3 244.24 (74) 207.96 (55) 271.07 (124) 257.12 (122)Tone 4 209.40 (68) 180.06 (43) 206.47 (60) 209.99 (63)0.0 0.2 0.4 0.6 0.8 1.06008001000120014001600Normalized syllable durationFrequency (Hz)ControlStartleFigure 5.3: Taiwanese Mandarin F1 and F2 range for control and startleresponses. SS ANOVA results of F1 and F2 frequency range across 4 tones.Each shaded band represents a 95% confidence interval. The white spacebetween the transition lines for each condition represents a statistically sig-nificant difference between the given measures.For English, SAS-induced pitch levels were significantly higher thanthose of control responses, as indicated by the distinct SS ANOVA curvesshown in Figure 5.4. The onset frequency in SAS-induced responses was105elevated by 40 ∼ 50 Hz, compared to control responses. It is also notedthat pitch contour, although not phonemic in English, was unaffected bythe SAS.0.0 0.2 0.4 0.6 0.8 1.0200250300Normalized durationFrequency (Hz)ControlStartleFigure 5.4: SS ANOVA comparison of pitch over English /ba/ durations.Each shaded band represents a 95% confidence interval. The white spacebetween the transition lines for each condition represents a statistically sig-nificant difference between the given measures.For Taiwanese Mandarin, as with English, SAS-elicited pitch levels wereelevated and distinguished from control responses throughout most of theirduration (Figure 5.5). At vowel onset, substantial pitch elevation was ob-served for all four tones, with an increase of 39 Hz (≈ 222 cents) for Tone1, 27 Hz (≈ 189 cents) for Tone 2, 23 Hz (≈ 166 cents) for Tone 3, and30 Hz (≈ 167 cents) for Tone 4. Pitch contour for each tone, on the otherhand, remained largely unchanged by the SAS. Also visible in Figure 5.5,tone-specific variance is maintained across the different tones (i.e., Tones 3106and 4 exhibit more variance than Tones 1 and 2 in control responses, andthis variance is preserved in startle conditions).0.0 0.2 0.4 0.6 0.8 1.0150250350 Tone 1Frequency (Hz)ControlStartle0.0 0.2 0.4 0.6 0.8 1.0150250350 Tone 2ControlStartle0.0 0.2 0.4 0.6 0.8 1.0150250350 Tone 3Normalized durationFrequency (Hz)ControlStartle0.0 0.2 0.4 0.6 0.8 1.0150250350 Tone 4Normalized durationControlStartleFigure 5.5: SS ANOVA results of pitch profiles across 4 Taiwanese Mandarintones. Other legends follow Figure 5.3.The prediction that fundamental frequency will increase in response toa SAS is supported by the SS ANOVA results. Elevated pitch contourwas observed in startle responses for both English and Taiwanese Mandarin(Figures 5.4 and 5.5). The SS ANOVA results for pitch show a substantialdifference (∼ 40 Hz) between control and startle responses.1075.4 DiscussionIn this chapter, a simple RT task using SAS was conducted to compareTaiwanese Mandarin, a language in which different pitch contour profilesproduce contrasts in meaning, with English, in which pitch contour is notused to distinguish word meanings. The purpose of the experiment was toinvestigate how formant and pitch control are prepared in the speech planand how these details are revealed in SAS-induced responses. Essentially,the experiment revealed that SAS-induced responses are both rapid andaccurate.As indicated by the results, the introduction of a SAS triggers a rapidrelease of prepared syllables. This result echoes the findings for Englishreported in previous chapters of this dissertation. SS ANOVA results for F1and F2 in both English and Taiwanese Mandarin also confirm that formantsremain largely unaffected by a SAS. As formants play a phonemic role inboth English and Taiwanese Mandarin, the lack of effect on formant profilesin response to SAS elicitation suggest that a prepared target syllable oughtminimally to include phonemic information (i.e., information necessary forlexical contrast). Note that there is a tendency for English F1 to increasein the middle of the response during startle trials. Increased vocal effort isusually accompanied by higher pitch in fundamental frequency and a lowerjaw, which in turn elevates F1 values. Therefore, the observed elevation ofF1 values in the middle of production may be attributed to the physiologicaljaw-lowering response to the SAS.For both English and Taiwanese Mandarin, the pitch height of SAS-108induced responses is significantly elevated throughout the syllable’s dura-tion. As reported in Hain et al. (2000), pitch compensation can occur asearly as 100 ∼ 150 ms after the onset of auditory feedback. Thus, forSAS-induced responses, if any feedback correction in pitch were to occur,it could start as early as 100 ms after the voice onset. The current resultsrevealed that no evidence of feedback-based correction of pitch level to apre-specified baseline was observed at any stage in the English production.Thus, it appears that absolute pitch height is not pre-specified in speechplans for English. The elevation can be explained as relating to a physiolog-ical startle reflex, with an increase in laryngeal tension presumably actingas a protective maneuver in response to the SAS (Baer, 1979).In a tonal language such as Taiwanese Mandarin, pitch plays a phonemicrole in speech production; on the assumption that contrastive phonemicinformation should be pre-specified in the speech plan to guarantee responseaccuracy, one might predict that, for Taiwanese Mandarin, such a plan wouldinclude both absolute pitch height and contour profile, while this may notbe the case for a non-tonal language such as English. The results fromthe current study, however, show that both languages respond similarly tothe SAS, with initial pitch heights being compromised (elevated) in bothlanguages, while contour profiles are preserved throughout the syllable forall four Taiwanese Mandarin tones. This difference between pitch heightand contour profiles may relate to the observation that speakers are capableof producing reliable pitch contours despite a wide range of baseline vocalpitch levels in the population. Pitch control realized within a range of levels,as opposed to an absolute height, has been associated with speakers’ long-109term exposure to the sound inventory of the language, to other speakersof the same language, and to musical training from different ages (Deutschet al., 2006, 2009). As summarized in Table 5.2, syllable durations across thefour Taiwanese Mandarin tones ranged from 210 ms to 313 ms. The effectsof afferent (sensory) feedback should thus have been expected to begin asearly as 30% ∼ 50% of the way through the syllable’s production. Ourresults show that for all SAS-induced tonal contours, no attempt to correctthe pitch to an absolute or baseline level was observed (Figure 5.5).In Figure 5.5, the falling contour in SAS-induced responses starts to levelout slightly around 50% of the way through the syllable’s duration. Onepotential explanation could be feedback in action. Previous studies havesuggested that pitch regulation is sensitive to auditory feedback (e.g., Houdeand Jordan, 2002; Jones and Munhall, 2002; Xu et al., 2004); in continuousvowel phonation, speakers compensate for perturbed auditory feedback ata latency around 210 ms (Jones and Munhall, 2002). Similarly, during theproduction of bi-tonal sequences, when speakers’ feedback is shifted, theproduction is compensated at a latency around 164 ms (Xu et al., 2004). Inthe current study, for Tone 4 responses, the average syllable duration was205 ms for control responses and 210 ms for startle responses. If the levelingof pitch contour were the result of feedback correction, the compensationoccurring around 100 ms (50% of the way through the syllable’s duration)is earlier than possible feedback compensation as otherwise reported. If thesignificant pitch drop is the result of feedback correction, the correction isanticipated to occur much later in time. Moreover, if afferent feedback loopsback during the production, the pitch contour and level are anticipated to be110in a similar profile and range as those in control responses (i.e., the dashedline in Figure 5.5). Instead of becoming more similar to control responses,the leveling of Tone 4 startle responses is more likely to be accounted for bythe creakiness in the second half of the syllable.Together with the findings reported in Chapter 2, the observations wehave made here reveal a good deal of similarity between plans for speech andother SAS-induced (e.g., upper limb) movements. In SAS-induced speech,overall trajectories for a prepared response are generally maintained (forinstance, in the context of vowel formants, pitch, and lip movements). Like-wise, just as a SAS causes increased EMG muscle activity compared withcontrol responses for upper limb movements (e.g., Carlsen et al., 2012), sim-ilar effects are also observed in SAS-induced speech movements (cf. Chapter2). In the present study, amplified activity in laryngeal muscles elicited bya SAS may also account for increased laryngeal tension and concomitantelevation in pitch level.To conclude, SAS has not been found to introduce any perturbation topre-specified formant and pitch contour profiles. Response accuracy thus re-mains unaffected. However, higher-pitched syllables are observed in responseto a SAS; this suggests a limited capacity in the preplanning of speech tasks.While pitch contour is likely to be included in speech plans under feedfor-ward control, absolute pitch height may be implemented at a later stagein production. This suggestion, that pitch contour and pitch height areintroduced at different stages of production, calls for further research.111Chapter 6General discussion6.1 SummariesThis dissertation employs the design of startling auditory stimulus totrigger prepared speech and investigates different aspects of the planning ofspoken syllables. The results from each chapter are summarized as follows.Chapter 2 demonstrates that the StartReact effect is also observed inprepared syllable production. The implication of this study is that thismethodology is also applicable to motor behaviours with heavy cortical in-volvement. In that chapter, it has been shown that a syllable such as aCV sequence can be prepared in advance and is subject to early release bySAS. In addition to the early release, additional lip compression was ob-served at the beginning of the voluntary spoken /ba/. This compressionwas seen in both control and startle responses. Chapter 3 followed up onthe observations of both latency and lip compression and further exam-ined whether these effects are speech-specific or generic to other speech-likeand non-speech movements. As revealed by the results, lip compression oc-curred most frequently in Spoken syllables, followed by Mouthed and thenNon-speech movements. Compared with Spoken responses, Mouthed andNon-speech responses exhibited less compression displacement and shorter112compression duration. Differentiated frequencies and kinematic profiles oflip compression across Spoken, Mouthed, and Non-speech responses suggestthat such lip compression is part of the speech plan for Spoken speech, andpossibly more associated with air flow required for Spoken speech. Thelatencies to the voluntary movement onsets for all three conditions werecomparable. However, slightly longer reaction time and more reduced lipcompression were observed in the Mouthed speech, compared with the othertwo. It is conjectured that an inhibitory effect may be induced for Mouthedspeech since it is not as well-rehearsed as Spoken speech. Therefore, thepotential inhibitory effect may be responsible for this observation.To compare with the speech-specific lip compression observed in chapter3, Chapter 4 further tested whether lip compression resulting from move-ment overshoot is present in all responses and subject to early release bySAS. An additional pre-speech movement (i.e., a mouth closing movement)was added to the beginning of the prepared /ba/. The results showed that allthree response types (Spoken, Mouthed, and Non-speech) respond similarlyin terms of both latency and lip compression.Chapter 5 investigates how suprasegmental information are included inspeech plans. By comparing English and Taiwanese Mandarin, the resultsshow that pitch levels are subject to be perturbed and elevated by SAS whilepitch contour and formants remain unaffected, suggesting that phonemicgestures may be included in speech plans and are more resistant to externalperturbation.1136.2 Theoretical implicationsThe experimental results of this chapter not only address the immediateresearch questions but also lead to further discussion and implication aboutprevious theoretical work, as I will discuss in the following sub-sections.6.2.1 Startle paradigm as a window into speech motorplanningIt is commonly acknowledged that a speech event not only specifies theintended linguistic function but also coordinates across multiple muscularstructures. As defined by Maas et al. (2008, p. 107), a speech motor pro-gram is “a set of processes responsible for transforming an abstract linguis-tic (phonological) code into spatially and temporally coordinated patterns ofmuscle contractions that produce speech movements.” For example, to suc-cessfully produce a CV syllable /ba/, the lips are bound to make a closurefollowed by a burst into the vowel /a/. Note that the production of /b/is not as simple as making a bilabial closure. It also involves the distinc-tive properties that differentiate /b/ from other phonemes, such as /p/, inwhich aspiration on the burst is required for English. In addition to respi-ratory and laryngeal mechanisms for maintaining and modulating air flowand phonation, the planning of a syllable as simple as a CV /ba/ wouldminimally require coordinated lip and tongue kinematics as well as perioralmuscular activities anticipating the aerodynamics within.To understand these coordinated patterns of muscle activity for speechproduction, McClean and Tasko (2002) examined the relationship between114supralaryngeal kinematics and laryngeal control, and between supralaryn-geal and respiratory configurations. Participants were directed to produce adesignated phrase (“a bad daba”) with different intensity levels and speak-ing rates. Their results show a strong and positive correlation between lipkinematics and laryngeal postures for different fundamental frequencies andintensity levels, although individual variations were observed across partici-pants. A more robust correlation was observed between jaw kinematics andlaryngeal configuration. McClean and Tasko (2003) further demonstratedthat orofacial EMG activity is positively correlated with orofacial kinematics(i.e., distance and speed) across variations in speech rate and intensity. Theauthors’ results also showed that lip kinematics and EMG are highly cor-related with speech rate and intensity, suggesting a tight coupling betweenorofacial movements and laryngeal and respiratory movements. Similar re-sults are reported in Wohlert and Hammen (2000).Similarly, other studies also find a tight coupling between lower lip, jaw,and laryngeal movements. For example, Gracco and Lo¨fqvist (1994) demon-strated that lip displacement and glottal configurations are highly correlatedin terms of timing during consonant production, as are jaw lowering andglottal closure velocity during vowel production. Based on these observedtight couplings, Gracco and Lo¨fqvist (1994) proposed that speech motor be-haviours are organized on a functional basis to successfully produce phone-mic segments, and that the motor actions required for a phonemic segmentmay be stored together as a motor program. While Gracco and Lo¨fqvist(1994) proposed that phonemic segments are the fundamental units of speechproduction, they also note that these units may be combined into sequences115through the use of blending motions (i.e., trajectories). Speech units, on thisview, are sequential compositions of different motor programs moderated bycoarticulation mechanics. Syllable construction is encompassed within theprogramming of units through context-dependent adjustment.In addition to a high correlation between lip kinematics and laryngealconfigurations, research has also found a direct effect of suprasegmental ad-justments on lip kinematics. Kelso et al. (1985) showed that different speak-ing rates and stress patterns may induce different lip kinematics. Largerdisplacements are observed in normal speech than in fast speech; greaterdisplacements and longer durations are observed under stress. Lip clos-ing and opening movements are also affected by speaking rate to a variableextent (Adams et al., 1993). These observed tightly coupled relationship be-tween orofacial kinematics and laryngeal configurations suggest that speechmotor control involves motor commands across muscular and kinematic or-ganizations, including both segmental and suprasegmental components.If speech plans indeed pre-specify coordination at a muscular level withcorrespondent kinematics, a SAS-elicited response should be able to revealthe details of such a speech plan. With only limited or no feedback correc-tion, a SAS can reliably elicit prepared speech motor responses. The resultsfrom Chapters 2 - 4 showed that a prepared speech event as long as a CVsequence is subject to rapid release by SAS. Uncompromised formants inSAS-induced responses indicate that muscular coordination across the oral,lingual (tongue), and laryngeal control were performed largely unperturbedby SAS. A more general view from this is that an entire CV syllable maybe considered as a coherent event. In Chapter 5, prepared syllables were116performed with phonemic contrasts, but phonemic parameters were not in-dependently performed or triggered, making these results inconclusive withregard to planning at a phonemic level. Moreover, Chapter 5 only examinedpitch contour as one of the phonemic parameters. An alternative argumentfor sub-phonemic information to be part of a pre-plan is implicated in studieson inner speech, for example. In inner speech (un-articulated speech), corol-lary discharge providing a sensory prediction of the motor command wouldaffect listeners’ perception (Scott et al., 2013). Mouthing and imaginingspeech tasks affect listeners’ perception, based on the sensory predictionof their motor commands for the tasks. When mouthing [afa], for example,speakers are more likely to categorize a sound token from the [ava]-[aba] con-tinuum as [ava]; in other words, they choose the token alternative in whichthe labiodental feature is shared between their sensory prediction (from [afa])and their auditory perception (from [ava]). Their results suggest that speechplanning may include information detailed to the sub-phonemic level. Withour current design, it would be too soon to conclude if phonemic parameterscan act as discrete units for speech production, and more specifically besubject to rapid release by SAS. A more reasonable design to examine inde-pendently prepared and triggered phonemic parameters would be applyinga choice RT task or a Go vs. No Go task using SAS (e.g., Carlsen et al.,2008; Kumru et al., 2006), in which phonemic parameters may be controlledas factors and further be examined if they can be independently elicited.1176.2.2 Forward controlFor decades in the speech motor control literature, speech productionwas considered to be dependent on inter-articulator coordination betweenplanning and feedback regulation. For example, perturbation studies in-vestigating coordination across articulators and muscles have emphasizedthe interdependence between forward and feedback control (e.g., Abbs andGracco, 1984; Kelso et al., 1984). In particular, several of these studiesobserved compensatory responses from the upper lip when external forceperturbed the closure movements of the lower lip, suggesting a predictive,open-loop process (Abbs and Gracco, 1984; Kelso et al., 1984; Shaiman andGracco, 2002). These findings reveal that speech motor behaviour relieson forward control and suggest that some aspects of articulation involve“preplanning” while others are automatic responses. Speech plans may berevealed when appropriate perturbation or stimulus is delivered. Note, how-ever, that the forward control revealed by these studies is conditional; for-ward control movements only arise when there is perturbation and feedbackinformation available. In other words, speech production cannot normallybe observed under forward control alone.Inner speech and efference copy also reflect the content of planning. Forexample, as reported by Ylinen et al. (2014), when the auditorily presentedvowel is concordant with the rehearsed item, a suppression effect was ob-served in the auditory cortex; when the auditorily presented vowel is differ-ent from the rehearsal, the auditory cortex would show enhanced activity.Their findings revealed that the planning may be as detailed as the level of118phonemes or syllables (not distinguished by Ylinen et al.). Efference copystudies similarly demonstrate that phonemic vowels are part of speech plans(Niziolek et al., 2013).Along with the above studies, the results from this dissertation alsoreveal that prepared speech may be performed under largely forward con-trol, i.e., when little or no feedback correction is introduced. Speech andnon-speech movements are performed as intended but with shorter laten-cies. While there appear to be some consequences of the elicitation by SAS,such as augmented EMG, larger acoustic amplitude, elevated pitch level,and greater magnitude of lip displacement, these affected aspects of perfor-mance instead suggest that they are not central to the task as specified inthe speech plan, and may thus be more susceptible to the SAS perturbation.This is also in line with the Minimal Intervention Principle, which arguesthat variabilities in task-irrelevant dimensions are allowed and feedback in-formation is used to correct only those that interfere with the intended goalof the task (Todorov and Jordan, 2002, 2003). As such, un-perturbed lipmovement trajectories (e.g., lip compression and bilabial burst) and phone-mic details are more associated with forward control and are resistant toother corrections or variations.6.3 Conclusion and future workResults in this dissertation lead to several broader implications. First,as the StartReact effect is also observed in heavy cortically determined pro-cesses like speech, speech motor control may share more in common than119previously believed with body motor control in terms of neurological path-ways and execution of motor commands. Second, the results not only un-cover a number of aspects of oral motor control that can be pre-specified ina speech plan, but also demonstrate how these aspects may be planned andexecuted in forward control when only limited feedback correction is avail-able. Third, speech plans may be detailed to the phonemic level whereassuprasegmental control is more susceptible to SAS perturbation.These and other questions arising from this work can be further testedand confirmed through experimental and simulation approaches. These willcall for future research.120BibliographyAbbs, J. H. and Gracco, V. L. (1984). Control of complex motor gestures- orofacial muscle responses to load perturbations of lip during speech.Journal of Neurophysiology, 51(4):705–723.Adams, S. G., Weismer, G., and Kent, R. D. (1993). Speaking rate andspeech movement velocity profiles. Journal of Speech and Hearing Re-search, 36:41–54.Alibiglou, L. and MacKinnon, C. D. (2012). The early release of plannedmovement by acoustic startle can be delayed by transcranial magneticstimulation over the motor cortex. Journal of Physiology, 590(4):919–936.Baddeley, A. (1998). Recent developments in working memory. CurrentOpinion in Neurobiology, 8(2):234–238.Baddeley, A., Lewis, V., and Vallar, G. (1984). Exploring the articulatoryloop. The Quarterly Journal of Experimental Psychology Section A: Hu-man Experimental Psychology, 36(2):233–252.Baer, T. (1979). Reflex activation of laryngeal muscles by sudden inducedsubglottal pressure changes. Journal of the Acoustical Society of America,65(5):1271–1275.121Bell, A. and Hooper, J. B. (1978). Syllables and Segments. North-Holland.Bernhardt, B. H. and Stemberger, J. P. (1988). Handbook of PhonologicalDevelopment from a Nonlinear Constraint-based Perspective. AcademicPress, San Diego, CA.Boersma, P. and Weenink, D. (2009). Praat: Doing Phonetics by Computer(version 5.1.15). [Computer program] Retrieved August 30, 2009, fromhttp://www.praat.org/.Bohland, J. W. and Guenther, F. H. (2006). An fMRI investigation ofsyllable sequence production. NeuroImage, 32:821–841.Boucher, V. J. (2008). Intrinsic factors of cyclical motion in speech articula-tors: Reappraising postulates of serial-ordering in motor-control theories.Journal of Phonetics, 36:295–307.Bratzlavsky, M. (1979). Feedback control of human lip muscle. ExperimentalNeurology, 65:209–217.Brendel, B., Erb, M., Riecker, A., Grodd, W., Ackermann, H., and Ziegler,W. (2011). Do we have a mental syllabary in the brain? An fMRI study.Motor Control, 15:34–51.Brendel, B., Hertrich, I., Erb, M., Lindner, A., Riecker, A., Grodd, W., andAckermann, H. (2010). The contribution of mesiofrontal cortex to thepreparation and execution of repetitive syllable productions: An fMRIstudy. NeuroImage, 50(3):1219–1230.122Brown, P., Rothwell, J. C., Thompson, P. D., Britton, T. C., Day, B. L.,and Marsden, C. D. (1991). New observations on the normal auditorystartle reflex in man. Brain, 114:1891–1902.Carlsen, A. N., Chua, R., Dakin, C. J., Sanderson, D. J., Inglis, T. J.,and Franks, I. M. (2008). Startle reveals an absence of advance motorprogramming in a Go/No-go task. Neuroscience Letters, 434:61–65.Carlsen, A. N., Chua, R., Inglis, T. J., Sanderson, D. J., and Franks, I. M.(2004a). Can prepared responses be stored subcortically. ExperimentalBrain Research, 159:301–309.Carlsen, A. N., Chua, R., Inglis, T. J., Sanderson, D. J., and Franks, I. M.(2004b). Prepared movements are elicited early by startle. Journal ofMotor Behaviour, 36(3):253–264.Carlsen, A. N., Chua, R., Inglis, T. J., Sanderson, D. J., and Franks, I. M.(2009a). Differential effects of startle on reaction time for finger and armmovements. Journal of Neurophysiology, 101:306–314.Carlsen, A. N., Chua, R., Summers, J. J., Inglis, T. J., Sanderson, D. J., andFranks, I. M. (2009b). Precues enable multiple response preprogramming:Evidence from startle. Psychophysiology, 46(2):241–251.Carlsen, A. N., Maslovat, D., and Franks, I. M. (2012). Preparation forvoluntary movement in healthy and clinical populations: Evidence fromstartle. Clinical Neurophysiology, 123:21–33.Carlsen, A. N., Maslovat, D., Lam, M. Y., Chua, R., and Franks, I. M.123(2011). Considerations for the use of a startling acoustic stimulus instudies of motor preparation in humans. Neuroscience and BiobehavioralReviews, 35(3):366–376.Castellote, J. M., Kumru, H., Queralt, A., and Valls-Sole´, J. (2007). A startlespeeds up the execution of externally guided saccades. Experimental BrainResearch, 177:129–136.Castellote, J. M., Queralt, A., and Valls-Sole´, J. (2012). Preparedness forlanding after a self-initiated fall. Journal of Neurophysiology, 108:2501–2508.Chein, J. M. and Fiez, J. A. (2001). Dissociation of verbal working memorysystem component using a delayed serial recall task. Cerebral Cortex,11:1003–1014.Chiu, C. and Gick, B. (2013). Producing whole speech events: Anticipatorylip compression in bilabial stops. In International Congress of Acoustics2013 Proceedings.Cho, T., Jun, S.-A., and Ladefoged, P. (2002). Acoustic and aerodynamiccorrelates of korean stops and fricatives. Journal of Phonetics, 30:193–228.Cholin, J. and Levelt, W. J. M. (2009). Effects of syllable preparationand syllable frequency in speech production: Further evidence for syllabicunits as a post-lexical level. Language and Cognitive Processes, 24(5):662–684.Crevier-Buchman, L., Gendrot, C., Denby, B., Pillot-Loiseau, C., Roussel,124P., Colazo-Simon, A., and Dreyfus, G. (2011). Articulatory strategies forlip and tongue movements in silent versus vocalized speech. In Proceedingsof the 17th International Congress of Phonetic Science, pages 1–4.d’Avella, A. and Bizzi, E. (2005). Shared and specific muscle synergies innatural motor behaviors. Proceedings of the National Academy of Sci-ences, 102(8):3076–3081.Davidson, L. (2006). Comparing tongue shapes from ultrasound imagingusing smoothing spline analysis of variance. Journal of the AcousticalSociety of America, 120(1):407–415.Davis, B. L., MacNeilage, P. F., and Matyear, C. L. (2002). Acquisitionof serial complexity in speech production: A comparison of phonetic andphonlogical approaches to first word production. Phonetica, 59:75–107.Delattre, P. C., Liberman, A. M., and Cooper, F. S. (1955). Acoustic lociand transitional cues for consonants. Journal of the Acoustical Society ofAmerica, 27(4):769–773.Derrick, D. (2011). Kinematic Patterning of Flaps Taps and Rhotics inEnglish. PhD thesis, University of British Columbia.Derrick, D. and Schultz, B. (2013). Acoustic correlates of flaps in NorthAmerican English. In Proceedings of Meetings on Acoustics, volume 19,page 060260.Deutsch, D., Henthron, T., Marvin, E., and Xu, H. (2006). Absolute pitchamong american and chinese conservatory students: Prevalence differ-125ences, and evidence for a speech-related critical period. Journal of theAcoustical Society of America, 119(2):719–722.Deutsch, D., Le, J., Shen, J., and Henthron, T. (2009). The pitch levels offemale speech in two Chinese villages. Journal of the Acoustical Societyof America Express Letters, 125(5):EL 208 – 213.Fant, G. (1960). Acoustic Theory of Specch Production. The Hague: Mouton.Ferrand, L. and Segui, J. (1998). The syllable’s role in speech production:Are syllables chunks, schemas, or both? Psychonomic Bulletin and Re-view, 5(2):253–258.Forgaard, C. J., Maslovat, D., Carlsen, A. N., Chua, R., and Franks, I. M.(2013). Startle reveals independent preparation and initiation of tripha-sic EMG burst components in targeted ballistic movements. Journal ofPhysiology, 110:2129–2139.Gick, B., Francis, N., Chiu, C., Stavness, I., and Fels, S. (2012). Producingwhole speech events: Differential facial stiffness across the labial stops.The Journal of the Acoustical Society of America, 131(4):3345.Gracco, V. L. and Lo¨fqvist, A. (1994). Speech motor coordination andcontrol: Evidence from lip, jaw, and laryngeal movements. Journal ofNeuroscience, 14(11):6585–6597.Gray, H. (2000). Anatomy of the Human Body. Philadelphia: Lea andFebiger, 1918; Bartleby.com, 2000. http://www.bartleby.com/107/, 20thedition.126Grimme, B., Fuchs, S., Perrier, P., and Scho¨ner (2011). Limb versus speechmotor control: A conceptual review. Motor Control, 15:5–33.Guenther, F. H. (2006). Cortical interactions underlying the production ofspeech sounds. Journal of Communication Disorders, 39:350–365.Guenther, F. H., Ghosh, S. S., and Tourville, J. A. (2006). Neural modelingand imaging of the cortical interactions underlying syllable production.Brain and Language, 96(3):280–301.Hain, T. C., Burnett, T. A., Kiran, S., Larson, C. R., Singh, S., and Kenney,M. K. (2000). Instructing subjects to make a voluntary response revealsthe presence of two components to the audio-vocal reflex. ExperimentalBrain Research, 130:133–141.Hickok, G. and Poeppel, D. (2000). Towards a functional neuroanatomy ofspeech perception. Trends in Cognitive Sciences, 4(4):131–138.Hickok, G. and Poeppel, D. (2004). Dorsal and ventral streams: A frame-work for understanding aspects of the functional anatomy of language.Cognition, 92:67–99.Hickok, G. and Poeppel, D. (2007). The cortical organization of speechprocessing. Nature Reviews Neuroscience, 8(5):393–402.Honeycutt, C. F., Kharouta, M., and Perreault, E. J. (2013). Evidence forreticulospinal contributions to coordinated finger movements in humans.Journal of Neurophysiology, 110:1476–1483.127Houde, J. and Jordan, M. I. (2002). Sensorimotor adaptation of speech I:Compensation and adaptation. Journal of Speech, Language, and HearingResearch, 45:295–310.Indefrey, P. and Levelt, W. J. M. (2004). The spatial and temporal signaturesof word production components. Cognition, 92(1-2):101–1044.Iwata, K., Yagi, J., Tsuboi, Y., Koshikawa, N., Sumino, R., and Cools, A. R.(1996). Anatomical connections of the ventral but not the dorsal part ofthe striatum with the parvicellular reticulation formation: Implicationsfor the anatomical substrate of oral movements. Neuroscience ResearchCommunications, 18(2):71–78.Jacquemot, C. and Scott, S. K. (2006). What is the relationship betweenphonological short-term memory and speech processing? Trends in Cog-nitive Sciences, 10(11):480–486.Janke, M., Wand, M., and Schultz, T. (2010). Impact of lack of acousticfeedback in emg-based silent speech recognition. In Interspeech, pages2686–2689.Jones, J. A. and Munhall, K. G. (2002). The role of auditory feedback duringphonation: Studies of Mandarin tone production. Journal of Phonetics,30:303–320.Ju¨rgens, U. (2002). Neural pathways underlying vocal control. Neuroscienceand Biobehavioral Reviews, 26:235–258.Keele, S. W. (1981). Behavioral analysis of movement. In Brooks, V.,128editor, Handbooks of Physiology: Sec 1: The Nervous System: Vol 2:Motor Control, pages 1391–1414. Baltimore, MD: Williams & Wilkins.Kelso, J. A., Tuller, B., Vatikiotis-Bateson, E., and Fowler, C. (1984). Func-tionally specific articulatory cooperation following jaw perturbations dur-ing speech: Evidence for coordinative structures. Journal of ExperimentalPsychology: Human Perception and Performance, 10(6):812–832.Kelso, J. A., Vatikiotis-Bateson, E., Saltzman, E., and Kay, B. (1985). Aqualitative dynamic analysis of reiterant speech production: Phase por-traits, kinematics, and dynamic modeling. Journal of the Acoustical So-ciety of America, 77(1):266–280.Klapp, S. T. (1977). Reaction time analysis of programmed control. Exerciseand Sport Sciences Reviews, 5:231–253.Klapp, S. T. (1995). Motor response programming during simple and choicereaction time: The role of practice. Journal of Experimental Psychology:Human Perception and Performance, 21(5):1015–1027.Klapp, S. T. (2003). Reaction time analysis of two types of motor prepara-tion for speech articulation: actions as a sequence of chunks. Journal ofMotor Behavior, 35(2):135–150.Kumru, H., Urra, X., Compta, Y., Castellote, J. M., Turbau, J., and Valls-Sole´, J. (2006). Excitability of subcortical motor circuits in Go/noGo andforced choice reaction time tasks. Neuroscience Letters, 406:66–70.Larson, C. R., Altman, K. W., Liu, H., and Hain, T. C. (2008). Interac-129tions between auditory and somatosensory feedback for voice F0 control.Experimental Brain Research, 187:613–621.Lashley, K. S. (1951). The problem of serial order in behavior. In Jeffress,L. A., editor, Cerebral Mechanisms in Behavior, pages 112–131. New York:Wiley.Levelt, W. J. M., Roelofs, A., A., and Meyer, A. S. (1999). A theory of lexicalaccess in speech production. Behavioral and Brain Sciences, 22:1–75.Levelt, W. J. M. and Wheeldon, L. (1994). Do speakers have access tomental syllabary? Cognition, 50:239–269.Liu, H.-M., Tsao, F.-M., and Kuhl, P. K. (2007). Acoustic analysis of lex-ical tone in mandarin infant-directed speech. Developmental Psychology,43(4):912–197.Lo¨fqvist, A. and Gracco, V. L. (1997). Lip and jaw kinematics in bilabialstop consonant production. Journal of Speech, Language, and HearingResearch, 40:877–893.Maas, E., Robin, D. A., Wright, D. L., and Ballard, K. J. (2008). Motorprogramming in apraxia of speech. Brain and Language, 106:107–118.MacKinnon, C. D., Bissig, D., Chiusano, J., Miller, E., Rudnick, L., Jager,C., Zhang, Y., Mille, M.-L., and Rogers, M. W. (2007). Preparation ofanticipatory postural adjustments prior to stepping. Journal of Neuro-physiology, 97:4368–4379.130MacNeilage, P. F. (1998). The frame/content theory of evolution of speechproduction. Behavioral and Brain Sciences, 21:499–546.MacNeilage, P. F., Davis, B. L., Kinney, A., and Matyear, C. L. (2000).The motor core of speech: A comparison of serial organization patternsin infants and languages. Child Development, 71(1):153–163.Maslovat, D., Carlsen, A. N., Chua, R., and Franks, I. M. (2009). Responsepreparation changes during practice of an asynchronous bimanual move-ment. Experimental Brain Research, pages 383–392.Maslovat, D., Hodges, N. J., Chua, R., and Franks, I. M. (2011). Motorpreparation and the effects of practice: Evidence from startle. BehavioralNeuroscience, 125(2):226–240.Maslovat, D., Klapp, S. T., Jagacinski, R. J., and Franks, I. M. (2014).Control of response timing occurs during the simple reaction time intervalbut on-line for choice reaction time. Journal of Experimental Psychology:Human Perception and Performance, 40(5):2005–2021.McClean, M. D. (1991). Lip muscle reflex and intentional response levels ina simple speech task. Experimental Brain Research, 87:662–670.McClean, M. D. and Clay, J. L. (1995). Activation of lip motor units withvariations in speech rate and phonetic structure. Journal of Speech andHearing Research, 38(4):772–788.McClean, M. D. and Tasko, S. M. (2002). Association of orofacial with la-131ryngeal and respiratory motor output during speech. Experimental BrainResearch, 146:481–489.McClean, M. D. and Tasko, S. M. (2003). Association of orofacial muscle ac-tivity and movement during changes in speech rate and intensity. Journalof Speech, Language, and Hearing Research, 46:1387–1400.McDonald, J. H. (2009). Handbook of Biological Statistics. Sparky HousePublishing.Meyer, B. U., Werhahn, K., Rothwell, J. C., Roericht, S., and Fauth,C. (1994). Functional organisation of corticonuclear pathways to mo-toneurones of lower facial muscles in man. Experimental Brain Research,101:465–472.Murphy, K., Corfield, D. R., Guz, A., Fink, G. R., Wise, R. J. S., Harrison,J., and Adams, L. (1997). Cerebral areas associated with motor controlof speech in humans. Journal of Applied Physiology, 83:1438–1447.Niziolek, C. A., Nagarajan, S. S., and Houde, J. F. (2013). What does motorefference copy represent? Evidence from speech production. Journal ofNeuroscience, 33(41):16110–16116.Nonnekes, J., Oude Nijhuis, L. B., de Niet, M., de Bot, S. T., Pasman,J. W., van de Warrenburg, B. P. C., Bloem, B. R., Weerdesteyn, V.,and Geurt, A. C. (2014). StartReact restores reaction time in HSP: Evi-dence for subcortical release of a motor program. Journal of Neuroscience,34(1):275–281.132Oude Nijhuis, L. B., Janssen, L., Bloem, B. R., Gert van Dijk, J., Gielen,S. C., Borm, G. F., and Overeem, S. (2007). Choice reaction times forhuman head rotations are shortened by startling acoustic stimuli, irre-spective of stimulus direction. Journal of Physiology, 584(1):97–109.Papoutsi, M., Zwart, J. A., Jansma, J. M., Pickering, M. J., Bednar, J. A.,and Horwitz, B. (2009). From phonemes to articulatory codes: An fMRIstudy of the role of Broca’s area in speech production. Cerebral Cortex,19:2156–2165.Perrier, P., Ostry, D. J., and Laboissie`re, R. (1996). The equilibrium pointhypothesis and its application to speech motor control. Journal of Speechand Hearing Research, 39(2):365–378.Peterson, G. E. and Barney, H. L. (1952). Control methods used in a studyof the vowels. Journal of the Acoustical Society of America, 24(2):175–184.Pickering, M. J. and Garrod, S. (2013). An integrated theory of languageproduction and comprehension. Behavioral and Brain Sciences, 36:329–392.Riecker, A., Mathiak, K., Wildgruber, D., Erb, M., Hertrich, I., Grodd, W.,and Ackermann, H. (2005). fMRI reveals two distinct cerebral networkssubserving speech motor control. Neurology, pages 700–706.Sanguineti, V., Laboissie`re, R., and Ostry, D. J. (1998). A dynamic biome-chanical model for neural control of speech production. Journal of theAcoustical Society of America, 103(3):1615–1627.133Schroeder, C. E. and Foxe, J. J. (2002). The timing and laminar profile ofconverging inputs to multisensory areas of the macaque neocortex. Cog-nitive Brain Research, 14:187–198.Schuhmann, T., Schiller, N. O., Goebel, R., and Sack, A. T. (2009). Thetemporal characteristics of functional activation of Broca’s area duringovert picture naming. Cortex, 45:1111–1116.Scott, M., Yeung, H. H., Gick, B., and Werker, J. F. (2013). Inner speechcaptures the perception of external speech. Journal of the AcousticalSociety of America Express Letters, 133(4):EL 286 – EL 292.Shaiman, S. and Gracco, V. L. (2002). Task-specific sensorimotor interac-tions in speech articulation. Experimental Brain Research, 146:411–418.Siegmund, G. P., Inglis, T. J., and Sanderson, D. J. (2001). Startle responseof human neck muscles sculpted by readiness to perform ballistic headmovements. Journal of Physiology, 535(1):289–300.Simonyan, K. and Horwitz, B. (2011). Laryngeal motor cortex and controlof speech in humans. Neuroscientist, 17(2):197–208.Stevenson, A. J. (2011). Examining cortical involvement in the StartReacteffect using TMS. Master’s thesis, University of British Columbia.Stevenson, A. J., Chiu, C., Maslovat, D., Chua, R., Gick, B., Blouin, J.-S.,and Franks, I. M. (2014). Cortical involvement in the StartReact effect.Neuroscience, 269:21–34.134Stockard, J. J., Stockard, J. E., and Sharbrough, F. W. (1977). Detec-tion and localization of occult lesions with brain-stem auditory responses.Mayo Clinic Proceedings, 52(12):761–769.Terao, Y., Ugawa, Y., Enomoto, H., Furubayashi, T., Shiio, Y., Machii, K.,Hanajima, R., Nishikawa, M., Iwata, N. K., Saito, Y., and Kanazawa, I.(2001). Hemispheric lateralization in the cortical motor preparation forhuman vocalization. Journal of Neuroscience, 21(5):1600–1609.Tian, X. and Poeppel, D. (2012). Mental imagery of speech: Linking mo-tor and perceptual systems through internal simulation and estimation.Frontiers in Human Neuroscience, 6:1–11.Todorov, E. and Jordan, M. I. (2002). Optimal feedback control as a theoryof motor coordination. Nature Neuroscience, 5(11):1226–1235.Todorov, E. and Jordan, M. I. (2003). A minimal intervention principle forcoordinated movement. In Advances in Neural Information ProcessingSystems, volume 15, pages 27–34. MIT Press.Tourville, J. A. and Guenther, F. H. (2011). The DIVA: A neural theoryof speech acquisition and production. Language and Cognitive Processes,26(7):952–981.Tremblay, S., Shiller, D. M., and Ostry, D. J. (2003). Somatosensory basisof speech production. Nature, 423:866–869.Uemura, N., Tanaka, M., and Kawazoe, T. (2008). Study on motor learning135of sternocleidomastoid muscles during ballistic voluntary opening. Journalof the Japan Prosthodontic Society, 52:494–500.Valls-Sole´, J., Kumru, H., and Kofler, M. (2008). Interaction between star-tle and voluntary reactions in humans. Experimental Brain Research,187:497–507.Valls-Sole´, J., Rothwell, J. C., Goulart, F., Cossu., G., and Mun˜oz, E. (1999).Patterned ballistic movements triggered by a startle in healthy humans.Journal of Physiology, 516(3):931–938.Wadman, W. J., Denier Van Der Gon, J. J., Geuze, G. H., and Mol, C. R.(1979). Control of fast goal-directed arm movements. Journal of HumanMovement Studies, 5:3–17.Wand, M., Jou, S.-C., S., Toth, A., R., and Schultz, T. (2009). Impact of dif-ferent speaking modes on EMG-based speech recognition. In InterSpeech,pages 648–651.Wohlert, A. B. and Hammen, V. L. (2000). Lip muscle activity relatedto speech rate and loudness. Journal of Speech, Language, and HearingResearch, 43:1229–1239.Xu, Y., Larson, C. R., Bauer, J. J., and Hain, T. C. (2004). Compensationfor pitch-shifted auditory feedback during the production of mandarintone sequences. Journal of the Acoustical Society of America, 116(2):1168–1178.136Yeomans, J. S. and Frankland, P. W. (1995). The acoustic startle reflex:neurons and connections. Brain Research Reviews, 21(3):301–314.Ylinen, S., Nora, A., Leminen, A., Hakala, T., Huotilainen, M., Shtyrov, Y.,Ma¨kela¨, J. P., and Service, E. (2014). Two distinct auditory-motor circuitsfor monitoring speech production as revealed by content-specific suppres-sion of auditory cortex. Cerebral Cortex, doi:10.1093/cercor/bht351.137Appendix AChapter 3 additional figuresTo further examine the phenomenon of lip compression in Mouthed andNon-speech responses, the frequency of lip compression was analyzed. Fig-ures A.1 and A.2 depict the numbers of trials with lip compression occurringin the first and second halves of the block. Overall, more lip compression wasobserved in Mouthed responses (Figure A.1) than in Non-speech responses(Figure A.2), although some individuals showed the opposite pattern (e.g.,subject 8). As shown by the figures, the frequency of trials with lip com-pression is comparable in the first and second halves. Training effects fromthe first half of the block did not influence lip compression frequency in thesecond half.13815 10 5 0 5 10 150123456789SubjectNumber of TrialsMouthedFirst halfSecond halfFigure A.1: Number of trials with lip compression in Mouthed condition.The bars in black represent trials occurring in the first half of the block; thegrey bars represent trials occurring in the second half of the block.15 10 5 0 5 10 150123456789SubjectNumber of TrialsNon-speechFirst halfSecond halfFigure A.2: Number of trials with lip compression in Non-speech condition.The bars in black represent trials occurring in the first half of the block; thegrey bars represent trials occurring in the second half of the block.139

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0220650/manifest

Comment

Related Items