UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Crosslinguistic similarity and structured variation in Cantonese-English bilingual speech production Johnson, Khia Anne


Bilingual speech production is highly variable. This variability arises from numerous sources, ranging from the heterogeneity of linguistic experiences to crosslinguistic influence and more. This area has historically been challenging to study, given the relative lack of high-quality bilingual speech corpora and scientific inquiry that such resources enable. This dissertation introduces the SpiCE corpus of bilingual Speech in Cantonese and English and reports on two studies with the corpus. Chapter 2 describes how SpiCE was designed, collected, transcribed, and annotated. Broadly, it comprises recordings of 34 early Cantonese-English bilinguals conversing in both languages, hand-corrected orthographic transcripts, and force-aligned phone level annotations. Chapters 3 and 4 are motivated by a desire to understand how crosslinguistic similarity shapes phonetic variation in speech production. Chapter 3 addresses this question at the level of voice. Using 24 filter and source-based acoustic measurements over all voiced speech in the interviews, principal components and canonical redundancy analyses demonstrate that while talkers vary in the degree to which they have the same ``voice'' across languages, all talkers show strong similarity with themselves. To a lesser extent, talkers exhibit similarities with one another, providing further support for prototype models of voice. Chapter 4 pivots to the level of sound categories. Prior work in this area emphasizes detecting crosslinguistic influence for phonetically distinct yet phonologically similar sounds. This chapter leverages the uniformity framework to assess underlying phonetic similarity for the long-lag stop series in Cantonese and English. Results indicate moderate patterns of uniformity within and across languages but suggest that a slightly coarser view of uniformity is more appropriate. Additionally, there was a clear difference across languages, supporting simultaneous roles for talker and language. Together, Chapters 3 and 4 give shape to how crosslinguistic similarity is structured and offer a solid ground for generating perceptual hypotheses for areas like multilingual talker identification. Altogether, this dissertation provides a novel resource and highlights the importance of corpus research, both for understanding production processes and for guiding perception research.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International