- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Research Data /
- SpiCE: Speech in Cantonese and English
Open Collections
UBC Research Data
SpiCE: Speech in Cantonese and English Johnson, Khia A.
Description
This is the Speech in Cantonese and English (SpiCE) corpus. SpiCE is an audio corpus of conversational Cantonese-English bilingual speech recorded in Vancouver, Canada during 2018-2020. The corpus includes high-quality recordings of 34 early bilinguals in both English and Cantonese. Participants completed a sentence reading task, storyboard narration, and conversational interview in each language. These different speech tasks are available in a single audio file for each language for each talker. A Praat textgrid file accompanies each audio file. The textgrids provide hand-corrected orthographic transcription and phoneme-level forced-alignment in Cantonese and English. As an open-access language resource, SpiCE will promote bilingualism research for a typologically distinct pair of languages, of which Cantonese remains understudied despite there being millions of speakers around the world. The SpiCE corpus is especially well-suited for phonetic research on conversational speech, and enables researchers to study cross-language within-speaker phenomena for a diverse group of early Cantonese-English bilinguals. These are areas with few existing high-quality resources. Corpus documentation is available at: <a href="https://spice-corpus.readthedocs.io/">https://spice-corpus.readthedocs.io/</a>.
Item Metadata
Title |
SpiCE: Speech in Cantonese and English
|
Alternate Title |
A transcribed audio corpus of conversational Cantonese-English bilingual speech
|
Creator | |
Contributor | |
Date Issued |
2021-05-20
|
Description |
This is the Speech in Cantonese and English (SpiCE) corpus. SpiCE is an audio corpus of conversational Cantonese-English bilingual speech recorded in Vancouver, Canada during 2018-2020. The corpus includes high-quality recordings of 34 early bilinguals in both English and Cantonese. Participants completed a sentence reading task, storyboard narration, and conversational interview in each language. These different speech tasks are available in a single audio file for each language for each talker. A Praat textgrid file accompanies each audio file. The textgrids provide hand-corrected orthographic transcription and phoneme-level forced-alignment in Cantonese and English. As an open-access language resource, SpiCE will promote bilingualism research for a typologically distinct pair of languages, of which Cantonese remains understudied despite there being millions of speakers around the world. The SpiCE corpus is especially well-suited for phonetic research on conversational speech, and enables researchers to study cross-language within-speaker phenomena for a diverse group of early Cantonese-English bilinguals. These are areas with few existing high-quality resources. Corpus documentation is available at: <a href="https://spice-corpus.readthedocs.io/">https://spice-corpus.readthedocs.io/</a>.
|
Subject | |
Type | |
Language |
Chinese; English
|
Date Available |
2021-01-27
|
Provider |
University of British Columbia Library
|
License |
CC-BY 4.0
|
DOI |
10.14288/1.0398086
|
URI | |
Publisher DOI | |
Grant Funding Agency |
Social Sciences and Humanities Research Council
|
Rights URI | |
Aggregated Source Repository |
Dataverse
|
Item Media
Item Citations and Data
Licence
CC-BY 4.0