- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- A Survey of Grapheme-to-Phoneme Conversion Methods
Open Collections
UBC Faculty Research and Publications
A Survey of Grapheme-to-Phoneme Conversion Methods Cheng, Shiyang; Zhu, Pengcheng; Liu, Jueting; Wang, Zehua
Abstract
Grapheme-to-phoneme conversion (G2P) is the task of converting letters (grapheme sequences) into their pronunciations (phoneme sequences). It plays a crucial role in natural language processing, text-to-speech synthesis, and automatic speech recognition systems. This paper provides a systematical overview of the G2P conversion from different perspectives. The conversion methods are first presented in the paper; detailed discussions are conducted on methods based on deep learning technology. For each method, the key ideas, advantages, disadvantages, and representative models are summarized. This paper then mentioned the learning strategies and multilingual G2P conversions. Finally, this paper summarized the commonly used monolingual and multilingual datasets, including Mandarin, Japanese, Arabic, etc. Two tables illustrated the performance of various methods with relative datasets. After making a general overall of G2P conversion, this paper concluded with the current issues and the future directions of deep learning-based G2P conversion.
Item Metadata
Title |
A Survey of Grapheme-to-Phoneme Conversion Methods
|
Creator | |
Publisher |
Multidisciplinary Digital Publishing Institute
|
Date Issued |
2024-12-17
|
Description |
Grapheme-to-phoneme conversion (G2P) is the task of converting letters (grapheme sequences) into their pronunciations (phoneme sequences). It plays a crucial role in natural language processing, text-to-speech synthesis, and automatic speech recognition systems. This paper provides a systematical overview of the G2P conversion from different perspectives. The conversion methods are first presented in the paper; detailed discussions are conducted on methods based on deep learning technology. For each method, the key ideas, advantages, disadvantages, and representative models are summarized. This paper then mentioned the learning strategies and multilingual G2P conversions. Finally, this paper summarized the commonly used monolingual and multilingual datasets, including Mandarin, Japanese, Arabic, etc. Two tables illustrated the performance of various methods with relative datasets. After making a general overall of G2P conversion, this paper concluded with the current issues and the future directions of deep learning-based G2P conversion.
|
Subject | |
Genre | |
Type | |
Language |
eng
|
Date Available |
2025-01-10
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
CC BY 4.0
|
DOI |
10.14288/1.0447724
|
URI | |
Affiliation | |
Citation |
Applied Sciences 14 (24): 11790 (2024)
|
Publisher DOI |
10.3390/app142411790
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty; Researcher
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
CC BY 4.0