- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Representation learning for Arabic dialect identification
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Representation learning for Arabic dialect identification Sullivan, Peter
Abstract
Arabic dialect identification (ADI) is an important aspect of the Arabic speech processing pipeline, and in particular dialectal Arabic automatic speech recognition (ASR) models. In this work, we present an overview of corpora and methods applicable to both ADI and dialectal Arabic ASR, then we benchmark two approaches to using pre-trained speech representation models for ADI. Namely, we first employ direct fine-tuning, and then use fixed-representations extracted from pre-trained models as an intermediate step in the ADI process. We train and evaluate our models on the granular ADI-17 Arabic dialect corpus (92% F1 for our fine-tuned HuBERT model), and further probe generalization by evaluating our trained models on coarse-grained ADI-5, (80% F1 for fine-tuned HuBERT).
Item Metadata
Title |
Representation learning for Arabic dialect identification
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2022
|
Description |
Arabic dialect identification (ADI) is an important aspect of the Arabic speech processing pipeline, and in particular dialectal Arabic automatic speech recognition (ASR) models. In this work, we present an overview of corpora and methods applicable to both ADI and dialectal Arabic ASR, then we benchmark two approaches to using pre-trained speech representation models for ADI. Namely, we first employ direct fine-tuning, and then use fixed-representations extracted from pre-trained models as an intermediate step in the ADI process. We train and evaluate our models on the granular ADI-17 Arabic dialect corpus (92% F1 for our fine-tuned HuBERT model), and further probe generalization by evaluating our trained models on coarse-grained ADI-5, (80% F1 for fine-tuned HuBERT).
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2022-08-22
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0417468
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2022-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International