Bayesian models of learning and generating inflectional morphology

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Bayesian models of learning and generating inflectional morphology Allen, Blake H.

Abstract

In many languages of the world, the form of individual words can undergo systematic variation in order to express concepts including tense, gender, and relative social status. Accurate models of these inflectional systems, such as verb conjugation and noun declension systems, are indispensable for purposes of both language research and language technology development. This dissertation presents a theoretical framework for understanding and predicting native speakers’ use of their languages’ inflectional systems. I propose a probabilistic interpretation of the task that speakers face when inferring unfamiliar inflected forms, and I argue in favor of a Bayesian approach to modeling this task. Specifically, I develop the theory of sublexical morphology, which augments the Bayesian approach with intuitive methods for calculating necessary probabilities. Sublexical morphology also possesses the virtue of computational implementability: this dissertation defines all data structures used in sublexical morphology, and it specifies the procedures necessary to use a model for morphological inference. I provide along with this dissertation a Python package that implements all the classes and methods necessary to perform inference with a sublexical morphology model. I also describe an implemented learning algorithm that allows induction of sublexical morphology models from labeled but unparsed training data. As empirical support for my core claims, I describe the outcomes of two behavioral experiments. Evidence from a test of Icelandic speakers’ inflection of novel words demonstrates that speakers are able to additively make use of information from multiple provided inflected forms of a word, and evidence from a similar test on Polish speakers suggests that speakers may be limited to this additive way of combining such pieces of information. In clear support of a Bayesian interpretation of morphological inference, both experiments additionally demonstrate that prior probabilities—understood as reflecting lexical frequencies of different groupings of words—play a major role in speakers’ use of their inflectional systems. This is shown to be true even when influence from prior probabilities results in speakers apparently deviating from exceptionless lexical patterns in those systems.

Item Metadata

Title	Bayesian models of learning and generating inflectional morphology
Creator	Allen, Blake H.
Publisher	University of British Columbia
Date Issued	2016
Description	In many languages of the world, the form of individual words can undergo systematic variation in order to express concepts including tense, gender, and relative social status. Accurate models of these inflectional systems, such as verb conjugation and noun declension systems, are indispensable for purposes of both language research and language technology development. This dissertation presents a theoretical framework for understanding and predicting native speakers’ use of their languages’ inflectional systems. I propose a probabilistic interpretation of the task that speakers face when inferring unfamiliar inflected forms, and I argue in favor of a Bayesian approach to modeling this task. Specifically, I develop the theory of sublexical morphology, which augments the Bayesian approach with intuitive methods for calculating necessary probabilities. Sublexical morphology also possesses the virtue of computational implementability: this dissertation defines all data structures used in sublexical morphology, and it specifies the procedures necessary to use a model for morphological inference. I provide along with this dissertation a Python package that implements all the classes and methods necessary to perform inference with a sublexical morphology model. I also describe an implemented learning algorithm that allows induction of sublexical morphology models from labeled but unparsed training data. As empirical support for my core claims, I describe the outcomes of two behavioral experiments. Evidence from a test of Icelandic speakers’ inflection of novel words demonstrates that speakers are able to additively make use of information from multiple provided inflected forms of a word, and evidence from a similar test on Polish speakers suggests that speakers may be limited to this additive way of combining such pieces of information. In clear support of a Bayesian interpretation of morphological inference, both experiments additionally demonstrate that prior probabilities—understood as reflecting lexical frequencies of different groupings of words—play a major role in speakers’ use of their inflectional systems. This is shown to be true even when influence from prior probabilities results in speakers apparently deviating from exceptionless lexical patterns in those systems.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2016-10-13
Provider	Vancouver : University of British Columbia Library
Rights	Attribution 4.0 International
DOI	10.14288/1.0319124
URI	http://hdl.handle.net/2429/59429
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Linguistics
Affiliation	Arts, Faculty of; Linguistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	2016-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Bayesian models of learning and generating inflectional morphology Allen, Blake H.

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights