TY - THES
AU - Leroux, Brian
PY - 1989
TI - Maximum likelihood estimation for mixture distributions and hidden Markov models
KW - Thesis/Dissertation
LA - eng
M3 - Text
AB - This thesis deals with computational and theoretical aspects of maximum likelihood estimation for data from a mixture model and a hidden Markov model.
A maximum penalized likelihood method is proposed for estimating the number of components in a mixture distribution. This method produces a consistent estimator of the unknown mixing distribution, in the sense of weak convergence of distribution functions. The proof of this result consists of establishing consistency results concerning maximum likelihood estimators (which have unrestricted number of components) and constrained maximum likelihood estimators (which assume a fixed finite number of components). In particular, a new proof of the consistency of maximum likelihood estimators is given. Also, the large sample limits of a sequence of constrained maximum likelihood estimators are identified as those distributions minimizing Kullback-Leibler divergence from the true distribution. If the number of components of the true mixture distribution is not greater than the assumed number, the constrained maximum likelihood estimator is consistent in the sense of weak convergence. If the assumed number of components is exactly correct, the estimators of the parameters which define the mixing distribution are also consistent (in a certain sense).
An algorithm for computation of maximum likelihood estimates (and the maximum penalized likelihood estimate) is given. The EM algorithm is used to locate local maxima of the likelihood function and a method of automatically generating "good" starting values for each possible number of components is incorporated. The estimation of a Poisson mixture distribution is illustrated using a distribution of traffic accidents in a population and a sequence of observations of fetal movements.
One way of looking at the finite mixture model is as a random sample of "states" from a mixing distribution and a sequence of conditionally independent observed variables with distributions determined by the states. In the hidden Markov model considered here, the sequence of states is modelled by a Markov chain. The use of the EM algorithm for finding local maxima of the likelihood function for the hidden Markov model is described. Problems arising in the implementation of the algorithm are discussed, including the automatic generation of starting values and a necessary adjustment to the forward-backward equations. The algorithm is applied, with Poisson component distributions, to the sequence of observations of fetal movements.
The consistency of the maximum likelihood estimator for the hidden Markov model is proved. The proof requires the consideration of identifiability, ergodicity, entropy, cross-entropy, and convergence of the log-likelihood function. For instance, the conclusion of the Shannon-McMillan-Breiman theorem on entropy convergence is established for hidden Markov models.
A class of doubly stochastic Poisson processes which corresponds to a continuous time version of the hidden Markov model is also considered. We discuss some preliminary work on the extension of the EM algorithm to these processes, and also the possibility of applying our method of proof of consistency of maximum likelihood estimators.
N2 - This thesis deals with computational and theoretical aspects of maximum likelihood estimation for data from a mixture model and a hidden Markov model.
A maximum penalized likelihood method is proposed for estimating the number of components in a mixture distribution. This method produces a consistent estimator of the unknown mixing distribution, in the sense of weak convergence of distribution functions. The proof of this result consists of establishing consistency results concerning maximum likelihood estimators (which have unrestricted number of components) and constrained maximum likelihood estimators (which assume a fixed finite number of components). In particular, a new proof of the consistency of maximum likelihood estimators is given. Also, the large sample limits of a sequence of constrained maximum likelihood estimators are identified as those distributions minimizing Kullback-Leibler divergence from the true distribution. If the number of components of the true mixture distribution is not greater than the assumed number, the constrained maximum likelihood estimator is consistent in the sense of weak convergence. If the assumed number of components is exactly correct, the estimators of the parameters which define the mixing distribution are also consistent (in a certain sense).
An algorithm for computation of maximum likelihood estimates (and the maximum penalized likelihood estimate) is given. The EM algorithm is used to locate local maxima of the likelihood function and a method of automatically generating "good" starting values for each possible number of components is incorporated. The estimation of a Poisson mixture distribution is illustrated using a distribution of traffic accidents in a population and a sequence of observations of fetal movements.
One way of looking at the finite mixture model is as a random sample of "states" from a mixing distribution and a sequence of conditionally independent observed variables with distributions determined by the states. In the hidden Markov model considered here, the sequence of states is modelled by a Markov chain. The use of the EM algorithm for finding local maxima of the likelihood function for the hidden Markov model is described. Problems arising in the implementation of the algorithm are discussed, including the automatic generation of starting values and a necessary adjustment to the forward-backward equations. The algorithm is applied, with Poisson component distributions, to the sequence of observations of fetal movements.
The consistency of the maximum likelihood estimator for the hidden Markov model is proved. The proof requires the consideration of identifiability, ergodicity, entropy, cross-entropy, and convergence of the log-likelihood function. For instance, the conclusion of the Shannon-McMillan-Breiman theorem on entropy convergence is established for hidden Markov models.
A class of doubly stochastic Poisson processes which corresponds to a continuous time version of the hidden Markov model is also considered. We discuss some preliminary work on the extension of the EM algorithm to these processes, and also the possibility of applying our method of proof of consistency of maximum likelihood estimators.
UR - https://open.library.ubc.ca/collections/831/items/1.0098216
ER - End of Reference