Statistical modelling and inference for discrete and censored familial data

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Statistical modelling and inference for discrete and censored familial data Zhao, Yinshan

Abstract

Analysis of familial data with quantitative traits based on the multivariate normal distribution has been well studied. However, little attention has been devoted to traits which do not have a multivariate normal distribution, such as traits with discrete or censored values. In this thesis, we devote our effort to (1) construct models for familial data when the trait value is discrete and/or censored, and (2) study alternative estimation methods when maximum likelihood estimation is infeasible. We discuss two existing classes of models: models with random effects which are multivariate normally distributed, and models constructed from the multivariate normal copula. These two classes include a variety of models which can be applied to familial data. We also propose another class of models which we call conditional independence models. This type of model is based on a conditional independence assumption: for a trait variable, we assume independence of a pair of non-sibling relatives conditional on their parents, so that the dependence structure is built on the Markov property. Maximum likelihood estimates are generally difficult to obtain for random effect models and copula models when there are large families involved. We propose two estimation procedures based on composite likelihoods: the first is a two-stage method in which univariate marginal parameters are estimated based on univariate marginal distributions and the dependence parameters are estimated separately based on bivariate marginal distributions with the marginal parameters treated as known; whereas in the second, all the parameters are estimated using the likelihoods of bivariate marginal distributions. The composite likelihood methods can greatly reduce computation in parameter estimation, but with a price of efficiency loss. In this thesis, extensive investigations based on asymptotic covariance matrices and simulations were carried out to compare the asymptotic efficiency of these two procedures with the maximum likelihood method. In our efficiency comparisons, we investigate the multivariate normal model for a continuous trait, the multivariate probit model for a binary trait, the multivariate Poisson-lognormal mixture model for a count trait and multivariate lognormal model for a censored variable. We found that when the dependence is strong, the first approach is inefficient for the regression parameters; whereas when the dependence is weak, the second approach is inefficient for the dependence parameters. In many familial analyses, quantifying familial association is of great interest. For a binary trait, the odds ratio may be used as a measure of association between a parent-offspring pair or a sibling pair. We develop theories so that the asymptotic variance of an odds ratio can be computed from a 2 x 2 contingency table formed by dependent pairs.

Item Metadata

Title	Statistical modelling and inference for discrete and censored familial data
Creator	Zhao, Yinshan
Publisher	University of British Columbia
Date Issued	2004
Description	Analysis of familial data with quantitative traits based on the multivariate normal distribution has been well studied. However, little attention has been devoted to traits which do not have a multivariate normal distribution, such as traits with discrete or censored values. In this thesis, we devote our effort to (1) construct models for familial data when the trait value is discrete and/or censored, and (2) study alternative estimation methods when maximum likelihood estimation is infeasible. We discuss two existing classes of models: models with random effects which are multivariate normally distributed, and models constructed from the multivariate normal copula. These two classes include a variety of models which can be applied to familial data. We also propose another class of models which we call conditional independence models. This type of model is based on a conditional independence assumption: for a trait variable, we assume independence of a pair of non-sibling relatives conditional on their parents, so that the dependence structure is built on the Markov property. Maximum likelihood estimates are generally difficult to obtain for random effect models and copula models when there are large families involved. We propose two estimation procedures based on composite likelihoods: the first is a two-stage method in which univariate marginal parameters are estimated based on univariate marginal distributions and the dependence parameters are estimated separately based on bivariate marginal distributions with the marginal parameters treated as known; whereas in the second, all the parameters are estimated using the likelihoods of bivariate marginal distributions. The composite likelihood methods can greatly reduce computation in parameter estimation, but with a price of efficiency loss. In this thesis, extensive investigations based on asymptotic covariance matrices and simulations were carried out to compare the asymptotic efficiency of these two procedures with the maximum likelihood method. In our efficiency comparisons, we investigate the multivariate normal model for a continuous trait, the multivariate probit model for a binary trait, the multivariate Poisson-lognormal mixture model for a count trait and multivariate lognormal model for a censored variable. We found that when the dependence is strong, the first approach is inefficient for the regression parameters; whereas when the dependence is weak, the second approach is inefficient for the dependence parameters. In many familial analyses, quantifying familial association is of great interest. For a binary trait, the odds ratio may be used as a measure of association between a parent-offspring pair or a sibling pair. We develop theories so that the asymptotic variance of an odds ratio can be computed from a 2 x 2 contingency table formed by dependent pairs.
Extent	7808346 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-12-01
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0099795
URI	http://hdl.handle.net/2429/16044
Degree	Doctor of Philosophy - PhD
Program	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	2004-05
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_2004-902935.pdf -- 7.45MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Statistical modelling and inference for discrete and censored familial data Zhao, Yinshan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights