Subgroup-specific regression models

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Subgroup-specific regression models Yaghoubi, Marjan

Abstract

Data sets are becoming massive with ever increasing advances in data collection technologies and are altering the nature of biomedical research. With many techniques, these huge data sets can be challenging, or even impossible, to accurately analyse. In biomedical settings, data sets are frequently heterogeneous, with samples representing various subtypes of diseases that are thought to have variations with respect to underlying biology. A motivating example is the study of progressive diseases such as Alzheimer's disease (AD). While there is a significant increase in the number of studies that concentrate on regression modeling of the disease progression, they ignore the fact that the pattern change are profoundly different for patients with various initial pro les. Estimating separate models for each subgroup is extremely difficult due to small sample sizes in the high dimensional setting, but may obtain results that are more accurate and reliable. Moreover, recognizing homogeneous subgroups of predictors can be cumbersome in high-dimensional regression analysis over subgroups of samples. This thesis attempts to improve upon an established method of regularized regression for group-structured datasets by using a linear combination of two penalty functions to select predictive clusters of correlated variables, and to allow for subgroup-specific parameter estimates. In order to showcase the performance of the suggested methodology, we conducted a series of experiments on Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset including three groups of Cognitively Normal Controls (CN), Late Mild Cognitive Impairment (LMCI), and Alzheimer's disease (AD) subjects to estimate Mini- Mental State Examination (MMSE) scores in multiple future time points. Results reveal the effectiveness of the suggested method in terms of Root Mean Square Error (RMSE) over several available well-known statistical methods in two subgroups, AD and LMCI. However, in CN group, our proposed method performed better than other methods at two time points. We also investigated the prediction performance of our proposed method with multiple multi-task learning regression methods.

Item Metadata

Title	Subgroup-specific regression models
Creator	Yaghoubi, Marjan
Supervisor	Tavakoli, Javad
Publisher	University of British Columbia
Date Issued	2021
Description	Data sets are becoming massive with ever increasing advances in data collection technologies and are altering the nature of biomedical research. With many techniques, these huge data sets can be challenging, or even impossible, to accurately analyse. In biomedical settings, data sets are frequently heterogeneous, with samples representing various subtypes of diseases that are thought to have variations with respect to underlying biology. A motivating example is the study of progressive diseases such as Alzheimer's disease (AD). While there is a significant increase in the number of studies that concentrate on regression modeling of the disease progression, they ignore the fact that the pattern change are profoundly different for patients with various initial pro les. Estimating separate models for each subgroup is extremely difficult due to small sample sizes in the high dimensional setting, but may obtain results that are more accurate and reliable. Moreover, recognizing homogeneous subgroups of predictors can be cumbersome in high-dimensional regression analysis over subgroups of samples. This thesis attempts to improve upon an established method of regularized regression for group-structured datasets by using a linear combination of two penalty functions to select predictive clusters of correlated variables, and to allow for subgroup-specific parameter estimates. In order to showcase the performance of the suggested methodology, we conducted a series of experiments on Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset including three groups of Cognitively Normal Controls (CN), Late Mild Cognitive Impairment (LMCI), and Alzheimer's disease (AD) subjects to estimate Mini- Mental State Examination (MMSE) scores in multiple future time points. Results reveal the effectiveness of the suggested method in terms of Root Mean Square Error (RMSE) over several available well-known statistical methods in two subgroups, AD and LMCI. However, in CN group, our proposed method performed better than other methods at two time points. We also investigated the prediction performance of our proposed method with multiple multi-task learning regression methods.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2021-05-31
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0398202
URI	http://hdl.handle.net/2429/78484
Degree (Theses)	Master of Science - MSc
Program (Theses)	Mathematics
Affiliation	Science, Irving K. Barber Faculty of (Okanagan); Computer Science, Mathematics, Physics and Statistics, Department of (Okanagan)
Degree Grantor	University of British Columbia
Graduation Date	2021-09
Campus	UBCO
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Subgroup-specific regression models Yaghoubi, Marjan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights