UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Robust sparse covariance-regularized regression for high-dimensional data with Casewise and Cellwise outliers Liu, Yitong (Maggie)

Abstract

Modern biomedical datasets, such as those found in genomic and proteomic studies, often involve a large number of predictor variables relative to the number of observations, pointing to the need for statistical methods specifically designed to handle high-dimensional data. In particular, for a regression task, regularized methods are needed to select a sparse model, that is, one that uses only a subset of the large number of features available to predict a response. The presence of outliers in the data further complicates this task. Many existing robust and sparse regression methods are computationally expensive when the dimensionality of the data is high. Furthermore, most of these previously developed methods were developed under the assumption that outliers occur casewise, which is not always a realistic assumption in high-dimensional settings. We propose a sparse and robust regression method for high-dimensional data that is based on regularized precision matrix estimation. Our method can handle both casewise and cellwise outliers in low- and high-dimensional settings. Through simulation studies, we also compare our method to existing sparse and robust methods by evaluating computational efficiency, prediction performance, and variable selection capabilities.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International