- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Application of supervised learning models to compare...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Application of supervised learning models to compare epigenetic predictors of gene expression across healthy breast cell types Tello Palencia, Marco Antonio
Abstract
Moderate associations have been identified between gene expression and DNA methylation variability, predicted transcription factor binding sites, and transcription factor expression across multiple human tissues, including healthy mammary cells and diverse cancer-related cellular contexts. However, previous models summarized DNA methylation primarily at promoter regions, ignoring methylation variability in other genomic regions. In the current thesis, I propose using Variably Methylated Regions (VMRs) for summarizing DNA methylation and hypothesized that models trained on VMR-derived features would outperform promoter-centered models in the prediction of individual gene expression across healthy mammary cell types. Results largely supported this hypothesis, with VMR-based models demonstrating a superior capacity for predicting standardized individual gene expression across held-out samples compared to their promoter counterparts. Additionally, the DNA methylation feature showed the highest contribution to the performance of VMR-based models. Despite challenges in generalizing association patterns to unseen data across all regression models, this thesis is the first study that uses and rigorously evaluates the contribution of VMR-derived features to explain gene expression variability across healthy mammary cell types.
Item Metadata
Title |
Application of supervised learning models to compare epigenetic predictors of gene expression across healthy breast cell types
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
Moderate associations have been identified between gene expression and DNA methylation variability, predicted transcription factor binding sites, and transcription factor expression across multiple human tissues, including healthy mammary cells and diverse cancer-related cellular contexts. However, previous models summarized DNA methylation primarily at promoter regions, ignoring methylation variability in other genomic regions. In the current thesis, I propose using Variably Methylated Regions (VMRs) for summarizing DNA methylation and hypothesized that models trained on VMR-derived features would outperform promoter-centered models in the prediction of individual gene expression across healthy mammary cell types. Results largely supported this hypothesis, with VMR-based models demonstrating a superior capacity for predicting standardized individual gene expression across held-out samples compared to their promoter counterparts. Additionally, the DNA methylation feature showed the highest contribution to the performance of VMR-based models. Despite challenges in generalizing association patterns to unseen data across all regression models, this thesis is the first study that uses and rigorously evaluates the contribution of VMR-derived features to explain gene expression variability across healthy mammary cell types.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-02-13
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0439887
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International