Open Collections

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

DNA methylation microarray data reduction for co-methylation analysis Gatev, Evan

Abstract

DNA Methylation (DNAm) is an epigenetic modification that is present across the human genome, primarily in the context of CpG di-nucleotides. In human population studies, high throughput bead chip microarray assays are the prevalent way to simultaneously measure the methylation state of many thousands of genomic CpG sites. Proximal genomic CpGs have correlated methylation state within a single cell and often function as a single biological unit. The prevailing common methylation state of such multiple CpGs within a common biological unit has been the subject of intense study, due to its immediate relevance for gene expression regulation and ultimately for health and disease. I designed and implemented a method for a biologically motivated DNAm array data reduction, which constructs co-methylated regions (CMRs), while incorporating information about the genomic CpG background from the reference human genome annotation. The method relies on the correlations of CpG methylation across individuals for proximal CpG probes. The method aims for enhanced statistical performance in terms of statistical power and specificity, including for downstream applications. For example, Epigenome Wide Association Studies (EWAS), an important such application, often places the focus on group “hits” with multiple adjacent CpGs that are significant, because their gnomic proximity makes it more likely that the detected correlations are not spurious. The CMRs capture such groups and I showed that the CMRs constructed in whole blood public data have high statistical specificity in the context of EWAS for chronological age and biological sex. When the composite CMR methylation measures were used to perform EWAS for age and sex, they had high sensitivity and specificity, including uncovering additional associated CpGs not detected by conventional EWAS. The utility of the data reduction method was further discussed within the broader context of applying machine learning algorithms for high dimensional DNAm array data analysis.

Item Metadata

Title	DNA methylation microarray data reduction for co-methylation analysis
Creator	Gatev, Evan
Publisher	University of British Columbia
Date Issued	2020
Description	DNA Methylation (DNAm) is an epigenetic modification that is present across the human genome, primarily in the context of CpG di-nucleotides. In human population studies, high throughput bead chip microarray assays are the prevalent way to simultaneously measure the methylation state of many thousands of genomic CpG sites. Proximal genomic CpGs have correlated methylation state within a single cell and often function as a single biological unit. The prevailing common methylation state of such multiple CpGs within a common biological unit has been the subject of intense study, due to its immediate relevance for gene expression regulation and ultimately for health and disease. I designed and implemented a method for a biologically motivated DNAm array data reduction, which constructs co-methylated regions (CMRs), while incorporating information about the genomic CpG background from the reference human genome annotation. The method relies on the correlations of CpG methylation across individuals for proximal CpG probes. The method aims for enhanced statistical performance in terms of statistical power and specificity, including for downstream applications. For example, Epigenome Wide Association Studies (EWAS), an important such application, often places the focus on group “hits” with multiple adjacent CpGs that are significant, because their gnomic proximity makes it more likely that the detected correlations are not spurious. The CMRs capture such groups and I showed that the CMRs constructed in whole blood public data have high statistical specificity in the context of EWAS for chronological age and biological sex. When the composite CMR methylation measures were used to perform EWAS for age and sex, they had high sensitivity and specificity, including uncovering additional associated CpGs not detected by conventional EWAS. The utility of the data reduction method was further discussed within the broader context of applying machine learning algorithms for high dimensional DNAm array data analysis.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2020-05-05
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0390361
URI	http://hdl.handle.net/2429/74339
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Bioinformatics
Affiliation	Science, Faculty of
Degree Grantor	University of British Columbia
Graduation Date	2020-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

DNA methylation microarray data reduction for co-methylation analysis Gatev, Evan

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights