UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Assessing accuracy and utility of epigenetic age prediction algorithm in a large-scale targeted methylation sequencing study Vasileva, Denitsa


The Horvath epigenetic clock was developed using Illumina HumanMethylation 450K and 27K array data with the purpose of predicting epigenetic age. In this study targeted methylation sequencing using Illumina’s MethylCapture sequencing library was completed on 932 samples from three Canadian studies - The Canadian Asthma Primary Prevention Study (CAPPS, n=632 samples); the Saguenay-Lac-Saint-Jean study (SLSJ, n=180 samples) and the Canadian Peanut Allergy Registry (CanPAR, n=120). CAPPS is a longitudinal cohort that follows children at high-risk for developing asthma from birth to year 15, with targeted methylation sequencing done on at least one of three occasions- birth, and/or years 7 and 15. Maternal samples for CAPPS were also included in the methylation study. SLSJ consists of three-generational triads from families of French-Canadian descent. CanPAR consists of individuals with peanut allergy. The objective of this study was three-fold: Assess the accuracy of the Horvath epigenetic clock in children To evaluate its applicability and its utility as a quality control (QC) metric in targeted methylation sequencing experiments Identify age informative CpG sites in dataset The Horvath age prediction algorithm was shown to be accurate in this data set with a mean absolute error (MAE±S.D.= 7.00±7.03, median absolute error=4.57). Additional QCs included principal component analyses to assess age, ethnicity and cell counts, and genotype concordance checks using overlap SNPs from GWAS and sequencing studies. These steps identified possibly swapped samples which were also noted as outliers by the age prediction with their relative difference (RD, RD = abs (predicted age-expected age)/expected age)) > mean + 2 S.D. for their reported groups. Linear and mixed Effects models that utilize the CAPPS longitudinal data structure identified hundreds of thousands of new CpG sites significantly associated with age in children. We have demonstrated the applicability of the Horvath age algorithm to targeted methylation sequencing studies and the opportunity for its improvement.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International