UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

DNA methylation variation across the human life course McEwen, Lisa Marie 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2018_november_mcewen_lisa.pdf [ 4.76MB ]
JSON: 24-1.0372530.json
JSON-LD: 24-1.0372530-ld.json
RDF/XML (Pretty): 24-1.0372530-rdf.xml
RDF/JSON: 24-1.0372530-rdf.json
Turtle: 24-1.0372530-turtle.txt
N-Triples: 24-1.0372530-rdf-ntriples.txt
Original Record: 24-1.0372530-source.json
Full Text

Full Text

 DNA METHYLATION VARIATION ACROSS THE HUMAN LIFE COURSE   by  Lisa Marie McEwen  B.Sc., The University of Victoria, 2013  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Medical Genetics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  October 2018  © Lisa Marie McEwen, 2018  ii The following individuals certify that they have read, and recommend to the Faculty of Graduate and Postdoctoral Studies for acceptance, the dissertation entitled:   DNA methylation variation across the human life course   submitted by Lisa McEwen in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Medical Genetics  Examining Committee: Dr. Michael Kobor Supervisor  Dr. Sara Mostafavi Supervisory Committee Member   Supervisory Committee Member Dr. Peter Lansdorp University Examiner Dr. Angela Devlin University Examiner   Additional Supervisory Committee Members: Dr. Lara Boyd Supervisory Committee Member Dr. Jan Friedman Supervisory Committee Member  iii Abstract Aging is a multifaceted process occurring in all living organisms, and it involves the breakdown of biological robustness. Although much research has revealed fascinating features of cellular mechanisms related to aging and lifespan, we have yet to understand the underpinnings driving this inevitable progression. Epigenetics is one area of aging research that has developed significant interest as certain epigenetic modifications, such as DNA methylation, have been proposed to mediate the relationship between the environment and gene expression as well as have age-associated patterns. Interestingly, predictors of age based on DNA methylation of <400 CpG sites, have been created to estimate age with impressive accuracy; however, these tools are less characterized in pediatric populations. To further characterize DNA methylation across the life course in human populations, I used the Illumina 450K and EPIC microarrays to assess genome-wide over 2,500 profiles from varying cohorts across the human life course. Specifically, I examined the relationship between buccal DNA methylation and age from pediatric samples, creating an accurate multivariate age-predictor with DNA methylation at 94 CpG sites (unbiased test data set median absolute difference: <0.40 years). Furthermore, I observed unique DNA methylation patterns in a long-lived population from the Nicoya Peninsula of Costa Rica; specifically, 20 differentially methylated regions (q-value ≤ 0.05, β-value ≥ 5%) as compared to non-Nicoyans as well as a lower overall variability of the measured sites. Additionally, I tested whether physical activity was associated with methylation changes by examining peripheral mononuclear blood cells before and after a six-month intervention, finding no apparent alterations, although, interestingly, I did observe potential associations between the change in DNA methylation at 12 CpG sites with percent weight loss (p-value < 0.00005, q-value ≤ 0.12, ∆β-value ≥ 5%). Lastly, I investigated potential technical factors that may  iv contribute to variance when estimating age with the most popular predictor based on DNA methylation. I observed moderate variability in the estimated DNA methylation age across three common preprocessing methods (error = 1.44 – 3.10 years) and provided needed evidence that the newest microarray technology, the EPIC, may still provide accurate measures despite missing predictor sites (correlation with 450K data: r = 0.91 - 0.96, depending on preprocessing method). In summary, I have contributed to the characterization of the human DNA methylome across the human life course, ranging from age-associated patterns in the pediatric life stage, variation related to a middle-aged lifestyle intervention, to DNA methylation signatures in a long-lived population.     v Lay Summary The goal of this work was to investigate the relationship between aging and a molecular process called DNA methylation. DNA methylation is a reversible process that involves adding a methyl group to DNA, which ultimately contributes to how a cell functions. The most important role of DNA methylation is to define a cell’s identity, however, this mark can change in association with environmental exposures and aging. Interestingly, DNA methylation and aging are highly associated, as patterns of DNA methylation change with age across all individuals. The functional consequence of these changes are not well established, but DNA methylation may be a promising biomarker of aging. I examined this epigenetic mark in a variety of human populations ranging from samples collected at from infants to supercentenarian individuals in order to characterize the variation in DNA methylation related to aging.     vi Preface All data chapters are presented in manuscript format, as they are currently published (Chapters 3 & 4) or in preparation for submission (Chapters 2 & 5).  Chapter 1 Portions of this section were published as a book chapter:  McEwen, L. M., Goodman, S. J., Kobor, M. S., & Jones, M. J. (2016). The DNA Methylome: An Interface Between the Environment, Immunity, and Ageing. In The Ageing Immune System and Health (Vol. 13, pp. 35–52). Cham: Springer International Publishing. http://doi.org/10.1007/978-3-319-43365-3_3. Copyright © 2017, Springer Nature  Chapter 2 Chapter 2 is unpublished work. The meta data and DNA methylation from all cohorts were previously generated by collaborators and the work presented in this chapter was primarily a secondary use of these data. I compiled and processed both meta data and methylation data across all cohorts. Dr. Steve Horvath performed the regression analysis to select the final age-predictive CpG sites. I performed all remaining analyses independently. I generated all figures and wrote the manuscript with feedback from co-authors.   Chapter 3 A version of this chapter was published as:  McEwen, L. M., Morin, A. M., Edgar, R. D., Macisaac, J. L., Jones, M. J., Dow, W. H.,  vii et al. (2017). Differential DNA methylation and lymphocyte proportions in a Costa Rican high longevity region. Epigenetics & Chromatin, 10(1), 21.   © 2017 McEwen et al., licensee Biomed Central.   Participant recruitment, meta data gathering, and sample collection were performed by members of the CRELES cohort. In addition, members of this cohort also extracted blood from participants in 2005 and genomic DNA was subsequently isolated and stored at UC Berkeley. I performed bisulfite conversion and processing of the Illumina Infinium HumanMethylation450 BeadChips with Dr. Julia MacIsaac, and independently performed all data processing and analyses. I wrote the article with help from Dr. David Rehkopf and Dr. Michael Kobor. All authors contributed to editing the article.  Ethics: The ‘Committee on Science and Ethics’ of the University of Costa Rica approved sample recruitment and sample collection, and all participants provided written informed consent. All data analyses were approved by the Administrative Panel for the Protection of Human Subjects at Stanford University.   Chapter 4  A version of this chapter was published as:   McEwen, L. M., Gatev, E. G., Jones, M. J., Macisaac, J. L., McAllister, M. M., Goulding, R. E., et al. (2018). DNA methylation signatures in peripheral blood mononuclear cells from a lifestyle intervention for women at midlife: a pilot randomized controlled trial. Applied Physiology, Nutrition, and Metabolism, 43(3), 233–239. http://doi.org/10.1139/apnm-2017-0436  viii  Participant recruitment, intervention implementation, and blood extraction were performed by collaborators. I extracted the peripheral mononuclear cells, isolated genomic DNA, performed bisulfite conversion and the Illumina Infinium HumanMethylation450 BeadChips with Dr. Julia MacIsaac, and then performed all data processing and analyses independently. Ethics: This was a sub-study of a lifestyle intervention for community-dwelling older women from Vancouver, Canada (ClinicalTrials.gov identifier: NCT01842061). Ethics were approved from the University of British Columbia Institutional Review Board, and all participants signed a written informed consent form prior to participation in the study.   Chapter 5  A version of Chapter 5 is accepted for publication in Clinical Epigenetics as: McEwen, L.M., Jones, M.J., Lin, D.T.S., Edgar, R.D., Husquin, L.T., MacIsaac, J.L., Ramadori, K.E., Morin, A.M., Rider, C.F., Carlsten, C., Quintana-Murci, L., Horvath, S., Kobor, M.S. (2018). Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array. Clinical Epigenetics, accepted October 1, 2018: http:doi.org/10.1186/s13148-018-0556-2. Cohorts were recruited and sampled by collaborators. Dr. Julia MacIsaac performed bisulfite conversion and the Illumina Infinium HumanMethylation450 and HumanMethylationEPIC BeadChips. I generated the research questions and performed all data processing, analyses, and created the figures. I also wrote the manuscript with feedback from co-authors. Ethics: Approval for the monocyte EVOIMMUNOPOP project was approved by the Ethics Committee of the Ghent University, the Ethics Board of Institut Pasteur (EVOIMMUNPOP-281297) and the relevant French Authorities (CPP, CCITRS and CNIL). Samples were collected  ix after written informed consent had been obtained. Approval for the DE3 study was granted from the University of British Columbia Research Ethics Board (H11-01831).   x Table of Contents  Abstract ......................................................................................................................................... iii Lay Summary .................................................................................................................................v Preface ........................................................................................................................................... vi Table of Contents ...........................................................................................................................x List of Tables .............................................................................................................................. xvi List of Figures ............................................................................................................................ xvii List of Symbols ......................................................................................................................... xviii List of Abbreviations ................................................................................................................. xix Acknowledgements .................................................................................................................... xxi Dedication ................................................................................................................................. xxiii Chapter 1: Introduction ................................................................................................................1 1.1 Overview and hypotheses ............................................................................................... 1 1.2 DNA methylation is an important component of the epigenetic landscape.................... 2 1.2.1 Definition and genomic context .............................................................................. 2 1.2.2 Transcription factors and gene expression .............................................................. 4 1.2.3 DNA methylation and the immune system ............................................................. 5 1.3 DNA methylation and the environment .......................................................................... 7 1.3.1 Physical environmental exposures .......................................................................... 7 1.3.2 Social and lifestyle factors ...................................................................................... 8 1.4 The Illumina DNA methylation microarray ................................................................. 10 1.4.1 Platform overview ................................................................................................. 10 1.4.2 Probe chemistry .................................................................................................... 10  xi 1.5 Considerations for meta-epigenome studies ................................................................. 11 1.5.1 Epigenome-wide association studies overview .................................................... 11 1.5.2 Cell type heterogeneity is a major contributor to DNA methylation variation..... 12 1.5.3 Genetic structure intersects with DNA methylation variation .............................. 13 1.6 The Human Lifespan..................................................................................................... 14 1.6.1 Genetics and longevity .......................................................................................... 14 1.6.2 Environmental influences on lifespan ................................................................... 15 1.7 DNA methylation and aging ......................................................................................... 16 1.7.1 A historical perspective......................................................................................... 16 1.7.2 Age-associated DNA methylation ........................................................................ 16 1.7.3 Epigenetic Drift ..................................................................................................... 18 1.7.4 Epigenetic Clocks ................................................................................................. 19 1.8 Research Objectives ...................................................................................................... 23 Chapter 2: DNA methylation age estimator for pediatric buccal swabs and potential applications to autism ..................................................................................................................25 2.1 Introduction ................................................................................................................... 25 2.2 Methods......................................................................................................................... 26 2.2.1 Cohort descriptions ............................................................................................... 26 2.2.2 DNA methylation data processing ........................................................................ 27 2.2.3 Pan tissue DNA methylation age (353 CpG model) ............................................. 28 2.2.4 Pediatric BEC DNA methylation age predictor .................................................... 28 2.2.5 Genomic enrichment ............................................................................................. 29 2.3 Results ........................................................................................................................... 29  xii 2.3.1 Training and test data set characteristics .............................................................. 29 2.3.2 A precise tool to measure pediatric DNA methylation age in BEC samples ........ 31 2.3.3 Pediatric BEC DNA methylation age prediction was highly accurate across longitudinal sampling............................................................................................................ 32 2.3.4 Positive pediatric DNA methylation age acceleration was associated with Autism Spectrum Disorder ................................................................................................................ 35 2.3.5 Pediatric BEC DNA methylation age in saliva and blood .................................... 37 2.3.6 A pediatric BEC DNA methylation epigenetic clock for future studies ............... 38 2.4 Discussion ..................................................................................................................... 44 2.5 Acknowledgments......................................................................................................... 46 Chapter 3: Differential DNA methylation and lymphocyte proportions in a Costa Rican high longevity region....................................................................................................................48 3.1 Introduction ................................................................................................................... 48 3.2 Materials and methods .................................................................................................. 52 3.2.1 Sample preparation and data collection ................................................................ 52 3.2.2 Data preprocessing and normalization .................................................................. 53 3.2.3 Estimation of blood cell proportions..................................................................... 54 3.2.4 Prediction of epigenetic age .................................................................................. 54 3.2.5 DNA methylation analysis .................................................................................... 54 3.2.6 Inferred genetic ancestry ....................................................................................... 55 3.3 Results ........................................................................................................................... 55 3.3.1 Cohort characteristics and DNA methylation data ............................................... 55 3.3.2 Nicoyans had fewer estimated CD8+ memory cells and more naïve T cells ....... 57  xiii 3.3.3 Epigenetic age does not differ between Nicoyans and non-Nicoyans .................. 60 3.3.4 DNA methylation variability is lower in Nicoyans .............................................. 62 3.3.5 Unique region-based differential methylation in Nicoyans .................................. 65 3.3.6 Site-specific differential methylation in Nicoyans and technical verification ...... 69 3.4 Discussion ..................................................................................................................... 74 3.5 Conclusions ................................................................................................................... 79 Chapter 4: DNA methylation signatures in peripheral blood mononuclear cells from a lifestyle intervention for women at midlife: A pilot RCT ........................................................80 4.1 Introduction ................................................................................................................... 80 4.2 Materials and methods .................................................................................................. 82 4.2.1 Participants ............................................................................................................ 82 4.2.2 Intervention ........................................................................................................... 83 4.2.3 Sample collection and data processing ................................................................. 83 4.2.4 Statistics ................................................................................................................ 85 4.3 Results ........................................................................................................................... 86 4.3.1 Cohort characteristics............................................................................................ 86 4.3.2 DNA methylation across the intervention period ................................................. 87 4.4 Discussion ..................................................................................................................... 91 Chapter 5: Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array ......................94 5.1 Introduction ................................................................................................................... 94 5.2 Methods......................................................................................................................... 97 5.2.1 Cohort characteristics............................................................................................ 97  xiv 5.2.2 DNA methylation quantification ........................................................................... 98 5.2.3 DNA methylation age ........................................................................................... 99 5.3 Results ......................................................................................................................... 100 5.3.1 The epigenetic clock accurately predicts DNA methylation age from EPIC methylation data .................................................................................................................. 100 5.3.2 Data preprocessing methods affects the calculated DNA methylation age but within error margins of the epigenetic clock ...................................................................... 103 5.4 Discussion ................................................................................................................... 107 Chapter 6: Discussion ................................................................................................................111 6.1 Summary ..................................................................................................................... 111 6.2 Biomarkers of aging .................................................................................................... 113 6.2.1 The epigenetic clock ........................................................................................... 113 6.2.2 Telomeres and Mitotic Division ......................................................................... 114 6.2.3 Other age predictors ............................................................................................ 115 6.3 Intervention studies ..................................................................................................... 116 6.4 Data preprocessing ...................................................................................................... 117 6.5 Limitations .................................................................................................................. 117 6.5.1 Cell-type heterogeneity ....................................................................................... 118 6.5.2 Genetics............................................................................................................... 119 6.5.3 Association studies.............................................................................................. 120 6.6 Future directions ......................................................................................................... 121 6.6.1 Development versus aging .................................................................................. 121 6.6.2 Investigations of the aging DNA methylome ..................................................... 121  xv 6.7 Potential applications .................................................................................................. 123 6.7.1 Anti-aging ........................................................................................................... 123 6.7.2 Forensics ............................................................................................................. 123 References ...................................................................................................................................125 Appendices ..................................................................................................................................145 Appendix A Supplementary materials for Chapter 2 .............................................................. 145 A.1 Supplementary figures ............................................................................................ 145 A.2 Supplementary tables .............................................................................................. 151 A.3 Cohort descriptions ................................................................................................. 154 Appendix B Supplementary materials for Chapter 3 .............................................................. 162 B.1 Supplementary figures ............................................................................................ 162 B.2 Supplementary tables .............................................................................................. 168 Appendix C Supplementary materials for Chapter 4 .............................................................. 169 C.1 Supplementary figures ............................................................................................ 169 C.2 Supplementary tables .............................................................................................. 172 Appendix D Supplementary materials for Chapter 5 .............................................................. 173 D.1 Supplementary figures ............................................................................................ 173 D.2 Supplementary tables .............................................................................................. 178    xvi List of Tables  Table 2.1 Cohort Characteristics ................................................................................................... 31 Table 2.2 Description of the 94 CpGs in the pediatric DNA methylation BEC epigenetic clock based on training data ................................................................................................................... 43 Table 3.1 Cohort characteristics (means and percents), Nicoyans and non-Nicoyans ................. 56 Table 3.2 Differentially methylated genomic regions between Nicoyans and non-Nicoyans ...... 68 Table 3.3 Characteristics of four biologically and statistically significant DNA methylation sites between Nicoyans and non-Nicoyans ........................................................................................... 73 Table 4.1 Participant baseline characteristics ............................................................................... 86 Table 4.2 Significant CpGs associated with percent weight change over six months .................. 90   xvii List of Figures   Figure 1.1 Schematic of chronological age versus DNA methylation age ................................... 20 Figure 2.1 Chronological age versus estimated pediatric buccal DNA methylation age. ............ 32 Figure 2.2 Longitudinal data demonstrates comparative accuracy between buccal pediatric DNA methylation age predictor and 353 CpG pan-tissue model. .......................................................... 34 Figure 2.3 Buccal DNA methylation age acceleration was moderately higher in individuals diagnosed with Autism spectrum disorder (ASD) than typically developing (TD) children. ...... 36 Figure 2.4 Predictive model of age in pediatric buccal samples derived from all available data. 39 Figure 3.1 Nicoyans had differential CD8+ naïve and memory T cell abundance levels. ........... 58 Figure 3.2 DNA methylation age correlated with chronological age across all samples. ............ 62 Figure 3.3 DNA methylation variability was lower in Nicoyans ................................................. 64 Figure 3.4 Significantly differentially methylated regions between Nicoyans and non-Nicoyans........................................................................................................................................................ 69 Figure 3.5 Pyrosequencing of significantly differentially methylated single CpGs. .................... 71 Figure 4.1 Volcano plots of the change in DNA methylation over six months. ........................... 88 Figure 4.2 Scatterplot displaying 12 significant CpGs for percent change in weight change. ..... 89 Figure 5.1 DNA methylation age comparison between 450K or EPIC Monocyte data across preprocessing methods. ............................................................................................................... 101 Figure 5.2 EPIC DNA methylation age estimated in control samples from the Diesel Exhaust III Study across three tissues............................................................................................................ 102 Figure 5.3 DNA methylation age acceleration variation across preprocessing methods ........... 104 Figure 5.4 Absolute difference between technical replicate pairs for each processing method. 106  xviii List of Symbols  β: beta value  Δβ: delta beta, mean difference between group beta values rs: Spearman’s rank correlation coefficient  τ: Kendall’s rank coefficient (tau)    xix List of Abbreviations  27K: Infinium Methylation27 BeadChip 450K: Infinium HumanMethylation450  ASD: autism spectrum disorder BaL: Bronchoalveolar lavage   BEC: buccal epithelial cell BH: Benjamini-Hochberg BMIQ: beta-mixture quantile dilation bp(s): base pair(s) Brush: Bronchial brushing Chr: chromosome CpG: cytosine-phosphate-guanine dinucleotide  CRELES: Costa Rican Study on Longevity and Healthy Aging CRP: C-reactive protein DE3: Diesel Exhaust III Study  DMR: differentially methylated region DNA methylation: DNA methylation DNMT: DNA methyltransferase  EPIC: Infinium MethylationEPIC  EWAS: epigenome-wide association study  FDR: False discovery rate GS: GenomeStudio  GEO: Gene Expression Omnibus  xx IQR: inter-quantile range  mQTL: methylation quantitative trait loci  N: number Noob: normal-exponential out-of-band  PBMC: peripheral blood mononuclear cell  PC(s): principal component(s) PCA: principal component analysis RCT: randomized controlled trial  SNP: single nucleotide polymorphism SWAN: subset-quantile within array normalization TD: typically developing TSS: transcription start site UCSC: University of California, Santa Cruz     xxi Acknowledgements The liberty of performing scientific research is an ultimate privilege and I have many individuals to thank for this experience. This dissertation would not have been possible without the mentorship and opportunities provided to me by my supervisor, Dr. Michael Kobor, whose steady confidence in my research abilities has allowed me to explore my research potential in a way I could have never imagined. My choice to pursue academic research was not in isolation and I have to acknowledge many discussions that I had the pleasure to be a part of with Dr. Meaghan Jones and Dr. Julia MacIsaac thank you for the encouragement to pursue this research. I would like to express my gratitude to the faculty, staff and my fellow students at UBC, who have inspired me to continue my work in this field. Particularly, my graduate committee, Drs. Jan Friedman, Lara Boyd, and Sara Mostafavi, I greatly appreciate the support, time, and efforts you have contributed to mentor my progress and development as a scientific researcher. My positive graduate program experience would not have been the same without the fantastic group I had the privilege of working with, discussing both intellectual concepts related to our research but also the emotional support and friendship I developed and feel that this has really contributed to my progress, so thank you Drs. Meaghan Jones, Julia MacIsaac, David Lin, Alexander Lussier and soon-to-be Drs. Sumaiya Islam, Rachel Edgar, Nicole Gladish. Additional thanks to all current and previous Koborites. This work would not have been possible without the support from collaborators who I have had the opportunity to work with. Thank you Drs. David Rehkopf, Chris Carlsten, Maureen Ashe, Annaliese Beery, Steve Horvath, Christopher Verschoor, Kieran O’Donnell, and Lluis  xxii Quintana-Murci. Thank you to all cohort participants for donating their time, biological samples, and information to allow this research to occur. My sincerest gratitude to my partner, Joshua Fauchon, as his patience and support has provided me with the stability in my life that is undoubtedly the ultimate root of my academic successes. Thank you for being so understanding and helpful these past years. Special thanks are owed to my parents, Norman McEwen and Susan McEwen, as well as my sister, Jennifer Fyfe. Your work-ethic, integrity, drive, willpower, strength, and optimism are all traits that I have looked up to and have personally strived for. I am so thankful to have had such an amazing supportive family.   xxiii Dedication  I dedicate this work to my parents, my father Norman Douglas McEwen and my mother Susan Irene McEwen. I am forever grateful for your unconditional love and support.   1 Chapter 1: Introduction 1.1 Overview and hypotheses An individual’s genomic sequence is identical across nearly all of the conservatively estimated 200 cell types in the human body, but how DNA is differentially regulated depends on several processes that have not been not fully elucidated. Interestingly, these nuclear activities are not maintained consistently across the human life course, and these changes may contribute to a subsequent decline in cell functioning, which may potentially be an underlying driver of the aging process. One particular area of relevant research is epigenetics, a field of study that focuses on several phenomena including mechanisms involved in DNA compaction and regulating gene expression. A commonly studied epigenetic modification is DNA methylation, the addition of a methyl group primarily, but not exclusively, in the context of cytosine nucleotides. DNA methylation profiles differ across cell types, and although these patterns are indicative of cellular identity, they have also been associated with environmental influences, a very popular and exciting area of interest. With this, DNA methylation may act as a mediator between the environment and phenotype, and the potential for DNA methylation to serve as a biomarker of lifestyle exposures has gained considerable attention. In addition to the association between lifestyle exposures and DNA methylation variation, DNA methylation changes with respect to time across an individual’s lifespan. The overarching goal of my thesis was to further characterize these associations by exploring DNA methylation in a variety of contexts related to the human life course. I hypothesized that the pediatric life stage would have a unique relationship between aging and DNA methylation as compared to adults, that mid-life lifestyle interventions may be associated with a deceleration of age-related DNA methylation changes, and that long-lived populations would have unique age- 2 related DNA methylation signatures. I employed genome-wide technologies to quantify DNA methylation and asked biological questions using established bioinformatics methods.  The ability to measure genome-wide DNA methylation in human populations has developed significantly, and this evolution has been a primary factor in driving the biology and our understanding of this epigenetic mark. One DNA methylation quantification technology that has dramatically expanded our knowledge of the human methylome are microarrays as they offer a relatively inexpensive and high throughput approach allowing researchers to measure over 850,000 sites across the genome. However, data processing pipelines are not universal across all labs, and there are several different approaches one can take when preprocessing, normalizing, and analyzing DNA methylation microarray data.  In Chapters 2-4, I present studies of DNA methylation at different life stages in the context of age-specific differences as well as in association with a lifestyle intervention; whereas, in the last data chapter (Chapter 5) I report findings of technical variation in a commonly used tool to measure epigenetic age that are relevant for interpretation and reproducibility standards applicable to the field.   1.2 DNA methylation is an important component of the epigenetic landscape 1.2.1 Definition and genomic context DNA methylation is a stable, yet reversible, epigenetic process crucial for mammalian development, involving the covalent attachment of a methyl group to the 5’carbon of cytosine nucleotides (5mc). This modification typically occurs in the context of a cytosine adjacent to a guanine nucleotide, commonly referred to as a CpG (cytosine-phosphate-guanine) dinucleotide (P. A. Jones, 2012); but can also occur at CpH sites (“H” is one of adenosine/thymine/cytosine).  3 There are approximately 28 million CpG sites in the human genome, most of which are methylated (Bird, 2002), although this frequency is less than expected by chance (1/100) due to the spontaneous deamination of methylated cytosines to thymines (Duncan & Miller, 1980; Shen, Rideout, & Jones, 1994). A unique feature of CpG dinucleotides is that they are palindromic, meaning they have the same genetic sequence on both leading and lagging DNA strands as a result of complementary base-pairing. This symmetry allows DNA methylation to be preserved throughout cell division, as hemimethylated DNA is bound by the maintenance-enzyme DNA methyltransferase (DNMT1), allowing for the process of methylation to occur. Whereas, de novo CpG methylation is established by DNMT3a and DNMT3b which occurs after global erasure events that occur during early development (Bestor, 2000) . Demethylation can be achieved by passive or active processes, with active demethylation occurring via oxidation or deamination. Ten-Eleven-Translocation (TET) performs the oxidative reaction of 5mc to 5-hydroxymethylcytosine (5hmC), which is further converted into 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). 5caC is excised and replaced via base excision repair, which results in an unmethylated cytosine nucleotide (Ito et al., 2011).  Although CpGs are scattered across the entire genome, they are enriched in densely packed regions referred to as CpG islands, defined as containing >50% GC content, are >200bp in length, are commonly unmethylated and near gene promoters (Saxonov, Berg, & Brutlag, 2006; Weber et al., 2007). The function of DNA methylation has not been fully characterized, as I discuss below, although it has been closely linked with defining tissue-specificity as well as many biological processes, such as X-chromosome inactivation, imprinting, cancer, and aging (Feil & Fraga, 2012).   4 DNA methylation is only one example of an epigenetic mark; however, there are many other modifications to DNA and DNA packaging elements that partake in shaping the dynamic epigenome; for example, histone modifications such as acetylation, methylation phosphorylation, and ubiquitination.   1.2.2 Transcription factors and gene expression The vicinity of CpG islands near gene promoters highlights a potential role of DNA methylation in regulating gene expression. An early model posited that CpG island methylation had an inversely linear relationship with gene expression of the proximal gene as methylation could interfere with DNA binding of the transcriptional machinery. However, it has become increasingly evident that the association between DNA methylation and gene silencing is not as straightforward as once hypothesized. For example, a recent study showed that the artificial addition of methyl groups to gene promoters is insufficient in repressing transcription in many cases (Ford et al., 2017), a finding that has challenged our current understanding of the function of DNA methylation in transcription. Another intriguing aspect is that genic methylation has been associated with gene activation (Gutierrez Arcelus et al., 2013) and has also been correlated with regulating alternative promoters and gene splicing (M. J. Jones, Fejes, & Kobor, 2013; Lev Maor, Yearim, & Ast, 2015; Maunakea, Chepelev, Cui, & Zhao, 2013). Additionally, enhancer regions display differential DNA methylation across cell-types (Fleischer et al., 2017; Lister et al., 2009). Interestingly, although not surprising, the associations between DNA methylation and gene expression are not stable across development (Yu et al., 2013). Genomic context and temporal stage are major contributors to the interaction between DNA methylation and gene regulation and are considerably more complex than once proposed; therefore, it is an active and ongoing area of research in the field.  5 Given the intricacies of DNA methylation and gene expression, it is integral to consider the relationship with more proximal regulators of gene regulation, such as transcription factors. The interaction between transcription factor binding and DNA methylation is also considerably complicated, but what is clear is that the relationship is not unidirectional. For example, specific transcription factors bind methylated CpG dinucleotides specifically, whereas others are blocked from binding DNA if methylation is present in their respective binding sequence (Defossez & Stancheva, 2011; Schübeler, 2015; Zhu, Wang, & Qian, 2016). Recently, it has been observed that transcription factors with a preferential nature to bind methylated DNA are often involved in developmental processes (Bonder et al., 2017; Yin et al., 2017). The interaction between DNA methylation and transcription factors is complex, depending on cell type, environment, and organism age; future targeted studies will unravel the context-specific details to ultimately better understand the epigenomic contribution to gene regulation.   1.2.3 DNA methylation and the immune system In addition to the potential role played by DNA methylation in the multifaceted regulation of gene expression, DNA methylation is deeply involved during early development when tissue identity is determined and throughout the life course, contributing to the maintenance of cell characteristics throughout following divisions. In the immune system, for example, maintenance of DNA methylation patterns is required for hematopoietic stem cell self-renewal and differentiation (Trowbridge, Snow, Kim, & Orkin, 2009). During hematopoietic stem cell differentiation, DNA methylation gains and losses at specific regions of the genome retain precise marks that allow cells with the same genetic material to only express the genes required for their unique cellular processes and identities (Meissner, 2010). This is illustrated by  6 the lineage-specific epigenetic differences that arise when early multipotent progenitors split into myleoerythroid and lymphoid lineages; these patterns become more discrete as differentiation progresses, resulting in cell type-specific DNA methylation signatures in mature cells (Calvanese et al., 2011; Reinius et al., 2012; Schuyler et al., 2016; Suarez-Alvarez, Rodríguez, Fraga, & López-Larrea, 2012). Hematopoietic stem cells lacking the maintenance methyltransferase prematurely lose their self-renewal capabilities, and cells lacking de novo methyltransferases show impaired lineage commitment, illustrating that these DNA methylation changes are essential for correct lineage development (Bröske et al., 2009; Challen et al., 2014). Additionally, certain blood cell type proportions have been shown to change with aging (Jaffe & Irizarry, 2014), highlighting the importance of considering the potential variation introduced by cellular heterogeneity in DNA methylation analyses. Not surprisingly, because of these cell type-specific DNA methylation patterns, cell and tissue types are the most significant contributors of DNA methylation variation in healthy individuals (Davies et al., 2012; Farré et al., 2015; Ziller et al., 2013). Further work will determine whether aberrations of DNA methylation in early life can influence the trajectory of immune aging. It is possible that the establishment of these DNA methylation patterns in response to environmental exposures serve to predict future phenotype, including immune responses. For example, pre-stimulation DNA methylation differences in leukocytes can predict their cytokine responses when stimulated through Toll-like receptor pathways (Lam et al., 2012). As the variability in these baseline patterns was representative of the differences in lifetime environmental exposures between the cells, DNA methylation may function both as a memory of past exposures as well as a predictor of future immune responses.   7 1.3 DNA methylation and the environment 1.3.1 Physical environmental exposures  Although DNA methylation is stable in that it, in general, is faithfully transmitted from mother to daughter cells, paradoxically it also appears to be malleable in response to certain environmental exposures (Boyce & Kobor, 2015; Feil & Fraga, 2012; Marsit & Marsit, 2015). Several studies have examined environmentally-induced changes in DNA methylation in both gene-specific contexts as well as genome-wide changes, such as average global methylation across repetitive elements (Bollati & Baccarelli, 2010). These changes may be transient and revert to their original state after the exposure ends, but in some cases, they can remain associated long after the event has passed (Essex et al., 2013). It is currently hypothesized that early life is a particularly sensitive time for the long-term epigenetic embedding of experiences, but in many cases, it is not until later in life that health outcomes associated with these exposures are revealed.   Some DNA methylation changes are strongly associated with previous and current cigarette smoke exposure (Monick et al., 2012; Shenker et al., 2013). The cigarette smoke-related DNA methylation change in the promoter of a well-characterized tumour suppressor gene, aryl hydrocarbon receptor (AHRR), is currently the most robust environmentally induced epigenetic alteration. Within the AHRR gene, changes in both DNA methylation and gene expression have been observed upon exposure to firsthand and secondhand cigarette smoke, as well as prenatal exposure to maternal smoking {Monick:2012ga, Miyake:2018il, Joubert:2012kr}. Additionally, postnatal cigarette exposure has also been associated with differential methylation at AHRR (Novakovic et al., 2014). The AHRR protein regulates an enzyme responsible for binding nicotine, thus supporting a plausible mechanism for a DNA  8 methylation response to cigarette smoke exposure. In addition to cigarette smoke exposure, studies have examined the effect of air pollution such as diesel exhaust, finding genome-wide effects in DNA methylation in peripheral blood and primary lung epithelial cells respectively (Clifford et al., 2016; Jiang, Jones, Sava, Kobor, & Carlsten, 2014). The implications of these findings are not well characterized but provide evidence to suggest a potential pathway for environmental exposures “getting under the skin”.   Another example supporting the plasticity of DNA methylation in association with environmental conditions is exemplified with the Dutch Hunger Winter. This study observed DNA methylation changes later in life in individuals who had experienced in-utero malnutrition; these individuals were also identified as having a higher incidence of adult obesity (Heijmans et al., 2008). Recent studies have built on this work, finding that DNA methylation mediates the relationship between prenatal famine exposure and adult BMI and serum triglycerides (Tobi et al., 2018). DNA methylation has also been connected with other prenatal exposures such as smoking and other perfluoroalkyl compounds (Guerrero-Preston et al., 2010), alcohol consumption (Sharp et al., 2018), and maternal stress (Rijlaarsdam et al., 2016). These studies have contributed significantly to laying the foundation for future work in characterizing DNA methylation as a possible link between adverse early-life conditions and adult well-being.  1.3.2 Social and lifestyle factors In addition to physical environmental exposures, studies have shown that the social environment also associates with DNA methylation changes. For example, early life trauma and low socioeconomic status in childhood have been associated with DNA methylation differences later in life (Houtepen et al., 2016; G. E. Miller et al., 2009; Stringhini et al., 2015).   9 The prospective intervention design in DNA methylation studies allows for a more in-depth investigation into possible social and lifestyle factors that influence change in DNA methylation. Although these study designs are challenged with common human population research issues; for example, it is sometimes not possible to untangle other possible environmental or lifestyle influences that occur simultaneously with the intervention and therefore limiting in the possible depth of interpretation. In population studies, intervention designs provide strength in determining whether lifestyle factors contribute to DNA methylation changes, and it is tempting to speculate that altering one’s lifestyle choices may affect DNA methylation. Such studies have focused on whether increasing exercise influences DNA methylation, with some reports of genome-wide effects on DNA methylation (Brown, 2015; Nitert et al., 2012; Rönn et al., 2013; Voisin, Eynon, Yan, & Bishop, 2015). Body-mass index (BMI) has also been shown to associate repeatedly with DNA methylation at certain loci in both adipose tissue and blood (Dick et al., 2014; Mendelson et al., 2017; Richmond et al., 2016); and alterations in BMI have also been associated with DNA methylation changes (Demerath et al., 2015). A prevailing hypothesis across these studies may be that DNA methylation is involved in a potential biological mechanism mediating the relationship between obesity and chronic disease; however, minimal evidence for any causal pathway exists, and future work is required to establish these links.  Current studies support the hypothesis that DNA methylation is a malleable molecular mark that is intimately associated with an organism’s environment, and that DNA methylation may be a mediator between phenotype and the milieu of potential social or physical exposures one is subjected to throughout her lifetime.    10 1.4 The Illumina DNA methylation microarray 1.4.1 Platform overview  Throughout this thesis, all DNA methylation was quantified by microarray experiments. Although many methods exist for measuring DNA methylation, genome-wide microarray technologies have contributed to expanding our potential to explore biology in population-based studies. Particularly, DNA methylation microarrays have allowed researchers to quantify a genome-wide representation of the human methylome requiring minimal DNA input in an efficient and relatively inexpensive manner as compared to other available methods such as whole genome bisulfite sequencing. The microarray design is based on the Illumina genotyping approach; however, an initial step includes a bisulfite treatment to convert unmethylated cytosines to thymines where methylated cytosines remain unchanged, allowing for the quantification of a C/T polymorphism to determine the site-specific percent of methylated DNA molecules in a given sample.   1.4.2 Probe chemistry  In the most recent microarray technologies (450K and EPIC), the design applies two different probe designs: Type I probes have one probe for the methylated locus and another for the unmethylated locus, assuming the same methylation status for adjacent CpG sites underlying the probe binding region; whereas Type II probes only use one probe for both a methylated or unmethylated loci and do not have the underlying assumption of similar adjacent CpG methylation and instead contain up to three degenerate bases allowing for multiple methylation statuses in order to measure the target CpG of interest. The differing probe chemistries allow for more extensive coverage of the human methylome; however, the two designs introduce variation due to differing dynamic ranges which can be accounted for by established normalization  11 methods (Dedeurwaerder et al., 2014; Maksimovic, Gordon, & Oshlack, 2012; Teschendorff et al., 2013). Fluorescent intensity scores are converted into a β-value = methylated/(methylated+unmethylated), representing percent methylation ranging from 0-1. However, for analyses purposes, a log-transformed measure (referred to as an M-value) of the β-value is often used for statistical tests as this value is less heteroscedastic (Du et al., 2010).  I discuss two platforms throughout the work presented here, the Illumina Infinium HumanMethylation450 BeadChip (450K) and the Illumina Infinium MethylationEPIC BeadChip (EPIC) microarrays. The EPIC microarray is the most recent DNA methylation array available and targets over 850,000 CpG sites, almost doubling the coverage from the previous version of this technology, the 450K array. Although the EPIC microarray targets less than 4% of the entire human methylome, probes are strategically distributed to regions predicted to have the greatest functional impact on gene regulation to capture a genome-wide representation of DNA methylation. The development of the EPIC array from the 450K array included an overlap of >90% of 450K probes, with roughly 333,000 new CpGs enriched in regulatory regions such as enhancers, transcription factor binding sites, and genomic regions with supportive evidence for open chromatin states and DNAse I hypersensitivity (Moran, Arribas, & Esteller, 2015). While the increase in genomic coverage is exciting, the upsurge in data quantity can pose new concerns as discussed in section 1.5.  1.5 Considerations for meta-epigenome studies  1.5.1 Epigenome-wide association studies overview  With the advent of microarray technology, epigenome-wide association studies (EWAS) are now subject to similar challenges as GWAS approaches and therefore should adhere to  12 similar statistical thresholds for determining genome-wide significance, although there are features of EWAS that reveal new obstacles that require different guidelines.  In EPIC microarray data, many EWAS will test over 850,000 linear models and consequently increase the chance of finding false discoveries; therefore, it is important to consider multiple testing biases by using established statistical approaches to limit false positives. With that, new methods have provided means to reduce the multiple-testing space, such as removing invariant probes based on empirical evidence (Edgar, Jones, Robinson, & Kobor, 2017) or using data reduction methods like principal component analyses or region-based analyses (T. J. Peters et al., 2015).  Another challenging aspect is that performing sample power calculations for EWAS is difficult as the expected effect size is not straightforward to determine; however, methods have been proposed (Saffari et al., 2017; Tsai & Bell, 2015). Regardless, it is evident that epigenetic studies face very similar obstacles as genome-wide association studies (GWAS) and should be held to similar standards to report meaningful associations.   Despite the challenges EWAS are subject to, they do provide the means of discovery of novel findings in human populations. As we increase the testing space in the methylome (increasing number of measured loci), the gold standard is and will continue to be replication studies: observing the same results in a separate cohort.   1.5.2 Cell type heterogeneity is a major contributor to DNA methylation variation As mentioned previously, DNA methylation varies considerably across cell types, and this variation needs to be accounted for during EWAS analyses in tissue samples of heterogeneous cell types, such as blood, brain, or saliva. Conventional methods to adjust for these known cell type-associated differences in DNA methylation include: 1) measuring cell  13 counts in samples prior to quantifying DNA methylation and then including these values as covariates in a linear model when assessing associations between CpG methylation and a variable of interest (Lam et al., 2012); 2) predicting cellular proportions using previously established reference-based deconvolution methods that were developed from cell sorted data (Guintivano, Aryee, & Kaminsky, 2013; Houseman et al., 2012; A. K. Smith et al., 2015); or 3) using a reference-free approach which performs under the assumption that the highest proportion of DNA methylation variation is due to cell type heterogeneity with procedures such as surrogate variable analysis (Kaushal et al., 2017; Leek & Storey, 2007; McGregor et al., 2016). When examining cellular-heterogeneous tissues, it is crucial to correct for cell type proportions as neglecting this can result in masking of interesting biological information. For example, one study examining the association between DNA methylation and early life socioeconomic status (SES) found that while DNA methylation was associated with early life SES, this association was only visible after correction for white blood cell type (Lam et al., 2012). Additionally, cell proportions may be confounded with the variable of interest increasing false positives. However, since DNA methylation can be used to predict cell type proportions, the estimated values can also be used to find biologically interesting results (Esposito et al., 2016).  1.5.3 Genetic structure intersects with DNA methylation variation   An important consideration for DNA methylation analyses and overall interpretation is underlying genetic information, as this can influence DNA methylation status. One such scenario is the direct influence of a polymorphic locus with only some individuals having a cytosine available for methylation. CpG methylation can also be affected by single-nucleotide polymorphisms (SNPs), referred to as a methylation quantitative trait locus (mQTL), in a variety of situations. An mQTL within a transcription factor binding sequence can subsequently  14 influence DNA methylation at a neighbouring CpG. Currently, there is an ongoing exploration into factors that affect mQTL relationships, such as ancestry (Fraser, Lam, Neumann, & Kobor, 2012), tissue-specificity with only particular tissues demonstrating mQTL behavior at specific loci (Gutierrez Arcelus et al., 2015; A. K. Smith et al., 2014), age-related mQTLs where the association between the polymorphism is only present within certain age ranges (Gaunt et al., 2016; van Dongen et al., 2016), as well as environmental or disease mQTLs where the association is only observed in specific contexts, such as in neurological disorders, cancer, smoking, child-hood trauma, blood pressure, and type II diabetes (Chambers et al., 2015; Gamazon et al., 2013; Gao, Thomsen, Zhang, Breitling, & Brenner, 2017; Heyn et al., 2014; Kato et al., 2015; Klengel et al., 2013; Q. Li et al., 2013). The nature of the methylome is further being defined in the context of genetic influences, and as the field continues to characterize DNA methylation it is important to consider this source of variation in human populations.  1.6 The Human Lifespan 1.6.1 Genetics and longevity A particular area of focus for many EWASs is in aging research. The vast amount of research that assesses biological aging focuses on unravelling the causes of variable aging rates observed across individuals. This variation in lifespan is largely influenced by a combination of genetic, environmental, and lifestyle factors. The heritability of lifespan, or the degree to which is attributed to genetic composition, has been approximated as contributing only 20-30% of the variance in life length (Christiansen et al., 2015; Murabito, Yuan, & Lunetta, 2012). Interestingly, this heritability is not constant across the life course; for example, the heritability of age at the time of death has been estimated between 15-30% (Jylhävä, 2014); whereas  15 genetics seems to contribute more to the time of death in humans at older ages (Brooks-Wilson, 2013). Furthermore, there have been speculations that genetic influence on longevity in age groups close to 90 years is different from that of reaching centenarian age (Jylhävä, 2014). Many gene variants identified in model organisms have been recognized for increasing lifespan with over one hundred genes identified as being involved (Kaeberlein, Rabinovitch, & Martin, 2015). In human studies it has remained a challenge to map particular gene variants with lifespan; with the exception of APOE, which has been observed as statically significant in several GWAS (Y. Ma et al., 2015). The difficulty in identifying single genes responsible for modifying lifespan results from the complexity of the aging process.   1.6.2 Environmental influences on lifespan Most environmental or lifestyle behaviours that are recognized as influencing lifespan stem from animal studies, where these factors can be strictly controlled, and large population-based cohort studies, where you have increased power. For example, caloric restriction of 20-30% has been observed to prolong lifespan in mice (Anderson & Weindruch, 2010; McCay, Crowell, & Maynard, 1989); a finding that has been reproduced in other model organisms such as rhesus monkeys (Colman et al., 2009; 2014), but remains to be investigated in humans. However, Mediterranean diet has been reported to associate with lower incidences of mortality and disease (Fung et al., 2009; Lopez-Garcia et al., 2014). Physical activity has also been posited as a contributor to health span, the length of time spent in good health, but not lifespan (Garcia-Valles et al., 2013), although these findings have been inconsistent when studied in humans (Gellert, Schöttker, & Brenner, 2012; I. M. Lee & Paffenbarger, 2000; S. C. Moore et al., 2012). In humans, both current and former cigarette smoking have been associated with decreased  16 lifespan (Gellert et al., 2012). Whereas alcohol consumption has demonstrated inconsistent associations, both have been implicated in lifespan variability (Reid, Boutros, O'Connor, Cadariu, & Concato, 2002). Aging is influenced by genetics, environmental factors, and undoubtedly a combination through gene-environment interactions.   1.7 DNA methylation and aging  1.7.1 A historical perspective  While the link between DNA methylation and aging has been studied over an extended period, recent findings and advances have sharpened the focus on the role of DNA methylation in the aging process. Early studies began assessing age-associated DNA methylation patterns over 40 years ago, using techniques such as liquid chromatography to measure bulk methylation levels in salmon, rodent, cattle, and chicken (Hoal-van Helden & van Helden, 1989; Romanov & Vanyushin, 1981). Of note, early studies in rodent models identified a loss of average DNA methylation over the life course across many tissues, a finding that later was validated in blood from a cross-sectional human cohort consisting of both newborns and centenarians (Heyn et al., 2012; Wilson, Smith, Ma, & Cutler, 1987). These explorations laid the foundation for human DNA methylation aging studies, and as technology continues to advance allowing easier access to the methylome, we enter an exciting era of epigenetic aging research, where DNA methylation is emerging as a critical player in the aging process.  1.7.2 Age-associated DNA methylation  It is estimated that one-third of the epigenome’s DNA methylation content changes over the human lifespan, and recent advances have helped further elucidate the context and potential  17 function of these changes. For example, there is a global loss of methylation with age (Unnikrishnan et al., 2018), preferentially occurring at regions of low CpG density and often located within gene bodies (Heyn et al., 2012). Despite the fact that mean DNA methylation decreases with age, there are specific age-related methylation changes that involve a gain in methylation as well. These tend to be found within CpG islands, or areas of high CpG density (B. C. Christensen et al., 2009; Johansson, Enroth, & Gyllensten, 2013). Together, these changes demonstrate a regression to the mean pattern – low CpG density regions, which are typically highly methylated, lose methylation with age, while high-density regions, which tend to have low levels of DNA methylation, gain methylation with age. Since most CpGs in the genome are methylated, this translates to a global loss of DNA methylation. For instance, repetitive elements tend to be highly methylated and lose DNA methylation with age despite their high CpG density (Bollati et al., 2009). Interestingly, few age-associated DNA methylation sites have been correlated with nearby gene expression; indicating that the function of age-related methylation is far from well-defined (Hannum et al., 2013; Marttila et al., 2015; Reynolds et al., 2014; Steegenga et al., 2014). A host of research explorations including animal models, human longitudinal twin studies and age-variable cohorts, have all contributed to identifying DNA methylation patterns associated with age. From the combination of these research findings, it is evident that two common trends of epigenetic aging have emerged: 1) random changes to DNA methylation that are inconsistent across individuals, and 2) predictable, site-specific DNA methylation changes occurring in a similar way across individuals with age (Boks et al., 2015; M. J. Jones, Goodman, & Kobor, 2015a).    18 1.7.3 Epigenetic Drift One repeatedly observed feature of age-associated DNA methylation is an increase in variability, resulting from changes in DNA methylation that do not share a common direction (gain or loss) across individuals (Zampieri et al., 2015). These age-related changes are collectively referred to as epigenetic drift, a trend of non-directional DNA methylation changes in part driven by environmental influences (van Dongen et al., 2016). Epigenetic drift was well-illustrated in an early study of monozygotic twins, which found that at infancy twin pairs possessed almost indistinguishable methylation profiles while older twin pairs had highly divergent methylomes (Fraga et al., 2005). The increased inter-individual variability of DNA methylation with age is also reflected in increased variability in transcriptional regulation (Hannum et al., 2013). Epigenetic drift may not be benign, as variability in DNA methylation has been suggested to increase the risk for diseases such as depression and cancer (Córdova-Palomera et al., 2015; Zheng, Widschwendter, & Teschendorff, 2016). The source of this increased variation is still unknown and may represent random DNA methylation events or an age-associated decline in efficiency of the machinery responsible for maintaining DNA methylation, such as downregulated or inhibited DNMTs or passive demethylation (Bonder et al., 2017; Hannum et al., 2013; Winnefeld & Lyko, 2012). Others have proposed that an individual’s unique combination of lifelong environments and experiences may create differences in cellular processes that in turn lead to higher variability in DNA methylation over time (Murgatroyd et al., 2009). Regardless of its cause, the phenomenon of epigenetic drift is fascinating as it may lead to a differential cellular functioning and diverse health outcomes possibly reflected in varying aging rates.    19 1.7.4 Epigenetic Clocks  In contrast to epigenetic drift, there are CpGs in the genome that are highly associated with age across individuals throughout the life course. These sites follow the same age-related trajectories as epigenetic drift, in which sites that are poorly methylated gain DNA methylation, and sites that are highly methylated lose DNA methylation with age (Bocklandt et al., 2011; Hannum et al., 2013; Weidner et al., 2014). The major difference is that the specific sites that change with age and the direction of that change are consistent across individuals (M. J. Jones, Goodman, & Kobor, 2015a; McEwen, Goodman, Kobor, & Jones, 2016). These sites have been used to construct multivariate chronological age predictors, giving rise to the concept of the “epigenetic clock” (Bocklandt et al., 2011; Hannum et al., 2013; Horvath, 2013; Koch & Wagner, 2011; Weidner et al., 2014) (Figure 1.1). Such models can predict chronological age either within a specific tissue or even across multiple tissues (Guintivano et al., 2013; Hannum et al., 2013; Horvath, 2013; Weidner et al., 2014). The current, most commonly used age-predictor was generated from over 8,000 DNA methylation profiles derived from 51 different cell types to identify 353 CpG sites capable of predicting chronological age with a median absolute error of 3.6 years (Horvath, 2013).   20  Figure 1.1 Schematic of chronological age versus DNA methylation age (McEwen et al., 2016). The widespread application of epigenetic age prediction has shown very high concordance between chronological age and predicted age; however, some individuals show large discrepancies between the two. These efforts have sparked a plethora of studies focused on determining the relationship between lifelong environmental exposures, biological age as measured by the epigenetic clock, and the presence of health and disease during aging. Specifically, the residuals extracted from a linear model of DNA methylation age regressed on chronological age, referred to as age acceleration or age deviation, have been associated with advanced or delayed aging phenotypes. Recent findings have shown epigenetic age acceleration in a number of diseases and disorders, though few studies have been able to determine whether this acceleration was preceded, concurrent with, or followed disease manifestation in late-onset diseases. For example, neurodegenerative disorders, such as a decline in cognitive function,  21 episodic memory, and working memory, as well as neuropathological measures, such as diffuse and neuritic plaques and amyloid load have been associated with epigenetic age acceleration (M. E. Levine, Lu, Bennett, & Horvath, 2015b). Also, individuals with Down syndrome, which has been associated with early cognitive decline, have an average epigenetic age 6.6 years older than their chronological age (Horvath, Garagnani, et al., 2015a). There have been many other studies showcasing deviations in the relationship between epigenetic age and chronological age in diseases such as Schizophrenia, Post-traumatic stress disorder, Parkinson’s Disease, and HIV (Boks et al., 2015; Horvath & Levine, 2015; Horvath & Ritz, 2015; Rickabaugh et al., 2015; van Eijk et al., 2015; Wolf et al., 2015; 2017). In one case, however, researchers were able to show an association between lung cancer incidence and increased epigenetic age acceleration prior to diagnosis (M. E. Levine et al., 2015a). Additionally, a broad analysis of DNA methylation age and lifestyle indicated that nutritional factors such as fish intake, moderate alcohol consumption, fruit and vegetable intake, and poultry consumption associated with epigenetic age acceleration (Quach et al., 2017). Furthermore, body-mass index (BMI) has also been positively associated with DNA methylation age acceleration in the liver, but not blood (Horvath et al., 2014). Together, these studies show that there are particular diseases or disorders associated with increased biological age, a relationship consistent with the toll diseases take on human health.  The connection between accelerated epigenetic age and poor health is further reinforced by work analyzing the association between epigenetic age acceleration and all-cause mortality. A longitudinal study found that an epigenetic age more than five years higher than chronological age was associated with a 21% increased mortality rate (Marioni, Shah, McRae, Chen, et al., 2015a). The heritability of age acceleration, the degree to which it is attributed to genetic composition, was also assessed in a parent-offspring cohort and revealed that approximately 40%  22 of the variation in age acceleration is due to genetic factors (Marioni, Shah, McRae, Chen, et al., 2015a). However, this genetic estimate has been shown to vary across groups and biological sex (Horvath, 2013; Horvath et al., 2016). These results indicate that although a significant proportion of age-related methylation changes may be under strong genetic influence, there are even more substantial unknown non-genetic contributions to the variation in these events. These findings provide one of the first links between DNA methylation-predicted age and mortality, highlighting the potential clinical relevance of age-related DNA methylation.  Another study investigated associations between epigenetic age and mortality in a cohort of 378 Danish twins, consisting of both monozygotic and dizygotic twin pairs, aged 30-82 years old. Upon resampling the 86 oldest twins during a 10-year follow-up, a mean 35% higher mortality risk was associated with each 5-year increase in epigenetic age. Interestingly, through a separate intra-pair twin analysis, a 3.2 times greater risk for mortality per 5-year epigenetic age difference within twin pairs was observed for the epigenetically older twin, after controlling for familial factors (Christiansen et al., 2015). These findings highlight the link between mortality and DNA methylation-predicted age, exemplifying the capacity of DNA methylation to discriminate between biologically younger or older individuals independent of genetic sequence.  In addition to the multi-tissue epigenetic clock, there have been other epigenetic age predictors established in blood (Hannum et al., 2013; Weidner et al., 2014). Some criticism has been made at training on chronological age and whether these tools are capturing true biological age; therefore, recently, a DNA methylation-based clock was designed to predict health-span by training the model on a composite of age-related phenotypes, the output referred to as “phenotypic epigenetic age” (M. E. Levine et al., 2018). In addition to epigenetic age predictors across the life course, there have also been epigenetic predictors of gestational age at birth  23 (Bohlin et al., 2016; Knight et al., 2016). Epigenetic clocks in other species, such as mice, dogs, and wolves have also been generated using DNA methylation with impressive accuracy (De Paoli-Iseppi et al., 2017; Stubbs et al., 2017; Thompson, vonHoldt, Horvath, & Pellegrini, 2017). The high correlation between estimated epigenetic age and chronological age supports DNA methylation as a molecular marker of age.  The described relationships, where the presence of disease is associated with acceleration in DNA methylation age, which in turn is related to mortality, is highly suggestive that epigenetic age may be an excellent biomarker of human health. Future work will determine whether acceleration in biological aging is reversible, and what factors might be involved in modifying the progression of aging.   1.8 Research Objectives  The goal of my dissertation research was to further characterize DNA methylation across the human life course. Specifically, I aimed to achieve this by filling several gaps in knowledge: addressing the issue with the error in the predictive capacity of DNA methylation to estimate age for samples obtained from individuals under the age of 20. Next, exploring patterns of DNA methylation in a healthy aging population enriched for centenarians to investigate unique biological signatures of longevity. Additionally, I examined the stability of DNA methylation over a 6-month lifestyle intervention promoting good health by increasing exercise, investigating whether the trajectory of epigenetic age differed in individuals who had experienced the physical activity program. Finally, I provided important evidence that the epigenetic clock performs well on the newest version of Illumina’s microarray technology, the EPIC array, despite the lack of 19  24 clock-CpGs, as well as characterizing the technical error of this tool due to preprocessing methods, findings that will encourage reproducibility in future studies, Epigenetics is a fascinating field that remains to be fully unravelled in the context to aging and lifestyle variation, and the research I have outlined here will contribute to gaining a more in-depth understanding in this area of research. This work may lay a possible foundation to unveil potential mechanisms of the role of DNA methylation in aging. The characterization of DNA methylation in early-life, particularly the work I showcased in the pediatric age-range, contributes to elucidating the developmental origins of health and disease. Furthermore, my investigations into DNA methylation and longevity add to our current understanding of the underpinnings of healthy aging, and future work will hopefully build on these results to work towards extending the number of healthy mobile years spent in old age, potentially reducing the economic impact on our healthcare systems.      25 Chapter 2: DNA methylation age estimator for pediatric buccal swabs and potential applications to autism 2.1 Introduction  Epigenetic age, based on CpG methylation and often referred to as DNA methylation age, has recently emerged as a highly accurate estimator of chronological age (Bocklandt et al., 2011; Garagnani et al., 2012; Hannum et al., 2013; Horvath, 2013; Lin et al., 2016; Weidner et al., 2014). A widely used pan-tissue age estimator (herein referred to as the 353 CpG model) was developed on DNA methylation data of over 8,000 samples from 51 healthy tissues (Horvath, 2013). This epigenetic clock has been applied to many independent data sets, each showing strong correlations with chronological age. Deviations between DNA methylation age and chronological age, referred to as DNA methylation age acceleration, have been associated with several age-related health variables, with higher epigenetic age associated with an increase in mortality, frailty, cognitive decline, and decrease in time until death (Breitling et al., 2016; B. H. Chen et al., 2016; Christiansen et al., 2015; Marioni, Shah, McRae, Ritchie, et al., 2015b; Perna et al., 2016). Furthermore, DNA methylation age acceleration has been associated with conditions expected to accelerate biological aging, such as lifetime stress (Zannas et al., 2015).  Although correlations between chronological age and DNA methylation age, as measured by the 353 CpG model, have been reported in pediatric samples, a high degree of variability from chronological age has been observed (Simpkin et al., 2016). The inaccuracy in predicting age in pediatric versus adult populations is not surprising, as challenges have been reported when extrapolating several adult-based biomarkers to children (Goldman, Becker, Jones, Clements, & Leeder, 2011). Furthermore, the rate of DNA methylation change is much higher in the pediatric age range compared to adulthood (Alisch et al., 2012). Thus, there is a need for an epigenetic  26 predictor of age specific to the pediatric age range, in order to accurately detect deviations across populations that may be reflective of developmental trajectories or pediatric disease. In the present study, we have generated a tool to measure biological age using DNA methylation that is specific to the pediatric-age range. We focused our predictor on buccal epithelial cells (BECs), as this tissue involves a non-invasive sample collection procedure, contains a lower degree of cellular heterogeneity, and has a high degree of DNA methylation stability (Forest et al., 2018) . The utility of this highly precise pediatric molecular biomarker remains to be explored in detail; however, we anticipate deviations between pediatric DNA methylation age and chronological age to be representative of developmental processes and/or pediatric disease, as they are in adults.  2.2 Methods  2.2.1 Cohort descriptions  Training and test data set inclusion criteria consisted of BEC Illumina Infinium450 (450K array) or BEC Illumina InfiniumEPIC (EPIC array) microarray DNA methylation data derived from typically developing individuals ranging from birth to 20 years old. Samples were excluded if exact age in the unit of days was not available (sample collection date – date of birth) or if predicted biological sex did not match with reported sex, based on the mean DNA methylation of three CpGs in the X-inactivation center (chrXIC: 72987573-72988170) on the X chromosome (Cotton et al., 2015). We obtained DNA methylation profiles of 1,721 individuals from 11 independent cohorts for our analyses. We divided these data into a training data set (data sets 1-7, n = 1,032) to generate the predictor and an independent test data set (data sets 8-11, n = 689) in order to report unbiased performance metrics. We also included three non-BEC data sets to  27 investigate the predictor accuracy in saliva (data set 12, n = 65) and blood (data set 13 and 14, n = 19 and n = 134, respectively). Further data set specific details including genomic DNA extraction methods are provided in Appendix A.3.   2.2.2 DNA methylation data processing  For all in-house data sets (data set 1-12) 750ng of genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) to deaminate unmethylated cytosine nucleotides, producing uracil nucleotides for sequence-based differentiation of methylated and unmethylated CpGs. Following this, approximately 160ng of bisulfite converted DNA was interrogated using the 450K or EPIC array, according to manufacturer’s instructions (Illumina). Microarrays were scanned with an Illumina iScan system. Intensity data were transformed into beta values ranging from 0 to 1 and subsequently background subtracted and colour corrected using GenomeStudio software. Data were loaded into R statistical software (version 3.2.3) for all downstream analyses. Probe filtering was based on the following criteria: probes predicted to cross-hybridize to more than one site in the genome, probes that are targeted to polymorphic CpG sites, and probes designed to bind either the X or Y chromosome (Y.-A. Chen et al., 2013; Price et al., 2013). Additionally, we reduced our data set to probes that are only represented on both the 450K array as well as the EPIC array, the most recent technology currently available. Probes with a bead count fewer than three in 5% of samples as well as probes having a detection p-value greater than 0.01 in 1% of samples were removed using the function ‘pfilter’ from the wateRmelon Bioconductor package (Pidsley et al., 2013). Non-variable probes, defined as having an inter-quantile range ≤ 0.05 were also removed (Edgar et al., 2017). Missing methylation data were imputed with the ‘impute.knn’ function (Troyanskaya et al., 2001). Data  28 were normalized using a modified beta mixture quantile dilation method to adjust for the two probe type design differences within the microarray platform (Horvath, 2013; Teschendorff et al., 2013). Lastly, we estimated BEC proportions using a previously described DNA methylation-based method (A. K. Smith et al., 2015). For blood derived DNA methylation data sets (data sets 13, 14), blood cell-type heterogeneity was accounted for by predicting proportions using a commonly used reference-based method (Houseman et al., 2012; Koestler et al., 2013), where the top principal components of these estimates were regressed out from the DNA methylation data. For any test data set with poor performing clock CpGs; for example, clock probes with a poor detection p-value which would normally have been removed during pre-processing, values were set to NA and imputed based on mean methylation across samples prior to normalization.  2.2.3 Pan tissue DNA methylation age (353 CpG model)  For all test (data sets 8-11) and non-BEC data sets (data sets 12-14), data were processed using methods as described above. The 353 CpG model was performed using R statistical software with code supplied from https://labs.genetics.ucla.edu/horvath/DNAmage/.   2.2.4 Pediatric BEC DNA methylation age predictor  Methods similar to the development of the epigenetic clock (353 CpG model) were used to create this predictor (Horvath, 2013). We employed an elastic net approach to select 94 CpG sites within a training data set (n = 1,032), where a test data set (n = 689) was used to evaluate and report accuracy metrics of the selected model. Lastly, we put forward a secondary model generated from all available data (collapsed training and test) for future researchers to employ,  29 creating a user-friendly online tool to easily calculate pediatric DNA methylation age within their BEC-based cohorts: www.shinyapps.com/pedDNAmAge (link to be activated upon publication).   2.2.5 Genomic enrichment  To explore the genomic context of the 94 CpGs selected from the training data, CpGs were associated with gene features using the annotation described previously (Edgar, Tan, Portales-Casamar, & Pavlidis, 2014) and with CpG island features as provided in the Illumina annotation (Bibikova et al., 2011). The count of clock CpGs located in each gene feature (promoter, intragenic, 3 prime region and intergenic) and CpG island feature (island, north and south shore, north and south shelf, and no island association), were compared to the background counts of all CpGs measured. To compare the clock CpG counts to the background in each region or feature 1,000 simulations of random CpG lists were used to calculate fold change values over the background.  2.3 Results  2.3.1 Training and test data set characteristics   We separated our 1,721 samples into two data sets: 1) a training data set (n = 1,032) containing buccal epithelial cells derived DNA profiles from typically developing individuals evenly distributed across our selected age range of 0-20 years old, and 2) a test data set (n = 689) also including typically developing individuals, constructed for the purpose of independently validating our predictor. We had a balanced sample of males and females in both data sets (training: 48% male, test: 53.3% male). Training and test data were processed independently  30 during all quality control, filtering, and normalization steps. The Autism Spectrum Disorder (ASD) cohort consisted of both individuals affected with ASD and typically developing (TD) individuals under 20 years old (ASD = 47, TD = 32). Typically developing samples from the ASD cohort were included in the test data. The tissue data sets consisted of two publically available blood DNA methylation data sets (GSE64495 n = 19, GSE35064 n = 134) as well as a saliva data set (GRADY n = 65), all of which were restricted to typically developing and under 20 years old (Table 2.1)    Data set ID Cohort1 N Sex (% male) Age range (years) Mean age (years)[SD] Microarray Training data* 1 APrON 145 53.1 0.17-0.60 0.26[0.06] 450K 2 C3ARE 89 57.3 3.6-5.8 4.5[0.51] 450K 3 GECKO 254 48.4 7.0-12.0 9.40[0.93] 450K 4 NDN I 95 40.0 5.5-18.7 11.7[3.41] 450K 5 MAVAN I 98 48.0 4.2-11.2 6.66[1.48] 450K 6 PAWS 172 47.7 9.4-13.4 10.91[0.93] 450K 7 WSFW 179 43.0 18.0-19.5 18.46[0.30] 450K     Training total 1032 48.0 0.17-19.5 9.47[5.59]   Test data* 8 BIBO 142 52.8 3.57-5.82 6.09[0.15] EPIC 9A GSE50759 32 37.5 1.15-19.96 9.24[4.92] 450K 10 MAVAN II 265 53.2 4.97-10.96 7.80[1.37] 450K 11 NEURON 250 55.6 0.01–1.16 0.51[0.48] EPIC     Testing total 689 53.3 0.01-19.96 4.87[3.65]   ASD data 9B GSE50759 81 (47 ASD, 32 TD) 61.7 1.15-19.96 8.01[4.30] 450K Tissue data sets* 12 GRADY (saliva) 65 52.3 6-13 9.71[1.7] 450K 13 GSE64495 (blood) 19 31.6 2.3-10.8 6.32[2.8] 450K  31 14 GSE36054 (blood) 134 51.5 1-16.9 4.61[4.1] 450K Table 2.1 Cohort Characteristics 1Full cohort descriptions see Appendix A.3 *All typically developing (TD) children; ASD = Autism Spectrum Disorder; SD = standard deviation.   2.3.2 A precise tool to measure pediatric DNA methylation age in BEC samples   Using the training data set, we generated an unbiased measure of pediatric DNAm age using elastic net regression, which selected 94 informative age-related CpG sites. To validate our tool, we applied the 94 CpG predictor in the test data set, finding a Pearson correlation coefficient between chronological age and pediatric DNAm age of r = 0.98 (p-value ≤ 2.2x10-16), with a test error (defined as median absolute difference between DNAm age and chronological age) of 0.35 years (Figure 2.1). Out of the 94 CpGs, DNAm at 50 CpG sites increased with chronological age and 44 decreased. We also investigated the genomic context of the 94 CpG sites by annotating each CpG to a gene feature using the annotation described previously (Edgar et al., 2014) and using CpG island features as provided in the Illumina annotation (Bibikova et al., 2009). We found that CpGs were significantly depleted in open sea regions of the genome (Monte Carlo simulations, FDR ≤ 0.01), trended non-significantly toward enrichment in CpG islands (Monte Carlo simulations, FDR = 0.17), and showed no significant enrichment in other annotated gene features (Supplementary figure 2.1).   32  Figure 2.1 Chronological age versus estimated pediatric buccal DNA methylation age. a The training data (data sets 1-7) used in an elastic net regression for selection of 94 age-related CpG sites. Estimated age (DNA methylation age) (y-axis) versus chronological age (x-axis). b The test data (data sets 8-11) were used in an independent assessment of the 94 CpG model (r = 0.98, error = 0.35 years). Each data point represents an individual and the colour indicates the corresponding data set. Solid black lines indicate a linear regression line; dashed black lines represent the perfect correlation line.   2.3.3 Pediatric BEC DNA methylation age prediction was highly accurate across longitudinal sampling  To further investigate the accuracy of the 94 CpG model, we took advantage of the longitudinal nature of two test data sets 10 (MAVAN, unique n = 127) and 11 (NEURON, unique n = 107 at both time points), which had repeated measures separated by 6 months to 2 years, depending on the individuals and study. As expected, after predicting pediatric DNAm age at each time point, all samples from both data sets were estimated as being older at time point 2  33 than at time point one. Additionally, when estimating DNAm age in these samples using the 353 CpG epigenetic clock model, a larger error was observed across both time points compared to the 94 CpG model (MAVAN: 94 CpG error = 0.43 years, R2 = 0.80, 353 CpG error = 3.8 years, R2 = 0.48; NEURON: 94 CpG = 0.20 years, R2 = 0.93, 353 CpG = 0.66 years, R2 = 0.71), highlighting the accuracy of our pediatric age estimator (Figure 2.2).     34  Figure 2.2 Longitudinal data demonstrates comparative accuracy between buccal pediatric DNA methylation age predictor and 353 CpG pan-tissue model. a Data set 10 (MAVAN) 94 CpG model versus chronological age across two to three time points ranging from 6 months – 2 years, depending on individual. Each point color represents an individual separated by a line indicating the time between sampling. b Data set 10 (MAVAN) 353 CpG model. c Data set 11 (NEURON) 94 CpG model versus age across two times points separated by one year between sampling. d Data set 11 (NEURON) 353- CpG model. Each coloured line represents an individual and the matched data points indicate sample collection time point. Additionally, for data set 10, individuals at time point 1 varied in age between 2-12 years. Buccal DNA methylation age was calculated for each individual at both time points.  35   2.3.4 Positive pediatric DNA methylation age acceleration was associated with Autism Spectrum Disorder   Our predictive model was trained and tested on typically developing individuals in order to best represent general developmental patterns, but we expected that deviations from this estimate might serve as a biomarker for altered developmental trajectories. Autism spectrum disorder (ASD) has been characterized as a pediatric -onset disease with an altered development phenotype, and so we tested whether pediatric DNAm age differed in ASD-affected individuals in a publically available data set parsed down to individuals between the ages of 0-20 years old (GSE50759). Similar to how the epigenetic clock has been investigated in adult samples, we extracted the residuals from a linear model of DNAm age regressed onto chronological age to obtain “age acceleration residual” (referred to here as “age deviation”). We assessed the age deviation in a cohort of 47 ASD cases and 34 typically developing (TD) individuals (data set 9B), while controlling for covariates such as ethnicity, experimental batch, and bioinformatically predicted BEC proportions (A. K. Smith et al., 2015). We observed a significant difference in the age acceleration residual between ASD and TD (p-value = 0.01), with ASD cases having positive age acceleration (Figure 2.3). We performed a sensitivity analysis by re-testing this association after removing the outlier in the TD group, and observed a stronger association (p-value = 0.005, Supplementary figure 2.2). To further verify this association, we employed a non-parametric propensity score matching method to ensure the groups were balanced regarding covariate measures (Ho, Imai, King, & Stuart, 2017). This resulted in 22 ASD cases and 17 TD cases and the difference remained significant (p-value = 0.03). Furthermore, to address any concern that familial status was influencing this result, we randomly removed one individual from each  36 sibling pair from our analyses and ASD and TD were still significantly different in terms of age acceleration residuals (p-value = 0.03, ASD = 47, TD = 20). To account for unbalanced sample sizes between ASD and TD, we again performed propensity score matching on this subset and observed a moderate difference between ASD and TD age acceleration residuals (p-value = 0.04, ASD = 16, TD = 12). Given the small group sizes of these analyses, we emphasize cautionary interpretation of this result and poise these findings as strictly exploratory requiring independent validation in more samples.     Figure 2.3 Buccal DNA methylation age acceleration was moderately higher in individuals diagnosed with Autism spectrum disorder (ASD) than typically developing (TD) children. In data set 9B (GSE50759, n = 81), DNA methylation age was predicted using a 94 CpG model and subsequently regressed onto chronological age, while controlling for sex, batch, predicted buccal proportion and ethnicity. A non-parametric propensity score matching method was  37 applied to ensure the groups were balanced regarding covariate measures. Box plot hinges represent the 25th and 75th percentiles and the upper and lower whiskers correspond to the largest and smallest value no further than 1.5 x the inter-quartile range, respectively.    2.3.5 Pediatric BEC DNA methylation age in saliva and blood   As we trained our predictor in samples obtained from BECs exclusively, we chose to explore the performance in different commonly used tissues, specifically saliva and blood. Using the saliva samples from GRADY (data set 12, n = 65), we observed a moderate association between predicted age using the pediatric BEC clock and chronological age (r = 0.50, error = 1.31 years, Supplementary figure 2.3). However, it should be noted that these data lacked exact age and therefore reported predictor performance is not as precise and could possibly deviate up to one year given the reported age was in rounded years.   In an independent publically available blood data set (data set 13, n = 134), samples were restricted to control individuals under 20 years old. We observed a correlation of r = 0.79 between chronological age and pediatric BEC age in data uncorrected for blood cellular heterogeneity. Notably, the error of the pediatric BEC clock in blood was much larger than the test BEC data sets (error = 3.26 years). Upon correcting for blood cell type heterogeneity, the correlation between predicted age and chronological age was moderately lower with a similar degree of error (r = 0.60, error = 3.02) (Supplementary figure 2.4a, Supplementary figure 2.5c). Similarly, in another publically available data set (data set 14, n = 19), the pediatric BEC clock performed well (r = 0.88, error = 1.89 years), but when applied to cell type corrected DNAm data the correlation was considerably lower (r = -0.27, error = 2.82 years, Supplementary figure 2.5a,  38 Supplementary figure 2.5c); meaning that estimated cell type proportion variation contributed greatly to the reducing the accuracy of the predictor. For both blood data sets 13 and 14, the 353 CpG pan-tissue model performed well (r = 0.96, error = 0.57 years; r = 0.95, error = 1.66 years, respectively) on data prior to cell composition adjustment (Supplementary figure 2.4b, Supplementary figure 2.5b), however, similar to the pediatric BEC clock, a reduction in accuracy was observed when the pan-tissue 353 CpG model was applied after cell type correction (data set 13: r = 0.82, error = 1.29 years; data set 14: r = -0.29, error = 5.69 years, Supplementary figure 2.4d, Supplementary figure 2.5d).These observations highlight the effect of blood cellular heterogeneity on age prediction in both the pan-tissue and 94-CpG clock.   2.3.6 A pediatric BEC DNA methylation epigenetic clock for future studies   In addition to establishing a model using our training data and determining the accuracy using our test data, we have established a model based on the entirety of our samples (n = 1,721) because of previous recommendations for future studies using DNA methylation age predictors (Simpkin, Suderman, & Howe, 2017)(Figure 2.4). With this approach, we cannot report test accuracy, but we have made this model available for the use by the community in future studies (www.shinyapps.com/pedDNAmAge - link to be activated upon publication). We employed the same parameters as the initial predictor from the n = 1,032 training data, but with all 1,721 samples included, and the number of probes in this model was coincidently also 94 CpGs (Supplementary table 2.1); however, the overlap with this predictor and the model established from our training data set was only 64 CpGs (overlap indicated in Table 2.2).    39  Figure 2.4 Predictive model of age in pediatric buccal samples derived from all available data.  a Chronological age versus training-data only based 94-CpG model estimated DNA methylation age. b Chronological age versus age estimates from a second 94 CpG model, based on the combination of both the training and test data sets, composed of typically developing, healthy individuals aged 0-20 years old. Error is median absolute difference between estimated age and chronological age, Pearson correlation coefficient shown (r).  40 CpG ID Coefficient Age Correlation(TRAIN) Age Correlation(TEST) Chr Coordinate1 Gene Isoform: Genomic region CpG Class cg00059225 0.02 0.87 0.89 5 151304357 GLRA1 promoter HC cg00085493 -0.1 -0.78 -0.81 1 208040203 NA intergenic LC cg00095976 0.01 0.69 0.76 6 118228060 SLC35F1 promoter HC cg00609333* 0.02 0.63 0.8 12 16505000 MGST1 1-4: intragenic, 5: promoter LC cg01287592 -0.06 -0.72 -0.58 8 142139839 DENND3 intragenic ICshore cg01704999 -0.1 -0.7 -0.57 15 65931192 SLC24A1 intragenic LC cg02209075* 0.14 0.58 0.63 22 21798794 HIC2 intragenic LC cg02310103 -0.08 -0.84 -0.81 3 50391305 CYB561D2 three_plus LC cg02426178* -0.36 -0.87 -0.92 19 10747142 SLC44A2 1-2: intragenic IC cg02821342* -0.13 -0.66 -0.78 7 130793551 MKLN1; LINC-PINT promoter LC cg02980055* -0.14 -0.79 -0.88 8 47866183 NA intergenic IC cg03020208* 0.21 0.66 0.76 12 50354962 AQP5 promoter HC cg03466124 -0.02 -0.75 -0.8 3 179168156 GNB4 intragenic LC cg03473016* 0 0.71 0.79 3 123577794 MYLK intragenic LC cg03493146* -0.16 -0.69 -0.8 7 69062082 NA intergenic HC cg03555227* 0.64 0.91 0.93 5 170289070 RANBP17 promoter HC cg04221461* 0.03 0.83 0.85 1 243877024 AKT3 1-3: intragenic LC cg04452203 -0.03 -0.68 -0.62 4 114332844 NA intergenic LC cg04937184 -0.05 -0.72 -0.83 14 54423845 BMP4 promoter HC cg04948475 0 -0.71 -0.54 4 69810056 UGT2A3 intragenic LC cg05024939* 0.15 0.77 0.77 4 113442251 NA intergenic IC cg05271255* -0.12 -0.83 -0.87 17 79821342 NA intergenic LC cg05923197 0.04 0.77 0.76 14 54418804 BMP4 1-2: intragenic IC cg05928290* 0.16 0.74 0.83 14 54418728 BMP4 1-2: intragenic IC cg06048436* -0.15 -0.62 -0.72 13 43938637 ENOX1 1-2: intragenic LC cg06144905* 0.32 0.81 0.91 17 27369780 PIPOX promoter LC cg06198384 0 0.7 0.39 1 63791999 NA intergenic IC cg06416491* -0.19 -0.8 -0.87 12 39837660 KIF21A promoter ICshore  41 CpG ID Coefficient Age Correlation(TRAIN) Age Correlation(TEST) Chr Coordinate1 Gene Isoform: Genomic region CpG Class cg06430061 -0.02 -0.8 -0.83 5 75013592 POC5 promoter ICshore cg06455149* 0.09 0.82 0.86 8 25905811 NA intergenic HC cg06712013* -0.05 -0.72 -0.73 12 49759545 SPATS2 promoter LC cg07066898* 0.07 0.88 0.9 19 17717094 UNC13A intragenic HC cg07638500* -0.03 -0.67 -0.72 3 123371420 MYLK intragenic LC cg07809484* 0.07 0.85 0.87 19 51231968 NA intergenic HC cg08207604* -0.11 -0.76 -0.76 6 35265308 DEF6 promoter LC cg08448547 0.06 0.73 0.34 8 47862954 NA intergenic LC cg09367901* 0.04 0.71 0.82 14 54418851 BMP4 1-2: intragenic IC cg09678615* -0.08 -0.75 -0.76 16 52579163 TOX3 1-2: intragenic ICshore cg10308629* -0.09 -0.72 -0.83 7 134354803 BPGM intragenic IC cg11071401* 0.23 0.9 0.74 17 48637194 CACNA1G-AS1; CACNA1G intragenic; promoter ICshore cg11084334* 0.03 0.88 0.9 3 9594264 LHFPL4 intragenic HC cg11092416* -0.09 -0.77 -0.79 6 29617549 NA intergenic IC cg11117945* -0.01 -0.78 -0.79 20 45524714 EYA2 intragenic LC cg11298786 -0.1 -0.68 -0.7 1 228195794 WNT3A intragenic HC cg11529819* -0.09 -0.74 -0.78 1 27695677 FCN3; NA three_plus; intergenic IC cg11705975* 0.18 0.93 0.9 10 120354248 PRLHR intragenic HC cg11804928* -0.25 -0.65 -0.67 17 38220298 THRA 1-3: intragenic LC cg12483947* -0.01 -0.75 -0.86 10 72640100 SGPL1 intragenic LC cg13462622 0.04 0.8 0.76 9 34578996 CNTFR-AS1; CNTFR intragenic; 1-2: intragenic LC cg13739066 0.05 0.74 0.69 14 105622312 JAG2 intragenic IC cg13783238 0 0.43 -0.11 3 160121821 SMC4; MIR15B, MIR16-2 1-2: intragenic; promoter, promoter LC cg13848598* 0.05 0.87 0.87 10 115804578 ADRB1 intragenic HC cg14676592* 0.14 0.84 0.85 16 49910862 NA intergenic IC  42 CpG ID Coefficient Age Correlation(TRAIN) Age Correlation(TEST) Chr Coordinate1 Gene Isoform: Genomic region CpG Class cg15774391 -0.14 -0.74 -0.46 17 40439628 STAT5A promoter LC cg16181396* 0.09 0.9 0.86 3 147126206 ZIC1 promoter IC cg16243756 0.07 0.62 0.73 8 15397637 TUSC3 promoter HC cg16314254 0.12 0.7 0.71 1 230249965 GALNT2 intragenic IC cg16374999* 0.02 0.81 0.85 12 49688980 PRPH promoter HC cg16514113* 0.32 0.86 0.75 16 34428061 NA intergenic LC cg16618789* 0.15 0.67 0.65 7 10116380 NA intergenic LC cg16703882 0.1 0.85 0.33 3 157823479 SHOX2 intragenic HC cg16852756* -0.05 -0.81 -0.89 7 130630462 LINC-PINT 1-2: intragenic LC cg16867657* 0.47 0.93 0.95 6 11044877 ELOVL2-AS1; ELOVL2 1-2: intragenic, promoter; promoter HC cg17882660 0.04 0.77 0.76 13 100637258 ZIC2 intragenic HC cg18244483* 0.22 0.8 0.89 4 107423468 NA intergenic IC cg18826352 0.09 0.59 0.64 20 57765900 ZNF831 promoter LC cg19203457* 0.09 0.79 0.86 6 114181798 MARCKS intragenic HC cg19381811* -0.08 -0.83 -0.86 3 49851713 UBA7 promoter LC cg19662687* -0.03 -0.72 -0.81 1 161337679 C1orf192 promoter LC cg20283670* -0.14 -0.77 -0.9 10 116162728 AFAP1L2 intragenic ICshore cg20397034* -0.68 -0.74 -0.7 7 130794584 MKLN1; LINC-PINT promoter; promoter IC cg20704560* -0.03 -0.68 -0.71 1 10698313 CASZ1 intragenic ICshore cg20755989* 0.19 0.74 0.88 5 28403960 NA intergenic LC cg20847090 0.01 0.76 0.62 1 23763959 ASAP3 intragenic IC cg20897936* 0.19 0.91 0.9 6 28554741 SCAND3 intragenic IC cg21515406* 0.01 0.63 0.72 5 3325150 NA intergenic IC cg21572722* 0.65 0.9 0.92 6 11044894 ELOVL2-AS1 1-2: intragenic, promoter; promoter HC cg22281505 -0.04 -0.71 -0.69 17 41158783 IFI35 promoter LC cg23500537* 0.08 0.84 0.9 5 140419819 NA intergenic IC cg23586440 0.01 0.6 0.66 20 31067009 C20orf112 1-2: intragenic LC  43 CpG ID Coefficient Age Correlation(TRAIN) Age Correlation(TEST) Chr Coordinate1 Gene Isoform: Genomic region CpG Class cg23731991* -0.07 -0.63 -0.56 7 130795031 MKLN1; LINC-PINT promoter; promoter IC cg24374161* -0.11 -0.81 -0.84 11 46582058 AMBRA1 1-2: intragenic LC cg24441922* -0.15 -0.79 -0.82 2 235401950 ARL4C; NA 1-2: three plus, intergenic LC cg24739596 0 0.8 0.67 1 243967381 AKT3 1-3: intragenic LC cg24881205 0.04 0.77 0.61 7 20826188 SP8 intragenic ICshore cg24996883* -0.24 -0.78 -0.88 1 226821857 ITPKB intragenic LC cg25062539* -0.35 -0.79 -0.84 8 79718482 IL7 promoter LC cg25936902* -0.05 -0.65 -0.75 13 30982971 NA intergenic ICshore cg26535072* -0.15 -0.74 -0.77 9 72288156 APBA1 promoter HC cg26650655 -0.06 -0.79 -0.7 5 88178400 MEF2C 2-3: intragenic LC cg26718511 0.18 0.78 -0.13 6 25652531 SCGN promoter HC cg27061971* 0.14 0.82 0.86 5 72746990 NA intergenic HC cg27396655 -0.33 -0.77 -0.79 22 39240517 NPTXR promoter HC cg27437510* 0.11 0.7 0.82 7 150817197 AGAP3 1-3: intragenic IC Table 2.2 Description of the 94 CpGs in the pediatric DNA methylation BEC epigenetic clock based on training data  44 2.4 Discussion  Birth to late adolescence is a tremendously dynamic period of development and growth, and an accurate molecular marker representing this biological activity has yet to be established. By assessing DNAm profiles in 1,721 healthy individuals, ranging in age from 0-20 years old, we have generated a predictor of age specific to pediatrics using DNAm values in BECs at 94 CpG sites. Importantly, we highlighted that children with a neurodevelopmental disorder, ASD, showed a higher BEC DNAm age than those considered typically developing.  The original 353 CpG pan-tissue age estimator applies to the entire life course: from prenatal samples to super-centenarians (Horvath, 2013; Horvath, Pirazzini, et al., 2015b; Simpkin et al., 2016; 2017). However, more accurate age estimators can be constructed by focusing on a more limited age range. In particular, more accurate epigenetic age estimators have been developed for gestational age based on cord blood samples (Bohlin et al., 2016; Knight et al., 2016). Our study demonstrates that the BEC estimator outperforms the pan-tissue estimator in buccal samples from a pediatric population. It is not surprising that previous predictors of age do not perform exceptionally well in children, as this is a unique period of rapid change in terms of DNAm that are unlikely to mimic adult methylome dynamics (Alisch et al., 2012). Given the differences in development and aging across the life course, applying adult-based markers to pediatric populations may not be an appropriate approach.  The 353 CpG clock is agnostic of tissue type, which inevitably sacrifices some precision when estimating age within any single tissue. Part of the reason for the increased accuracy of the BEC estimator stems from building the predictor in a single target tissue within a focused age range of 0-20 years old. The specificity of the sample target may be viewed as a limitation and, as we have shown, makes the predictor less robust to other tissues. However, the loss of  45 applicability to other tissues was necessary as it allowed our predictor to reach the highest accuracy of any epigenetic clock to date. We believe that since age-related DNAm has differential rates across tissues, the focus of a single tissue type is best for obtaining the highest estimate accuracy possible. Additionally, given the difficulty in obtaining pediatric biological samples, BECs are very commonly used in pediatric biomarker research and offer a non-invasive collection protocol. Therefore, our predictor targets a single tissue with wide applicability in pediatric research and hopefully will maximize the utility. Heterogeneous tissues, such as blood, can change in cell proportion with age, such as CD8+ T cells (Jaffe & Irizarry, 2014). Focusing on a more homogeneous tissue, such as BECs, should reduce this source of confounding. Interestingly in the blood test data sets, when blood cell type composition was adjusted for in the DNAm data prior to calculating pediatric BEC DNAm age, the association between estimated age and chronological age decreased considerably. This suggested that BEC DNAm age predictions in blood were not exclusive to changes in methylation, but instead highly associated with age-related changes in blood cell type proportions, underscoring the intended purposes of the present tool to be applied in BEC samples, exclusively. As DNAm is highly specific to age range and tissue, it is not surprising that we note a minimal overlap of only one CpG site (cg06144905) when comparing the 353 CpG model sites and the 94 CpG (training-based) model sites used in our pediatric epigenetic clock. Since this epigenetic clock is unique to children and has minimal overlap with the 353 CpG pan-tissue model, which has been heavily characterized in adult populations, we expect the 94 CpG model may be capturing developmental phenotypes rather than age-related measures in adults, such as frailty and mortality. This difference highlights the unique relationship between DNAm and  46 aging during childhood and the need for a specific tool to measure DNAm age in individuals during the pediatric period.  Our results investigating the ASD cohort suggest that deviations between pediatric DNAm age and chronological age may be associated with altered developmental trajectories and pediatric disorders. We chose to assess this pediatric-onset disease because it has previously been reported to have altered developmentally-related phenotypes, such as increased body growth, head growth, and body weight during early development (Surén et al., 2013). We showed that individuals with ASD had increased DNAm age acceleration compared to typically developing children. Previous research has also indicated DNAm differences in individuals affected by ASD (Berko et al., 2014), emphasizing the potential utility for DNAm as a biomarker in this disorder. In conclusion, this study describes a highly accurate molecular measure of chronological age using DNAm obtained from BECs in pediatric samples. We have presented a pediatric DNAm clock based on all available data for future applications, as recommended by others in the field (Simpkin et al., 2017). The utility of this tool remains to be explored, but it may open new frontiers in this area of research. The pan-tissue epigenetic age estimator has revealed associations between epigenetic age acceleration and pubertal staging (Binder et al., 2018), Down Syndrome (Horvath, Garagnani, et al., 2015a), and Werner syndrome (Maierhofer et al., 2017). Future studies involving the BEC DNAm age predictor should explore whether similar associations can be observed in buccal epithelial cells from children.  2.5 Acknowledgments  The authors would like to acknowledge all study participants. We also thank the researchers who generously uploaded their data to the public repository GEO, without their  47 contributions this work would not be possible. MS Kobor is the Canada Research Chair in Social Epigenetics, Senior Fellow of the Canadian Institute for Advanced Research, and Sunny Hill BC Leadership Chair in Child Development. LM McEwen is supported by a Frederick Banting and Charles Best CIHR Doctoral Research Award (F15-04283).                                 48 Chapter 3: Differential DNA methylation and lymphocyte proportions in a Costa Rican high longevity region  3.1  Introduction  Aging is a complex biological process that progressively leads to physiological decline and an increased risk for death. The genetic component of lifespan is estimated to be less than 30%, leaving the remainder to be determined by environmentally and socially influenced factors such as diet, exposure to infection, and lifestyle choices (Brooks-Wilson, 2013; Passarino, De Rango, & Montesanto, 2016). While the mechanistic regulation of these non-genetic influences is poorly understood, previous work has suggested that epigenetic processes may be tightly interwoven with biological aging (M. J. Jones, Goodman, & Kobor, 2015a). Epigenetics generally refers to the study of altered chromatin states, such as modifications to DNA and the proteins involved in its packaging and regulation. To date, DNA methylation is the most commonly studied epigenetic mark in human populations, as recent advances in technology have allowed for the inexpensive high throughput measurement of >400,000 CpG sites across the genome. There are many other studied epigenetic processes, such as post-translational histone modifications, histone variants, and non-coding RNAs; however, study of these modifications has been more of a focus in model organism and cancer research.  DNA methylation is one type of epigenetic modification that impacts how genes are expressed and thus can have important phenotypic and functional consequences for an organism. Unlike the DNA sequence itself, DNA methylation is changeable through environmental influences over an individual’s life course. DNA methylation involves the covalent addition of a methyl group to a 5’ cytosine, most often in the context of CpG dinucleotides (cytosine-phosphate-guanine).   49  Genetic variants, such as single-nucleotide-polymorphisms (SNPs), can affect DNA methylation at nearby CpG sites, called methylation quantitative trait loci (mQTL) (Shoemaker, Deng, Wang, & Zhang, 2010). DNA methylation also varies in association with environmental and behavioral factors, such as diesel fume exposure and smoking. Additionally, variability in DNA methylation accumulates across one’s entire lifespan, as discussed in detail below. DNA methylation patterns are associated with altered gene activity. Shifts in DNA methylation levels may follow a change in gene expression or may act in the recruitment of methylation-dependent transcription factors that regulate transcriptional machinery. Understanding the effect of environmental influences on DNA methylation is important for unravelling the intricate regulation of genes and possible functional consequences of these alterations.  Age-related DNA methylation encompasses at least two distinct phenomena. First, specific CpGs associated with chronological age have been identified and, in some cases, replicated in several human populations. These age-related DNA methylation signatures can be either tissue-specific or occur across several tissue types (Farré et al., 2015). The “epigenetic clock” is a tool based on individual CpGs that change with age. Epigenetic clocks are DNA methylation-based markers of biological age, either confined to a single tissue or consistently accurate across tissues (Hannum et al., 2013; Horvath, 2013; Weidner et al., 2014). Age acceleration, i.e., deviations from the epigenetic age estimates produced by an epigenetic clock, have been associated with all-cause mortality, time until death, and frailty (Breitling et al., 2016; B. H. Chen et al., 2016; Marioni, Shah, McRae, Chen, et al., 2015a). Second, inter-individual DNA methylation variability increases with age, a phenomenon referred to as epigenetic drift (Martin, 2005).   50 It is critical to address cell type heterogeneity when investigating DNA methylation patterns in tissues containing mixed cell-types. Not only does cell-type change over one's lifespan, but it is also the primary source of variation in DNA methylation across healthy individuals. DNA methylation profiles obtained from identical cell types in different individuals show higher similarity than two different cell types from the same individual (M. J. Jones, Islam, Edgar, & Kobor, 2015b). Given that isolating DNA from a single cell type is not always feasible or that cell count information is sometimes not available, bioinformatics-based methods have been developed to estimate cell type proportions using DNA methylation profiles in blood and brain (Guintivano et al., 2013; Houseman et al., 2012). The blood-based predictions are closely correlated with complete blood count measures, thus suggesting the validity of these methods to derive accurate blood count information bioinformatically (Koestler et al., 2013). It is also worth noting that measures of epigenetic age acceleration specific to whole blood have been defined to account for age-induced changes to cell-type proportions (Horvath, 2013). These measures are integral when analyzing DNA methylation from whole blood, as the proportion of certain blood cell types, such as CD8+ memory and naïve T cells, changes with age (Fagnoni et al., 2000).  Aging research in humans commonly investigates the unique biological and lifestyle characteristics of individuals surviving to old age (Beekman et al., 2013). An alternative approach, used in the current study design, is to examine the underlying biology of longevity by examining a population characterized as having a particularly high life expectancy (Rose, 2001), the Nicoya region of Costa Rica, and comparing it to the rest of Costa Rica, which has moderately lower life expectancy. By averaging out the stochastic variation in aging among individuals within each geographic region of the country, this approach offers a way to identify regional (rather than individual) differences associated with healthy aging and longevity. While  51 the method of examining area based determinants of health and longevity has received substantial attention in biomedical research (Diez Roux, 2001), a lack of appropriate data sources has limited its application in understanding the biological mechanisms of longevity.  The population of the Nicoya peninsula of Costa Rica has been characterized by exceptionally high longevity, providing an intriguing framework to explore the relationship between DNA methylation and aging (Rosero-Bixby, Dow, & Rehkopf, 2013b). Mortality rates among elderly Costa Ricans in Nicoya are substantially lower than in the rest of Costa Rica, with individuals in Nicoya being some of the most long lived in the world. The relative mortality rate of Nicoya as compared to similar age cohorts in the rest of Costa Rica is 0.80. This advantage remains significant after statistical control for level of education and type of health insurance (Rosero-Bixby et al., 2013b). The Nicoyan advantage is particularly evident in cardiovascular disease, despite the fact that risk factors like smoking, physical activity and systolic blood pressure are similar throughout Costa Rica. One key indicator of the Nicoyan characteristic is longer knee height – an anthropometric biomarker that is associated with early childhood environment (Bogin & Varela-Silva, 2010). Nicoyans also have lower BMI, waist circumference, and, among men, lower levels of HbA1c, glucose, triglycerides and total/HDL cholesterol ratio (Rosero-Bixby et al., 2013b). We have previously shown that leukocyte telomere length, an aging biomarker, also has more favorable (longer) levels among Nicoyans compared to individuals in the rest of Costa Rica (Rehkopf et al., 2013).  In this study we examined DNA methylation in a nationally representative population of Costa Ricans, investigating potential biological differences that may help explain the higher longevity observed in Nicoyans as compared to other Costa Ricans. We focused on DNA methylation, as this epigenetic mark can be modified by environmental influences, has the  52 potential to regulate gene expression, and most importantly, has an established relationship with aging across the mammalian lifespan (M. Jung & Pfeifer, 2015). Using genome-wide DNA methylation patterns to predict blood cell type composition, we determined differential estimated proportions of age-related immune cells. While we did not observe differences in epigenetic age acceleration, we did find significantly decreased DNA methylation variability in Nicoyans. Finally, we identified DNA methylation patterns unique to Nicoyans, at both genomic regions and specific CpG sites. Understanding differences in DNA methylation patterns between Nicoyans and other Costa Ricans (non-Nicoyans) may offer new insights into the role of DNA methylation in aging and help to illuminate why Nicoyans have among the longest old age life expectancies.   3.2 Materials and methods 3.2.1 Sample preparation and data collection  Whole blood was collected from participants and genomic DNA was extracted at the University of Costa Rica from 2 ml of frozen whole blood using the phenol-chloroform method. DNA was bisulfite converted with the Zymo Research EZ DNA Methylation Kit (Irvine, CA, USA). Approximately 160 ng of bisulfite converted DNA from each sample, with the addition of one technical replicate, were randomized across eight 450K array BeadChips and run in one batch according to the manufacturer’s protocol (San Diego, CA, USA).  Qiagen Pyromark Assay Design 2.0 software (Hilden, Germany) was used to generate pyrosequencing assays targeted to three of the identified differentially methylated 450K array CpGs in all samples to verify our findings and confirm the results were not an artificact of technical variability of the microarray. Pyrosequencing was performed on the a Qiagen  53 Pyromark™ Q96 (Hilden, Germany) according to manufacturer’s instructions. All primer sequences listed in Supplementary table 3.1.  3.2.2 Data preprocessing and normalization  Illumina GenomeStudio Software (San Diego, CA, USA) was used to subtract background noise and colour correct raw data using control probes. Data were extracted in the form of an average β-value matrix and imported into R Statistical Software (version 3.2.3) for the remainder of data processing and statistical analyses. Logit transformed β-values to M-values, a less heteroscedastic value, were used for all statistical analyses; β-values were used for visualization purposes as they represent percent methylation (0 = no methylation, 100 =fully methylated).  We removed a subset of probes that could potentially lead to erroneous results. These consisted of 65 SNP control probes, probes that were specific to either the X or Y sex chromosomes, probes with missing β-values or with a detection p0.01 in 5 or more samples, polymorphic CpG probes, and cross-hybridizing XY probes. The total number of probes post filtering based on these criteria was 441,109 (Price et al., 2013). Sample outliers, defined as having more than 5% of their total probes fail, were not removed.   Subset-quantile within array normalization (SWAN) was used to account for type I and II probe differences on the 450K array (Maksimovic et al., 2012). Known technical variation (sentrix ID and position) were accounted for with the R function ‘ComBat’ (W. E. Johnson, Li, & Rabinovic, 2007). Confirmation of these corrections was assessed before and after using principal component analysis.    54 3.2.3 Estimation of blood cell proportions  A validated cellular deconvolution method was used to estimate cell type proportions in each blood sample, namely: CD4+ and CD8+ T cells, natural killer cells, B cells, monocytes and granulocytes (Koestler et al., 2013). The predicted abundance levels of CD8+ T naïve and CD8+ T memory were obtained from the ‘Advanced Blood Analysis’ of the online DNA methylation age predictor (Horvath, 2013). Significance values were generated by performing a Kruskal-Wallis test for each cell type proportion by group.   3.2.4 Prediction of epigenetic age Three epigenetic clocks were used to predict biological age. The ‘Horvath’ and ‘Hannum’ estimates were computed with the online epigenetic clock software (Hannum et al., 2013; Horvath, 2013). The 'Weidner’ age prediction was generated using the previously described 99 CpG model (Lin et al., 2016).   3.2.5 DNA methylation analysis The R package ‘DMRcate’ was used to find differentially methylated regions (DMRs) (T. J. Peters et al., 2015). The DMRcate model contained Nicoya group, chronological age, sex, and estimated cell type proportions. This tool uses a Gaussian kernel smoothing of DNA methylation across the genome. The Benjamini Hochberg (BH) method was applied to correct for multiple comparisons with a threshold of  0.05 and a β-value difference of  5% (Hochberg & Benjamini, 1990). Site-specific differential DNA methylation analysis was conducted using moderated t-statistics with empirical Bayesian variation estimation using the bioconductor R package ‘limma’  55 with chronological age, sex, and cell type proportions as covariates (Smyth, 2004). M-values consisted of log-transformed β-values to achieve a measure with uniform variation and decreased heteroscedasticity. Significance values were corrected for multiple testing using the BH method (Hochberg & Benjamini, 1990).   DNA methylation variability was calculated using the inter-quantile range (IQR) across the 90th and 10th percentiles of each group independently at each CpG. A significance value was generated by performing a Wilcoxon Signed-rank test between groups.    3.2.6 Inferred genetic ancestry Population structure was inferred using the ‘Epistructure’ command-line tool GLINT (Rahmani et al., 2016). This method applies principal component (PC) analysis on a reference list of genetically informative 450K array probes to infer genetic structure. Linear regression was used for comparison of each PC and group status, while adjusting for sex, cell type proportions, and age.   3.3 Results 3.3.1 Cohort characteristics and DNA methylation data We examined a subset of samples from the Costa Rican Study on Longevity and Healthy Aging (CRELES), a longitudinal sample of close to 3,000 adults aged 60 years and over that were collected mostly in 2005, with over-sampling of older ages (Rosero-Bixby, Dow, & Fernández, 2013a). We assayed DNA methylation profiles of 48 Nicoyans (longevity group) and 47 non-Nicoyans (control group). In order to maximize statistical power for our age-based hypothesis, we randomly sampled half of the individuals between the ages of 60 and 75 and the  56 other half aged 95 and above, selecting an equal number in each age category among Nicoyans and non-Nicoyans. Table 3.1 shows the mean characteristics of these populations. Nicoyans tend to have lower levels of education and lower wealth than non-Nicoyans, but are similar on observed health related characteristics. We obtained DNA methylation profiles from whole blood using the Infinium HumanMethylation450 (450K) array, a genome-wide microarray that quantifies DNA methylation at over 485,000 sites. We applied data quality controls to remove poor performing probes, probes that hybridized the XY chromosomes, and probes predicted to cross hybridize (Price et al., 2013). Our final data set for subsequent analyses consisted of 441,109 sites.  Characteristics Nicoya (n = 48) Non-Nicoya (n = 47) Mean age (sd) 83 (14) 85 (16) Female (%) 57 55 Low education (%) 80 68 Low wealth (%) 35 21 Currently smoke (%) 4 6 Systolic blood pressure (mean mmHg) 139 (23) 140 (25) Diastolic blood pressure (mean mmHg) 78 (12) 78 (13) Body mass index (mean) 24 (7.1) 25 (5.8) Table 3.1 Cohort characteristics (means and percents), Nicoyans and non-Nicoyans Standard deviations are shown in parenthesis. Low education is not completing primary school. The wealth index was based on a simple count of ten goods and conveniences in the household, ranging from running water and a toilet to having a clothes washer and a car. Low wealth is defined as having six or fewer of these items. Systolic and diastolic blood pressure and body mass index were measured at the time of interview. Currently smoking was assessed through questionnaire. Further details on survey measures are available elsewhere (Rosero-Bixby & Dow, 2009).  57   3.3.2 Nicoyans had fewer estimated CD8+ memory cells and more naïve T cells  Whole blood is a heterogeneous tissue containing certain cell types that change with age, with age-related decreases occurring in CD8+T, CD4+T and B lymphocytes, and the greatest increases in natural killer cells and monocytes (Jaffe & Irizarry, 2014). To assess these differences in our cohort, we performed a previously described blood cell type deconvolution (Houseman et al., 2012) by using the DNA methylation profiles of each sample to estimate the proportions of granulocytes, natural killer cells, CD8+ T lymphocytes, CD4+ T lymphocytes, monocytes, and B lymphocytes. We found that Nicoyans had a statistically significantly lower proportion of estimated CD8+ T cells when compared to non-Nicoyans (Kruskal-Wallis p-value=0.0038)(Figure 3.1a). We also observed that Nicoyans had a higher mean level of estimated granulocyte proportions, although only reaching borderline significance (Kruskal-Wallis p-value=0.0486). It is important to note that we did not focus on the blood composition as a whole, as we were primarily interested in specific age-related cell type trends.   58   Figure 3.1 Nicoyans had differential CD8+ naïve and memory T cell abundance levels. a Box plots demonstrating bioinformatically derived white blood cells in Nicoyans and non-Nicoyans. Cell proportions estimated using the Houseman method. Blue: Nicoyans, white: non-Nicoyans. The p-value is derived from a nonparametric group comparison test using Kruskal–Wallis. b Box plots illustrating the relationship between the bioinformatically derived CD8+ naïve T cell and CD8+ memory T cell across Nicoyans and non-Nicoyans. Box plot hinges represent the 25th and 75th percentiles and the upper and lower whiskers correspond to the largest  59 and smallest value no further than 1.5 x the inter-quartile range, respectively. c Scatter plots of chronological age plotted against each CD8+ naïve T cells and CD8+ memory T cells abundance level for each sample. CD8+ naïve T cell show a decrease with age, whereas CD8+ memory T cells increase with chronological age. Pearson’s r coefficients derived from log-transformed age correlated with each respective cell-type level. Blue: Nicoyans, black: non-Nicoyans. Line of best fit shown with 95% confident intervals shaded in respective group color. The scale of cell abundance is a measure from a bioinformatically derived prediction of the respective cell types using flow-sorted counts from other data sets to infer cellular proportions of that specific isolated cell type based on the DNA methylation data (Horvath & Levine, 2015; Horvath & Ritz, 2015; Marioni, Shah, McRae, Chen, et al., 2015a).    To further investigate the differential proportion of estimated CD8+ T cells, we applied a more detailed cell deconvolution tool that provides an expanded estimation of CD8+ T naïve cells (CD8+CD45RA+CCR7+) and CD8+ T memory cells (CD8+CD28-CD45RA-) (Horvath, 2013). CD8+ T memory cells typically increase with age and CD8+ T naïve cells generally decrease with age through thymic involution (Fagnoni et al., 1996). Given that these measures are proportional and highly correlated, it is statistically appropriate to assess the ratio of CD8+ T naïve cells to CD8+T memory cells (Supplementary figure 3.1). Using this approach, we found a significant difference between Nicoyans and non-Nicoyans (Kruskal-Wallis p-value=0.0135). We observed a greater abundance of predicted CD8+ T naïve cells in Nicoyans and a lower abundance of estimated CD8+ T memory cells in Nicoyans, as compared to non-Nicoyans (Figure 3.1b). These trends were suggestive of a younger immune cell profile in Nicoyans.   60 Pearson’s correlation coefficients were computed to assess the relationship between chronological age (log transformed) and the estimated proportion of each CD8+ T cell type (Figure 3.1c). We observed a negative correlation between chronological age and CD8+ T naïve proportion (Nicoyans and non-Nicoyans; Pearson’s r = -0.55 and -0.61, respectively). As expected, there was a positive correlation between age and CD8+ T memory cells (Nicoyans and non-Nicoyans; Pearson’s r = 0.61, and 0.61, respectively; Figure 3.1c), potentially demonstrating a known immunological aging trend, but here based on epigenetic data. We found no significant difference in the regression slopes of chronological age on either CD8+ T cell type, when comparing Nicoyans to non-Nicoyans (p > 0.50). While we did observe differences in mean levels of both estimated CD8+T cell types, we did not see any group difference in the nature of how these cell types changed across age.   3.3.3 Epigenetic age does not differ between Nicoyans and non-Nicoyans Having established differences in estimated CD8+ T cell populations between the two groups, we next examined DNA methylation age as an established metric of biological aging. We examined DNA methylation age of Nicoyans compared to non-Nicoyans using three epigenetic clocks, which provided measures of biological age using DNA methylation levels at different groups of CpG sites (Figure 3.2a) (Hannum et al., 2013; Horvath, 2013; Weidner et al., 2014). Across all samples, we found correlations between DNA methylation age and chronological age (Horvath: Pearson’s r = 0.86, Hannum: Pearson’s r = 0.85, Weidner: Pearson’s r = 0.86). However, we found no significant difference between Nicoyans and non-Nicoyans in terms of DNA methylation age (as calculated by each clock), while adjusting for chronological age (ANOVA p > 0.30, 95% CI for the Horvath clock was -6.3 to 3.8 years). We did, however,  61 observe a mean difference of -6.9 years between epigenetic age and chronological age for all samples; suggesting that Costa Ricans, inclusive of both Nicoyans and non-Nicoyans, may on average be epigenetically younger than their chronological age (Figure 3.2b). We investigated the possibility that this finding was due to a global batch effect influencing all samples by performing the DNA methylation age calculation on raw data, after SWAN normalization, and again after ComBat correction. In all cases the mean DNA methylation age for all Costa Ricans was younger than the mean chronological age. We chose to proceed with calculating DNA methylation age using the most corrected data as we expected data that are corrected for technical batch effects, inclusive of probe design and chip-chip variance, to best represent true biological signal. Furthermore, using this corrected data we reduced our data to only centenarians (age ≥ 100 years old), the mean absolute difference between DNA methylation age and chronological age was -12.7 years. We further examined biological age by assessing two other recently defined measures of age acceleration: intrinsic and extrinsic epigenetic acceleration, which are independent and dependent of blood cell-type proportions, respectively. We did not find any significant difference between Nicoyans and non-Nicoyans in any of these acceleration measures (ANOVA p ≥ 0.5, Supplementary figure 3.2).     62   Figure 3.2 DNA methylation age correlated with chronological age across all samples. a Left to right: Horvath age estimation method was used to derive DNA methylation age using 353 CpGs; Hannum method to estimate DNA methylation age using 71 CpGs, and the Weidner 99 CpG model. b Histogram of age acceleration difference (DNA methylation age–chronological age). Dotted line shows the mean of all samples.   3.3.4 DNA methylation variability is lower in Nicoyans After assessing whether any differences in epigenetic age existed, we next investigated epigenetic drift between Nicoyans and non-Nicoyans. We hypothesized that lower variability in Nicoyans would be representative of a biologically younger profile based on the epigenetic drift phenomenon where stochastic variation in DNA methylation occurs with age. We calculated the interquantile range (IQR) at each CpG site (90th – 10th percentile) represented on the 450K array  63 to account for outliers and found a significant variability difference between Nicoyans and non-Nicoyans (Wilcoxon Signed-rank test; p < 2.2 x 10-16), with a lower level of total mean DNA methylation variation in Nicoyans (Figure 3.3a). Furthermore, we assessed the level of DNA methylation variability across individuals between 60-80 years old and individuals >80 years old, in Nicoyans and non-Nicoyans. We found a lower DNA methylation variability in the younger group, in both populations; however, there was a greater difference between the non-Nicoyan two age groups compared to the Nicoyans respective age groups (non-Nicoya: IQR mean difference = 0.0065, p < 2.2 x 10-16, Nicoya: IQR mean difference = 0.0043, p=1.79x10-10). Lastly, when a β-value IQR threshold of  5% was applied to only variable sites, we found 129,971 and 146,047 variable sites in the >80 year old range of Nicoyans and non-Nicoyans, respectively. For the younger age range we found 116,038 and 120,711 sites that had greater than 5% IQR in Nicoyans and non-Nicoyans, respectively (Figure 3.3b). In summary, we found a lower degree of DNA methylation variability was associated with Nicoyans for both age groups, 60-80 years old and >80 years old, when compared to non-Nicoyans.    64   Figure 3.3 DNA methylation variability was lower in Nicoyans a Scatter plot of log-transformed interquantile range (IQR; 90th–10th) at each CpG, with values generated independently in each Nicoyan and non-Nicoyan group. Blue: 41,695 CpGs had higher variability in Nicoyans. Red: 98,073 CpGs had higher variability in non-Nicoyans. Gray: insignificant variability, less than 20% IQR between groups. Significance value from Wilcoxon signed-rank test of IQR values between Nicoyans and non-Nicoyans. b Number  65 of sites with a β-value IQR greater than 5% in each age group. Blue: Nicoyans, gray: non-Nicoyans. β-values were corrected for cell-type proportions  3.3.5 Unique region-based differential methylation in Nicoyans We next investigated differentially methylated regions (DMRs) between Nicoyans and non-Nicoyans to identify unique epigenetic signatures that may be associated with the longevity observed in Nicoya. Using the R package ‘DMRcate’, we found that in the comparison of Nicoyans and non-Nicoyans, 20 DMRs containing three or more CpG sites passed a false discovery rate of ≤ 0.05, as well as an arbitrary biological cutoff of a β-value ≥ 5% between groups for at least one CpG site in each DMR (Table 3.2, Figure 3.4, Supplementary figure 3.3). Age, sex, and blood cell type proportions were used as covariates. DMRs were associated with genes based on the closest proximity to a transcription start site (TSS). The mean length of the DMRs was 411bp, with the shortest being 76 bp and the longest being 1,416 bp. On average, out of the 20 DMRs observed, there were seven CpG sites per region, with a range from three to 16. Six DMRs were found within 1500 bp of a TSS associated with the following genes’ promoter regions: Nudix Hydrolase 12 (NUDT12)(6 CpGs), Vault-RNA-2 (VTRNA2-1)(16 CpGs), Peptidase M20 Domain Containing 1 (PM20D1)(8 CpGs), Active BCR-Related (ABR) (3 CpGs), tRNA-Leu (5 CpGs), and LOC100128885 (uncharacterized) (3 CpGs). The majority of DMRs were found in intergenic regions (8/20) with an average number of four CpGs per DMR and average length of 267 bp. Four DMRs were intragenic in the bodies of the following genes: Glutamate-rich protein 1 (ERICH1)(8 CpGs), Hook Microtubule Tethering Protein 2 (HOOK2)(4 CpGs), GATA2 Antisense RNA 1 (GATA2-AS1)(3 CpGs), and C15orf26 (uncharacterized)(4 CpGs). Two DMRs were found in the 3’end regions of Mitochondrial Ribosomal Protein L21  66 (MRPL21)(3 CpGs) and BolA family member 3 (BOLA3)(5 CpGs). The average absolute difference in DNA methylation between Nicoyans and non-Nicoyans was 7.9% when assessing the single CpG site for each DMR that showed the greatest difference between groups. When DNA methylation was averaged across the DMR per group, the mean β-value difference was 5.9%.       67 Genomic location* # CpGs Min FDR Max Δβ Mean Δβ Associated gene DMR length (bp) Genomic region Gene ontology chr5:102898223-102898733 6 1.0E−11 −0.07 −0.05 NUDT12 511 Promoter NAD + diphosphatase activity; NADH pyrophosphatase activity chr4:132896266-132897018 6 4.9E−08 −0.09 −0.06 – 753 Intergenic – chr5:180643432-180643713 3 1.9E−05 −0.05 −0.05 – 282 Intergenic – chr7:155832831-155832992 3 2.9E−04 0.06 0.05 – 162 Intergenic – chr11:68658383-68658836 3 7.3E−04 −0.09 −0.08 MRPL21 454 3′ end Poly(A) RNA binding, ribosomal protein chr6:6894084-6894182 3 5.3E−4 0.06 0.05 – 99 Intergenic – chr7:64034943-64035529 3 2.4E−03 0.07 0.06 LOC100128885 593 Promoter – chr12:130707332-130707407 3 2.6E−03 0.07 0.05 – 76 Intergenic Wnt-protein binding chr3:128215433-128216122 3 1.7E−03 −0.06 −0.05 GATA2-AS1 690 Intragenic Enhancer sequence-specific binding, chromatin/transcription factor/C2H2 zinc finger binding chr5:42924215-42924694 4 2.6E−03 0.07 0.05 – 480 Intergenic – chr1:152161237-152162507 8 2.1E−06 −0.07 −0.05 – 113 Intergenic – chr16:3988694-3988869 3 1.9E−02 −0.07 −0.05 – 176 Intergenic – chr6:28446794-28447115 5 1.3E−02 0.06 0.06 tRNA-Leu 322 Promoter – chr19:12876846-12877188 4 3.6E−02 −0.14 −0.11 HOOK2 343 Intragenic Kinase modulator; membrane traffic protein chr17:1133546-1133706 3 3.5E−02 −0.07 −0.05 ABR 161 Promoter Transcription factor; protein binding; guanyl-nucleotide exchange factor chr15:81410745-81411066 4 2.3E−02 0.09 0.06 C15orf26 322 Intragenic –  68 Genomic location* # CpGs Min FDR Max Δβ Mean Δβ Associated gene DMR length (bp) Genomic region Gene ontology chr2:74357527-74358223 5 2.0E−02 0.08 0.06 BOLA3 697 3′ end Production of iron–sulfur clusters chr1:205818956-205819609 8 2.2E−02 −0.08 −0.07 PM20D1 654 Promoter Deacetylase, metalloprotease chr8:599525-600940 8 2.7E−04 −0.12 −0.05 ERICH1 1416 Intragenic – chr5:135415693-135416613 16 1.1E−03 −0.09 −0.06 VTRNA2-1 921 Promoter Potential tumor suppressor Table 3.2 Differentially methylated genomic regions between Nicoyans and non-Nicoyans DMR: differentially methylated region; Δβ: (Delta beta; absolute mean or max difference of β values between groups (Nicoya–non-Nicoya); *Genome coordinates from Human Genome GRCh37/hg19 Assembly  69   Figure 3.4 Significantly differentially methylated regions between Nicoyans and non-Nicoyans. Top six of 20 significant DMRs found using DMRcate. Unadjusted β-values are displayed on the y-axis and genomic distance (bp) to the most proximal gene transcriptional start site (TSS) is plotted on the x-axis. Blue: Nicoyans, red: non-Nicoyans. Group mean represented by respective colored line.   3.3.6 Site-specific differential methylation in Nicoyans and technical verification  To complement the region-specific analysis, we also assessed DNA methylation differences at the site-by-site level to identify any single sites that were differentially methylated  70 in Nicoyans as compared to non-Nicoyans. By investigating site-specific changes, we did not limit ourselves to highly represented genomic regions on the 450K array. We performed an epigenome wide association study using a linear model of population group regressed on M-values at each CpG site, with sex, age, and blood cell type proportions included as covariates (Supplementary figure 3.4). We have also included a comparison table of β-values and M-values for CpGs identified in the site-specific differential methylation analysis (Supplementary table 3.2). We observed nine CpGs below a nominal p-value significance threshold of p ≤ 5x10-7 (q value < 0.022) that were differentially methylated between Nicoyans and non-Nicoyans and after applying a biological threshold similar to the DMRcate analysis, four CpG sites passed our significance criteria: cg02853387 (DSCAML1), cg02438481 (C6orf123), cg13979274 (OR10H1), cg26107275 (BC042649). Not surprisingly, the four significant CpG sites did not overlap with our DMR findings, as these single CpGs identified through the site-by-site analysis were either: 1) found in a genomic region with no proximal array probes, or 2) nearby CpGs sites were not significantly correlated. The nearest neighbouring array-CpG probe was greater than 1 kb away for three out of the four significant CpG sites (cg02853387, cg13979274, and cg26107275). While for cg02438481, the closest array-CpG (cg00788354) was 324 bp away, it had a Pearson’s correlation of r < 0.10 and was not significantly differentially methylated between groups.  We performed pyrosequencing, a targeted DNA methylation sequencing technology, to verify that our single differential DNA methylation results were reproducible using an independent platform. We designed assays to measure three CpG sites (cg02853387, cg02438481, and cg13979274) that were observed to have a significant between group (Nicoyans and non-Nicoyans) difference in DNA methylation of ≥5.0% (sequences listed in  71 Supplementary table 3.1). Correlation coefficients between the pyrosequencing and 450K array for each CpG site showed a strong concordance between the two technologies (Pearson’s r = 0.87 (cg02853387), r = 0.92 (cg02438481), and r = 0.88 (cg13979274)). Bland-Altman plots were generated for each CpG site to illustrate the agreement between the two quantification techniques (Figure 3.5).   Figure 3.5 Pyrosequencing of significantly differentially methylated single CpGs. Left: Bland–Altman plot of concordance between 450K array and pyrosequencing result for each CpG. Text labels represent sample IDs. Middle: scatter plot displaying correlation between 450K  72 array and pyrosequencing for each CpG. Spearman correlation coefficients shown. Right: box plots of significant difference between Nicoyans and non-Nicoyans at each CpG site, measured using pyrosequencing. Significant value from regression model of CpG methylation on group status, while controlling for sex, age, and cell-type proportions. Box plot hinges represent the 25th and 75th percentiles and the upper and lower whiskers correspond to the largest and smallest value no further than 1.5 x the inter-quartile range, respectively.  To verify our differential DNA methylation findings between Nicoyans and non-Nicoyans, we confirmed these associations by statistically regressing DNA methylation determined by pyrosequencing onto group status, while controlling for age, cell type proportions, and sex. All three sites remained significantly different between Nicoyans and non-Nicoyans (Table 3.3).  73  Probe ID Pyrosequencing 450K array Chr Distance closest to TSS (bp) Genomic region : gene Gene ontology  Δβ p value Δβ p value q value     cg02853387 0.06 7.9E−05 0.07 1.3E−07 0.012 11 183072 Intragenic: intron 3 of DSCAML1 Protein homodimerization activity cg02438481 0.08 1.9E−05 0.08 1.6E−07 0.012 6 −1861 Intergenic: C6orf123 – cg13979274 0.06 9.1E−07 0.06 2.0E−07 0.012 19 4347 Intergenic: ~ 3 kb from 3′ end of OR10H1 G-protein coupled receptor/olfactory receptor activity cg26107275 – – −0.07 2.2E−07 0.012 12 −407 Promoter: BC04264 – Table 3.3 Characteristics of four biologically and statistically significant DNA methylation sites between Nicoyans and non-Nicoyans Δβ: (Delta beta; absolute mean difference of β values between groups (Nicoyans–non-Nicoyans); TSS Transcription start site, Chr chromosome 74 Lastly, to determine whether a measure of genetic population structure was confounded with group (Nicoyans vs non-Nicoyans), we performed a post hoc analysis using ‘Epistructure’ (Rahmani et al., 2016). Principal component analysis was completed on DNA methylation of CpGs previously identified as genetically informative loci. The first two principal components generated from this analysis have been proposed to confer composites of genetic structure to be used as covariates in a DNA methylation analysis. Using this technique, we did not observe a significant difference between the Epistructure principal components of the measures in Nicoyans and non-Nicoyans, while controlling for sex, age, and cell type proportions (PC1: p=0.60, PC2: p= 0.93, Supplementary figure 3.5); therefore, we did not include these measures in our analyses.  3.4 Discussion In this study, we investigated differential patterns of DNA methylation in a population with well-characterized high longevity: Nicoya, Costa Rica. We aimed to identify unique patterns of DNA methylation that may underlie biological pathways associated with the longevity observed in Nicoya. Our study sample was drawn from a demographic study of Costa Ricans age 60 years and over, and we randomly sampled individuals from within Nicoya and the rest of Costa Rica who were aged 60-75 years and 95 years and above in order to assess age-associated DNA methylation. We have four primary findings. First, we observed a bioinformatically-inferred younger immune profile in Nicoyan individuals compared to those living in the rest of Costa Rica, finding estimated cellular proportion differences in CD8+ naïve and CD8+ memory T cells. Next, we found a lower level of total mean DNA methylation variation in Nicoyans compared to non-Nicoyans. We found 20 DMRs and four single CpG sites that were significantly differently methylated between Nicoyans and non-Nicoyans. While any of  75 these differences may be due to genetic differences in the populations, the fact that we show that genetic structure did not differ between Nicoyans and non-Nicoyans suggests that these biological differences are more likely to be the result of environmental differences between these two populations. Lastly, DNA methylation age was not significantly different between Nicoyans and non-Nicoyans, although Costa Ricans overall had a younger mean DNA methylation age than chronological age.  Our finding of proportional differences in CD8+ naïve and CD8+ memory T cells is intriguing in the context of previous work from animal models and other human research that has established that these blood cell proportions change as a function of age. Specifically, the naïve T cell response diminishes with time as they are naturally replaced by memory T cells through age-related thymic involution (Fagnoni et al., 2000; Wherry & Kurachi, 2015). Therefore, younger immune profiles have been hypothesized to delay the onset of infection vulnerability and extend health span (Gui, Mustachio, Su, & Craig, 2012). Interestingly, centenarian offspring have been investigated in this context and show this “youthful” immune cell phenotype (Pellicanò et al., 2013). The fact that our work suggested that Nicoyans had a lower proportion of CD8+ T memory cells and higher CD8+ T naïve cells is interesting in the context of immunoaging and might be suggestive of an age-related immune phenotype in Nicoyans.  Given the lower mortality rate of Nicoyans, we were surprised about the lack of differences in DNA methylation age between them and non-Nicoyans, especially given that this biological aging measure has been associated with many age-related conditions such as cognitive fitness decline, frailty, and mortality (Breitling et al., 2016; Marioni, Shah, McRae, Chen, et al., 2015a; Marioni, Shah, McRae, Ritchie, et al., 2015b). It is important to note, however, that we only had the statistical power to test for very large difference in DNA methylation age, and our  76 95% confidence interval suggests that Nicoyan individuals could be up to six years younger in DNA methylation age. We calculated epigenetic age in our samples using three published DNA methylation age predictors and found no significant difference in these measures of biological age between Nicoyans and non-Nicoyans. Previous work found that peripheral blood mononuclear cells from the offspring of semi-supercentenarians (individuals who reached 105-109 years of age) in an Italian cohort appear 5.1 years epigenetically younger than controls. Centenarians also were reportedly 8.6 years younger than their chronological age (Horvath, Pirazzini, et al., 2015b). This is consistent with our population, when we collapsed the Nicoyan and non-Nicoyan groups we found centenarians were 12.7 years epigenetically younger. Furthermore, all Costa Ricans were 6.9 years epigenetically younger than their chronological age. It is possible that this might reflect the recently reported epigenetically younger phenotype in Hispanic populations (Horvath et al., 2016). However, more research into how epigenetic clocks perform in older populations is needed to confirm this tool as being applicable to elderly samples. To examine age-associated DNA methylation variability, we assessed a measure of variance in Nicoyans and non-Nicoyans independently. We were able to demonstrate the phenomenon of epigenetic drift within each of these groups, as our sample consisted of two age ranges in each group. Additionally, we observed that DNA methylation variation was lower in Nicoyans, both in the ≤ 80-year-old and > 80-year-old age groups, when compared to non-Nicoyans. Given that DNA methylation variability across individuals has been reported to increase with age, our finding may highlight an epigenetic characteristic of the longevity in Nicoyans (Martin, 2005). Although the biology of increased DNA methylation variability is poorly understood, our findings suggest lower DNA methylation variability may be associated  77 with lower mortality. Some explanations for this age-associated DNA methylation variability have been proposed, such that DNA methylation variability may result from a functional decline in DNA methylation maintenance machinery or that the variability is a product of environmental exposures over time (Shah et al., 2014). We found 20 genomic regions and four single CpGs that were significantly differentially methylated between Nicoyans and non-Nicoyans. One such DMR contained six CpGs and was located in the promoter region of NUDT12, a gene encoding a protein shown in vitro to cleave NADH, NADPH, and NAD+ (Garten et al., 2015). Given that NUDT12 may play a role in NAD metabolism, a regulatory process associated with health span and aging, it is consistent with the fact that DNA methylation may be involved in the regulation of this gene that may have downstream effects on NAD biosynthesis. Nicoyans had a lower level of DNA methylation in the promoter region of NUDT12, a signature often associated with higher gene expression. In addition to our DMR findings, we also found four individual sites to be both statistically and biologically significant, three of which existed in intergenic regions. We further investigated three of these CpGs by quantifying DNA methylation with pyrosequencing, allowing us to verify both the accuracy of the 450K array as well as the significant differences between Nicoyans and non-Nicoyans. However, it remains unclear whether differential DNA methylation of these single CpGs or DMRs at the observed effect sizes (<10%) are sufficient to yield a biological change. Interpretation of these findings at a biological level will require future mechanistic experiments. Our findings should be interpreted within the context of several limitations. One limitation was the lack of genotype information for these samples, as genetic variation is strongly associated with DNA methylation (Banovich et al., 2014). In order to reduce genetic  78 heterogeneity, we restricted control sampling to areas within Costa Rica but outside of Nicoya. Nicoyan status was determined as of the time of the survey, i.e. at older ages, not based on birth or life-course residence, but in our sample 44 out of 48 Nicoyan residents have lived there their entire lives. While there are no documented differences between the historical migration patterns of the inhabitants of Nicoya and the rest of Costa Rica, minor differences may exist. Therefore, we implemented a recently published tool to infer genetic information using DNA methylation data obtained from the 450K array called ‘Epistructure’, a tool from the python package GLINT (Rahmani et al., 2016). We found no significant difference in population structure measures between Nicoyans and non-Nicoyans, and thus did not include these composite measures in our analyses. Another consideration of our study results is that we used a bioinformatics approach to predict CD8+ T cell proportion differences, which are relative compositional estimates. As these predictions do not estimate actual cell counts, the abundance of other cell types will affect the proportional estimate of the CD8+ T cells. Ideally, we will need to validate our findings using a quantitative approach, such as fluorescent activated cell sorting, that estimates actual cell counts. We note, however, that we did not observe a significant difference when we performed an overall compositional analysis of these predictions (van den Boogaart & Tolosana-Delgado, 2013); meaning the composition overall was not different between Nicoyans and non-Nicoyans. In addition, our findings are supported by using two separate reference-based approaches (Horvath, 2013; Houseman et al., 2012), both of which identified CD8+ T cells as being significantly different between Nicoyans and non-Nicoyans. Furthermore, these bioinformatic cell type proportion predictions have been well-validated in the literature, albeit less in older aged samples, but the estimated cell type proportions have been shown to strongly correlate actual cell counts (Koestler et al., 2013)  79  3.5 Conclusions Our findings highlight DNA methylation as a potential factor underlying the unique longevity observed in Nicoya region of Costa Rica. This work also supports the demographic data on longevity as characterizing this population as unique. The specific differences in immune cell proportions we observed in Nicoyans will lay the framework for a validation study to observe whether cell-sorting experiments yield similar results. Additionally, the differential DNA methylation findings may provide a candidate list of CpGs to test for differences in other longevity populations. Lastly, our work, if validated, will contribute to narrowing the focus of mechanistic studies to assess whether the DNA methylation differences we observed are involved in gene regulation that may alter gene expression trajectories.           80 Chapter 4: DNA methylation signatures in peripheral blood mononuclear cells from a lifestyle intervention for women at midlife: A pilot RCT 4.1 Introduction Physical activity is a fundamental aspect of healthy living, contributing to the prevention of obesity, type II diabetes, cardiovascular disease, and other lifestyle-related diseases (Warburton, Nicol, & Bredin, 2006). Benefits of an active lifestyle include reductions in stress, risk for developing frailty, and premature mortality (Barlow, Otahal, Schultz, Shing, & Sharman, 2014; Nocon et al., 2008). Physical activity is also associated with normalizing blood pressure and serum triglyceride and cholesterol levels (Mann, Beedie, & Jimenez, 2014; Nunan, Mahtani, Roberts, & Heneghan, 2013). Although the association between physical activity and health is well established, research investigating the molecular pathways, particularly epigenetic processes, underlying the response to physical activity remain limited. Characterizing such molecular changes may contribute to understanding the biological processes that link physical activity and health outcomes. The field of epigenetics focuses on studying modifications to DNA and DNA packaging that have the potential to influence gene expression without altering the genetic sequence (Bird, 2007). In human populations, the most widely studied and characterized epigenetic modification is DNA methylation: the chemical process of adding a methyl group onto DNA. DNA methylation occurs predominantly at cytosine DNA nucleotides followed by guanine bases connected by a phosphate bond (referred to as a ‘CpG’). CpGs are enriched in regions called CpG islands, areas of >200 bp densely packed with CpGs (Hatchwell & Greally, 2007; Illingworth & Bird, 2009). Approximately 60% of promoters are associated with a CpG island, emphasizing the role of CpG methylation in controlling gene expression (P. A. Jones, 2012).  81 An important role of DNA methylation is to orchestrate cell type differentiation during embryogenesis, given that all somatic cell types have nearly identical DNA sequences, the epigenome in general and DNA methylation in particular allows for the specification of cell type identity. Thus, not surprisingly, the primary source of DNA methylation variation is tissue type. Cellular differences in DNA methylation are larger than inter-individual variation, such that DNA methylation profiles of two different tissues obtained from one individual are less alike than the DNA methylation of the same tissue from two different people (Farré et al., 2015). Though having a smaller effect than tissue or cell type, underlying genetic makeup at certain sites has also been shown to substantially affect DNA methylation of nearby CpGs (Banovich et al., 2014). Additionally, DNA methylation is a dynamic mark that can act as a mediator between environmental influences and genomic regulation. With that, DNA methylation is variable over the life-course and can be altered in response to certain environmental or lifestyle exposures. Specific exposures such as cigarette smoking, diesel exhaust fumes, and physical activity have recently been associated with DNA methylation signatures (Brown, 2015; Clifford et al., 2016; Joubert et al., 2012). As most research has been focused on cross-sectional study designs, the majority of the findings have merely identified associations. Therefore, intervention studies are crucial in distinguishing the specific effects of environmental or lifestyle influences on the methylome. The observed effects of short-term lifestyle changes on DNA methylation have been subtle, and replication studies of these findings are still needed. Although notable physical activity intervention studies have observed differences in DNA methylation (Archer, 2015), they lack consistency in the type/length of exercise program, study population characteristics  82 (sex/age/ethnicity), and type of tissue examined. Weight-loss interventions, including caloric-restriction programs or gastric bypass, have also produced changes in DNA methylation, but again results have been inconsistent (Benton et al., 2015; Campión, Milagro, Goyenechea, & Martínez, 2009; Milagro et al., 2011). Although this field is still in its infancy, the results thus far of a relationship between DNA methylation and lifestyle changes related to physical activity and/or weight loss are encouraging. Therefore, in the present study, we investigated DNA methylation patterns before and after a lifestyle intervention from a six-month pilot randomized control trial (Ashe et al., 2015) of previously inactive post-menopausal women. We focused on characterizing genome-wide DNA methylation changes associated with measures of physical activity, percent body weight change, and C-reactive protein (CRP).  4.2 Materials and methods 4.2.1 Participants This was a sub-study of a lifestyle intervention for community-dwelling older women from Vancouver, Canada (ClinicalTrials.gov identifier: NCT01842061). We provide a description of the methods and results, following CONSORT 2010 guidelines (Moher, 2010), from the main study (1:1 parallel two-arm randomized pilot trial) in our previously published report (Ashe et al., 2015). For this current study we included women aged 55-70 years who self-identified as not meeting physical activity of 150 minutes/week of moderate to vigorous physical activity (MVPA), but could walk 400 meters and climb a flight of stairs. Ethics were approved from the university institutional review board and all participants signed a written informed consent form prior to participation in the study.   83 4.2.2 Intervention The six-month self-management intervention called “Everyday Activity Supports You” (EASY), was designed to reduce sitting time and promote physical activity through incremental increases in activities of daily living and utilitarian walking (Ashe et al., 2015). The EASY model contains the following three key elements: group education, individualized sessions with an exercise professional, and the use of an activity monitor (Fitbit, USA). The intervention group was offered nine two-hour sessions (over six months) and concurrently the control group had six one-hour monthly sessions. The format for the EASY model was as follows: the first hour was an interactive educational session on a number of physical activity and health related topics. The second hour consisted of participatory group-based work to address action and coping plans to being active, and similar topics. In addition, all participants had an individual short session with an exercise professional. Participants’ programs were developed and progressed, based on feedback from the activity monitor, to reduce sitting time and increase their daily step count. The control group received six one-hour sessions on health- (but not physical activity-) related topics. Descriptive information was collected, and height, weight, and blood pressure [BpTRU BPM-200 BpTRU Medical Devices, Coquitlam, BC] were measured using standard procedures. Participants were also requested to wear a waist-mounted accelerometer, ActiGraph GTX3+ (LLC, Fort Walton Beach, FL, USA). Further details of the pilot study are described in detail elsewhere (Ashe et al., 2015).  4.2.3 Sample collection and data processing  We requested that participants provide non-fasting blood samples at baseline and again at six months. Blood was drawn into Vacutainer Cell Preparation Tubes (Becton Dickinson) by  84 antecubital venipuncture. A complete blood cell count was performed, and no outliers were observed. Samples were inverted 10 times and then subjected to density-gradient centrifugation at 1800 relative centrifugal force (RCF) for 20 minutes at room temperature within one hour of collection. 1 ml of blood plasma was simultaneously collected and used for CRP measurement using an immunoassay (Meso Scale Discovery, Maryland, USA). Approximately 3 ml of buffy coat was transferred into a tube, topped up to 15ml with warm RPMI-1640 medium (Sigma-Aldrich), inverted five times, centrifuged at 300 RCF for 10 minutes at room temperature, and the supernatant discarded. The cell pellet was re-suspended with 10 ml warm RPMI medium with a pipette, centrifuged at 300 RCF for 10 minutes at room temperature, and then the supernatant discarded. The cell pellet was then re-suspended with 1 ml RPMI medium and thoroughly mixed by pipetting. 5 ml of peripheral blood mononuclear cell (PBMC) suspension was aliquoted into 45 ml of Trypan Blue Solution 0.4% (Sigma-Aldrich). 10 ml of trypan blue/cell mixture was transferred into a Bright-Line™ Hemacytometer (Sigma-Aldrich) to count cells. Approximately six million cells were aliquoted per microfuge tube and centrifuged at maximum speed for 3 minutes, the supernatant was discarded, and the pellet was stored at -80˚C until DNA extraction. One tube for each sample intended for DNA methylation quantification was lysed and homogenized in a QiaShredder Spin Column (Qiagen, Hilden, Germany). Genomic DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany). Quantity and quality of DNA was assessed using a NanoDrop™ 8000 Spectrophotometer (Thermo Scientific). Bisulfite conversion of 750ng of DNA was performed using the EZ-DNA Methylation™ Kit (Zymo Research, Irvine, CA, USA). We processed 47 samples (20 paired samples, three technical replicates, and four unpaired samples) using the Infinium HumanMethylation450 BeadChip per manufacturer’s instructions (Illumina, San Diego, CA,  85 USA). For the purpose of our analyses we only included individuals who provided samples at both time points (n=20). Data preprocessing included exporting colour-corrected and background subtracted β-values from the Illumina GenomeStudio software (Illumina, San Diego, CA, USA). We subsequently imported and processed these data using R version 3.2.3. Probe filtering was standard and included poor performing probes as well as probes known to potentially introduce bias (Price et al., 2013), resulting in 433,056 probes remaining for further analysis. We accounted for known variation that had potential to confound or bias our results, such as probe design, batch effects, and cell type proportion. Specifically, we employed quantile normalization across samples and subset-quantile within array normalization across probes to adjust for technical bias (Maksimovic et al., 2012), ‘ComBat’ from the package ‘sva’ was used to adjust for microarray chip and position batch effects (W. E. Johnson et al., 2007), and we corrected for predicted blood cell type proportions using previously established methods (Houseman et al., 2012), (M. J. Jones, Islam, Edgar, & Kobor, 2015b). All analyses and reports of DNA methylation are as β-values ranging from 0-1 (0 = no methylation, 1 = fully methylated).  4.2.4 Statistics To avoid examining sites that did not change across individuals, we performed a within sample variability filter to yield a reduced data set of 39,238 CpGs that showed an inter-quantile range (80th – 20th percentile) above 5% DNA methylation (Lemire et al., 2015). We then regressed changes in methylation at each of these CpGs onto the following variables in separate linear models: intervention group status, daily percent change in step count, percent change in  86 CRP, and percent change in body weight, while correcting for baseline BMI and age. We used a significance threshold of p-value = 5x10-5, as this was an exploratory study and we had a considerably small sample size. We also reported multiple test corrected values for statistical transparency.  4.3 Results  4.3.1 Cohort characteristics Samples were obtained from participants of an exercise intervention program called Everyday Activity Supports You (EASY). A subsample of the EASY cohort was used and included 20 healthy middle-aged [mean (SD) 64.1 (4.6) years] women who self-reported as being physically inactive (not meeting physical activity guidelines) and resided in Metro Vancouver, British Columbia (Table 4.1). We previously reported significant differences between groups for some variables (step count, weight, and diastolic blood pressure) favoring the intervention group (Ashe et al., 2015).  Characteristics Control (n = 8) Intervention (n = 12) Age 63.1 (4.8) 64.8 (4.6) Weight (kg) 90.2 (18.9) 69.7 (19.3) BMI (kg/m2) 32.9 (6.8) 26.9 (6.8) Step Count (steps/day) 5340 (1966) 6402 (2534) MVPA (min/day) 24.3 (32.7) 23.39 (15.2) Daily Sedentary behaviour (%) 62.4 (13.0) 67.8 (7.5) Systolic BP (mmHg) 140.0 (13.3) 127.9 (16.3) Diastolic BP (mmHg) 83.4 (9.3) 77.3 (6.9) Table 4.1 Participant baseline characteristics BMI: body-mass index; MVPA: moderate-to-vigorous physical activity; BP: blood pressure     87 4.3.2 DNA methylation across the intervention period Initially, we selected CpG sites with a DNA methylation change robust to technical variation and more likely reflect a biological change from the total data set (433,056 CpGs), that being an absolute DNA methylation change of ≥ 5% across the intervention period. Using this method, we identified 39,268 variable CpGs (vCpGs) for downstream analyses. Applying a minimum threshold of DNA methylation change allowed us to assess changes that were less likely to be a result of technical noise and more likely to provide evidence of a biological difference. We investigated whether DNA methylation changed differently in the intervention group as compared to the control group by testing the change in DNA methylation at each of the 39,238 vCpG sites, while controlling for chronological age and baseline BMI. Two CpG sites passed a nominal p-value threshold (p < 5e-5), cg09786593 (p-value = 2.89e-05, q-value = 0.737) and cg11630939 (p-value = 4.28e-05, q-value = 0.737), although these findings did not remain significant after applying a multiple test correction (Figure 4.1, Supplementary figure 4.1).      88   Figure 4.1 Volcano plots of the change in DNA methylation over six months. a The change in DNA methylation over six months for 39,268 CpGs was modeled as the dependent variable and intervention group was the independent variable with BMI at baseline and chronological age as covariates. Red points: intervention group lost DNA methylation compared to controls, blue points: intervention group gained DNA methylation. b The change in DNA methylation over six months modeled as the dependent variable and percent weight change in all samples was the independent variable with BMI at baseline and chronological age as covariates. Red points: percent weight change was associated with a loss of DNA methylation compared to controls, blue points: weight loss associated with a gain of DNA methylation over six months. “Delta beta difference” = 1) intervention group: (Intervention Group[time point B – time point A]) – (Control Group[time point B – time point A]), 2) percent weight loss: slope*maximum value from linear regression of change in methylation over time regressed on percent change in weight.  Since no statistically significant differences in DNA methylation were found between the intervention and control groups, we focused on the association between physiological variables  89 measured at baseline and post intervention and the change in DNA methylation in all individuals. Using linear regression, we examined the association between DNA methylation changes at each vCpG and percent change in body weight (kg) and found 12 CpG sites that demonstrated a change in DNA methylation (p-value < 0.00005, q value < 0.12) (Figure 4.2, Table 4.2). DNA methylation at seven of these CpGs sites was positively correlated with percent weight change, and five were negatively correlated with percent weight change. Many of these sites were located within gene bodies (5/12), although some were located between genes (4/12) or within promoter regions (3/12). This is not surprising as these distributions are comparable to what is represented on the 450K array. The effect sizes of these identified CpGs ranged from 8.6% - 15.1%, with an average β-value change of 10.9%.   Figure 4.2 Scatterplot displaying 12 significant CpGs for percent change in weight change. Linear regression line is shown in black with 95% confident intervals displayed by grey shading. Dotted lines represent no change in DNA methylation and no change in weight.  90   CpG ID p value q value β-value range Chromosome Genomic Location Associated Gene cg10171794 2.95E-06 0.10 0.09 16 Intergenic NA cg06833613 8.73e-06 0.10 0.14 17 Intragenic ABR cg16062320 1.53e-05 0.10 0.09 5 Intergenic SLCO4C1 cg10711871 1.57e-05 0.10 0.09 3 Intergenic WNT7A cg16165651 1.59e-05 0.10 0.15 2 Intragenic RASGRP3 cg10632837 1.66e-05 0.10 0.10 10 Intergenic CYP2E1 cg07095671 1.85e-05 0.10 -0.12 8 Promoter CA13 cg05004518 2.62e-05 0.13 0.11 7 Intragenic NAMPT cg15014975  3.55e-05 0.15 -0.11 1 Intragenic RUNX3 cg16028201 4.32e-05 0.15 -0.10 1 Promoter KANK4 cg10176110 4.43e-05 0.15 -0.10 6 Promoter SMOC2 cg26624398 4.58e-05 0.15 -0.11 5 Intragenic SLIT3 Table 4.2 Significant CpGs associated with percent weight change over six months  We did not find any association between the change in DNA methylation over time with either the percent daily step count or the percent change in serum CRP. We do however note that the change in CRP was quite variable across individuals, ranging from -64% to 95%. (Supplementary figure 4.1).   91 4.4 Discussion In this study, our goals were to test whether a six-month lifestyle intervention was associated with epigenetic changes. Although we did not find significant changes between DNA methylation with either the intervention or physical activity measures, we did observe a change in DNA methylation at 12 CpG sites that associated with weight loss.  Weight loss was an unexpected significant outcome from the EASY model pilot study, with an average between group difference of approximately -4 kg, favouring the intervention group (Ashe et al., 2015) and, interestingly, weight loss has previously been associated with DNA methylation (Benton et al., 2015; Martínez, Milagro, Claycombe, & Schalinske, 2014; Wahl et al., 2017). In the current study, one of the CpG sites (cg05004518) associated with weight change has also been reported previously to relate to changes in body weight. The site cg05004518 is located within an intron of nicotinamide phosphoribosyltransferase (NAMPT). NAMPT is an enzyme involved in regulation of the nicotinamide adenine dinucleotide (NAD) pool. NAD is important in cellular redox reactions, and levels of NAD decrease with age and in some metabolic disorders (Garten et al., 2015). Furthermore, expression of NAMPT has been found to increase after weight loss in previously obese individuals (Rappou et al., 2016). We observed a decrease in DNA methylation at the NAMPT CpG that was associated with greater weight loss at the intragenic NAMPT CpG. This relationship is generally associated with reduced gene expression; however, this relationship is not always linear (P. A. Jones, 2012). The other significant CpG (cg15014975) is located -777 bp from the TSS of RUNX3, which encodes for a transcription factor that has a pivotal role in activating the immune response (Smeets et al., 2012). At this site, we found that a gain of methylation was associated with greater weight loss. While it is tempting to speculate that the observed associations between  92 body weight change and DNA methylation within NAMPT and RUNX3 may have biological implications, we emphasize these findings are exploratory in nature and require further studies to confirm their biological significance. The ten other genes identified with nearby body weight-related DNA methylation included ABR, SLCO4C, WNT7A, RASGRP3, CYP2E, CA13, KANK4, SMOC2, SLIT3, and GABRG3; to our knowledge these genes have not previously been associated with weight loss. It is encouraging that the effect sizes at these sites, ranging from 8-15%, illustrated expected trends: participants who did not lose or gain weight during the intervention period had no change in DNA methylation (~0%) at the identified sites. We also investigated whether a change in activity levels, objectively measured as MVPA, daily step count, and proportion of the day spent in sedentary behavior, was associated with changes in DNA methylation over the intervention period. The EASY lifestyle intervention aimed to increase everyday activity but did not focus on generating high intensity physical activity (e.g., strenuous exercise), and since the activity variables were highly correlated we decided to focus on step count as a representative variable for activity. However, we did not find associations between any change in DNA methylation and daily step count over the intervention period. We also investigated inflammation (CRP) over time as this is a pro-inflammatory marker highly related to physiological processes, but we did not see any association with the change in DNA methylation. It is possible that the intensity level of physical activity may not have been vigorous enough to identify a uniform epigenetic signature across participants in this study, as other studies that observed DNA methylation changes concordant with exercise have included a more intense activity regime (Rönn et al., 2013). Additionally, the variability in CRP may have  93 also originated from differential times of blood collection for each participant, as the non-fasting conditions of our study design could have introduced unwanted variation. For example, food consumption, specifically foods high in fat or carbohydrates, prior to blood sampling could have influenced CRP levels, as has been shown with other blood inflammatory markers (Poppitt et al., 2008). We note limitations in this study that should be considered when interpreting results. The use of PBMCs instead of a more relevant tissue, such as adipose, was based on feasibility, but does impose a limitation as concordance across these tissues has not been examined at the genomic sites identified. Another limitation stems from the small number of observations in our sample, as a much larger sample size is necessary to confidently establish associations in DNA methylation studies. Furthermore, the baseline status of the participants may have also contributed to potential confounding between the intervention and control groups. This pilot study was designed to investigate whether changes at the epigenetic level occurred in response to a physical activity intervention, and although we did not find epigenomic effect of physical activity, we do highlight a potential signature of body weight loss in the blood methylome. We believe these findings provide a basis for a larger study of this nature.   94 Chapter 5: Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array  5.1 Introduction Epigenetics is a rapidly evolving field in the contexts of new biological discoveries as well as the available technologies used to drive such findings. The most commonly studied epigenetic mark in humans is DNA methylation (DNAm), defined as the covalent addition of a methyl group to DNA, most frequently occurring at cytosine-guanine dinucleotides (CpGs)(P. A. Jones & Takai, 2001). Cellular DNAm profiles change naturally during the development of an organism, resulting in tissue identity being the strongest predictor of DNAm variation. As such, DNAm variability between tissues within an individual is larger than variability across individuals from the same tissue (Farré et al., 2015; M. D. Schultz et al., 2015). Though less than between tissue variation, the variation seen within a tissue is still of interest for epigenetic association studies. Inter-individual variability in DNAm has been linked to a number of different sources, including but not limited to the underlying DNA sequence, environmental exposures, and health outcomes. One of the most active areas of research related to DNAm inter-individual variability in human cohorts focuses on the relationship between DNAm and aging, as there has been substantial evidence that DNAm changes with age, both linearly and non-linearly, across the entire life course (Horvath, 2013). Rapidly evolving new technologies and resources have fueled exponential growth in human DNAm research over the past decade, enhancing our ability to address questions such as the effects of aging. Although many methodologies can be used to measure DNAm  95 quantitatively, Illumina microarrays are the most common method for population-based epigenetic studies, as they provide an economical and accessible high-throughput platform. Over a little more than a decade, the capacity of Illumina DNAm microarray platform has increased from 1,506 CpGs to more than 860,000 CpGs. The increased numbers of CpGs reflects both better coverage across genes and expanded interrogation of genomic regions. For example, the Illumina 27K (27K) array targeted >27,000 CpG sites and interrogated at least one CpG per gene, but was biased towards CpG islands(Bibikova et al., 2009). Its successor, the Illumina Infinium Methylation450 (450K) array assessed >485,000 CpGs and covered 99% of RefSeq genes. The Illumina Infinium MethylationEPIC (EPIC) array is the newest tool and allows the quantification of over 860,000 CpG sites, with the additional content providing higher coverage of specific genomic regions, such as enhancers and non-coding regions. The EPIC array generally uses the same DNAm measurement protocol as the 450K array and includes over 94% of the 450K content (Moran et al., 2015). However, the increased genomic resolution and complexity of the EPIC array in conjunction with missing 6% of the 450K CpGs necessitates an evaluation of the applicability of established bioinformatic tools established for the 27K or 450K arrays.  To accommodate advancements in DNAm array technology and the increasing volume of data, many pipelines for data preprocessing, normalization, and analyses have been developed to streamline data handling (Aryee et al., 2014; Davis, Du, Bilke, Triche, & Bootwalla, 2014; Morris et al., 2014; Teschendorff et al., 2013; Triche & Weisenberger, 2013). Here we refer to “preprocessing methods” as algorithms commonly performed on DNAm data prior to probe-type normalization, including methods to reduce background fluorescence or adjust for dye bias, which if unaddressed can reduce the dynamic range of beta values (Triche, Weisenberger, Van  96 Den Berg, Laird, & Siegmund, 2013). Probe-type normalization is a necessary adjustment for Illumina microarray DNAm data, as there are two different probe designs that possess differential beta distributions (Bibikova et al., 2009). Tools such as the R function ‘preprocessNoob’ in the minfi package subtract background based on the out-of-band intensities (for example, Infinium I probes fluorescing in the colour channel opposite their designed base extension). Colour or dye bias adjustment is applied to account for the two colour channels that type II probes employ, one for methylated and one for unmethylated CpGs, since background fluorescence can introduce unwanted variation. Tools to account for the colour bias include in the Bioconductor package ‘methylumi’, which is based on smooth quantile normalization, or the Illumina GenomeStudio software, which implements shift-and-scaling normalization (Dedeurwaerder et al., 2014). Although these methods have been reviewed in comparison to one another (T. Wang et al., 2015; Wilhelm-Benartzi et al., 2013; Yousefi et al., 2014), a mixed variety of pipelines are used across the literature and the influence of method selection on detecting true positives or generating accurate predictions should be investigated both within and across array technologies. One tool that could be compromised by different preprocessing methods or the lack of certain 450K CpG sites on the EPIC array is the epigenetic clock, a popular predictive model that estimates an individual’s biological age, irrespective of tissue type, using DNAm at 353 CpGs (Horvath, 2013). Established on DNAm profiles (obtained from 27K and 450K data) from 51 different tissues from over 8,000 individuals, the epigenetic clock calculates DNAm age, which has been shown to correlate well with chronological age (r > 0.80) across the life course (Horvath & Raj, 2018). This epigenetic clock is hypothesized to be an accurate molecular biomarker of biological aging, and deviations between chronological and DNAm age, commonly  97 referred to as epigenetic age acceleration (which can be positive or negative), have been correlated with a host of age-related conditions, such as Parkinson’s disease, time until death, frailty, and cognitive and physical decline (Breitling et al., 2016; Horvath & Ritz, 2015; Marioni, Shah, McRae, Chen, et al., 2015a; Marioni, Shah, McRae, Ritchie, et al., 2015b). However, 19 of the 353 clock-CpGs are not present on the EPIC array, and since the 450K platform is no longer available, it is crucial to assess the performance of the pan-tissue epigenetic clock despite the missing probes, if use of this tool is to be continued. Here, we investigated 1) the consistency between DNAm age measured from 450K and EPIC array data from the same individuals to evaluate the utility of EPIC array data in the Horvath epigenetic clock (Horvath, 2013), given that it is missing 19 clock CpGs, and 2) whether DNAm age estimates differ with different preprocessing methods. We found that EPIC data can be used to predict DNAm age accurately using the epigenetic clock. Additionally, we observed differences in DNAm age across the assessed preprocessing methods, although the differences across the values were below the reported median absolute error of the epigenetic clock. Lastly, we have replicated accurate measurement of DNAm age across tissues using an EPIC data set with three different tissues from 13 individuals. Our findings provide preliminary support for the epigenetic clock as a robust tool that may be applied with EPIC array data in the future.   5.2 Methods 5.2.1 Cohort characteristics We used two different cohorts in order to assess the pan-tissue epigenetic clock on the EPIC array. The first consisted of primary monocytes collected from 172 healthy males, aged 19-50 years old, of self-reported African or European descent from the EVOIMMUNOPOP project (Husquin, 2018). Genomic DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen,  98 Hilden, Germany). Quantity and quality of DNA were assessed using a NanoDrop™ 8000 Spectrophotometer (Thermo Scientific) and then subjected to bisulfite conversion with the EZ DNA Methylation Kit (ZymoResearch, Irvine, CA). We quantified DNAm on all samples using two separate Illumina microarray platforms: 450K and EPIC arrays (Illumina, San Diego, CA), following the manufacturer’s instructions. To ensure sample labeling across technologies, we assessed the correlation between the overlapping quality control single nucleotide polymorphic (SNP) probes present on both microarrays (59 SNPs); observing all sample pairs correlated with a Pearson’s coefficient of r ≥ 0.99 (Supplementary figure 5.1). Four technical replicates were included during the 450K processing and 12 technical replicates were included during the EPIC sample processing, with two common technical replicates across technologies. Our secondary cohort, consisting of 13 individuals aged 23-46 years old from the control subset of Diesel Exhaust Study III (DE3), was comprised of DNAm from peripheral blood mononuclear cells (PBMCs), bronchoalveolar lavage (BAL), and bronchial brushings (brush). All samples were collected from individuals after control (filtered air/saline) exposures. Primary cohort characteristics are provided in Supplementary table 5.1. Genomic DNA was isolated using the DNeasy Blood and Tissue Kit (Qiagen, Hilden Germany) and subsequently bisulfite converted using the EZ DNA Methylation Kit (ZymoResearch, Irvine, CA). Bisulfite-treated samples were processed using the EPIC array as above (Illumina, San Diego, CA).  5.2.2 DNA methylation quantification All microarrays were scanned with an Illumina HiScan system. For the EPIC array data, we used the most current manifest file, “Infinium MethylationEPIC v1.0 B4 Manifest File”, released by Illumina on May 26, 2017, and consisting of 865,918 probes, whereas for the 450K  99 we used the “HumanMethylation450 v1.2 Manifest File” with 485,577 probes. Both manifest files are available at: https://support.illumina.com/downloads.html. In addition to unprocessed (raw) data, we used data preprocessed in three different ways: 1) colour corrected/background subtracted in Genome Studio (GS), 2) quantile-normalized using “preprocessQuantile” (Touleimat & Tost, 2012), and 3) normal-expontential out-of-band (noob)-normalized with “preprocessNoob” (Triche & Weisenberger, 2013). Raw data and data that were to be quantile or noob normalized were uploaded directly into R from IDAT files using the ‘minfi’ package function “read.metharray” (Fortin, Triche, & Hansen, 2016). For colour correction/background subtracted preprocessing, data were background subtracted/colour corrected with GenomeStudio, and then uploaded into R with the package ‘methylumi’, function ‘lumiMethyR’ (Davis et al., 2014).  5.2.3 DNA methylation age We calculated DNAm age for each sample by using a modified version of the publicly available R code at https://dnamage.genetics.ucla.edu, with the normalization feature set to “TRUE” which is a version of beta-mixture interquantile normalization to adjust the probe type distribution differences(Horvath, 2013; Teschendorff et al., 2013). We have provided the R code used for all analyses presented here at https://github.com/lmcewen/EPIC450K. We focused our inquiry on data preprocessing only, and not probe-type normalization methods, as the epigenetic clock code applies an imputation of missing values and performs a calibrated version of a beta-mixture quantile normalization (Horvath, 2013; Teschendorff et al., 2013).   100 5.3 Results  5.3.1 The epigenetic clock accurately predicts DNA methylation age from EPIC methylation data  From the 450K array, 33,059 (6.8%) of 485,557 probes are not represented on the EPIC array, including 19/353 epigenetic clock-CpGs (5.4%). The lack of 19 epigenetic clock CpGs on the EPIC array could reduce the accuracy of the epigenetic clock when using EPIC array data. Therefore, we investigated the consistency between DNAm age as calculated from the 450K (original 353-CpG model) and the EPIC (reduced 334-CpG model) arrays.  We applied the epigenetic clock to data from samples run on both platforms and found a high correlation between the 450K and EPIC array DNAm age values regardless of preprocessing method (r = 0.91 - 0.96, error = 1.44 – 3.10 years, R2 = 0.83 – 0.91)(Figure 5.1), observing consistent patterns between chronological age and DNAm age as measured from both the EPIC (r = 0.84-0.86) and 450K (r = 0.86-0.87) arrays (Supplementary figure 5.2). As a second approach to assess the direct consequence of the absent clock sites, we removed the 19 missing EPIC clock-CpGs in the 450K data to simulate the 334-CpG model. We then calculated DNAm age using both the 353-CpG model and the 334-CpG model from the 450K data, finding a strong correlation of r = 0.998, indicating that the missing 19 CpGs did not adversely affect DNAm age prediction in monocytes (Supplementary figure 5.3).  101  Figure 5.1 DNA methylation age comparison between 450K or EPIC Monocyte data across preprocessing methods. Identical samples were assayed on both the 450K and EPIC arrays, and then each preprocessed in one of four ways prior to calculating DNA methylation (DNAm) age: raw unprocessed, GenomeStudio colour correction/background subtraction (GS), normal exponential out-of-band (noob) normalization, or quantile normalization. Solid coloured line represents corresponding group regression line. For each regression, the Pearson’s correlation coefficient, error (median absolute error between EPIC DNAm age and 450K DNAm age), R2 value, and p-value corresponding to the correlation coefficient are shown.  Lastly, to confirm that the epigenetic clock could produce an accurate estimate of age from EPIC data across different tissues, we used an independent cohort of three tissues (PBMCs,  102 BALs, brushes) collected from 13 healthy adults. Calculating DNAm age with GS-preprocessed data, we observed a strong correlation with chronological age using EPIC data for the PBMC and BAL samples (r = 0.88, p-value = 4.1x10-9; r = 0.89, p-value = 0.003, respectively), but a lesser degree of correlation for the brush samples (r = 0.59, p-value = 0.05)(Figure 5.2). We note that the brush beta-value distribution across all probes was considerably more variable than the other tissues, which may explain the lower correlation in brush samples (Supplementary figure 5.4).   Figure 5.2 EPIC DNA methylation age estimated in control samples from the Diesel Exhaust III Study across three tissues. DNA methylation (DNAm) age was estimated using the EPIC 334-CpG model from GenomeStudio background-subtracted and colour-channel-adjusted EPIC data. Linear regression line is shown with 95% confidence intervals shown in grey. Error is the median absolute difference between EPIC DNAm age and chronological age. Pearson’s correlation coefficient (r) and corresponding p-value are shown for each tissue. BAL = Bronchoalveolar lavage, PBMC = peripheral blood mononuclear cells, Brush = bronchial brushing.   103 5.3.2 Data preprocessing methods affects the calculated DNA methylation age but within error margins of the epigenetic clock Given that there is not an accepted standard method of preprocessing data prior to calculating DNAm age, we assessed the potential effects of different commonly used data preprocessing methods on the DNAm age estimates. We compared DNAm age estimates calculated from raw data as well as after applying three separate standard data preprocessing methods: colour-correction and background-normalization with GenomeStudio software (abbreviated GS), quantile-normalization, or noob-normalization. Imputation and a probe-type normalization were performed the same way across preprocessing methods using the R code supplied with the epigenetic clock method (Horvath et al., 2012; Teschendorff et al., 2013; Troyanskaya et al., 2001). Using monocyte-derived data from 172 subjects on both the 450K and EPIC, we found that DNAm age was highly correlated across both raw data and data after three different preprocessing methods (r > 0.91) (Supplementary figure 5.2). However, shifts in mean DNAm age were observed, indicating that although mean DNAm differences did exist, the trends with age were consistent across preprocessing methods (Figure 5.3a). This was further supported by significant Kendall rank coefficients in DNAm age across each preprocessing method (τ = 0.86-0.94, p-value < 2.2x10-6)( Supplementary figure 5.5).      104  Figure 5.3 DNA methylation age acceleration variation across preprocessing methods a Scatter plot of EPIC DNA methylation (DNAm) age calculated from raw and data from three different preprocessing methods: quantile, GenomeStudio (GS), and normal exponential out-of-band (noob) normalization. Coloured regression lines and surrounding shaded grey areas represent 95% confidence interval for each group. b Boxplot of estimated DNA methylation (DNAm) age – chronological age (acceleration difference) for each preprocessing method. The median is illustrated by horizontal line with upper and lower hinges representing the 25th and 75th percentiles; upper and lower whiskers correspond to the largest and smallest value no further than 1.5 x the inter-quartile range, respectively. Coloured data points represent individual samples for each group. c Boxplot of residuals from a linear regression (DNAm age ~ chronological age) across methods.    105 To further investigate the sample-to-sample trend in DNAm age across methods, we explored two common measures associated with the epigenetic clock, both considered measures of epigenetic age acceleration, represented by either the difference between DNAm age and chronological age (age acceleration difference) or the residuals from a linear model of DNAm age regressed onto chronological age (age acceleration residual). Given the observed mean DNAm age shifts when using different preprocessing methods (Figure 5.3a), age acceleration difference is more likely to be affected by which preprocessing method was chosen. Age acceleration residual, on the other hand, is less affected by mean differences as it is expressed relative to the measured population. As expected, we observed significant discrepancies in the mean age acceleration difference measure for nearly all comparisons (p-value < 0.0002) except for noob versus GS (p-value = 0.23) (Figure 5.3b) and minimal variation in the age acceleration residual mean across preprocessing methods (p-value = 1) (Figure 5.3c). This supports the previous suggestion of using age-acceleration residuals (Horvath, 2013) to correct for processing specific shifts in DNAm estimates in order to accurately compare DNAm age between people. We assessed how different preprocessing methods influenced the DNAm age estimate by examining the concordance of DNAm age measured from EPIC array technical replicates. A technical replicate pair represented an identical DNA sample quantified twice for quality control purposes; specifically, the sample was divided into two separate tubes after bisulfite conversion and DNAm was quantified separately. Technical replicate sample identity was confirmed by examining the 59 SNP probes present on the EPIC array (Supplementary figure 5.6). We focused on the 24 technical replicates (12 pairs) from the EPIC array, calculating DNAm age for each technical replicate from data subjected to each separate preprocessing method: raw, GS color corrected and background subtracted, quantile normalized, or noob normalized. We calculated  106 the median absolute difference between each technical replicate pair’s DNAm age estimates in each data set. We found that the GS colour correction and background subtraction had the least deviation across replicates (errorGS = 2.17 years), followed by noob normalization (errorNoob = 2.41 years), quantile normalization (errorQuantile = 2.89 years), and then raw data having the largest deviation (errorRaw = 3.14 years)(Figure 5.4). Notably, these values are all below the median absolute error of the epigenetic clock (3.6 years)(Horvath, 2013).    Figure 5.4 Absolute difference between technical replicate pairs for each processing method. The y-axis represents the absolute difference between each technical replicate pair’s DNA methylation (DNAm) age. Each processing method is represented on the x-axis. The median difference is indicated by the red cross for each group. The colour of data points represents one  107 technical replicate pair for ease of interpretation across methods. Error refers to the median absolute difference between DNAm age and chronological age.  5.4 Discussion This study had two primary aims: 1) to investigate whether using EPIC methylation data to calculate DNAm age is an appropriate approach, given that 19 out of 353 clock-CpGs are missing on this new platform, and 2) to evaluate the effect of various data preprocessing methods prior to calculating DNAm age, as a standard pipeline for processing data prior to calculating DNAm age does not exist. By analyzing monocyte DNAm from 172 individuals quantified on both the 450K and EPIC arrays, we demonstrated that the lack of the 19 clock-CpGs on the EPIC array did not compromise the utility of the epigenetic clock. We also evaluated the performance of the EPIC epigenetic clock (334-CpG model) on another EPIC data set, consisting of three tissues from 13 individuals, finding comparable correlations with age to those reported with the 450K array (353-CpG model). Furthermore, we found small differences in the DNAm age estimate between data preprocessing methods, implying that although the methods assessed here differed in mean values, the trends in respect to chronological age were consistent across methods. Finding that preprocessing method influenced mean values of DNAm age is important for the interpretation of future analyses, as we demonstrated that variation in DNAm age can be introduced by how the data are preprocessed. Our work here provides supporting evidence for the DNAm age acceleration residual measure, since this value is reflective of inter-individual variability within a measured data set and is, therefore, more comparable across studies. In contrast, the DNAm age difference, the crude difference between estimated DNAm age and chronological age, can be reflective of global DNAm shifts due to preprocessing methods.  108 Whichever measure of DNAm age is used (difference, residual or age itself), there is an additional consideration that small effect sizes should be interpreted with caution. To highlight this point, we calculated DNAm age for technical replicates from raw data and three different preprocessing methods. We found that while there was some variability in DNAm age across technical replicates, regardless of preprocessing methods, the observed median absolute error in DNAm age for each method (2.7-3.14 years) was lower than the reported error of the epigenetic clock (3.6 years) (Horvath et al., 2016). GS-preprocessed data produced the tightest replicates, followed by noob and then quantile normalization, and the consistency between replicates was lowest when DNAm age was calculated from raw data. These findings may suggest using preprocessed rather than raw data, but overall we emphasize the importance of considering the technical error of the epigenetic clock and caution interpretation of changes of less than 3.6 years. To examine the appropriateness of using EPIC data to calculate DNAm age for future research, we took advantage of a cohort with DNAm data on the same individuals on both the 450K and EPIC arrays. It is crucial to examine whether the epigenetic clock can continue to be used on EPIC data, as the 450K platform is no longer available. There was high consistency between 450K and EPIC DNAm age estimates, and the lack of 19 CpG sites did not significantly affect the prediction accuracy of the epigenetic clock. This consistency lends support to the strength of this predictive model on the EPIC platform and will allow users to continue applying this bioinformatic tool to continue to calculate DNAm age.  To further examine the application of EPIC array data to predict DNAm age, we estimated DNAm age in an independent EPIC array cohort. We observed correlations between EPIC DNAm age and chronological age that were comparable to previous reports, specifically in  109 PBMCs and BAL samples. The strong association in PBMCs is consistent with previous reports of DNAm age in PBMCs as generated from 450K data (Horvath & Levine, 2015; Horvath, Pirazzini, et al., 2015b). We observed less consistency in the brush samples; however, this tissue was not included in the training data of the 353-CpG epigenetic clock and so performance may not be reflective of EPIC array but rather be a property of clock itself. This is reinforced by our experiment removing the 19 clock CpGs not present on the EPIC array from the 450K data, where we observed a nearly perfect correlation with the 353-CpG data, suggesting that the loss of the 19 clock CpG sites did not influence the accuracy of the epigenetic clock. There are limitations to this study that should be taken into consideration when interpreting the results. The primary data sets we investigated when comparing EPIC versus 450K estimated DNAm age were from monocyte samples, and although we found that the lack of 19 CpGs did not affect the DNAm age estimate in this specific cell type, those 19 CpGs may be important to estimate age in other tissues. Their importance to other tissues remains to be explored. Additionally, the methods applied in the current study should not be generalized across all studies. For example, global normalization methods, such as quantile normalization, are not appropriate in all cases as interesting biological information can be removed in data sets with large variation across samples, such as cancer, compared to normal or multiple-tissue projects. Instead, the use of these data transformation methods should be considered on a study-by-study basis (Hicks & Irizarry, 2015). Furthermore, while we are cognizant there are several other available preprocessing options, for the purposes of our exploration and presentation of these data, we only assessed three of the most common methods.  In summary, we have performed an investigation into the variation in calculating DNAm age from raw and different data preprocessing inputs. Additionally, we have shown the accuracy  110 of calculating DNAm age with the most recent human DNAm platform, the EPIC array. Our work will provide researchers the confidence to investigate DNAm age using the EPIC array, as well as encourage users to critically consider the technical error of the epigenetic clock when interpreting future findings.    111 Chapter 6: Discussion  6.1 Summary The body of work reported in this dissertation focuses on analyzing DNA methylation changes across the life course, assessing the period of early development to late adolescence, the effects of a mid-life health intervention, and the unique DNA methylation profiles of a long-lived population. Additionally, I performed an analysis of potential technical variance introduced when using a common tool for predicting epigenetic age.  Firstly, I explored pediatric epigenetic aging and the potential for a predictor specific to this life stage. I combined 14 different buccal epithelial DNA methylation data sets, both unpublished and published data, all containing typically developing individuals ranging in age from 0-20 years old, equaling over 1700 samples. By using a training/test design and in collaboration with Dr. Steve Horvath, I developed a predictive model of age using elastic net regression to select 94 age-informative CpG sites for the pediatric age range. The test accuracy of this predictive model was less than 4 months and had a very high correlation value with chronological age. I also showed that deviations in the estimated pediatric DNA methylation age were moderately different between individuals with Autism Spectrum Disorder while controlling for age, estimated buccal proportion, ethnicity, and biological sex. The findings from this study are exciting but remain to be fully explored to determine what, if any, biological importance or alterations in developmental trajectories these DNA methylation patterns represent. Next, I investigated whole blood DNA methylation measured at over 485,000 CpGs across the genome from one of the world's longest-lived populations who reside in Nicoya, Costa Rica. In this study, I compared Nicoyan DNA methylation profiles to individuals from other areas of Costa Rica, assessing known age-related trends in methylation such as epigenetic drift  112 and the epigenetic clock. I found that Nicoyans had less DNA methylation variability as compared to non-Nicoyans, a characteristic that may reflect a “younger” epigenome, and also that there were no differences in DNA methylation age. Rather, I observed that on average all Costa Ricans had a younger than expected DNA methylation age. Additionally, I found a significant difference in predicted CD8+ T cells; specifically, I observed a higher proportion of estimated CD8+ naïve T cells in the Nicoyans and lower proportion of CD8+ memory T cells, which is also indicative of a younger biological phenotype. These results are supportive of Nicoyans being a biologically younger population; however, the findings of DNA methylation age suggest that both Nicoyan and non-Nicoyan Costa Ricans may be an epigenetically “younger” group.  Given that lifestyle interventions have been associated with changes in DNA methylation, including in a physical activity study in men (Rönn et al., 2013), I examined DNA methylation before and after a physical activity intervention in a mid-life post-menopausal cohort. This was small pilot study investigating DNA methylation in peripheral blood collected from women before and after a moderate 6-month exercise intervention as well as in a control group who did not undergo the intervention. I did not find any association between DNA methylation and the intervention program. However, one major outcome of the original intervention study was a significant weight loss that favoured the intervention group. I, therefore, assessed whether any DNA methylation changes were associated with percent body weight change and found 12 CpG sites that showed a concordant change over time. These differential DNA methylation findings were exploratory and cannot be interpreted as evidence for DNA methylation playing a mechanistic role in weight loss, but they may help pinpoint gene pathways involved and do suggest DNA methylation may be influenced by weight loss. Given that few intervention studies  113 have been performed in the context of DNA methylation changes, these findings provide evidence to support the need for a well-designed larger trial to further develop our current level of understanding the complexities and sculpting factors of the methylome. Lastly, I reported results from a technical analysis assessing the epigenetic clock in two contexts. One is the utility of using the multi-tissue DNA methylation age predictor with data quantified with the EPIC platform, as this new microarray does not include all 353-clock sites. The second is an assessment of how different data preprocessing methods may influence the estimate of DNA methylation age using the Horvath epigenetic clock. I found that epigenetic clock performance was not compromised with the loss of 19-clock CpG sites and therefore is applicable to EPIC data. This observation is important because the previous 450K microarray is no longer available. Additionally, I observed moderate differences in DNA methylation age across the three different preprocessing methods I assessed, providing evidence of data preprocessing method variability.    6.2 Biomarkers of aging Given the results described in this thesis, specifically in Chapters 2, 3 and 5, it is clear that DNA methylation is closely associated with aging. However, epigenetic age is not the only biomarker, to date, of aging, and to gain insight into the biological underpinnings of the aging process it is important to consider how age-associated measures are interconnected with each other.  6.2.1 The epigenetic clock  The concept of epigenetic age has gained attention in many research fields as it allows investigators to easily estimate one possible measure of biological age that can be useful in a variety of health research studies. I applied the multi-tissue epigenetic clock as well as tissue- 114 specific DNA methylation age predictors to each of the data sets mentioned in the work described throughout Chapters 2-5. Epigenetic predictors with few sites such as the Weidner clock did not perform as well as the epigenetic clocks with several CpG sites, such as the Hannum or Horvath (“multi-tissue”) epigenetic clock (Hannum et al., 2013; Horvath, 2013). This infers that accuracy in age prediction is achieved by using more information from the DNA methylome; however, fewer sites may be useful in certain contexts such as forensic applications or when microarray quantification is not feasible (Horvath & Raj, 2018).  The predictor outlined in Chapter 2 uses DNA methylation at 94 CpG sites and has a high accuracy of less than 4 months. This tool is specific to buccal epithelial samples and to a particular age range. The choice to restrict this epigenetic clock to individuals under 20 years old was based upon previous research that has defined birth to late adolescence as a highly dynamic period compared to later life (Alisch et al., 2012). Given that there is only one CpG overlapping between this predictor and the multi-tissue epigenetic clock, it is easy to speculate that these two tools are capturing different molecular processes. It seems likely that the DNA methylation changes in early life may be reflective or serve different purposes than in adulthood. Further work will be required to fully characterize this predictor as well as determine whether estimate deviations relate to expected developmental measures. 6.2.2 Telomeres and Mitotic Division Another prominent biomarker of age is leukocyte telomere length. Telomeres, the caps of chromosomes composed of DNA segments and protein, shorten with each cell division and therefore the length of these repeated genomic elements is associated with chronological aging (Blackburn, Greider, & Szostak, 2006). Telomere length has been associated with aging as well as certain age-related diseases, such as cardiovascular disease (Rehkopf et al., 2016).  115 Interestingly, telomere length is longer in Nicoyans compared to non-Nicoyans (Rehkopf et al., 2013), but this observation of population-specific younger biological age was not observed when I compared epigenetic age. This finding parallels other investigations into epigenetic age and telomere length, as they have been identified as associating independently with age (Belsky et al., 2017; Marioni et al., 2016). Additionally, mitotic cell division is not related to epigenetic age, as the multi-tissue epigenetic predicts similar estimates across tissues that are highly prolific and those that are less proliferative. However, it has been suggested that the negligible correlation between these two biomarkers of aging may only be in differentiated cells, as studies in stem cells suggest that telomere biology may relate to epigenetic aging (Horvath & Raj, 2018). Further investigations into these separate biological signatures of time will provide answers to whether these are completely independent entities or if they converge in a common pathway. 6.2.3 Other age predictors In addition to DNA methylation and telomere length, other less frequently applied biological predictors of age have been identified. In terms of molecular markers, deviations in transcriptomic age, based on gene expression, have been positively correlated with blood pressure, serum cholesterol level, and body-mass index (M. J. Peters, n.d.). However, the association between transcriptomic age and the epigenetic clock is very weak, and it has therefore been suggested that these measures reflect different features of biological aging (Jylhävä, Pedersen, & Hägg, 2017). Biological age has also been predicted from protein glycosylation, candidate protein levels, and a selected group of metabolites (Jylhävä et al., 2017).   Tissue aging has also been investigated to predict biological age. For example, neuroimaging of brain structures has been used to estimate age (Franke, Ziegler, Klöppel, Gaser, Alzheimer's Disease Neuroimaging Initiative, 2010). These measures have been associated with  116 mortality and showed higher predictive capacity when combined with DNA methylation age (Cole et al., 2018).  6.3 Intervention studies Most population-based DNA methylation studies in the current literature are cross-sectional by design, which limits any interpretation regarding causal inference or temporal stability. Longitudinal studies are able to answer questions about directionality of associations and can provide information about reversibility when coupled with an intervention. Intervention studies with pre/post-sampling are powerful in that they control for possible individual confounding factors that may contribute unwanted variation (Hussey, Lindley, & Mastana, 2017). In humans, intervention studies looking at DNA methylation have primarily focused on diet, methods of physical activity, and weight loss (Aronica et al., 2017; Harkess, Ryan, Delfabbro, & Cohen-Woods, 2016; Horvath et al., 2014; Hussey et al., 2017; Rönn et al., 2013).  The study reported in Chapter 4, which was designed to look at a physical activity intervention, did not find any differential DNA methylation changes between those who participated in the program and controls over the 6-month study period. This may have resulted from small sample size or the physical activity program not being rigorous enough, as previous studies have found DNA methylation changes over exercise regimes, although this was in adipose tissue or skeletal muscle, which may be more relevant tissues (Nitert et al., 2012; Rönn et al., 2013). However, we did observe changes associated with percent weight loss. A meta-analysis of DNA methylation and weight loss interventions recently reported limited overlap of results in genome-wide studies, and most of the sites that did validate had <5% change in methylation (Aronica et al., 2017). Given that in Chapter 4, a 5% minimum methylation change was used as a threshold for the performed EWAS to reduce false discoveries, it may not be  117 surprising that none of the sites I identified over the intervention period overlapped with the meta-analysis validated findings.  6.4 Data preprocessing All data reported throughout this dissertation were subjected to a pipeline of data processing, including but not limited to quality control steps, normalization procedures, data filtering, batch effect removal, and data reduction techniques. Previous work has characterized optimal protocols for certain aspects of this pipeline, such as comparative studies outlining the most effective method to normalize probe design differences (T. Wang et al., 2015). Furthermore, studies have examined how certain batch effects can actually influence results (Buhule et al., 2014; Cazaly et al., 2016). However, with new tools such as the epigenetic clock, no study has outlined a specific data handling procedure prior to implementing the calculation of epigenetic age.  The results I reported in Chapter 5 showed greater variance in epigenetic age due to preprocessing method than expected but also showed that some methods may reduce this technical noise better than others. The results also illustrated an approximate threshold for effect sizes when reporting findings from epigenetic clock studies. In summary, these observations will provide the field with needed information when using the epigenetic clock and may ultimately support consistency across the literature for both accurate validation studies and comparative analyses.  6.5 Limitations There are limitations to the work presented in these studies that should be considered when interpreting the results. Many common factors arise when performing human EWAS studies on DNA obtained from a cell-type heterogeneous tissue, like blood and buccal epithelial  118 cells. Additionally, the lack of genetic background information, such as genotyping, was a limiting factor and could influence the accuracy of the pediatric DNA methylation age predictor as well as the differentially methylated CpG findings reported in chapters 3 and 4. 6.5.1 Cell-type heterogeneity  Given that DNA methylation is a binary measure, either present or absent, for each cell, the data we obtained from microarray experiments were converted into values representing the percentage of methylated cells for each locus. To add to this complexity, the majority of EWAS projects focus on heterogeneous tissues in terms of cell type, such as blood or saliva, rather than on an isolated cell type, as separating samples into individual cell types is expensive and not usually feasible in large human cohort studies. However, even in samples that have gone through cell sorting prior to DNA methylation quantification, variation attributed to inefficient sorting or what could potentially be reflective of different cellular states of the same cell type may be observed (Lappalainen & Greally, 2017). As discussed previously, each cell type has a unique DNA methylation profile that is indicative of that cell's identity, and this feature introduces a considerable level of variation in human DNA methylation microarray data. It is therefore not surprising that principal component analyses of the DNA methylation cohorts presented here consistently highlight the first component of variation being associated with estimated or counted cell type measures. To account for this, there are reference-based methods for brain (Guintivano et al., 2013), blood (Houseman et al., 2012), saliva (A. K. Smith et al., 2015), or buccal samples (Eipel et al., 2016), which all use reference-based DNA methylation profiles from isolated cell types to infer estimated cell type proportions in a given sample. Although these methods may not perform perfectly, they are currently the most appropriate approach for the sample types used in such studies to reduce variation attributed to cellular heterogeneity.  119 Although this variation is most often unwanted in epigenetic analyses, using DNA methylation to estimate cell type proportions can have interesting outcomes and may pose a confounding limitation when examining heterogeneous tissue composed of age-related cell types. For example, removing variance attributed to cell type proportions could potentially remove biologically interesting signals related to aging, such as what I observed in Chapter 3. This has also been observed by others who have found interesting differences in DNA methylation-based estimate cell proportions, such as white blood cell composition, between adopted and non-adopted youth (Aronica et al., 2017; Esposito et al., 2016). Therefore, the ideal route of epigenetic investigation is to examine isolated cell types; however, that in itself is a challenge as not all purification systems are precise and significant proportions of cells may be in different states even if they share a specific cell type marker (Lappalainen & Greally, 2017).  6.5.2 Genetics An important consideration for DNA methylation analysis is the underlying genetic variation, as this a known contributor to DNA methylation variability (Banovich et al., 2014). For all the population studies presented in this dissertation, genetic background information (genotyping) was not available, which is a notable limitation of this research. When reported ethnicity was accessible, this measure was used as a categorical representation of genetic background in regression models. Although this approach would not account for all site-specific mQTL relationships, large variation in DNA methylation attributed to ethnicity differences would be adjusted for. Furthermore, methods to infer genetic structure were employed, for example, the ‘epistructure’ method (Rahmani et al., 2017) in Chapter 3, but these approaches have not been fully validated. Including genetic information is important for certain DNA methylation analyses, as previous work in umbilical cord tissue has shown that an interaction  120 between both genotype and environment performs the best in terms of explaining DNA methylation variation. Moving forward, in the absence of actual genotyping data for a given cohort, the rapidly emerging availability of mQTL databases based on large populations will provide the field with a useful resource for post hoc analyses to assess whether any significant DNA methylation sites observed have been previously identified as being under genetic regulation. Additionally, it is possible mQTLs could be tissue-, age-, or environment-specific, and future work to map out this information will be required to ensure differential methylation findings are not majorly influenced by genetic factors. 6.5.3 Association studies Another limitation of the work reported here is that the majority of the analyses were exploratory in nature, and we can only speculate on the potential mechanistic pathways that may underlie the associations observed. Exploratory studies are essential for generating targets for experimental studies that focus on specific loci or molecular pathways. It will be of most interest to build on the work described here with a mechanistic approach; for example, to assess the functionality of the observed differential DNA methylation changes identified in association with weight loss in Chapter 4 and whether there is a consequent difference in gene expression. Another promising avenue to build on these findings is through the use of molecular techniques such as CRISPR to alter the methylation status of targeted age-associated loci in a mammalian model to assess potential functional relevance in terms of whether these alterations slow the aging process.  In regards to investigating the role of DNA methylation in aging, the results presented here and throughout the literature only support an association. It is possible that DNA methylation may only be a biomarker of other functional cellular changes related to ageing. This  121 may be likely as it has been proposed that DNA methylation is an after-the-fact mark of other molecular aging processes (Leenen, Muller, & Turner, 2016). Mechanistic studies, such as cell culture experiments, will be necessary to fully evaluate the function of this epigenetic mark in the process of aging.  6.6 Future directions 6.6.1 Development versus aging Aging has been classified as a disease or at least a risk factor for several late-onset diseases, and with that brings a negative mindset surrounding the idea of aging. In contrast, early life development is most often viewed as a positive stage of growth and maturity. An important, and perhaps under-recognized, feature of the life-course is when developmental processes switch from positive growth and maturity into dysfunction and decline. However, the mechanistic underpinnings of this window are not well explored. The research presented throughout this dissertation highlights not only the common nature of aging across the life-course but also the unique relationships between aging and DNA methylation during specific periods of life. For instance, there is a significant difference in the relationship between DNA methylation and chronological age in early life compared to the patterns seen across adulthood. This observation may reveal another layer of molecular differences between early- and late-life where future work will be required to define both how DNA methylation is related to each of these life stages as well as the functional relevance of these methylome signatures over time.  6.6.2 Investigations of the aging DNA methylome  It is well known that DNA methylation exhibits strong correlations with age, but possible functional relationships have yet to be unravelled. Since genetic sequence accounts for less than 30% of variation in lifespan, the major driver of human longevity must be attributed to non- 122 genetic factors such as diet, physical activity, smoking or other exposures (K. Christensen, Johnson, & Vaupel, 2006; Gebel et al., 2015). Alterations in DNA methylation may occur as a result of the combination of programmed changes in cell type or function with age, the biological embedding of lifelong environmental exposures, and possible stochastic events over time. Given the combination of evidence that DNA methylation is involved in the embedding of previous environments and the development of disease and that the human methylome is age-sensitive, it seems possible that DNA methylation may have a functional role in aging; however, this is entirely speculative at this stage. An emerging possible functional role of these age-related epigenetic changes may, in part, play into the molecular pathway responsible for the decline in immune dysfunction observed with age (Goronzy, Li, & Weyand, 2010). Over the past years, work has focused on determining the relative contribution of specific environments and stochastic changes in altering DNA methylation patterns during aging. Research into RNA and DNA patterns at the single cell level may provide further information to at least partially answer these questions.  Another area of epigenetic research that has yet to be fully characterized in the context of aging is histone modifications, non-coding RNAs, and transcription factors. Mapping of certain histone modifications has been associated with age; for example, increased trimethylation of histone 3 lysine 4 (H3K4) and H4K20 is positively associated with age, whereas age is negatively associated with H3K27 trimethylation and methylation of H3K9 (Han & Brunet, 2012). Furthermore, short non-coding RNAs, such as microRNAs, have been observed to regulate lifespan across species including yeast, nematodes, fruit flies, mice, and humans (Smith-Vikos & Slack, 2012). As interesting as each of these independent associations is, studies  123 looking at how these marks or signatures are linked together will ultimately reveal important conclusions, allowing us to better characterize the epigenome across the life course.   6.7 Potential applications  6.7.1 Anti-aging  DNA methylation is an attractive age-related molecular mark as it is reversible, and therefore the idea of reversing or slowing down the aging process through manipulating this epigenetic modification has gained considerable interest. As mentioned above, utilizing technologies such as CRISPR to intentionally methylate or de-methylate specific loci may ultimately uncover features of how DNA methylation plays a role in different regulative processes. Future prospects are particularly exciting in the research area of age-related DNA methylation because if DNA methylation does indeed play a functional role in aging there may be potential to slow down this biological process. However, the most impactful implementation of this research likely would be to apply interventions not to extend the human lifespan but to increase the number of healthy years spent in old age.   6.7.2 Forensics DNA methylation is emerging as an intriguing biological mark for potential forensic applications as it is stable and relatively straight-forward in terms of measurement and sample inputs and can be highly informative in a variety of contexts. For example, as mentioned previously, DNA methylation is tissue-specific and therefore can be used in determining the origin of crime scene biological samples (H. Y. Lee, Jung, Lee, Yang, & Shin, 2016). Furthermore, the application of DNA methylation for age prediction is very exciting and has recently been viewed as a potential tool for forensic purposes (Vidaki & Kayser, 2017; Vidaki et  124 al., 2017). Although the pan-tissue epigenetic clock has remarkable accuracy in predicting age, the number of CpG sites (353) poses a challenge for forensic-use as there are no current technologies available to quantify this many sites across the genome in a straightforward and cost-effective approach. Additionally, the epigenetic clock has been characterized as measuring biological age instead of chronological age and in that light, this particular predictor would not be useful in a criminal investigation. However, research in this area is still in the early stages and continued focus on improving accuracy, reproducibility, and feasibility may fully unravel the utility of this biological mark in a forensic capacity.   125 References  Alisch, R. S., Barwick, B. G., Chopra, P., Myrick, L. K., Satten, G. A., Conneely, K. N., & Warren, S. T. (2012). Age-associated DNA methylation in pediatric populations. Genome Research, 22(4), 623–632. http://doi.org/10.1101/gr.125187.111 Anderson, R. M., & Weindruch, R. (2010). Metabolic reprogramming, caloric restriction and aging. Trends in Endocrinology and Metabolism: TEM, 21(3), 134–141. http://doi.org/10.1016/j.tem.2009.11.005 Archer, T. (2015). Epigenetic Changes Induced by Exercise. Journal of Reward Deficiency Syndrome, 1(2). http://doi.org/10.17756/jrds.2015-011 Aronica, L., Levine, A. J., Brennan, K., Mi, J., Gardner, C., Haile, R. W., & Hitchins, M. P. (2017). A systematic review of studies of DNA methylation in the context of a weight loss intervention. Epigenomics, 9(5), 769–787. http://doi.org/10.2217/epi-2016-0182 Aryee, M. J., Jaffe, A. E., Corrada-Bravo, H., Ladd-Acosta, C., Feinberg, A. P., Hansen, K. D., & Irizarry, R. A. (2014). Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics (Oxford, England), 30(10), 1363–1369. http://doi.org/10.1093/bioinformatics/btu049 Ashe, M. C., Winters, M., Hoppmann, C. A., Dawes, M. G., Gardiner, P. A., Giangregorio, L. M., et al. (2015). “Not just another walking program”: Everyday Activity Supports You (EASY) model—a randomized pilot study for a parallel randomized controlled trial. Pilot and Feasibility Studies, 1(1), 4. http://doi.org/10.1186/2055-5784-1-4 Banovich, N. E., Lan, X., McVicker, G., van de Geijn, B., Degner, J. F., Blischak, J. D., et al. (2014). Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genetics, 10(9), e1004663. http://doi.org/10.1371/journal.pgen.1004663 Barlow, P. A., Otahal, P., Schultz, M. G., Shing, C. M., & Sharman, J. E. (2014). Low exercise blood pressure and risk of cardiovascular events and all-cause mortality: systematic review and meta-analysis. Atherosclerosis, 237(1), 13–22. http://doi.org/10.1016/j.atherosclerosis.2014.08.029 Beekman, M., Blanché, H., Perola, M., Hervonen, A., Bezrukov, V., Sikora, E., et al. (2013). Genome-wide linkage analysis for human longevity: Genetics of Healthy Aging Study. Aging Cell, 12(2), 184–193. http://doi.org/10.1111/acel.12039 Belsky, D. W., Moffitt, T. E., Cohen, A. A., Corcoran, D. L., Levine, M. E., Prinz, J. A., et al. (2017). Eleven Telomere, Epigenetic Clock, and Biomarker-Composite Quantifications of Biological Aging: Do They Measure the Same Thing? American Journal of Epidemiology. http://doi.org/10.1093/aje/kwx346 Benton, M. C., Johnstone, A., Eccles, D., Harmon, B., Hayes, M. T., Lea, R. A., et al. (2015). An analysis of DNA methylation in human adipose tissue reveals differential modification of obesity genes before and after gastric bypass and weight loss. Genome Biology, 16(1), 8. http://doi.org/10.1186/s13059-014-0569-x Berko, E. R., Suzuki, M., Beren, F., Lemetre, C., Alaimo, C. M., Calder, R. B., et al. (2014). Mosaic epigenetic dysregulation of ectodermal cells in autism spectrum disorder. PLoS Genetics, 10(5), e1004402. http://doi.org/10.1371/journal.pgen.1004402 Bestor, T. H. (2000). The DNA methyltransferases of mammals. Human Molecular Genetics, 9(16), 2395–2402.  126 Bibikova, M., Barnes, B., Tsan, C., Ho, V., Klotzle, B., Le, J. M., et al. (2011). High density DNA methylation array with single CpG site resolution. Genomics, 98(4), 288–295. http://doi.org/10.1016/j.ygeno.2011.07.007 Bibikova, M., Le, J., Barnes, B., Saedinia-Melnyk, S., Zhou, L., Shen, R., & Gunderson, K. L. (2009). Genome-wide DNA methylation profiling using Infinium® assay. Epigenomics, 1(1), 177–200. http://doi.org/10.2217/epi.09.14 Binder, A. M., Corvalan, C., Mericq, V., Pereira, A., Santos, J. L., Horvath, S., et al. (2018). Faster ticking rate of the epigenetic clock is associated with faster pubertal development in girls. Epigenetics : Official Journal of the DNA Methylation Society, 13(1), 85–94. http://doi.org/10.1080/15592294.2017.1414127 Bird, A. (2002). DNA methylation patterns and epigenetic memory. Genes & Development, 16(1), 6–21. http://doi.org/10.1101/gad.947102 Bird, A. (2007). Perceptions of epigenetics. Nature, 447(7143), 396–398. http://doi.org/10.1038/nature05913 Blackburn, E. H., Greider, C. W., & Szostak, J. W. (2006). Telomeres and telomerase: the path from maize, Tetrahymena and yeast to human cancer and aging. Nature Medicine, 12(10), 1133–1138. http://doi.org/10.1038/nm1006-1133 Bocklandt, S., Lin, W., Sehl, M. E., Sánchez, F. J., Sinsheimer, J. S., Horvath, S., & Vilain, E. (2011). Epigenetic predictor of age. PloS One, 6(6), e14821. http://doi.org/10.1371/journal.pone.0014821 Bogin, B., & Varela-Silva, M. I. (2010). Leg length, body proportion, and health: a review with a note on beauty. International Journal of Environmental Research and Public Health, 7(3), 1047–1075. http://doi.org/10.3390/ijerph7031047 Bohlin, J., Håberg, S. E., Magnus, P., Reese, S. E., Gjessing, H. K., Magnus, M. C., et al. (2016). Prediction of gestational age based on genome-wide differentially methylated regions. Genome Biology, 17(1), 207. http://doi.org/10.1186/s13059-016-1063-4 Boks, M. P., van Mierlo, H. C., Rutten, B. P. F., Radstake, T. R. D. J., De Witte, L., Geuze, E., et al. (2015). Longitudinal changes of telomere length and epigenetic age related to traumatic stress and post-traumatic stress disorder. Psychoneuroendocrinology, 51, 506–512. http://doi.org/10.1016/j.psyneuen.2014.07.011 Bollati, V., & Baccarelli, A. (2010). Environmental epigenetics. Heredity, 105(1), 105–112. http://doi.org/10.1038/hdy.2010.2 Bollati, V., Schwartz, J., Wright, R., Litonjua, A., Tarantini, L., Suh, H., et al. (2009). Decline in genomic DNA methylation through aging in a cohort of elderly subjects. - PubMed - NCBI. Mechanisms of Ageing and Development, 130(4), 234–239. http://doi.org/10.1016/j.mad.2008.12.003 Bonder, M.-J., Luijk, R., Zhernakova, D. V., Moed, M., Deelen, P., Vermaat, M., et al. (2017). Disease variants alter transcription factor levels and methylation of their binding sites. Nature Genetics, 49(1), 131–138. http://doi.org/10.1038/ng.3721 Boyce, W. T., & Kobor, M. S. (2015). Development and the epigenome: the “synapse” of gene-environment interplay. Developmental Science, 18(1), 1–23. http://doi.org/10.1111/desc.12282 Boyce, W. T., Besten, Den, P. K., Stamperdahl, J., Zhan, L., Jiang, Y., Adler, N. E., & Featherstone, J. D. (2010). Social inequalities in childhood dental caries: the convergent roles of stress, bacteria and disadvantage. Social Science & Medicine (1982), 71(9), 1644–1652. http://doi.org/10.1016/j.socscimed.2010.07.045  127 Breitling, L. P., Saum, K.-U., Perna, L., Schöttker, B., Holleczek, B., & Brenner, H. (2016). Frailty is associated with the epigenetic clock but not with telomere length in a German cohort. Clinical Epigenetics, 8(1), 21. http://doi.org/10.1186/s13148-016-0186-5 Brooks-Wilson, A. R. (2013). Genetics of healthy aging and longevity. Human Genetics, 132(12), 1323–1338. http://doi.org/10.1007/s00439-013-1342-z Brown, W. M. (2015). Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis. British Journal of Sports Medicine, 49(24), 1567–1578. http://doi.org/10.1136/bjsports-2014-094073 Bröske, A.-M., Vockentanz, L., Kharazi, S., Huska, M. R., Mancini, E., Scheller, M., et al. (2009). DNA methylation protects hematopoietic stem cell multipotency from myeloerythroid restriction. Nature Genetics, 41(11), 1207–1215. http://doi.org/10.1038/ng.463 Buhule, O. D., Minster, R. L., Hawley, N. L., Medvedovic, M., Sun, G., Viali, S., et al. (2014). Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale. Frontiers in Genetics, 5, 354. http://doi.org/10.3389/fgene.2014.00354 Calvanese, V., Calvanese, V., Fernandez, A. F., Fernandez, A. F., Urdinguio, R. G., Urdinguio, R. G., et al. (2011). A promoter DNA demethylation landscape of human hematopoietic differentiation. Nucleic Acids Research, 40(1), 116–131. http://doi.org/10.1093/nar/gkr685 Campión, J., Milagro, F. I., Goyenechea, E., & Martínez, J. A. (2009). TNF-alpha promoter methylation as a predictive biomarker for weight-loss response. Obesity (Silver Spring, Md.), 17(6), 1293–1297. http://doi.org/10.1038/oby.2008.679 Cazaly, E., Thomson, R., Marthick, J. R., Holloway, A. F., Charlesworth, J., & Dickinson, J. L. (2016). Comparison of pre-processing methodologies for Illumina 450k methylation array data in familial analyses. Clinical Epigenetics, 8(1), 75. http://doi.org/10.1186/s13148-016-0241-2 Challen, G. A., Sun, D., Mayle, A., Jeong, M., Luo, M., Rodriguez, B., et al. (2014). Dnmt3a and Dnmt3b Have Overlapping and Distinct Functions in Hematopoietic Stem Cells. Cell Stem Cell, 15(3), 350–364. http://doi.org/10.1016/j.stem.2014.06.018 Chambers, J. C., Loh, M., Lehne, B., Drong, A., Kriebel, J., Motta, V., et al. (2015). Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case-control study. The Lancet. Diabetes & Endocrinology, 3(7), 526–534. http://doi.org/10.1016/S2213-8587(15)00127-8 Chen, B. H., Marioni, R. E., Colicino, E., Peters, M. J., Ward-Caviness, C. K., Tsai, P.-C., et al. (2016). DNA methylation-based measures of biological age: meta-analysis predicting time to death. Aging, 8(9), 1844–1865. http://doi.org/10.18632/aging.101020 Chen, Y.-A., Lemire, M., Choufani, S., Butcher, D. T., Grafodatskaya, D., Zanke, B. W., et al. (2013). Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics : Official Journal of the DNA Methylation Society, 8(2), 203–209. http://doi.org/10.4161/epi.23470 Christensen, B. C., Houseman, E. A., Marsit, C. J., Zheng, S., Wrensch, M. R., Wiemels, J. L., et al. (2009). Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context. PLoS Genetics, 5(8), e1000602. http://doi.org/10.1371/journal.pgen.1000602 Christensen, K., Johnson, T. E., & Vaupel, J. W. (2006). The quest for genetic determinants of human longevity: challenges and insights. Nature Reviews. Genetics, 7(6), 436–448. http://doi.org/10.1038/nrg1871  128 Christiansen, L., Lenart, A., Tan, Q., Vaupel, J. W., Aviv, A., McGue, M., & Christensen, K. (2015). DNA methylation age is associated with mortality in a longitudinal Danish twin study. Aging Cell, n/a–n/a. http://doi.org/10.1111/acel.12421 Chudley, A. E., Conry, J., Cook, J. L., Loock, C., Rosales, T., LeBlanc, N., Public Health Agency of Canada's National Advisory Committee on Fetal Alcohol Spectrum Disorder. (2005, March 1). Fetal alcohol spectrum disorder: Canadian guidelines for diagnosis. CMAJ : Canadian Medical Association Journal = Journal De l'Association Medicale Canadienne. Canadian Medical Association. http://doi.org/10.1503/cmaj.1040302 Clifford, R. L., Jones, M. J., Macisaac, J. L., McEwen, L. M., Goodman, S. J., Mostafavi, S., et al. (2016). Inhalation of diesel exhaust and allergen alters human bronchial epithelium DNA methylation. The Journal of Allergy and Clinical Immunology. http://doi.org/10.1016/j.jaci.2016.03.046 Cole, J. H., Ritchie, S. J., Bastin, M. E., Valdés Hernández, M. C., Muñoz Maniega, S., Royle, N., et al. (2018). Brain age predicts mortality. Molecular Psychiatry, 23(5), 1385–1392. http://doi.org/10.1038/mp.2017.62 Colman, R. J., Anderson, R. M., Johnson, S. C., Kastman, E. K., Kosmatka, K. J., Beasley, T. M., et al. (2009). Caloric restriction delays disease onset and mortality in rhesus monkeys. Science (New York, N.Y.), 325(5937), 201–204. http://doi.org/10.1126/science.1173635 Colman, R. J., Beasley, T. M., Kemnitz, J. W., Johnson, S. C., Weindruch, R., & Anderson, R. M. (2014). Caloric restriction reduces age-related and all-cause mortality in rhesus monkeys. Nature Communications, 5, 3557. http://doi.org/10.1038/ncomms4557 Cotton, A. M., Price, E. M., Jones, M. J., Balaton, B. P., Kobor, M. S., & Brown, C. J. (2015). Landscape of DNA methylation on the X chromosome reflects CpG density, functional chromatin state and X-chromosome inactivation. Human Molecular Genetics, 24(6), 1528–1539. http://doi.org/10.1093/hmg/ddu564 Córdova-Palomera, A., Fatjó-Vilas, M., Gastó, C., Navarro, V., Krebs, M.-O., & Fañanás, L. (2015). Genome-wide methylation study on depression: differential methylation and variable methylation in monozygotic twins. Translational Psychiatry, 5(4), e557. http://doi.org/10.1038/tp.2015.49 Davies, M. N., Volta, M., Pidsley, R., Lunnon, K., Dixit, A., Lovestone, S., et al. (2012). Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biology, 13(6), R43. http://doi.org/10.1186/gb-2012-13-6-r43 Davis, S., Du, P., Bilke, S., Triche, T., & Bootwalla, M. (2014). methylumi: Handle Illumina methylation data. R package version. De Paoli-Iseppi, R., Deagle, B. E., McMahon, C. R., Hindell, M. A., Dickinson, J. L., & Jarman, S. N. (2017). Measuring Animal Age with DNA Methylation: From Humans to Wild Animals. Frontiers in Genetics, 8, 106. http://doi.org/10.3389/fgene.2017.00106 Dedeurwaerder, S., Defrance, M., Bizet, M., Calonne, E., Bontempi, G., & Fuks, F. (2014). A comprehensive overview of Infinium HumanMethylation450 data processing. Briefings in Bioinformatics, 15(6), 929–941. http://doi.org/10.1093/bib/bbt054 Defossez, P.-A., & Stancheva, I. (2011). Biological functions of methyl-CpG-binding proteins. Progress in Molecular Biology and Translational Science, 101, 377–398. http://doi.org/10.1016/B978-0-12-387685-0.00012-3 Demerath, E. W., Guan, W., Grove, M. L., Aslibekyan, S., Mendelson, M., Zhou, Y.-H., et al. (2015). Epigenome-wide association study (EWAS) of BMI, BMI change and waist  129 circumference in African American adults identifies multiple replicated loci. Human Molecular Genetics, 24(15), 4464–4479. http://doi.org/10.1093/hmg/ddv161 Dick, K. J., Nelson, C. P., Tsaprouni, L., Sandling, J. K., Aïssi, D., Wahl, S., et al. (2014). DNA methylation and body-mass index: a genome-wide analysis. The Lancet, 383(9933), 1990–1998. http://doi.org/10.1016/S0140-6736(13)62674-4 Diez Roux, A. V. (2001). Investigating Neighborhood and Area Effects on Health. American Journal of Public Health, 91(11), 1783–1789. http://doi.org/10.2105/AJPH.91.11.1783 Du, P., Zhang, X., Huang, C.-C., Jafari, N., Kibbe, W. A., Hou, L., & Lin, S. M. (2010). Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics, 11(1), 587. http://doi.org/10.1186/1471-2105-11-587 Duncan, B. K., & Miller, J. H. (1980). Mutagenic deamination of cytosine residues in DNA. Nature, 287(5782), 560–561. http://doi.org/10.1038/287560a0 Edgar, R. D., Jones, M. J., Robinson, W. P., & Kobor, M. S. (2017). An empirically driven data reduction method on the human 450K methylation array to remove tissue specific non-variable CpGs. Clinical Epigenetics, 9(1), 11. http://doi.org/10.1186/s13148-017-0320-z Edgar, R., Tan, P. P. C., Portales-Casamar, E., & Pavlidis, P. (2014). Meta-analysis of human methylomes reveals stably methylated sequences surrounding CpG islands associated with high gene expression. Epigenetics & Chromatin, 7(1), 28. http://doi.org/10.1186/1756-8935-7-28 Eipel, M., Mayer, F., Arent, T., Ferreira, M. R. P., Birkhofer, C., Gerstenmaier, U., et al. (2016). Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures. Aging, 8(5), 1034–1048. http://doi.org/10.18632/aging.100972 Esposito, E. A., Jones, M. J., Doom, J. R., Macisaac, J. L., Gunnar, M. R., & Kobor, M. S. (2016). Differential DNA methylation in peripheral blood mononuclear cells in adolescents exposed to significant early but not later childhood adversity. Development and Psychopathology, 1–15. http://doi.org/10.1017/S0954579416000055 Essex, M. J., Boyce, W. T., Hertzman, C., Lam, L. L., Armstrong, J. M., Neumann, S. M. A., & Kobor, M. S. (2013). Epigenetic vestiges of early developmental adversity: childhood stress exposure and DNA methylation in adolescence. Child Development, 84(1), 58–75. http://doi.org/10.1111/j.1467-8624.2011.01641.x Fagnoni, F. F., Vescovini, R., Mazzola, M., Bologna, G., NIGRO, E., LAVAGETTO, G., et al. (1996). Expansion of cytotoxic CD8+ CD28- T cells in healthy ageing people, including centenarians. Immunology, 88(4), 501–507. Fagnoni, F. F., Vescovini, R., Passeri, G., Bologna, G., Pedrazzoni, M., Lavagetto, G., et al. (2000). Shortage of circulating naive CD8+ T cells provides new insights on immunodeficiency in aging. Blood, 95(9), 2860–2868. http://doi.org/10.1016/0167-5699(95)80064-6 Farré, P., Jones, M. J., Meaney, M. J., Emberly, E., Turecki, G., & Kobor, M. S. (2015). Concordant and discordant DNA methylation signatures of aging in human blood and brain. Epigenetics & Chromatin, 8(1), 19. http://doi.org/10.1186/s13072-015-0011-y Feil, R., & Fraga, M. F. (2012). Epigenetics and the environment: emerging patterns and implications. Nature Reviews. Genetics, 13(2), 97–109. http://doi.org/10.1038/nrg3142 Fleischer, T., Tekpli, X., Mathelier, A., Wang, S., Nebdal, D., Dhakal, H. P., et al. (2017). DNA methylation at enhancers identifies distinct breast cancer lineages. Nature Communications,  130 8(1), 1379. http://doi.org/10.1038/s41467-017-00510-x Ford, E. E., Grimmer, M. R., Stolzenburg, S., Bogdanovic, O., de Mendoza, A., Farnham, P. J., et al. (2017). Frequent lack of repressive capacity of promoter DNA methylation identified through genome-wide epigenomic manipulation. bioRxiv, 170506. http://doi.org/10.1101/170506 Forest, M., O'Donnell, K. J., Voisin, G., Gaudreau, H., Macisaac, J. L., McEwen, L. M., et al. (2018). Agreement in DNA methylation levels from the Illumina 450K array across batches, tissues, and time. Epigenetics : Official Journal of the DNA Methylation Society, 1–14. http://doi.org/10.1080/15592294.2017.1411443 Fortin, J. P., Triche, T. J., Jr, & Hansen, K. D. (2016). Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics (Oxford, England). Fraga, M. F., Ballestar, E., Paz, M. F., Ropero, S., Setien, F., Ballestar, M. L., et al. (2005). Epigenetic differences arise during the lifetime of monozygotic twins. Proceedings of the National Academy of Sciences, 102(30), 10604–10609. http://doi.org/10.1073/pnas.0500398102 Franke, K., Ziegler, G., Klöppel, S., Gaser, C., Alzheimer's Disease Neuroimaging Initiative. (2010). Estimating the age of healthy subjects from T1-weighted MRI scans using kernel methods: exploring the influence of various parameters. NeuroImage, 50(3), 883–892. http://doi.org/10.1016/j.neuroimage.2010.01.005 Fraser, H. B., Lam, L. L., Neumann, S. M., & Kobor, M. S. (2012). Population-specificity of human DNA methylation. Genome Biology, 13(2), R8. http://doi.org/10.1186/gb-2012-13-2-r8 Fung, T. T., Rexrode, K. M., Mantzoros, C. S., Manson, J. E., Willett, W. C., & Hu, F. B. (2009). Mediterranean diet and incidence of and mortality from coronary heart disease and stroke in women. Circulation, 119(8), 1093–1100. http://doi.org/10.1161/CIRCULATIONAHA.108.816736 Gamazon, E. R., Badner, J. A., Cheng, L., Zhang, C., Zhang, D., Cox, N. J., et al. (2013). Enrichment of cis-regulatory gene expression SNPs and methylation quantitative trait loci among bipolar disorder susceptibility variants. Molecular Psychiatry, 18(3), 340–346. http://doi.org/10.1038/mp.2011.174 Gao, X., Thomsen, H., Zhang, Y., Breitling, L. P., & Brenner, H. (2017). The impact of methylation quantitative trait loci (mQTLs) on active smoking-related DNA methylation changes. Clinical Epigenetics, 9(1), 87. http://doi.org/10.1186/s13148-017-0387-6 Garagnani, P., Bacalini, M. G., Pirazzini, C., Gori, D., Giuliani, C., Mari, D., et al. (2012). Methylation of ELOVL2 gene as a new epigenetic marker of age. Aging Cell, 11(6), 1132–1134. http://doi.org/10.1111/acel.12005 Garcia-Valles, R., Gomez-Cabrera, M. C., Rodriguez-Mañas, L., Garcia-Garcia, F. J., Diaz, A., Noguera, I., et al. (2013). Life-long spontaneous exercise does not prolong lifespan but improves health span in mice. Longevity & Healthspan, 2(1), 14. http://doi.org/10.1186/2046-2395-2-14 Garten, A., Schuster, S., Penke, M., Gorski, T., de Giorgis, T., & Kiess, W. (2015). Physiological and pathophysiological roles of NAMPT and NAD metabolism. Nature Reviews. Endocrinology, 11(9), 535–546. http://doi.org/10.1038/nrendo.2015.117 Gaunt, T. R., Shihab, H. A., Hemani, G., Min, J. L., Woodward, G., Lyttleton, O., et al. (2016). Systematic identification of genetic influences on methylation across the human life course.  131 Genome Biology, 17(1), 61. http://doi.org/10.1186/s13059-016-0926-z Gebel, K., Ding, D., Chey, T., Stamatakis, E., Brown, W. J., & Bauman, A. E. (2015). Effect of Moderate to Vigorous Physical Activity on All-Cause Mortality in Middle-aged and Older Australians. JAMA Internal Medicine, 175(6), 970–977. http://doi.org/10.1001/jamainternmed.2015.0541 Gellert, C., Schöttker, B., & Brenner, H. (2012). Smoking and all-cause mortality in older people: systematic review and meta-analysis. Archives of Internal Medicine, 172(11), 837–844. http://doi.org/10.1001/archinternmed.2012.1397 Goldman, J., Becker, M. L., Jones, B., Clements, M., & Leeder, J. S. (2011). Development of biomarkers to optimize pediatric patient management: what makes children different? Dx.Doi.org, 5(6), 781–794. http://doi.org/10.2217/bmm.11.96 Goronzy, J. J., Li, G., & Weyand, C. M. (2010). DNA Methylation, Age-Related Immune Defects, and Autoimmunity. In Epigenetics of Aging (pp. 327–344). New York, NY: Springer New York. http://doi.org/10.1007/978-1-4419-0639-7_18 Guerrero-Preston, R., Goldman, L. R., Brebi-Mieville, P., Ili-Gangas, C., Lebron, C., Witter, F. R., et al. (2010). Global DNA hypomethylation is associated with in utero exposure to cotinine and perfluorinated alkyl compounds. Epigenetics : Official Journal of the DNA Methylation Society, 5(6), 539–546. http://doi.org/10.4161/epi.5.6.12378 Gui, J., Mustachio, L. M., Su, D.-M., & Craig, R. W. (2012). Thymus Size and Age-related Thymic Involution: Early Programming, Sexual Dimorphism, Progenitors and Stroma. Aging and Disease, 3(3), 280–290. Guintivano, J., Aryee, M. J., & Kaminsky, Z. A. (2013). A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics : Official Journal of the DNA Methylation Society, 8(3), 290–302. http://doi.org/10.4161/epi.23924 Gutierrez Arcelus, M., Lappalainen, T., Montgomery, S. B., Buil, A., Ongen, H., Yurovsky, A., et al. (2013). Passive and active DNA methylation and the interplay with genetic variation in gene regulation. eLife, 2, 56. http://doi.org/10.7554/eLife.00523 Gutierrez Arcelus, M., Ongen, H., Lappalainen, T., Montgomery, S. B., Buil, A., Yurovsky, A., et al. (2015). Tissue-specific effects of genetic and epigenetic variation on gene regulation and splicing. PLoS Genetics, 11(1), e1004958. http://doi.org/10.1371/journal.pgen.1004958 Han, S., & Brunet, A. (2012). Histone methylation makes its mark on longevity. Trends in Cell Biology, 22(1), 42–49. http://doi.org/10.1016/j.tcb.2011.11.001 Hannum, G., Guinney, J., Zhao, L., Zhang, L., Hughes, G., Sadda, S., et al. (2013). Genome-wide methylation profiles reveal quantitative views of human aging rates. Molecular Cell, 49(2), 359–367. http://doi.org/10.1016/j.molcel.2012.10.016 Harkess, K. N., Ryan, J., Delfabbro, P. H., & Cohen-Woods, S. (2016). Preliminary indications of the effect of a brief yoga intervention on markers of inflammation and DNA methylation in chronically stressed women. Translational Psychiatry, 6(11), e965–e965. http://doi.org/10.1038/tp.2016.234 Hatchwell, E., & Greally, J. M. (2007). The potential role of epigenomic dysregulation in complex human disease. Trends in Genetics : TIG, 23(11), 588–595. http://doi.org/10.1016/j.tig.2007.08.010 Heijmans, B. T., Tobi, E. W., Stein, A. D., Putter, H., Blauw, G. J., Susser, E. S., et al. (2008). Persistent epigenetic differences associated with prenatal exposure to famine in humans. Proceedings of the National Academy of Sciences of the United States of America, 105(44),  132 17046–17049. http://doi.org/10.1073/pnas.0806560105 Heyn, H., Li, N., Ferreira, H. J., Moran, S., Pisano, D. G., Gomez, A., et al. (2012). Distinct DNA methylomes of newborns and centenarians. Proceedings of the National Academy of Sciences of the United States of America, 109(26), 10522–10527. http://doi.org/10.1073/pnas.1120658109 Heyn, H., Sayols, S., Moutinho, C., Vidal, E., Sanchez-Mut, J. V., Stefansson, O. A., et al. (2014). Linkage of DNA methylation quantitative trait loci to human cancer risk. Cell Reports, 7(2), 331–338. http://doi.org/10.1016/j.celrep.2014.03.016 Hicks, S. C., & Irizarry, R. A. (2015). quantro: a data-driven approach to guide the choice of an appropriate normalization method. Genome Biology, 16(1), 117. http://doi.org/10.1186/s13059-015-0679-0 Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2017). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(03), 199–236. http://doi.org/10.1093/pan/mpl013 Hoal-van Helden, E. G., & van Helden, P. D. (1989). Age-related methylation changes in DNA may reflect the proliferative potential of organs. Mutation Research/DNAging, 219(5-6), 263–266. http://doi.org/10.1016/0921-8734(89)90027-1 Hochberg, Y., & Benjamini, Y. (1990). More powerful procedures for multiple significance testing. Statistics in Medicine, 9(7), 811–818. Horvath, S. (2013). DNA methylation age of human tissues and cell types. Genome Biology, 14(10), R115. http://doi.org/10.1186/gb-2013-14-10-r115 Horvath, S., & Levine, A. J. (2015). HIV-1 Infection Accelerates Age According to the Epigenetic Clock. The Journal of Infectious Diseases, 212(10), 1563–1573. http://doi.org/10.1093/infdis/jiv277 Horvath, S., & Raj, K. (2018). DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nature Reviews. Genetics, 23, 223. http://doi.org/10.1038/s41576-018-0004-3 Horvath, S., & Ritz, B. R. (2015). Increased epigenetic age and granulocyte counts in the blood of Parkinson's disease patients. Aging. Horvath, S., Erhart, W., Brosch, M., Ammerpohl, O., Schönfels, von, W., Ahrens, M., et al. (2014). Obesity accelerates epigenetic aging of human liver. Proceedings of the National Academy of Sciences of the United States of America, 111(43), 15538–15543. http://doi.org/10.1073/pnas.1412759111 Horvath, S., Garagnani, P., Bacalini, M. G., Pirazzini, C., Salvioli, S., Gentilini, D., et al. (2015a). Accelerated epigenetic aging in Down syndrome. Aging Cell, 14(3), 491–495. http://doi.org/10.1111/acel.12325 Horvath, S., Gurven, M., Levine, M. E., Trumble, B. C., Kaplan, H., Allayee, H., et al. (2016). An epigenetic clock analysis of race/ethnicity, sex, and coronary heart disease. Genome Biology, 17(1), 171. http://doi.org/10.1186/s13059-016-1030-0 Horvath, S., Pirazzini, C., Bacalini, M. G., Gentilini, D., Di Blasio, A. M., Delledonne, M., et al. (2015b). Decreased epigenetic age of PBMCs from Italian semi-supercentenarians and their offspring. Aging. Horvath, S., Zhang, Y., Langfelder, P., Kahn, R. S., Boks, M. P. M., van Eijk, K., et al. (2012). Aging effects on DNA methylation modules in human brain and blood tissue. Genome Biology, 13(10), R97. http://doi.org/10.1186/gb-2012-13-10-r97 Houseman, E., Accomando, W. P., Koestler, D. C., Christensen, B. C., Marsit, C. J., Nelson, H.  133 H., et al. (2012). DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics, 13(1), 86. http://doi.org/10.1186/1471-2105-13-86 Houtepen, L. C., Vinkers, C. H., Carrillo-Roa, T., Hiemstra, M., van Lier, P. A., Meeus, W., et al. (2016). Genome-wide DNA methylation levels and altered cortisol stress reactivity following childhood trauma in humans. Nature Communications, 7, 10967. http://doi.org/10.1038/ncomms10967 Hussey, B., Lindley, M. R., & Mastana, S. (2017). Epigenetics and epigenomics: the future of nutritional interventions? Future Science OA, 3(4), FSO237. http://doi.org/10.4155/fsoa-2017-0088 Husquin, L.T., Rotival, M., Fagny, M., Quach, H., Zidane, N., McEwen, L.M., MacIsaac, J.L., Kobor, M.S., Aschard, H., Patin, E., Quintana-Murci, L. (2018). Exploring the Genetic Basis of Human Population Differences in DNA Methylation and their Causal Impact on Immune Gene Regulation. BioRxiv 371872. doi: https://doi.org/10.1101/371872.  Illingworth, R. S., & Bird, A. P. (2009). CpG islands--'a rough guide'. FEBS Letters, 583(11), 1713–1720. http://doi.org/10.1016/j.febslet.2009.04.012 Ito, S., Shen, L., Dai, Q., Wu, S. C., Collins, L. B., Swenberg, J. A., et al. (2011). Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science (New York, N.Y.), 333(6047), 1300–1303. http://doi.org/10.1126/science.1210597 Jaffe, A. E., & Irizarry, R. A. (2014). Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biology, 15(2), R31. http://doi.org/10.1186/gb-2014-15-2-r31 Jiang, R., Jones, M. J., Sava, F., Kobor, M. S., & Carlsten, C. (2014). Short-term diesel exhaust inhalation in a controlled human crossover study is associated with changes in DNA methylation of circulating mononuclear cells in asthmatics. Particle and Fibre Toxicology, 11(1), 71. http://doi.org/10.1186/s12989-014-0071-3 Johansson, A., Enroth, S., & Gyllensten, U. (2013). Continuous Aging of the Human DNA Methylome Throughout the Human Lifespan. PloS One, 8(6), e67378. http://doi.org/10.1371/journal.pone.0067378 Johnson, W. E., Li, C., & Rabinovic, A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (Oxford, England), 8(1), 118–127. http://doi.org/10.1093/biostatistics/kxj037 Jones, M. J., Fejes, A. P., & Kobor, M. S. (2013). DNA methylation, genotype and gene expression: who is driving and who is along for the ride? Genome Biology, 14(7), 126. http://doi.org/10.1186/gb-2013-14-7-126 Jones, M. J., Goodman, S. J., & Kobor, M. S. (2015a). DNA methylation and healthy human aging. Aging Cell, n/a–n/a. http://doi.org/10.1111/acel.12349 Jones, M. J., Islam, S. A., Edgar, R. D., & Kobor, M. S. (2015b). Adjusting for Cell Type Composition in DNA Methylation Data Using a Regression-Based Approach. Methods in Molecular Biology (Clifton, N.J.), (Chapter 262). http://doi.org/10.1007/7651_2015_262 Jones, P. A. (2012). Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nature Reviews. Genetics, 13(7), 484–492. http://doi.org/10.1038/nrg3230 Jones, P. A., & Takai, D. (2001). The role of DNA methylation in mammalian epigenetics. Science (New York, N.Y.), 293(5532), 1068–1070. http://doi.org/10.1126/science.1063852 Joubert, B. R., Håberg, S. E., Nilsen, R. M., Wang, X., Vollset, S. E., Murphy, S. K., et al. (2012). 450K epigenome-wide scan identifies differential DNA methylation in newborns related to maternal smoking during pregnancy. Environmental Health Perspectives, 120(10),  134 1425–1431. http://doi.org/10.1289/ehp.1205412 Jung, M., & Pfeifer, G. P. (2015). Aging and DNA methylation. BMC Biology, 13(1), 7. http://doi.org/10.1186/s12915-015-0118-4 Jylhävä, J. (2014). Determinants of longevity: genetics, biomarkers and therapeutic approaches. Current Pharmaceutical Design, 20(38), 6058–6070. Jylhävä, J., Pedersen, N. L., & Hägg, S. (2017). Biological Age Predictors. EBioMedicine, 21, 29–36. http://doi.org/10.1016/j.ebiom.2017.03.046 Kaeberlein, M., Rabinovitch, P. S., & Martin, G. M. (2015). Healthy aging: The ultimate preventative medicine. Science (New York, N.Y.), 350(6265), 1191–1193. http://doi.org/10.1126/science.aad3267 Kaplan, B. J., Giesbrecht, G. F., Leung, B. M. Y., Field, C. J., Dewey, D., Bell, R. C., et al. (2012). The Alberta Pregnancy Outcomes and Nutrition (APrON) cohort study: rationale and methods. Maternal & Child Nutrition, 10(1), 44–60. http://doi.org/10.1111/j.1740-8709.2012.00433.x Kato, N., Loh, M., Takeuchi, F., Verweij, N., Wang, X., Zhang, W., et al. (2015). Trans-ancestry genome-wide association study identifies 12 genetic loci influencing blood pressure and implicates a role for DNA methylation. Nature Genetics, 47(11), 1282–1293. http://doi.org/10.1038/ng.3405 Kaushal, A., Zhang, H., Karmaus, W. J. J., Ray, M., Torres, M. A., Smith, A. K., & Wang, S.-L. (2017). Comparison of different cell type correction methods for genome-scale epigenetics studies. BMC Bioinformatics, 18(1), 216. http://doi.org/10.1186/s12859-017-1611-2 Klengel, T., Mehta, D., Anacker, C., Rex-Haffner, M., Pruessner, J. C., Pariante, C. M., et al. (2013). Allele-specific FKBP5 DNA demethylation mediates gene-childhood trauma interactions. Nature Neuroscience, 16(1), 33–41. http://doi.org/10.1038/nn.3275 Knight, A. K., Craig, J. M., Theda, C., Bækvad-Hansen, M., Bybjerg-Grauholm, J., Hansen, C. S., et al. (2016). An epigenetic clock for gestational age at birth based on blood methylation data. Genome Biology, 17(1), 206. http://doi.org/10.1186/s13059-016-1068-z Koch, C. M., & Wagner, W. (2011). Epigenetic-aging-signature to determine age in different tissues. Aging, 3(10), 1018–1027. Koestler, D. C., Christensen, B., Karagas, M. R., Marsit, C. J., Langevin, S. M., Kelsey, K. T., et al. (2013). Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics : Official Journal of the DNA Methylation Society, 8(8), 816–826. http://doi.org/10.4161/epi.25430 Lam, L. L., Emberly, E., Fraser, H. B., Neumann, S. M., Chen, E., Miller, G. E., & Kobor, M. S. (2012). Factors underlying variable DNA methylation in a human community cohort. Proceedings of the National Academy of Sciences of the United States of America, 109 Suppl 2(Supplement_2), 17253–17260. http://doi.org/10.1073/pnas.1121249109 Lappalainen, T., & Greally, J. M. (2017). Associating cellular epigenetic models with human phenotypes. Nature Reviews. Genetics, 18(7), 441–451. http://doi.org/10.1038/nrg.2017.32 Lee, H. Y., Jung, S.-E., Lee, E. H., Yang, W. I., & Shin, K.-J. (2016). DNA methylation profiling for a confirmatory test for blood, saliva, semen, vaginal fluid and menstrual blood. Forensic Science International. Genetics, 24, 75–82. http://doi.org/10.1016/j.fsigen.2016.06.007 Lee, I. M., & Paffenbarger, R. S. (2000). Associations of light, moderate, and vigorous intensity physical activity with longevity. The Harvard Alumni Health Study. American Journal of Epidemiology, 151(3), 293–299.  135 Leek, J. T., & Storey, J. D. (2007). Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genetics, 3(9), 1724–1735. http://doi.org/10.1371/journal.pgen.0030161 Leenen, F. A. D., Muller, C. P., & Turner, J. D. (2016). DNA methylation: conducting the orchestra from exposure to phenotype? Clinical Epigenetics, 8(1), 92. http://doi.org/10.1186/s13148-016-0256-8 Lemire, M., Zaidi, S. H. E., Ban, M., Ge, B., Aïssi, D., Germain, M., et al. (2015). Long-range epigenetic regulation is conferred by genetic variation located at thousands of independent loci. Nature Communications, 6, 6326. http://doi.org/10.1038/ncomms7326 Lev Maor, G., Yearim, A., & Ast, G. (2015). The alternative role of DNA methylation in splicing regulation. Trends in Genetics : TIG, 31(5), 274–280. http://doi.org/10.1016/j.tig.2015.03.002 Levine, M. E., Hosgood, H. D., Chen, B., Absher, D., Assimes, T., & Horvath, S. (2015a). DNA methylation age of blood predicts future onset of lung cancer in the women's health initiative. Aging, 7(9), 690–700. Levine, M. E., Lu, A. T., Bennett, D. A., & Horvath, S. (2015b). Epigenetic age of the pre-frontal cortex is associated with neuritic plaques, amyloid load, and Alzheimer's disease related cognitive functioning. Aging. Levine, M. E., Lu, A. T., Quach, A., Chen, B. H., Assimes, T. L., Bandinelli, S., et al. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging, 10(4), 573–591. http://doi.org/10.18632/aging.101414 Li, Q., Seo, J.-H., Stranger, B., McKenna, A., Pe'er, I., Laframboise, T., et al. (2013). Integrative eQTL-based analyses reveal the biology of breast cancer risk loci. Cell, 152(3), 633–641. http://doi.org/10.1016/j.cell.2012.12.034 Lin, Q., Weidner, C. I., Costa, I. G., Marioni, R. E., Ferreira, M. R. P., Deary, I. J., & Wagner, W. (2016). DNA methylation levels at individual age-associated CpG sites can be indicative for life expectancy. Aging, 8(2), 394–401. Lister, R., Pelizzola, M., Dowen, R. H., Hawkins, R. D., Hon, G., Tonti-Filippini, J., et al. (2009). Human DNA methylomes at base resolution show widespread epigenomic differences. Nature, 462(7271), 315–322. http://doi.org/10.1038/nature08514 Lopez-Garcia, E., Rodriguez-Artalejo, F., Li, T. Y., Fung, T. T., Li, S., Willett, W. C., et al. (2014). The Mediterranean-style dietary pattern and mortality among men and women with cardiovascular disease. The American Journal of Clinical Nutrition, 99(1), 172–180. http://doi.org/10.3945/ajcn.113.068106 Lussier, A. A., Morin, A. M., Macisaac, J. L., Salmon, J., Weinberg, J., Reynolds, J. N., et al. (2018). DNA methylation as a predictor of fetal alcohol spectrum disorder. Clinical Epigenetics, 10(1), 5. http://doi.org/10.1186/s13148-018-0439-6 Ma, Y., Smith, C. E., Lai, C.-Q., Irvin, M. R., Parnell, L. D., Lee, Y.-C., et al. (2015). Genetic variants modify the effect of age on APOE methylation in the Genetics of Lipid Lowering Drugs and Diet Network study. Aging Cell, 14(1), 49–59. http://doi.org/10.1111/acel.12293 Maierhofer, A., Flunkert, J., Oshima, J., Martin, G. M., Haaf, T., & Horvath, S. (2017). Accelerated epigenetic aging in Werner syndrome. Aging. http://doi.org/10.18632/aging.101217 Maksimovic, J., Gordon, L., & Oshlack, A. (2012). SWAN: Subset-quantile Within Array Normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome Biology, 13(6), R44. http://doi.org/10.1186/gb-2012-13-6-r44  136 Mann, S., Beedie, C., & Jimenez, A. (2014). Differential effects of aerobic exercise, resistance training and combined exercise modalities on cholesterol and the lipid profile: review, synthesis and recommendations. Sports Medicine (Auckland, N.Z.), 44(2), 211–221. http://doi.org/10.1007/s40279-013-0110-5 Marioni, R. E., Harris, S. E., Shah, S., McRae, A. F., Zglinicki, von, T., Martin-Ruiz, C., et al. (2016). The epigenetic clock and telomere length are independently associated with chronological age and mortality. International Journal of Epidemiology. http://doi.org/10.1093/ije/dyw041 Marioni, R. E., Shah, S., McRae, A. F., Chen, B. H., Colicino, E., Harris, S. E., et al. (2015a). DNA methylation age of blood predicts all-cause mortality in later life. Genome Biology, 16(1), 25. http://doi.org/10.1186/s13059-015-0584-6 Marioni, R. E., Shah, S., McRae, A. F., Ritchie, S. J., Muniz-Terrera, G., Harris, S. E., et al. (2015b). The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936. International Journal of Epidemiology, dyu277. http://doi.org/10.1093/ije/dyu277 Marsit, C. J., & Marsit, C. J. (2015). Influence of environmental exposure on human epigenetic regulation. Journal of Experimental Biology, 218(1), 71–79. http://doi.org/10.1242/jeb.106971 Martin, G. M. (2005). Epigenetic drift in aging identical twins. Proceedings of the National Academy of Sciences, 102(30), 10413–10414. http://doi.org/10.1073/pnas.0504743102 Martínez, J. A., Milagro, F. I., Claycombe, K. J., & Schalinske, K. L. (2014). Epigenetics in adipose tissue, obesity, weight loss, and diabetes. Advances in Nutrition (Bethesda, Md.), 5(1), 71–81. http://doi.org/10.3945/an.113.004705 Marttila, S., Kananen, L., Häyrynen, S., Jylhävä, J., Nevalainen, T., Hervonen, A., et al. (2015). Ageing-associated changes in the human DNA methylome: genomic locations and effects on gene expression. BMC Genomics, 16(1), 179. http://doi.org/10.1186/s12864-015-1381-z Maunakea, A. K., Chepelev, I., Cui, K., & Zhao, K. (2013). Intragenic DNA methylation modulates alternative splicing by recruiting MeCP2 to promote exon recognition. Cell Research, 23(11), 1256–1269. http://doi.org/10.1038/cr.2013.110 McCay, C. M., Crowell, M. F., & Maynard, L. A. (1989). The effect of retarded growth upon the length of life span and upon the ultimate body size. 1935. Nutrition (Burbank, Los Angeles County, Calif.) (Vol. 5, pp. 155–71– discussion 172). McEwen, L. M., Goodman, S. J., Kobor, M. S., & Jones, M. J. (2016). The DNA Methylome: An Interface Between the Environment, Immunity, and Ageing. In The Ageing Immune System and Health (Vol. 13, pp. 35–52). Cham: Springer International Publishing. http://doi.org/10.1007/978-3-319-43365-3_3 McGregor, K., Bernatsky, S., Colmegna, I., Hudson, M., Pastinen, T., Labbe, A., & Greenwood, C. M. T. (2016). An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies. Genome Biology, 17(1), 84. http://doi.org/10.1186/s13059-016-0935-y Meissner, A. (2010). Epigenetic modifications in pluripotent and differentiated cells. Nature Biotechnology, 28(10), 1079–1088. http://doi.org/10.1038/nbt.1684 Mendelson, M. M., Marioni, R. E., Joehanes, R., Liu, C., Hedman, Å. K., Aslibekyan, S., et al. (2017). Association of Body Mass Index with DNA Methylation and Gene Expression in Blood Cells and Relations to Cardiometabolic Disease: A Mendelian Randomization Approach. PLoS Medicine, 14(1), e1002215. http://doi.org/10.1371/journal.pmed.1002215 Milagro, F. I., Campión, J., Cordero, P., Goyenechea, E., Gómez-Úriz, A. M., Abete, I., et al.  137 (2011). A dual epigenomic approach for the search of obesity biomarkers: DNA methylation in relation to diet-induced weight loss. FASEB Journal : Official Publication of the Federation of American Societies for Experimental Biology, 25(4), 1378–1389. http://doi.org/10.1096/fj.10-170365 Miller, G. E., Miller, G. E., Chen, E., Chen, E., Fok, A. K., Fok, A. K., et al. (2009). Low early-life social class leaves a biological residue manifested by decreased glucocorticoid and increased proinflammatory signaling. Proceedings of the National Academy of Sciences, 106(34), 14716–14721. http://doi.org/10.1073/pnas.0902971106 Miyake, K., Kawaguchi, A., Miura, R., Kobayashi, S., Tran, N. Q. V., Kobayashi, S., et al. (2018). Association between DNA methylation in cord blood and maternal smoking: The Hokkaido Study on Environment and Children’s Health. Scientific Reports, 8(1), 77. http://doi.org/10.1038/s41598-018-23772-x Moher, D. (2010). CONSORT 2010 Explanation and Elaboration: updated guidelines for reporting parallel group randomised trials (Chinese version). Journal of Chinese Integrative Medicine, 701–741. http://doi.org/10.3736/jcim20100801 Monick, M. M., Beach, S. R. H., Plume, J., Sears, R., Gerrard, M., Brody, G. H., & Philibert, R. A. (2012). Coordinated changes in AHRR methylation in lymphoblasts and pulmonary macrophages from smokers. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 159B(2), 141–151. http://doi.org/10.1002/ajmg.b.32021 Moore, S. C., Patel, A. V., Matthews, C. E., Berrington de Gonzalez, A., Park, Y., Katki, H. A., et al. (2012). Leisure time physical activity of moderate to vigorous intensity and mortality: a large pooled cohort analysis. PLoS Medicine, 9(11), e1001335. http://doi.org/10.1371/journal.pmed.1001335 Moore, S. R., McEwen, L. M., Quirt, J., Morin, A., Mah, S. M., Barr, R. G., et al. (2017). Epigenetic correlates of neonatal contact in humans. Development and Psychopathology, 29(5), 1517–1538. http://doi.org/10.1017/S0954579417001213 Moran, S., Arribas, C., & Esteller, M. (2015). Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Dx.Doi.org, 8(3), 389–399. http://doi.org/10.2217/epi.15.114 Morris, T. J., Butcher, L. M., Feber, A., Teschendorff, A. E., Chakravarthy, A. R., Wojdacz, T. K., & Beck, S. (2014). ChAMP: 450k Chip Analysis Methylation Pipeline. Bioinformatics (Oxford, England), 30(3), 428–430. http://doi.org/10.1093/bioinformatics/btt684 Murabito, J. M., Yuan, R., & Lunetta, K. L. (2012). The search for longevity and healthy aging genes: insights from epidemiological studies and samples of long-lived individuals. The Journals of Gerontology Series a: Biological Sciences and Medical Sciences, 67(5), 470–479. http://doi.org/10.1093/gerona/gls089 Murgatroyd, C., Patchev, A. V., Wu, Y., Micale, V., Bockmühl, Y., Fischer, D., et al. (2009). Dynamic DNA methylation programs persistent adverse effects of early-life stress. Nature Neuroscience, 12(12), 1559–1566. http://doi.org/10.1038/nn.2436 Nitert, M. D., Dayeh, T., Volkov, P., Elgzyri, T., Hall, E., Nilsson, E., et al. (2012). Impact of an exercise intervention on DNA methylation in skeletal muscle from first-degree relatives of patients with type 2 diabetes. Diabetes, 61(12), 3322–3332. http://doi.org/10.2337/db11-1653 Nocon, M., Hiemann, T., Müller-Riemenschneider, F., Thalau, F., Roll, S., & Willich, S. N. (2008). Association of physical activity with all-cause and cardiovascular mortality: a systematic review and meta-analysis. European Journal of Cardiovascular Prevention and  138 Rehabilitation : Official Journal of the European Society of Cardiology, Working Groups on Epidemiology & Prevention and Cardiac Rehabilitation and Exercise Physiology, 15(3), 239–246. http://doi.org/10.1097/HJR.0b013e3282f55e09 Novakovic, B., Ryan, J., Pereira, N., Boughton, B., Craig, J. M., & Saffery, R. (2014). Postnatal stability, tissue, and time specific effects of AHRR methylation change in response to maternal smoking in pregnancy. Epigenetics : Official Journal of the DNA Methylation Society, 9(3), 377–386. http://doi.org/10.4161/epi.27248 Nunan, D., Mahtani, K. R., Roberts, N., & Heneghan, C. (2013). Physical activity for the prevention and treatment of major chronic disease: an overview of systematic reviews. Systematic Reviews, 2(1), 56. http://doi.org/10.1186/2046-4053-2-56 O'Donnell, K. A., Gaudreau, H., Colalillo, S., Steiner, M., Atkinson, L., Moss, E., et al. (2014). The maternal adversity, vulnerability and neurodevelopment project: theory and methodology. Canadian Journal of Psychiatry. Revue Canadienne De Psychiatrie, 59(9), 497–508. http://doi.org/10.1177/070674371405900906 Passarino, G., De Rango, F., & Montesanto, A. (2016). Human longevity: Genetics or Lifestyle? It takes two to tango. Immunity & Ageing : I & A, 13(1), 12. http://doi.org/10.1186/s12979-016-0066-z Pellicanò, M., Buffa, S., Goldeck, D., Bulati, M., Martorana, A., Caruso, C., et al. (2013). Evidence for Less Marked Potential Signs of T-Cell Immunosenescence in Centenarian Offspring Than in the General Age-Matched Population. The Journals of Gerontology Series a: Biological Sciences and Medical Sciences, 69(5), glt120–504. http://doi.org/10.1093/gerona/glt120 Perna, L., Zhang, Y., Mons, U., Holleczek, B., Saum, K.-U., & Brenner, H. (2016). Epigenetic age acceleration predicts cancer, cardiovascular, and all-cause mortality in a German case cohort. Clinical Epigenetics, 8(1), 64. http://doi.org/10.1186/s13148-016-0228-z Peters, M. J. (n.d.). The transcriptional landscape of age in human peripheral blood
  . Retrieved October 26, 2015, from http://www.nature.com/ncomms/2015/151022/ncomms9570/pdf/ncomms9570.pdf Peters, T. J., Buckley, M. J., Statham, A. L., Pidsley, R., Samaras, K., V Lord, R., et al. (2015). De novo identification of differentially methylated regions in the human genome. Epigenetics & Chromatin, 8, 6. http://doi.org/10.1186/1756-8935-8-6 Pidsley, R., Y Wong, C. C., Volta, M., Lunnon, K., Mill, J., & Schalkwyk, L. C. (2013). A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics, 14(1), 293. http://doi.org/10.1186/1471-2164-14-293 Poppitt, S. D., Keogh, G. F., Lithander, F. E., Wang, Y., Mulvey, T. B., Chan, Y.-K., et al. (2008). Postprandial response of adiponectin, interleukin-6, tumor necrosis factor-alpha, and C-reactive protein to a high-fat dietary load. Nutrition (Burbank, Los Angeles County, Calif.), 24(4), 322–329. http://doi.org/10.1016/j.nut.2007.12.012 Portales-Casamar, E., Mah, S. M., MacIsaac, J. L., Barhdadi, A., Provost, S., Dubé, M.-P., et al. (2015). DNA methylation changes in fetal alcohol spectrum disorder. International Journal of Developmental Neuroscience : the Official Journal of the International Society for Developmental Neuroscience, 47(Pt A), 126. http://doi.org/10.1016/j.ijdevneu.2015.04.336 Price, M. E., Cotton, A. M., Lam, L. L., Farré, P., Emberly, E., Brown, C. J., et al. (2013). Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array. Epigenetics & Chromatin, 6(1), 4. http://doi.org/10.1186/1756-8935-6-4  139 Quach, A., Levine, M. E., Tanaka, T., Lu, A. T., Chen, B. H., Ferrucci, L., et al. (2017). Epigenetic clock analysis of diet, exercise, education, and lifestyle factors. Aging, 9(2), 419–446. http://doi.org/10.18632/aging.101168 Rahmani, E., Shenhav, L., Schweiger, R., Yousefi, P., Huen, K., Eskenazi, B., et al. (2016). Genome-wide methylation data mirror ancestry information. bioRxiv (p. 066340). Cold Spring Harbor Labs Journals. Rahmani, E., Shenhav, L., Schweiger, R., Yousefi, P., Huen, K., Eskenazi, B., et al. (2017). Genome-wide methylation data mirror ancestry information. Epigenetics & Chromatin, 10(1), 1. http://doi.org/10.1186/s13072-016-0108-y Rappou, E., Jukarainen, S., Rinnankoski-Tuikka, R., Kaye, S., Heinonen, S., Hakkarainen, A., et al. (2016). Weight Loss Is Associated With Increased NAD+/SIRT1 Expression But Reduced PARP Activity in White Adipose Tissue. The Journal of Clinical Endocrinology & Metabolism, 101(3), 1263–1273. http://doi.org/10.1210/jc.2015-3054 Rehkopf, D. H., Dow, W. H., Rosero-Bixby, L., Lin, J., Epel, E. S., & Blackburn, E. H. (2013). Longer leukocyte telomere length in Costa Rica's Nicoya Peninsula: a population-based study. Experimental Gerontology, 48(11), 1266–1273. http://doi.org/10.1016/j.exger.2013.08.005 Rehkopf, D. H., Needham, B. L., Lin, J., Blackburn, E. H., Zota, A. R., Wojcicki, J. M., & Epel, E. S. (2016). Leukocyte Telomere Length in Relation to 17 Biomarkers of Cardiovascular Disease Risk: A Cross-Sectional Study of US Adults. PLoS Medicine, 13(11), e1002188. http://doi.org/10.1371/journal.pmed.1002188 Reid, M. C., Boutros, N. N., O'Connor, P. G., Cadariu, A., & Concato, J. (2002). The health-related effects of alcohol use in older persons: a systematic review. Substance Abuse, 23(3), 149–164. http://doi.org/10.1080/08897070209511485 Reinius, L. E., Acevedo, N., Joerink, M., Pershagen, G., Dahlen, S.-E., Greco, D., et al. (2012). Differential DNA Methylation in Purified Human Blood Cells: Implications for Cell Lineage and Studies on Disease Susceptibility. PloS One, 7(7), e41361. http://doi.org/10.1371/journal.pone.0041361 Reynolds, L. M., Taylor, J. R., Ding, J., Lohman, K., Johnson, C., Siscovick, D., et al. (2014). Age-related variations in the methylome associated with gene expression in human monocytes and T cells. Nature Communications, 5, 5366. http://doi.org/10.1038/ncomms6366 Richmond, R. C., Sharp, G. C., Ward, M. E., Fraser, A., Lyttleton, O., McArdle, W. L., et al. (2016). DNA Methylation and BMI: Investigating Identified Methylation Sites at HIF3A in a Causal Framework. Diabetes, 65(5), 1231–1244. http://doi.org/10.2337/db15-0996 Rickabaugh, T. M., Baxter, R. M., Sehl, M., Sinsheimer, J. S., Hultin, P. M., Hultin, L. E., et al. (2015). Acceleration of age-associated methylation patterns in HIV-1-infected adults. PloS One, 10(3), e0119201. http://doi.org/10.1371/journal.pone.0119201 Rijlaarsdam, J., Pappa, I., Walton, E., Bakermans-Kranenburg, M. J., Mileva-Seitz, V. R., Rippe, R. C. A., et al. (2016). An epigenome-wide association meta-analysis of prenatal maternal stress in neonates: A model approach for replication. Epigenetics : Official Journal of the DNA Methylation Society, 11(2), 140–149. http://doi.org/10.1080/15592294.2016.1145329 Romanov, G. A., & Vanyushin, B. F. (1981). Methylation of reiterated sequences in mammalian DNAs. Effects of the tissue type, age, malignancy and hormonal induction. Biochimica Et Biophysica Acta, 653(2), 204–218. Rose, G. (2001). Sick individuals and sick populations. International Journal of Epidemiology,  140 30(3), 427–32– discussion 433–4. http://doi.org/10.1093/ije/30.3.427 Rosero-Bixby, L., & Dow, W. H. (2009). Surprising SES Gradients in mortality, health, and biomarkers in a Latin American population of adults. The Journals of Gerontology. Series B, Psychological Sciences and Social Sciences, 64(1), 105–117. http://doi.org/10.1093/geronb/gbn004 Rosero-Bixby, L., Dow, W. H., & Fernández, X. (2013a). CRELES: Costa Rican Longevity and Healthy Aging Study. Rosero-Bixby, L., Dow, W. H., & Rehkopf, D. H. (2013b). The Nicoya region of Costa Rica: a high longevity island for elderly males. Vienna Yearbook of Population Research / Vienna Institute of Demography, Austrian Academy of Sciences, 11, 109–136. Rönn, T., Volkov, P., Davegårdh, C., Dayeh, T., Hall, E., Olsson, A. H., et al. (2013). A Six Months Exercise Intervention Influences the Genome-wide DNA Methylation Pattern in Human Adipose Tissue. PLoS Genetics, 9(6), e1003572. http://doi.org/10.1371/journal.pgen.1003572 Saffari, A., Silver, M. J., Zavattari, P., Moi, L., Columbano, A., Meaburn, E. L., & Dudbridge, F. (2017). Estimation of a significance threshold for epigenome-wide association studies. Genetic Epidemiology, 42(1), 20–33. http://doi.org/10.1002/gepi.22086 Saxonov, S., Berg, P., & Brutlag, D. L. (2006). A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proceedings of the National Academy of Sciences, 103(5), 1412–1417. http://doi.org/10.1073/pnas.0510310103 Schultz, M. D., He, Y., Whitaker, J. W., Hariharan, M., Mukamel, E. A., Leung, D., et al. (2015). Human body epigenome maps reveal noncanonical DNA methylation variation. Nature, 523(7559), 212–216. http://doi.org/10.1038/nature14465 Schuyler, R. P., Merkel, A., Raineri, E., Altucci, L., Vellenga, E., Martens, J. H. A., et al. (2016). Distinct Trends of DNA Methylation Patterning in the Innate and Adaptive Immune Systems. Cell Reports, 17(8), 2101–2111. http://doi.org/10.1016/j.celrep.2016.10.054 Schübeler, D. (2015). Function and information content of DNA methylation. Nature, 517(7534), 321–326. http://doi.org/10.1038/nature14192 Shah, S., McRae, A. F., Marioni, R. E., Harris, S. E., Gibson, J., Henders, A. K., et al. (2014). Genetic and environmental exposures constrain epigenetic drift over the human life course. Genome Research, 24(11), 1725–1733. http://doi.org/10.1101/gr.176933.114 Sharp, G. C., Arathimos, R., Reese, S. E., Page, C. M., Felix, J., Küpers, L. K., et al. (2018). Maternal alcohol consumption and offspring DNA methylation: findings from six general population-based birth cohorts. Epigenomics, 10(1), 27–42. http://doi.org/10.2217/epi-2017-0095 Shen, J.-C., Rideout, W. M., III, & Jones, P. A. (1994). The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA. Nucleic Acids Research, 22(6), 972–976. http://doi.org/10.1093/nar/22.6.972 Shenker, N. S., Polidoro, S., van Veldhoven, K., Sacerdote, C., Ricceri, F., Birrell, M. A., et al. (2013). Epigenome-wide association study in the European Prospective Investigation into Cancer and Nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking. Human Molecular Genetics, 22(5), 843–851. http://doi.org/10.1093/hmg/dds488 Shoemaker, R., Deng, J., Wang, W., & Zhang, K. (2010). Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Research, 20(7), 883–889. http://doi.org/10.1101/gr.104695.109 Simpkin, A. J., Hemani, G., Suderman, M., Gaunt, T. R., Lyttleton, O., McArdle, W. L., et al.  141 (2016). Prenatal and early life influences on epigenetic age in children: A study of mother-offspring pairs from two cohort studies. Human Molecular Genetics, 25(1), 191–201. http://doi.org/10.1093/hmg/ddv456 Simpkin, A. J., Suderman, M., & Howe, L. D. (2017). Epigenetic clocks for gestational age: statistical and study design considerations. Clinical Epigenetics, 9(1), 100. http://doi.org/10.1186/s13148-017-0402-y Smeets, R. L., Fleuren, W. W., He, X., Vink, P. M., Wijnands, F., Gorecka, M., et al. (2012). Molecular pathway profiling of T lymphocyte signal transduction pathways; Th1 and Th2 genomic fingerprints are defined by TCR and CD28-mediated signaling. BMC Immunology, 13(1), 1. http://doi.org/10.1186/1471-2172-13-12 Smith, A. K., Kilaru, V., Klengel, T., Mercer, K. B., Bradley, B., Conneely, K. N., et al. (2015). DNA extracted from saliva for methylation studies of psychiatric traits: Evidence tissue specificity and relatedness to brain. American Journal of Medical Genetics. Part B, Neuropsychiatric Genetics : the Official Publication of the International Society of Psychiatric Genetics, 168(1), 36–44. http://doi.org/10.1002/ajmg.b.32278 Smith, A. K., Kilaru, V., Kocak, M., Almli, L. M., Mercer, K. B., Ressler, K. J., et al. (2014). Methylation quantitative trait loci (meQTLs) are consistently detected across ancestry, developmental stage, and tissue type. BMC Genomics, 15(1), 145. http://doi.org/10.1186/1471-2164-15-145 Smith-Vikos, T., & Slack, F. J. (2012). MicroRNAs and their roles in aging. Journal of Cell Science, 125(Pt 1), 7–17. http://doi.org/10.1242/jcs.099200 Smyth, G. K. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article3–25. http://doi.org/10.2202/1544-6115.1027 Steegenga, W. T., Boekschoten, M. V., Lute, C., Hooiveld, G. J., de Groot, P. J., Morris, T. J., et al. (2014). Genome-wide age-related changes in DNA methylation and gene expression in human PBMCs. Age (Dordrecht, Netherlands), 36(3), 9648–1540. http://doi.org/10.1007/s11357-014-9648-x Stringhini, S., Polidoro, S., Sacerdote, C., Kelly, R. S., van Veldhoven, K., Agnoli, C., et al. (2015). Life-course socioeconomic status and DNA methylation of genes regulating inflammation. International Journal of Epidemiology, 44(4), 1320–1330. http://doi.org/10.1093/ije/dyv060 Stubbs, T. M., Bonder, M.-J., Stark, A.-K., Krueger, F., BI Ageing Clock Team, Meyenn, von, F., et al. (2017). Multi-tissue DNA methylation age predictor in mouse. Genome Biology, 18(1), 68. http://doi.org/10.1186/s13059-017-1203-5 Suarez-Alvarez, B., Rodríguez, R. M., Fraga, M. F., & López-Larrea, C. (2012). DNA methylation: a promising landscape for immune system-related diseases. Trends in Genetics, 28(10), 506–514. http://doi.org/10.1016/j.tig.2012.06.005 Surén, P., Stoltenberg, C., Bresnahan, M., Hirtz, D., Lie, K. K., Lipkin, W. I., et al. (2013). Early growth patterns in children with autism. Epidemiology (Cambridge, Mass.), 24(5), 660–670. http://doi.org/10.1097/EDE.0b013e31829e1d45 Teschendorff, A. E., Marabita, F., Lechner, M., Bartlett, T., Tegnér, J., Gomez-Cabrero, D., & Beck, S. (2013). A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics (Oxford, England), 29(2), 189–196. http://doi.org/10.1093/bioinformatics/bts680 Thompson, M. J., vonHoldt, B., Horvath, S., & Pellegrini, M. (2017). An epigenetic aging clock  142 for dogs and wolves. Aging, 9(3), 1055–1068. http://doi.org/10.18632/aging.101211 Tobi, E. W., Slieker, R. C., Luijk, R., Dekkers, K. F., Stein, A. D., Xu, K. M., et al. (2018). DNA methylation as a mediator of the association between prenatal adversity and risk factors for metabolic disease in adulthood. Science Advances, 4(1), eaao4364. http://doi.org/10.1126/sciadv.aao4364 Touleimat, N., & Tost, J. (2012). Complete pipeline for Infinium® Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation. Dx.Doi.org, 4(3), 325–341. http://doi.org/10.2217/epi.12.21 Triche, T. J., Jr, & Weisenberger, D. J. (2013). Low-level processing of illumina infinium dna methylation beadarrays. Nucleic Acids …. Triche, T. J., Jr, Weisenberger, D. J., Van Den Berg, D., Laird, P. W., & Siegmund, K. D. (2013). Low-level processing of Illumina Infinium DNA Methylation BeadArrays. Nucleic Acids Research, 41(7), e90–e90. http://doi.org/10.1093/nar/gkt090 Trowbridge, J. J., Snow, J. W., Kim, J., & Orkin, S. H. (2009). DNA Methyltransferase 1 Is Essential for and Uniquely Regulates Hematopoietic Stem and Progenitor Cells. Cell Stem Cell, 5(4), 442–449. http://doi.org/10.1016/j.stem.2009.08.016 Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., et al. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics (Oxford, England), 17(6), 520–525. http://doi.org/10.1093/bioinformatics/17.6.520 Tsai, P.-C., & Bell, J. T. (2015). Power and sample size estimation for epigenome-wide association scans to detect differential DNA methylation. International Journal of Epidemiology, 44(4), 1429–1441. http://doi.org/10.1093/ije/dyv041 Unnikrishnan, A., Hadad, N., Masser, D. R., Jackson, J., Freeman, W. M., & Richardson, A. (2018). Revisiting the genomic hypomethylation hypothesis of aging. Annals of the New York Academy of Sciences, 166, 237. http://doi.org/10.1111/nyas.13533 van den Boogaart, K. G., & Tolosana-Delgado, R. (2013). Fundamental Concepts of Compositional Data Analysis. In Analyzing Compositional Data with R (pp. 13–50). Berlin, Heidelberg: Springer Berlin Heidelberg. http://doi.org/10.1007/978-3-642-36809-7_2 van Dongen, J., Nivard, M. G., Willemsen, G., Hottenga, J.-J., Helmer, Q., Dolan, C. V., et al. (2016). Genetic and environmental influences interact with age and sex in shaping the human methylome. Nature Communications, 7, 11115. http://doi.org/10.1038/ncomms11115 van Eijk, K. R., de Jong, S., Strengman, E., Buizer-Voskamp, J. E., Kahn, R. S., Boks, M. P., et al. (2015). Identification of schizophrenia-associated loci by combining DNA methylation and gene expression data from whole blood. European Journal of Human Genetics : EJHG, 23(8), 1106–1110. http://doi.org/10.1038/ejhg.2014.245 Vidaki, A., & Kayser, M. (2017). From forensic epigenetics to forensic epigenomics: broadening DNA investigative intelligence. Genome Biology, 18(1), 238. http://doi.org/10.1186/s13059-017-1373-1 Vidaki, A., Ballard, D., Aliferi, A., Miller, T. H., Barron, L. P., & Syndercombe Court, D. (2017). DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing. Forensic Science International. Genetics, 28, 225–236. http://doi.org/10.1016/j.fsigen.2017.02.009 Voisin, S., Eynon, N., Yan, X., & Bishop, D. J. (2015). Exercise training and DNA methylation in humans. Acta Physiologica (Oxford, England), 213(1), 39–59. http://doi.org/10.1111/apha.12414 Wahl, S., Drong, A., Lehne, B., Loh, M., Scott, W. R., Kunze, S., et al. (2017). Epigenome-wide  143 association study of body mass index, and the adverse outcomes of adiposity. Nature, 541(7635), 81–86. http://doi.org/10.1038/nature20784 Walker, R. F., Liu, J. S., Peters, B. A., Ritz, B. R., Wu, T., Ophoff, R. A., & Horvath, S. (2015). Epigenetic age analysis of children who seem to evade aging. Aging, 7(5), 334–339. Wang, T., Guan, W., Lin, J., Boutaoui, N., Canino, G., Luo, J., et al. (2015). A systematic study of normalization methods for Infinium 450K methylation data using whole-genome bisulfite sequencing data. Epigenetics : Official Journal of the DNA Methylation Society, 10(7), 662–669. http://doi.org/10.1080/15592294.2015.1057384 Warburton, D. E. R., Nicol, C. W., & Bredin, S. S. D. (2006). Health benefits of physical activity: the evidence. CMAJ : Canadian Medical Association Journal = Journal De l'Association Medicale Canadienne, 174(6), 801–809. http://doi.org/10.1503/cmaj.051351 Weber, M., Hellmann, I., Stadler, M. B., Ramos, L., Pääbo, S., Rebhan, M., & Schübeler, D. (2007). Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nature Genetics, 39(4), 457–466. http://doi.org/10.1038/ng1990 Weidner, C. I., Lin, Q., Koch, C. M., Eisele, L., Beier, F., Ziegler, P., et al. (2014). Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biology, 15(2), R24. http://doi.org/10.1186/gb-2014-15-2-r24 Wherry, E. J., & Kurachi, M. (2015). Molecular and cellular insights into T cell exhaustion. Nature Reviews Immunology, 15(8), 486–499. http://doi.org/10.1038/nri3862 Wilhelm-Benartzi, C. S., Koestler, D. C., Karagas, M. R., Flanagan, J. M., Christensen, B. C., Kelsey, K. T., et al. (2013). Review of processing and analysis methods for DNA methylation array data. British Journal of Cancer, 109(6), 1394–1402. http://doi.org/10.1038/bjc.2013.496 Wilson, V. L., Smith, R. A., Ma, S., & Cutler, R. G. (1987). Genomic 5-methyldeoxycytidine decreases with age. The Journal of Biological Chemistry, 262(21), 9948–9951. Winnefeld, M., & Lyko, F. (2012). The aging epigenome: DNA methylation from the cradle to the grave. Genome Biology, 13(7), 165. http://doi.org/10.1186/gb-2012-13-7-165 Wolf, E. J., Logue, M. W., Hayes, J. P., Sadeh, N., Schichman, S. A., Stone, A., et al. (2015). Accelerated DNA methylation age: Associations with PTSD and neural integrity. Psychoneuroendocrinology, 63, 155–162. http://doi.org/10.1016/j.psyneuen.2015.09.020 Wolf, E. J., Maniates, H., Nugent, N., Maihofer, A. X., Armstrong, D., Ratanatharathorn, A., et al. (2017). Traumatic stress and accelerated DNA methylation age: A meta-analysis. Psychoneuroendocrinology. http://doi.org/10.1016/j.psyneuen.2017.12.007 Yin, Y., Morgunova, E., Jolma, A., Kaasinen, E., Sahu, B., Khund-Sayeed, S., et al. (2017). Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science (New York, N.Y.), 356(6337), eaaj2239. http://doi.org/10.1126/science.aaj2239 Yousefi, P., Huen, K., Schall, R. A., Decker, A., Elboudwarej, E., Quach, H., et al. (2014). Considerations for normalization of DNA methylation data by Illumina 450K BeadChip assay in population studies. Epigenetics : Official Journal of the DNA Methylation Society, 8(11), 1141–1152. http://doi.org/10.4161/epi.26037 Yu, D.-H., Ware, C., Waterland, R. A., Zhang, J., Chen, M.-H., Gadkari, M., et al. (2013). Developmentally programmed 3' CpG island methylation confers tissue- and cell-type-specific transcriptional activation. Molecular and Cellular Biology, 33(9), 1845–1858. http://doi.org/10.1128/MCB.01124-12 Zampieri, M., Ciccarone, F., Calabrese, R., Franceschi, C., Bürkle, A., & Caiafa, P. (2015).  144 Reconfiguration of DNA methylation in aging. Mechanisms of Ageing and Development, 151, 60–70. http://doi.org/10.1016/j.mad.2015.02.002 Zannas, A. S., Arloth, J., Carrillo-Roa, T., Iurato, S., Röh, S., Ressler, K. J., et al. (2015). Lifetime stress accelerates epigenetic aging in an urban, African American cohort: relevance of glucocorticoid signaling. Genome Biology, 16(1), R741. http://doi.org/10.1186/s13059-015-0828-5 Zheng, S. C., Widschwendter, M., & Teschendorff, A. E. (2016). Epigenetic drift, epigenetic clocks and cancer risk. Epigenomics, 8(5), 705–719. http://doi.org/10.2217/epi-2015-0017 Zhu, H., Wang, G., & Qian, J. (2016). Transcription factors as readers and effectors of DNA methylation. Nature Reviews. Genetics, 17(9), 551–565. http://doi.org/10.1038/nrg.2016.83 Ziller, M. J., Gu, H., Müller, F., Donaghey, J., Tsai, L. T.-Y., Kohlbacher, O., et al. (2013). Charting a dynamic DNA methylation landscape of the human genome. Nature, 500(7463), 477–481. http://doi.org/10.1038/nature12433    145 Appendices  Appendix A  Supplementary materials for Chapter 2 A.1 Supplementary figures   Supplementary Figure 2.1: Genomic enrichment plot of 94-clock CpG sites selected from training data. The 94-clock CpGs were associated with gene features using the annotation described previously (Edgar et al., 2014)and with CpG island features as provided in the Illumina annotation (Bibikova et al., 2011). A count of CpGs located in each gene feature (promoter, intragenic, 3 prime region and intergenic) and CpG island feature (island, north and south shore, north and south shelf, and no island association), were compared to the background counts of all CpGs measured. Monte Carlo simulations were performed using 1,000 permutations of random lists of 94 CpG sites to calculate fold change values over the background.   146   Supplementary Figure 2.2: Buccal DNA methylation age acceleration was moderately higher in individuals diagnosed with Autism spectrum disorder (ASD) than typically developing (TD) children with outlier removed. In data set 9B, after removing the outlier observed in Figure 3, (GSE50759, n = 80), DNA methylation age was predicted using a 94-CpG model and subsequently regressed onto chronological age, while controlling for sex, batch, predicted buccal proportion and ethnicity.    147  Supplementary Figure 2.3: Pediatric buccal DNA methylation age performance in saliva. Pediatric buccal DNA methylation age (y-axis) estimated from the 94-CpG model (based on training data) in data set 12 (GRADY, n = 65). Dashed line indicates perfect correlation y=x, and solid black line represents linear regression line with 95% confidence intervals shaded in grey. Error = median absolute difference between DNA methylation age and chronological age, unit in years. Spearman correlation coefficients displayed.   148  Supplementary Figure 2.4: Pediatric buccal DNA methylation age performance in whole blood data set 13 before and after cell type composition adjustment. a Pediatric buccal DNA methylation (DNAm) age (y-axis) estimated from the 94-CpG model (based on training data) in data set 13 (GSE36054, n = 134), prior to adjustment for cell type composition, compared to chronological age (x-axis). b DNAm age (y-axis) calculated from the 353-CpG pan-tissue model in data set 13, prior to adjustment for cell type composition, compared to chronological age (x-axis). c Pediatric buccal DNAm age (y-axis) estimated from the 94-CpG model (based on training data) in data set 13, after adjustment for cell type composition (Houseman et al., 2012), compared to chronological age (x-axis). d DNAm age (y-axis) calculated from the 353-CpG pan-tissue model in data set 13 after adjustment for cell type composition using same method as  149 above, compared to chronological age (x-axis). Dashed line indicates perfect correlation y=x, and solid black line represents linear regression line with 95% confidence intervals shaded in grey. Error = median absolute difference between DNAm age and chronological age.           150   Supplementary Figure 2.5: Pediatric buccal DNA methylation age performance in whole blood data set 14 before and after cell type composition adjustment. a Pediatric buccal DNA methylation (DNAm) age (y-axis) estimated from the 94-CpG model (based on training data) in data set 14 (GSE64495), prior to adjustment for cell type composition, compared to chronological age (x-axis). b DNAm age (y-axis) calculated from the 353-CpG pan-tissue model in data set 14, prior to adjustment for cell type composition, compared to chronological age (x-axis). c Pediatric buccal DNAm age (y-axis) estimated from the 94-CpG model (based on training data) in data set 14, after adjustment for cell type composition using the Houseman method (Houseman et al., 2012), compared to chronological age (x-axis). d DNAm age (y-axis) calculated from the 353-CpG pan-tissue model in data set 14 after adjustment for cell type  151 composition using the same cellular deconvolution method, compared to chronological age (x-axis). Dashed line indicates perfect correlation y=x, and solid black line represents linear regression line with 95% confidence intervals shaded in grey. Error = median absolute difference between DNAm age and chronological age, unit in years. Spearman correlation coefficients displayed.   A.2 Supplementary tables Supplementary table 2.1: Description of 94 CpGs for model derived from all available DNA methylation data (both buccal training and test data sets) CpG Coef Age Correlation Associated gene (nearest TSS) Chromosome Distance to TSS (bp) CpG Class cg00085493* -0.105 -0.80 AK123177 1 2214 LC cg00609333* 0.026 0.67 MGST1 12 -1350 LC cg01912040 0.021 0.81 ABR 17 -15937 HC cg02209075* 0.048 0.60 HIC2 22 2046 LC cg02426178* -0.387 -0.87 SLC44A2 19 663 IC cg02553520 0.023 0.63 PDIA6 2 35145 IC cg02821342* -0.272 -0.71 FLJ43663 7 11 LC cg02980055* -0.192 -0.80 LINC00293 8 113676 IC cg03020208* 0.057 0.72 AQP5 12 -316 HC cg03182000 0.030 0.80 VGLL4 3 9991 LC cg03370490 -0.001 -0.58 BRSK2 11 -30823 IC cg03473016* 0.090 0.76 MYLK 3 25355 LC cg03493146* -0.164 -0.69 AUTS2 7 -1822 HC cg03555227* 0.668 0.91 RANBP17 5 175 HC cg03627290 -0.016 -0.69 TMBIM1 2 2032 LC cg03834031 -0.279 -0.51 LOC554174 22 -11677 ICshore cg04100595 -0.033 -0.56 C12orf70 12 -4604 LC cg04221461* 0.077 0.85 AKT3 1 129560 LC cg04267101 -0.088 -0.81 SPATA20 17 -1381 HC cg05018671 -0.088 -0.84 TRIM35 8 22074 LC  152 CpG Coef Age Correlation Associated gene (nearest TSS) Chromosome Distance to TSS (bp) CpG Class cg05024939* 0.114 0.81 NEUROG2 4 -4923 IC cg05271255* -0.149 -0.84 P4HB 17 -2798 LC cg05928290* 0.127 0.76 BMP4 14 1445 IC cg06048436* -0.179 -0.70 ENOX1 13 -3424 LC cg06144905* 0.363 0.83 PIPOX 17 -137 LC cg06416491* -0.207 -0.81 KIF21A 12 -468 ICshore cg06455149* 0.043 0.85 EBF2 8 -3171 HC cg06712013* -0.098 -0.72 SPATS2 12 -1142 LC cg07057579 0.015 0.86 LOC400456 15 217 HC cg07066898* 0.133 0.88 GLT25D1 19 37777 HC cg07352544 -0.013 -0.67 SEPN1 1 -433 LC cg07638500* -0.050 -0.68 MYLK 3 11685 LC cg07809484* 0.097 0.86 CLEC11A 19 5364 HC cg08003628 -0.036 -0.69 ZFPM2 8 1042 HC cg08207604* -0.031 -0.77 DEF6 6 -286 LC cg08458637 0.024 0.64 LOC100287814 1 -70 LC cg09023624 0.033 0.70 UBE2E1 3 -62467 IC cg09367901* 0.021 0.74 BMP4 14 1322 IC cg09678615* -0.056 -0.76 TOX3 16 1643 ICshore cg09877744 0.000 0.82 MAPK8IP2 22 795 HC cg09938443 0.016 0.72 ZDHHC7 16 25273 IC cg10043090 -0.058 -0.77 MOB2 11 -28862 IC cg10308629* -0.098 -0.74 BPGM 7 23273 IC cg11071401* 0.004 0.86 CACNA1G 17 -1254 ICshore cg11084334* 0.137 0.90 LHFPL4 3 1222 HC cg11092416* -0.019 -0.76 MOG 6 -7208 IC cg11117945* -0.024 -0.76 EYA2 20 1452 LC cg11529819* -0.009 -0.72 MAP3K6 1 -2340 IC cg11705975* 0.102 0.92 PRLHR 10 508 HC cg11804928* -0.116 -0.62 THRA 17 1231 LC cg12483947* -0.050 -0.75 PCBD1 10 8441 LC cg12791192 -0.057 -0.66 SPATA20 17 -665 HC cg13848598* 0.033 0.88 ADRB1 10 773 HC cg14051236 -0.082 -0.74 Y_RNA 14 -19590 LC cg14343652 -0.003 -0.60 FLJ43663 7 2776 LC cg14676592* 0.171 0.86 ZNF423 16 -49944 IC cg16181396* 0.191 0.91 ZIC1 3 -974 IC cg16243756* 0.005 0.70 TUSC3 8 42 HC  153 CpG Coef Age Correlation Associated gene (nearest TSS) Chromosome Distance to TSS (bp) CpG Class cg16374999* 0.063 0.83 PRPH 12 72 HC cg16514113* 0.254 0.86 UBE2MP1 16 -23299 LC cg16618789* 0.042 0.67 U3 7 -146043 LC cg16852756* -0.004 -0.82 LOC646329 7 -32393 LC cg16867657* 0.601 0.93 ELOLV2 6 -46 HC cg17110586 0.019 0.89 LRFN3 19 26602 IC cg17349352 -0.002 -0.73 CCM2 7 -494 LC cg17508941 0.030 0.91 BC043576 7 -635 ICshore cg18244483* 0.303 0.82 LOC100507096 4 -134876 IC cg19203457* 0.167 0.84 MARCKS 6 3272 HC cg19381811* -0.066 -0.80 UBA7 3 -322 LC cg19662687* 0.000 -0.73 C1orf192 1 -15 LC cg20224218 -0.103 -0.76 Mir_1302 9 31873 LC cg20283670* -0.214 -0.79 AFAP1L2 10 1787 ICshore cg20397034* -0.502 -0.73 FLJ43663 7 69 IC cg20704560* -0.026 -0.72 CASZ1 1 23142 ICshore cg20755989* 0.198 0.78 LOC729862 5 -523016 LC cg20897936* 0.191 0.92 SCAND3 6 371 IC cg21515406* 0.000 0.71 LOC285577 5 211058 IC cg21572722* 0.651 0.90 ELOLV2 6 -29 HC cg22686066 0.001 0.57 TAF4 20 -317 HC cg23500537* 0.103 0.84 PCDHB1 5 -11159 IC cg23709172 -0.001 -0.81 SLC25A29 14 -1665 LC cg23731991* -0.080 -0.65 MKLN1 7 177 IC cg24374161* -0.012 -0.79 AMBRA1 11 -12008 LC cg24441922* -0.061 -0.78 ARL4C 2 3283 LC cg24996883* -0.336 -0.80 C1orf95 1 85357 LC cg25062539* -0.232 -0.77 IL7 8 -724 LC cg25397922 0.002 0.79 NEUROG2 4 4762 ICshore cg25540871 -0.156 -0.74 TACC1 8 -24478 LC cg25936902* -0.081 -0.69 LINC00426 13 -34935 ICshore cg26535072* -0.244 -0.73 APBA1 9 -881 HC cg27015773 0.024 0.49 ADAMTS10 19 2979 LC cg27061971* 0.119 0.84 FOXD1 5 -2638 HC cg27437510* 0.105 0.74 AX747175 7 -1203 IC cg27490391 0.125 0.83 LHFP 13 25169 LC   154  A.3 Cohort descriptions  Data set 1: APrON Summary: APrON (Albert Pregnancy Outcomes and Nutrition) is a Canadian cohort of pregnant women with 3-year child follow-up. DNA methylation was collected at birth (cord blood) and at age 3 months (buccal epithelial), however, we only included the buccal epithelial data in the current study.  Tissue extraction/genomic DNA isolation: Genomic DNA was extracted using the Gentra Puregene Buccal Cell Kit (Qiagen #158845). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip  Relevant citations: (Kaplan et al., 2012) Cohort link: http://www.apronstudy.ca/  Data set 2: C3ARE Summary: The C3ARE Study includes mothers of singleton infants born at > 34 weeks gestation who are residents of Greater Vancouver. The primary motive of this study was to assess the relationship between maternal contact/caregiving and epigenetics. Samples from the infants were collected at the participant’s location and time of choice (either at their home or at BC Women & Children’s Health Center).  Tissue extraction/genomic DNA isolation: Buccal samples were collected using Isohelix Buccal Swabs (Cell Projects, Harrietsham, Kent, England) and stabilized per manufacturer’s instructions  155 for storage at room temperature. Genomic DNA was isolated using Isohelix Buccal DNA Isolation Kits (Cell Projects, Harrietsham, Kent, England), and then purified and concentrated using DNA Clean & Concentrator (Zymo Research, Irvine, CA).  Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip  Relevant citations: (S. R. Moore et al., 2017).  Data set 3: GECKO  Summary: GECKO (Gene Expression Collaborative for Kids Only) is comprised of individuals aged 6-11 years from Vancouver, British Columbia. All experimental procedures were conducted in accordance to institutional review board policies at the University of British Columbia. Written informed consent was obtained from a parent or legal guardian and assent was obtained from each child before study participation.  Tissue extraction/genomic DNA isolation: buccal epithelial cells were collected with Isohelix DNA buccal swabs using standard collection procedures and stabilized with Isohelix Dri-Capsules for storage at room temperature prior to genomic DNA extraction (Cell Projects Ltd., Kent, UK). Genomic DNA was purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA) Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip  Cohort link: http://earlylearning.ubc.ca/gecko/  156  Data set 4: NDN I Summary: NDN I (NeuroDevNet I) is a cohort of children aged between 5-18 years old with fetal alcohol spectrum disorder (FASD) as well as age and sex matched typically developing individuals. Individuals were diagnosed with FASD according to previously published guidelines (Chudley et al., 2005), and were recruited across Canada from several FASD diagnostic clinics. For the current study, only typically developing children were included. Ethics for this project were reviewed and approved by the “Children’s and Women’s Research Ethics Board—Clinical” (H10-01149). All experimental procedures were reviewed and approved by the Health Research Ethics Boards at Queen’s University, University of Alberta, Children’s Hospital of Eastern Ontario, University of Manitoba, and the University of British Columbia. Written informed consent was obtained from a parent or legal guardian, and assent was obtained from each child before study participation.  Tissue extraction/genomic DNA isolation: Isohelix buccal swabs and Dri-Capsule (Cell Projects Ltd., Kent, UK) were used to collect buccal epithelial samples. Genomic DNA was purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip  DNA methylation data are publically available: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80261 Relevant citations: (Lussier et al., 2018; Portales-Casamar et al., 2015)  157 Cohort link: http://neurodevnet.ca/  Data set 5 & 10: MAVAN I & II Summary: MAVAN (Maternal Adversity, Vulnerability and Neurodevelopment) Eligible women were recruited in pregnancy and inclusion criteria were age under 18 years, singleton pregnancy and fluency in French or English. Women with severe chronic illness, obstetric complications and those who delivered preterm (<37 weeks gestational age) were excluded. Full ethical approval for this study was provided by the Institutional Review Boards at the Douglas Research Institute at McMaster University.  Tissue extraction/genomic DNA isolation: Buccal epithelial cells were collected from the children (4-11 years) using Catch-All Swabs (Epicentre, Illumina, USA). Genomic DNA was extracted using the Masterpure system (Epicentre, USA), quantified using a NanoPhotometer P300 (Implen, Germany) and subsequently purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip Relevant citations: (O'Donnell et al., 2014) Cohort link: https://douglas.research.mcgill.ca/maternal-adversity-vulnerability-and-neurodevelopment-project-theory-and-methodology  Data set 6: PAWS  158 Summary: PAWS (The Peers and Wellness Study) is a cohort of children originally recruited from public school kindergarten classrooms in the San Francisco Bay area (aged between 9-13 years at time of buccal epithelial sample collection). The primary goal of this study was to investigate social dominance, biological responses to adversity and mental/physical health.  Tissue extraction/genomic DNA isolation: Isohelix T-swab Buccal Swabs (Isohelix) were used for sample collection, cells were first mixed with stabilizing solution and then stored at room temperature until DNA was extracted. Genomic DNA was purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip Relevant citations: Bush et al. (submitted), (Boyce et al., 2010) Cohort link: http://bushlab.ucsf.edu/article/paws-peers-and-wellness-study  Data set 7: WSFW Summary: WSFW (Wisconsin Study of Families and Work) is a longitudinal cohort focused on family health throughout pregnancy and post-partum. For the current study, only samples at age 18 were included.   Tissue extraction/genomic DNA isolation: Buccal epithelial swabs were collected using MasterAmp Buccal Swabs (Epicentre Biotechnologies, Madison, WI, USA). DNA was extracted using Isohelix Buccal DNA Isolation Kits (Cell Projects, Harrietsham, Kent, England), then purified and concentrated using DNA Clean & Concentrator (Zymo Research, Irvine, CA, USA).  159 Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA, USA) DNA methylation quantification: Infinium HumanMethylation450K BeadChip Relevant citations: Park et al (to be submitted). Cohort link:  Data set 8: BIBO Summary: BIBO (Basale Invloeden op de Baby Ontwikkeling, translated from Dutch = Basal Influences on Baby Development). Tissue extraction/genomic DNA isolation: Isohelix T-swab Buccal Swabs. Genomic DNA was purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA, USA) DNA methylation quantification: Infinium MethylationEPIC Kit Cohort link: https://dpblab.org/bibo-study/informatie-over-bibo/  Data set 9A & 9B: GSE50759 Summary: GSE50759 is a publically available data set obtained from the Gene Expression Omnibus website (https://www.ncbi.nlm.nih.gov/geo). The full data set is composed of 96 individuals, 45 were diagnosed with autism spectrum disorder and 45 were controls. Data set 9A contained only the typically developing individuals for the purposes of accurate test assessment, whereas 9B included both typically developing and ASD cases in order to examine the effect of ASD status and deviation between the predicted age and actual age.  160 Relevant citations: (Berko et al., 2014)  Data set 11: NEURON Summary: NEURON is a project focused on early maternal traumatic childhood experiences and their effects on pregnancy and the postpartum period. Tissue extraction/genomic DNA isolation: DNA isolation: Isohelix T-swab Buccal Swabs. Genomic DNA was purified and concentrated using DNA Clean & Concentrator (Zymo Research, CA, USA). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA, USA) DNA methylation quantification: Infinium MethylationEPIC Kit  Data set 12: GRADY Summary: The Grady Trauma Project is a trauma based study at Grady Memorial Hospital and Emory University School of Medicine in Atlanta, Georgia. The project focuses on Post-Traumatic Stress Disorder and the clinical and physiological implications of trauma exposure. Tissue extraction/genomic DNA isolation: Oragene DNA Saliva Kit (DNA Genotek, ON, Canada). Bisulfite conversion: genomic DNA was bisulfite converted using the EZ DNA Methylation Kit™ (Zymo Research, Irvine, CA, USA) DNA methylation quantification: Infinium MethylationEPIC Kit Cohort link: http://gradytraumaproject.com/   161 Data set 13: GSE30654 Publically available data set:  https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi Relevant citations:  (Alisch et al., 2012)  Data set 14: GSE64495 Publically available data set:  https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE64495 Relevant citations:  (Walker et al., 2015)              162    Appendix B  Supplementary materials for Chapter 3  B.1 Supplementary figures    Supplementary Figure 3.1: Correlation plot of DNA methylation-based estimated blood cell-type proportions. Colored blocks represent correlation p values below 0.05, red indicates negative correlation and blue indicates positive correlation. Gran = granulocyte, Mono = monocyte, NK = natural killer.   163  Supplementary Figure 3.2: EEAA (extrinsic epigenetic age acceleration), General Age Acceleration (residuals from a linear model of DNAm age regressed onto chronological age), IEAA (intrinsic age acceleration). All measures were generated from the online epigenetic age software. No significant differences were observed between Nicoyans (blue) and non-Nicoyans (red). Significant values generated from ANOVA statistical tests. 47 48p = 0.8947 48p = 0.6447 48p = 0.54EEAA General IEAA− 1.0− Acceleration 164   165 Supplementary Figure 3.3: Continuation of differentially methylated regions between Nicoyans and non-Nicoyans. Remaining 14 of the 20 statistically significant DMRs obtained from DMRcate analysis found by the R package ‘DMRcate.’ Unadjusted DNA methylation values, shown as percent of cells methylated, are displayed on the y-axis, and genomic distance (bp) to the TSS is plotted on the × axis. Associated genes are based on closest distance to the TSS of each differentially methylation region. Nicoyans are represented by blue points with the mean of each CpG site illustrated by a blue line. Non-Nicoyans are represented with red with each point indicating an individual, with the red line illustrating the mean at each CpG.    166  Supplementary Figure 3.4: QQ plots of each identified CpG modeled using M values or β values. Linear model included DNA methylation value regressed on group (Nicoya vs non-Nicoya) with sex, age, and estimated cell-type proportions included as covariates.      167  Supplementary Figure 3.5: Epistructure-derived principal component analysis. Genetically informative 450K array sites were used to estimate genetic population structure in our data. Principal component analysis was performed on a reference CpG list to generate measures (PCs) of genetic structure. No significant difference in these estimates was seen between Nicoyans and non-Nicoyans.            168 B.2 Supplementary tables  Assay CpG target Primer  Sequence cg02853387 Forward  AGGGAAGAAAAGTTATTAAGTTGT Reverse (5’biotinylated) ACAAATACAAAACCCATATTCTCAA Sequencing  GTGTAGGTTTTTAGTTTATAGT cg02438481 Forward  GTTTTGGGTTTGGTGATTTGGTTTTA Reverse (5’biotinylated) ATTTCTTAATCAATACCACCTTCTTCTATA Sequencing  TTTATTTTAGGTGGGAGT cg13979274 Forward (5’biotinylated) AGGGGAGTATTTTAGTTTAGTGTATAG Reverse  CCAACTTAAAAAAACCAAACTTCAATATC Sequencing  AAAACAATTACAACCCTC Supplementary Table 3.1: Pyrosequencing primer sequences designed with Qiagen Pyromark Assay Design 2.0 software   CpG ID Performed on M-values Performed on βvalues p value q value* p value q value*  cg02853387 1.3E-07 0.012 1.6e-07 0.014 cg02438481 1.6E-07 0.012 2.2e-07 0.016 cg13979274 2.0E-07 0.012 2.9e-07 0.018 cg26107275 2.2E-07 0.012 3.2e-07 0.018 Supplementary Table 3.2: Comparison of M values and β values of each identified CpG.  *BH corrected      169 Appendix C  Supplementary materials for Chapter 4 C.1 Supplementary figures   Supplementary Figure 4.1: Density plots of change in metavariables over time by group allocation. BMI: body mass index, CRP: c-reactive protein, Diastolic BP: diastolic blood pressure, HR: heart rate, MVPA: minutes of daily moderate-vigorous physical activity, Step Count: daily step count, Systolic BP: systolic blood pressure, Sedentary Activity: percent day spent inactive. All data except for CRP measurements are based on results originally reported in Ashe et al (Ashe et al., 2015).    170  Supplementary Figure 4.2: SERPINE-1 baseline methylation at cg17968347 of all participants on y-axis and actual percent weight loss over six-month period displayed on x-axis. Spearman correlation coefficient shown.    171  Supplementary Figure 4.3: Correlation plot of metadata variables. “pc” = percent change, “diff” = difference in variable measure between time points. Blue indicates a positive correlation, red represents a negative correlation. Correlation values were generated from a Pearson’s Correlation Test. All data, except for CRP measurements, are based on results originally reported in Ashe et al (Ashe et al., 2015).         172 C.2 Supplementary tables  Probe ID p-value  q value1  β-value Difference Chr Genomic Location Associated Gene cg09786593 2.89e-05 0.737 -0.034 2 Intergenic FLJ43879 cg11630939 4.28e-05 0.737 0.051 17 Intergenic AX747630 Supplementary table 4.1: Top 5 CpG probes from univariate linear model of group status regressed on DNA methylation change over 6 months. 1Multiple test corrected p-value using Benjamini & Hochberg method (Hochberg & Benjamini, 1990). Chr: Chromosome                  173 Appendix D  Supplementary materials for Chapter 5  D.1 Supplementary figures   Supplementary Figure 5.1: Correlation heat-map of 59 common polymorphic control probes for 172 common samples run on the EPIC and 450K methylation platforms    174  Supplementary Figure 5.2: Correlations between chronological age, 450K DNA methylation age, and EPIC DNA methylation age estimates from each data input. Pearson’s correlation coefficients shown. Green = comparison between EPIC predictions, blue = comparison between 450K predictions, orange = comparison between EPIC and 450K estimates at each processing stage, pink = comparison of each data platform and stage with chronological age. Bold indicates same processing stage correlation between the 550K and EPIC platforms.   175  Supplementary Figure 5.3: DNA methylation (DNAm) age from a reduced epigenetic age predictor (334 CpGs) compared to the full epigenetic age predictor (353 CpGs) using the same 450K data set. DNAm age derived from a reduced raw 450K data set with only probes present on the EPIC platform (334 clock sites) compared to age acceleration calculated from a full raw 450K data set (353 CpGs).   176  Supplementary Figure 5.4: Diesel Exhaust Study III EPIC DNA methylation beta-value distribution across 795,882 sites. Each individual is represented by a color line. BAL = Bronchoalveolar lavage samples, PBMC = peripheral blood mononuclear cells, Brush = bronchial brushing samples.   177  Supplementary Figure 5.5: Heat-map of Kendall rank coefficients across preprocessing methods in EPIC data.   178  Supplementary Figure 5.6: Dendrogram of 59 single nucleotide polymorphic control probes of technical replicates from the EPIC array.  D.2 Supplementary tables   Monocyte Cohort   N= 172 Age (mean[SD]) 30.0[7.1] Sex (% male) 100%  Diesel Exhaust Controls  N = 13 Age (mean[SD]) 31.2[7.7] Sex (%male) 53.9% Supplementary table 5.1: Cohort Characteristics. SD = standard deviation  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items