UBC Faculty Research and Publications

An empirically driven data reduction method on the human 450K methylation array to remove tissue specific… Edgar, Rachel D; Jones, Meaghan J; Robinson, Wendy P; Kobor, Michael S Feb 2, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-13148_2017_Article_320.pdf [ 1.46MB ]
JSON: 52383-1.0362096.json
JSON-LD: 52383-1.0362096-ld.json
RDF/XML (Pretty): 52383-1.0362096-rdf.xml
RDF/JSON: 52383-1.0362096-rdf.json
Turtle: 52383-1.0362096-turtle.txt
N-Triples: 52383-1.0362096-rdf-ntriples.txt
Original Record: 52383-1.0362096-source.json
Full Text

Full Text

METHODOLOGY Open AccessAn empirically driven data reductionmethod on the human 450K methylationarray to remove tissue specific non-variableCpGsRachel D. Edgar, Meaghan J. Jones, Wendy P. Robinson and Michael S. Kobor*AbstractBackground: Population based epigenetic association studies of disease and exposures are becoming morecommon with the availability of economical genome-wide technologies for interrogation of the methylome,such as the Illumina 450K Human Methylation Array (450K). Often, the expected small number of differentiallymethylated cytosine-guanine pairs (CpGs) in studies of the human methylome presents a statistical challenge,as the large number of CpGs measured on the 450K necessitates careful multiple test correction. While the 450Kis a highly useful tool for population epigenetic studies, many of the CpGs tested are not variable and thus oflimited information content in the context of the study and tissue. CpGs with observed lack of variability inthe tissue under study could be removed to reduce the data dimensionality, limit the severity of multiple testcorrection and allow for improved detection of differential DNA methylation.Methods: Here, we performed a meta-analysis of 450K data from three commonly studied human tissues, namelyblood (605 samples), buccal epithelial cells (121 samples) and placenta (157 samples). We developed lists of CpGsthat are non-variable in each tissue.Results: These lists are surprisingly large (blood 114,204 CpGs, buccal epithelial cells 120,009 CpGs and placenta101,367 CpGs) and thus will be valuable filters for epigenetic association studies, considerably reducing thedimensionality of the 450K and subsequently the multiple testing correction severity.Conclusions: We propose this empirically derived method for data reduction to allow for more power in detectingdifferential DNA methylation associated with exposures in studies on the human methylome.Keywords: Non-variable, 450K, Tissue, Filter, Power, Multiple-test correction, DNA methylation, Dimensionality reductionBackgroundPopulation studies that interrogate epigenetic signaturesassociated with environmental variation and disease arebecoming increasingly common. The challenge with themajority of epigenome wide association studies (EWAS)of environment and disease is that the epigenetic signals,in terms of detectable number of epigenetic changes andthe effect size of changes, between groups are relativelysmall compared to those observed in EWAS of develop-ment, tissues or cancer. Therefore careful and specificmethodological steps need to be implemented in analysesto separate any true biological signal from stochastic vari-ation in DNA methylation (DNAm), a phenomenon com-monly referred to as noise [1].One of the most common types of population basedepigenetic studies is the examination of DNAm usingthe Illumina Infinium 450K array (450K) or its relatedarrays [2]. The Illumina series of DNAm arrays, whilehighly useful as tools for epigenetic studies, were notdesigned for any specific human tissue, and a large num-ber of cytosine-guanine pairs (CpGs) lack variabilitywithin single tissue studies on the arrays [3–8]. CpGs thatare non-variable in a study of a specific disease or tissue* Correspondence: msk@cmmt.ubc.caDepartment of Medical Genetics, BC Children’s Hospital, University of BritishColumbia, Vancouver, Canada© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Edgar et al. Clinical Epigenetics  (2017) 9:11 DOI 10.1186/s13148-017-0320-zmay be variable in another context and therefore are stillvaluable on the 450K. However, these tissue specific non-variable CpGs contribute to the high dimensionality of the450k data and partially necessitate the need for severemultiple test correction. In an effort to rigorously deter-mine the epigenetic signals of environmental exposureand/or disease phenotypes, dimensionality reduction tech-niques are often employed. These include mixture model-ling, principal component analysis, weighted gene co-expression network analysis and elastic net models,among others [9–12]. While these techniques are effectivefor high-dimensional data reduction, they do not take intoaccount the wealth of independent DNAm data availableto build empirical data reduction filters. A common data-driven dimensionality reduction technique is to removenon-variable CpGs from within a specific study and thentest only variable sites for association with the exposure ofinterest [3–8]. While this practice can reduce severe mul-tiple test correction penalties, it can introduce a bias to-ward significant results [13]. A promising alternative fromgene expression analyses is to use a filter based on priorbiological knowledge from independent data, which canbe highly effective in improving sensitivity while maintain-ing specificity [13].Here, we have developed an empirically derived datareduction method in the form of CpG lists which arenon-variable in independent cohorts of samples fromthree commonly used human tissues: blood, buccal epi-thelial cells and placenta. We anticipate these independ-ently identified non-variable CpG lists will be useful forconfirmation of a lack of variability at CpGs in 450Kstudies of interest. As such, our non-variable CpGsmight serve as a benchmark to cross-reference CpGsalso seen as non-variable in a study of interest so thatthese CpGs can be filtered prior to differential DNAmanalysis. Removal of these independently verified non-variable CpGs should then allow for a reduced multipletesting space and allow for more power to detect differ-ential DNAm in the study of interest. While this ap-proach will be immediately useful for studies of 450Kdata, it will also provide a blueprint for similar ap-proaches with emerging technologies such as the Illu-mina EPIC array. Our filtering approach for datareduction is focused on CpG-by-CpG EWAS analyses,which are very common approaches in DNAm analysis.However, this filtering approach also has the potential toimprove the performance of other analyses where astrong signal is expected at a small subset of CpGs andnoise in the data is a concern. In the context of the rap-idly increasing number of DNAm datasets being pro-duced, we have made our code available so thatindependent non-variable CpG lists can be rapidly devel-oped for other tissues of interest on the 450K and theEPIC as data becomes available.MethodsData collectionThe tissue datasets were collected from Gene ExpressionOmnibus (GEO) [14]. In all tissues, cancer samples wereexcluded, as cancer is associated with high DNAm vari-ability [15]. For individual tissues, there were a range ofexclusion terms by which samples were filtered(Additional file 1: Table S2). Exclusion terms were basedon whether the term indicated cancerous tissue, a tissueother than the tissue of interest or a species other thanhuman. In general, data was downloaded as non-normalized betas, but in some cases, M values wereconverted to beta, and normalized data was used. Eachtissue dataset was then filtered down to the minimalnumber of CpGs with DNAm values across all samplesof a tissue (blood 469,961 CpGs, buccal epithelial cells420,374 CpGs and placenta 484,621 CpGs).Quality controlTo remove CpGs and samples that consistently did notperform well on the 450K, CpGs were filtered if greaterthan 5% of samples had fewer than three beads contrib-uting to the signal across all samples from a tissue. Sam-ples were removed if 2.5% of CpGs in a sample hadfewer than three beads contributing to the signal. Sam-ples were also removed if they had low sample-samplecorrelation compared to all other samples of a tissue.One sample was filtered from placenta and four entirestudies were filtered from blood (total of 158 samplesfrom blood; see Fig. 1; Additional file 1: Table S2). Thefinal studies and samples included are listed inAdditional file 1: Table S3.Non-variable callingTo designate a CpG as non-variable in a tissue, a thresh-old of 5% range in beta values (DNAm level rangingfrom 0 to 1) between the 10th and 90th percentile wasused [16]. While effect sizes as small as 1% are used inEWAS [8, 17, 18], we used a slightly more stringent def-inition of change in beta of 5% as we are asking only thatthe population as a whole varies by at least 5% and arenot testing an effect size between groups. CpGs with lessthan 5% reference range of beta values in a single tissuepopulation were considered non-variable in that tissue.Genomic enrichmentTo explore the genomic context of non-variable CpGs,all CpGs were associated with gene features using theannotation described previously [19] and with CpG is-land features as provided in the Illumina annotation [2].The count of non-variable CpGs located in each genefeature (promoter, intragenic, 3 prime region and inter-genic) and CpG island feature (island, north and southshore, north and south shelf, and no island association)Edgar et al. Clinical Epigenetics  (2017) 9:11 Page 2 of 8were compared to the background counts of all CpGsmeasured, in each tissue. To compare the non-variableCpG counts to the background in each region, 1000 per-mutations of random CpG lists were used to calculatefold change values over the background [20].Application of data reduction methodTo reproduce the published findings of AHRR DNAmethylation changes associated with smoke exposure, alinear modelling approach was used on previously pub-lished data [21]. In short, DNAm values were normal-ized using BMIQ [22], and cell composition wasnormalized between blood samples [23, 24]. A linearmodel was run at all CpG sites and delta beta effect sizeswere calculated between smokers and non-smokers inthe full dataset of 111 blood samples. To simulate astudy with reduced power, ten permutations of 24 ran-dom samples (12 smokers and 12 non-smokers) were se-lected and the same linear model was run at all CpGs.To test the data reduction method, the CpGs in the tensmaller cohorts were filtered to 374,945 variable CpGsby overlapping the CpGs that were non-variable inGSE53045 (264,578 CpGs non-variable at a referencerange of 0.05) and the blood non-variable CpGs identi-fied in the independent samples (114,204 CpGs de-scribed above). Then, the same linear model was run ononly variable CpGs. CpGs were associated to genes aspreviously described [19].ResultsTissues showed similar levels of non-variable CpGsDNAm data from publicly available studies was collectedfor blood, buccal epithelial cells and placenta (21, 3 and4 studies, respectively). Meta-analysis of samples foreach of the tissues showed generally high correlations(70% of sample pairs correlated above 0.95). While therewere some samples with higher within study correlationsthan across study correlations, the overall high correl-ation of cross study samples can be taken as evidence ofthe consistency of the 450K across research groups(Fig. 1). While four studies of blood were removed dueto low correlation, no obvious explanation of the lack ofcorrelation could be found in the available study charac-teristic information (Additional file 1: Table S2). Thegenerally high concordance of the DNAm samples fromthe same tissue but different studies gives us confidencegoing forward in the appropriateness of comparing vari-ability across studies. After quality control of the data,605, 121 and 157 samples were used from blood, buccalepithelial cells and placenta, respectively.A substantial number of tissue-specific non-variableCpGs were identified, thus providing a solid baseline forpotential removal from studies of interest to reducedimensionality. The total number of non-variable CpGswas similar across tissues: blood 114,204 (24%), buccalepithelial cells 120,009 (29%) and placenta 101,367 (21%)and showed a significant overlap of 42,315 non-variablebaFig. 1 Quality control of samples from GEO for each tissue type. a Heat maps showing sample-sample correlation values. Side colours showthe study ID of each sample, and samples are ordered by study ID. b Plots of the average sample-sample correlation for each sample to showpossible outliers and studies with overall low average sample-sample correlationEdgar et al. Clinical Epigenetics  (2017) 9:11 Page 3 of 8CpGs (permutation p < 0.0001; Fig. 2a). Non-variableCpGs existed in either fully methylated or unmethylatedstate, with few non-variable CpGs observed at an inter-mediate DNAm level. In all tissues, the 99th percentileof non-variable CpGs had a mean DNAm greater than0.80 or less than 0.16 (Fig. 2b). To test robustness of thenon-variable CpG lists, we compared the list of non-variable CpGs prior to processing with a similar list gen-erated after normalization or after cell type correction.We found that non-variable CpG lists overlapped by90% with all processing strategies (Additional file 1).While exploring the biological role of non-variableCpGs that was not the primary focus of this analysis, wedid observe that non-variable CpGs from each tissuewere significantly enriched in promoters and CpGislands (relative enrichment = 2.46–8.20, false discoveryrate (FDR) = 0.01; Fig. 3), with maximum enrichment inblood and lowest enrichment in placenta. Based on thelarge overlap in and similar genomic localization of non-variable CpGs between the three tissues, it is likely thatthe non-variable CpGs identified have similar underlyingproperties in each tissue.Application of data reduction method to smoking cohortTo test the utility of our filtering non-variable CpGs as adimensionality reduction method, capable of improvingstatistical power and sensitivity, we attempted to demon-strate the gain in statistical power in reproducing a well-accepted true positive DNAm modification associatedwith smoking. In particular, one of the mostreproducible biomarkers in DNAm association studiesto date is decreased DNAm associated with smokeexposure at two CpGs in the gene body of AHRR[21, 25–28]. To validate our data reduction method,we used the AHRR signal in response to smoke ex-posure as a true positive. By reanalyzing all 111 bloodsamples available with smoking status in the originalunfiltered data set (GSE53045) [21], we reproducedthe finding of significantly decreased DNAm at twoCpGs (cg05575921, cg23576855; FDR <0.05, delta beta0.1) in AHRR. Interestingly, the non-variable CpGsoften reached statistical significance (Fig. 4a), support-ing that targeted removal of non-variable CpGs fromEWAS improves specificity and reduces spuriousassociations.To simulate a less powered study of smoke exposure,we randomly sampled the cohort down to 24 samples(12 smokers, 12 non-smokers) ten times. The same lin-ear model, as used in the full cohort, was run on each ofthe ten randomly sampled smaller cohorts, but witheither all 485,512 CpGs included in the EWAS or withfiltering of 110,567 non-variable CpGs (filtered EWAS).This resulted in several interesting insights. First, in nineof the ten low powered EWAS sub samples, the multipletest corrected p values of the two true positive AHRRCpGs of interest were smaller in the filtered data set(Fig. 4b). Second, beyond AHRR, only six out of ten subsamples had any significantly differentially DNAm CpGsregardless of whether we used filtered or unfiltered data(FDR <0.05, delta beta 0.1). Third, in five of these six,All Three TissuesNon-variable inBloodBuccal Placenta0.000.250.500.751.00cg03707948 cg04999441 cg01940181 cg08101036cg10671668 cg19476647 cg10116893 cg18493214CpGMethylation Level (Beta)TissueBloodBuccalPlacentaNon-variable inNon-variable in Non-variable in0.000.250.500.751.00a bBuccalBloodPlacenta31,85026,02828,67915,17830,66615,19542,315Fig. 2 Non-variable CpGs had similar characteristics in all tissues. a Venn diagram showing the overlap of non-variable CpGs between tissues.b Methylation levels of representative non-variable CpGs from each tissueEdgar et al. Clinical Epigenetics  (2017) 9:11 Page 4 of 8the filtered data set EWAS resulted in more CpGs withsignificant differential DNAm. The greater significanceof AHRR in the filtered EWAS suggested that filtrationof non-variable CpGs should allow for prioritization oftrue positives, potentially even when the differentialDNAm signal is not as strong as AHRR in smoking.DiscussionHere, we have developed an empirically derived dimen-sionality reduction method for EWAS, which can reducenoise in 450K data from tissue specific non-variableCpGs. Our proposed method for removing our empiric-ally identified non-variable CpGs is to first confirm ifthey are also non-variable in the new dataset of interestand remove only those CpGs which are confirmed asnon-variable, as presented in the analysis of the AHRRsignal in response to smoke exposure. This procedurewould avoid removing CpGs that were non-variable inthe data collected previously, but do in fact vary in newdata being analyzed from the tissue. Generally, previousanalyses on 450K data have either filtered based on vari-ability within the study data or not filtered the data onvariability at all. We consider our filtration method to bea more moderate compromise between false positive andnegatives. Our method is less biased toward false posi-tives than filtering based on variability just in the studydata, and also less likely to result in false negatives dueto severe multiple test correction when no variabilityfilter is to be used at all [13].In defining our non-variable CpG list, we were agnos-tic to normalization methods and did not correct forbatch effects between laboratories, beyond removingsamples with low sample-sample correlations. We havetherefore left in variability in the data due to technicalfactors that would have been minimized had we com-bined the data for normalization and performed batchcorrection. Our list of non-variable CpGs is thus conser-vative, but should be robust to study specific technicalvariability, increasing its utility in the community.We have demonstrated the utility of the filtration inthe analysis of smoke exposure in GSE53045, as the suc-cessful identification of differential DNAm at the truepositive AHRR and the identification of more CpGs gen-ome wide with significantly differentially DNAm. We donot propose simply observing more CpGs with differen-tial DNAm as a good metric for the utility of our dataFold ChangeCpG Island Featuresa b cNoneN. ShelfN. ShoreIslandS. ShoreS. ShelfNoneN. ShelfN. ShoreIslandS. ShoreS. Shelf NoneN. ShelfN. ShoreIslandS. ShoreS. ShelfFold ChangeFold ChangeFold ChangeFold ChangeFold ChangePromoterIntragenicThree PrimeIntergenicPromoterIntragenicThree PrimeIntergenicPromoterIntragenicThree PrimeIntergenic20−4−220−4−220−4−220−4−220−4−220−4−2Gene FeaturesCpG Island FeaturesCpG Island FeaturesGene FeaturesGene FeaturesFig. 3 Non-variable CpGs were enriched in CpG island and promoters. All plots show the enrichment fold change of non-variable CpGscompared to all CpGs available for a tissue. Each pair of plots shows the fold changes in gene regions (top) and CpG resort features(bottom). a Blood non-variable CpGs. b Buccal epithelial cell non-variable CpGs. c Placenta non-variable CpGsEdgar et al. Clinical Epigenetics  (2017) 9:11 Page 5 of 8reduction method, as some of the significant CpGs iden-tified with filtration will be false positives. However, incombination with the observation of significant differen-tial DNAm at the true positive, AHRR, more consist-ently with filtration, we are confident that our datareduction method will have utility in allowing identifica-tion of replicable differential DNAm in other datasets.Filtering for data reduction will be particularly usefulwhen there is an expectation of CpGs with strong differ-ential methylation signals (>5%); so, the expected magni-tude of DNAm change should be carefully considered bythe researcher before applying any data reduction. Inconcert with a stringent biological filter for the changein DNAm level between groups (5–10%) [1, 29], and val-idation of the 450K results with another technology suchas pyrosequencing [1], this tissue specific DNAm datadimensionality reduction method may allow for betterand more stringent identification of epigenetic signa-tures of exposure or disease.ConclusionsWhile the ability to define a tissue specific non-variablelist will ultimately depend on the amount of data avail-able for the tissue in public repositories, we expect thereare already other tissues of interest with sufficient 450Kdata for which a useful list of non-variable CpGs could bedeveloped. We have therefore made our code for buildingtissue specific non-variable lists available on GitHub(github.com/redgar598/Tissue_Nonvariable_450K_CpGs).We hope our analysis can be reapplied in the futureto update the non-variable CpGs lists for blood, buc-cal epithelial cells and placenta as more samples be-come available, and be expanded to more tissues.Additionally, with the increased dimensionality of thenewly released Illumina Infinium EPIC array, the needfor tissue specific dimensionality reduction will beeven greater. The analysis we have outlined and madeavailable can easily be applied to EPIC array datasetsas more are released [30].Additional fileAdditional file 1: Table S1. Gene expression omnibus data descriptionand additional analysis. Terms used to exclude samples not of interestin a given tissue. Table S2. Quality control filters for each tissue and theresulting final study, sample and CpG numbers. Table S3. Series IDs ofthe final samples used in the meta-analysis of tissue non-variable CpGs.Additional analysis on the stability of the non-variable CpG list withdifferent data processing approaches. (PDF 63 kb)cg05575921 cg235768550. Non−Variable Filtered Unfiltered Non−Variable FilteredFDR01020-0.2 -0.1 0.0 0.1 0.2Delta BetaP Value (-log10)a bCpG Change BetweenSmoking and Non-SmokingDecreased MethylationIncreased MethylationDecreased MethylationIncreased MethylationNot Significantly Different(with Potential Biological Impact)(with Potential Biological Impact)CpG VariabilityNon-VariableVariableFig. 4 Multiple test corrected p values were lower in the filtered EWAS. a Volcano plot of the differential methylation analysis between smokingand non-smoking samples, with no filtering of non-variable CpGs. Vertical lines indicate a DNAm difference between 0.1. The horizontal line representsan FDR corrected p value of 0.05. Points are coloured to highlight CpGs exceeding both the biological and statistical cutoffs. Points with a black outlineare CpGs found to be non-variable in blood. b Shown are the multiple test corrected p values (FDR) for the two CpGs of interest inAHRR. Lines connect FDR values between paired permutation sub-samples to show the trend between paired cohorts. The horizontal lineshows the FDR values of 0.05Edgar et al. Clinical Epigenetics  (2017) 9:11 Page 6 of 8Abbreviations450K: Illumina 450K human methylation array; CpG: Cytosine-guanine pair;DNAm: DNA methylation; EWAS: Epigenome wide association studies;FDR: False discovery rate; GEO: Gene expression omnibusAcknowledgementsThe authors would like to thank Dr. Magda Price and Dr. Sara Mostafavifor their thoughts and careful consideration of the approach taken inthis work. Thank you to the researchers who kindly deposited their datain public repositories, without their contribution this work would not bepossible.FundingSupport for this work was provided by Canadian Institutes of HealthResearch (FRN146331); R. Howard Webster Foundation (F13-00031) andW. Garfield Weston Foundation/Brain Canada Foundation (F13-02369).Availability of data and materialsData sharing is not applicable to this article as no datasets were generatedor analyzed during the current study.Authors’ contributionsRDE performed the data analysis and drafted the manuscript. RDE, MJJ,WPR and MSK wrote, reviewed and edited the manuscript. WPR and MSKconceived and coordinated the study. All authors have approved the finalversion.Competing interestsThe authors declare that they have no competing interests.Consent for publicationNot applicable.Ethics approval and consent to participateNot applicable.Received: 7 October 2016 Accepted: 31 January 2017References1. Michels KB, Binder AM, Dedeurwaerder S, Epstein CB, Greally JM, Gut I,Houseman EA, Izzi B, Kelsey KT, Meissner A, Milosavljevic A, Siegmund KD,Bock C, Irizarry RA. Recommendations for the design and analysis ofepigenome-wide association studies. Nat Methods. 2013;10(10):949–55.2. Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, Delano D, Zhang L,Schroth GP, Gunderson KL, Fan JB, Shen R. High density DNA methylationarray with single CpG site resolution. Genomics. 2011;98(4):288–95.3. Byun HM, Siegmund KD, Pan F, Weisenberger DJ, Kanel G, Laird PW, YangAS. Epigenetic profiling of somatic tissues from human autopsy specimensidentifies tissue- and individual-specific DNA methylation patterns. Hum MolGenet. 2009;18(24):4808–17.4. Glossop JR, Nixon NB, Emes RD, Haworth KE, Packham JC, Dawes PT,Fryer AA, Mattey DL, Farrell WE. Epigenome-wide profiling identifiessignificant differences in DNA methylation between matched-pairs ofT- and B-lymphocytes from healthy individuals. Epigenetics. 2013;8(11):1188–97.5. Duong CV, Emes RD, Wessely F, Yacqub-Usman K, Clayton RN, Farrell WE.Quantitative, genome-wide analysis of the DNA methylome in sporadicpituitary adenomas. Endocr Relat Cancer. 2012;19(6):805–16.6. Fryer AA, Emes RD, Ismail KM, Haworth KE, Mein C, Carroll WD, Farrell WE.Quantitative, high-resolution epigenetic profiling of CpG loci identifiesassociations with cord blood plasma homocysteine and birth weight inhumans. Epigenetics. 2011;6(1):86–94.7. Lam LL, Emberly E, Fraser HB, Neumann SM, Chen E, Miller GE, Kobor MS.Factors underlying variable DNA methylation in a human communitycohort. Proc Natl Acad Sci. 2012;109(suppl2):17253–60.8. Esposito EA, Jones MJ, Doom JR, MacIsaac JL, Gunnar MR, Kobor MS.Differential DNA methylation in peripheral blood mononuclear cells inadolescents exposed to significant early but not later childhood adversity.Dev Psychopathol. 2016;28(4pt2):1–15.9. Meng H, Joyce AR, Adkins DE, Basu P, Jia Y, Li G, Sengupta TK, Zedler BK,Murrelle EL, van den Oord EJ. A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling. BMCBioinformatics. 2010;11:227.10. Farré P, Jones MJ, Meaney MJ, Emberly E, Turecki G, Kobor MS. Concordantand discordant DNA methylation signatures of aging in human blood andbrain. Epigenetics Chromatin. 2015;8:19.11. Langfelder P, Horvath S. WGCNA: an R package for weighted correlationnetwork analysis. BMC Bioinformatics. 2008;9:559.12. Horvath S. DNA methylation age of human tissues and cell types. GenomeBiol. 2013;14(10):115.13. Bourgon R, Gentleman R, Huber W. Independent filtering increasesdetection power for high-throughput experiments. Proc Natl Acad SciU S A. 2010;107(21):9546–51.14. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI geneexpression and hybridization array data repository. Nucleic Acids Res. 2002;30(1):207–10.15. Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG,Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP.Increased methylation variation in epigenetic domains across cancer types.Nat Genet. 2011;43(8):768–75.16. Lemire M, Zaidi SHE, Ban M, Ge B, Assi D, Germain M, Kassam I, Wang M,Zanke BW, Gagnon F, Morange PE, Trgout DA, Wells PS, Sawcer S, GallingerS, Pastinen T, Hudson TJ. Long-range epigenetic regulation is conferred bygenetic variation located at thousands of independent loci. Nat Commun.2015;6:6326.17. Rakyan VK, Beyan H, Down TA, Hawa MI, Maslau S, Aden D, Daunay A,Busato F, Mein CA, Manfras B, Dias KRM, Bell CG, Tost J, Boehm BO, Beck S,Leslie RD. Identification of type 1 diabetes associated DNA methylationvariable positions that precede disease diagnosis. PLoS Genet. 2011;7(9):e1002300.18. Stringhini S, Polidoro S, Sacerdote C, Kelly RS, van Veldhoven K, Agnoli C,Grioni S, Tumino R, Giurdanella MC, Panico S, Mattiello A, Palli D, Masala G,Gallo V, Castagn R, Paccaud F, Campanella G, Chadeau-Hyam M, Vineis P.Life-course socioeconomic status and DNA methylation of genes regulatinginflammation. Int J Epidemiol. 2015;44(4):1320–30.19. Edgar R, Tan PPC, Portales-Casamar E, Pavlidis P. Meta-analysis ofhuman methylomes reveals stably methylated sequences surroundingCpG islands associated with high gene expression. EpigeneticsChromatin. 2014;7(1):28.20. Hannon E, Spiers H, Viana J, Pidsley R, Burrage J, Murphy TM, Troakes C,Turecki G, O’Donovan MC, Schalkwyk LC, Bray NJ, Mill J. Methylation QTLsin the developing brain and their enrichment in schizophrenia risk loci.Nat Neurosci. 2016;19(1):48–54.21. Dogan MV, Shields B, Cutrona C, Gao L, Gibbons FX, Simons R, Monick M,Brody GH, Tan K, Beach SRH, Philibert RA. The effect of smoking on DNAmethylation of peripheral blood mononuclear cells from African Americanwomen. BMC Genomics. 2014;15:151.22. Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, Gomez-CabreroD, Beck S. A beta-mixture quantile normalization method for correctingprobe design bias in Illumina Infinium 450 k DNA methylation data.Bioinformatics. 2013;29(2):189–96.23. Houseman EA, Accomando WP, Koestler DC, Christensen BC, MarsitCJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays assurrogate measures of cell mixture distribution. BMC Bioinformatics.2012;13:86.24. Jones MJ, Islam SA, Edgar RD, Kobor MS. Adjusting for cell typecomposition in DNA methylation data using a regression-based approach.Methods Mol Biol. 2015. doi:10.1007/7651_2015_262.25. Monick MM, Beach SRH, Plume J, Sears R, Gerrard M, Brody GH, Philibert RA.Coordinated changes in AHRR methylation in lymphoblasts and pulmonarymacrophages from smokers. Am J Med Genet B Neuropsychiatr Genet.2012;159B(2):141–51.26. Philibert RA, Beach SRH, Lei MK, Brody GH. Changes in DNA methylationat the aryl hydrocarbon receptor repressor may be a new biomarker forsmoking. Clin Epigenetics. 2013;5(1):19.27. Shenker NS, Polidoro S, van Veldhoven K, Sacerdote C, Ricceri F, Birrell MA,Belvisi MG, Brown R, Vineis P, Flanagan JM. Epigenome-wide associationstudy in the European Prospective Investigation into Cancer and Nutrition.EPIC-Turin) identifies novel genetic loci associated with smoking. Hum MolGenet. 2013;22(5):843–51.Edgar et al. Clinical Epigenetics  (2017) 9:11 Page 7 of 828. Bauer M, Fink B, Thrmann L, Eszlinger M, Herberth G, Lehmann I. Tobaccosmoking differently influences cell types of the innate and adaptiveimmune systemindications from CpG site methylation. Clin Epigenetics.2016;8:83.29. Tsai PC, Bell JT. Power and sample size estimation for epigenome-wideassociation scans to detect differential DNA methylation. Int J Epidemiol.2015;44(4):1429–41.30. Moran S, Arribas C, Esteller M. Validation of a DNA methylation microarrayfor 850,000 CpG sites of the human genome enriched in enhancersequences. Epigenomics. 2016;8(3):389–99.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services •  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submitSubmit your next manuscript to BioMed Central and we will help you at every step:Edgar et al. Clinical Epigenetics  (2017) 9:11 Page 8 of 8


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items