UBC Faculty Research and Publications

Comparison of weighting approaches for genetic risk scores in gene-environment interaction studies Hüls, Anke; Krämer, Ursula; Carlsten, Christopher; Schikowski, Tamara; Ickstadt, Katja; Schwender, Holger Dec 16, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12863_2017_Article_586.pdf [ 1.73MB ]
JSON: 52383-1.0362114.json
JSON-LD: 52383-1.0362114-ld.json
RDF/XML (Pretty): 52383-1.0362114-rdf.xml
RDF/JSON: 52383-1.0362114-rdf.json
Turtle: 52383-1.0362114-turtle.txt
N-Triples: 52383-1.0362114-rdf-ntriples.txt
Original Record: 52383-1.0362114-source.json
Full Text

Full Text

METHODOLOGY ARTICLE Open AccessComparison of weighting approaches forgenetic risk scores in gene-environmentinteraction studiesAnke Hüls1,2* , Ursula Krämer1, Christopher Carlsten3,4,5, Tamara Schikowski1, Katja Ickstadt2†and Holger Schwender6†AbstractBackground: Weighted genetic risk scores (GRS), defined as weighted sums of risk alleles of single nucleotidepolymorphisms (SNPs), are statistically powerful for detection gene-environment (GxE) interactions. To assignweights, the gold standard is to use external weights from an independent study. However, appropriate externalweights are not always available. In such situations and in the presence of predominant marginal genetic effects,we have shown in a previous study that GRS with internal weights from marginal genetic effects (“GRS-marginal-internal”) are a powerful and reliable alternative to single SNP approaches or the use of unweighted GRS. However,this approach might not be appropriate for detecting predominant interactions, i.e. interactions showing an effectstronger than the marginal genetic effect.Methods: In this paper, we present a weighting approach for such predominant interactions (“GRS-interaction-training”) in which parts of the data are used to estimate the weights from the interaction terms and the remainingdata are used to determine the GRS. We conducted a simulation study for the detection of GxE interactions inwhich we evaluated power, type I error and sign-misspecification. We compared this new weighting approach tothe GRS-marginal-internal approach and to GRS with external weights.Results: Our simulation study showed that in the absence of external weights and with predominant interactioneffects, the highest power was reached with the GRS-interaction-training approach. If marginal genetic effects werepredominant, the GRS-marginal-internal approach was more appropriate. Furthermore, the power to detectinteractions reached by the GRS-interaction-training approach was only slightly lower than the power achieved byGRS with external weights. The power of the GRS-interaction-training approach was confirmed in a real dataapplication to the Traffic, Asthma and Genetics (TAG) Study (N = 4465 observations).Conclusion: When appropriate external weights are unavailable, we recommend to use internal weights from thestudy population itself to construct weighted GRS for GxE interaction studies. If the SNPs were chosen because astrong marginal genetic effect was hypothesized, GRS-marginal-internal should be used. If the SNPs were chosenbecause of their collective impact on the biological mechanisms mediating the environmental effect (hypothesis ofpredominant interactions) GRS-interaction-training should be applied.Keywords: Polygenic approach, Training dataset, Internal weights, External weights, Simulation study, Power,Type I error* Correspondence: Anke.Huels@IUF-Duesseldorf.de†Equal contributors1IUF-Leibniz Research Institute for Environmental Medicine, Düsseldorf,Germany2Faculty of Statistics, TU Dortmund University, Dortmund, GermanyFull list of author information is available at the end of the article© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Hüls et al. BMC Genetics  (2017) 18:115 DOI 10.1186/s12863-017-0586-3BackgroundFor many diseases, genetic influences are exceedinglycomplex and cannot be explained by simple Mendelianmodes of inheritance only. Moreover, genetic and envir-onmental factors may jointly contribute to susceptibilityclarifying the importance of analyzing gene-environment(GxE) interactions, which can be defined as “a differenteffect of environmental exposure in disease risk in per-sons with different genotypes” [1].Since most complex diseases are influenced by hun-dreds of genetic variants each having a small effect onits own, polygenic approaches that deal with the geneticbasis en masse often access more of the heritable com-ponent of complex traits than is possible by single-variant approaches [2]. The most common polygenicapproach is the weighted genetic risk score (GRS) ap-proach in which a weighted GRS is calculated from apre-selected number of genetic variants to define a per-son’s individual genetic risk for disease development [3].One of the first GRS applications was published byPurcell et al. who used GRS to argue that schizophreniahas a polygenic risk [4]. Although their genome-wide as-sociation study (GWAS) identified few individually sig-nificant single nucleotide polymorphisms (SNPs), theyprovided evidence for a substantial polygenic componentto risk of schizophrenia involving thousands of commonalleles of very small effect. In addition, GRS show prom-ise for patient stratification and subphenotyping [2].Hamshere et al. showed that among bipolar disordercases GRS for schizophrenia risk could distinguishschizo-affective cases from others [5]. Moreover, GRSwere successfully used in interaction analyses to examinethe genetic susceptibility to air pollution-induced type 2diabetes [6], air pollution-induced airway inflammation[7] and fried food-induced obesity [8].The high power of GRS approaches to detect GxE inter-actions has been confirmed in a recent methodologicalpaper by Aschard [9]. In this publication, Aschard showedthat if most interaction effects point into the same direc-tion, the use of GRS increases the power to detect GxE in-teractions in comparison to the common univariatesingle-variant approaches, e.g. with Bonferroni correction,and the joint test of main genetic and interaction effects[9, 10]. Furthermore, by combining SNPs of a certain bio-logical pathway, GRS can be used as a simple statisticalapproach for the complex biological pathways throughwhich environment-induced diseases might be caused [7].GRS have been employed to summarize genetic ef-fects among an ensemble of markers that do not indi-vidually achieve significance and to estimate thevariance explained by a marker panel [3]. In these ap-plications, the gold standard is to use externalweights, e.g. marginal genetic effects estimated in anindependent study population [3, 11].In a recent publication, we presented a new GRS ap-proach that can be applied if no appropriate externalweights are available and the marginal genetic effects arepredominant, which means that the marginal genetic ef-fects are stronger than the interaction effects [12]. Inthis approach, we used GRS with internal weights fromthe marginal genetic effects of the study itself andshowed that using these GRS increased the power to de-tect gene-environment interactions substantially com-pared to the common single SNPs approach and to theusage of unweighted GRS with a well-controlled type Ierror [12]. In addition, GRS with weights from the mar-ginal genetic effects estimated with elastic net regression[13] were able to handle a large number of correlatedSNPs as well as noise SNPs, i.e. SNPs having no effecton the outcome of interest. Applying this approach toan epidemiological study, we showed in a study popu-lation of only 402 women that genetic variation inthe endoplasmatic reticulum (ER) stress pathwaymight play a role in air pollution induced inflamma-tion in the lung [7].However, in scenarios with predominant interactioneffects, a better approach might be to split the data intotest and training data and using the training data to esti-mate the weights in the interaction term itself and theremaining test data to determine the GRS. Dudbridge(2013) evaluated a GRS approach in which the data weresplit into test and training data for the detection of mar-ginal genetic effects [3]. Dudbridge recommended thatthe optimal balance of sample sizes between trainingand test data sets is close to one-half regardless of theproportion of noise SNPs or the p-value threshold [3].Therefore, given an initial sample to be split into train-ing and test subsets, an obvious rule of thumb is tomake an even split [3]. However, to the best of ourknowledge, this approach has never been evaluated forthe detection of GxE interactions.The aim of the current study is to present a new GRSapproach for GxE interaction studies, called GRS-interaction-training, in which the weights are gainedfrom the interaction terms in the training dataset that issplit off the sample data and the remaining test data isused to determine the GRS. We performed a simulationstudy on the detection of gene-environment interactionsin which we compared the performance of GRS-interaction-training to GRS with external weights (goldstandard) and to weighted GRS-marginal-internal [12].We considered scenarios with predominant marginalgenetic effects and smaller additional GxE interaction ef-fects, and vice versa. We simulated scenarios with an in-creasing number of noise SNPs (up to 200) and withvarying minor allele frequencies.Moreover, we applied these different weighting ap-proaches to a real data set from the Traffic, Asthma andHüls et al. BMC Genetics  (2017) 18:115 Page 2 of 12Genetics (TAG) Study (N = 4465 observations in apooled dataset across six birth cohorts) concerned withinvestigating the role of genetic variation of the oxidativestress and inflammation pathway on air pollution-induced asthma at school age.MethodsDetermination of weighted GRSWeighted GRS (GRSi) are defined as a weighted sums ofthe number of risk alleles (coded as 0, 1, 2) of k consid-ered SNPs (gi1,…, gik) for the n subjects (i = 1,…, n):GRSi ¼ w1 gi1 þ…þ wk gik : ð1ÞThe most common weighting approach is to use ex-ternal weights w1, …, wk, e.g. marginal genetic effectsof the k SNPs estimated in an independent studypopulation [3, 11].Genome-wide meta-analyses that provide the com-bined effect estimates of a range of independent studiesare usually preferred, followed by meta-analyses, whichonly include a selected number of SNPs identified to berelevant for the phenotype and by GWAS in large singlecohorts. Determining weights from two or more differ-ent external studies should be treated with caution be-cause effect estimates from different cohorts are oftenincomparable, e.g. due to differences in study design,ethnicity or phenotype definitions.A limitation of GRS with external weights is that wecan only include SNPs for which the marginal genetic ef-fects have been published. In this regard, GRS with ex-ternal weights are usually restricted to SNPs with agenome-wide significant (p-value <5 × 10−8) marginalgenetic effect in the external study population, whereasSNPs with a predominant interaction effect are usuallynot presented. Furthermore, not for every phenotypelarge-scale GWAS are published and sometimes theyhave been conducted only in populations with differentethnicity, sex or age range.GRS-marginal-internal approachIf no appropriate external weights are available, one ap-proach that we developed recently is to estimate theweights w1, …, wk from the internal marginal genetic ef-fect of the study sample itself [12], called GRS-marginal-internal.In this approach, the weights ( w1;…;wkÞ ¼β^1;…; β^k in eq. (1) are estimated internally from amultivariate elastic net regression analysis [13–15] forthe combined marginal genetic effect of k pathway-related SNPs on the health outcome y in the studypopulation itself. In the elastic net regression model,the values of the unknown parameters for theintercept β0 and the marginal genetic effects of the kSNPs βj (j = 1,…, k) can be estimated by minimizing thesum of the residual sum of squares and a penalty term:β^0; β^ ¼argminβ0; βXni¼1 yi−β0−Xkj¼1βjGij 2þ P λ; βð Þ :ð2ÞHere, G = (gi1,…, gik) is an n x k matrix holding the kconsidered SNPs for the n subjects and the penalty func-tion Pðλ; βÞ :¼λPkj¼1ð12 ð1−αÞ β2j þ α jβjjÞ is a combinedpenalty of lasso and ridge regression penalties. We usedcross-validation to find the optimal values of theregularization parameter λ, i.e. the largest λ –value suchthat the mean squared error (minMSE) is within 1standard error (SE) of the minimum as implemented inthe R package glmnet [14] and recommended in [15].The penalty weight α can be chosen between 0 and 1.The elastic net with a penalty weight of α = 1 is identicalto the lasso regression, whereas the elastic net with α = 0is identical to the ridge regression [15]. Since we couldshow in our recent publication, that the penalty weightα only has a minor impact on power and type I error forthe detection of interactions [12], we chose a penaltyweight of α = 0.5 in this publication to receive a goodbalance between ridge and lasso regression. Zou andHastie proposed the elastic net penalty for linear regres-sion models [13] that was further extended to logistic re-gression and multinomial regression [14] and to the Coxregression [16].GRS-interaction-training approachIn scenarios with predominant interaction effects, i.e. inscenarios in which the GxE interaction effects are stron-ger than the marginal genetic effects, a better approachmight be to use the coefficients from the interactionterms to determine the weights instead of using the mar-ginal genetic effect estimates.In this new approach, which we call GRS-interaction-training approach, SNPs get a larger weight to the extentthat they interact more strongly with the environmentalexposure.Up to now, the use of training and test datasets forthe construction of GRS has only been described forthe detection of marginal genetic effects. If GRS areused to estimate marginal genetic effects, Dudbridgepointed out that the weights must be estimated fromthe marginal genetic effects in a training sample andbe used to construct a GRS in an independent testdataset [3]. In the same line, Burgess et al. showedthat using internal weights instead of weights from atraining dataset should be avoided because it leads tobiased effect estimates [17, 18].Hüls et al. BMC Genetics  (2017) 18:115 Page 3 of 12Transferring this knowledge to GxE interaction ana-lyses with GRS with weights from the interaction termitself, it is necessary to estimate these internal inter-action weights in an independent training sample aswell.In the first step of the GRS-interaction-training ap-proach, the initial sample is split randomly into a train-ing dataset and a test dataset. Next, the elastic netregression is used to estimate the interaction parametersδj (j = 1,…, k) between each of the k SNPs and the envir-onmental factor E by minimizing the sum of the residualsum of squares and a penalty term in the training data:β^0; β^; γ^ ; δ^¼ argminβ0; β; γ; δXni¼1 yi−β0−Xkj¼1βjGij−γEi−Xkj¼1δjGijEi 2þ P λ; β; γ; δð Þ ð3Þwith E = (e1,…, en) being an n x 1 matrix holding theconsidered environmental exposure E for the n subjects,the environmental effect parameter γ and the penaltyfunction:Pðλ; β; γ; δÞ :¼λ ðXkj¼1ðð12ð1−αÞ β2j þ α jβjjÞþ ð12ð1−αÞ δ2j þ α jδjjÞÞ þ ð12ð1−αÞ γ2 þ α jγjÞÞ:The remaining parameters are defined as in eq. (2).The effect estimates for the interaction terms δ^ jj ¼ 1;…; kð Þ are then used as weights w1, …, wk for theGRS (see eq. (1) for the general definition of weightedGRS) in the remaining test data.Interaction analysisIn the subsequent gene-environment interaction ana-lysis, a generalized linear model (GLM) [19, 20] is ap-plied to estimate the gene-environment interaction(GRSxE interaction; interaction between GRS and envir-onmental exposure) for the same health outcome y as ineqs. (2, 3). In a GLM, y is usually assumed to be gener-ated from a distribution in the exponential family thatincludes, e.g., the normal, binomial, Poisson and gammadistribution. The mean μ of this distribution depends onthe independent variables X through:E Yð Þ ¼ μ ¼ g−1 Xτð Þwhere E(Y) is the expected value of the random variableY, g is the link function and X = (grsi, ei, grsiei) being ann x 3 matrix holding the considered GRS, the environ-mental exposure E and the interaction between the GRSand E for the n subjects. The unknown parameter vectorτ is estimated using maximum likelihood.Simulation studySimulation designThe data for the simulation study was generated usingthe function simulateSNPglm from the R-package scrime[21]. Each of the simulated datasets contains six inde-pendent genetic risk factors (i.e. SNPs) and either 6, 50,100, or 200 additional noise SNPs. The impact of morenoise SNPs (up to 840) and highly correlated SNPs wasdiscussed in our previous publication where we showedthat weighted GRS with weights estimated in the elasticnet regression can handle even a high number of noiseand correlated SNPs very well [12]. In most scenarios,we randomly chose minor allele frequencies (MAF) be-tween 0.01 and 0.45 for the six risk SNPs as well as forthe noise SNPs. When analyzing the impact of the MAF,we varied the MAFs of the six risk SNPs between 0.01and 0.45, whereas the MAFs for the noise SNPs wererandomly selected. A dominant mode of inheritance wasconsidered for each risk SNP.We compared two scenarios:In scenario (a), we constructed a predominant inter-action effect which means that the interaction betweeneach of the six risk SNPs and an environmental exposureE is set to an interaction effect of 1.5 with a smaller mar-ginal genetic effect that is not explicitly defined (see [21]).In scenario (b), we constructed a predominant mar-ginal genetic effect, which means that the marginal gen-etic effect of each of the six risk SNPs is set to 1.5 withan additional (smaller) interaction effect. For the simula-tion of the gene-environment interaction terms in sce-nario (b), we followed the procedure previouslydescribed [12].Effect estimates and p-values for the marginal geneticeffects, the environmental effects and for the interactioneffects of a simulated example dataset of N = 3000 aregiven for scenarios (a) and (b) in Tables S1 and S2 ofAdditional file 1.Simulation of external weightsIn real data applications, it is often not or hardly pos-sible to get appropriate external weights. Therefore, wesimulated different types of external data with varyingdegrees of fit to the own study sample. First, externalweights were estimated from the marginal genetic effectsin an external dataset that was simulated from the samedistribution as our study sample data (perfect weights).In addition, we simulated two scenarios with less ap-propriate external weights. In the first scenario, theeffect estimates of the risk SNPs in our own studysample were larger than in the external data (under-estimating weights) and in the second scenario, onlyone of the six risk SNPs of the external data was as-sociated with the outcome in our own study sample(overestimating weights).Hüls et al. BMC Genetics  (2017) 18:115 Page 4 of 12We simulated external data with the same sample sizeas in our own study sample and external data with asample size being four times larger than in our ownstudy sample and varied the number of noise SNPs from6 to 200.Evaluation of power, proportion of sign-misspecification,and type I errorThe main focus of the model comparison was tomaximize the power to detect a gene-environment inter-action with an acceptable type I error.Power was evaluated in datasets with N = 3000 or N =1000 observations and 100 or 1000 replications depend-ing on the running time and precision needed in differ-ent scenarios. As shown in [12], the restriction to 100replications only caused a minor sampling error ofaround 3%-points in power and type I error.The power of the model was calculated as the propor-tion of times a true-positive interaction was correctlyidentified (sign of the parameter estimate for the GRSxEinteraction term correctly identified and p-value < 0.05)across all replications. The type I error of the model wascalculated as the proportion of times a false-positive inter-action was identified under the null hypothesis. We fur-ther evaluated the proportion of sign-misspecifications,which was calculated as the proportion of times a signifi-cant interaction was identified, but the sign of the param-eter estimate for the GRSxE interaction term was notcorrectly determined.Within the evaluation of our GRS-interaction-trainingapproach, we investigated the optimal balance betweentraining and test datasets by comparing different propor-tions: We started with the scenario recommended byDudbridge (2013) for GRS used for the detection of mar-ginal genetic effects [3], in which the training and thetest datasets have an even sample size (1:1). Further sce-narios are based on smaller training datasets (1:2, 1:3,1:4, 1:9 and 1:19) and larger training datasets (19:1, 9:1,4:1, 3:1, 2:1) than test datasets.All analyses were performed using R 3.3.1 [22].ResultsSimulation studyGRS-interaction-training approach – Balance betweentraining vs. test dataIn a first step, we evaluated the optimal balance betweentraining and test data applying our GRS-interaction-training approach.In Fig. 1, power and type I error to detect GxE interac-tions for (a) predominant interaction effects and (b) pre-dominant marginal genetic effects are presented. Powerand type I error were evaluated with an increasing sam-ple size of the training data in comparison to the testdata (from 19:1 to 1:19).This figure reveals that in scenarios with many noiseSNPs, the optimal split is close to one-half and the bal-ance is roughly symmetrical around one-half. However,with a decreasing number of noise SNPs, a higher powerFig. 1 Impact of the balance between training vs. test data on power and type I error of the GRS-interaction-training approach. Scenarios withpredominant interaction effects (a) and predominant marginal genetic effects (b). Balance training vs. test data increases from 19:1 to 1:19,scenarios with 6 risk SNPs that interact with the environmental exposure and 6, 50, 100 and 200 additional noise SNPs that are not associatedwith the outcome (N = 3000 observations and 1000 replications)Hüls et al. BMC Genetics  (2017) 18:115 Page 5 of 12was achieved by increasing the test data in comparisonto the training data. In scenarios with an equal numberof noise and risk SNPs, i.e. with six noise and six riskSNPs, the optimal balance between training and testdata lay between 1:3 and 1:4. The type I error was wellcontrolled over all scenarios and there was no differencein power and type I error between scenarios with pre-dominant interaction effects (Fig. 1a) and scenarios withpredominant marginal genetic effects (Fig. 1b).GRS-interaction-training in comparison to previousweighting approachesNext, we compared the GRS-interaction-training approach(balance training vs. test data 1:1) to our previouslypublished GRS-marginal-internal approach [12] and toGRS with external weights (which is typically considered asgold standard) in scenarios with (a) predominant inter-action effects and (b) predominant marginal genetic effectswith an increasing number of noise SNPs (up to 200).In scenarios with predominant interaction effects (seeFig. 2a), the GRS-interaction-training approach achieved ahigher power than the GRS-marginal-internal approach. Inparticular, in scenarios with many noise SNPs, the GRS-marginal-internal approach reached a very low power todetect interaction effects. Furthermore, with more noise,there was a high number of sign-misspecifications whenusing the GRS-marginal-internal approach in scenarioswith predominant interaction effects.Fig. 2 External vs. internal weights with increasing number of noise SNPs (up to 200) in scenarios with predominant interaction effects (a) andpredominant marginal genetic effects (b). Power, sign-misspecifications and type I error comparison of i) the GRS-interaction-training approach (redlines; one half of the data used as training data and the other half as test data), ii) the GRS-marginal-internal approach (blue lines) and iii) GRS with ex-ternal weights (black lines). We compared three types of external weights. Perfect: data from the same distribution as the sample data; over-estimating: only one of the six risk SNPs of the external data was associated with the outcome in the sample data; underestimating: effectestimates of the risk SNPs in the sample data were 30% larger than in the external data). External weights with “1:1” and “1:4”: Balance between size ofsample data vs. size of external data (N = 3000 observations and 1000 replications)Hüls et al. BMC Genetics  (2017) 18:115 Page 6 of 12In scenarios with predominant marginal genetic effects(see Fig. 2b), the GRS-marginal-internal approachachieved a slightly higher power to detect interaction ef-fects than the GRS-interaction-training approach, butthe differences became smaller with an increasing num-ber of noise SNPs. There were no sign-misspecificationsin scenarios with predominant marginal genetic effects.GRS with perfect external weights that were gainedfrom external data that were simulated from the samedistribution as our study sample data, outperformed theGRS-interaction-training and the GRS-marginal-internalapproaches. However, if the sample size of the externaldata was not larger than our own study sample size, theGRS-interaction-training approach achieved a higherpower than GRS with perfect external weights in scenar-ios with predominant interaction effects (Fig. 2a).Furthermore, in real data applications, there is usuallyno perfect match between the external data and the sam-ple data, e.g., effect estimates in the own study samplemight differ from those in the external data or only a sub-set of risk SNPs identified in the external data is associatedwith the outcome in the own study sample. In these sce-narios, the GRS-interaction-training approach was oftenmore appropriate to detect predominant interaction ef-fects than GRS with external weights. The GRS-marginal-internal approach only outperformed GRS with externalweights in the detection of predominant marginal geneticeffects if there were <100 noise SNPs in the data (Fig. 2b).The type I error was well controlled over all scenarios(Fig. 2).GRS-interaction-training vs. GRS-marginal-internal – Impactof MAFIn a last step, we analyzed the impact of the MAFs ofthe six risk SNPs on power, proportion of sign-misspecifications and type I error of the GRS-interaction-training approach in comparison to theGRS-marginal-internal approach.In scenarios with a predominant interaction effect (seeFig. 3a), the power achieved by the GRS-interaction-training approach was highest for MAFs between 0.05and 0.20. Furthermore, there were no sign-misspecifications and the type I error was wellcontrolled. The power achieved by the GRS-marginal-internal approach was even higher than the powerachieved by the GRS-interaction-training approach inscenarios with only a small number of noise SNPs andsmall MAFs. However, with more noise and MAFs >0.1,the GRS-interaction-training approach outperformed theGRS-marginal-internal approach. Most interestingly,there was a high number of sign-misspecifications inscenarios with MAFs ≥0.2 when applying the GRS-marginal-internal approach, especially in scenarios withmany noise SNPs.In scenarios with a predominant marginal genetic ef-fect (see Fig. 3b), the GRS-marginal-internal approachachieved a higher power than the GRS-interaction-training approach with an acceptable proportion of sign-misspecifications.The type I error was well controlled in all scenarios,but with a higher variation due to the reduced numberof replications (100 instead of 1000).Real data applicationThe real data application was based on a dataset fromthe Traffic, Asthma and Genetics (TAG) Study (N =4465 observations in the pooled dataset across six birthcohorts) in which the interaction between air pollutionand SNPs associated with oxidative stress and inflamma-tion on incident childhood asthma was investigated.Traffic-related air pollution, asthma, SNPs, and poten-tial confounder data were pooled across six birth co-horts. Parents reported physician-diagnosed asthmafrom birth to 7–8 years of age (confirmed by pediatricallergist in two cohorts). Individual estimates of annualaverage air pollution [nitrogen dioxide (NO2), particulatematter ≤2.5 μm (PM2.5), PM2.5 absorbance, ozone] wereassigned to each child’s birth address using land use re-gression, atmospheric modeling, and ambient monitor-ing data. Gene-environment interactions between airpollution and SNPs in GSTP1 (rs1138272 and rs1695)and TNF (rs1800629) on asthma were investigated.The main findings of the pooled analyses were thatNO2 (OR = 1.23; 95%-CI: 1.03, 1.46, for a 10-μg/m3 in-crease in NO2) and GSTP1 rs1138272 (TT/TC vs. CC;OR = 1.49; 95%-CI: 1.20, 1.84) were marginally associ-ated with asthma and a significant interaction betweenGSTP1 rs1138272 and NO2 on asthma was detected(Bonferroni-corrected p = 0.012) [23].More information about the TAG study can be foundin [23–25].In our analysis, we focused on the German InfantStudy on the influence of Nutritional Intervention plusenvironmental and genetic influences of on allergy de-velopment (GINIplus) as study sample (N = 593 observa-tions), which is one of the six birth cohorts included inthe TAG study. We compared the p-values derived fromweighted GRS with weights from the pooled analysis aspublished in [23] (proxy for external weights) to p-values from the GRS-marginal-internal approach and top-values from the GRS-interaction-training approach(balance training vs. test data 1:1 (Ntest = 296), 1:2(Ntest = 395) and 1:3 (Ntest = 444)).In Table 1, an overview on the marginal genetic effectsin the pooled analysis [23] and in GINIplus are given.Only the marginal genetic association between GSTP1rs1138272 and asthma was significant in the pooledTAG analysis. Effect estimates differed only slightlyHüls et al. BMC Genetics  (2017) 18:115 Page 7 of 12between the pooled analysis and GINIplus, being ~30%stronger in GINIplus than in the pooled analysis. How-ever, due to the small sample size of GINIplus (N = 593),this marginal association was not significant inGINIplus.Table 2 shows the results of the GxE interaction ana-lysis in GINIplus. The significant GxE interaction be-tween GSTP1 rs1138272 and NO2 on asthma, which wasidentified in the pooled analysis [23], was identified byeach GRS approach. The lowest p-values were achievedby applying the GRS-marginal-internal approach andGRS with external weights, followed by the GRS-interaction-training (using 25% of the data for trainingand the remaining 75% as test data). The weights fromthe GRS-marginal-internal approach were almost identi-cal to the univariate estimates from the pooled analysis.The GRS-interaction-training approach was the only ap-proach that correctly identified GSTP1 rs1138272 as theonly SNP that interacts with air pollution (cf. [23]) bysetting the weights of the other SNPs to zero.DiscussionIn this article, we presented a new weighting approach,called GRS-interaction-training, for GRSxE interactionstudies in which parts of the study sample are used toestimate the weights and the remaining data areemployed to determine the GRS.Fig. 3 Power, sign-misspecifications and type I error comparison of the GRS-interaction-training approach (one half of the data used as trainingdata and the other half as test data) vs. the GRS-marginal-internal approach. Scenarios with predominant interaction effects (a) and predominantmarginal genetic effects (b). Minor allele frequencies of the 6 risk SNPs increase from 0.01 to 0.45, scenarios with 6, 50 and 100 noise SNPs (N =1000 observations and 100 replications)Hüls et al. BMC Genetics  (2017) 18:115 Page 8 of 12In a simulation study and a subsequent real data appli-cation, we compared the performance of this approachto weighted GRS with internal weights from the mar-ginal genetic effects, called GRS-marginal-internal [12],and GRS with external weights for the detection ofgene-environment interactions.Our simulation study has shown that the power fordetecting GxE interactions reached by applying theGRS-interaction-training approach was only slightlylower than the power achieved by weighted GRS withexternal weights from the marginal genetic effects esti-mated in an independent study population that fits per-fectly to our own study sample. If the external data,however, did not fit to the own study sample perfectly orthe sample size of the external data was not larger thanour own sample size, the power was higher when usingthe GRS-interaction-training approach.The sample size of the test data in the GRS-interaction-training approach is only half of the samplesize from the GRS-marginal-internal approach, becausein the GRS-interaction-training approach half of the datais used to determine the weights and the remaining testdata to calculate the GRS and to estimate the inter-action. Nevertheless, if there were no external weightsavailable and the underlying GxE interaction effect waslarger than the marginal genetic effect, the highestpower was reached with the GRS-interaction-trainingapproach. If the underlying marginal genetic effect wassubstantially larger than the GxE interaction effect, theGRS-marginal-internal approach was more appropriate.GRS-interaction-training approach – Balance betweentraining vs. test dataMotivated by the idea that the interaction itself might bemore suitable to estimate the weights than the marginalgenetic effect, we divided each of our datasets into atraining and a test dataset and used the interaction esti-mates from the training data as weights for the GRS inthe test data. Dudbridge (2013) evaluated a similar ap-proach for the detection of marginal genetic effects andreported that the optimal balance of sample sizes be-tween training and test datasets is close to one-half re-gardless of the proportion of noise SNPs or the p-valuethreshold [3]. In our study, this recommendationshowed up to be true for scenarios with many noiseSNPs (e.g., 6 risk SNPs and 200 noise SNPs) and the bal-ance was roughly symmetrical around one-half which isalso in line with [3]. However, in contrast to Dudbridge(2013), with a decreasing number of noise SNPs(down to only 6), a higher power was achieved by in-creasing the size of the test data proportionally to thesize of the training data. This finding was confirmedin our real data application with only two noise SNPsand one risk SNP, as a lower p-value was achievedwhen using more test data than training data. Never-theless, since we usually consider a large number ofnoise SNPs in most gene-environment interactionstudies, we generally support Dudbridge’s rule ofTable 1 Real data application. Marginal genetic effects for theassociations of three GSTP1 & TNF SNPs with parents reportedphysician-diagnosed asthma from birth to 7–8 years of age inthe pooled TAG data and in GINIplus considering a dominantmode of inheritance for the three SNPsAssociation with asthmaN ORa p-valuebGSTP1 rs1138272 Pooledc 4465 1.49 <0.001GINIplusd 593 1.67 0.348GSTP1 rs1695 Pooledc 4635 0.91 0.430GINIplusd 593 0.75 0.972TNF rs1800629 Pooledc 4356 1.04 0.647GINIplusd 593 0.80 1.000aAdjusted for study, city, intervention, infant sex, maternal age at birth,maternal smoking during pregnancy, environmental tobacco smoke in thehome, birth weight, and parental atopy. bp-values were corrected for multipletesting using the Bonferroni method (raw p-values multiplied by the numberof analyzed SNPs (3)). cPooled data from BAMSE, CAPPS, GINIplus, LISAplus,SAGE and PIAMA, N, ORs and p-values as published in MacIntyre et al. (2014).ddetermined for this publicationTable 2 Real data application. GxE interaction analysis in GINIplus between a GRS of three GSTP1 & TNF SNPs and air pollutionexposure (NO2) with parents reported physician-diagnosed asthma from birth to 7–8 years of ageWeights for GRS GRSxE interactionN GSTP1 rs1138272 GSTP1 rs1695 TNF rs1800629 ORa p-valueGRS with weights from pooled marginal genetic effectsb 593 ln(1.49) ≈ 0.40 ln(0.91) ≈ −0.09 ln(1.04) ≈ 0.04 16.31 0.004GRS-marginal-internalc 593 0.69 −0.09 0.00 8.83 0.004GRS-interaction-training (1:1)d,e 296 0.63 0.00 0.00 9.71 0.028GRS-interaction-training (1:2)d,f 395 0.64 0.00 0.00 9.24 0.014GRS-interaction-training (1:3)d,g 444 0.85 0.00 0.00 7.34 0.007aOR and p-values for the interaction effects. Adjusted for study, city, intervention, infant sex, maternal age at birth, maternal smoking during pregnancy, environmentaltobacco smoke in the home, and parental atopy. bPooled data from BAMSE, CAPPS, GINIplus, LISAplus, SAGE and PIAMA; ln(ORs) as published in MacIntyre et al. (2014)were used as weights (compare Table 1). cestimated in GINIplus within this publication, estimates from the elastic net regression (α = 0.5) for the marginal genetic effectsin GINIplus. dWeights from the interaction term itself when using parts of the data to estimate the weights and the remaining data to determine the GRS. eBalancetraining vs. test data 1:1. fBalance training vs. test data 1:2. gBalance training vs. test data 1:3Hüls et al. BMC Genetics  (2017) 18:115 Page 9 of 12thumb to make an even split between training andtest data for GxE interaction studies.Internal vs. external weightsOur simulation study has confirmed that the gold stand-ard for the construction of GRS is to use externalweights, e.g., from the marginal genetic effects estimatedin independent study populations, if the external data fitvery well to the study sample. This strong assumptionmeans that the marginal genetic associations in the ex-ternal data are the same as in our own study sample, thismight but must not be reached if the phenotype isassessed in exactly the same way and that there is noethnic or age difference between the study populations.In real data analyses, these assumptions are often notfulfilled because large scale GWAS are not published forevery phenotype and sometimes only in populations withdifferent ethnicity, sex or age range.The violation of these assumptions might lead to a de-crease of power for detecting interaction effects withGRS with external weights. Therefore, in the practicalanalysis of real data, using internal weights from thestudy population itself might often be a more powerfulalternative to detect GxE interactions.However, in our real data application, the powerreached by GRS with external weights was similar to thepower reached by the two approaches with internalweights. One reason for that might be that our studysample (GINIplus) was included in the estimation of the“external” effects. Therefore, the effect estimates fromthe pooled analysis might fit slightly better to the GINI-plus data than they would have fitted if the GINIplusdata would not have been part of the pooled analysis.Furthermore, a limitation of the GRS-interaction-training approach is that the GRSxE interaction termcan only be estimated in a subset (i.e. the test data) ofthe original sample data which reduces the power to de-tect interactions.A major limitation of GRS with external weights isthat we can only include SNPs for which the marginalgenetic effects have been published. In this regard, GRSwith external weights are usually restricted to SNPs witha genome-wide significant (p-value <5 × 10−8) marginalgenetic effect in the external study population, whereasSNPs with a predominant interaction effect are usuallynot presented. For GxE interaction studies, this leads toa publication bias towards SNPs with predominant mar-ginal genetic effects. To avoid this publication bias andto increase the power for detecting GxE interactions, es-timates from genome wide gene-environment inter-action studies might be used. However, up to now, veryfew genome-wide gene-environment interaction studieshave been published because of the limited power to de-tect interactions in genome-wide analyses.From a biological perspective, a pathway-orientatedGxE interaction analysis might be a more powerful andbiologically plausible alternative to genome-wide ap-proaches. Very recently, we could, e.g., show in a studypopulation consisting of 402 women that genetic varia-tions in the ER stress pathway might play a role in airpollution induced inflammation in the lung using theGRS approach with internal weights from the marginalgenetic effects, although there was no significant mar-ginal genetic effect on the individual SNP level [7].GRS-interaction-training vs. GRS-marginal-internalIn scenarios with a predominant interaction effect, i.e.an interaction effect that is (substantially) larger than themarginal genetic effect, the GRS-interaction-training ap-proach was more powerful than the GRS-marginal-internal approach, particularly in the presence of noiseSNPs. Furthermore, applying the GRS-marginal-internalapproach in scenarios with predominant interaction ef-fects might lead to a high number of sign-misspecifications when the MAFs of the risk SNPs are≥0.2 and in the presence of noise.However, in scenarios with a predominant marginalgenetic effect and a smaller additional interaction effect,the GRS-marginal-internal approach achieved a slightlyhigher power than GRS-interaction-training approachwith an acceptable number of sign-misspecifications.In real data applications, the decision if the interactionor the marginal genetic effect is predominant, should bemade a priori and be based on biological knowledge. Ifthe SNPs were chosen because the underlying genes hadbeen identified to be marginally associated with thesame or a related phenotype (e.g. in a large-scalegenome-wide meta-analysis), independently of the envir-onmental exposure, the weights should be determinedfrom the marginal genetic effects (GRS-marginal-in-ternal). Nevertheless, if the SNPs were chosen becauseof their potential impact on the biological mechanismsmediating the association between the environmentalexposure and disease development, the weights shouldbe determined from the interaction term (GRS-inter-action-training approach). Either this knowledge mightbe based on mechanistic studies or on epigenome-wideassociation studies (EWAS). EWAS present differentiallymethylated probes (DMPs) and regions (DMRs) in bal-ance to disease outcomes (e.g. [26] for lung function).Since EWAS identify regions that are modified by envir-onmental factors, they might provide a good pre-selection of genetic regions to be considered in GxEinteraction studies.In the TAG study, e.g., the considered SNPs werechosen, as the biological mechanisms were thought tounderlie both the toxicity of traffic-related air pollutionand the development of asthma [27]. This wasHüls et al. BMC Genetics  (2017) 18:115 Page 10 of 12confirmed by our performed analysis, which shows thatthe GRS-marginal-internal approach reached almost thesame power as GRS-interaction-training approach.Strengths and limitationsOur study has several strengths. To our knowledge thisis the first study presenting GRS with weights from theinteraction term itself and comparing GRS with internalvs. external weights for the detection of gene-environment interactions. Furthermore, this is the firststudy comparing interaction approaches in scenarioswith predominant interaction vs. predominant marginalgenetic effects, a differentiation that is often ignored inthe real data practice but which was shown to have amajor impact on the selection of the most powerful ana-lytic strategy. A further strength is that we analyzed theperformance of the GRS approaches in the presence ofnoise and SNPs with different MAFs to cover severaldata structures common in GxE interaction studies.A few limitations and outstanding issues should benoted. In our simulation study, we compared the per-formance of GRS with internal and external weights inquite simple scenarios, which might not cover all typesof interaction models. We did not include differentmodes of inheritance, gene-gene or other more complexinteractions in these scenarios. Such considerationsmight be beneficial to further optimize the weightedGRS for other scenarios.Moreover, a comparison of the considered GRS ap-proaches with other state-of-the-art interaction ap-proaches might be interesting. However, as Aschardrecently showed, the use of GRS can increase the powerto detect GxE interactions in comparison to commonunivariate single-variant approaches and the joint test ofmain genetic and interaction effects [4, 5]. We addition-ally compared our GRS approaches with a multiple lo-gistic lasso regression considering p-values estimatedusing the significance test for the lasso [28]. The resultsof this comparison presented in Additional file 1 showthat our GRS approaches outperform the results of alasso regression in the considered scenarios.Furthermore, there is room for improvement regard-ing the decision making process between a predominantinteraction effect and a predominant marginal geneticeffect because detailed a priori knowledge about the bio-logical pathways is often limited. One possibility to im-prove the a-priori knowledge might be to useinformation from EWAS. The growing field of epigenet-ics might clarify many of the biological pathways howenvironmental exposures might induce health problemsand thereby improve the selection process of candidateSNPs for pathway based GxE interaction studies. A pos-sibility to improve the GRS approaches might be tocombine the GRS-marginal-internal approach and theGRS-interaction-training approach to reach a goodpower for the detection of interactions in scenarios withpredominant marginal genetic effects as well as in sce-narios with predominant interaction effects.Our real data application has the limitation that wecould only include the three SNPs from which we hadprevious knowledge about the marginal genetic andinteraction effects in a large pooled analysis [23]. How-ever, this is often a limitation in the daily practice aswell, since external weights are often limited to, e.g.,genome-wide significant SNPs because other effectestimates are often not reported. Furthermore, sinceGINIplus (N = 593) was part of the TAG consortia (N =4465), the weights from the pooled marginal genetic ef-fects were not independent from our sample data. How-ever, this problem does also often occur in the real datapractice because large scale genome-wide meta-analysesoften include all study populations that are available forthe considered phenotype and thereby often include theown study sample as well.ConclusionIn conclusion, when no appropriate external weightsare available (due to, e.g., ethnic differences or differ-ences in the phenotype assessment), we recommendto use internal weights from the study population it-self to construct weighted GRS for GxE interactionstudies. If the SNPs were chosen because a marginalgenetic effect was hypothesized, the weights should beestimated from the marginal genetic effects (GRS-marginal-internal approach). If the SNPs were chosenbecause of their potential impact on the biologicalmechanisms mediating the association between theenvironmental exposure and disease development, theweights should be estimated from the interactionterm itself in a training dataset (GRS-interaction-training approach).Additional fileAdditional file 1: Table S1. Simulated predominant interaction effects(example data for N = 3000). Table S2. Simulated marginal geneticeffects (example data for N = 3000). Comparison of GRS approaches andlasso regression. Figure S1. Power and sign-misspecifications comparison.Figure S2. Type I error comparison. (DOCX 405 kb)AbbreviationsEWAS: Epigenome-wide association study; GINIplus: German Infant Study onthe influence of Nutritional Intervention plus environmental and geneticinfluences of on allergy development; GLM: Generalized linear model;GRS: Genetic risk score; GRSxE interaction: Interaction between GRS andenvironmental exposure; GWAS: Genome-wide association study; GxEinteraction: Gene-environment interaction; MAF: Minor allele frequency;NO2: Nitrogen dioxide; PM2.5: Particulate matter ≤2.5 μm; SNP: Singlenucleotide polymorphism; TAG Study: Traffic, Asthma and Genetics StudyHüls et al. BMC Genetics  (2017) 18:115 Page 11 of 12AcknowledgementsWe would like to thank all members of the TAG consortia for providing usthe data for the real data application: Allan Becker, Andrew Sandford, Andreavon Berg, Anita L. Koryrskyj, Anna Bergström, Anna Gref, Barbara Hoffmann,Beate Schaaf, Bert Brunekreef, Carl Peter Bauer, Carla M. T. Tiesler, CillaSöderhäll, Claudia Klümper, Dietrich Berdel, Dirkje S. Postma, Elaina AMacIntyre, Elaine Fuertes, Elisabeth Thiering, Eric Melén, F. Nicole Dijk, GerardH. Koppelman, Göran Pershagen, Inger Kull, Joachim Heinrich, Juha Kere,Marie Standl, Mario Bauer, Marit Westman, Marjan Kerkhof, MeaghanMacnutt, Melanie Waldenberger, Michael Brauer, Moira Chan-Yeung, NathalieAcevedo, Olf Herbarth, Sibylle Koletzko, Tom Bellander, Ulrike Gehring.FundingThis project was part of AH’s PhD thesis at the Faculty of Statistics, TUDortmund University and was funded by the IUF-Leibniz Research Institutefor Environmental Medicine, Düsseldorf. This work was also supported by theDeutsche Forschungsgemeinschaft (grant SCHW 1508/3–1 to HS). We fur-ther acknowledge financial support by the Deutsche Forschungsge-meinschaft and TU Dortmund University within the funding programmeOpen Access Publishing.Availability of data and materialsAll data generated within the simulation study can be made available toreaders upon request.Authors’ contributionsAH, HS, KI and UK conceived and designed the simulation study. AH, UK andTS (PI of the GINIplus study) and CC (PI of the TAG consortium) contributedto the study design of the real data application. AH performed thesimulation study and real data application and was the major contributor inwriting the manuscript. All authors read and approved the final manuscript.Ethics approval and consent to participateThe GINIplus study was approved by the relevant ethics committees(Ethikkommission der Ärztekammer Nordrhein and Ethikkommision derBayerischen Landesärztekammer) with written informed consent obtainedfrom the parents of all participants.Consent for publicationNot applicable.Competing interestsThe authors declare that they have no competing interests.Author details1IUF-Leibniz Research Institute for Environmental Medicine, Düsseldorf,Germany. 2Faculty of Statistics, TU Dortmund University, Dortmund, Germany.3Department of Medicine, University of British Columbia, Vancouver, BC,Canada. 4Institute for Heart and Lung Health, Vancouver, BC, Canada. 5Schoolof Population and Public Health, University of British Columbia, Vancouver,BC, Canada. 6Mathematical Institute, Heinrich Heine University, Düsseldorf,Germany.Received: 18 September 2017 Accepted: 7 December 2017References1. Ottman R. Gene–environment Interaction : definitions and study designs.Prev Med (Baltim). 1996;25:764–70.2. Dudbridge F. Polygenic epidemiology. Genet Epidemiol. 2016;40:268–72.3. Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoSGenet. 2013;9:e1003348.4. Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, etal. Common polygenic variation contributes to risk of schizophrenia andbipolar disorder. Nature. 2009;72:1343–54.5. Hamshere ML, O’Donovan MC, Jones IR, Jones L, Kirov G, Green EK, et al.Polygenic dissection of the bipolar phenotype. Br J Psychiatry. 2011;198:284–8.6. Eze IC, Imboden M, Kumar A, von Eckardstein A, Stolz D, Gerbase MW, et al.Air pollution and diabetes association: modification by type 2 diabetesgenetic risk score. Environ Int The Authors. 2016;94:263–71.7. Hüls A, Krämer U, Herder C, Fehsel K, Luckhaus C, Stolz S, et al. Geneticsusceptibility for air pollution-induced airway inflammation in the SALIAstudy. Environ Res Elsevier. 2017;152:43–50.8. Qi Q, Chu AY, Kang JH, Huang J, Rose LM, Jensen MK, et al. Fried foodconsumption, genetic risk, and body mass index: gene-diet interactionanalysis in three US cohort studies. BMJ. 2014;348:g1610.9. Aschard HA. Perspective on interaction effects in genetic associationstudies. Genet Epidemiol. 2016;40:678–88.10. Kraft P, Yen YC, Stram DO, Morrison J, Gauderman WJ. Exploiting gene-environment interaction to detect genetic associations. Hum Hered. 2007;63:111–9.11. Che R, Motsinger-Reif A. a. Evaluation of genetic risk score models in thepresence of interaction and linkage disequilibrium. Front Genet. 2013;4:1–10.12. Hüls A, Ickstadt K, Schikowski T, Krämer U. Detection of gene-environmentinteractions in the presence of linkage disequilibrium and noise by usinggenetic risk scores with internal weights from elastic net regression. BMCGenet. 2017;18:55.13. Zou H, Hastie T. Regularization and variable selection via the elastic-net. J RStat Soc. 2005;67:301–20.14. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linearmodels via coordinate descent. J Stat Softw. 2009;33:1–22.15. Waldmann P, Mészáros G, Gredler B, Fuerst C, Sölkner J. Evaluation of thelasso and the elastic net in genome-wide association studies. Front Genet.2013;4:1–11.16. Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths forCox’s proportional hazards model via coordinate descent. J Stat Softw.2011;39:1–13.17. Burgess S, Dudbridge F, Thompson SG. Combining information on multipleinstrumental variables in Mendelian randomization: comparison of allelescore and summarized data methods. Stat Med. 2016;35:1880–906.18. Burgess S, Thompson SG. Use of allele scores as instrumental variables forMendelian randomization. Int J Epidemiol. 2013;42:1134–44.19. McCullagh P, Nelder JA. Generalized linear models. 2nd ed. London:Chapman and Hall; 1989.20. Nelder JA, Wedderburn RWM. Generalized linear models. J R Stat Soc A.1972;135:370–84.21. Schwender H, Fritsch A. scrime: Analysis of High-Dimensional CategoricalData such as SNP Data. R package version 1.3.3. 2013.22. Development Core R, Team R. A language and environment for statisticalcomputing [internet]. Vienna, Austria: R foundation for statistical.Computing. 2017; Available from: http://www.r-project.org/23. MacIntyre EA, Brauer M, Melén E, Bauer CP, Bauer M, Berdel D, et al. GSTP1and TNF gene variants and associations between air pollution and incidentchildhood asthma: the traffic, asthma and genetics (TAG) study. EnvironHealth Perspect. 2014;122:418–24.24. MacIntyre EA, Carlsten C, MacNutt M, Fuertes E, Melén E, Tiesler CMT, et al.Traffic, asthma and genetics: combining international birth cohort data toexamine genetics as a mediator of traffic-related air pollution’s impact onchildhood asthma. Eur J Epidemiol. 2013;28:597–606.25. Fuertes E, Brauer M, MacIntyre E, Bauer M, Bellander T, Von Berg A, et al.Childhood allergic rhinitis, traffic-related air pollution, and variability in theGSTP1, TNF, TLR2, and TLR4 genes: results from the TAG study. J Allergy ClinImmunol. 2013;132:342–52.26. Lee M, Hong Y, Kim W, London S. Epigenome-wide association study ofchronic obstructive pulmonary disease and lung function in Koreans.Epigenomics. 2017;9:971–84.27. Kelly FJ. Oxidative stress: its role in air pollution and adverse health effects.Occup Environ Med. 2003;60:612–6.28. Lockhart R, Taylor J, Tibshirani RJ, Tibshirani RA. Significance test for thelasso. Ann Stat. 2014;42:413–68.Hüls et al. BMC Genetics  (2017) 18:115 Page 12 of 12


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items