UBC Faculty Research and Publications

Network-based analysis reveals novel gene signatures in peripheral blood of patients with chronic obstructive… Obeidat, Ma’en; Nie, Yunlong; Chen, Virginia; Shannon, Casey P; Andiappan, Anand K; Lee, Bernett; Rotzschke, Olaf; Castaldi, Peter J; Hersh, Craig P; Fishbane, Nick; Ng, Raymond T; McManus, Bruce; Miller, Bruce E; Rennard, Stephen; Paré, Peter D; Sin, Don D Apr 24, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12931_2017_Article_558.pdf [ 1.28MB ]
JSON: 52383-1.0357445.json
JSON-LD: 52383-1.0357445-ld.json
RDF/XML (Pretty): 52383-1.0357445-rdf.xml
RDF/JSON: 52383-1.0357445-rdf.json
Turtle: 52383-1.0357445-turtle.txt
N-Triples: 52383-1.0357445-rdf-ntriples.txt
Original Record: 52383-1.0357445-source.json
Full Text

Full Text

RESEARCH Open AccessNetwork-based analysis reveals novel genesignatures in peripheral blood of patientswith chronic obstructive pulmonary diseaseMa’en Obeidat1* , Yunlong Nie1, Virginia Chen2, Casey P. Shannon2, Anand Kumar Andiappan3, Bernett Lee3,Olaf Rotzschke3, Peter J. Castaldi4,5, Craig P. Hersh4,6, Nick Fishbane1, Raymond T. Ng2, Bruce McManus1,2,Bruce E. Miller7, Stephen Rennard8,9, Peter D. Paré1,10 and Don D. Sin1,10AbstractBackground: Chronic obstructive pulmonary disease (COPD) is currently the third leading cause of death andthere is a huge unmet clinical need to identify disease biomarkers in peripheral blood. Compared to gene leveldifferential expression approaches to identify gene signatures, network analyses provide a biologically intuitiveapproach which leverages the co-expression patterns in the transcriptome to identify modules of co-expressedgenes.Methods: A weighted gene co-expression network analysis (WGCNA) was applied to peripheral bloodtranscriptome from 238 COPD subjects to discover co-expressed gene modules. We then determined therelationship between these modules and forced expiratory volume in 1 s (FEV1). In a second, independentcohort of 381 subjects, we determined the preservation of these modules and their relationship with FEV1. Forthose modules that were significantly related to FEV1, we determined the biological processes as well as theblood cell-specific gene expression that were over-represented using additional external datasets.Results: Using WGCNA, we identified 17 modules of co-expressed genes in the discovery cohort. Three of thesemodules were significantly correlated with FEV1 (FDR < 0.1). In the replication cohort, these modules were highlypreserved and their FEV1 associations were reproducible (P < 0.05). Two of the three modules were negativelyrelated to FEV1 and were enriched in IL8 and IL10 pathways and correlated with neutrophil-specific gene expression.The positively related module, on the other hand, was enriched in DNA transcription and translation and was stronglycorrelated to CD4+, CD8+ T cell-specific gene expression.Conclusions: Network based approaches are promising tools to identify potential biomarkers for COPD.Trial registration: The ECLIPSE study was funded by GlaxoSmithKline, under ClinicalTrials.gov identifier NCT00292552and GSK No. SCO104960Keywords: COPD, FEV1, Blood, mRNA, Gene expression, Co-expression, WGCNA, Biomarker, Transcriptome* Correspondence: maen.obeidat@hli.ubc.ca1The University of British Columbia Centre for Heart Lung Innovation, StPaul’s Hospital, 1081 Burrard Street, Vancouver, BC V6Z 1Y6, CanadaFull list of author information is available at the end of the article© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Obeidat et al. Respiratory Research  (2017) 18:72 DOI 10.1186/s12931-017-0558-1BackgroundChronic obstructive pulmonary disease (COPD) is cur-rently the third leading cause of death [1]. The disease isunder genetic and environmental control with cigarettesmoking being the major modifiable risk factor in theWestern world [2]. COPD is characterized by chronic irre-versible airflow limitation that is often accompanied bysystemic inflammation [3, 4]. The two main morphologicphenotypes of COPD are small airway obstruction andemphysematous destruction and enlargement of airspaces.While the molecular mechanisms underlying the two pro-cesses may be different, COPD is diagnosed and assessedusing lung function parameters; the most commonly usedare the forced expiratory volume in 1 s (FEV1) and its ratiowith the forced vital capacity (FEV1/FVC).There is a huge unmet clinical need to identify clinicallyuseful biomarkers for COPD [5]. To this end, blood bio-markers would be highly desirable since blood is veryaccessible. However, the main limitation of blood as asource for biomarker discovery is that its signals may notreflect the disease process in lungs, which are the predom-inant site of disease in COPD. Recently, a number ofstudies have evaluated the relationship of gene expressionprofiles in peripheral blood with COPD endpoints andhave demonstrated some signal [6, 7]. One major limita-tion of using gene expression data for biomarker discoveryis the requirement for statistical stringency in determiningsignificant expression changes. However, biologically,this traditional approach lacks intuition since genes areexpressed (and function) in clusters or networks ratherthan as independent entities.To address this limitation, in this study, we usedweighted gene co-expression network (WGCNA) [8] toidentify “modules” of co-expressed genes in peripheralblood of former smokers with COPD. We then usedthese modules to discover novel molecular pathwaysthat are related to FEV1.MethodsOverall study designThe overall study design is shown in Fig. 1. First, in thediscovery cohort, using the WGCNA approach, we identi-fied modules of strongly co-expressed genes. We then de-termined the association between these discovered geneexpression modules and FEV1% predicted in the discoverycohort. Next, we determined the reproducibility of theserelationships in an independent replication cohort. In boththe discovery and replication cohorts, all analyses wereperformed with and without adjustment for cell counts inthe peripheral circulation. We also determined whetherthe discovered co-expression patterns in the discovery co-hort were preserved in the replication cohort. Finally, weused external cell-specific gene expression studies to de-termine whether the discovered gene expression modulesPeripheral blood gene expressionn=238Construct gene co-expression modulesModule-level associations with FEV1Test modules for preservation in the replication cohortAssociation of discovery modules with FEV1 in the  replication cohortTest FEV1- associated modules for enrichment in cell-specific gene expressionDiscoveryReplicationPeripheral blood gene expressionn= 381Fig. 1 Overall study designObeidat et al. Respiratory Research  (2017) 18:72 Page 2 of 11were enriched (i.e. over-represented) for specific cell typesin the peripheral circulation (e.g. neutrophils, eosinophils,lymphocytes, monocytes, etc.).Study subjectsThe discovery and replication populations were subsetsof the ECLIPSE (Evaluation of COPD Longitudinallyto Identify Predictive Surrogate Endpoints) study [9].ECLIPSE was a 3 year non-interventional, multicentre,longitudinal prospective study of COPD progression.ECLIPSE included 2164 COPD patients aged 40–75 years (smoking history ≥10 pack-years with a post-bronchodilator FEV1/FVC < 0.70 and FEV1 < 80% pre-dicted) and 337 smokers and 245 non-smokers whowere control subjects (FEV1/FVC > 0.70 and FEV1 > 90%predicted). Blood was collected in PAXgene RNA tubesand frozen at −80 °C. The gene expression sub-study ofECLIPSE was originally designed to determine gene sig-natures of exacerbation in peripheral blood of patientswith COPD [10]. The discovery cohort consisted of 238former smokers with COPD. The replication cohort in-cluded 381 subjects (54.3% former and 38.6% currentsmokers) who were not part of the discovery set. Theparent ECLIPSE study was approved by the relevantethics review boards at each of the participating cen-tres. Study participants provided written informed con-sent, and participants’ information was de-identified.The ECLIPSE study was funded by GlaxoSmithKline,under ClinicalTrials.gov identifier NCT00292552 andGSK No. SCO104960. This gene expression sub-studywas funded by Genome British Columbia and was ap-proved by the Providence Health Care Research EthicsBoard (REB) of the University of British Columbia(UBC) (H11-00786).Microarray data processingPAXgene Blood miRNA kit from PreAnalytix (Cat.#763134) was used to extract the total RNA which wasthen hybridized to the Affymetrix Human Gene 1.1 STarray. Affymetrix GeneTitan MC Scanner (AffymetrixInc.) was used to scan the array plates. The oligo Biocon-ductor [11] and RMA Express [12] packages were usedto perform quality control on the microarray data. Back-ground correction, normalization and summarization ofthe data and filtering out non-informative probe setswas undertaken using the Factor Analysis for RobustMicroarray Summarization (FARMS Bioconductor pack-age) [13]. The gene expression data are available on theNCBI Gene Expression Omnibus (GEO) under http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE71220.Weighted gene co-expression network analysis (WGCNA)The WGCNA R package [8] was used to clustergroups of strongly co-expressed genes into co-expressionnetworks. We followed the WGCNA tutorials at(http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html). Aweighted gene co-expression network reconstruction al-gorithm was used to create co-expression networksamong the unique 18,892 genes [14]. The workflow ofWGCNA began by creating a matrix of Pearson correla-tions between genes, and transforming these into an ad-jacency matrix through soft thresholding by raising it to apower β. In this study β =7 was selected so that theresulting adjacency matrix approximated a scale-free top-ology criterion. The adjacency matrix was transformedinto a topological overlap matrix (TOM) [15]. Moduleswere defined as groups of highly interconnected genes.To identify modules of highly co-expressed genes, weused average linkage hierarchical clustering to groupgenes based on the topological overlap of their connect-ivity, followed by a dynamic tree-cut algorithm to clusterdendrogram branches into gene modules [16]. Each ofthe resulting modules was assigned a color. For eachgene, we calculated a Module Membership (MM) whosevalues ranged between 0 and 1 by correlating the gene’sexpression profile with the module eigengene determinedby the first principal component of the gene expressionprofiles in that module. A gene that has a MM ap-proaching 1 is considered to be highly connected to othergenes in that module. In this study “hub” genes, whichare considered to be central to the module, were definedbased on the sum of ranks of their MM and gene signifi-cance for association with FEV1.Module preservationTo test for module preservation in the replication sam-ple, we used the Zsummary statistics method of theWGCNA package [17]. The Zsummary is an integratedstatistics of two preservation measures: a density pres-ervation statistic which determines whether a modulegenes remain highly connected in the replication net-work, and the connectivity based preservation statisticwhich determines whether the connectivity pattern be-tween genes in the discovery network is similar to thatin the replication network [17]. A permutation test isused to assess the significance of the preservation sta-tistics and Zsummary for the “gold” module, which is arandom sample representing the entire network. Basedon the thresholds determined by Langfelder et al. [17],modules with a Zsummary score >10 demonstrate strongpreservation.Differential gene and module expression analysisIn the discovery dataset, linear regression was used toidentify genes, whose expression was significantly relatedto FEV1 % predicted, after adjustments for age, sex andpack-years of smoking. The same approach was used toObeidat et al. Respiratory Research  (2017) 18:72 Page 3 of 11identify modules obtained from WGCNA that were dif-ferentially expressed with regards to FEV1% predicted. Ina sensitivity analysis, both the gene and module-level as-sociations were adjusted for cell counts in peripheralblood. For this analysis the first three principal compo-nents from the five cell types (neutrophils %, lympho-cytes %, monocytes %, eosinophils % and basophils %)accounted for 99.7% of variation in the cell percentagesand were used as covariates in the linear regressionmodel. The Benjamini-Hochberg method was applied tocorrect for multiple testing [18].Replication of module FEV1 associationsFor modules identified in the discovery cohort, we com-puted their eigenene values in the replication cohort andthen tested them for associations with FEV1 in this inde-pendent set of subjects using linear regression. Similarto the discovery cohort the analysis was adjusted for age,sex and pack-years with additional adjustment for smok-ing status given that the cohort consisted of former andcurrent smokers. A parallel analysis with additional ad-justment for cell count was also performed.Enrichment in cell specific gene expressionThe three modules associated with FEV1 were testedfor enrichment in cell specific gene expression datafrom three independent studies. These include: 1) thestudy by Allantaz et al. [19] where they performedmiRNA and mRNA expression profiling in a panel ofnine human immune cell subsets (neutrophils, eosino-phils, monocytes, B cells, NK cells, CD4 T cells, CD8 Tcells, myeloid dendritic cells (mDCs) and plasmacytoiddendritic cells (pDCs), to identify cell-type specific ex-pression (GSE28490 and GSE28491) in a discovery anda validation cohort, 2) the study by Naranbhai et al.[20] measured gene expression and mapped expressionquantitative trait loci (eQTL) in peripheral bloodCD16+ neutrophils from 101 healthy European adults(E-MTAB-3536) and 3) the study by Fairfax et al. [21]measured gene expression in B cells and monocytes(E-MTAB-945).Affymetrix arrays were normalized using RMA andIllumina arrays were normalized using quantilenormalization. Non-overlapping genes across the threestudies were removed. Spearman Rank correlations wereperformed to determine the extent of correlation be-tween genes in significant modules and the cell specificexpression values. Furthermore, permutation was per-formed by shuffling the expression data for 10,000 itera-tions and checking the number of times that the rho isgreater or equal to the value obtained for each module.In addition to P values, the enrichment was ranked usingrho values and agreement between studies considered inthe assignment of the most likely cell type.Ingenuity pathway analysisQIAGEN’s Ingenuity Pathway Analysis (IPA®, QIAGENRedwood City, www.qiagen.com/ingenuity) was used toanalyze the gene sets for enriched canonical pathways.Statistical analysis softwareAll analyses were performed with R version 3.2.1 andBioconductor packages [22]. Data processing was per-formed using Biovia Pipeline Pilot.ResultsThe discovery study included 238 former smokers withCOPD, while the replication cohort included 323 COPDpatients and 58 controls. The demographics of studyparticipants are shown in Table 1.Gene level associations with FEV1At the gene level, the strongest association with FEV1was observed for BTN2A1 (Butyrophilin subfamily 2member A1), which was negatively correlated with FEV1(FDR = 0.094). It was the only gene that had an FDR <0.1. The top 10 genes associated with FEV1 are shown inAdditional file 1: Table S1.Module identifications and associations with FEV1Applying WGCNA to the 18,892 genes expressed inblood cells led to the identification of 17 modules ofvarious sizes ranging from 117 in the “grey60” moduleto 5783 genes in the “turquoise” module. A total of 3659genes could not be mapped to any module; these geneswere grouped into the “grey” module and were not con-sidered further in the differential expression analyses.Three modules showed strong associations with FEV1after adjustments for age, sex and pack-years of smoking.The most significantly correlated module was the “yel-low” module containing 918 genes. It had a negative re-lationship with FEV1 (FDR = 0.004). The secondstrongest associated module with FEV1 (FDR = 0.007)was the “green” module which contained 553 genes andwas also negatively correlated with FEV1. Finally the“brown” module which contained1569 genes was posi-tively correlated to FEV1 (FDR = 0.03). The relationshipbetween all the modules and FEV1 are shown in Table 2.Genes in the yellow, green and brown modulesshowed strong enrichment for certain biological pro-cesses, suggesting that these modules have distinct bio-logical function (Table 3). The green module, forinstance, was enriched in interleukin (IL)-10, the trigger-ing receptor expressed on myeloid cells 1 (TREM1), theFc Receptor-mediated phagocytosis in macrophage andmonocyte and the peroxisome proliferator-activated re-ceptors (PPAR) signalling pathways. The yellow modulewas enriched in IL-8 signalling, the production of nitricoxide and reactive oxygen species in macrophages, andObeidat et al. Respiratory Research  (2017) 18:72 Page 4 of 11the caveolar-mediated endocytosis and relaxin signallingpathways. The brown module was enriched in processesrelated to DNA transcription and translation.Mapping “hub” genesThe identification of modules allowed mapping of hubgenes, which are central to their respective modules. Toidentify hub genes, we used a combination of genesignificance (P value) for its association with FEV1 andthe gene’s module membership (MM). MM is a measureof how well that gene is connected to the entire moduleand is reflective of a gene’s centrality. Using this ap-proach, the top hub genes for the green module werededicator of cytokinesis 5 (DOCK5) and DENN domaincontaining 3 (DENND3) genes. For the yellow module,the top two hub genes were RAB3D, member RASoncogene family (RAB3D) and GRB2 (Growth factorreceptor-bound protein 2) associated binding protein 2(GAB2). For the brown module, the top hub genes wereDCAF16 and EIF2AK3. The networks of GAB2, DOCK5and DCAF16 are shown in Fig. 1.Impact of adjustment of complete cell count (CBC) anddifferential to the gene and module level associationswith FEV1Because peripheral blood contains a mixture of inflam-matory cells, we evaluated the impact of complete cellcount (CBC) and differential on gene expression at thegene as well as module level. The correlation of eigen-genes with CBC in peripheral blood of the same subjectsis shown in Additional file 1: Table S2 for the discoveryand replication cohorts.The yellow and green modules, which were negativelyassociated with FEV1 were positively correlated withneutrophils (P < 0.001) and negatively correlated tolymphocytes (P < 0.001) in peripheral blood. The brownmodule, which showed positive association with FEV1,was negatively correlated with neutrophils (P < 0.001)Table 2 Module associations with FEV1 in the discovery cohortModule Estimate SE p-value FDR Module sizeYellow −59.924 16.106 2.49E-04 4.49E-03 918Green −54.373 15.998 7.96E-04 7.17E-03 553Brown 45.964 16.373 5.42E-03 3.25E-02 1569Greenyellow −38.935 16.077 1.62E-02 5.97E-02 369Blue 38.616 16.000 1.66E-02 5.97E-02 2399Magenta −34.945 16.104 3.10E-02 9.31E-02 455Red 30.352 16.007 5.92E-02 1.52E-01 510Pink 21.163 16.248 1.94E-01 4.37E-01 471Turquoise −19.410 16.096 2.29E-01 4.58E-01 5783Grey60 −16.590 16.461 3.15E-01 5.66E-01 117Midnightblue 9.907 16.225 5.42E-01 7.14E-01 152Black 9.638 16.210 5.53E-01 7.14E-01 493Tan −9.510 16.190 5.58E-01 7.14E-01 330Purple −8.022 16.543 6.28E-01 7.14E-01 442Lightcyan −7.684 16.168 6.35E-01 7.14E-01 123Salmon −4.268 16.136 7.92E-01 8.03E-01 316Cyan 4.047 16.209 8.03E-01 8.03E-01 233SE standard error, FDR false discovery rateTable 1 Subjects demographicsVariable Discovery Replication P-value*N 238 COPD 323 COPD 58 controlsAge 64.2 ± 6.2 63.9 ± 6.1 59.6 ± 6.5 <0.001Male 64.3% 67.8% 62.1% 0.556BMI 28.1 ± 6 26.5 ± 5.8 28.6 ± 4.3 0.001Smoker <0.001Former 96.6% 54.5% 53.4%Current 3.4% 45.5% 0%Never 0% 0% 46.6%Pack years 46 ± 26.9 48.2 ± 26.4 26.9 ± 14.1 <0.001FEV1% predicted 49.5 ± 16.2 49.7 ± 15.9 109.7 ± 15.8 <0.001FEV1/FVC 0.46 ± 0.13 0.45 ± 0.11 0.79 ± 0.06 <0.001GOLD –2 43.7% 39.6% –3 44.1% 49.2% –4 12.2% 11.1% –Exacerbations in prior year 1.6 ± 1.8 0.4 ± 0.5 – <0.001*P-value is from F test for continuous variables and chi-square test for categorical variablesObeidat et al. Respiratory Research  (2017) 18:72 Page 5 of 11and positively correlated to lymphocytes (P < 0.001).These relationships were replicated in the replicationcohort.To evaluate the impact of peripheral blood cell counton gene level association, the analysis was repeated byincluding CBC and differential of peripheral blood in thestatistical model we had used previously. Results areshown in Additional file 1: Table S3. The smallest FDRvalue after cell count adjustment was 0.64 indicating thatcell count adjustment had a significant effect on periph-eral blood gene expression signatures for FEV1. Therewas a modest correlation (r = 0.56, P < 2.2x10−16) ofP values from the cell count and the non-cell countadjusted associations.We performed a similar analysis by adding CBC anddifferential as covariates in the module level analysis(Additional file 1: Table S4). This led to the inflation ofp values and a loss of statistical significance in the rela-tionships between modules and FEV1: the yellow andgreen modules, for instance, ranked third and seventhwith P values of 0.158 and 0.282, respectively and anFDR = 0.653 in the CBC adjusted analysis.The relationship between modules and inflammatory cellsin peripheral bloodTo determine which specific cell types were influencinggene expression in each of the modules, we evaluated 3external databases that had captured cell specific geneexpression in peripheral blood. The results are shown inTable 4. The green and yellow modules, which were bothnegatively associated with FEV1, were enriched in neu-trophils, while the brown module, which showed positiveassociation with FEV1, was enriched in CD4+ T cells,CD8+ T cells and CD56+ NK cells.Modules’ preservation and reproducibility of FEV1associationsThe WGCNA modules were tested for preservation in areplication cohort of 381 current and former smokerswith COPD. The resulting preservation Zsummary was >10,which was higher than the randomly assigned “gold”module, suggesting that all modules (except grey) werestrongly preserved in the replication cohort (Fig. 2).Table 3 Biological processes enrichment for the three FEV1associated modulesCanonical Pathways p-valueGreen ModuleIL-10 Signaling 4.47E-08TREM1 Signaling 1.14E-06PPAR/RXR Activation 3.96E-06Fc Receptor-mediated Phagocytosis inMacrophages and Monocytes9.75E-06PPAR Signaling 9.75E-06Yellow ModuleIL-8 Signaling 5.90E-09Production of Nitric Oxide and ReactiveOxygen Species in Macrophages8.84E-07Caveolar-mediated Endocytosis Signaling 2.29E-05Role of Tissue Factor in Cancer 2.95E-05Relaxin Signaling 3.26E-05Brown ModuletRNA Charging 2.06E-10Purine Nucleotides De Novo Biosynthesis II 5.12E-04Cleavage and Polyadenylation of Pre-mRNA 8.27E-04Nur77 Signaling in T Lymphocytes 1.61E-03Leucine Degradation I 2.17E-03Table 4 Cell type enrichement for the three FEV1 associated modulesReference dataset Cell type Module rho* Permutation best rhoAllantaz et al. (Discovery) Neutrophil Green 0.747 0.216Allantaz et al. (Validation) Neutrophil Green 0.715 0.193Naranbhai et al. Neutrophil Green 0.656 0.186Allantaz et al. (Discovery) Neutrophils Yellow 0.773 0.147Allantaz et al. (Validation) Neutrophils Yellow 0.729 0.145Naranbhai et al. Neutrophils Yellow 0.697 0.143Allantaz et al. (Discovery) CD4+ T cells Brown 0.618 0.121Allantaz et al. (Discovery) CD8+ T cells Brown 0.589 0.127Allantaz et al. (Validation) CD8+ T cells Brown 0.571 0.118Allantaz et al. (Discovery) CD56+ NK cells Brown 0.568 0.122Allantaz et al. (Validation) CD4+ T cells Brown 0.567 0.109Allantaz et al. (Validation) NK cells Brown 0.536 0.142Allantaz et al. (Discovery) CD14+ monocytes Brown 0.531 0.129*donates P < 1×10−308 for all the reported Spearman’s rho values. Permutation best rho: the highest rho value obtained during permutationObeidat et al. Respiratory Research  (2017) 18:72 Page 6 of 11To determine whether the module associations were re-producible, eigengenes were computed in the replicationcohort for modules from the discovery cohort. The neweigengenes were then tested for association with FEV1 inthe replication cohort (Additional file 1: Table S5). Inter-estingly, the top three modules associated with FEV1 inthe discovery cohort, brown, yellow and green, were alsothe top three modules associated with FEV1 in the replica-tion cohort with P = 0.024, P = 0.035, and P = 0.036 forbrown, green and yellow, respectively (Figs. 3 and 4). Simi-lar to results from the discovery cohort, adjustments forcell count in the replication cohort led to the inflation ofp-values for these modules (Additional file 1: Table S6).DiscussionCOPD is an inflammatory lung disease, which has a sig-nificant systemic component that contributes to its overallmorbidity and mortality. Because inflammation is thoughtto play a central role in the pathogenesis of COPD, therehas been a tremendous surge of interest in studyingcirculating immune and inflammatory cells as potentialbiomarkers for the disease. There is a pressing need toidentify genomic signatures of disease severity and activitythat can guide therapeutic decisions and address thegrowing burden of COPD worldwide. In this study, weused modules of co-expressed genes in a highly accessibletissue, peripheral blood, to identify genomic signatures ofCOPD severity using FEV1 as the readout.The main findings of the present study were that: 1)At the gene level, only one gene was associated withFEV1 (FDR < 0.1); 2) the 18,892 genes expressed in per-ipheral blood mapped to 17 modules of co-expressedgenes; 3) three of the modules were associated withFEV1, 4) in a second and larger cohort of current andformer smokers with COPD and controls, all of themodules were preserved at the co-expression level, 5)the three modules in the discovery cohort that were statis-tically associated with FEV1 showed the strongest associa-tions with FEV1 in the replication cohort (P < 0.05), 6) thetwo modules, which were negatively related to FEV1, wereFig. 2 Networks of GAB2, DOCK5 and DCAF16. The figure shows the networks for GAB2, DOCK5 and DCAF16 in the yellow, green and Brownmodules, respectively. The genes shown are top 50 significant genes that had a FDR adjusted P value <0.05 for association with FEV1. The size ofthe circle is proportional to the P value on the –log10 scale (larger = smaller P value). The thickness of the edge is proportional to the topologicaloverlap measure (TOM) identified in WGCNAObeidat et al. Respiratory Research  (2017) 18:72 Page 7 of 11Fig. 3 Preservation Zsummary of modules from discovery cohort in the replication cohort. The Y axis shows the modules vs. their correspondingZsummary statistics on the X axis. All modules (except the grey modules) showed a strong preservation based on the threshold prescribed in Langfelderet al. [17] of a Zsummary score >10. Furthermore, the “gold” module consists of 1000 randomly selected genes that represent a sample of the wholegenome, constructed for module preservation analysis. The grey module consists of genes that were not assigned to any module in the networkFig. 4 Scatter plot of module associations with FEV1 in discovery and replication cohorts. The Y axis shows the P values (−log10 scale) for FEV1associations in the replication cohort while the X axis shows the association P values in the discovery cohortObeidat et al. Respiratory Research  (2017) 18:72 Page 8 of 11enriched in IL10 and IL8 pathways and were stronglycorrelated to neutrophil cell-specific expression, whilethe positively related module was enriched in DNAtranscription pathways and strongly correlated to T cellspecific expression.Previous studies investigating differential expressionin COPD have mainly tested genes and probesets indi-vidually; however, in vivo, genes are co-expressed innetworks. By leveraging co-expression patterns, net-works of closely co-expressed genes can be identified,often revealing novel functional pathways. The resultingnetwork modules can then be tested for differential ex-pression with FEV1. Another major advantage of net-work analyses is that this approach can significantlydecrease false negatives (Type II error) by markedlyreducing the number of features that are tested. In thepresent study the three modules reproducibly associ-ated with FEV1 were enriched in biological pathwayssuggesting that co-expressed genes share biologicalfunctions within a particular module.In each of the co-expressed networks, driver or “hub”genes can be identified, which can additionally informthe biology of these modules as they relate to FEV1. Thetop hub gene for the yellow module was DOCK5 whichis a member of the DOCK family of guanine-nucleotideexchange factors that activate Rho-family GTPases byexchanging bound GDP for free guanosine triphosphate(GTP) [23]. DOCK5 has been shown to interact with theregulatory and catalytic subunits of protein phosphatase2, encoded by PPP2R1A/B/C [24]. In mice, protein phos-phatase 2A has been shown to regulate innate immuneand proteolytic responses to cigarette smoke exposure inthe lung [25]. The top hub gene for the green modulewas GAB2 which was negatively correlated to FEV1.GAB2 is a member of the growth factor receptor-boundprotein 2 (GRB2) associated binding protein (GAB) genefamily, which acts as an adapter molecule in signal trans-duction of cytokine and growth factor receptors, and Tand B cell antigen receptors [26]. GAB2 is the principalactivator of phosphatidylinositol-3 kinase in responseto activation of the high affinity IgE receptor [27]. In aprevious study, the expression of GAB2 in sputum wassignificantly increased in patients with severe emphy-sema compared to those who had minimal emphysema[28]. In the brown module, DDB1 and CUL4 associatedfactor 16 (DCAF16) and eukaryotic translation initi-ation factor 2 alpha kinase 3 (EIF2AK3) were the toptwo FEV1 hub genes. Little is known about DCAF16,and EIF2AK3 encodes a protein, which functions as anendoplasmic reticulum stress sensor [29].Although the present study is one of the largest todate that have evaluated peripheral gene expression sig-nature in COPD [6], at the gene level, only one gene;butyrophilin subfamily 2 member A1 (BTN2A1) wassignificantly associated with FEV1. Butyrophilin has beenshown to regulate immune function [30]. In contrast togene-by-gene comparison approach, the use of networkbased modules identified a larger number of geneswithin the three significant modules which were relatedto FEV1 highlighting the value of network approaches inidentifying gene signatures. Previous work on exacerba-tions in COPD demonstrated similar findings [31].It is notable that adjustments for cell count had a largeimpact on the relationship between gene expression sig-natures and FEV1. This is not surprising given that per-ipheral whole blood is a heterogeneous tissue composedof many different immune cell subsets. Moreover, its cel-lular composition varies in response to physiological orpathological processes. These processes often involvecell differentiation and/or transit of specific cell typesbetween blood and tissues, resulting in important shiftsin the cellular makeup of samples under different con-ditions affecting blood-derived gene expression data.Disentangling causal from reactive relationships is chal-lenging in observational studies. Although it is commonpractice to statistically adjust for peripheral blood cellcomposition by including CBC and differential cellcounts as covariates, regression methods do not fullytake into account cell-specific gene expression and thusmay obfuscate important cell-specific signatures. To ex-plore this possibility, in the present study, in additionto the standard regression analysis, we interrogatedcell-specific gene expression in three external studiesthat contained cell-specific gene expression data thatwere generated by using cell isolation methods. Usingthis approach, we found that the two modules whichwere negatively associated with FEV1, contained strongneutrophil-specific gene expression, suggesting that in-creased number and/or activation of peripheral neutro-phils is associated with airway obstruction. The role ofneutrophils in the pathogenesis of COPD is well estab-lished [32, 33]. The module that was positively relatedto FEV1, on the other hand, contained gene expressionsignals that were T and B cell specific. Previous studieshave highlighted the role of the adaptive immune re-sponse in COPD [34–37].The current study has a number of limitations. First,gene expression signatures in peripheral blood may notreflect disease process in lungs of COPD patients. How-ever, peripheral blood is more accessible than lung tissueand may provide information on biological processessuch as immune responses that may be relevant inCOPD. Second, FEV1 may not fully capture disease ac-tivity in COPD and could reflect different pathologicalprocesses (emphysema or airway disease). Finally, thecell count adjustment had a large effect on the relation-ship between modules and FEV1. Given that changes incell abundance can be causally related to changes inObeidat et al. Respiratory Research  (2017) 18:72 Page 9 of 11FEV1 and disease status [38, 39] and given the strongcorrelations with cell specific expression in externaldatasets, the regression methods used for adjustmentmay have been overly conservative. Most published stud-ies to date on peripheral blood in COPD do not adjustfor cell count [6, 31, 40]. Future studies are warrantedthat incorporate differences in cell counts and/or meas-urement of cell specific expression changes.ConclusionsIn conclusion, we identified gene co-expression modulesin peripheral blood of patients with COPD that arehighly reproducible. Three modules showed strong associ-ations with FEV1 and were sensitive to cell count. In alarger replication cohort, the module-based co-expressionpatterns were preserved and associated with FEV1 in thesame direction. Network based analyses represent a novelapproach to discover biomarkers for COPD and warrantfurther attention in future studies.Additional fileAdditional file 1: Table S1. Top 10 FEV1 differentially expressed genesin the discovery cohort. Table S2. Module correlations with cell counts inthe discovery and replication cohorts. Table S3. Top 10 FEV1 differentiallyexpressed genes in the discovery cohort after adjusting for cell count.Table S4. Module associations with FEV1 in the discovery cohort adjustingfor cell count. Table S5. Modules association with FEV1 in the replicationcohort. Table S6. Modules association with FEV1 in the replication cohortadjusting for cell counts. (DOCX 43 kb)AbbreviationsCOPD: Chronic obstructive pulmonary disease; ECLIPSE: Evaluation of COPDlongitudinally to identify predictive surrogate endpoints; FDR: False discoveryrate; FEV1: Forced expiratory volume in 1 s; FEV1%pred: Percentage of predictedforced expiratory volume in 1 s; FEV1/FVC: Ratio of forced expiratory volumein 1 s to forced vital capacity; FVC: Forced vital capacity; MM: Modulemembership; PC: Principal component; WGCNA: Weighted gene co-expressionnetwork analysisAcknowledgementMa’en Obeidat is a Postdoctoral Fellow of the Michael Smith Foundation forHealth Research (MSFHR) and the Canadian Institute for Health Research(CIHR) Integrated and Mentored Pulmonary and Cardiovascular Trainingprogram (IMPACT). He is also a recipient of British Columbia Lung AssociationResearch Grant.FundingThe ECLIPSE study was funded by GlaxoSmithKline, under ClinicalTrials.govidentifier NCT00292552 and GSK No. SCO104960. This gene expressionsub-study was funded by Genome British Columbia.Availability of data and materialsThe blood gene expression data are available on the NCBI GeneExpression Omnibus (GEO) under http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE71220.Authors’ contributionsConception and design of study: MO, DDS. Performed the experiments/measurements: MO, VC, CPS, BEM, YN. Analysis and interpretation of data:MO, VC, CPS, YN, AKA, BL, OR, NF. Drafting the manuscript for importantintellectual content: MO, DDS, PDP, RN. Contributed to discussion: CPH,BM and PC. All authors read and approved the final manuscript.Competing interestsBEM is an employee and shareholder of GSK.SR has served as a consultant, participated in advisory boards, receivedhonorarium for speaking or grant support from: American Board of InternalMedicine, Advantage Healthcare, Almirall, American Thoracic Society,AstraZeneca, Baxter, Boehringer Ingelheim, Chiesi, ClearView Healthcare,Cleveland Clinic, Complete Medical Group, CSL, Dailchi Sankyo, DecisionResources, Forest, Gerson Lehman, Grifols, GroupH, Guidepoint Global,Haymarket, Huron Consulting, Inthought, Johnson and Johnson, MethodistHealth System – Dallas, NCI Consulting, Novartis, Pearl, Penn Technology,Pfizer, PlanningShop, PSL FirstWord, Qwessential, Takeda, Theron andWebMD. Since August 10, 2015 he has served as chief clinical scientist,new clinical development, AstraZeneca, UK.DDS: Over the past 3 years, DDS has served as a consultant on AstraZeneca(AZ) and Novartis Advisory Boards for COPD. He has been a consultant withAmgen and Almirall. He has received research funding from AZ and BoehringerIngelheim (BI). He has given lectures sponsored by BI and AZ.Consent for publicationNot applicable.Ethics approval and consent to participateECLIPSE study was approved by the relevant ethics review boards at each ofthe participating centres. Study participants provided written informed consent,and participants’ information was de-identified. This gene expression sub-studywas approved by the Providence Health Care Research Ethics Board (REB) ofthe University of British Columbia (UBC) (H11-00786).Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1The University of British Columbia Centre for Heart Lung Innovation, StPaul’s Hospital, 1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.2Prevention of Organ Failure (PROOF) Centre of Excellence, Vancouver, BC,Canada. 3Singapore Immunology Network, 8A Biomedical Grove, Singapore,Singapore. 4Channing Division of Network Medicine, Brigham and Women’sHospital, Boston, USA. 5Division of General Internal Medicine and PrimaryCare, Brigham and Women’s Hospital and Harvard Medical School, Boston,USA. 6Pulmonary and Critical Care Division, Brigham and Women’s Hospitaland Harvard Medical School, Boston, USA. 7GlaxoSmithKline, King of Prussia,PA, USA. 8Division of Pulmonary and Critical Care Medicine, University ofNebraska Medical Center, Omaha, NE, USA. 9Clinical Discovery Unit, EarlyClinical Development, AstraZeneca, Cambridge, UK. 10Respiratory Division,Department of Medicine, University of British Columbia, Vancouver, BC,Canada.Received: 30 October 2016 Accepted: 20 April 2017References1. WHO. The top 10 causes of death. Geneva: World Health Organization;2014.2. Vestbo J, Hurd SS, Agusti AG, Jones PW, Vogelmeier C, Anzueto A, Barnes PJ,Fabbri LM, Martinez FJ, Nishimura M, et al. Global strategy for the diagnosis,management, and prevention of chronic obstructive pulmonary disease: GOLDexecutive summary. Am J Respir Crit Care Med. 2013;187:347–65.3. GOLD. Global initiative for chronic obstructive lung disease. 2011.4. Decramer M, Janssens W. Chronic obstructive pulmonary disease andcomorbidities. Lancet Respir Med. 2013;1:73–83.5. Sin DD, Hollander Z, DeMarco ML, McManus BM, Ng RT. Biomarkerdevelopment for chronic obstructive pulmonary disease. From discovery toclinical implementation. Am J Respir Crit Care Med. 2015;192:1162–70.6. Bahr TM, Hughes GJ, Armstrong M, Reisdorph R, Coldren CD, Edwards MG,Schnell C, Kedl R, LaFlamme DJ, Reisdorph N, et al. Peripheral bloodmononuclear cell gene expression in chronic obstructive pulmonary disease.Am J Respir Cell Mol Biol. 2013;49:316–23.7. Chang Y, Glass K, Liu Y-Y, Silverman EK, Crapo JD, Tal-Singer R, Bowler R, DyJ, Cho M, Castaldi P. COPD subtypes identified by network-based clusteringof blood gene expression. Genomics. 2016;107:51–8.Obeidat et al. Respiratory Research  (2017) 18:72 Page 10 of 118. Langfelder P, Horvath S. WGCNA: an R package for weighted correlationnetwork analysis. BMC Bioinf. 2008;9:559.9. Vestbo J, Anderson W, Coxson HO, Crim C, Dawber F, Edwards L, Hagan G,Knobil K, Lomas DA, MacNee W, et al. Evaluation of COPD longitudinally toidentify predictive surrogate end-points (ECLIPSE). Eur Respir J. 2008;31:869–73.10. Obeidat M, Fishbane N, Nie Y, Chen V, Hollander Z, Tebbutt SJ, Bosse Y,Ng RT, Miller BE, McManus B, et al. The effect of statins on blood geneexpression in COPD. PLoS One. 2015;10:e0140022.11. Carvalho BS, Irizarry RA. A framework for oligonucleotide microarraypreprocessing. Bioinformatics. 2010;26:2363–7.12. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U,Speed TP. Exploration, normalization, and summaries of high densityoligonucleotide array probe level data. Biostatistics. 2003;4:249–64.13. Hochreiter S, Clevert D-A, Obermayer K. A new summarization method foraffymetrix probe level data. Bioinformatics. 2006;22:943–9.14. Zhang B, Horvath S. A general framework for weighted gene co-expressionnetwork analysis. Stat Appl Genet Mol Biol. 2005;4:Article17.15. Yip A, Horvath S. Gene network interconnectedness and the generalizedtopological overlap measure. BMC Bioinf. 2007;8:22.16. Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical clustertree: the dynamic tree cut package for R. Bioinformatics. 2008;24:719–20.17. Langfelder P, Luo R, Oldham MC, Horvath S. Is my network module preservedand reproducible? PLoS Comput Biol. 2011;7:e1001057.18. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practicaland powerful approach to multiple testing. J R Stat Soc Ser B Methodol.1995;57:289–300.19. Allantaz F, Cheng DT, Bergauer T, Ravindran P, Rossier MF, Ebeling M, Badi L,Reis B, Bitter H, D’Asaro M, et al. Expression profiling of human immunecell subsets identifies miRNA-mRNA regulatory relationships correlatedwith cell type specific expression. PLoS One. 2012;7:e29979.20. Naranbhai V, Fairfax BP, Makino S, Humburg P, Wong D, Ng E, Hill AVS,Knight JC. Genomic modulators of gene expression in human neutrophils.Nat Commun. 2015;6:7545.21. Fairfax BP, Makino S, Radhakrishnan J, Plant K, Leslie S, Dilthey A, Ellis P,Langford C, Vannberg FO, Knight JC. Genetics of gene expression inprimary immune cells identifies cell type-specific master regulators androles of HLA alleles. Nat Genet. 2012;44:502–10.22. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B,Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software developmentfor computational biology and bioinformatics. Genome Biol. 2004;5:R80.23. Côté J-F, Vuori K. Identification of an evolutionarily conserved superfamilyof DOCK180-related proteins with guanine nucleotide exchange activity.J Cell Sci. 2002;115:4901–13.24. Glatter T, Wepf A, Aebersold R, Gstaiger M. An integrated workflow forcharting the human interaction proteome: insights into the PP2A system.Mol Syst Biol. 2009;5:237.25. Wallace AM, Hardigan A, Geraghty P, Salim S, Gaffney A, Thankachen J,Arellanos L, D’Armiento JM, Foronjy RF. Protein phosphatase 2A regulatesinnate immune and proteolytic responses to cigarette smoke exposure inthe lung. Toxicol Sci. 2012;126:589–99.26. Hibi M, Hirano T. Gab-family adapter molecules in signal transduction ofcytokine and growth factor receptors, and T and B cell antigen receptors.Leuk Lymphoma. 2000;37:299–307.27. Gu H, Saito K, Klaman LD, Shen J, Fleming T, Wang Y, Pratt JC, Lin G, Lim B,Kinet J-P, Neel BG. Essential role for Gab2 in the allergic response. Nature.2001;412:186–90.28. Singh D, Fox SM, Tal-Singer R, Plumb J, Bates S, Broad P, Riley JH, Celli B.Induced sputum genes associated with spirometric and radiologicaldisease severity in COPD ex-smokers. Thorax. 2011;66:489–95.29. Liu J, Hoppman N, O’Connell JR, Wang H, Streeten EA, McLenithan JC,Mitchell BD, Shuldiner AR. A functional haplotype in EIF2AK3, an ER stresssensor, is associated with lower bone mineral density. J Bone Miner Res.2012;27:331–41.30. Arnett HA, Escobar SS, Viney JL. Regulation of costimulation in the era ofbutyrophilins. Cytokine. 2009;46:370–5.31. Morrow JD, Qiu W, Chhabra D, Rennard SI, Belloni P, Belousov A, Pillai SG,Hersh CP. Identifying a gene expression signature of frequent COPDexacerbations in peripheral blood using network methods. BMC MedGenomics. 2015;8:1–11.32. Hoenderdos K, Condliffe A. The neutrophil in chronic obstructive pulmonarydisease. Am J Respir Cell Mol Biol. 2013;48:531–9.33. Gernez Y, Tirouvanziam R, Chanez P. Neutrophils in chronic inflammatoryairway diseases: can we target them and how? Eur Respir J. 2010;35:467–9.34. Faner R, Cruz T, Casserras T, Lopez-Giraldo A, Noell G, Coca I, Tal-Singer R,Miller B, Rodriguez-Roisin R, Spira A, et al. Network analysis of lungtranscriptomics reveals a distinct B-cell signature in emphysema. Am JRespir Crit Care Med. 2016;193:1242–53.35. Lloyd CM. Chair’s summary: innate and adaptive immune responses inairway disease. Ann Am Thorac Soc. 2014;11:S234–5.36. Polverino F, Cosio BG, Pons J, Laucho-Contreras M, Tejera P, Iglesias A, Rios A,Jahn A, Sauleda J, Divo M, et al. B cell-activating factor. An orchestrator oflymphoid follicles in severe chronic obstructive pulmonary disease. Am JRespir Crit Care Med. 2015;192:695–705.37. Polverino F, Seys LJM, Bracke KR, Owen CA. B cells in chronic obstructivepulmonary disease: moving to center stage. Am J Physiol Lung Cell MolPhysiol. 2016;311:L687.38. Chan-Yeung M, Abboud R, Buncio AD, Vedal S. Peripheral leucocyte countand longitudinal decline in lung function. Thorax. 1988;43:462–6.39. Yeung MC, Buncio AD. Leukocyte count, smoking, and lung function. Am JMed. 1984;76:31–7.40. Bhattacharya S, Tyagi S, Srisuma S, Demeo DL, Shapiro SD, Bueno R,Silverman EK, Reilly JJ, Mariani TJ. Peripheral blood gene expression profilesin COPD subjects. J Clin Bioinf. 2011;1:12.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services •  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submitSubmit your next manuscript to BioMed Central and we will help you at every step:Obeidat et al. Respiratory Research  (2017) 18:72 Page 11 of 11


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items