UBC Faculty Research and Publications

Factors affecting the accuracy of genomic selection for growth and wood quality traits in an advanced-breeding… Lenz, Patrick R; Beaulieu, Jean; Mansfield, Shawn D; Clément, Sébastien; Desponts, Mireille; Bousquet, Jean Apr 28, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2017_Article_3715.pdf [ 1.45MB ]
JSON: 52383-1.0347286.json
JSON-LD: 52383-1.0347286-ld.json
RDF/XML (Pretty): 52383-1.0347286-rdf.xml
RDF/JSON: 52383-1.0347286-rdf.json
Turtle: 52383-1.0347286-turtle.txt
N-Triples: 52383-1.0347286-rdf-ntriples.txt
Original Record: 52383-1.0347286-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessFactors affecting the accuracy of genomicselection for growth and wood qualitytraits in an advanced-breeding populationof black spruce (Picea mariana)obtained when using marker subsets that were identified to carry large effects, indicating a minor role for short-rangeLenz et al. BMC Genomics  (2017) 18:335 DOI 10.1186/s12864-017-3715-5Avenue de la Médecine, Québec, Québec G1V 0A6, CanadaFull list of author information is available at the end of the articleQuébec G1V 4C7, Canada2Canada Research Chair in Forest Genomics, Institute of Systems andIntegrative Biology and Centre for Forest Research, Université Laval, 1030,* Correspondence: Patrick.Lenz@canada.ca1Canadian Wood Fibre Centre, Canadian Forest Service, Natural ResourcesCanada, Government of Canada, 1055 du PEPS, P.O. Box 10380, Québec,LD in this population.(Continued on next page)range linkage disequilibrium (LD) in the high accuracy estiPatrick R.N. Lenz1,2* , Jean Beaulieu1,2, Shawn D. Mansfield3, Sébastien Clément1, Mireille Desponts4 andJean Bousquet2AbstractBackground: Genomic selection (GS) uses information from genomic signatures consisting of thousands of geneticmarkers to predict complex traits. As such, GS represents a promising approach to accelerate tree breeding, whichis especially relevant for the genetic improvement of boreal conifers characterized by long breeding cycles. In thepresent study, we tested GS in an advanced-breeding population of the boreal black spruce (Picea mariana [Mill.]BSP) for growth and wood quality traits, and concurrently examined factors affecting GS model accuracy.Results: The study relied on 734 25-year-old trees belonging to 34 full-sib families derived from 27 parents and thatwere established on two contrasting sites. Genomic profiles were obtained from 4993 Single Nucleotide Polymorphisms(SNPs) representative of as many gene loci distributed among the 12 linkage groups common to spruce. GS models wereobtained for four growth and wood traits. Validation using independent sets of trees showed that GS model accuracywas high, related to trait heritability and equivalent to that of conventional pedigree-based models. In forward selection,gains per unit of time were three times higher with the GS approach than with conventional selection. In addition,models were also accurate across sites, indicating little genotype-by-environment interaction in the area investigated.Using information from half-sibs instead of full-sibs led to a significant reduction in model accuracy, indicating that theinclusion of relatedness in the model contributed to its higher accuracies. About 500 to 1000 markers were sufficient toobtain GS model accuracy almost equivalent to that obtained with all markers, whether they were well spread across thegenome or from a single linkage group, further confirming the implication of relatedness and potential long-mates obtained. Only slightly higher model accuracy was© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.dea rednocrus,Lenz et al. BMC Genomics  (2017) 18:335 Page 2 of 17these traits [9–11]. Additionally, the weak linkage disequi-librium (LD) between markers and QTLs across differentgenetic backgrounds, as well as the narrow range capturedin typical QTL studies limits their use to within familyselection.To overcome the limitations of MAS, genomic selec-tion (GS) uses dense genomic marker information(called genomic signatures or profiles) of individuals, aswell as their parents when available, to predict theirbreeding value [12]. Contrary to MAS, GS models simul-ger, which includes production of crosses, field evalu-ation of progeny and performing selections, and thepropagation of selected superior material through sexualor vegetative means [30]. Early selection methods thatfacilitate accurate prediction of mature phenotypes at ayounger stage are therefore vital to shorten breeding cy-cles and ultimately improve the cost-efficiency of suchbreeding programs. Traditionally, these methods haverelied on indirect methods of phenotypic selection,which may be less effective, especially when genetic cor-higher, which is consistent with the multigenic control ofual quantitative trait loci (QTLs) has only been slightly Generally, tree improvement of boreal conifer speciesis characterized by breeding cycles of 30 years and lon-(Continued from previous page)Conclusions: This study supports the integration of GS mothat high genomic prediction accuracy was obtained withand family structure in the population. In boreal spruce bremuch larger gain per unit of time can be obtained from geapproach. GS thus appears highly profitable, especially in theto mass vegetative propagation of selected stock, such as spKeywords: Genomic selection, Black spruce, Wood propertiebreeding values, Gene SNPsBackgroundGenomics of forest trees is rapidly gaining momentumas it promises to unravel the genetic control of adaptiveand economically important traits, in part to satisfy theincreasing demand for high quality wood fibre world-wide but also, to cope with the increasing challengesimposed by changing climates and environments [1, 2].Uses of genomic information for breeding are diverseand may rely on reconstruction of the pedigree, verifica-tion of co-ancestry in breeding populations for geneticdiversity management purposes, or on the correlation ofmarker information with phenotypes for selection [3].For example, marker-assisted selection (MAS) was amongthe first approaches suggested to accelerate tree improve-ment [4]. In conifers, with the use of current statisticalmethods and correction for multiple testing, a limitednumber of markers or genes have been reported to begenetically linked to economically important traits, suchas wood quality and/or tree growth [5–7]. Individual Sin-gle Nucleotide Polymorphism (SNP) markers are largelyconstrained to explaining only a minor proportion of thevariation, as they rarely explain more than 5% of quantita-tive trait variation [5–9]. Therefore, for most quantitativetraits, the association approach does not appear usefulenough in breeding selection where accurate predictionsof genetic values are necessary. When using large progenysets, the proportion of trait variance explained by individ-taneously estimate the effects of all available markers ina training population. Predictions or genomic-estimatedbreeding values (GEBVs) are then made for progeny ofls in advanced-generation tree breeding programs, givenelatively small number of markers due to high relatednessing programs and similar ones with long breeding cycles,mic selection at an early age than by the conventionalontext of forward selection in species which are amenableces.Tree improvement and breeding, Genomic-estimatedthe same or future generations [12]. One of the basic as-sumptions is that the markers are scattered throughoutthe genome so that at least some of them are in directlinkage with causal loci [13].Most economically important traits are of quantitativenature and controlled by many genes [2]. Consequently,it is crucial that an adequate number of markers areused to attain sufficient coverage of the genome, andthat as many loci as possible are in LD with QTLs [14].The combination of the rapidly decreasing costs of high-throughput genotyping as well as the significant ad-vances in sequencing and subsequent computation hasfacilitated GS-based selection approaches to be consid-ered in organisms even with very large genomes, such astrees and conifers, in particular.Initially employed in dairy cattle breeding [12, 14, 15],GS has also been applied to other animals such as mice[16], and in plant and crop breeding [17–19]. In trees,GS has been tested in both angiosperms such as euca-lypts [20], and gymnosperms such as loblolly pine (Pinustaeda L.) [21, 22], maritime pine (Pinus pinaster Aiton),[23, 24] and white spruce and its hybrids (Picea glauca(Moench) Voss) [25–28]. However, studies on borealconifer species are still relatively rare, which is mostlikely due to the fact that substantive genomic resourceswere only recently made available for several species(reviewed by De La Torre et al. [29]).relations between the juvenile and mature traits are low[31]. Performing selection directly on genomic markerinformation would, in theory, avoid the loss of predictionLenz et al. BMC Genomics  (2017) 18:335 Page 3 of 17accuracy due to imperfect correlation between differentlife stages, and would mostly be dependent on the herit-ability of the target trait at the mature stage [32].Genomic selection promises to significantly reduce thetime commitment for completing a breeding cycle andthus to increase genetic gain per time unit by avoidingthe field testing stages, that often represent the largesttime commitment [3, 32]. Technically, the estimates ofbreeding values could be obtained at the seedling stageor even from seed or somatic tissue [1]. Hence, the timerequired to select elite genotypes of northern conifers,which is largely impacted by their slow inherent growthcould be overcome, and mature phenotypes would onlybe needed for model construction and validation [26].Previous GS studies in forest trees have reported mod-erate to high accuracy of selection models with correla-tions of between 0.6 and 0.8 when comparing GEBVswith conventional EBVs, when full-sib families were con-sidered [20, 25, 33]. This suggests that under certainconditions of high relatedness where long-range linkagedisequilibrium (LD) is likely more a potent factor of ac-curacy than short-range LD, the GS approach can easilycompete with the conventional pedigree-based breedingapproach or even outperform it due to the reduction intime needed to complete a breeding cycle [25, 32, 33]. Incontrast, lower accuracy values (0.3 and 0.5) have beenobserved when half-sib families were considered [26, 28],and GS models developed in unrelated trees had very lowpower to make prediction [25, 26]. The marginal to lowperformance of these latter two cases of GS is linked tothe larger effective population size and low LD, which iscommon to natural populations of undomesticated trees[34–36]. Large population sizes indeed results in more re-combination and a greater diversity among available gam-etes within populations, from high outcrossing rates inwind-pollinated species such as black spruce [37]. More-over, the genetic drift common to small populations withfewer parents generating non-random associations amongalleles at different loci, i.e., gametic disequilibrium or LD[38], is not a significant factor in such conditions.Black spruce (Picea mariana [Mill.] B.S.P.) is one ofthe most abundantly distributed trees throughout thetranscontinental boreal forest of Canada and Alaska. It isalso one of the most reforested species in Canada, with65 million seedlings being planted per year in the prov-ince of Québec alone, and has been the subject ofadvanced-generation breeding programs [30].The objectives of this study were to (1) estimate accur-acies and gains per unit of time derived from GS forwood quality and growth traits in an advanced-breedingpopulation of the black spruce; (2) assess variation inaccuracy from building site-specific or multiple-sitemodels; (3) explore the roles of relatedness, short- andlong-range linkage disequilibrium in genomic predictionaccuracy by simulating different family structures andtesting different subsets of markers; and (4) investigatethe effect of sample sizes on model accuracy.To meet these objectives, we used trait data obtainedfrom 25-year-old black spruces grown in a genetic testreplicated on two environmentally contrasted sites inQuébec Canada. The genetic test consisted of full-sibfamilies generated using a partial diallel mating design.Trees were genotyped for 4993 SNP markers representa-tive of as many distinct gene loci, and originating from ablack spruce high-confidence SNP catalogue previouslyassembled and validated [39].MethodsSampling of plant materialPhenotypic and genetic data were obtained for 734 25-year-old progeny trees belonging to 34 controlled-pollinated families derived from a partial diallel matingdesign that consisted of 27 parents originating from 9provenances from the Canadian provinces of Ontario,New Brunswick and Manitoba, and one from Maine inthe northeastern United States. The parental trees hadpreviously been identified for their superior growth andstem form in a range-wide provenance trial establishedon 4 sites in Québec [40]. Progeny were raised as cut-tings in the nursery for three years and then establishedin 1991 by the Ministère des Forêts, de la Faune et desParcs du Québec (MFFPQ) on two forest sites using arandomized complete block design, with 4-tree row plotsand a 2 m by 2 m spacing. The test sites are located intwo contrasting environments in the eastern part of theprovince of Québec: 1) the Matapedia arboretum(latitude: 48° 32’N, longitude: 67° 25’W, elevation 216 m,333 trees sampled), which is characterized by a warmerclimate typical of the balsam fir – yellow birch forest,and 2) the Robidoux site (latitude: 48° 18’N, longitude:65° 31’W, elevation 275 m, 401 trees sampled), which ischaracterized by a colder climate typical of the balsamfir – white birch forest domain and under the influenceof the Chic-Choc mountains (1270 m) on the Gaspésiepeninsula. Needle tissues for DNA extraction were sam-pled in the upper third of the crown, immediately placedon ice and stored at −10 °C until further processing.DNA extraction protocols used for collected samples aredescribed in detail in Pavy et al. [41].Phenotypic trait determinationTree height and diameter at breast height (DBH, 1.3 mabove ground) were recorded in 2013 at the age of 25years, and a wood increment core was extracted fromthe south facing side of each tree. Cores were stored in afreezer, conditioned to 7% moisture and cut to 1.68 mmthickness prior to X-ray densitometry analyses (QuintekMeasurement Systems, TN). Wood density was calculatedLenz et al. BMC Genomics  (2017) 18:335 Page 4 of 17as a ring area weighted mean from recorded pith to barkwood density profiles. Microfibril angle (MFA) was esti-mated using X-Ray diffraction, on a Bruker D8 Discoveryunit equipped with an area array detector, as perUkrainetz et al. [42].GenotypingIn recent years, significant genomic resources have beendeveloped for white spruce and Norway spruce (Piceaabies (L.) Karst.), including draft genome sequences[43–46], genetic and QTL maps [11, 47, 48], gene cata-log and expression chips [49], large SNP registries andhigh throughput genotyping chips [40, 50], but compara-tively few genomic resources exist for black spruce.Before investigating GS in black spruce, exome captureand sequencing were used together with an in-housebioinformatic pipeline to produce a registry of 97,000high-confidence SNPs pertaining to around 15,000 genesequence clusters [39]. In addition, we compiled success-fully genotyped gene SNPs from previous black sprucegenomic studies [9, 41, 51–53]. Altogether, this informa-tion was used to develop an Infinium iSelect SNP geno-typing array (Illumina, San Diego, CA) containing 5,300SNPs representing as many distinct black spruce genecontigs [39]. Based on control DNA replicates, the chipgenotyping reproducibility rate obtained was 99.9%. Forgenomic selection modelling and analyses, genotypinginformation from a total of 4,993 SNPs representing asmany distinct gene loci spread among the 12 sprucelinkage groups was retained for each of the 734 progenytrees, resulting in a total of >3.6 million SNP calls. Giventhe estimated 1,850 centimorgans (cM) of the blackspruce genome [51, 54], this corresponds to a markerdensity of approximately 2.7 per cM. For all 4,993 SNPsretained for GS analyses, the following in-house qualitycriteria were met: biallelic SNPs matching with bothparental genotypes, a GenTrain (Illumina, San Diego,California) quality score ≥0.25, a call rate ≥85%, a fix-ation index |FIS| ≤ 0.50, and a minor allele frequency(MAF) ≥0.0055. Furthermore, SNPs with minor fre-quency alleles that were present in less than 10 individualswere not considered. Of the retained markers, only 2%had a minimum allele frequency (MAF) < 5%, indicating asmall number of rare allele markers.Estimating "true" breeding valuesFor each trait, an individual tree (so called “animal”)model was fit using the GS3 software [55] in order toestimate reference or “true” breeding values:y ¼ Xβþ Spþ Tu ð1Þwhere, β is a vector of fixed effects, including an overallmean and the fixed site effect, p is the permanentrandom block effect, u is the vector of random additivepolygenic effects following a distribution ~N(0, Aσ2u)and e is the error term with ~N(0, Iσ2e). X, T and S areincidence matrices, and A is the numerator of the rela-tionship matrix describing the additive relationshipamong individuals and I is the identity matrix. All thetrees sampled were used to estimate these referencebreeding values.Basic genomic selection models with all informationIn the first set of analyses, phenotypes from both siteswere combined and GS models were fit using a simpleBayesian framework in the GS3 software [55, 56], con-sidering all SNPs in the models. Phenotypes were stan-dardized by block-within-site effects and site standarddeviation in order to account for differences betweensites. The genomic-estimated breeding value (GEBV) ofeach tree was estimated for all phenotypic traits usingSNP marker information whose effect was estimatedwith the linear mixed model:y ¼ Xβþ Zαþ e ð2Þwhere, y is the vector of standardized phenotypes, β is avector of fixed effects including an overall mean, α is therandom marker effect, e the random error, and X and Zare incidence matrices. The latter was built from thenumber of alleles observed for each individual and SNP,and coded as 0, 1 and 2 representing the number of cop-ies of the minor allele. Approximately 1.8% of the totalnumber of genotypes was missing and imputed as themean of the respective marker rounded to the nextgenotype value. Given that only biallelic markers wereretained for the analyses, values of +0.5aj and −0.5ajwere arbitrarily assigned to alleles 1 and 2 respectively,which follows conventional parametrization where thedifference between the two homozygotes equals two aj[55]. Marker effects were assumed to follow a normaldistribution ~N(0, Iσ2a), where I is the identity matrix.The model hence assumes common variance and allmarker coefficients are minimized to the same extent,which is commonly called ridge regression. The ap-proach was deemed appropriate for similar traits evalu-ated in white spruce and assumingly controlled bymany genes of small effect [25], and was henceretained for all analyses in this study. GEBVs were es-timated asg^i ¼Xnj¼1Z0i ja^j ð3Þwhere, Z’ij is the indicator co-variate (−1, 0 or 1) for theith tree at the jth locus and âj is the estimated effect atthe jth locus.The GS3 software uses the Gauss-Seidel algorithmwith residual update for best linear unbiased predictionModel validation and estimation of accuracycross-validated genomic-predicted breeding values andTesting the effect of relatedness on accuracy estimatesLenz et al. BMC Genomics  (2017) 18:335 Page 5 of 17In order to address questions related to the influence ofrelatedness and long-range linkage disequilibrium (LD)on genomic selection accuracy, we investigated the effectof family structure. Training and validation sets weregenetically related in the basic models, i.e. different trees,but originating from the same full-sib families. For com-parison purposes, a genomic selection scenario withhalf-sib families was also tested where the training andvalidation sets shared female progenitors, but form dif-ferent families, hence implicating different males, or viceversa. Estimates of model accuracy were also obtainedthe “true” or reference breeding values previously calcu-lated using pedigree information and all available sam-pled trees. The predictive ability, r(y, ŷ), was similarlyevaluated as the correlation between predicted and ac-tual phenotypes.Testing site specificityDifferent GS scenarios were also tested to investigate theinfluence of site on model quality: the basic model con-tained all 734 trees from the two sites combined (seeabove); another set of models were trained and validatedwith only data from a single site, and the last modelswere trained with data from one site and validated withdata from the second site.Tenfold cross-validation was performed for pedigree-based and marker-based models in order to obtain pre-dicted breeding values for both approaches. For eachround of cross-validation, 10% of the trees were ran-domly drawn from each family and set aside for valid-ation, the remainder being used for model training. Eachindividual was thus included in one validation set. Modelquality was evaluated by the accuracy, r(GEBV, EBV),which is defined here as the correlation between the(BLUP), and for best linear unbiased estimation of ran-dom and fixed effects respectively [55]. The algorithm isextended by Gibbs sampling for estimation of variancecomponents. The Gibbs sampler was run for 100,000iterations with a burn-in of 20,000 iterations. Everythousand iterations, a sample was retained and conver-gence of the posterior distribution was verified usingtrace plots. Flat priors were found to give the moststable results of convergence for the various modelsand subsamples tested.by using completely unrelated trees from families andprovenances excluded from model training.Testing subsets of markersTo investigate the effect of LD, and that of various sub-sets of markers on GS model accuracy, different subsetswere delineated and used to build models. Hence, gen-omic selection models were constructed with subsets of1,000, 500 and 250 markers randomly drawn from the4,993 SNPs available, as well as with a set of 250markers having the largest absolute effects and a setcontaining the 4,743 remaining markers. Accuracy andpredictive ability of these reduced models were thencompared with those of the basic model. GS modelswere also constructed separately with markers identifiedfor each of the 12 black spruce linkage groups, in orderto investigate the contribution of various genomic re-gions to the genetic control of the different traits andthe influence of LD on GS model accuracy. Given thevery high synteny and collinearity between the whitespruce and black spruce genomes [51], we used themore complete linkage mapping information of whitespruce gene homologs to approximate the genomic posi-tions of black spruce genes carrying the SNPs consid-ered herein. Of the 4,993 black spruce gene contigs ofthe present study (with gene nomenclature according tothe white spruce gene catalogue of Rigault et al. [49] asdescribed in Pavy et al. [39]), 2,928 genes had homologsmapped on the most recent white spruce reference gen-etic map containing nearly 9,000 genes [48]. These ho-mologs were well distributed over the 12 linkage groupswith, on average, 244 (+/- 15) gene homologs repre-sented per chromosome.Testing the size of training data setRandom subsets of individuals of various sizes were usedto examine the minimum number of trees needed tobuild GS models without significantly loosing accuracy.Starting with the complete sample set of 4,993 SNPs and734 trees, about one third of the trees was iterativelyremoved, creating subsets of 490, 330, 224, 147 and106 trees that were subsequently analysed with all the4,993 SNPs.Genetic gain estimationsEstimates of genetic gain from conventional selection orfrom GS were obtained by using predicted breedingvalues for each trait and by considering a selection in-tensity of 5%. Gains per year were obtained by assuminga spruce breeding cycle of a minimum of 28 years forconventional selection, and a shortened cycle of 9 yearsfor GS, with 4 years for crosses and production of see-dlots that are full-siblings to the training population,followed by 1 year for selection of individuals usingmarkers and genomic selection models, and 4 years forvegetative propagation of selected individuals for seedlingproduction. This last scenario thus assumes GS under aLenz et al. BMC Genomics  (2017) 18:335 Page 6 of 17forward selection scheme, which is possible in blackspruce [1].ResultsGenotypic and phenotypic information was gatheredfrom a total of 734 25-year-old black spruce treesplanted in two different forest environments, and repre-senting an advanced-generation breeding population offull-sib families from crosses involving parents from nineCanadian provenances and one from Maine in theU.S.A. Population structure was found to be weak, incongruence with previous genetic studies relying on mo-lecular markers and implicating populations from east-ern Canada [52, 53, 57]. Indeed, spectral decompositionof the genomic relationship matrix [15, 25] according tothe geographic distribution of origins revealed that aboutfive percent of the variation was captured by the firsteigenvector. We concluded that the genetic and genomicrelationship matrices were sufficient to capture related-ness among individuals, and as such population struc-ture was not considered in subsequent GS analyses.High accuracy of GS models in combined-site analysesWhen the two test sites were simultaneously considered,with all markers and independent trees for validation,the accuracy of the GS models was high for all traitsassessed. Moreover, these accuracy estimates were simi-lar to those of the polygenic models using pedigree andphenotypic information (Table 2). Differences in accur-acy between the different phenotypic traits for both theGS and polygenic models were also minor. Larger differ-ences were found for predictive ability where the mostaccurate predictions were recorded for height growth.The accuracies largely mirrored the trends in individualtrait heritability (Table 1), where high estimates of gen-etic control were observed for MFA (h2 = 0.74) andheight growth (h2 = 0.68), and somewhat weaker valuesobserved for DBH and wood density (h2 = 0.57 and h2 =0.41, respectively).In terms of genetic gains, the model containing treesfrom both sites resulted in the largest gain predictions(Table 2). The gain ratio, or in other words, that frommarker models versus conventional pedigree selectionwas high for all traits, with DBH showing the weakestratio (below 90%). Gain ratios above 100% were obtainedfor wood traits, indicating that marker-based modelswould allow for better gains to be obtained thanpedigree-based models. When considering the potentialtime saved for breeding based on GS, the annual gain ra-tios for marker-based models versus conventional selec-tion increased considerably to being around 3. This isbecause GS is conducted without field testing and thus,avoids the delays associated with tree growing and asses-sing phenotypic trait variation from mature trees.Between-site differences in genomic selection models inblack spruceIn order to assess the extent of the genotype-by-environment interaction, and to evaluate if GS model ac-curacy is affected by sites, we constructed and validatedGS models for each of the two different sites (Table 2).Accuracy estimates of models built using data from onesite only were slightly inferior compared with accuraciesobtained with the combined-site analyses, especially forsite 1. Accuracy estimates of models developed with datacollected on one site and validated with data collectedfrom the second site were high, and only marginallylower than estimates for validations carried out withinthe same site. These results indicate that genotype-by-environment interaction is low, and that GS models canmost likely be applied over a large range of sites withoutthe need for increased tree sampling for independentmodel construction, given that the two sites in thecurrent study represented quite contrasting environmen-tal conditions.The relatively good overall model accuracies obtainedby applying models developed on one site and applied tothe second one does not mask important within-site en-vironmental differences. Field records identified moretrait variation and vegetation competition on site 1(Matapedia, warmer site) compared with site 2 (Robidoux,colder site), which led to lower heritability estimates forsite 1 (Table 1). Additionally, a smaller number of treeswere available for site 1 compared to site 2 (333 and 401trees analysed, respectively). Hence, GS models trained onsite 2 and validated on site 1 or site 2 were marginallymore accurate, which was especially true for wood densityand diameter. The same trend was observed for predictionmodels based on pedigree and phenotypic information.RelatednessModels built with a half-sib structure led to a large de-crease in accuracy, predictability and genetic gain(Table 3). Although model validation was set up equallyfor all traits, accuracy varied much more among traitsthan when full-sib models were applied. The loss ofmodel quality was most pronounced for MFA, whereaccuracy was only half. When cross-validation wasconducted with completely unrelated progeny fromdifferent provenances and families (whether with full-sib or half-sib structure), model accuracy dropped vir-tually to zero and was associated with high errorrates (results not shown).Marker subsets inform on the nature of causative linkagedisequilibriumWe also investigated the effect of reduced marker setson the accuracy of GS models. The use of fewer markerswould result in significant cost reductions for genotyping,ysefoa907E4Lenz et al. BMC Genomics  (2017) 18:335 Page 7 of 17Table 1 Variance component estimates from genomic selection analvariance component estimation based on pedigree and phenotypic inTrait Model σ2aWood density Combined sites Pedigree -Markers 0.2Site 1 Pedigree -Markers 0.5Site 2 Pedigree -Markers 0.2Diameter (DBH) Combined sites Pedigree -Markers 2.9Site 1 Pedigree -Markers 0.0Site 2 Pedigree -which would have some impact on the overall cost of GS,but would also impact the extent of LD that is picked upby the prediction models. Decreasing the number ofmarkers from almost 5,000 to 1,000 randomly sampledmarkers did not lead to a notable reduction in the accur-acy of GS models (Fig. 1). However, using less than 500markers led to an appreciable loss in accuracy for all fourtraits. Figure 1 shows the accuracy plots for several sce-narios, where the loss of accuracy was mostly related toless scattered correlation plots due to a range reduction ofgenomic-estimated breeding values (GEBV).The modelling of 250 markers that had previouslybeen identified to contribute the largest absolute effectsin the basic model, led to accuracy estimates comparableto those of the basic model including all 4,993 markers(Fig. 1). These models also perform substantially betterthan models based on 250 random markers, which isagain related to an increase of range in GEBV estimatesMarkers 0.03Height Combined sites Pedigree -Markers 0.97Site 1 Pedigree -Markers 1.14Site 2 Pedigree -Markers 1.32Microfibril angle Combined sites Pedigree -Markers 3.5ESite 1 Pedigree -Markers 4.5ESite 2 Pedigree -Markers 4.3Eaσ2a is the additive genetic variance explained by marker locibVA is the additive genetic variance based on markers was estimated as VA ¼ 2σ2aXcσ2u is the polygenic variance estimated based on pedigreedσ2e is the residual varianceeh2i is the individual trait heritabilitys, combined-site and single-site analyses. “Pedigree” indicatesrmation, and “Markers” indicate SNP information from all 4,993 SNPsVAb σ2uc σ2ed h2ie- 597.73 870.01 0.41551.46 - 878.19 0.39- 973.94 1407.01 0.41957.04 - 1398.98 0.41- 653.68 216.32 0.75523.71 - 291.00 0.64- 126.42 93.69 0.57-2 55.88 - 135.02 0.29- 123.22 129.34 0.4968.09 - 164.37 0.29- 143.65 54.22 0.73(Fig. 1). However, the average MAF of largest-effectmarkers was also significantly higher (student t, P <0.0001, for all traits) than the average MAF of remaining4,743 markers of lower effects. Markers with the largesteffects most likely retraced family linkages better, giventheir higher MAF and thus, higher information value.Modelling the remaining 4,743 lower-effect markers ledto accuracies marginally lower than estimates for the fullmodel.Linkage groups have similar effects on GS modelaccuraciesIn a further analysis, we investigated the effect of thegenomic location of markers of GS models for the differ-ent traits. Therefore, we used markers of genes for whicha set of 2928 homologs have been recently mapped tothe white spruce genome [48], which has been previ-ously shown to be highly syntenic et collinear to the60.26 - 106.18 0.36- 3503.73 1613.54 0.681851.53 - 2514.81 0.42- 3232.04 1623.81 0.672182.23 - 2227.79 0.49- 4494.96 1222.76 0.792530.77 - 2341.31 0.52- 0.14 4.8E -2 0.74-5 6.7E -2 - 8.8E -2 0.43- 0.15 5.5E -2 0.73-5 8.6E -2 - 9.2E -2 0.48- 0.15 3.1E -2 0.83-5 8.2E -2 - 7.2E -2 0.53ki¼1pkqk :(4se” inoommLenz et al. BMC Genomics  (2017) 18:335 Page 8 of 17Table 2 Genomic selection analyses based on all marker informationapplication. “Pedigree” indicates that only pedigree information was uthe conventional breeding approach in the body of the text; “Markersfrom cross-validations using 10 replicates on randomly-selected treesmodels. The first scheme considered all 734 trees from the two sites cwhether applied on the same or on the other site, respectively. For coblack spruce genome [51]. GS analyses consideringsets of markers delimited according to linkage groupsshowed some differences in accuracy estimates fromone chromosome to another, but these differenceswere not statistically significant (Fig. 2). At thismarker resolution, we were thus unable to identifyparticular linkage groups that would control more fora specific trait. Overall, the accuracies obtained for5 models run on 359 trees corresponding to the mean number of treesvalidated estimated breeding value (using independent sets of trability is the correlation between the predicted and the actual ppercentages are given in brackets. Gain estimates are based on pgain estimates were based on assumptions of a conventional breedingshortened cycle length of 9 years for selection with markers (“Markers”full-siblings to the training population, followed by 1 year for selection4 years for vegetative propagation of selected individuals for seedlingTrait GS scenario Accuracy (error) Predictive ability (erPedigree Markers Pedigree MarkeWood density Combined sites 0.89 (0.03) 0.84 (0.02) 0.45 (0.09) 0.49 (Site1 0.80 (0.08) 0.77 (0.06) 0.34 (0.16) 0.38 (Site2 0.88 (0.05) 0.82 (0.10) 0.52 (0.15) 0.56 (Site_mean 0.85 (0.05) 0.81 (0.05) 0.42 (0.18) 0.43 (Site1→ 2 0.79 (0.05) 0.77 (0.08) 0.42 (0.16) 0.44 (Site2→ 1 0.85 (0.04) 0.80 (0.08) 0.36 (0.13) 0.39 (Height Combined sites 0.88 (0.03) 0.86 (0.03) 0.56 (0.08) 0.57 (Site1 0.84 (0.05) 0.81 (0.04) 0.51 (0.11) 0.51 (Site2 0.88 (0.02) 0.85 (0.03) 0.58 (0.09) 0.58 (Site_mean 0.85 (0.05) 0.83 (0.05) 0.55 (0.12) 0.55 (Site1→ 2 0.84 (0.03) 0.83 (0.04) 0.56 (0.12) 0.56 (Site2→ 1 0.85 (0.04) 0.84 (0.03) 0.51 (0.11) 0.52 (Diameter (DBH) Combined sites 0.86 (0.03) 0.83 (0.04) 0.45 (0.06) 0.43 (Site1 0.76 (0.08) 0.74 (0.08) 0.38 (0.15) 0.34 (Site2 0.87 (0.02) 0.82 (0.04) 0.53 (0.05) 0.48 (Site_mean 0.82 (0.05) 0.79 (0.06) 0.45 (0.09) 0.42 (Site1→ 2 0.76 (0.05) 0.75 (0.05) 0.43 (0.09) 0.43 (Site2→ 1 0.80 (0.05) 0.78 (0.05) 0.33 (0.13) 0.34 (Microfibril anglec Combined sites 0.88 (0.03) 0.84 (0.04) 0.51 (0.11) 0.51 (Site1 0.83 (0.06) 0.79 (0.04) 0.47 (0.15) 0.45 (Site2 0.86 (0.04) 0.82 (0.04) 0.54 (0.16) 0.52 (Site_mean 0.84 (0.06) 0.80 (0.05) 0.48 (0.15) 0.48 (Site1→ 2 0.80 (0.03) 0.76 (0.04) 0.48 (0.13) 0.47 (Site2→ 1 0.84 (0.04) 0.81 (0.04) 0.44 (0.13) 0.45 (aGenetic gains are given in absolute values; units are kg/m3 for wood density,bM/P, markers to pedigree gain ratiocNegative genetic gain for MFA indicates trait improvement,993 SNPs) and following different schemes of model building andd for prediction after model calibrations, which is also referred to asndicates that SNP information was used for prediction. All results aret included in model fitting, but from the same families used to fitbined. Models in other schemes were trained on one site only andparative purposes, the “Site_mean” scheme represents the mean ofthe four traits were lower than those of models basedon all markers, but they were slightly higher than es-timates for models based on an equivalently reducednumber of markers but sampled randomly across theentire genome (see Fig. 1, random 250 SNPs). Theseresults suggest that relatedness is a likely strongcontributing factor to the high prediction accuracy ofGS models.per site. Model accuracy is the correlation between the cross-ees) and the “true” reference breeding value. The predictivehenotypes. Genetic gains are given in absolute values andredicted phenotypes and a selection intensity of 5%. Annualcycle length of 28 years for pedigree selection (“Pedigree”), and a), with 4 years for crosses and production of seedlots that areof individuals using markers and genomic selection models, andproductionror) Gaina (percent) GainratioGain per year Gain peryear ratiors Pedigree Markers M/Pb Pedigree Markers M/Pb0.07) 34.74 (0.08) 35.63 (0.09) 1.03 1.24 3.96 3.190.12) 35.90 (0.09) 31.87 (0.08) 0.89 1.28 3.54 2.770.12) 26.25 (0.06) 29.48 (0.07) 1.12 0.94 3.28 3.490.16) 30.15 (0.07) 28.55 (0.07) 0.95 1.08 3.17 2.940.19) 24.60 (0.06) 21.47 (0.05) 0.87 0.88 2.39 2.720.11) 40.02 (0.10) 45.46 (0.11) 1.14 1.43 5.05 3.530.07) 105.17 (0.13) 104.57 (0.13) 0.99 3.76 11.62 3.090.11) 53.95 (0.07) 52.17 (0.07) 0.97 1.93 5.8 3.010.12) 82.89 (0.10) 78.69 (0.10) 0.95 2.96 8.74 2.950.12) 87.00 (0.11) 84.56 (0.11) 0.97 3.11 9.4 3.020.13) 75.93 (0.09) 74.56 (0.09) 0.98 2.71 8.28 3.060.11) 63.67 (0.08) 62.31 (0.08) 0.98 2.27 6.92 3.050.07) 16.65 (0.14) 14.81 (0.13) 0.89 0.59 1.65 2.800.15) 12.09 (0.10) 10.15 (0.09) 0.84 0.43 1.13 2.630.08) 14.70 (0.13) 13.18 (0.12) 0.90 0.53 1.46 2.750.11) 13.44 (0.12) 11.84 (0.10) 0.88 0.48 1.32 2.750.10) 10.99 (0.10) 9.65 (0.08) 0.88 0.39 1.07 2.740.12) 16.09 (0.14) 13.75 (0.12) 0.85 0.57 1.53 2.680.10) −2.71 (−0.15) −2.78 (−0.15) 1.02 −0.10 −0.31 3.100.13) −2.47 (−0.12) −2.29 (−0.11) 0.92 −0.09 −0.25 2.780.15) −1.72 (−0.11) −1.80 (−0.12) 1.09 −0.06 −0.20 3.330.14) −1.84 (−0.11) −1.80 (−0.11) 1.00 −0.07 −0.20 2.860.12) −2.05 (−0.13) −1.95 (−0.12) 0.92 −0.07 −0.22 3.140.11) −2.14 (−0.10) −2.16 (−0.10) 1.01 −0.08 −0.24 3.00cm for height, mm for diameter and degrees for microfibril anglebgrebiagee linro(pre(0(0(0(−fLenz et al. BMC Genomics  (2017) 18:335 Page 9 of 17Sample sizes used to build genomic selection modelsThe effect of sample size used for GS model training wasinvestigated. Starting with the full set of 734 sampled treesfrom both sites, we randomly removed trees and evaluatedmodel quality for various subsets. As expected, accuracyof the GS models from the combined-site analysis gener-ally decreased when fewer trees were used (Fig. 3). The re-duction was much more important in marker-basedmodels compared with pedigree-based models, leading todecay in the accuracy ratio. When considering less than aquarter of the trees initially present in the training set,model accuracy decreased by 50% on average. Sample setswith 330 or more trees showed comparable or only mar-ginally inferior model accuracy compared with the fullsample set, picking up well existing LD and relatedness.However, when training models with this number of trees,Table 3 Accuracies of genomic selection models based on half-siremoving full-sib family linkage. Pedigree indicates that only pedimarkers indicates that SNP information was used. The predictive aphenotypes. Genetic gains are given in absolute values and percentphenotypes and a selection intensity of 5%. Annual gain estimates wof 28 years for pedigree selection (“Pedigree”), and a shortened cyclfor crosses and production of seedlots that are full-siblings to the tramarkers and genomic selection models, and 4 years for vegetative pTrait Accuracy (error) Predictive ability (error) GainaPedigree Markers Pedigree Markers PedigWood density 0.77 (0.14) 0.65 (0.18) 0.38 (0.16) 0.37 (0.15) 26.48Height 0.80 (0.12) 0.76 (0.13) 0.50 (0.19) 0.49 (0.18) 78.59Diameter (DBH) 0.64 (0.17) 0.63 (0.17) 0.30 (0.20) 0.31 (0.21) 14.41Microfibril anglec 0.40 (0.41) 0.42 (0.28) 0.18 (0.32) 0.23 (0.25) −2.01aGenetic gains are given in absolute values; units are kg/m3 for wood density, cmbM/P, markers to pedigree gain ratiocNegative genetic gain for MFA indicates trait improvementthe accuracy of models was associated with larger errorsfor wood quality traits, especially for wood density whereaccuracy degraded faster and more irregularly in bothmarker-based and pedigree-based models.Similarly, errors of predicted genetic gain based on theuse of GS models increased for sample sets of 330 trees orless. This was especially true for tree height and wood dens-ity where the average predicted genetic gain of marker-based models even increased slightly, when models werebased on low numbers of individuals (Fig. 4). Together withthe important loss of accuracy for both traits (Fig. 3), thisresult highlights the value of a large training set leading toprecise accuracy estimates from GS models in order tomake well-grounded selection decisions.DiscussionAccuracy of genomic selection models with completeinformationThis study clearly shows that medium-dense markerpanels with several thousand markers well distributedover the genome can be effectively used in GS to predictadditive breeding values in advanced-generation treebreeding programs, echoing previous results in similarsettings [21, 24, 25, 33]. The accuracy of the GS modelsobtained with the present black spruce breeding popula-tion was high for both growth and wood quality traits,reaching values of approximately 0.8 when using severalhundred trees to build the GS models. Accuracy esti-mates from GS models were comparable with theirpedigree-based counterparts, and were superior to re-sults obtained in white spruce for a population of full-sib families [25]. These findings suggest that the currentmarker panel was marginally more efficient than thepedigree information in retracing family linkages (seebelow), especially if some errors affected pedigree infor-mation. These results lead to the conclusion that GS canfamilies using all 4,993 markers in a combined-site analysis, thuse information was used for predictions after model calibrations;lity is the correlation between the predicted and the actuales are given in brackets. Gain estimates are based on predictedre based on assumptions of a conventional breeding cycle lengthength of 9 years for selection with markers (“Markers”), with 4 yearsing population, followed by 1 year for selection of individuals usingpagation of selected individuals for seedling productionercent) Gain ratio Gain per year Gain per year ratioe Markers M/Pb Pedigree Markers M/Pb.06) 26.33 (0.06) 0.99 0.95 2.93 3.08.10) 76.70 (0.10) 0.98 2.81 8.52 3.03.13) 12.19 (0.11) 0.85 0.51 1.35 2.650.11) −1.97 (−0.11) 1.00 −0.07 −0.22 3.14or height, mm for diameter and degrees for microfibril angleefficiently be applied for this boreal conifer species inadvanced-breeding programs and highly structured pop-ulations of full-sib families, resulting in much highergains per year than conventional selection. Moreover,applying a forward GS scheme at an early stage appearspossible in black spruce, given that it inherently displaysa high propensity for vegetative propagation at an earlyage, as seen for other spruces [1].The present accuracy estimates were somewhat higherthan estimates previously obtained for similar traits infull-sib families of white spruce of similar age in a com-parable study [25], and they were considerably higherthan accuracy estimates obtained for loblolly pine [22].Following the simulation results of Grattapaglia andResende [32] and the parameters of the present study,one would expect accuracy estimates ranging between0.7 and 0.8 for genomic models relying on 2 to 3markers per cM, with an effective population size closeto 30, as well as a training population set somewhatbelow 1,000 individuals. The present estimates are onFig. 1 Accuracy of genomic selection models with reduced marker density. Accuracy estimates for subsets of markers are shown by correlationsbetween the genomic-predicted breeding values (x-axis) and the true breeding values (y-axis) for tree height, diameter at breast height (DBH), wood density, andmicrofibril angle. Associated errors of accuracy estimates are presented in brackets. On the y-axis of the fifth row of plots, largest SNPs indicate SNPs with largestabsolute effects. On the y-axis of the sixth row of plots, remaining SNPs indicate the subset of SNPs without those 250 SNPs with largest absolute effectsLenz et al. BMC Genomics  (2017) 18:335 Page 10 of 17Fig. 2 Effect of linkage group on accuracy of genomic selection models. Accuracy (black circles) and associated errors for models based on markerspertaining to the same individual linkage group. Dashed grey lines indicate the means of accuracy estimates of the 12 linkage groups; long-dashedblack lines are the accuracies of models combining all markers of known map positions (2928 markers, see Methods) and spanning the entire genomeFig. 3 Accuracy of genomic selection models obtained using subsets of trees. Accuracy estimates for pedigree-based models (light grey line) andmarker-based models (dark line), as well as their ratio (histograms). Estimates are given with their associated standard errorsLenz et al. BMC Genomics  (2017) 18:335 Page 11 of 17ededLenz et al. BMC Genomics  (2017) 18:335 Page 12 of 17Fig. 4 Predicted genetic gain using subsets of trees to build pedigree-bascorresponding coefficient of variation (error bars). The ratio of marker- to pare based on predicted phenotypes and a selection intensity of 5%the upper limit or surpass these expectations, likely be-cause of the high heritabilities observed in the presentblack spruce field trial for these traits.Overall, the accuracy estimates only showed minordifferences among traits. Model quality for tree diam-eter was somewhat lower than that for tree height,which is congruent with earlier reports [24, 25]. Thisis most likely related to the higher heritability of treeheight compared to diameter, as often noted in coni-fers [25, 27, 58].Genotype-by-environment interactionThe genotype-by-environment interaction was low. Modelscalibrated on one site led to good predictions of GEBVs onthe other site, indicating low genotype-by-environment in-teractions in this test despite the contrasting site conditionsand the large geographic distribution of parental trees usedto produce the full-sib families. Low genotype-by-environment interactions were also previously reportedfrom provenance-progeny tests replicated on multiple sitesin Québec [59, 60]. Similar observations of good transfer ofmodel accuracy among distant sites from two large breed-ing zones in Québec were also reported for white sprucefor both half-sib and full-sib GS models [25, 26]. These re-sults contrast with reports on hybrid spruce from westernCanada [27] and loblolly pine from the southeastern UnitedStates [33], where the need to recalibrate GS models ineach breeding zone was shown.models (light grey line) and marker-based models (dark line), and theigree-predicted genetic gain is presented by histograms. Gain estimatesBoreal spruces in eastern Canada, such as black andwhite spruces, are reforested on a geographically re-stricted land base compared to the extent of their nat-ural distribution, at the southern edge of their naturalrange where the commercial forest is mostly located.Based on provenance-progeny tests targeting these refor-estation areas, reduced genotype-by-environment inter-action was noted and the reforestation sites have beensplit into only a few large breeding zones [40, 60].Furthermore, little phylogeographic structure has beenreported in the province of Québec and its vicinity forblack spruce [52, 53, 61, 62] or white spruce [63], indi-cating a homogenous historical genetic background.Little among-population differentiation has also beenobserved with various molecular markers, indicatinglimited population structuring and reflecting therecent post-glacial recolonization in eastern Canada[52, 53, 57, 64]. Furthermore, the parental treesemployed in the present black spruce advanced-breedingpopulations were first-generation superior trees selectedin provenance tests assembling multiple provenances, wellscattered geographically beyond single breeding zones andperforming well on multiple sites. Therefore, their geneticbackground may have been indirectly selected towardgeneralists bearing a plastic adaptability to local condi-tions. This study shows that models can be applied to dif-ferent sites or be built by pooling data from different siteswithout significantly compromising accuracy. From athe present context. One interesting observation was thatserved between using half-sib versus full-sib sets furtherLenz et al. BMC Genomics  (2017) 18:335 Page 13 of 17highlights that GS is most efficiently applied in morestructured populations where relatedness and LD arehigher, which necessitates less genome coverage to attainhigh prediction accuracy. Similar observations were re-ported by Beaulieu et al. [25, 26] and Zapata-Valenzulamarker-based models had a tendency to loose accuracymore quickly than their pedigree-based counterparts.From a practical point of view, a smaller number of treesnecessary to train and obtain models with good accuracywill help reduce the genotyping and phenotyping costs forGS model development.Genomic selection model accuracy and level of familystructureGenomic selection models constructed using data froma subset of half-sibs from the same test resulted in muchlower and more variable accuracies. However, modelsbuilt with half-sibs had higher accuracies than thosefrom previous reports similarly derived from half-sibfamilies of eastern white spruce or western hybridspruce in Canada [26, 28]. The difference is likely due tothe low effective population size of half-sibs in thepresent study, leading to a higher level of relatednessamong trees in the training and validation sets comparedto true open-pollinated families where a large number ofmostly unrelated pollen donors intervene and greatly in-crease the effective population size [26].The shift in accuracy and predicted genetic gain ob-practical point of view, it also means that fewer trees arenecessary to train and obtain models with good accuracy,thus reducing phenotyping and genotyping costs for GSmodel development.The effect of size of model training set on genomic selectionaccuracyThe size of the dataset employed to train models haspreviously been shown to have a large effect on the ac-curacy of GS models [32]. In advanced-breeding popula-tions with small effective population size, we hypothesizedthat a smaller number of samples per progeny should besufficient to obtain good model accuracy. Using resam-pling, Perron et al. [65] reported that optimal subgroupsshould include 6 to 8 trees per family and site in order toprecisely estimate genetic parameters for wood densityand growth in an open-pollinated black spruce test. Ourcurrent finds concur with those of Perron et al. [65], as asizeable loss in accuracy was only observed when thetraining sets were less than 330 trees, corresponding to aminimum of about 4 trees per site and full-sib family inet al. [21]. Below, we discuss additional evidence to sup-port this interpretation.A limited role for short-range LD in genomic predictionThe main obstacles for the application of MAS in largelyundomesticated populations of conifers are their largegenome sizes often exceeding 20 Gbp [66] and their lowLD [35, 36], which in turn would require very high gen-ome coverage in order to pick-up short-range marker-QTL LD and make accurate predictions; this, besidesthe multigenic control of most relevant traits wasalready identified as a drawback in association studies[5–7] where single markers explained only a low percentof variance. In this context, the role of short-rangemarker-QTL LD in the accuracy of GS models obtainedwith moderate genome coverage appears negligible. Inthe present study, a density of approximately 2.7markers/cM was used, which resulted in GS model ac-curacy roughly equivalent to that from pedigree-basedmodels. The same trend was also observed in previousstudies [25, 26]. GS model accuracy decreased signifi-cantly when using half-sibs instead of full-sibs, which isconsistent with the trend seen from other studies dis-playing similar genome coverage [25–27]. Using onlymarkers with large effects resulted in marginally bettermodel accuracies compared to those obtained with samenumber of markers randomly picked. However, weshowed that this aspect is entangled with higher averageMAF values for these markers. Also, the non-significantdifferences among model accuracies obtained withmarkers from different chromosomes likely indicatesthat limited short-range marker-QTL LD could betraced. Altogether, these observations point to related-ness and an increasing size of un-recombined chromo-some blocks as the main drivers of GS model accuracy.Relatedness and the ability to retrace family linkagesshould be seen as the key factors for the high accuraciesobtained, given that restricting GS model building withmarkers from single chromosomes led to somewhathigher accuracies compared to models relying on anequivalent random sample of markers covering all 12spruce chromosomes. This trend is further supported bythe fact that increasing the number of markers used tobuild GS models tenfold (from 500 to 4,993 markers)only led to incremental, though useful, improvements inmodel accuracy.These observations are further supported by the factthat when GS models were applied to trees from familiesnot included during model building, little or no accuracywas obtained, further confirming the limited role ofshort-range LD in the high accuracies obtained whenusing full-sibs. These results are not surprising giventhat black spruce and most conifers are essentially un-domesticated, outbreed, wind-pollinated organisms withlarge effective population sizes, hence lacking population-wide LD. For instance, LD decays rather rapidly in naturalpopulations of spruces and pines, usually well within geneLenz et al. BMC Genomics  (2017) 18:335 Page 14 of 17limits and in many cases, within a few hundred base pairs,which is a likely consequence of historically large effectivepopulation sizes [35, 36]. Consequently, our results andthose of others (e.g. [22, 23]) suggest that genomic predic-tion may not be possible for unrelated individuals atcurrent marker densities.A corollary is that obtaining GS model accuracy be-yond that of pedigree-based models would likely necessi-tate increasing genome coverage by a factor of at least10X to 100X of that used herein, resulting in the use ofa very large number of markers, likely in the hundredsof thousands. Simulations and historical data in cattleled to the conclusion that 50k markers would allow forthe capture of causal loci within breeds, but 300kmarkers would be needed for accurate prediction acrossbreeds [67]. A similar trend is emerging for crop plants,with the preparation of genotyping arrays containingover half a million SNPs (e.g. McCouch C. et al., Inter-national Rice Consortium, in preparation). Furthermore,based on simulations, Grattapaglia and Resende [32]showed that increasing genome coverage tenfold from1–2 markers/cM to 10–20 markers/cM only asymptotic-ally improves GS model accuracy.The fact that slightly higher GS model accuracy wasobtained with models using only markers with largesteffects (top 250 out of 4,993 markers), compared tomodels estimated with all markers, could imply that partof the short-range LD can be picked-up by the GSmodels when a reduced numbers of large-effect markersare employed. However, there could be confounding fac-tors, such as the a priori information value of markers.Indeed, a significantly higher average value of minimumallele frequency (MAF) was observed for markers withlargest effects. Such markers could track family linkagesmore effectively than random markers, especially whensmall numbers of markers are used. Also, the marginallyhigher accuracy achieved with markers located on a spe-cific chromosome compared to same number of markersbut spread over the entire genome indicate that a bettertracking of family linkages is achieved when a smallnumber of markers are located on the same chromo-some. Thus, this factor could also be potentially use-ful to reduce genotyping and GS costs in highlystructured populations when the genome location ofmarkers is known.ConclusionsThe results of the present study on black spruce indicatethat, at least in the short term, GS holds substantialpromises for efficient application in populations of smalleffective size, such as advanced-breeding populations.With the scale of marker densities usually employed (ata rate of a few markers per cM), the overall short-rangeLD between markers and causal loci is likely not sufficientlywell captured to make GS efficient in largely unstructurednatural populations of conifers or between unrelated breed-ing populations. In small and structured populations, weshow that prediction accuracy is null when relatedness be-tween training and testing data sets is absent. Marker dens-ities of one or several orders of magnitude higher wouldlikely be necessary to improve accuracy in such conditions.Thus, high relatedness between individuals appears to be aprerequisite to obtain highly accurate GEBVs.From an applied point of view, lowering marker dens-ities may be feasible without major loss in accuracy inorder to reduce genotyping costs. Information relative torelatedness would be more precise when markers are lo-cated on a single chromosome instead of spreading anequivalent number of markers over the whole genome,and with using markers with highest MAF. These ap-proaches may be considered when a reduced genotypingassay is needed.Based on the present results, the GS application withhighest potential for spruce breeders would be to selectwith high accuracy superior individuals within a groupof full-sib families. This would effectively increase therelatedness as well as the size of unrecombined chromo-some blocks generated by controlled crossing in a smallbreeding population. One obvious beneficial implemen-tation of GS in such a context would be to repeat thecontrolled crosses that were used to build the GS modelsto generate much larger full-sib families and thus, applyhigher selection intensities and obtain larger geneticgains. Future studies also need to evaluate to which ex-tent genomic selection models developed for the presentgeneration could be applied to the next generation ofprogeny, as recombination should break up some of theestablished linkage [23]. However, given the good accur-acy of GS models that we obtained when considering in-dividuals sharing only one parent, we hypothesize thatpredictions in a following generation may be relativelyaccurate when sharing the same parental material. Also,because family linkages could be efficiently traced withgenomic profiles, polycross strategies could likely beused without losing significantly on prediction accuracy,where male pollen donors are mixed so to reduce thecost of crosses. Such screening of larger families frompolycross would make it possible to increase selectionintensity and the ensuing genetic gain in a cost-effectivefashion, especially for traits such as wood quality param-eters or pest resistance, which are expensive and cum-bersome to assess on large test populations. Thus,candidate individuals for selection would only have to begenotyped at the early seedling stage and those havingthe highest GEBVs predicted using the available GSmodels would be selected.For species amenable to vegetative propagation and fororganizations having access to somatic embryogenesisLenz et al. BMC Genomics  (2017) 18:335 Page 15 of 17and/or rooted cutting facilities, as it is the case in easternCanada for spruces, individuals identified with GS at theearly seedling stage could also be propagated and mass-produced for reforestation programs within only a fewyears [1]. With such a forward selection scheme, thebreeding and production cycles could be significantly re-duced and gain per time unit would be multiplied by a fac-tor of 3 (Table 2). Other schemes based on sexualreproduction could be deployed, such as top-grafting ofselected progeny in previous generation seed orchardsfollowed by polymix crossing, which would facilitate theproduction of genetically improved seeds with minimaldelays. At the same time and by shortening quite drastic-ally the breeding cycles for slow-growing species such astemperate and boreal conifers, the use of GS tools shouldresult in more flexibility to tree breeders, which appearsespecially important in the context of rapid environmentalchanges and evolution of wood products markets.AbbreviationscM: Centimorgan; EBV: Estimated breeding value; GEBV: Genomic estimatedbreeding value; GS: Genomic selection; LD: Linkage disequilibrium; MAF: Minorallele frequency; MAS: Marker-assisted selection; MFA: Cellulose microfibril anglein the secondary cell wall; QTL: Quantitative trait locus; SNP: Single nucleotidepolymorphismAcknowledgementsThe authors acknowledge the contribution of M. Villeneuve (formerly atDirection de la recherche forestière - DRF, Ministère des Forêts, de la Fauneet des Parcs du Québec - MFFPQ) for assembling and supervising the initiationof the black spruce biparental test used in this study more than 25 years ago,and the help of G. Gagnon, G. Numainville, and numerous others (DRF-MFFPQ)for conducting crosses to establish the test and for field test sampling andmeasurements. We thank S. Blais and F. Gagnon (Canada Research Chair inForest Genomics - CRCFG, Univ. Laval) for their help with the acquisition andvalidation of genotyping data, and A. Deschêne and N. Pavy (CRCFG, Univ.Laval) for the black spruce SNP resource and help in designing the genotypingchip. We also thank D. Vincent (Génome Québec Innovation Centre, McGillUniv., Montréal, Québec) and his team for performing the Infinium genotypingassay. We finally thank T. Doerksen (Canadian Wood Fibre Centre, now at BCForests, Lands and Natural Resource Operations) for help with data validation atthe onset of the project, as well as Simon Nadeau (Canadian Wood FibreCentre, Natural Resources Canada) and two anonymous reviewers for theirhelpful comments.FundingThis study was made possible through funding from the Fonds de recherchedu Québec – Nature et technologie (FRQ-NT) for phenotyping, DNA sampling,SNP discovery, and salaries for lab assistant, bioinformatician and postdoc. TheQuébec Ministry of Forests, Wildlife and Parks (Ministère des Forêts, de la Fauneet des Parcs du Québec) provided salary for maintenance of test sites andsampling. The Canadian Wood Fibre Centre (CWFC) contributed support fordatabasing computational environment and salary for data management anddata analyses. Genome Canada, Génome Québec and Genome BC through thespruce genomics projects SMarTForests, FastTRAC and Spruce-Up provided fundingfor genotyping, publication costs, and postdoc salary for genetic mapping aspects.Availability of data and materialsIn order to comply with Intellectual Property Policies (IPP) of participatinggovernmental institutions in this work, the supporting phenotyping data isnot deposited into a public domain. The original data is stored in the institutions’databases and may be shared upon request to the corresponding authoraccording to our IPP. The SNP dataset is available on DRYAD entry doi:10.5061/dryad.tr87v [38].Authors’ contributionPL performed the bulk of GS modelling and drafted the manuscript. SC wasresponsible for data preparation and bioinformatics. SDM conducted thecharacterization of wood traits. MD helped conceiving the study, providedaccess to field experiments and supervised field data collection as scientistresponsible for the Québec black spruce breeding program. JBe and JBodesigned the study, assisted in drafting the manuscript, and obtained funding.All co-authors significantly contributed to the present study. All authors readand approved the final manuscript.Competing interestsThe authors declare that they have no competing interests.Consent for publicationNot applicable.Ethics approval and consent to participateThe plant material analysed for this study comes from common gardenexperiments (plantations) that were established and maintained by theProvince of Québec for breeding selections and research purposes. Theprovincial black spruce breeding program is overseen by M. Despont, who isco-authoring this publication. Black spruce is a widespread boreal coniferspecies on the North American continent and no particular permission forsampling was required.Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1Canadian Wood Fibre Centre, Canadian Forest Service, Natural ResourcesCanada, Government of Canada, 1055 du PEPS, P.O. Box 10380, Québec,Québec G1V 4C7, Canada. 2Canada Research Chair in Forest Genomics,Institute of Systems and Integrative Biology and Centre for Forest Research,Université Laval, 1030, Avenue de la Médecine, Québec, Québec G1V 0A6,Canada. 3Department of Wood Science, Forest Sciences Centre, University ofBritish Columbia, Vancouver, British Columbia V6T 1Z4, Canada. 4Ministèredes Forêts, de la Faune et des Parcs, Gouvernement du Québec, Direction dela recherche forestière, 2700 rue Einstein, Québec, Québec G1P 3W8, Canada.Received: 13 December 2016 Accepted: 21 April 2017References1. Park Y-S, Beaulieu J, Bousquet J. Multi-varietal forestry integrating genomicselection and somatic embryogenesis. In: Park YS, Bonga JM, Moon H-K,editors. Vegetative Propagation of Forest Trees. Seoul: National Institute ofForest Science (NiFos); 2016. p. 302–22.2. White TL, Neale DB, Adams WT. Forest Genetics. Wallingford: CABI Publishing;2007. p. 682.3. Burdon RD, Wilcox PL. Integration of molecular markers in breeding. In:Plomion C, Bousquet J, Kole C, editors. Genetics, Genomics and Breeding ofConifers. New York: Edenbridge Science Publishers and CRC Press; 2011. p.276–322.4. Lande R, Thompson R. Efficiency of marker-assisted selection in the improvementof quantitative traits. Genetics. 1990;124:743–56.5. Porth I, Klapšte J, Skyba O, Hannemann J, McKown AD, Guy RD, et al.Genome-wide association mapping for wood characteristics in Populusidentifies an array of candidate single nucleotide polymorphisms. NewPhytol. 2013;200:710–26.6. Beaulieu J, Doerksen T, Boyle B, Clement S, Deslauriers M, Beauseigle S, et al.Association genetics of wood physical traits in the conifer white spruce andrelationships with gene expression. Genetics. 2011;188:197–214.7. Gonzalez-Martinez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB. Associationgenetics in Pinus taeda L. I. Wood property traits. Genetics. 2007;175:399–409.8. Holliday JA, Ritland K, Aitken SN. Widespread, ecologically relevant geneticmarkers developed from association mapping of climate-related traits inSitka spruce (Picea sitchensis). New Phytol. 2010;188:501–14.9. Prunier J, Pelgas B, Gagnon F, Desponts M, Isabel N, Beaulieu J, et al. Thegenomic architecture and association genetics of adaptive characters usinga candidate SNP approach in boreal black spruce. BMC Genomics. 2013;14:368.Lenz et al. BMC Genomics  (2017) 18:335 Page 16 of 1710. Ritland K, Krutovsky KV, Tsumura Y, Pelgas B, Isabel N, Bousquet J. Geneticmapping in conifers. In: Plomion C, Bousquet J, Kole C, editors. Genetics,Genomics and Breeding of Conifers. New York: Edenbridge Science Publishersand CRC Press; 2011. p. 196–238.11. Pelgas B, Bousquet J, Meirmans PG, Ritland K, Isabel N. QTL mapping inwhite spruce: gene maps and genomic regions underlying adaptive traitsacross pedigrees, years and environments. BMC Genomics. 2011;12:145.12. Meuwissen T, Hayes B, Goddard M. Prediction of total genetic value usinggenome-wide dense marker maps. Genetics. 2001;157:1819–29.13. Grattapaglia D, Plomion C, Kirst M, Sederoff RR. Genomics of growth traits inforest trees. Curr Opin Plant Biol. 2009;12:148–56.14. Hayes B, Goddard M. Genome-wide association and genomic selection inanimal breeding. Genome. 2010;53:876–83.15. VanRaden P. Efficient methods to compute genomic predictions. J Dairy Sci.2008;91:4414–23.16. Legarra A, Robert-Granié C, Manfredi E, Elsen J-M. Performance of genomicselection in mice. Genetics. 2008;180:611–8.17. Desta ZA, Ortiz R. Genomic selection: genome-wide prediction in plantimprovement. Trends Plant Sci. 2014;19:592–601.18. Jannink JL, Lorenz AJ, Iwata H. Genomic selection in plant breeding: fromtheory to practice. Brief Funct Genomics. 2011;9:166–77.19. Heffner EL, Sorrells ME, Jannink J-L. Genomic selection for crop improvement.Crop Sci. 2009;49:1–12.20. Resende MDV, Resende MFR, Sansaloni CP, Petroli CD, Missiaggia AA, AguiarAM, et al. Genomic selection for growth and wood quality in Eucalyptus:capturing the missing heritability and accelerating breeding for complextraits in forest trees. New Phytol. 2012;194:116–28.21. Zapata-Valenzuela J, Isik F, Maltecca C, Wegrzyn J, Neale D, McKeand S, et al.SNP markers trace familial linkages in a cloned population of Pinustaeda—prospects for genomic selection. Tree Genet Genomes. 2012;8:1307–18.22. Resende Jr MFR, Muñoz P, Resende MDV, Garrick DJ, Fernando RL, Davis JM,et al. Accuracy of genomic selection methods in a standard dataset ofLoblolly pine (Pinus taeda L.). Genetics. 2012;190:1503–10.23. Bartholomé J, Van Heerwaarden J, Isik F, Boury C, Vidal M, Plomion C, et al.Performance of genomic prediction within and across generations in maritimepine. BMC Genomics. 2016;17:604.24. Isik F, Bartholomé J, Farjat A, Chancerel E, Raffin A, Sanchez L, et al. Genomicselection in maritime pine. Plant Sci. 2016;242:108–19.25. Beaulieu J, Doerksen TK, MacKay J, Rainville A, Bousquet J. Genomic selectionaccuracies within and between environments and small breeding groups inwhite spruce. BMC Genomics. 2014;15:1048.26. Beaulieu J, Doerksen T, Clément S, MacKay J, Bousquet J. Accuracy ofgenomic selection models in a large population of open-pollinated familiesin white spruce. Heredity. 2014;113:342–52.27. Gamal El-Dien O, Ratcliffe B, Klápště J, Chen C, Porth I, El-Kassaby YA.Prediction accuracies for growth and wood attributes of interior spruce inspace using genotyping-by-sequencing. BMC Genomics. 2015;16:370.28. Ratcliffe B, Gamal El-Dien O, Klápště J, Porth I, Chen C, Jaquish B, et al. Acomparison of genomic selection models across time in interior spruce(Picea engelmannii × glauca) using unordered SNP imputation methods.Heredity. 2015;115:547–55.29. De La Torre A, Birol I, Bousquet J, Ingvarsson P, Jansson S, Jones SJ, et al.Insights into conifer giga-genomes. Plant Physiol. 2015;166:1724–32.30. Mullin TJ, Andersson B, Bastien J, Beaulieu J, Burdon R, Dvorak W, et al.Economic importance, breeding objectives and achievements. In: PlomionC, Bousquet J, Kole C, editors. Genetics, Genomics and Breeding of Conifers.New York: Edenbridge Science Publishers and CRC Press; 2011. p. 40–127.31. Kremer A. Predictions of age-age correlations of total height based on serialcorrelations between height increments in Maritime pine (Pinus pinaster Ait.). Theor Appl Genet. 1992;85:152–8.32. Grattapaglia D, Resende MDV. Genomic selection in forest tree breeding.Tree Genet Genomes. 2011;7:241–55.33. Resende M, Munoz P, Acosta J, Peter G, Davis J, Grattapaglia D, et al.Accelerating the domestication of trees using genomic selection: accuracyof prediction models across ages and environments. New Phytol. 2012;193:617–24.34. Isik F. Genomic selection in forest tree breeding: the concept and an outlookto the future. New For. 2014;45:379–401.35. Pavy N, Namroud M, Gagnon F, Isabel N, Bousquet J. The heterogeneouslevels of linkage disequilibrium in white spruce genes and comparative analysiswith other conifers. Heredity. 2012;108:273–84.36. Neale DB, Savolainen O. Association genetics of complex traits in conifers.Trends Plant Sci. 2004;9:325–30.37. Perry DJ, Bousquet J. Genetic diversity and mating system of post-fire andpost-harvest black spruce: an investigation using codominant sequence-tagged-site (STS) markers. Can J For Res. 2001;31:32–40.38. Hill WG. Estimation of effective population size from data on linkagedisequilibrium. Genet Res. 1981;38:209–16.39. Pavy N, Gagnon F, Deschênes A, Boyle B, Beaulieu J, Bousquet J. Developmentof highly reliable in silico SNP resource and genotyping assay from exomecapture and sequencing: an example from black spruce (Picea mariana). MolEcol Resour. 2016;16:588–98.40. Beaulieu J, Corriveau A, Daoust G. Phenotypic stability and delineation ofblack spruce breeding zones in Québec: Natural Ressources Canada, CanadianForest Service, Laurentian Forestry Centre, Information Report Lau-X-85E; 1989.41. Pavy N, Gagnon F, Rigault P, Blais S, Deschênes A, Boyle B, et al.Development of high-density SNP genotyping arrays for white spruce (Piceaglauca) and transferability to subtropical and nordic congeners. Mol EcolResour. 2013;13:324–36.42. Ukrainetz NK, Kang KY, Aitken SN, Stoehr M, Mansfield SD. Heritability andphenotypic and genetic correlations of coastal Douglas-fir (Pseudotsugamenziesii) wood quality traits. Can J For Res. 2008;38:1536–46.43. Jackman SD, Warren RL, Gibb EA, Vandervalk BP, Mohamadi H, Chu J, et al.Organellar genomes of white spruce (Picea glauca): assembly and annotation.Genome Biol Evol. 2016;8:29–41.44. Warren RL, Keeling CI, Yuen MMS, Raymond A, Taylor GA, Vandervalk BP,et al. Improved white spruce (Picea glauca) genome assemblies andannotation of large gene families of conifer terpenoid and phenolic defensemetabolism. Plant J. 2015;83:189–212.45. Birol I, Raymond A, Jackman SD, Pleasance S, Coope R, Taylor GA, et al.Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics. 2013;29:1492–7.46. Nystedt B, Street NR, Wetterbom A, Zuccolo A, Lin Y-C, Scofield DG, et al.The Norway spruce genome sequence and conifer genome evolution. Nature.2013;497:579–84.47. Pavy N, Pelgas B, Laroche J, Rigault P, Isabel N, Bousquet J. A spruce genemap infer ancient plant genome reshuffling and subsequent slow evolutionin the gymnosperm lineage leading to extant conifers. BMC Biol. 2012;10:84.48. Pavy N, Lamothe M, Pelgas B, Gagnon F, Birol I, Bohlmann J, et al. A high-resolution reference genetic map positioning 8.8 K genes for the coniferwhite spruce: structural genomics implications and correspondence withphysical distance. Plant J. 2017;90:189–203.49. Rigault P, Boyle B, Lepage P, Cooke JEK, Bousquet J, MacKay JJ. A white sprucegene catalogue for conifer genome analyses. Plant Physiol. 2011;157:14–28.50. Pavy N, Deschênes A, Blais S, Lavigne P, Beaulieu J, Isabel N, et al. Thelandscape of nucleotide polymorphism among 13,500 genes of the coniferPicea glauca, relationships with functions, and comparison with Medicagotruncatula. Genome Biol Evol. 2013;5:1910–25.51. Pavy N, Pelgas B, Beauseigle S, Blais S, Gagnon F, Gosselin I, et al. Enhancinggenetic mapping of complex genomes through the design of highly-multiplexed SNP arrays: application to the large and unsequenced genomesof white spruce and black spruce. BMC Genomics. 2008;9:21.52. Prunier J, Laroche J, Beaulieu J, Bousquet J. Scanning the genome for geneSNPs related to climate adaptation and estimating selection at the molecularlevel in boreal black spruce. Mol Ecol. 2011;20:1702–16.53. Prunier J, Gerardi S, Laroche J, Beaulieu J, Bousquet J. Parallel and lineage-specific molecular adaptation to climate in boreal black spruce. Mol Ecol.2012;21:4270–86.54. Pelgas B, Bousquet J, Beauseigle S, Isabel N. A composite linkage map fromtwo crosses for the species complex Picea mariana × Picea rubens and analysisof synteny with other Pinaceae. Theor Appl Genet. 2005;111:1466–88.55. Legarra A, Misztal I. Technical note: computing strategies in genome-wideselection. J Dairy Sci. 2008;91:360–6.56. Legarra A. GS3 web folder. [cited 2016 2016-07-31]; Available from: http://genoweb.toulouse.inra.fr/~alegarra/gs3_folder/.57. Isabel N, Beaulieu J, Bousquet J. Complete congruence between gene diversityestimates derived from genotypic data at enzyme and random amplifiedpolymorphic DNA loci in black spruce. Proc Natl Acad Sci U S A. 1995;92(14):6369–73.58. Lenz P, Auty D, Achim A, Beaulieu J, Mackay J. Genetic improvement of whitespruce mechanical wood traits — early screening by means of acousticvelocity. Forests. 2013;4:575–94.59. Beaulieu J, Perron M, Bousquet J. Multivariate patterns of adaptive geneticvariation and seed source transfer in Picea mariana. Can J For Res. 2004;34:531–45.60. Li P, Beaulieu J, Bousquet J. Genetic structure and patterns of genetic variationamong populations in eastern white spruce (Picea glauca). Can J For Res. 1997;27:189–98.61. Jaramillo-Correa JP, Beaulieu J, Bousquet J. Variation in mitochondrial DNAreveals multiple distant glacial refugia in black spruce (Picea mariana), atranscontinental North American conifer. Mol Ecol. 2004;13:2735–47.62. Gérardi S, Jaramillo-Correa JP, Beaulieu J, Bousquet J. From glacial refugia tomodern populations: new assemblages of organelle genomes generated bydifferential cytoplasmic gene flow in transcontinental black spruce. Mol Ecol.2010;19:5265–80.63. De Lafontaine G, Turgeon J, Payette S. Phylogeography of white spruce(Picea glauca) in eastern North America reveals contrasting ecologicaltrajectories. J Biogeogr. 2010;37:741–51.64. Jaramillo-Correa JP, Beaulieu J, Bousquet J. Contrasting evolutionary forcesdriving population structure at expressed sequence tag polymorphisms,allozymes and quantitative traits in white spruce. Mol Ecol. 2001;10:2729–40.65. Perron M, DeBlois J, Desponts M. Use of resampling to assess optimal subgroupcomposition for estimating genetic parameters from progeny trials. Tree GenetGenomes. 2013;9:129–43.66. Murray BG. Nuclear DNA amounts in gymnosperms. Ann Bot. 1998;82(supplA):3–15.67. De Roos A, Hayes BJ, Spelman R, Goddard ME. Linkage disequilibrium andpersistence of phase in Holstein–Friesian, Jersey and Angus cattle. Genetics.2008;179:1503–12.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services Submit your next manuscript to BioMed Central and we will help you at every step:Lenz et al. BMC Genomics  (2017) 18:335 Page 17 of 17•  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submit


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items