UBC Faculty Research and Publications

Bioinformatically predicted deleterious mutations reveal complementation in the interior spruce hybrid… Conte, Gina L; Hodgins, Kathryn A; Yeaman, Sam; Degner, Jon C; Aitken, Sally N; Rieseberg, Loren H; Whitlock, Michael C Dec 15, 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2017_Article_4344.pdf [ 1.65MB ]
JSON: 52383-1.0362113.json
JSON-LD: 52383-1.0362113-ld.json
RDF/XML (Pretty): 52383-1.0362113-rdf.xml
RDF/JSON: 52383-1.0362113-rdf.json
Turtle: 52383-1.0362113-turtle.txt
N-Triples: 52383-1.0362113-rdf-ntriples.txt
Original Record: 52383-1.0362113-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessBioinformatically predicted deleteriousmutations reveal complementation in theinterior spruce hybrid complexGina L. Conte1,2*, Kathryn A. Hodgins1,4, Sam Yeaman1,5, Jon C. Degner1, Sally N. Aitken1, Loren H. Rieseberg2 andMichael C. Whitlock3AbstractBackground: Mutation load is expected to be reduced in hybrids via complementation of deleterious alleles. While localadaptation of hybrids confounds phenotypic tests for reduced mutation load, it may be possible to assess variation inload by analyzing the distribution of putatively deleterious alleles. Here, we use this approach in the interior spruce (Piceaglauca x P. engelmannii) hybrid complex, a group likely to suffer from high mutation load and in which hybrids exhibitlocal adaptation to intermediate conditions. We used PROVEAN to bioinformatically predict whether non-synonymousalleles are deleterious, based on conservation of the position and abnormality of the amino acid change.Results: As expected, we found that predicted deleterious alleles were at lower average allele frequencies than alleles notpredicted to be deleterious. We were unable to detect a phenotypic effect on juvenile growth rate of the many rarealleles predicted to be deleterious. Both the proportion of alleles predicted to be deleterious and the proportion of locihomozygous for predicted deleterious alleles were higher in P. engelmannii (Engelmann spruce) than in P. glauca (whitespruce), due to higher diversity and frequencies of rare alleles in Engelmann. Relative to parental species, the proportionof alleles predicted to be deleterious was intermediate in hybrids, and the proportion of loci homozygous for predicteddeleterious alleles was lowest.Conclusion: Given that most deleterious alleles are recessive, this suggests that mutation load is reduced in hybrids dueto complementation of deleterious alleles. This effect may enhance the fitness of hybrids.Keywords: Deleterious mutations, Mutation load, Complementation, Hybridization, Population genomics, Conifers, SpruceBackgroundMutation load is the reduction in fitness caused bydeleterious alleles segregating at mutation-selectionbalance in populations [1, 2]. Decreased mutation loadin hybrids may increase their fitness relative to parents[3–5]. Because mutations are random and deleteriousalleles may rise in frequency through genetic drift,mutation load in partially reproductively isolated groupsis likely to result from largely distinct sets of alleles [6].Therefore, relative to parents, mutation load due toadditive deleterious alleles should be intermediate inhybrids due to their intermediate number of deleteriousalleles. However, most deleterious alleles are thought tobe at least partially recessive [2, 7–9], and mutation loaddue to recessive deleterious alleles should be lower inhybrids than in parental species due to their lowerhomozygosity of deleterious alleles. This latter effect isknown as complementation and is the mechanismunderlying the dominance hypothesis of heterosis [3, 4].Reduction of mutation load in hybrids may commonlycontribute to hybrid zone dynamics. However, the possi-bility that hybrids are also locally adapted to environ-mental conditions in contact zones confounds ourability to phenotypically detect reduced mutation load.In the case of bounded hybrid superiority, hybrids arepredicted to be more fit than parents in their ownenvironment and less fit in parental environments, a* Correspondence: conte@zoology.ubc.ca1Department of Forest and Conservation Sciences, University of BritishColumbia, 3041-2424 Main Mall, Vancouver, BC V6T 1Z4, Canada2Department of Botany, University of British Columbia, 3200-6270 UniversityBlvd, Vancouver, BC V6T 1Z4, CanadaFull list of author information is available at the end of the article© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Conte et al. BMC Genomics  (2017) 18:970 DOI 10.1186/s12864-017-4344-8clear signal of local adaptation [10]. However, since wehave no theoretical expectation for the precise differ-ences in fitness between hybrids and parents in eachenvironment that are caused by local adaptation, it isdifficult to tell whether reduced mutation load has anadditional effect on hybrid fitness, which should enhancetheir fitness across all environments.Here we introduce an alternative approach to assesswhether mutation load is reduced in hybrid zones, whichidentifies bioinformatically predicted deleterious allelesthat are segregating in populations and then comparesproperties of these alleles in hybrid and non-hybrid indi-viduals. Traditionally, identifying the alleles underlyingmutation load in natural populations has been very diffi-cult. Most deleterious alleles contributing to mutationload tend to be of small effect and are kept at lowfrequency by purifying selection [2, 11], making theirindividual effects on fitness or phenotype too small todetect with reasonable sample sizes. With the diversityof genomic data available today, it is now possible to useprotein conservation to predict whether nonsynonymousalleles are likely to be deleterious. Generally, alleles arethought to be more deleterious when they involve eithera nonsynonymous change in a phylogeneticallyconserved position or a change to a substantiallydifferent amino acid (inferred based on substitutionprobabilities or biochemical qualities). Several methodshave been developed that implement this approach,including GERP [12], MAPP [13], SIFT [14], PolyPhen-2[15] and PROVEAN [16]. While not all alleles predictedto be deleterious by these methods will actually be so,and while some deleterious alleles may be missed, testsof functionally verified deleterious alleles have typicallyfound both specificity and sensitivity to be between 70and 90% [15–17] though specificity was lower in [18])(see the Discussion regarding accuracy of specificity esti-mates). Furthermore, a handful of studies have now ap-plied these methods to genomic datasets, finding thatalleles predicted to be deleterious are at lower averageallele frequencies than those predicted to not be deleteri-ous, consistent with the expected effect of purifyingselection [17, 19–25]. By providing insight into thegenetic basis of mutation load, these methods offer ameans for studying load in a comparative manner. Thusfar, few studies employing these methods at a genome-wide scale have focused on natural populations andfewer still have investigated natural hybrid zones.The interior spruce hybrid complex is an ideal naturalsystem to investigate the patterns of mutation load inhybridizing species. The complex is composed of Piceaglauca (Moench) Voss (white spruce), Picea engelmanniiParry ex Engelm. (Engelmann spruce), and their hybrids.White spruce is a boreal species with a widespreadcontinuous range across Canada and Alaska [26]. Rapidpostglacial northwestern spread in the western interior ofCanada has led to low chloroplast DNA diversity in whitespruce populations west of the great lakes and east ofAlaska [26, 27]. Engelmann spruce is a sub-alpine specieswith a fragmented range in western North America [28].Populations in western Canada are the result of northwardpost-glacial expansion [29]. Contrary to expectations,isozyme diversity of Engelmann spruce increases withlatitude, apparently due to hybridization with white spruce[29]. The two species hybridize extensively where theirranges overlap in western Canada. Hybrids occupy inter-mediate ecological and elevational niches [30], and theyappear to be fitter than parental species in theseintermediate habitats, supporting the bounded superioritymodel of hybrid zone maintenance [31]. We predict thatthese hybrids should also benefit from decreased mutationload. However, the observed local adaptation of hybrids tointermediate environments hampers our ability todetermine this at the phenotypic level. Instead, we can testfor a signature of reduced mutation load in hybrids at thegenetic level.Conifers, in general, are known to suffer from highmutation load [32–34], and a reduction in load viahybridization may provide a substantial fitness benefit.Previous estimates of mutation load in conifers havebeen derived by comparing viable seed numbers percone from self-pollination with those from unrelatedcrossings. The number of lethal equivalents per zygote isestimated using 2B = −4lnR, where B is the averagenumber of lethal equivalents per gamete and R is theratio of percent selfing survivors to percent outcrosssurvivors [7, 34]. The estimated number of lethalequivalents in white spruce, 12.6 per diploid individual(affecting seed yield, germination, and survival to age17 years), is one of the highest estimated in conifers orin any life form [35]. However, the deleterious allelesunderlying their high mutation load have yet to beidentified. Furthermore, mutation load has not beenestimated in Engelmann spruce or in white-Engelmannhybrids. By identifying deleterious alleles in whitespruce, Engelmann spruce and their hybrids, we are ableto infer the relative severity of mutation load in the twospecies and their hybrids, and to better understand theeffects of hybridization on load.We used the program PROVEAN [16] to identifyputative deleterious alleles within a previously reportedexome capture dataset [36]. We then determined theproportion of alleles that are putatively deleterious inindividuals of each of the two spruce species and inhybrids, as well as the proportions of loci that areheterozygous or homozygous for putative deleteriousalleles. This information allowed us to test the followinghypotheses: first, if putative deleterious alleles areactually deleterious, then we predict that they shouldConte et al. BMC Genomics  (2017) 18:970 Page 2 of 12occur at lower frequencies than non-deleterious allelesand be associated with a decrease in fitness (via a pheno-typic fitness proxy); and second, we hypothesize thatmutation load should be reduced in hybrids.MethodsData collectionWe used a previously reported exome capture datasetcontaining about nine million single nucleotide polymor-phisms (SNPs) identified in 579 spruce individuals from254 locations in British Columbia and Alberta, Canada,and grown in a common garden experiment [36, 37].Methods for sample collection and growth, and for SNPidentification are documented therein. Briefly, for allsequence alignment and downstream analysis, we usedthe February 2013 version of the white spruce genome(SMarTForests Project [38]). Sequenced reads werefiltered and trimmed using the FASTX toolkit (http://han-nonlab.cshl.edu/fastx_toolkit/index.html). We aligned theremaining reads to the draft genome using the Burrows-Wheeler Aligner mem algorithm [39] marked andremoved polymerase chain reaction (PCR) duplicatesusing Picard MarkDuplicates (http://broadinstitute.githu-b.io/picard/), and performed realignment around indelsusing GATK’s IndelRealigner [40]. Base Quality Score Re-calibration (GATK) was conducted using intermediateSNP databases. Following recalibration, a working set ofSNPs was called using GATK-Unified Genotyper (v3.3.0[40]), with any genotypes that did not have individualquality scores >20 and depth > 5× set to ‘N’ (i.e. missingdata). The set of SNPs were filtered to eliminate any locithat did not meet the following GATK criteria: qualityscore > = 20, map quality score > = 40, FisherStrand score< = 40, HaplotypeScore <= 13, MQRankSumTest <=−12.5, ReadPosRankSum > −8. An additional “allele bal-ance” filter based on the ratio of the number of reads sup-porting reference vs. alternate alleles in heterozygotes wasimplemented, where any loci with a balance ratio > 2.3were filtered. The values of the above filters were decidedbased on the results of the success/failure rate of SNPs ontwo Affymetrix SNP arrays that were designed de novo togenotype ~50,000 SNPs in a test sample of 384 individ-uals. Quality metrics were plotted for successful vs. failedSNPs, and the quality cutoffs chosen visually based onscores that would have optimized the yield of successfulSNPs while eliminating failed SNPs. We then filtered theSNP table to 519,902 genic SNPs (annotated as synonym-ous or non-synonymous [37, 41]) for which at least 10 % ofindividuals were genotyped and for which heterozygositywas no greater than 0.7. Following the filtering of SNPs, weremoved six individuals in the bottom one percentile of thenumber of SNPs genotyped. Finally, we removed seven in-dividuals who were outliers for inbreeding coefficient(range of 0.38 to 0.57), as determined using 221,950 syn-onymous SNPs from the original dataset and the R package‘SNPRelate’ [42].The common garden experimental design and methodsfor phenotyping total biomass are described in [43].Species ancestry between white spruce and Engelmannspruce was estimated with unsupervised analysis inADMIXTURE, using default parameters [44]. The presentdataset contained sufficient sampling within the allopatricranges of white spruce and Engelmann spruce to estimateancestry from these species. We also estimated ancestryfrom Sitka spruce (Picea sitchensis), a parapatric congener,and removed 23 individuals whose value was estimated tobe greater than 12% (the average percent expected from adouble-backcross). To provide reference genotypes forSitka spruce, similar sequence capture data from 26 pureSitka spruce individuals and four white spruce x Sitkaspruce hybrids were obtained from an unrelated study(Joane Elleouet, unpublished data). A subset of 289,987SNPs was used to estimate ancestry, which were selectedon the basis of low linkage with other SNPs, having a minorallele frequency > 0.01, and having <30% missing dataacross both the present dataset and the Sitka spruce data.Amino acid genotypes and PROVEANIn 539 trees from 247 locations (Fig. 1), we checked foramino acid variation resulting from 437,639 SNPs in10,196 coding regions having complete open readingframe predictions [37, 41]. We constructed individualcodon genotypes at each codon containing one or moreSNPs using the reference transcriptome [41] for allmonomorphic positions and SNP calls for polymorphicpositions. We then translated codon genotypes atpolymorphic codons to amino acid genotypes using theR package ‘Biostrings’ [45]. We removed variants withfewer than 500 alleles genotyped. Variants involving stopcodons (i.e. variants at stop positions and nonsensevariants) were not considered further in the presentstudy. We calculated allele frequencies of amino acidvariants. Polymorphic amino acid positions having morethan one major allele (i.e. two or more alleles of equallyhigh frequency) were removed from the dataset. Wewere left with 165,576 amino acid variants at 153,961positions in 6928 proteins.We translated all genes/predicted open reading framescontaining amino acid variation to protein sequencesusing ‘Biostrings’. To avoid reference bias [21], wereplaced the reference allele with the major allele atpolymorphic amino acid positions (the reference allelewas different from the major allele in about ~1% ofpolymorphic amino acid positions) and tested the minorallele for a deleterious signature. Testing only minoralleles for a deleterious signature was also justified forthis study because our major alleles are ‘major’ acrossConte et al. BMC Genomics  (2017) 18:970 Page 3 of 12two species and are therefore, unlikely to be truly deleteri-ous. While deleterious alleles may occasionally rise to highfrequencies within populations at non-equilibrium condi-tions, it is very unlikely that the same allele would do so ineach of two species. We confirmed that the vast majority(99.3%) of the relatively small number of major alleles thatwould have been flagged as deleterious (n = 1328, as indi-cated by PROVEAN score of minor allele > +2.282), wereindeed major alleles in both species. Rather than deleteri-ous, these alleles may actually be beneficial, but abnormalenough to receive a low PROVEAN score.We used the program PROVEAN [16] to predict whetheror not minor amino acid variants were deleterious.PROVEAN assumes that changes are more likely to bedeleterious when they are more abnormal, with scoresbeing lower when amino acid replacements are rarely seenamong homologous protein sequences and their typesrarely seen in conserved protein sequences. This is achievedusing differences in the alignment score between thereference protein with and without the mutation of interest,to homologous proteins in the NCBI NR database usingBLASTP. Alignment scores are derived from a BLOSUMsubstitution matrix [46], which gives the log-odds for everypossible amino acid substitution based on the relativefrequencies of amino acids and their observed substitutionprobabilities in very conserved regions of protein familiesin the BLOCKS database. Thus, we chose this program be-cause the severity of amino acid mutations is empirically es-timated using conserved protein sequences that are subjectto purifying selection. Importantly, PROVEAN does notuse information about allele frequency in its calculations.We used a threshold PROVEAN score below whichvariants were predicted to be deleterious of −2.282.This value was empirically determined to result in thehighest balanced accuracy (78.75, based on a sensitiv-ity of 78.39 and a specificity of 79.11) in tests ofknown disease variants and common polymorphisms(assumed to be neutral) in the “Human Polymor-phisms and Disease Mutations” dataset by [16] (seethe Discussion regarding the accuracy of specificityestimates). This balanced accuracy has been shown tobe very similar to that of other programs available forpredicting deleterious alleles (tested using similarmethods), including SIFT, PolyPhen2 and MutationAssessor [16]. Furthermore, studies that have usedboth PROVEAN and SIFT have found that the pro-gram used did not qualitatively affect the results [23,25]. Finally, here we compare the relative strength ofmutation load across populations, rather than estimat-ing the absolute strength of mutation load. Thus,while we recognize that some fraction of the predicteddeleterious alleles are likely false positive, the specific rateFig. 1 Sample collection locations. Seed was collected from the 249 locations indicated across British Columbia and Alberta, Canada. The averageancestry proportion of individuals in a given collection location is indicated by point color which ranges from blue, representing pure whitespruce, to red, representing pure Engelmann spruce. Background colors show predicted species ranges (based on climatic niche model) of whitespruce (blue), Engelmann spruce (red) and hybrids (purple). Niche envelopes were generated by Tongli Wang (unpubl.) with methodology asdescribed in Wang et al. [57]. All maps were generated by JD, and produced using ESRI ArcGIS 10.2.2. No copyright permissions were requiredConte et al. BMC Genomics  (2017) 18:970 Page 4 of 12of false positives should not qualitatively affect our results,as we do not expect false positives to differ in average fre-quency across populations.Nucleotide diversityTo determine nucleotide diversity in the coding regionswe implemented a modified version of the SNP callingpipeline in [37] (see Supplementary Methods). Briefly, wefiltered SNPs using Qual scores and genotype qualityscores over 20, and map quality scores of 40 and filteredall sites, including non-variable sites, with a minimumdepth threshold of five for each individual. Using customperl scripts we created corresponding fasta files for eachindividual and included “N” at sites that failed the filteringcriteria for individuals. We extracted coding sequences foreach coding region from these fasta alignments based onidentified open reading frames for the transcripts [47].To assess nucleotide polymorphism for each species,we used a modified version of Polymorphorama perlscript [48–50] to generate summary statistics for eachgene [47]. We estimated the number of synonymoussites, non-synonymous sites, and average pairwise diver-sity at synonymous (πs) and non-synonymous sites (πn).We only examined variable coding regions with morethan 100 bp sequenced (with no missing data) in 20 ormore individuals in each species (hybrid, white andEngelmann) leaving coding regions for 3531 separategenes. We used the non-parametric Kruskal-Wallis ranksum test to determine if the species differed for thesesequence diversity statistics and the Nemenyi-tests (Rpackage PMCMR) for multiple comparisons whensignificance was detected.ResultsProportion of alleles predicted to be deleteriousWe identified 291,892 polymorphic codon positions in 539individuals of white spruce, Engelmann spruce and theirhybrids (Fig. 1) in a previously reported exome capturedataset [36]. Of these, 126,316 coded for the same aminoacid (which we refer to as synonymous) and 165,576 re-sulted in amino acid variants. The mean PROVEAN scoreof minor amino acid variants was −0.775 (0.004 SE) (Add-itional file 1: Figure S1). 13.35% of minor variants had PRO-VEAN scores below the threshold of −2.282, and weretherefore predicted to be deleterious (Additional file 1:Figure S1). On average, individuals carried a heterozygousputative deleterious minor allele at 0.4% (0.001% SE) ofpolymorphic loci (note that from here forward, ‘poly-morphic’ refers to the amino acid level) and were homozy-gous for putative deleterious minor alleles at 0.05%(0.0003% SE) of polymorphic loci. Also from here forward,we often refer to ‘predicted deleterious’ variants as simply‘deleterious’ and ‘predicted non-deleterious’ variants as‘non-deleterious’.The efficacy of PROVEAN predictions in spruceVariants predicted to be deleterious were at significantlylower allele frequencies on average than both synonym-ous codons (p < 10−15 based on Mann–Whitney U test)and variants predicted to be non-deleterious (p < 10−15based on Mann–Whitney U test) (Fig. 2). To ensurethat this result was not dependent on the PROVEANscore threshold chosen, we varied the threshold be-tween −1 and −6 and found that the difference per-sisted across all values (data not shown). With lowerthresholds, the magnitude of allele frequency differ-ences increased but power to detect these differencesdecreased as sample size of predicted deleterious al-leles decreased.Because deleterious variants tend to be kept at lowfrequencies by purifying selection, we did not use a minorallele frequency cutoff in this study. However, some rarevariants may be sequencing or genotyping errors ratherthan real variants, and these errors may appear deleteriousmore often than real variants since they are more likely tocause unusual amino acid changes and have never beenexposed to natural selection. To address this issue, weremoved all variants observed only once from the dataset(29,900 synonymous and 52,176 non-synonymous) andfound that qualitative patterns of differences in allelefrequency persisted (p < 10−15 based on Mann–Whitney Utest for deleterious variants vs. synonymous codons, andp < 10−15 based on Mann–Whitney U test for deleteriousvs. non-deleterious variants) (Additional file 1: Figure S2).Effect of mutation load on a fitness proxyWe found that the proportion of ancestry from Engelmannspruce, hereafter ‘ancestry proportion’, explained signifi-cant variation in total biomass of seedlings (dry rootweight + shoot weight) of individuals, in a quadraticregression (r2 = 0.04, F2,533 = 11.24, p = 1.7 × 10−5) (Fig. 3).After accounting for species identity, we found no effect ofeither the proportion of alleles at polymorphic loci thatwere predicted to be deleterious (F1,534 = 0.004, p = 0.95)or the proportion of polymorphic loci predicted to behomozygous deleterious (F1,534 = 0.005, p = 0.95) on totalbiomass (Fig. 3). However, we did find that the sevenindividuals who were likely recently inbred, as inferredfrom their unusually high inbreeding coefficients (range of0.38 to 0.57), had significantly lower total biomass thanother individuals (t = 3.1, df = 6, p = 0.02) (Additional file 1:Figure S3), providing support for use of total biomass as afitness proxy.Deleterious load carried by individualsBelow, to be concise, we refer to the ‘proportion ofalleles at polymorphic loci’, as simply the ‘proportionof alleles’ and the ‘proportion of polymorphic loci’ assimply the ‘proportion of loci’. There was a strongConte et al. BMC Genomics  (2017) 18:970 Page 5 of 12positive linear relationship between the proportion ofancestry from Engelmann spruce and both the pro-portion of alleles that are non-deleterious minor al-leles (r2 = 0.93, F1,537 = 6943, p < 10−15) and theproportion of alleles that are deleterious minor alleles(r2 = 0.39, F1,537 = 337.3, p < 10−15) (Fig. 4a, b). Wealso tested for differences among binned speciesgroups of ‘pure Engelmann spruce’ (i.e. proportion ofancestry from Engelmann spruce ≥90%), ‘pure Whitespruce’ (i.e. proportion of ancestry from Engelmannspruce ≤10%), and ‘intermediate hybrids’ (i.e. 40% ≤proportion of ancestry from Engelmann spruce ≤60%)(Additional file 1: Figure S5). Both the proportion ofminor alleles that are non-deleterious and the propor-tion of minor alleles that are deleterious were highlysignificantly different among these binned speciesgroups (F2,357 = 2444.2, p < 10−15; F2,357 = 131.79, p <10−15, respectively). On average, Engelmann spruceindividuals carried 26% more non-deleterious minoralleles than white spruce individuals (Tukey HSD p =0) and 12% more deleterious minor alleles than whitespruce individuals (Tukey HSD p = 0) (Fig. 4a, b).Intermediate hybrids had intermediate values, with14% more non-deleterious minor alleles than whitespruce individuals (Tukey HSD p = 0) and 10% fewernon-deleterious minor alleles than Engelmann spruceindividuals (Tukey HSD p = 0). Intermediate hybridsalso had 7% more deleterious minor alleles than whitespruce individuals (Tukey HSD p = 0) and 4% fewerdeleterious minor alleles than Engelmann spruce indi-viduals (Tukey HSD p = 5.7 × 10−6) (Fig. 4a, b).Qualitatively, the same patterns were found for meanallele frequencies of non-deleterious and deleteriousminor alleles among the binned species group(Additional file 1: Figure S6). Furthermore, we testedhow the ratio of the proportion of alleles that areFig. 2 Folded site frequency spectrum. Synonymous minor alleles are shown in grey, nonsynonymous non-deleterious minor alleles are shown ingreen and nonsynonymous deleterious minor alleles are shown in orangeFig. 3 Effect of deleterious alleles on a fitness proxy. Quadratic regression of the proportion of ancestry from Engelmann spruce on total biomass(a) and linear regressions of those residuals on the proportion of alleles at polymorphic loci predicted to be deleterious (b) and the proportion ofpolymorphic loci predicted to be homozygous deleterious (c). The proportion of ancestry from Engelmann is represented by a color gradientwith warm colors indicating a high proportion and cool colors indicating a low proportionConte et al. BMC Genomics  (2017) 18:970 Page 6 of 12minor and deleterious (response variable in Fig. 4b)to the proportion of alleles that are minor and non-deleterious (response variable in Fig. 4a) changed withthe proportion of ancestry from Engelmann spruce.We found that this ratio decreased significantly withincreasing ancestry from Engelmann spruce (r2 = 0.53,F1,537 = 598.9, p < 10−15) (Fig. 4c).Both the proportion of loci that were homozygous for anon-deleterious minor allele and the proportion that werehomozygous for a deleterious minor allele increased withabcdefFig. 4 Prevalence of non-deleterious and deleterious minor alleles per individual by ancestry proportion. The proportion of ancestry from Engelmannspruce is shown against the proportion of alleles at polymorphic loci that are non-deleterious minor alleles (a) the proportion of alleles at polymorphicloci that are deleterious minor alleles (b), the ratio of the proportion of alleles at polymorphic loci that are deleterious minor alleles to the proportion ofalleles at polymorphic loci that are non-deleterious minor alleles (c), the proportion of polymorphic loci that are homozygous for a non-deleterious minorallele (d), the proportion of polymorphic loci that are homozygous for a deleterious minor allele (e) and the ratio of the proportion of polymorphic locithat are homozygous for a deleterious minor allele to the proportion of polymorphic loci that are homozygous for a non-deleterious minor allele (f). Linesin in (a) - (c) represent linear regressions and those in (d) - (f) represent quadratic regressions. Vertical colored bars represent 95% confidence intervals forthe mean of each species groups (blue for pure white spruce, red for pure Engelmann spruce and purple for intermediate hybrid) and columns of thecorresponding background colors indicate the range of individuals included in each species group. Note difference in Y-axis scale among panels,especially for deleterious and non-deleterious allelesConte et al. BMC Genomics  (2017) 18:970 Page 7 of 12the proportion of ancestry from Engelmann spruce withpositive quadratic curvature (r2 = 0.77, F2,536 = 887.4, p <10−15; r2 = 0.29, F2,536 = 108.1, p < 10−15, respectively)(Fig. 4d, e). The proportion of loci that were homozygousfor a non-deleterious minor allele and the proportion thatwere homozygous for a deleterious minor allele were bothhighly significantly different among binned species groups(F2,357 = 579.19, p < 10−15; F2,357 = 66.721, p < 10−15,respectively). On average, Engelmann spruce individualshad a 51% higher proportion of loci that were homozy-gous for a non-deleterious minor allele and a 11% higherproportion of loci that were homozygous for a deleteriousminor than white spruce individuals (Tukey HSD p = 0for both) (Fig. 4d, e). Intermediate hybrids had the lowestvalues, having a 5% lower proportion of loci that werehomozygous for a non-deleterious minor allele than whitespruce individuals (Tukey HSD p = 0.011) and 37% lowerproportion of these than Engelmann spruce individuals(Tukey HSD p = 0). Intermediate hybrids had a 12% lowerproportion of loci that were homozygous for a deleteriousminor allele than white spruce individuals (Tukey HSDp = 0) and 21% lower proportion of these than Engelmannspruce individuals (Tukey HSD p = 0) (Fig. 4d, e). Theratio of the proportion of loci that were homozygous for adeleterious minor allele (response variable in Fig. 4e) toproportion that were homozygous for a non-deleteriousminor allele (response variable in Fig. 4d) decreasedsignificantly with increasing ancestry from Engelmannspruce with negative quadratic curvature (r2 = 0.46,F2,536 = 228.3, p < 10−15) (Fig. 4f).We had more white-like individuals than Engelmann-like individuals in our sample. This uneven sampling mayhave caused alleles at higher relative frequency in whitethan in Engelmann spruce to be deemed as major alleles,and therefore not tested for deleterious signatures. Thus,we estimated the number of major alleles that would havebeen minor alleles if we had even sampling. We did thisby counting the number of major alleles at higher relativefrequency in the white spruce binned group than in theEngelmann spruce binned group, and for which therelative frequencies of the two groups sum to less thanone. We found only three of these alleles in the datasetand therefore, conclude that uneven sampling has had aminimal effect on our results.Nucleotide diversityWe found that average pairwise diversity at synonymoussites (πs, χ2 = 71.54, df = 2, p < 0.001) and non-synon-ymous sites (πn)(χ2 = 57.65, df = 2, p < 0.001) were sig-nificantly different among the binned species groups.White spruce had significantly lower πs (mean = 0.0033,SE = 9.7 × 10−05) than both Engelmann (mean = 0.0040,SE = 1.0× 10−04) and intermediate hybrids (mean =0.0040, SE = 1.0 × 10−04, p < 0.001), but Engelmann andintermediate hybrids did not differ in their average pair-wise diversity at synonymous sites. This same pattern ofsignificance was also repeated for non-synonymous sites(πn) (mean ± SE: white = 0.0012 ± 4.1 × 10−05, Engel-mann = 0.0014 ± 4.4e-05, intermediate hybrid = 0.0014 ±4.4 × 10−05).DiscussionWe used the distribution of putatively deleterious allelescarried by individuals to infer the relative strength ofmutation load across a natural hybrid complex. Thisapproach allowed us to test the prediction that mutationload is reduced in hybrids that are locally adapted tointermediate environments, a pattern that confoundsour ability to test this prediction phenotypically.Proportion of alleles predicted to be deleteriousApproximately 13% of nonsynonymous minor alleles seg-regating in the interior spruce hybrid complex were flaggedas deleterious (using the threshold PROVEAN score rec-ommended by [16]). This estimate is not far from othersuch estimates in wild populations, including about 20%in Arabidopsis thaliana and rice [17], about 12% inHelianthus annuus [23] and about 28% in Populustrichocarpa [25]. Here, we refrain from comparing num-bers or proportions of deleterious alleles per individual(reflecting the extent of mutation load) with those esti-mated for other taxa, due to differences in methods andcautions explained below. However, inbreeding experi-ments have long suggested that conifers suffer from rela-tively high mutation load [32, 34, 35, 51, 52].We recommend caution when interpreting absolutenumbers and absolute proportions of deleterious allelesestimated bioinformatically in this and other studies.Choi et al. [16] suggest that PROVEAN has approxi-mately an 80% specificity (meaning that 20% of all truenegative SNPs tested would give false positive results).Surprisingly, this and other studies have found a lowerproportion of variants that are flagged as deleterious (agroup that should include true positives + false positives)than are expected or even possible given the estimatedspecificity of PROVEAN (or the other similar programsthat have been used) [17, 20, 22, 23, 25]. Finding fewerpositive results than predicted by the expected falsepositive rate is difficult to explain unless the stated spe-cificity of the PROVEAN method is estimated in error.To accurately estimate specificity, one must test thefocal program on a set of known neutral alleles and cal-culate the number of false positives generated. However,confirming neutrality is very difficult. Studies estimatingspecificity (including Choi et al. [16]) tend to deemdisease-causing variants as truly deleterious and variantsnot known to cause disease as truly neutral. We predictthat many of the variants assumed to be truly neutralConte et al. BMC Genomics  (2017) 18:970 Page 8 of 12are actually weakly deleterious, with effects too small todetect phenotypically or through functional assays, butlarge enough to be kept at low frequency by selection. Iftrue, this would help to explain why studies like oursoften find fewer false positives than predicted by specifi-city estimates. While this issue calls into question directinterpretation of bioinformatically estimated numbersand proportions of deleterious alleles, such estimates arestill useful for relative comparisons across groups withinstudies (e.g., across the interior spruce hybrid complex)and across studies that use the same PROVEAN scorethreshold or for which the programs used have similarsensitivity and specificity, estimated in a similar way.The efficacy of PROVEAN predictions in sprucePROVEAN and similar programs predict that alleles aredeleterious when they occur in conserved amino acidpositions and when the amino acid replacement is eithera relatively rare or causes a substantial biochemicalchange at the site [12–16]. However, in some cases suchalleles may not be truly deleterious. These alleles mayrepresent genetic innovations that are globally beneficialto the focal species, or they may be beneficial in particu-lar environments and neutral or deleterious in others. Insupport of the PROVEAN predictions, however, we findstrong evidence that alleles predicted to be deleteriousare at lower allele frequencies on average than those notpredicted to be deleterious, suggesting that predicteddeleterious alleles are enriched for truly deleterious al-leles. We also find that non-synonymous mutations thatwere not predicted to be deleterious were at lower allelefrequencies than synonymous mutations, most likelyreflecting the presence of false negatives in that category.Nonetheless, similar evidence that predicted deleteriousalleles are enriched for truly deleterious alleles hasrecently been found in other systems as well, includingArabidopsis [17], maize [20], barley and soybean [22],sunflower [23] and humans [19, 21, 24], suggesting thatthat these approaches are useful for identifying deleteri-ous alleles underlying mutation load.Genotyping error presents an underappreciatedcomplication to studies identifying deleterious allelesbioinformatically. Because we expect deleterious allelesto be at low frequencies, we cannot use a minor allelefrequency cutoff when filtering SNPs to help eliminaterare genotyping errors, as is typically done in studies ofthe genetics of adaptation. Because genotyping errorsare not real alleles that are exposed to selection, theymay appear to be alleles resulting in more severebiochemical changes and thus be called deleterious moreoften than real alleles. This represents an alternativeexplanation for the commonly reported pattern thatpredicted deleterious alleles are at lower frequencies.However, when we eliminated apparent alleles observedonly a single time (the frequency class most likely tocontain genotyping errors), we still found strongevidence that predicted deleterious alleles are at lowerallele frequencies. This suggests that while rare genotyp-ing errors are almost certainly present in the dataset,they are not driving the pattern, and instead, many realdeleterious alleles have been identified.Ideally, we would like to confirm that predicteddeleterious alleles indeed have a negative effect on aphenotypic fitness proxy. Because most deleteriousalleles are of small effect and are at low allele frequen-cies, detecting their individual phenotypic effects re-quires prohibitively massive sample sizes. Here, wetested for cumulative effects of deleterious alleles (i.e.the proportion of alleles that are deleterious, or the pro-portion of loci that are homozygous deleterious) on thetotal biomass of seedling individuals, a proxy for juvenilefitness, and we were not able to detect an effect of eithervariable beyond the effects of ancestry proportion itself.Because deleterious alleles tend to be strongly differenti-ated among species, their patterns are tightly correlatedwith those of ancestry. This has perhaps led to lowpower for detecting an additional effect of mutation loadon our phenotypic fitness proxy. Moreover, seedlingbiomass is only one small component of total fitness andthe correlation between this trait and fitness may not bestrong. Finally, our sample of putatively deleterious al-leles is likely only a small fraction of those that exist inthe genome, and many may be outside of coding regions,which our approach could not target. Zhang et al. car-ried out a similar test in Populus trichocarpa, and theydid detect a significant effect of the proportion of puta-tively deleterious homozygous alleles on plant heightafter accounting for distance from the range center andpopulation structure using principal components ana-lysis [25]. Other studies have also found that genes asso-ciated with complex traits or genes with knownfunctional effects are enriched for bioinformatically pre-dicted deleterious variants [17, 20, 25].Relative amount of mutation load in white spruce andEngelmann spruceOur results suggest that Engelmann spruce carries a greatermutation load than white spruce. Engelmann spruce indi-viduals tend to be burdened by more deleterious alleles (inboth heterozygous and homozygous state) due to bothhigher diversity and higher frequencies of all rare alleles, in-cluding deleterious ones. Mitochondrial DNA haplotype di-versity is also considerably greater in Engelmann sprucethan in white spruce where their ranges overlap (JC Degner,unpublished data). Furthermore, isozyme diversity increaseswith latitude in Engelmann spruce, and it is highest whereits range overlaps with white spruce, a pattern attributed tohybridization [29]. On the other hand, white spruce in thisConte et al. BMC Genomics  (2017) 18:970 Page 9 of 12area are at the leading edge of a rapid and long distancerange expansion [26, 27]. Therefore, their relatively low di-versity may be due to serial founder events that have takenplace during range expansion. It may seem counterintuitivethat the species with greater genetic diversity has greatermutation load. This pattern could be explained by anythingthat maintains a larger species-level population size or ahigher mutation rate. In particular, if local population sizesare low and there is low dispersal between local popula-tions, then drift is strong relative to selection within localpopulations, allowing neutral and deleterious alleles to driftlocally to high frequencies, while low dispersal maintains ahigh level of genetic diversity at the species level [53].Engelmann spruce is adapted to high elevations and theirrange is currently fragmented on mountain tops [28]. It isplausible (although by no means certain) that during glaci-ation, refugial population sizes were small with low disper-sal between them, contributing to the pattern we observe.While with increasing ancestry from Engelmannspruce, individuals have more deleterious alleles onaverage, we also find that the same individuals haveproportionately fewer deleterious alleles relative tonon-deleterious alleles (both in total and in homozy-gous state). In other words, patterns of deleterious al-leles vary with ancestry proportion when correcting forpatterns in non-deleterious alleles. First, this providesevidence that patterns in deleterious alleles are distin-guishable from those of demography (as represented bynon-deleterious alleles). Second, it may provide evi-dence that selection is less efficient at removing dele-terious variants in more white spruce-like populations,due to weaker purifying selection and/or stronger gen-etic drift. Given that these white spruce populations areat the leading edge of a recent, long-distance range ex-pansion [26, 27], these results may be a signature ofserial founder effects during range expansion (i.e. ex-pansion load) [54]. However, further work is needed totest for expansion load in white spruce. Similarly, [55]found that while African American individuals con-tained more non-deleterious and deleterious variantsthan European American individuals, on average, theyhad a lower proportion of deleterious variants, likelyresulting from both increased drift due serial foundereffects and a decreased strength of purifying selectionin populations that expanded out of Africa [24]. Notethough, that several other relevant human studies havebeen done, with a range of results (e.g. [21, 24, 55, 56].Together they are forming a detailed picture of the ef-fects of the out-of-Africa expansion on deleterious vari-ation in humans. Also importantly, while there aresome similarities between our results and human re-sults, there are also differences. These are likely due inpart to significant differences in demographic historybetween the systems.Relative amount of mutation load in parental species andtheir hybridsHybrids are intermediate relative to parental species forthe proportion of alleles that are deleterious. Thus, ifmost deleterious alleles have additive effects, thenhybrids should have an intermediate mutation load rela-tive to parental species. However, evidence suggests thatdeleterious alleles tend to be (at least partially) recessive[2, 7–9]. Here, we find that hybrids have a lower propor-tion of loci that are homozygous for deleterious allelesthan either parental species. While this pattern is alsoexpected (and was found) for non-deleterious alleles,showing a decrease in the homozygosity of deleteriousalleles provides direct evidence for the possibility ofcomplementation in hybrids. Complementation is themechanism underlying the dominance hypothesis ofheterosis [3, 4]. Even if complementation of deleteriousalleles does not have a large enough effect to result inheterosis (i.e., hybrid fitness > parent fitness), it may stillcontribute to higher fitness in hybrids than they wouldotherwise have and therefore, have a significant impacton the outcomes of hybridization and on the stability ofhybrid zones. When hybrids also benefit from localadaptation to their own environment, decreased muta-tion load due to complementation of deleterious allelesmay give them a fitness advantage as well, though itseffects on phenotypic fitness proxies would be insepar-able from those of local adaptation. Studying mutationload at the genetic level has allowed us to infer that in-terior spruce hybrids, which are locally adapted to envir-onmental conditions that are intermediate to theparental species (Engelmann and white spruce), alsobenefit from reduced mutation load due to complemen-tation of deleterious alleles, given that most deleteriousalleles are recessive.ConclusionsHere we showed that PROVEAN is a useful tool foridentifying the genetic basis of mutation load in the in-terior spruce hybrid complex. The set of putatively dele-terious alleles we identified allowed us to compare therelative strength of mutation load across the hybridcomplex. We found that Engelmann spruce suffers fromgreater mutation load than white spruce due to higherfrequencies of rare deleterious alleles. Given that dele-terious alleles tend to be recessive, we also find that hy-brids have lower mutation load than either of theparental species, due to complementation of deleteriousalleles introduced by each of the parental species. Alongwith bounded hybrid superiority, this reduced mutationload likely contributes to the high hybrid fitness previ-ously reported in this complex [31].Interior spruce is an economically important speciescomplex Canada, and is the second most planted treeConte et al. BMC Genomics  (2017) 18:970 Page 10 of 12type in British Columbia [31]. Because these trees sufferfrom high mutation load, understanding the geneticbasis mutation load and factors that contribute to itsstrength will help us with management and breeding ofthis important genetic resource.Additional fileAdditional file 1: Supplementary Material. Supplementary Figure S1-S6(referenced in text). (PDF 9957 kb)AbbreviationsPCR: Polymerase chain reaction; SNP: Single nucleotide polymorphismAcknowledgementsNot applicable.FundingThis work was part of the AdapTree Project (S.N. Aitken and A. Hamann, co-Project Leaders), funded by the Genome Canada Large Scale Applied Re-search Project program, with co-funding from Genome BC, the BC Ministryof Forests, Lands and Natural Resources Operations, Forest Genetics Councilof BC, Alberta Innovates Bio Solutions, Virginia Tech, and the University ofBritish Columbia. The funding bodies did not provide input into the designof the study, collection, analysis, interpretation of results or in writing themanuscript.Availability of data and materialsAll sequence data used here can be found in the NCBI sequence readarchive under SRA accession number SRP071805, https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP071805, and BioProject accessionnumber PRJNA251573, https://www.ncbi.nlm.nih.gov/bioproject/PRJNA251573.Authors’ contributionsGLC, MCW, LHR and SNA conceived of the study. SY and KAH generated theSNP genotypes. GLC analyzed the data and wrote the manuscript. KAHconducted nucleotide diversity analyses. JCD provided ancestry proportiondata and produced map figures. All authors assisted with writing themanuscript. All authors read and approved the final manuscript.Ethics approval and consent to participateExisting seed collections registered and stored in BC and AB provincialgovernment seedbanks were donated by government and industry for thisresearch and met all guidelines for operational reforestation seedlots. A listof organizations donating seed is available at http://adaptree.forestry.ubc.ca/seed-contributors. No licenses were required.Consent for publicationNot applicable.Competing interestsThe authors declare that they have no competing interests.Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1Department of Forest and Conservation Sciences, University of BritishColumbia, 3041-2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.2Department of Botany, University of British Columbia, 3200-6270 UniversityBlvd, Vancouver, BC V6T 1Z4, Canada. 3Department of Zoology, University ofBritish Columbia, 4200-6270 University Blvd, Vancouver, BC V6T 1Z4, Canada.4Present Address: School of Biological Sciences, Monash University, ClaytonCampus, Melbourne, Victoria 3800, Australia. 5Present Address: Department ofBiological Sciences, University of Calgary, 2500 University Dr NW, Calgary, ABT2N 1N4, Canada.Received: 25 May 2017 Accepted: 21 November 2017References1. Kimura M, Maruyama T, Crow JF. The mutation load in small populations.Genetics. 1963;48:1303.2. Agrawal AF, Whitlock MC. Mutation load: the fitness of individuals inpopulations where deleterious alleles are abundant. Annu Rev Ecol EvolSyst. 2012;43:115–35.3. Crow JF. Alternative hypotheses of hybrid vigor. Genetics. 1948;33:477–87.4. Gowen JW. Heterosis; a record of researches directed toward explainingand utilizing the vigor of hybrids. Iowa City: Iowa State College Press; 1952.5. Lippman ZB, Zamir D. Heterosis: revisiting the magic. Trends Genet. 2007;23:60–6.6. Whitlock MC, Ingvarsson PK, Hatfield T. Local drift load and the heterosis ofinterconnected populations. Heredity. 2000;84:452–7.7. Morton NE, Crow JF, Muller HJ. An estimate of the mutational damage inman from data on consanguineous marriages. Proc Natl Acad Sci U S A.1956;42:855–63.8. Simmons MJ, Crow JF. Mutations affecting fitness in drosophila populations.Annu Rev Genet. 1977;11:49–78.9. Agrawal AF, Whitlock MC. Inferences about the distribution of dominancedrawn from yeast gene knockout data. Genetics. 2011;187:553–66.10. Kawecki TJ, Ebert D. Conceptual issues in local adaptation. Ecol Lett. 2004;7:1225–41.11. Haldane JBS. The effect of variation on fitness. Am Nat. 1937;71:337–49.12. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A.Distribution and intensity of constraint in mammalian genomic sequence.Genome Res. 2005;15:901–13.13. Stone EA, Sidow A. Physicochemical constraint violation by missensesubstitutions mediates impairment of protein function and disease severity.Genome Res. 2005;15:978–86.14. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. NatProtoc. 2009;4:1073–81.15. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, etal. A method and server for predicting damaging missense mutations. NatMethods. 2010;7:248–9.16. Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functionaleffect of amino acid substitutions and Indels. PLoS One. 2012;7:e46688.17. Günther T, Schmid KJ. Deleterious amino acid polymorphisms inArabidopsis Thaliana and rice. Theor Appl Genet. 2010;121:157–68.18. Flanagan SE, Patch A-M, Ellard S. Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genet Test Mol Biomark. 2010;14:533–7.19. Xue Y, Chen Y, Ayub Q, Huang N, Ball EV, Mort M, et al. Deleterious- anddisease-allele prevalence in healthy individuals: insights from currentpredictions, mutation databases, and population-scale Resequencing. Am JHum Genet. 2012;91:1022–32.20. Mezmouk S, Ross-Ibarra J. The pattern and distribution of deleteriousmutations in maize. G3. 2014;4:163–71.21. Simons YB, Turchin MC, Pritchard JK, Sella G. The deleterious mutation loadis insensitive to recent population history. Nat Genet. 2014;46:220–4.22. Kono TJY, Fu F, Mohammadi M, Hoffman PJ, Liu C, Stupar RM, et al. The roleof deleterious substitutions in crop genomes. Mol Biol Evol. 2016;33:2307-2317.23. Renaut S, Rieseberg LH. The accumulation of deleterious mutations as aconsequence of domestication and improvement in sunflowers and otherCompositae crops. Mol Biol Evol. 2015;32:2273–83.24. Henn BM, Botigué LR, Peischl S, Dupanloup I, Lipatov M, Maples BK, et al.Distance from sub-Saharan Africa predicts mutational load in diverse humangenomes. Proc Natl Acad Sci. 2016;113:E440–9.25. Zhang M, Zhou L, Bawa R, Suren H, Holliday JA. Recombination ratevariation, hitchhiking, and demographic history shape deleterious load inpoplar. Mol Biol Evol. 2016;33:2899–910.26. Ritchie JC, MacDonald GM. The patterns of post-glacial spread of whitespruce. J Biogeogr. 1986;13:527–40.27. Anderson LL, Hu FS, Nelson DM, Petit RJ, Paige KN. Ice-age endurance: DNAevidence of a white spruce refugium in Alaska. Proc Natl Acad Sci. 2006;103:12447–50.Conte et al. BMC Genomics  (2017) 18:970 Page 11 of 1228. Alexander RR, Shepperd WD. Picea engelmannii Parry ex. Engelm.Engelmann spruce. In: Burns RM, Honkala BH, editors. Silv. N. Am. Volconifers agric. Handb. 654. Washington, DC: USDA Forest Service; 1990. p.187–203.29. Ledig FT, Hodgskiss PD, Johnson DR. The structure of genetic diversityin Engelmann spruce and a comparison with blue spruce. Can J Bot.2006;84:1806–28.30. De La Torre A, Ingvarsson PK, Aitken SN. Genetic architecture and genomicpatterns of gene flow between hybridizing species of Picea. Heredity. 2015;115:153–64.31. De La Torre AR, Wang T, Jaquish B, Aitken SN. Adaptation and exogenousselection in a Picea Glauca × Picea Engelmannii hybrid zone: implications forforest management under climate change. New Phytol. 2014;201:687–99.32. Namkoong G, Bishir J. The frequency of lethal alleles in Forest treepopulations. Evolution. 1987;41:1123–6.33. Klekowski EJ. Genetic load and its causes in long-lived plants. Trees. 1988;2:195–203.34. Savolainen O, Karkkainen K, Kuittinen H. Estimating numbers of embryoniclethals in conifers. Heredity. 1992;69:308–14.35. Fowler DP, Park YS. Population studies of white spruce. I. Effects of self-pollination. Can J For Res. 1983;13:1133–8.36. Suren H, Hodgins KA, Yeaman S, Nurkowski KA, Smets P, Rieseberg LH, et al.Exome capture from the spruce and pine giga-genomes. Mol Ecol Resour.2016;16:1136–46.37. Yeaman S, Hodgins KA, Lotterhos KE, Suren H, Nadeau S, Degner JC, et al.Convergent local adaptation to climate in distantly related conifers. Science.2016;353:1431–3.38. Birol I, Raymond A, Jackman SD, Pleasance S, Coope R, Taylor GA, et al.Assembling the 20 Gb white spruce (Picea Glauca) genome from whole-genome shotgun sequencing data. Bioinformatics. 2013;29:1492–7.39. Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009;25:1754–60.40. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. Aframework for variation discovery and genotyping using next-generationDNA sequencing data. Nat Genet. 2011;43:491–8.41. Yeaman S, Hodgins KA, Suren H, Nurkowski KA, Rieseberg LH, HollidayJA, et al. Conservation and divergence of gene expression plasticityfollowing c. 140 million years of evolution in lodgepole pine (PinusContorta) and interior spruce (Picea Glauca × Picea Engelmannii). NewPhytol. 2014;203:578–91.42. Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal componentanalysis of SNP data. Bioinformatics. 2012;28:3326–8.43. Liepe KJ, Hamann A, Smets P, Fitzpatrick CR, Aitken SN. Adaptation oflodgepole pine and interior spruce to climate: implications for reforestationin a warming world. Evol Appl. 2016;9:409–19.44. Alexander DH, Novembre J, Lange K. Fast model-based estimation ofancestry in unrelated individuals. Genome Res. 2009;19:1655–64.45. Pages H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: string objectsrepresenting biological sequences, and matching algorithms. 2014 [cited2016 Feb 18]. Available from: http://bioconductor.org/packages/Biostrings/.46. Henikoff S, Henikoff JG. Amino acid substitution matrices from proteinblocks. Proc Natl Acad Sci U S A. 1992;89:10915–9.47. Hodgins KA, Yeaman S, Nurkowski KA, Rieseberg LH, Aitken SN. ExpressionDivergence Is Correlated with Sequence Evolution but Not PositiveSelection in Conifers. Mol Biol Evol. 2016;33:1502-516.48. Bachtrog D, Andolfatto P. Selection, recombination and demographichistory in Drosophila Miranda. Genetics. 2006;174:2045–59.49. Andolfatto P. Hitchhiking effects of recurrent beneficial amino acidsubstitutions in the Drosophila Melanogaster genome. Genome Res. 2007;17:1755–62.50. Haddrill PR, Bachtrog D, Andolfatto P. Positive and negative selection onnoncoding DNA in Drosophila Simulans. Mol Biol Evol. 2008;25:1825–34.51. Franklin EC. Genetic load in loblolly pine. Am Nat. 1972;106:262–5.52. Doerksen TK, Bousquet J, Beaulieu J. Inbreeding depression in intra-provenance crosses driven by founder relatedness in white spruce. TreeGenet Genomes. 2013;10:203–12.53. Wright S. Statistical genetics in relation to evolution. Actual. Sci. Ind. 802expo. Biom. Stat. Biol. XIII. Paris: Hermann et Cie; 1939.54. Excoffier L, Foll M, Petit RJ. Genetic consequences of range expansions.Annu Rev Ecol Evol Syst. 2009;40:481–501.55. Lohmueller KE, Indap AR, Schmidt S, Boyko AR, Hernandez RD, Hubisz MJ, etal. Proportionally more deleterious genetic variation in European than inAfrican populations. Nature. 2008;451:994–7.56. Do R, Balick D, Li H, Adzhubei I, Sunyaev S, Reich D. No evidence thatselection has been less effective at removing deleterious mutations inEuropeans than in Africans. Nat Genet. 2015;47:126–31.57. Wang T, Wang G, Innes J, Nitschke C, Kang H. Climatic niche models andtheir consensus projections for future climates for four major forest treespecies in the Asia–Pacific region. For Ecol Manag. 2016;360:357–66.•  We accept pre-submission inquiries •  Our selector tool helps you to find the most relevant journal•  We provide round the clock customer support •  Convenient online submission•  Thorough peer review•  Inclusion in PubMed and all major indexing services •  Maximum visibility for your researchSubmit your manuscript atwww.biomedcentral.com/submitSubmit your next manuscript to BioMed Central and we will help you at every step:Conte et al. BMC Genomics  (2017) 18:970 Page 12 of 12


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items