UBC Undergraduate Research

Using a genotyping-by-sequencing (GBS) approach to elucidate population structure in Garry Oak (Quercus… Degner, Jonathan C. Apr 30, 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
52966-Degner_Jonathan_FRST_498.pdf [ 1.18MB ]
Metadata
JSON: 52966-1.0075637.json
JSON-LD: 52966-1.0075637-ld.json
RDF/XML (Pretty): 52966-1.0075637-rdf.xml
RDF/JSON: 52966-1.0075637-rdf.json
Turtle: 52966-1.0075637-turtle.txt
N-Triples: 52966-1.0075637-rdf-ntriples.txt
Original Record: 52966-1.0075637-source.json
Full Text
52966-1.0075637-fulltext.txt
Citation
52966-1.0075637.ris

Full Text

1  USING A GENOTYPING-BY-SEQUENCING (GBS) APPROACH TO ELUCIDATE POPULATION STRUCTURE IN GARRY OAK (Quercus garryana)  by Jonathan C. Degner     A thesis submitted in partial fulfillment of the requirements for the degree of BACHELOR OF SCIENCE IN FOREST AND CONSERVATION SCIENCES  in  The Faculty of Forestry THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)      April 2014  © Jonathan C. Degner, 2014  2  Abstract  A pilot study was performed to assess the usability of the genotyping-by-sequencing (GBS) method to perform population genetic analyses across the species range of Garry oak (Quercus garryana). Ninety-two individuals representing 11 populations were sequenced in a single multiplexed GBS library, which resulted in 18.1 Gigabases of sequence data covering 20 Megabases of unique sequences for 87 of the 92 individuals sequenced, which is estimated to represent ~2% of the genome. 20,751 high-quality single nucleotide polymorphisms (SNPs) were obtained, which were used to determine population differentiation and genetic structure. A more-refined set of 2,194 SNPs was developed and used for Bayesian genetic clustering analysis. Clustering analyses revealed two distinct genetic clusters, one corresponding to Q. garryana var. garryana and the other to both vars. semota and breweri. Some evidence supports the presence of additional hierarchical structure within both of these clusters, although there was no evidence to support the taxonomic division between vars. semota and breweri. Overall FST averaged 0.091, and was strongly correlated with physical distance (R2=0.93, p<0.0001). Pair-wise FST was significantly higher between populations of different genetic clusters than among them. Heterozygosity was negatively correlated with latitude (R2=0.72, p=0.001), and was higher in vars. semota and breweri than in var. garryana (p<0.0001). Inbreeding coefficient was also negatively correlated with latitude (R2=0.65, p=0.003). Genetic distance was positively correlated at distances up to 400 km (Mantel test, 10,000 permutations; p<0.001) and negatively correlated beyond 1000km (p<0.001). The success of sequencing and analyses using GBS merits expansion of this pilot project into a full-scale study with higher representation of sample size both within and among populations.  Keywords: genomics; reduced representation library; RRL; Oregon white oak; semota; breweri; fruticosa; Fst; inbreeding coefficient; heterozygosity; latitudinal cline; genetic cluster; STRUCTURE; discriminant analysis of principal components; DAPC;   3  Table of Contents Abstract ...................................................................................................................................................... 2 Table of Contents ........................................................................................................................................ 3 List of Figures .............................................................................................................................................. 4 List of Tables ............................................................................................................................................... 4 Acknowledgements ..................................................................................................................................... 5 1. Introduction ............................................................................................................................................ 6 1.1 Study species ................................................................................................................................................................. 6 1.2 Common gardens .......................................................................................................................................................... 9 1.3 Genotyping-by-sequencing (GBS) ............................................................................................................................... 10 2. Materials and Methods .......................................................................................................................... 11 2.1 Sampling design .......................................................................................................................................................... 11 2.2 Variety identification ................................................................................................................................................... 12 2.3 Tissue sampling ........................................................................................................................................................... 12 2.4 DNA extraction ............................................................................................................................................................ 13 2.5 GBS library preparation ............................................................................................................................................... 13 2.6 Sequencing .................................................................................................................................................................. 14 2.7 De-multiplexing and de novo assembly ...................................................................................................................... 14 2.8 SNP filtering and population genetic parameters ....................................................................................................... 15 2.9 Population structure analyses ..................................................................................................................................... 15 2.10 Data visualization and statistical software ................................................................................................................ 16 3. Results ................................................................................................................................................... 17 3.1 Sequencing, de-multiplexing and de novo assembly .................................................................................................. 17 3.2 Filtering ....................................................................................................................................................................... 17 3.3 Population genetic parameters ................................................................................................................................... 18 3.4 STRUCTURE ................................................................................................................................................................. 18 3.5 DAPC ............................................................................................................................................................................ 19 4. Discussion .............................................................................................................................................. 21 4.1 GBS as an approach for population genetics study .................................................................................................... 21 4.2 Population structure in Quercus garryana .................................................................................................................. 22 4.3 Comparison with previous studies .............................................................................................................................. 24 4.4 Taxonomic implications .............................................................................................................................................. 25 4.5 Conservation implications ........................................................................................................................................... 26 5. Conclusions ........................................................................................................................................... 29 References ................................................................................................................................................ 31  4  List of Figures Figure 1: Species distribution of Garry oak and sampling locations........................................................................................................34 Figure 2: Presence and abundance of the three Garry oak varieties in sampled California populations..........................................................35 Figure 3: Bioanalyzer visualization of GBS library quality…..................................................................................................................35 Figure 4: Distribution of minor allele frequencies among called SNPs……...............................................................................................36 Figure 5: Plot of genetic vs. physical distance among sampled populations……........................................................................................36 Figure 6: Comparison of genetic distances between and among populations of different Garry oak varieties..................................................37 Figure 7: Plot of average allelic heterozygosity per population vs. latitude..............................................................................................37 Figure 8: Plot of average inbreeding coefficient per population vs. latitude.............................................................................................37 Figure 9: Mantel correlogram of genetic correlation and distance.........................................................................................................38 Figure 10: Log-likelihood of models of population structure for any given number of genetic clusters, from STRUCTURE...................................38 Figure 11: K likelihood of models of population structure for any given number of genetic clusters, from STRUCTURE....................................38 Figure 12: Individual genetic cluster assignments for two to four genetic clusters, from STRUCTURE............................................................39 Figure 13: Population-level cluster assignments for two and four genetic clusters, from STRUCTURE............................................................40 Figure 14: BIC likelihood of models of population structure for any given number of genetic clusters, from DAPC............................................41 Figure 15: Clustering densities of individuals in the first discriminant function of DAPC, for two genetic clusters.............................................41 Figure 16: Plot of individual clustering assignments along the first two discriminant functions of DAPC, for four genetic clusters. ......................42   List of Tables Table 1: Geoclimatic variables of sampled populations.......................................................................................................................43 Table 2: Population genetic parameters of sampled populations..........................................................................................................43     5  Acknowledgements  First and foremost I would like to thank my supervisor, Sally Aitken, for her ideas and guidance throughout this project, and for providing me access to this amazing research experience. Her knowledge has ever been an inspiration and her assistance in interpretation has been invaluable. I would also like to give my sincere thanks to Kristin Nurkowski for her critical help in designing the GBS protocol, helping me prepare the library, and coordinating the sequencing. Without her expertise in troubleshooting this often temperamental protocol, it would never have been half the success it has been.  I’d like to thank Colin Huebert for establishing the common garden from which I was able to get my genetic material. Studying trees from locations 1,500 km apart wouldn’t have been possible without his hard work over several years to establish this garden. I’d like to thank Sean King for his assistance in collecting tissue for this project. His long hours, sun burns, and wasp bites endured to get all of the material collected in a very narrow time frame was a tremendous help. I would like to thank Nicolas Feau for his patience in teaching me how to navigate a UNIX server, and to use the bioinformatic tools necessary for my analyses.  I would, of course, like to thank the AdapTree Large-Scale Applied Research Project for funding the sequencing in this project and to The Centre for Forest Conservation Genetics for funding the laboratory costs associated with this project. I have been incredibly fortunate to have access to exceptional funding for only a BSc project.  I would like to thank Pia Smets, Sam Yeaman, and Kay Hodgins for their advice and assistance in various aspects of this project.  Finally I would like to thank my wife, Paige Sonmor, for her continued love and support. She often worked long hours so I could afford to stay late in the lab working on this project, and her willingness to listen to me yammer on about trees for hours on end has always been much appreciated.   6    1. Introduction  1.1 Study species  Quercus garryana, commonly known as Garry oak or Oregon white oak, is a deciduous tree species distributed throughout submaritime regions from Southern California to Southwestern British Columbia (Fowells 1965). It possesses the largest latitudinal range of western oak species and is the only oak species native to Washington and British Columbia, as well as the primary oak species in Oregon. Though Q. garryana is present in a wide range of climates (mean annual temperature 8-18C, mean annual precipitation 170-2630mm), it tends to inhabit relatively warm and dry sites in any given locality (Stein 1990). Prior to European settlement in North America, Q. garryana-dominated savannahs, characterized by low tree density and extensive herbaceous vegetation, were common throughout much of the species range. These ecosystems, which typically succeed to Pseudotsuga menziesii-dominated closed-canopy forests in the absence of disturbance (Agee 1996), were maintained through prescribed burning by native populations (Thilenius 1965). A cessation of this practice, compounded with land use conversion in the fertile soils characteristic of lowland Q. garryana savannahs, has led to massive habitat loss and fragmentation in the last 150 years (Pellatt et al. 2007).  Three taxonomic varieties of Q. garryana are currently accepted: Quercus garryana Douglas ex Hooker, vars. garryana, semota, and breweri (Nixon 1997). Q. garryana var. garryana is an erect tree to 25m that is present across the majority of the species range, whereas the other two varieties are multi-stemmed shrubs endemic to the Sierra Nevada mountain range of California (var. semota) and the Siskiyou Mountains of Southern Oregon and Northwestern California (var. breweri) (Jepson 1909), hereafter collectively referenced as “shrub varieties”. The distinctiveness of the two shrub varieties has been debated, as they appear to be neither geographically or morphologically distinct (McMinn 1951). However, Nixon (1997) notes distinctive characteristics of bud and trichome morphology that are useful 7  in distinguishing the two varieties. As no genetic study has ever been performed with these varieties (although see Marsico et al. 2009), it is currently unknown whether these varieties constitute evolutionarily significant units (Moritz 1994) that may merit separate management strategies, or merely outdated taxonomy. Q. garryana ecosystems are highly threated in Canada, with robust ecosystems comprising only one to five percent of their pre-European distribution in British Columbia (Fuchs 2001). These ecosystems host 49 plant species at risk in British Columbia (BC Conservation Data Centre 2014), 30 of which are recognized to be at risk at the national level (COSEWIC 2011). Additionally, 5 animal species native to Q. garryana ecosystems are at risk in British Columbia (BC Conservation Data Centre 2014), 2 of which have national designation (COSEWIC 2011). As such, conservation of Q. garryana ecosystems is of critical concern in British Columbia and Washington, where their presence is most threatened (Fuchs 2001). The range of habitats available to Q. garryana is expected to dramatically change in future predicted climates, with climatic conditions favouring Q. garryana establishment and presence in 75,000-200,000 km2 in 2099, compared to ~50,000 km2 at present (Pellatt et al. 2012). Limited acorn dispersal abilities will likely hinder Q. garryana from rapidly invading these newly accessible habitats (Marsico et al. 2009). However, these favourable climatic changes may allow for novel range expansion and conservation programs for this species and its related ecosystems. However, understanding the genetic structure of the species is important in establishing appropriate seed transfer guidelines (Aitken et al. 2008).   The genetics of Q. garryana has received limited attention. Only three studies have so far analysed genetic structure within this species, and only one other study has generated unique sequence data.   The first group to study population structure in Q. garryana was Ritland et al. (2005). Using 42 populations distributed from Northern California to the northern limit of the species range, and analyzing isozyme diversity, Ritland et al. found a fundamental genetic divide between “mainland 8  populations” i.e. those found on continental North America, and “island populations” i.e. those found on Vancouver Island and neighbouring Gulf Islands, suggesting that island colonisation events are rare.  They estimated population differentiation (Nei’s GST) at 0.084, and estimated average inbreeding coefficient (F) at 0.012.   They detected negative latitudinal correlations with allelic diversity and inbreeding coefficient, although only in mainland populations. Genetic parameters were roughly constant between island populations. No mention was paid to the shrub varieties of Q. garryana, though the lack of genetic divide between Northern California populations and other mainland populations in that study suggests that either solely var. garryana was sampled, or genetic differentiation is weak between varieties.  Huebert (2009), using 15 populations from across the species range established in a common garden upon which this current research is founded, studied population differentiation in ecologically-relevant quantitative traits including bud phenology and cold hardiness. He found population differentiation (QST) among quantitative traits and their principal components (PCs) to average 0.15 (range 0.08-0.31). These estimates of QST are lower than those found in many other forest tree species studied, but similar in rank (Alberto et al. 2013, table 1), suggesting that Q. garryana is less locally-differentiated than many other forest tree species studied, potentially due to high levels of gene flow. However, many of the quantitative traits studied correlated strongly with environmental variables, suggesting at least some degree of local adaptation in this species. Additionally, Huebert noted very high within-population genetic variation, suggesting high levels of standing genetic diversity in this species. Marsico et al. (2009) studied gene flow using 22 populations distributed from Southern Oregon to the northern limit of the species range to analyse diversity among chloroplast and nuclear microsatellites.  Their results agreed with those of Ritland et al. (2005) in that haplotype diversity in chloroplast microsatellites was low in island populations, suggesting few colonisation events as chloroplast DNA is maternally inherited in oak (Dumolin et al. 1995). Their results were also in agreement with those of Ritland et al. in that they found a negative correlation between latitude and 9  allelic diversity. Genetic and physical distance were positively correlated at scales < 200 km and negatively correlated at scales >700km, suggesting that the effects of gene flow are stronger than drift at scales well-beyond what would be expected of pollen travel. Rates of pollen flow were estimated to be ~120 times that of seed flow. Chloroplast haplotypes were generally shared within populations and a few dominant haplotypes were present throughout the mainland. Interestingly, high diversity and unique chloroplast haplotypes were observed at one site in Southern Oregon that the authors speculate may have been var. breweri. The only unique DNA sequence data generated from Q. garryana was from one world-wide phylogenetic study of Fagaceae species that sequenced the CRAB CLAWS gene in numerous Quercus species (Oh and Manos 2008). This study found Q. garryana to be closely related to a number of other well-studied Quercus species including Q. robur and Q. alba. As both of these species have available expressed sequence tag (EST) libraries available and a Q. robur draft genome is in preparation (A. Kremer, pers. comm. 2013), there is high potential for successful alignment of Q. garryana sequence reads against these available resources.  1.2 Common gardens    Early studies of population genetics that focused on quantitative traits required a method of disentangling phenotypic variation caused by environmental versus genetic factors. The common garden experimental design, in which seeds from multiple populations are grown in one location with an effectively constant environment, has long been useful in studying population genetic parameters (early studies reviewed in  Guries and Ledig 1976). These designs also allow for simultaneous measurement of individuals from distant locations, making them ideal for comparisons of traits across species ranges.   10  1.3 Genotyping-by-sequencing (GBS)  The high costs of genome resequencing projects are difficult to justify for species without substantial economic importance or critical conservation concern. Additionally, the highly repetitive nature of many genomes often makes whole-genome resequencing inefficient for marker discovery studies that are more interested in genic and regulatory regions, particularly when the extent of repetitive genomic elements is unknown for the study species (Altshuler et al. 2000).   Various methods exist for alleviating these problems by sequencing portions of the genome, a strategy broadly termed reduced-representation library (RRL) genotyping. For example, RNA sequencing is useful for targeting genic regions and may provide some metric of transcription level, but is technically difficult and omits regulatory regions. Restriction enzyme-based approaches are a promising RRL method that allow for simple sequence alignment and DNA amplification but early efforts were fraught with inconsistency and low sequencing depth (Altshuler et al. 2000).  New restriction-based approaches, including genotyping-by-sequencing (GBS) (Elshire et al. 2011) and RAD-seq (Davey and Blaxter 2010), have emerged as low-cost methods for generating genome-wide SNPs that overcome some of the previous hurdles of restriction enzyme-based RRL generation.  In GBS, genomic DNA is subjected to restriction enzyme digestion and ligated with a sample-specific adapter sequence, as well as an adapter common to all sequences. This makes amplification more consistent, and allows for sequence multiplexing, wherein 96 samples can be sequenced in a single Illumina HiSeq lane, greatly reducing cost. Additionally, utilising methylation-sensitive enzymes such as PstI or ApeKI biases restriction to genic and other active genomic regions, providing greater coverage in these regions of interest.    11  2. Materials and Methods  2.1 Sampling design  All individuals were sampled from a common garden experiment established in March 2007 in Vancouver, British Columbia (49.256791, -123.249984, Fig. 1). This common garden represents a collection of 140 half-sib families from 15 populations across the species range of Quercus garryana, divided into 12 randomised replicates, totaling 1692 individuals. Procedures for population sampling, acorn collection, germination, and garden design are described by Huebert (2009). A subset of these trees was used for the current pilot GBS study to test the applicability of this approach to larger-scale genetic study of Q. garryana. The subsampled trees contain members of 11 populations from the first 2 replicates of the common garden, selected for an even spatial distribution across the species range and to capture the full range of among-population phenotypes available for the trees (height, basal diameter, and bud burst phenology; from Huebert 2009). Provenance information and a few key geoclimatic variables are provided in Table 2. Of the populations omitted from analysis, two had an insufficient number of individuals to accurately describe within-population variation, and two were in close proximity to other locations but exhibited less-extreme phenotypes. Between eight and nine half-sib families were chosen per population, based on the presence of living trees in the first two replicates, and to capture the full range of phenotypes present within populations. Ninety-nine families were used, six containing one representative, and the remainder containing two representatives, totaling 192 individuals. As GBS had never previously been used for a population genetics study in the Centre for Conservation Genetics laboratory, ninety-six individuals were selected, primarily from the second replicate, for a pilot GBS study analysed below.  12  2.2 Variety identification  Individuals were identified to the variety level based on growth form and bud morphology to distinguish between var. garryana and the other two varieties, and then by trichome morphology to distinguish between vars. semota and breweri, as described by Nixon (1997). All individuals from populations 7-11, and the majority of those from Population 6, identify as var. garryana. Individuals from population 4 were primarily identified as var. breweri, and those from population 3 primarily as var. semota. Populations 1 and 2 contained several individuals distinctly of var. semota, as well as others appearing to be var. breweri or potentially hybrids between var. semota and Q. douglasii, which have been reported based on macromorphological observations (Tucker 1980; K. Nixon, pers. comm. 2013). Population 5 appeared to be in an area of introgression between at least vars. breweri and semota, and potentially with var. garryana as well. Type specimens of vars. breweri and semota were both identified from this population, as well as a multitude of phenotypically intermediate individuals. The results of the varietal identification are visualised in Fig. 2.  2.3 Tissue sampling  Over a period of two days in May 2013, newly emergent leaf tissues were collected from the sampled trees. Using a 5 mm diameter leaf-sampling hole punch, approximately 5 discs of leaf tissue were collected from the tip and outer lobes of 5-8 leaves per tree, whenever possible. Fewer leaves were collected from trees of exceptionally small stature and/or low vigour. All sampling surfaces were sterilised with 0.3% bleach solution between samples to prevent contamination. Tissue samples were stored at -80°C until further use.    13  2.4 DNA extraction  DNA was extracted from the juvenile leaf tissues using a 96-well NucleoSpin Plant II kit (Macherey-Nagel, Düren, Germany),and quantified using a Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA). If the first extraction did not meet a minimum acceptable concentration criterion of 8.5 ng/µL for GBS (K. Nurkowski, pers. comm. 2013), a new tissue sample was extracted individually using the same protocol and concentrated using a DNA Clean & Concentrator kit (Zymo Research, Irvine, CA, USA). Five samples from Block 2 proved to be especially problematic in yielding DNA. When these samples were re-extracted using CTAB, the purified pellet was highly discoloured and exhibited high secondary metabolite content when assayed with a NanoDrop 2000c UV-Vis spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). This may be indicative of secondary metabolites binding the DNA during extraction, which may have also prevented fluorescent dye binding during the Qubit assay and thus biased concentration estimates downward. These samples were not used in further analyses. Instead, one sample was replaced by its corresponding family member in Block 1, and 4 samples were re-sequenced (1 each from populations 1, 3, 4, and 5) to hopefully provide better coverage on these individuals and to provide some sense of the within-individual replicability of GBS.  2.5 GBS library preparation  A single 96-plexed GBS library was prepared using 100 ng DNA per sample. PstI was used for enzyme digestion due to its methylation-sensitivity, resulting in preferential amplification of non-repetitive and transcribed genomic elements. After adapter ligation, size selection was performed by running the library on an agarose gel and excising the region corresponding to 350-600 basepairs (bp). Library quality was assessed using a 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA) to determine fragment size distribution, Qubit 2.0 to assess library concentration, and qPCR to verify correct adapter 14  ligation prior to sequencing.  Bioanalyzer results suggest that size selection was very successful. The fragment distribution shows a peak at 550bp, and a range of approximately 450-700 bp (Fig. 3). 2.6 Sequencing  The completed library was sent to Genome Quebec Innovation Centre and sequenced with an Illumina HiSeq 2500. Unfortuantely, the flowcell was clustered too tightly for a GBS library, and the sequencing of the return-ends of the fragments failed for all samples due to the low-complexity nature of the common adapter. New bioinformatic techniques have been developed to recover this type of data loss (Krueger et al. 2011), but unfortunately the original image files were discarded by the sequencing centre before they could be obtained for re-analysis. The return reads were discarded and the sequence data was used as single-reads.   2.7 De-multiplexing and de novo assembly  Reads were assembled using STACKS ver. 1.12 (Catchen et al. 2013). Due to the multiple lengths of barcodes used in this study, a tiered de-multiplexing system was used. Individuals with 8 bp barcodes were de-multiplexed first, and the discarded reads were then run through individuals with 7 bp barcodes, and then again for 6 bp barcodes. The 100 bp reads were then trimmed to 92 bp to account for this multi-length barcode system. The four samples that were sequenced twice were then concatenated, and the fastq files were run through the STACKS de novo assembly and SNP calling pipeline with default parameters.     15  2.8 SNP filtering and population genetic parameters  The SNP set generated from STACKS was filtered for quality using vcftools (Danecek et al. 2011). Two separate SNP datasets were generated. The first set, hereon referred to as the “full SNP set”, includes all SNPs with a sequencing depth of at least 5x, present in at least 80% of individuals, and with a mean maximum depth less than 500 to remove SNPs in repetitive and paralogous regions. The second set, hereon referred to as the “strict SNP set”, was filtered to ensure that no SNPs are far outside of Hardy-Weinberg equilibrium (p>0.001), includes only sites with a minor allele frequency >0.05, require SNPs to be present in 90% of individuals, and include only 1 SNP per 92 bp stack to reduce linkage disequilibrium.  FST on a per-site and population-pair-wise basis was calculated with the full SNP set in vcftools using the Weir and Cockerham method (Weir and Cockerham 1984). FST was analysed as FST/(1-FST) after Rousset (1997), who found this transformation to be informative in detecting isolation by distance. Physical distances between populations were calculated in ArcGIS. Vcftools was also used to calculate observed and expected heterozygosity on a per-site and per-individual basis. Observed and expected heterozygosity was used to calculate the inbreeding coefficient F (Wright 1921), as used by Ritland et al. (2005). Mantel correlograms (Mantel 1967) were generated in R (R Development Core Team 2011), with the program mantel.correl from the “vegan” package using 10,000 permutations to assess probabilities.   2.9 Population structure analyses  The full SNP set described above was used for multivariate clustering in discriminant analysis of principal components (DAPC) (Jombart et al. 2010). This method was chosen for its combination of the advantages of principal component analysis e.g. fast computational time and quantitative clustering of large datasets, with those of discriminant analysis e.g. accurate assignment of individuals to clusters and minimization of within cluster:between cluster divergence. DAPC was performed using the adegenet 16  package in R (Jombart 2008) with default optimization parameters, which makes very few assumptions regarding marker neutrality and presence, allowing for the use of the full marker set. All available PCs were included to determine the optimum number of clusters and a-score, an experimental feature of the adegenet package, was used to determine the optimum number of PCs for use with any particular number of genetic clusters.  The strict SNP set described above was used for Bayesian cluster modelling, which requires strict assumptions to be met regarding allelic structure (Francois and Durand 2010). STRUCTURE (Pritchard et al. 2000) was chosen for its detailed and customizable interface along with its widespread use in population genetics.  K values ranging from 1 to 11 were tested. For each value of K, 5 replicates were performed with a 100,000-step burn-in phase and 500,000-step post-burn-in phase. Prior sampling information was provided to improve detection of weak structure (Hubisz et al. 2009). Results were analysed in STRUCTURE HARVESTER (Earl and vonHoldt 2012).  2.10 Data visualization and statistical software   All maps were generated in ArcGis 10.2 (ESRI 2013). Figures were generated using Microsoft Excel (Microsoft 2009) and DAPC (Jombart 2008). Statistical analyses were performed using SAS 9.3 (SAS Institute 2010) and R 2.14 (R Development Core Team 2011).     17  3. Results   3.1 Sequencing, de-multiplexing and de novo assembly  De-multiplexing was very successful despite poor read quality in the first 2 bases of the barcode. 197.1 million (91.94%) reads were retained. The majority of discarded reads were due to ambiguous bar codes (7.33%). Only 0.49% of reads were discarded due to low quality, suggesting that the GBS library preparation was highly successful in generating a clean library. Overall, 18.1 Gigabases of sequence data were retained. Although the genome size of Q. garryana has not been estimated, this likely corresponds to ~19-fold coverage based on genome sizes of other Quercus species of the same subgenus (Favre and Brown 1996).  The de novo assembly built 217,358 unique 92-bp contigs, totaling 20 Megabases of sequence data (perhaps ~2% of the total genome size), and called 132,226 SNPs. Each SNP was found in an average of 63.6% (SD=31.9%) of individuals, although the majority of SNPs were rare and found in <10% of individuals. Individuals contained an average of 36.3% (range: 20.3-51.8%) of called SNPs. Average depth per individual was highly variable, averaging 38.6 (range: 2.94-123.94). This highlights one of the key issues with GBS, namely the preferential amplification of some individuals over others. Ultimately, five individuals were dropped from further analysis due to high rates of missing data, leaving a total of 87 individuals on which subsequent analyses were performed.  3.2 Filtering  The full SNP set retained 20,751 SNPs. The strict set retained 2,194 SNPs. The majority of SNPs different between the two sets were those that had a minor allele frequency < 0.05. 8141 (39.2%) SNPs 18  had a minor allele present in only a single individual, leading to 7400 (35.6%) SNPs with a minor allele frequency < 0.01 (Fig. 4).  3.3 Population genetic parameters  Overall FST across the full SNP set averaged  0.091. A very strong correlation was observed between transformed FST and physical distance (R2=0.93, F=297.3, p<0.0001, Fig. 5). This model fits very well with Rousset’s model of one-dimensional isolation by distance in a narrow habitat (Rousset 1997). When physical distance was used as a covariate, transformed FST values were found to be significantly higher in shrub-tree population comparisons than between shrub-shrub or tree-tree comparisons whereas tree-tree comparisons were not significantly different from shrub-shrub comparisons (Fig. 6). This suggests some degree of divergence between these groups. Observed heterozygosity was found to be negatively correlated with latitude (R2=0.72, F=22.94, p=0.001, Fig. 7 and Table 3). Overall, heterozygosity within shrub populations was significantly higher than heretozygosity in var. garryana populations (F=113.39, p<0.0001). Inbreeding coefficient was found to be negatively correlated with latitude (R2=0.65, p=0.003, Fig. 8, Table 3). The results of the Mantel tests show positive correlation between genetic and physical distance at distances up to 400 km (p<0.001). Negative correlations were observed beyond 1000km (p<0.001, Fig. 9).   3.4 STRUCTURE  Runs were stable up to K=7 (Fig. 10), and consistent between replicates up to K=5. Increases in log-likelihood of the model were negligible beyond K=4, thus this was chosen as the threshold for analysis. Using the optimum group selection based on K (Evanno et al. 2005), there is a very clear peak 19  at K=2, which declines rapidly due to increases in likelihood variance (Fig. 11), suggestive of two major clusters.   Posterior assignments for K=2-4 are shown in Fig. 12 and mapped in Fig. 13. In all runs, there is a clear distinction between populations that were identified as belonging to the two shrub varieties, and those identified as var. garryana. However, no genetic distinction can be found between vars. breweri and semota. The population identified primarily as var. breweri (population 4) based on bud and trichome morphology does not seem to be substantially different than the admixed population (population 5), despite the former showing very little morphological evidence of any var. semota individuals (see Fig. 2). Population 6, which identified as relatively purely var. garryana with some suggestion of admixture, was found to be similar to the southern populations in all runs, suggesting high gene flow between varieties.   3.5 DAPC  Utilising all available PCs, the best fit for the data was found at K=2 (Fig. 14).Values up to K=4 were analysed for consistency with the STRUCTURE analysis. The optimum number of retained PCs for each analysis of K values of 2 to 4 was 5, 10, and 2, respectively. The relatively low a-scores for K=2 and 3 compared to K=4 and the low number of PCs required to adequately explain the variation at K=4 (results not shown) suggest that it may be a more stable population model than lower values of K, suggesting the presence of hierarchical population structure.  The results of DAPC are largely consistent with the results from STRUCTURE, but provide more information on the hierarchical organization of variation. At K=2, individuals are clustered based on identification as shrub varieties or var. garryana, with no discernible difference between vars. breweri and semota. The bimodal distribution of individuals across the discriminant function for this model (Fig. 20  15) again shows individuals from population 6 clustering intermediately between the remainder of var. garryana and the shrub varieties.  The two models do differ in some respects. At K=3, DAPC identified populations 4-6 as a distinct cluster from populations 1-3, as opposed to STRUCTURE finding the next most-important division to be between BC populations and mainland populations. Also at K=3, individuals from population 6 seem to have little clustering similarity to other populations of var. garryana. However, the high number of PCs required to explain the structure at K=3 suggests that this may not be a stable model.  The model at K=4 is highly congruent to that of STRUCTURE. The four clusters found correspond almost identically to those identified by structure (Fig. 16). Individuals from population 6, while clustering with the remainder of mainland var. garryana populations, are intermediate in discriminant function 1 between the rest of its cluster and the Northern California shrub cluster, corresponding to populations 4 and 5 and again suggesting high genetic similarity. Population  10 clusters with population 11 but is intermediate in both discriminant axes between population 11 and the mainland populations. This is reflected in all STRUCTURE runs, which show approximately equal contributions from mainland and Vancouver Island clusters in that population.       21  4. Discussion  4.1 GBS as an approach for population genetics study   Overall, GBS was highly successful in generating robust SNP marker sets for use in population genetic analyses. Marker quality and inclusion was very high, and there appears to be a good proportion of neutral SNPs. 20 Megabases of unique sequence data was generated, perhaps 2% of the total genome. This is a large step towards generating useful genomic tools for understanding Q. garryana, as this species had no substantial genomic resources prior to this study. Also, due to the methylation-sensitivity of PstI, this 2% is likely over-representative of coding and regulatory regions, thus making this level of sequencing even more useful for future study.  Due to the single-end reads generated here, contigs longer than 92-bp could not be assembled. This has several implications for downstream analysis including not being able to measure linkage disequilibrium between markers, an important measure for detection of marker neutrality. The short reads also greatly reduce the power of marker annotation via BLAST searches, so the proportion of genic SNPs could not be determined. Expansion of the current study to include paired-end data should be able to provide better estimates of the true neutrality and location of markers obtained here. Furthermore, mapping reads to current Q. robut or Q. alba resources or the Q. robur draft genome in preparation will provide estimates of marker distribution. The issues in sequencing reported here highlight the importance of having GBS libraries sequenced by facilities familiar with the procedure, as the highly repetitive nature of the first few bases due to enzyme cleavage and the common adapter used in amplification require special attention during sequencing. Also, obtaining the raw Illumina HiSeq image files for re-processing may be wise until read integrity can be established.  The majority of difficulties incurred in making a GBS library for Q. garryana were in DNA extraction, which was largely due to tissue quality, and size selection. Quantification may have been 22  influenced by high levels of secondary metabolites observed in the elute following DNA extraction. Preferential amplification of some individuals was somewhat problematic, but not a major issue overall. Ultimately, researchers should err on the large side when selecting sample sizes for GBS due to the potential for individual variability and subsequent exclusion of some samples. With these caveats in mind, GBS appears to be a good approach for marker discovery in Q. garryana and potentially other species with comparable genome sizes, at a relatively low cost compared to non-anonymous genomic markers such as those obtained from sequence capture. PstI seems to be an effective enzyme for this species, despite reports of groups having difficulties with its use in Q. lobata, a closely-related species (V. Sork, pers. comm., 2013). Several highly repetitive elements were sequenced, but overall sequencing favoured single-copy regions (results not shown), perhaps due to the methylation sensitivity of PstI. 4.2 Population structure in Quercus garryana    The results of DAPC and STRUCTURE analyses both show maximum support for a two-cluster model of population structure, in which var. garryana clusters separately from vars. semota and breweri. The high degree of admixture and intermediacy observed in the southernmost Q. garryana population and northernmost shrub populations suggests only limited reproductive isolation between varieties. This may be due to limited elevational separation observed between tree and shrub populations in this species (Tucker 2012). Due to the high morphological distinction between shrub and tree varieties, this clustering is fairly logical and the genetic division is clearly fairly deep as suggested by the high K value observed at K=2. As such, the population structure is likely a hierarchical one, with a strong division between tree and shrub varieties, and some apparent substructure within those clusters. For example, at  K=3 in the STRUCTURE analysis, the distinction between mainland Pacific Northwest and Vancouver Island populations noted in Ritland et al. 2005 and Marsico et al. 2009 can be seen (Fig. 12).   23   The notion of hierarchical population structure in Q. garryana is supported by the increased log-probability and high stability of STRUCTURE models at K=3-5, and the low number of PCs required to explain the structure at K=4 in DAPC. Additionally, the striking similarities between STRUCTURE and DAPC at K=4 despite large differences in methodology and markers used suggests a more complex population structure than is suggested by the “optimal K” criteria for both programs. The K=4 model is supported by high stability in DAPC and is the highest value of K to show substantial K (see inset in Fig. 11). Additionally, this model is congruent with previous studies of population structure in Q. garryana and is biologically sound.  At K=4, the significant clusters may be identified as “Vancouver Island”, “mainland Oregon and Washington”, “Northern California”, and “Southern California”. The divergence between Vancouver Island and Washington State mainland populations is well-supported by the previous genetic studies performed with Q. garryana (Ritland et al. 2005, Marsico et al. 2009). The Northern California cluster is particularly interesting as it represents the area considered to be introgressive between the shrub varieties, and shows differentiation between the southernmost var. garryana trees and the more northern mainland populations that show little genetic difference (see Fig. 12). The Southern California cluster is also of interest because neither STRUCTURE nor DAPC showed divergence between or support for the separate identities of vars. semota and breweri.   The extremely strong correlation between genetic and physical distance between populations is strongly suggestive of a one-dimensional isolation by distance model for population differentiation (Rousset 1997). This makes sense as Q. garryana occupies a long latitudinal gradient with little longitudinal variation. The long distance at which genetic and physical distance are positively correlated (up to 400 km, Fig. 9) suggests that despite apparent low gene flow via seed dispersal (Marsico et al. 2009), genetic correlation remains high well beyond the range of most pollination events. Substantial levels of gene flow could still be facilitated by rare long-distance pollen or seed dispersal events, both of 24  which have been previously documented to affect gene flow in oak species (Hampe et al. 2013 and Gómez 2003, respecitively), and which may not have been apparent at the low resolution offered by the 12 microsatellites used by Marsico et. al. Additionally, genetic correlation could be attributable not to high gene flow, but rather to shared ancestry from post-glacial range expansion (Stone et al. 2011). This high genetic correlation across long distances could result in the obscuring of hierarchical population structure as was observed in the clustering analyses. 4.3 Comparison with previous studies  Overall, mean FST among the full SNP set was estimated at 0.091. This is quite similar to Ritland et al.’s (2005) estimate of GST at 0.084 based on 23 isozyme loci and substantially higher than Marsico et al.’s (2009) estimate of Nuclear FST at 0.049 based on 7 microsatellite loci. However, when only individuals from the range sampled by Marsico et al. are included, the average FST drops to 0.056, which is well within the range of reasonable similarity. Also, using highly polymorphic markers such as microsatellites is likely to bias estimates of FST downward (Hedrick 2005). The current study’s estimates of population differentiation fall quite short of the QST estimates of Huebert (2008), with a maximum of 0.31despite using the same trees. This discrepancy suggests that local adaptation has strong influences on population structure for some key ecologically-relevant traits. However, it is of value to note that QST for many of the traits quantified by Huebert fall within the range of those found in this study, and those traits are presumably not under divergent selection. Estimates of inbreeding coefficient (F) followed the same trend as observed by Ritland et al., but values were often different at individual sites. Ritland et al. found their highest value for F (0.21) in the area that corresponds to Population 7 in this study, where this study found its second-lowest estimate (-0.03). This could reflect biases of low marker number in the Ritland et al. study, effects of null alleles at some of those loci, or potentially a fine-scale, patchy mating system in the area. 25  The results of the Mantel correlation are largely comparable to those of Marsico et al. in trend, but differ in scale. Marsico et al. found positive correlation only to 200 km, whereas the current study found positive correlation up to 400km. This could possibly be due to the small number of microsatellite markers used by Marsico et al. not representing genome-scale gene flow in the same way that many SNPs might, or the limited number of populations compared in the current study.  The current study is in agreement with the findings of Ritland et al. and Marsico et al. in that genetic diversity decreases with latitude (Fig. 7).  As the range of Q. garryana extends far north of any other related oak species, it would follow that the species has been more strongly influenced by drift and strong selection as it entered progressively cooler and more competitive habitats, following the leading-edge hypothesis of range expansion (Soltis et al. 1997). Despite the presence of Quercus pollen near the edge of the ice sheet at the last glacial maximum (Marsico et al. 2009), it is likely that temperatures in the northern part of the range limited Q. garryana to a few isolated refugia which have been steadily revitalized by the genetic variation from more southern, unglaciated portions of the var. garryana range.   4.4 Taxonomic implications   All analyses support division between var. garryana and the shrub varieties. However, no support could be found for a division between vars. semota and breweri. This holds despite shrub populations showing high inbreeding coefficients (Fig. 8)suggestive of reduced gene flow between these populations .  Although trichome and bud morphology appear to be relatively successful at delineating between varieties (see Fig. 2), the complete lack of correlation with genetic data could be due to any one of several factors, including: a) Population 4, dubbed relatively pure var. breweri, is not a good example of the variety and sampling more populations from deeper within the range of var. breweri may show higher differentiation; b) The amount of genomic sequence obtained (perhaps 2% of the total 26  genome) may not adequately capture enough genomic variability to adequately separate these varieties or c) the separation is caused by only a few key loci not sampled near in this study, or d) the morphological traits used to distinguish vars. semota from breweri are polymorphic within a single variety, and this variation does not reflect a deeper taxonomic divide but is rather an example of incomplete lineage sorting. .   However, even if there are indeed genetic differences, the complete lack of clustering suggests there are, if any, very weak reproductive barriers between or geographic separation of these varieties. The high presence of var. breweri-like samples in the southernmost populations further complicates this matter and suggests that trichome and/or bud morphology may simply be site-specific local adaptations that have occurred in multiple localities of a single shrub variety. Anecdotal reports of differences in floral and acorn morphology (Nixon 1997) that could not be tested here due to immaturity of trees may lend future insights into this issue if the trees in the common garden are given time to mature.   Based on the current lack of evidence for two genetically-distinct shrub varieties, it is suggested that they be treated as one evolutionarily significant unit until more in-depth genetic study can be done to resolve the matter. Despite the current study’s evidence of substantial introgression between the two varieties of Q. garryana, reproductive barriers are notoriously weak even at the species level among oaks (Rushton 1993), so this should not necessarily be used as an argument to determine speciation status in this taxon. The fairly sharp genetic divide between var. garryana and the shrub populations, even in close proximity, suggests that these taxa are evolutionarily divergent.  4.5 Conservation implications   The results of the population structure analyses suggest that Q. garryana can be divided into two evolutionarily-significant units, one corresponding to what is currently referred to as Q. garryana var. garryana, and one that is composed of both Q. garryana var. semota and Q. garryana var. breweri. 27  As var. breweri (Watson 1880) was described before var. semota (Jepson 1909), it is suggested that the unified variety be described as Quercus garryana Dougl. Ex Hook var. breweri Engelm., following the notes of Watson. Although neither var. semota or breweri are currently threatened in their range, managing them separately, should the need arise, may not be necessary. However, the small sample size of putatively pure populations in the current study may not encompass the full variability of these varieties. Further sampling in these regions would be useful in verifying whether there is substantive evolutionary divergence between these taxa.  The presence of reduced genetic diversity in the northern populations is suggestive of either increasingly strong selection as trees established northward (Smith and Haigh 1974), or the effects of drift in a northward leading-edge model (Soltis et al. 1997). Given the strong trend of decreasing heterozygosity across >20,000 loci, most of which are likely neutral, drift is the more likely explanation for this species. Either way, this decreased heterozygosity may limit the adaptive potential of these populations (Gilpin and Soulé 1986). With a changing climate expected to favour Q. garryana at the northern extent of its range, having viable and adaptable seed sources at the current edge could be critical in allowing this species to migrate or adapt in the future (Aitken et al. 2008).  The apparent high gene flow between populations of Q. garryana could be valuable in adapting northern populations to a changing climate (Daivs and Shaw 2001), but massive habitat fragmentation in the northern range of this species may be generating a novel barrier to gene flow that could hinder the transfer of warm-adapted genes from the core of the species range (Watkinson and Sutherland 1995). Assisted gene flow could be used to enhance adaptation of northern populations to new climates (Aitken and Whitlock 2013).  With the relatively low population differentiation in Q. garryana and the low differentiation between populations in many ecologically-relevant quantitative traits (Huebert 2009), seed transfer may be an effective means of assisted gene flow, supplying adaptive potential to the northernmost populations (Aitken and Whitlock 2013). The genetic correlation between populations up to 400 km 28  suggests that individuals from well within the core of the species range may be available for genetic infusion into northern populations without substantial outbreeding depression (Lopez et al. 2009). Although the one-dimensional nature of the species range could potentially result in mismatches of phenology even over relatively short scales, Huebert (2009) found relatively low differentiation in phenology-related traits.  The high levels of heterozygosity observed in southern populations are a good sign for the shrub varieties of Q. garryana. In general, southerly and high-elevation populations are expected to be the most strongly affected by climate change (Davis and Shaw 2001), and the habitat of the shrub varieties meets both of these criteria. Additionally, gene flow from the geographically homogeneous centre of the species range could cause further introgression of alleles maladapted to the highly heterogeneous landscape of the southern populations (Kirkpatrick and Barton 1997). Despite the relatively high level of inbreeding observed in these populations that could be indicative of the mountainous landscape limiting gene flow, the shrub populations showed the weakest cline in genetic divergence with distance, although not significantly weaker than between populations of var. garryana.  Hopefully the high levels of variation in these populations are enough to buffer the effects of climate change.        29  5. Conclusions   This study has shown significant divergence between two major lineages in the evolutionary history of Quercus garryana, one that is isolated to the montane regions of California and Southern Oregon, and one that has spread through the lowland valleys of Oregon and Washington into the Gulf Islands and Vancouver Island in British Columbia. The presence of hierarchical population structure within varieties seems possible given the congruence of multiple analyses and the results of previous studies, but the limited sample size of the current study and poor ability of current tools to handle hierarchical population structures may be hindering characterization of this structure. Additionally, high levels of gene flow via pollen may be preventing populations from differentiating. Gene flow appears quite high in this species and admixture is common between populations even when separated by apparent geographic barriers, as has been suggested by previous studies. Several trends appear to vary latitudinally in Q. garryana, including a northward decline in genetic diversity and inbreeding. This is consistent with the strong correlation between FST and physical distance, suggesting isolation-by-distance and a leading edge model of migration. Natural gene flow appears to be high enough that assisted gene flow could be a potentially useful conservation tool for adapting northern populations to climate change. Overall, GBS appears promising as a technique for marker discovery in Quercus garryana and potentially other related species. A sufficient number of SNP markers with high sequencing depth were obtained to allow for resolution of several range-wise trends, even given a small sample size (87 individuals). The results showing distinct trends and strong clines with only a limited sample of individuals merits expansion of these methods to a more statistically-powerful scale. Increasing sample size within and among populations will hopefully allow for finer-scale resolution of many of the trends noted in this study. Increasing the number of sampled individuals could also yield yet more SNPs and 30  enough statistical power to perform genome-wide association study on a number of ecologically-relevant traits in this species that have been phenotyped in a common garden by Huebert (2009). The addition of some closely-related outgroup species from elsewhere in the California white-oak complex could yield valuable comparisons for the evolutionary position of this species as well as evaluating potential hybridization with other oaks. Paired-end sequence data could allow for more meaningful sequence annotations than were possible with the short reads generated in this study. Although there are some worrying signs regarding the future of Q. garryana populations including low genetic variability and extensive loss of habitat in the north and high susceptibility to climate change in the south, the high degree of gene flow in this species may be sufficient to allow Q. garryana to adapt and migrate into new habitats expected to become ecologically available. Due to high levels of fragmentation in much of the current species distribution, assisted gene flow or the establishment of pollination corridors may be useful in maintaining historic levels of gene flow.31  References  Agee, J. 1996. Fire in restoration of Oregon white oak woodlands. In The Use of Fire in Forest Restoration. Edited by C. Hardy and S. Amo. USDA Forest Service Technical Report INT GTR-341. Intermountain Forest and Range Experiment Station. Ogden, UT.  Aitken, S., Yeaman, S., Holliday, J. et al. 2008. Adaptation, migration or extirpation: Climate change outcomes for tree populations. Evolutionary Applications 1(1): 95-111. Aitken, S. and Whitlock, M. 2013. Assisted gene flow to facilitate local adaptation to climate change. Annual Review of Ecology 44: 367-388. Altshuler, D., Pollara, V., Cowles, C. et al. 2000. An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407: 513-516. BC Conservation Data Centre. 2014. BC Species and Ecosystems Explorer [online]. BC Ministry of Environment. Victoria, B.C. Available from http://a100.gov.bc.ca/pub/eswp/ [accessed 24 March  2014]. Catchen, J., Hohenlohe, P., Bassham, S. et al. 2013. Stacks: An analysis tool set for population genomics. Molecular Ecology 22(11): 3124-3140. COSEWIC. 2011. Canadian Wildlife Species at Risk. Committee on the Status of Endangered Wildlife in Canada. Ottowa, ON. Danecek, P., Auton, A., Abecasis, G. et al. 2011. The variant call format and VCFtools. Bioinformatics 27(15): 2156-2158. Davey, J. and Blaxter, M. 2010. RADSeq: next-generation population genetics. Briefings in Functional Genomics 9: 416-423. Davis, M. and Shaw, R. 2001. Range shifts and adaptive responses to Quaternary climate change. Science 86: 1704-1714. Dumolin, S., Demesure, B. and Petit, R. 1995. Inheritance of chloroplast and mitochondrial genomes in pedunculate oak investigated with an efficient PCR method. Theoretical and Applied Genetics 91: 1253-1256. Earl, D. and vonHoldt, B. 2012. STRUCTURE HARVESTER: A website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources 4(2): 359-361. Elshire, R., Glaubitz, J., Sun, Q. et al. 2011. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS 6(5): e19379. ESRI. 2013. ArcGIS Desktop: Release 10.2. Environmentla Systems Research Institude: Redlands, CA. Evanno, G., Regnaut, S., and Goudet, J. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Molecular Ecology 14(8): 2611-2620. Favre, J. and Brown, S. 1996. A flow cytometric evaluation of the nuclear DNA content and GC percent in genomes of European oak species. Annales des Sciences Forestieres 53: 915-917. Fowells, H. 1965. Silvis of Forest Trees In The United States. USDA Handbook 271. USDA Forest Serivce, Washington, DC. Francois, O. and Durand, E. 2010. Spatially explicit Bayesian clustering models in populations genetics. Molecular Ecology Resources 10: 773-784. Fuchs, M. 2001. Towards a recovery strategy for Garry oak and associated ecosystems in Canada: Ecological assessment and literature review. Available from Canadian Wildlife Service Pacific and Yukon Region, Delta, B.C. Publ. GBEI/EC-00-030. Gilpin, M. and Soulé, M. 1986. Minimum viable populations: Processes of extinction. In Conservation Biology: The Science of Scarcity and Diversity. Edited by M. Soulé. Sinauer, Sunderland. Gómez, J. 2003. Spatial patterns in long-distance dispersal of Quercus ilex acorns by jays in a heterogeneous landscape. Ecography 26(5): 573-584. Guries, R. and Ledig, F. 1976. Genetic structure and differentiation in forest trees. In Proceedings of the Symposium on Isozymes of North American Forest Trees and Forest Insects. 27 July, 1979. Berkeley, CA. pp. 42-47. Hampe, A., Pemonge, M-H., and Petit, R. 2013. Efficient mitigation of founder effects during the establishment of a leading-edge oak population. Philosophical Transactions of the Royal Society of Biological Sciences 280. doi: 10.1098/rspb.2013.1070.  Hedrick, P. 2005. A standardized genetic differentiation measure. Evolution 59(8): 1633-1638. 32  Hubisz, M., Faluch, D., Stephens, M. et al. 2009. Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources 9(5): 1322-1332.  Huebert, C. 2009. The ecological and conservation genetics of Garry oak (Quercus garryana Dougl. Ex Hook). M.Sc. thesis, Department of Forestry, University of British Columbia.  Jepson, W. 1909. Oregon oak. In The trees of California. Cunningham, Curtis, and Welch. San Francisco, California, US. pp. 158-161. Jombart, T. 2008. Adegenet: A R package for the multivariate analysis of genetic markers. Bioinformatics 24: 1403-1405.BMC Genetics 11: 94. Jombart, T., Devillard, S., and Balloux, S. 2010. Discriminant analysis of principal components: A new method for the analysis of genetically structured populations.  Kirkpatrick, M. and Barton, N. 1997. Evolution of a species’ range. The American Naturalist 150(1): 1-23. Krueger, F., Andrews, S., and Osborne, C. 2011. Large scale loss of data in low-diversity Illumina sequencing libraries can be recovered by deferred cluster calling. PLoS ONE 6(1): e16607. Lopez, S., Rousset, F., Shaw, F. et al. 2009. Joint effects of inbreeding and local adaptation on the evolution of genetic load after fragmentation. Conservation Biology 23: 1618-1627. Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27: 209-220. Marsico, T., Hellmann, J., and Romero-Severson, J. 2009. Patterns of seed dispersal and pollen flow in Quercus garryana (Fagaceae) following post-glacial climatic changes. Journal of Biogeography 36(5): 929-941. McMinn, H. 1951. An Illustrated Manual of California Shrubs. University of California Press, Berkeley, CA. Microsoft. 2009. Excel 2010. Microsoft Corporation: Redmond, WA. Moritz, C. 1994. Defining ‘evolutionarily significant units’ for conservation. Trends in Ecology and Evolution 9(10): 373-375. Nixon, K.C. 1997. Quercus garryana. In Flora of North America North of Mexico, Vol. 3 [online]. Edited by Flora of North America Editorial Committee. New York and oxford.  Available from http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=233501033 [accessed 6 August 2013]. Pellatt, M., Gedalof, Z., McCoy, M. et al. 2007. Fire history and ecology of Garry oak and associated ecosystems in British Columbia. Final Report for the Interdepartmental Recovery Fund Project 733. WNSC Pub., Vancouver, BC. Pellatt, M., Goring, S., Bodtker, K. et al. 2012. Using a down-scaled bioclimate envelope model to determine long-term temporal connectivity of Garry oak (Quercus garryana) habitat in Western North America: Implications for protected area planning. Environmental Management 49: 802-815. Pritchard, J., Stephens, M., and Donnelly, P. 2000. Inference of population structure using mulitlocus genotype data. Genetics 155(2): 945-959. R Development Core Team. 2011. R: A language and environment for statistical computing. R Foundation for Statistical Computing: Vienna, Austria. Ritland, K., Meagher., L., Edwards, D., et al. 2005. Isozyme variation and the conservation genetics of Garry oak. Canadian Journal of Botany 83: 1478-1487. Rousset, F. 1997. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145: 1219-1228. Rushton, B. 1993. Natural hybridization within the genus Quercus L. Annals of Forest Science 50: 73-90. SAS Institute. 2010. SAS 9.3. SAS Institute: Cary, NC. Smith, J. and Haigh, J. 1974. The hitch-hiking effect of a favourable gene. Genetical Research 23(1): 23-35. Soltis, D., Gitzendanner, M., Strenge, D. et al. 1997. Chloroplast DNA intraspecific phylogeography of plants from the Pacific Northwest of North America. Plant Systematics and Evolution 206: 353-373. 33  Stein, W. 1990. Quercus garryana Dougl. Ex Hook. – Oregon White Oak. In Silvics of North America: 2. Hardwoods. Edited by R. Burns and B. Honkala. USDA Handbook 654. USDA Forest Serivce, Washington, DC. Stone, G., Nee, S., and Felsenstein, J. 2011. Controlling for non-independence in comparative analysis of patterns across populations within species. Philosophical Transactions of the Royal Society of Biological Sciences 366. doi: 10.1098/rstb.2010.0311. Thilenius, J. 1968. The Quercus garryana forests of the Willamette Valley, Oregon. Ecology 49: 1124-1133. Tucker, J. 1980. Taxonomy of California oaks. In Proceedings of the symposium on the ecology, management and utilization of California oaks, Claremont, California, 26-28 June 1979. Edited by Plumb, T. USDA Forest Service Pacific Southwest Forest and Range Experiment Station, Berkeley, Cal. pp. 19-29. Tucker, J. 2012. Quercus. In Jepson eFlora [online]. Edited by Jepson Flora Project. Available from http://ucjeps.berkeley.edu/cgi-bin/get_IJM.pl?tid=64728. Accessed 28 March 2014. Watkinson, A. and Sutherland, W. 1995. Sources, sinks, and pseudo-sinks. Journal of Animal Ecology 64: 126-130. Watson, S. 1880. Botany of California Vol. 2. Cambridge University Press, Cambridge, MA.  Weir, B. and Cockerham, C. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38(6): 1358-1370. Wright, S. 1921. Systems of mating. II. The effects of inbreeding on the genetic composition of a population. Genetics 6(2): 124-143.         34   Figure 1: Species distribution of Quercus garryana and sampling locations. Distribution modified from Fuchs (2001) to include proposed variety distributions based on observations from Jepson (1909) and McMinn (1951). 35  Figure 2: Varietal identity of Quercus garryana populations as assessed by trichome and bud morphology of individuals planted in a common garden in Vancouver, British Columbia. All populations north of the area shown were identified as purely var. garryana. N=17-37 individuals assessed per population.              Figure 3: Bioanalyzer analysis of a 96-plexed genotyping-by-sequencing (GBS) library of Quercus garryana showing DNA fragment size distribution. The tight peak around 500 bp indicates highly successful amplification and size selection.  36  Figure 4: Distribution of identified minor allele frequencies of 20,751 single nucleotide polymorphism (SNP) markers discovered in individuals from across the species range of Quercus garryana.           Figure 5: Multiple linear regression of physical and genetic distance among populations of Quercus garryana. FST values were calculated from 20,751 single nucleotide polymorphism (SNP) markers. “Tree” type refers to Q. garryana var. garryana and “shrub” type refers to vars. breweri and semota. 00.050.10.150.20.250.30.350.4SNP Frequency Minor allele frequency 00.050.10.150.20.250.30 500 1000 1500Pairwise genetic distance (F ST (1- F ST)-1) Physical distance (km) shrub-shrubtree-treeshrub-treeshrub-shrubtree-treeshrub-treeR2=0.93 p<0.0001 37  Figure 6: Comparison of mean pairwise FST values of populations of Quercus garryana adjusted for physical distance, with standard error given in bars. FST values were calculated from 20,751 single nucleotide polymorphism (SNP) markers. “Tree” type refers to Q. garryana var. garryana and “shrub” type refers to vars. breweri and semota. Letters represent statistical differences at the α=0.05 level.     Figure 7: Correlation between provenance latitude and mean observed hererozygosity in 20,751 single nucleotide polymorphism (SNP) markers discovered in individuals from across the species range of Quercus garryana.     Figure 8: Correlation between provenance latitude and population inbreeding coefficient (F) in 20,751 single nucleotide polymorphism (SNP) markers discovered in individuals from across the species range of Quercus garryana.     B A B 00.020.040.060.080.10.12shrub-shrub shrub-tree tree-treeAdjusted mean pairwise F ST  Comparison type R² = 0.7183 0.060.0650.070.0750.080.0850.0935 40 45 50Observed heterozygosity Latitude (degrees) vars. semota and brewerivar. garryanaR² = 0.6449 -0.1-0.0500.050.10.1535 40 45 50Inbreeding coefficient (F) Latitude (degrees) vars. semota and brewerivar. garryana38  05001000150020002500300035004000450050002 3 4 5 6 7 8 9 10K Number of genetic clusters (K) 0501001502 3 4 5 6K K Figure 9: Mantel correlogram of pairwise genetic distance (FST/(1-FST)) and physical distance. Filled squares indicate significant correlation with 10,000 permutations at a level of α=0.05 and open squares represent non-significant correlations. Square size is proportional to the number of comparisons in that distance class with the largest squares representing 12 comparisons and the smallest representing 3.   Figure 10: ln probability of genetic clusters determined among 11 populations of Quercus garryana in STRUCTURE. STRUCTURE runs were performed using 2,194 single nucleotide polymorphisms (SNPs) with a 100,000-step burn-in and 500,000 post-burn-in steps. Five replicates were performed per cluster.     Figure 11: K for genetic clusters determined among 11 populations of Quercus garryana in STRUCTURE, as determined using the method of Evanno et al. (2005). STRUCTURE runs were performed using 2,194 single nucleotide polymorphisms (SNPs) with a 100,000-step burn-in and 500,000 post-burn-in steps. Five replicates were performed per cluster. -230000-220000-210000-200000-190000-180000-170000-160000-1500001 2 3 4 5 6 7 8 9 10 11ln probability Number of genetic clusters (K) -0.6-0.4-0.200.20.40.60 200 400 600 800 1000 1200 1400rM Geogrpahic distance (km) 39  Figure 12: Individual posterior assignment probabilities among 11 populations of Quercus garryana in STRUCTURE for 2-4 genetic clusters. STRUCTURE runs were performed using 2,194 single nucleotide polymorphisms (SNPs) with a 100,000-step burn-in and 500,000 post-burn-in steps. Five replicates were performed per cluster. Dark bars represent divisions between populations. Clusters corresponding to var. garryana are shown in shades of green, and clusters corresponding to vars. breweri and semota are shown in blue.40   Figure 13: Population mean posterior assignment probabilities among 11 populations of Quercus garryana in STRUCTURE for 2 (a) and 4 (b) genetic clusters. STRUCTURE runs were performed using 2,194 single nucleotide polymorphisms (SNPs) with a 100,000-step burn-in and 500,000 post-burn-in steps. Five replicates were performed per cluster. Clusters corresponding to var. garryana are shown in shades of green, and clusters corresponding to vars. breweri and semota are shown in blue. 41  Figure 14: Genetic clustering likelihood among 20,751 single nucleotide polymorphism (SNP) markers in Quercus garryana as determined by discriminant analysis of principal components (DAPC). Lower BIC values represent a more likely model.           Figure 15: Clustering assignment of 87 Quercus garryana individuals as determined by discriminant analysis of principal components (DAPC) using 20,751 single nucleotide polymorphism (SNP) markers and the first 5 principal components. Individuals assigned to the blue cluster represent vars. semota and breweri and individuals assigned to the green cluster represent var. garryana. The small green cluster in the middle corresponds to a population in Northern California that was identified as var. garryana but shows some admixture with vars. semota and/or breweri in other analyses. 5505555605655705755801 2 3 4 5 6 7 8 9 10 11Bayesian Information Criterion (BIC) Number of genetic clusters (K) 42    Figure 16: Clustering assignment of 87 Quercus garryana individuals as determined by discriminant analysis of principal components (DAPC) using 20,751 single nucleotide polymorphism (SNP) markers and the first 2 principal components. Clusters corresponding to var. garryana are shown in shades of green, and clusters corresponding to vars. breweri and semota are shown in blue. “Tree” type refers to Q. garryana var. garryana and “shrub” type refers to vars. breweri and semota. Cal.=California, US; OR=Oregon, US; WA=Washington, US; BC=British Columbia, Canada. The red circle highlights a population in Northern California that was identified as var. garryana but shows some admixture with vars. semota and/or breweri in other analyses. The orange circle highlights a population in Southern Vancouver Island, BC that clusters with the other Vancouver Island population, but showed high levels of admixture with OR/WA populations in other analyses.   -4-3-2-10123456-8 -6 -4 -2 0 2 4 6 8Discriminant function 2 Discriminant function 1 S. Cal. shrubN. Cal. shrubOR/WA/N. Cal. treeBC43  Table 1:  Geographic and climatic variables of sampled provenances of Quercus garryana Population number from Huebert (2009) Population number for current study Latitude Longitude Elevation (m) Mean annual temperature (°C) Mean annual precipitation (mm) Summer heat-moisture index 1 1 35.87 -118.64 1224 12.3 551 520.9 2 2 36.80 -119.09 1093 14.8 743 489.9 3 3 39.10 -120.85 844 13.7 1302 262.6 4 4 40.36 -122.94 788 14.8 916 372.8 5 5 40.85 -122.03 475 15.6 1439 178.2 6 6 41.85 -122.84 535 10 618 273.4 8 7 45.01 -123.17 191 10.8 1258 115.2 9 8 45.28 -121.35 650 9.2 472 263.2 11 9 46.83 -123.01 63 10.4 1311 82.5 14 10 48.46 -123.4 25 10 871 120.1 15 11 49.73 -125.02 35 9.3 1398 72.3   Table 2: Population-level genetic parameters of sampled provenances of Quercus garryana based on 20,751 single nucleotide polymorphism (SNP) markers. Proportion cluster identity as assessed by STRUCTURE using 2,194 single nucleotide polymorphisms (SNPs) with a 100,000-step burn-in and 500,000 post-burn-in steps. Five replicates were performed per cluster. Population Observed heterozygosity (HO) Expected heterozygosity (HE) Inbreeding coefficient(F) Primary genetic cluster (K) Proportion of identity with primary genetic cluster 1 0.084 0.089 0.058 1 1 2 0.081 0.093 0.124 1 1 3 0.087 0.094 0.071 1 0.976 4 0.086 0.092 0.061 1 0.707 5 0.086 0.093 0.078 1 0.784 6 0.073 0.078 0.065 2 0.653 7 0.073 0.071 -0.030 2 0.963 8 0.073 0.076 0.048 2 0.935 9 0.071 0.072 0.017 2 0.985 10 0.069 0.071 0.020 2 1 11 0.069 0.065 -0.063 2 1  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.52966.1-0075637/manifest

Comment

Related Items