Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Heritable somatic mutations accumulate slowly in Sitka spruce but increase the per-generation mutation… Hanlon, Vincent Charles Terrence 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_may_hanlon_vincent.pdf [ 736.55kB ]
Metadata
JSON: 24-1.0365772.json
JSON-LD: 24-1.0365772-ld.json
RDF/XML (Pretty): 24-1.0365772-rdf.xml
RDF/JSON: 24-1.0365772-rdf.json
Turtle: 24-1.0365772-turtle.txt
N-Triples: 24-1.0365772-rdf-ntriples.txt
Original Record: 24-1.0365772-source.json
Full Text
24-1.0365772-fulltext.txt
Citation
24-1.0365772.ris

Full Text

  HERITABLE SOMATIC MUTATIONS ACCUMULATE SLOWLY IN SITKA SPRUCE BUT INCREASE THE PER-GENERATION MUTATION RATE CONSIDERABLY  by   Vincent Charles Terrence Hanlon  B.Sc. (Hons.), Queen’s University at Kingston, 2014  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Forestry)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  April 2018   © Vincent Charles Terrence Hanlon, 2018  ii  Abstract The rate and biological significance of heritable somatic mutations accumulating with vegetative growth or age in trees has long been a subject of debate. Somatic mutation rates are unknown for conifers, some of which can reach exceptional sizes and ages. I investigated the somatic mutation rate in the conifer Sitka spruce (Picea sitchensis (Bong.) Carr.) by sequencing 113.1 Mb of nuclear DNA from the top and bottom of 20 exceptional old growth trees averaging 76 m in height. I estimate a somatic base substitution rate of 2.7 x 10-8 per base pair within a generation. Since this is comparable to germline mutation rates in the fastest-mutating eukaryotes, somatic mutations increase the per-generation heritable mutation rate in conifers considerably; but because these Sitka spruce are old, the increase is very small on an annual basis. I argue that although somatic mutations raise genetic loads in conifers, they generate important genetic variation, and they may also enable selection among cell lineages within individual trees.   iii  Lay Summary Because trees live so long they should evolve slowly; and yet, paradoxically, tree populations are well adapted to local climates and evolve effective responses to changing stresses such as herbivorous insects. A plausible solution is that trees produce more mutations than other species because they do not have a segregated germline, that is, a tree’s reproductive cells come from the same cells that build the tree’s trunk, branches, and leaves. Heritable mutations that arise in non-reproductive cells (“somatic mutations”) could alter how trees evolve by generating more variation for natural selection every generation or even by enabling selection among the branches or cells of an individual tree. I find that somatic mutations are common enough in trees that they could provide an answer to the paradox, especially over longer periods of time.  iv  Preface The research topic originated with Dr. Sally Aitken, and both Dr. Aitken and I developed the details of the experimental design. I conducted the fieldwork, the labwork, the bioinformatics, the analysis, the interpretation, and the writing. Dr. Aitken’s advice guided every stage of the process, and Dr. Sally Otto contributed advice on Sanger verification and the calculation of the mutation rate.  v  Table of Contents Abstract .......................................................................................................................................... ii Lay Summary ............................................................................................................................... iii Preface ........................................................................................................................................... iv Table of Contents ...........................................................................................................................v List of Tables ............................................................................................................................... vii List of Figures ............................................................................................................................. viii Acknowledgements ...................................................................................................................... ix Chapter 1: Introduction ................................................................................................................1 1.1 What are somatic mutations? .......................................................................................... 1 1.2 Causes ............................................................................................................................. 2 1.3 Occurrence and frequency .............................................................................................. 4 1.4 Evolutionary implications ............................................................................................... 7 1.5 Context for the experimental design ............................................................................. 10 Chapter 2: The Somatic Mutation Rate in Sitka Spruce .........................................................11 2.1 Introduction ................................................................................................................... 11 2.2 Results ........................................................................................................................... 14 2.2.1 Base substitutions: identification .............................................................................. 14 2.2.2 The IGV method ....................................................................................................... 16 2.2.3 Base substitutions: characteristics ............................................................................. 16 2.2.4 The mutation rate ...................................................................................................... 19 2.3 Discussion ..................................................................................................................... 20 2.4 Methods......................................................................................................................... 25 vi  2.4.1 Samples and processing ............................................................................................ 25 2.4.2 SNP calling and filtering ........................................................................................... 26 2.4.3 The IGV criteria ........................................................................................................ 28 2.4.4 Estimating the size of the search space ..................................................................... 28 2.4.5 Synonymous and non-synonymous substitutions ..................................................... 29 Chapter 3: Conclusion .................................................................................................................30 References .....................................................................................................................................33 Appendices ....................................................................................................................................49 Appendix A ............................................................................................................................... 49 Appendix B ............................................................................................................................... 50 Appendix C ............................................................................................................................... 51  vii  List of Tables Table 2.1 Mutations and their characteristics . ............................................................................. 18 Table A.1 Data and references for Figure 2.3 ............................................................................... 49 Table B.1 Geographic coordinates of trees sampled ……………………………………………50   viii  List of Figures Figure 2.1 Schematic diagram of the protocol for detecting somatic mutations in 20 trees......... 15 Figure 2.2 Schematic diagram of the bark samples and their consensus genotype ...................... 15 Figure 2.3 Mutation rates (µ) per generation (A) and per year (B) for multicellular species with sufficient data available ................................................................................................................ 21   ix  Acknowledgements First, thanks to my supervisor Dr. Sally Aitken for her patience in teaching a disaffected mathematics student to do biology, and for the many ideas and people she recommended to me. I am especially grateful for the offer of such a compelling research question, particularly now that we know it leads somewhere interesting. I also appreciate the help and commentary of my committee members Dr. Rob Guy and Dr. Sally Otto, most notably Dr. Guy’s information on meristems and Dr. Otto’s work on the calculation of the size of the search space.  I am indebted to all members of the Aitken lab for answering thousands of questions, especially during the first year of research, for volunteering plenty of useful advice, for making suggestions about labwork, logistics, and bioinformatics, for friendship, and for supplying me with snacks. Thanks to Joane Elleouet and Dr. Dragana Obreht Vidakovic for help and commiseration in the lab, to Jon Degner for introducing me to UNIX and for continuous aid when all I could do was type cd and ls, to Dr. Pia Smets for instilling in me a great respect for receipts, and to Dr. Ian MacLachlan for practicality and an early trip to Carmanah. I gratefully acknowledge Lea Zhecheva’s stoicism in the face of a broken radio, rain, and over-large boots on an abortive attempt at fieldwork on Haida Gwaii. Similarly, I thank Jon once again for help collecting samples in Carmanah, TJ Watt for his wonderful photos of trees despite a cloud of mosquitoes, and James Luce, Matthew Beatty, and Ryan Murphy for their herculean efforts while climbing 23 of the tallest trees in Canada so expertly. Thanks to the Varsity Outdoor Club for sore muscles and a burnt nose on Mondays and for an unfocussed mind on Fridays. I also appreciate the forbearance of my friends and family whenever I mentioned somatic mutations in otherwise civilized conversation. I am sincerely grateful for the support you have all given me during the past two and a half years.  1  Chapter 1: Introduction  1.1 What are somatic mutations? All mutations outside the germline are somatic mutations: in trees, these are mutations that occur in any structure except sporangia, their immediate precursors, and the cell lineages they produce by meiosis for sexual reproduction. Somatic mutations have the potential to make each tree a mosaic of genetic variation that is broadly or finely partitioned according to the tissue in question. For example, a mutation in an axillary meristem could produce an entirely mutant branch, or a mutation in the cambium could propagate through a narrow radial sector in the wood within a transverse layer of xylem and phloem cells; but a slow-dividing mutant cell in a growing leaf might give rise to just a few mutant daughter cells. Somatic mutations can be inherited in taxa that lack a segregated germline or that reproduce asexually. However, since the popularization of Weismann’s doctrine that evolution is confined to the germline (Weissman 1892, cited in Buss 1983), as well as Magni’s observation that meiosis produces more mutations than mitosis in yeast (1963), it has been widely assumed that most heritable mutations in eukaryotes are germline mutations (Buss 1983). Yet unlike in mammals, the inheritance of somatic mutations is the rule rather than the exception in plants, fungi, and many animals (Buss 1983). Because most model organisms are small and short-lived, often with a segregated germline, the effects of heritable somatic mutations in large, long-lived organisms such as trees may be overlooked.  Researchers since the early 1980s have hypothesized that somatic mutations may be essential to the evolution of trees (Whitham and Slobodchikoff 1981; Gill et al. 1995; Petit and Hampe 2006; Schultz et al. 2011). Because apical meristems perform primary growth by 2  repeated cell division, they are related to the zygote that established the tree by a long series of mitoses across decades, centuries, or even millennia. They can accumulate somatic mutations as they grow, from replication errors or other DNA damage, and they may produce gametes containing the mutant alleles. If heritable somatic mutations increase the per-generation mutation rate significantly, then tree populations should experience a concomitant increase in both genetic load and standing genetic variation (Whitham and Slobodchikoff 1981). The latter is particularly interesting because it might help trees overcome the adaptive handicap of long generation times.   1.2 Causes It is generally thought that somatic mutations in plants originate from DNA replication during mitosis, and the hypothesis that the mitotic rate predicts the per-generation mutation rate across plants of different sizes or ages has been repeatedly proposed (Antolin and Strobeck 1985; Scofield and Schultz 2006; Lanfear et al. 2013). However, this hypothesis is not yet well supported (Gaut et al. 2011). The literature on human cancer identifies UV light, cigarette smoke, and the deamination of methylated cytosine as particularly important sources of somatic mutations (see Alexandrov et al. 2013). Analogously, in trees, mutagens and spontaneous damage could cause somatic mutations that accumulate steadily with time rather than successive cell divisions. If this is so, the implications for trees that may live for thousands of years are intriguing.   An insightful analytic model by Gao et al. (2016) shows how difficult it is to discern the causes of mutation. In particular, under plausible conditions, DNA damage accruing with absolute time can couple the mutation rate with the rate of cell division if damaged sites are repaired quickly—that is, if damage only escapes repair if it occurs immediately before 3  replication. Conversely, under more restrictive conditions, replication errors can accumulate as mutations at a constant rate per time or nearly so. Since the evidence that replication drives mutation comes from indirect correlations between neutral substitution rates and inferred rates of mitosis (Baer et al. 2007; Smith and Donoghue 2008; Arnheim and Calabrese 2009; Lanfear et al. 2013; Bromham et al. 2015), and since such relationships are easily confounded, the causes of mutations in plants may not be as well established as the literature on the subject suggests.  Knowledge of the causes of mutations could help predict variation in the somatic mutation rate among species and populations. If somatic mutations are not caused by replication, then factors like the proportion of methylated sites, UV exposure, or metabolic rate could predict mutation rates. Conversely, if replication is the primary culprit, we might expect ancient but stunted species (e.g. high-elevation subalpine larch (Larix lyallii) or bristlecone pine (Pinus longaeva)) to have lower per-generation mutation rates than large but comparatively short-lived species (e.g. valley-bottom Sitka spruce). Morphological characteristics such as the architecture of apical meristems and the average length of the cells they produce—or, for clonal species, the tissue of origin of the ramet—might also be predictive of somatic mutation rates, all else being equal. Of course, all else is not equal. Differences in average generation time, effective population size (Lynch 2010b), genome size (Bromham et al. 2015), and other unknown mechanisms, such as variation in the susceptibility to infection (Ranade et al. 2014), can inform or complicate such inferences.  Differences in genomic architecture and composition among taxa could also alter mutation rates. For example, members of the Pinaceae have more interspersed repeats (40–86% of the genome; Wegrzyn et al. 2014) than angiosperms such as Arabidopsis (14%; Arabidopsis thaliana; The Arabidopsis Genome Initiative 2000). Conversely, 4  chromosomal rearrangements are less common in the Pinaceae than in angiosperms, along with duplications (Bowers et al. 2003), with 3 times more collinearity among homologous genes shared by Picea glauca and Pinus taeda than those shared by Arabidopsis and Populus trichocarpa despite similar divergence dates around 100 million years ago (Wang et al. 2009; Pavy et al. 2012; Wang et al. 2012). Such differences in genomic architecture could alter mutational processes, especially since repeats such as transposable elements can be mutagenic (Kidwell and Lisch 1997).  1.3 Occurrence and frequency Conspicuous phenotypic variation within individuals provided the first evidence of somatic mutations in plants. Agriculturally-important examples include the origins of pink grapefruits and nectarines from mutant sectors of wild-type grapefruit trees (Citrus x paradisi; Hartmann and Kester 1975) and peach trees (Prunus persica; Philp and Davis 1936, cited in McGregor 1976), as well as the high frequency of albinism in embryos produced by old mangroves (Rhizophora mangle; Klekowski and Godfrey 1989). Induced or spontaneous mutations affecting the colour or shape of leaves or flowers are also well documented throughout the last century (e.g. Punnett 1919, cited in Gates 1920; Lindstrom 1933; Cuany et al. 1958, cited in Mericle and Mericle 1965; Heiken 1960; Whitham and Slobodchikoff 1981). More recently, direct genetic evidence has come from estimated rates of somatic mutation at microsatellite loci in a variety of trees (Pla et al. 2000; Cloutier et al. 2003; Lian et al. 2004; O’Connell and Ritland 2004; Burg et al. 2006; Helmersson et al. 2008; Ally et al. 2008; Kuchma et al. 2011; Ranade et al. 2014), although such rates can be difficult to interpret because microsatellite loci are generally non-functional and because of rate variation among loci. 5   Nonetheless, new morphological work suggests that heritable somatic mutations may be comparatively uncommon. Assuming they are caused by replication errors, a low rate of mitosis would be evidence for a low somatic mutation rate. A recent study found exactly this in the meristems of tomato (Solanum lycopersicum) and Arabidopsis during branching (Burian et al. 2016): that is, few cell divisions are necessary to produce an axillary bud from an apical meristem. If we extrapolate to conifers, however, it may not follow that there are few cell divisions—and hence perhaps few mutations—in apical meristems, since the production of long internodes or of apical buds could conceal many more cell divisions than branching events. A study of Arabidopsis pursuing a similar line of evidence found that small plants undergo only 13% fewer meristematic cell divisions than large plants (Watson et al. 2016).  Sequencing the parents and progeny of large and small mismatch-repair-deficient mutants, the authors also showed that there was no significant difference among plant types in the number of new mutations. Again, as detailed above, we might infer that large increases in plant size require only small increases in the number of cell divisions, and if somatic mutations result from replication errors, this could limit the somatic mutation rate even in very large trees. Little purchase on the problem was technically feasible before next-generation sequencing methods became widely available, and in 2016, Xie et al. reported the first per-generation mutation rate in a tree from sequence data. They estimated 7.77 x 10-9 substitutions per base pair per generation from the leaf of a young peach tree to the leaves of its very young progeny, including germline mutations. Their mutation identification protocol likely excluded most somatic mutations by requiring that each mutation be unique among progeny from different branches, but an additional application of their method showed at most a modestly elevated 6  mutation rate even when the parent was 200 years old—that is, it did not indicate a strong effect of somatic mutation.   The most up-to-date work on the somatic mutation rate in trees is a new estimate for an angiosperm, a 234-year-old common oak (Quercus robur) in Switzerland (Schmid-Siegert et al. 2017). Whole-genome sequencing of two branches revealed 17 heterozygous mutations and a base substitution rate of 4.2–5.2 x 10-8 per base pair within a generation. This is the highest mutation rate reported for a multicellular species per generation. But although an oak undergoes much more vegetative growth during its lifetime than Arabidopsis, for example, the difference between their mutation rates is not proportionately large. This comparison suggests either that oak meristems perform fewer mitoses per meter of growth than Arabidopsis or perhaps that replication errors do not drive somatic mutation. Regardless, if oaks are similar to conifers—despite the fact that angiosperms evolve much faster than conifers in general (De La Torre et al. 2017; see below)—then we have a broad idea of how the somatic mutation rate behaves: conifers may generate many more new mutations each generation than most taxa, but we expect a low rate of mutation per year because they are so long-lived.  Variation among sites or regions within a genome—for instance, associated with recombination rates (Lercher and Hurst 2002)—is an important caveat to any discussion of mutation rates. Not only are mutations more common in some nucleotide contexts, but mutation rates in different contexts are also differently correlated with replication time, recombination, and other processes, at least in mammals (Hodgkinson and Eyre-Walker 2011). Moreover, in humans, different tissues can have vastly different somatic mutation rates (Lynch 2010a). Any study of the somatic mutation rate must therefore contend with possible heterogeneity within species. 7   1.4 Evolutionary implications Research on somatic mutations is motivated by the potential consequences of an unusually high dose of heritable mutation for large or long-lived species. Although many hypotheses exist, few have been thoroughly tested. It has been suggested that somatic mutations may permit trees to fine-tune local adaptation to current or changing climates (Whitham and Slobodchikoff 1981), maintain a coevolutionary arms race with insect herbivores (Whitham and Slobodchikoff 1981), create a branch-by-branch mosaic of plant defenses (Whitham and Slobodchikoff 1981), allow natural selection among branches and cell lineages within individual trees (Gaul 1964), contribute to genetic load (Klekowski 1988), or force the evolution of outcrossing mating systems (Scofield and Schultz 2006). There is some theory on the latter and even more on selection within the individual (Klekowski and Kazarinova-Fukshansky 1984; Otto and Orive 1995; Otto and Hastings 1998; Pineda-Krch and Fagerstrom 1999; Orive 2001; Folse and Roughgarden 2012), as well as on the implications of somatic mutations for senescence (Klekowski 2003; Ally et al. 2008; Brutovská et al. 2013; Groot and Laux 2016), but molecular studies are restricted to plant defense mosaics (Padovan et al. 2013).  In order to understand many of these hypotheses, more detailed knowledge is required of the physiological processes that determine how somatic mutations spread through a tree and the effects they may have.  For instance, selection among cell lineages within apical meristems is predicted to be most efficient if both the number of apical initials and the number of mitoses is high (Klekowski and Kazarinova-Fukshansky 1984), if cells are haploid, and if the meristem is unstructured (Otto and Orive 1995). Although the basic organization of conifer meristems shows that they fulfill few of these requirements (Evert 2006), the implications for selection within the 8  individual will be unknown until meristems are described in finer detail.  And even if meristem organization is not conducive to such selection, the physiology of branching events (Burian et al. 2016) might make selection among cell lineages efficient nonetheless.  Similarly, the efficiency of selection among branches (Antolin and Strobeck 1985) depends both on the genetic homogeneity of individual branches (produced by patterns of cell division in meristems) and also on the capacity of vigorous branches to generate or requisition more resources and produce more cones or flowers than weaker branches. For example, although strong apical dominance limits the independence of individual branches in conifers, mutant branches could obtain more or less resources for cone production by photosynthesizing more efficiently or across greater leaf area, by producing cytokinin phytohormones to draw in more resources (Logan et al. 2013; Rob Guy, personal communication March 2018), by surviving stresses such as defoliation (Edwards et al. 1990), or simply by growing faster or slower than their neighbours. All of these processes are determined by physiology, not only by the amount of genetic variation within a tree.   All the same, data on somatic mutation rates is also needed to evaluate all the evolutionary hypotheses listed above. So far, the best available data for most plants are indirect inferences from neutral substitution rates. Simple but compelling arguments show that the neutral substitution rate is equal to the average heritable mutation rate per year, including mutations from meiosis, regardless of effective population size (Kimura 1983) or linked selection (Birky and Walsh 1988; but see Phung et al. 2016). If most heritable mutations in trees are somatic, then the neutral substitution rate is an informative upper bound for the somatic mutation rate. Estimates of the neutral substitution rate are confounded, however, by the difficulty of identifying truly neutral loci (Shields et al. 1988). The neutral substitution rate also 9  incorporates historical alterations to the mutation rate due, perhaps, to variation in generation time, the appearance of mutator alleles, or changes in the efficiency of selection on the mutation rate itself, and this could decouple past substitution rates from the current mutation rate. A number of studies have estimated that substitution rates in gymnosperms are 0.5–1 x 10-9 per base pair per year, with variation among genera, and higher in angiosperms at 4–6.5 x 10-9 per base pair per year (Willyard et al. 2006; Buschiazzo et al. 2012; Chen 2012; De La Torre et al. 2017). The inferences that can be drawn from these data are limited. Perhaps angiosperm trees have a higher somatic mutation rate than conifers of a similar age and size because of differences in metabolism, meristematic architecture, genome composition, or any number of other factors. Picea, the genus considered in this study, displays a faster substitution rate than most other gymnosperms, so we may suspect that it accumulates more somatic mutations as well. However, the main implication is that since conifers have low neutral substitution rates compared with other plants, somatic mutations do not compensate for the slow pace that long generation times impose on evolution. If we want to evaluate hypotheses about the evolutionary effects of somatic mutations, then we need better estimates than neutral substitution rates provide. Phenotypic observations and neutral substitution rates are a poor substitute for direct evidence. The available estimates of mutation rates in trees are encouraging, but information from multiple species of angiosperms and gymnosperms, as well as knowledge both of the causes of somatic mutations and of the variation in the somatic mutation rate with life history traits, is necessary before a complete picture of their evolutionary significance emerges. 10  1.5 Context for the experimental design To provide an estimate of the somatic mutation rate in a gymnosperm, I sequenced tissue from the top and bottom of exceptionally tall Sitka spruce. I chose this approach over sequencing two branch tips from each tree because the number of mutations between the top and bottom is a better approximation of the total contribution of heritable somatic mutations to the per-generation mutation rate, because samples from two branch tips cannot be separated by as much vegetative growth (i.e. as many mitoses), and because obtaining samples from the ends of branches on tall trees would be challenging. I also chose to focus on base substitutions rather than indels, rearrangements, or other mutations, since base substitutions should be both more common and easier to identify against the highly-fragmented reference genome available (Birol et al. 2013)—making detection more straightforward.  Because most literature on the subject suggests that somatic mutations are caused by replication errors (see above), and since greater height could imply a greater number of mitoses in primary apical meristems, I chose exceptionally tall rather than exceptionally old trees for this research. Sitka spruce is among the tallest tree species in the world, occurs locally, and has better genomic resources than the other local tall tree, Douglas-fir (Pseudotsuga menziesii). I sampled trees from the well-known population of seral, old growth Sitka spruce on the floodplain of the Carmanah River valley (Carmanah Walbran Provincial Park), which is maintained by regular erosion when the river changes course and which contains the tallest tree in Canada (Little et al. 2013).  Although these striking trees may not be representative of the average Sitka spruce, their height and age increase the odds of observing somatic mutations at all; and this advantage is necessary given the technical challenges of detecting such very rare events.  11  Chapter 2: The Somatic Mutation Rate in Sitka Spruce1  2.1  Introduction Evolution depends on heritable mutations. Because new mutations increase genetic variation, can speed up adaptation, and also contribute to genetic load, any process that might alter the heritable mutation rate is of fundamental interest in evolutionary biology. The inheritance of somatic mutations is one such process in many animals and in all plants (Buss 1983). Without a segregated germline, mutations that arise in a plant’s stem-cell-like apical meristems during vegetative growth can be propagated through new or growing branches, incorporated into the gametes those branches produce, and inherited by offspring.   Long-lived or large trees could have drastically elevated mutation rates if age or successive mitoses drive the accumulation of somatic mutations (Whitham & Slobodchikoff 1981). This possibility has inspired a variety of tantalizing hypotheses: Within a single tree, somatic mutations could permit selection among cell lineages to purge deleterious mutations (Gaul 1964), expose different combinations of new mutations to selection in offspring derived from different branches, and slow adaptation in insect herbivores by confronting them with a genetic mosaic of plant defenses (Whitham & Slobodchikoff 1981; but see Folse and Roughgarden 2012). Within a species, the extra genetic variation that somatic mutations generate could permit closer adaptation to local or changing conditions (Whitham & Slobodchikoff 1981), increase genetic load and promote outcrossing mating systems (Klekowski 1988; Scofield &                                                  1. Chapter 2 is a manuscript for publication to which Sally Aitken and Sally Otto also contributed, so “we” is used rather than “I”.  12  Schultz 2006), and allow for coevolution with short-lived insect herbivores (Whitham & Slobodchikoff 1981).  There is some evidence that the somatic mutation rate is low per year and—at most—high per generation in conifers (Petit and Hampe 2006). Namely, the neutral nucleotide substitution rate—which equals the mutation rate per year, irrespective of effective population size (Kimura 1983) or linked selection (Birky & Walsh 1988)—is lower in conifers than the per-generation mutation rate in annual rice and Arabidopsis (De La Torre et al. 2017; Ossowski et al. 2010; Yang et al. 2015). Thus, somatic mutations should contribute more to per-generation processes in conifers (e.g. selection within the individual) than to per-year processes (e.g. arms races with herbivores). A recent first estimate of the somatic mutation rate between two branch tips of a 234-year-old common oak (Quercus robur) corroborates this interpretation for angiosperm trees (Schmid-Siegert et al. 2017), but it remains to be seen whether it extends to conifers as well.  Early researchers argued that the ubiquity of mutant phenotypes in horticulture and the expectation of a high mitotic rate imply a high somatic mutation rate in trees (see Gill et al. 1995). Evidence of somatic mutations in nature soon followed. Strikingly, Klekowski and Godfrey (1989) observed that the incidence of mutant chlorophyll-deficient embryos in long-lived mangroves (Rhizophora mangle) is roughly 25 times higher than in short-lived barley (Hordeum vulgare) and buckwheat (Fagopyrum esculentum); and Edwards et al. reported a presumed-mutant Eucalyptus melliodora tree with a branch uniquely resistant to herbivory (Edwards et al. 1990). But recent work on cell divisions and apical meristem organization in Arabidopsis (Arabidopsis thaliana) and tomato (Solanum lycopersicum) suggests that increases in body size may not entail concomitant increases in the number of mitoses, implying that 13  somatic mutations may make a minor contribution to the total heritable mutation rate (Watson et al. 2016; Burian et al. 2016).  However, elegant theory cautions that the causes of mutations—whether mitosis or time-dependent DNA damage—are nearly indistinguishable (Gao et al. 2016). If mitosis is not the main source of mutation, then the number of mitoses may be a poor predictor of the somatic mutation rate. Regardless, conifers and angiosperms have dramatically different physical architectures, since conifers typically have a single dominant stem and little bifurcative branching, and the two taxa are separated by 300–350 million years of evolution each (Wang & Ran 2014): that is, the number of mitoses in annual angiosperms may not predict the somatic mutation rate in conifers. Although estimates of the somatic mutation rate in conifers exist for fast-mutating microsatellite loci (e.g. O’Connell & Ritland 2004), the rate of heritable somatic mutations in protein-coding regions is unknown. Here we estimate the somatic mutation rate in 20 exceptionally tall, old-growth Sitka spruce (Picea sitchensis) in the range of 220–500 years old (Little et al. 2013). We targeted and sequenced 10.2 Mb of DNA, primarily in or adjacent to exons, in samples separated by an average of 74 m of vegetative growth and 3–4 branching events from the top and bottom of each tree, compared their sequences to identify somatic mutations, and verified the results using Sanger sequencing.    14  2.2 Results 2.2.1 Base substitutions: identification We took 2 samples of cambium and phloem tissues (“bark samples”) from the base of each tree as well as a sample of young foliage from as high in the crown as possible, divided into 2 technical replicates (Figure 2.1). To include only heritable mutations, the bark samples were  taken from opposite sides of the trunk. Because their most recent common ancestor is a cell in the centre of the tree, near to the tree’s meristem when it was a young seedling, this guarantees that any genotype shared by both bark samples represents the ancestral state, that is, the genotype young seedling itself (Figure 2.2). We extracted DNA separately for all four samples from each of the 20 trees, sequenced reduced-representation libraries that targeted exon-rich regions totaling 10.2 Mb, aligned the data to the draft genome of a closely-related natural hybrid (Picea glauca x engelmannii; Birol et al. 2013; Warren et al. 2015), and called and filtered SNPs to obtain a set of high-confidence variants. Overall, we obtained 276 Gb of Illumina sequence data, with a read depth of 34 over the target region and 71% of target bases covered to a depth of at least 5 in all four samples from each tree. Comparing consensus bark genotypes with consensus foliage genotypes within each tree yielded a conservative pool of 24 candidate base substitutions, of which Sanger sequencing verified 3, discarded 16 as false positives, and was inconclusive for 5. We used the Integrative Genomics Viewer (IGV; Robinson et al. 2011) to manually examine raw sequence alignments (see The IGV method, below), identifying an additional 3 likely mutations from a broader pool of 141 candidate base substitutions procured by relaxing filtering criteria. Sanger sequencing verified 2 of these 3 and discarded the remaining false positive, bringing the total number of verified base substitutions to 5. 15  Figure 2.1 Schematic diagram of the protocol for detecting somatic mutations in 20 trees. DNA from samples and replicates was extracted separately. Illustration by Matt Strieby.              Figure 2.2 Schematic diagram of the bark samples and their consensus genotype. To infer the genotype of the tree when it was a young seedling, we discard loci for which the genotypes of the two bark samples differ, thus eliminating non-heritable mutations in the vascular cambium.   16  2.2.2 The IGV method To assess candidate mutations from the conservative pool that failed either PCR amplification or Sanger sequencing despite multiple attempts, as well as to pre-screen the broader pool of 141 candidate base substitutions, we developed a set of criteria based on Keightley et al. (2014) to distinguish between true and false positives based. In brief, we rejected a candidate mutation if unfiltered, aligned reads showed 3 haplotypes or if the mutant allele was present in putatively wild type samples (see Methods). We tested the criteria by making predictions for the original 24 candidate base substitutions and then Sanger sequencing them: the predictions were supported in every case.  Based on the IGV method, we discarded all 5 of the candidate mutations we could not Sanger sequence. Raw alignments for 3 showed heterozygotes with roughly equal numbers of reads supporting each allele in all four samples. The other 2 candidate mutations were at adjacent nucleotides in the same tree and displayed 3 well-supported haplotypes, even though Sitka spruce is diploid. All four samples for this tree contained at least 1 read supporting the 2 mutant alleles, and in every case the supporting read also contained a unique allele at a nearby locus—indicating that the candidate mutation resulted from the capture and sequencing of a paralogous locus outside the draft reference. For the broader pool of 141 candidate base substitutions, we used the IGV method to identify all except the 3 likely base substitutions—and the 3 base substitutions from the conservative pool, which is a subset of the broader pool—as false positives.  2.2.3 Base substitutions: characteristics  For all of the 5 verified somatic mutations, a homozygote mutated to a heterozygote containing an allele that was unique among all trees for the locus. No tree or scaffold contained 17  more than 1 mutation. Three of the 5 base substitutions were G/C to A/T transitions and 2 of these 3 occurred at C-containing dipyrimidine sites (Table 2.1), which are known to be susceptible to UV-induced mutations (Friedberg et al. 2006). Moreover, 2 of the 5 mutations were non-synonymous and 1 was synonymous, although the remaining 2 did not align well to the hybrid spruce transcriptome and may have occurred in introns or in non-genic sequences outside the target region. The genes in which the synonymous and non-synonymous mutations occurred were predicted from a transcriptome assembly (Yeaman et al. 2014), and as such the functional consequences of these mutations are unknown.  The sequence quality at confirmed mutant loci was high. Read depth after filtering ranged from 21 to 141 in each of the four samples from a mutant tree, and no reads in the homozygous samples supported the mutant allele. Nonetheless, errors in alignment and SNP calling seem to be responsible for the large number of false positives among candidate mutations, presumably because the reference genome is fragmented and incomplete.18  Table 2.1 Mutations and their characteristics. Positions given for the PG29 v4.1 reference genome. Contig Position Mutation CpG site C-containing dipyrimidine site Effect Transcriptome reading frame ALWZ04S2162146.1 18479 G/C to A/T yes yes non-synonymous serine to leucine ALWZ04S2011522.1 5666 G/C to A/T no  no - no alignment ALWZ04S1889965.1 2816 G/C to A/T no  yes synonymous proline  ALWZ04S2907496.1 48235 G/C to T/A no  no non-synonymous asparagine to lysine ALWZ04S1913287.1 11681 A/T to T/A no  no - incomplete alignment 19  2.2.4 The mutation rate We estimate a somatic base substitution rate of 2.7 x 10-8 [95% CI: 7.2 x 10-9 –7.6 x 10-8] per base pair within a generation, calculated as the number of verified mutations from the conservative pool (3) divided by the size of the search space, with an exact Poisson confidence interval. The 2 additional verified mutations from the broader pool, which we screened using the IGV method, were excluded from this estimate because we could not assess the IGV method’s false negative rate. However, including both the conservative pool and the broader pool (all 5 verified mutations), as well as expanding the search space to account for the relaxed filters that produced the broader pool, yields a similar somatic base substitution rate estimate of 2.5 x 10-8 [95% CI: 1.0 x 10-8 –5.8 x 10-8] per base pair within a generation. The cumulative search space across all trees was estimated to be 113.1 Mb for the conservative pool and 197.6 Mb for the broader pool, calculated as a sum of the number of genotypes contained in invariant sites and the number of genotypes contained in variable sites that were checked for mutations. Some filters could only be applied to variable sites, and we reduced the number of invariant sites in the search space proportionally: since 47% of all genotypes contained in variable sites passed filters and were checked for mutations, 47% of all genotypes contained in invariant sites were included in the search space. After accounting for diploidy, a 7% correction was required to compensate for bias against heterozygotes in the genotype-level filters. Because verified mutations in our study consist of two homozygous genotypes (bark) and two heterozygous genotypes (needles), such bias means that the filters retain mutant genotypes at a slightly different rate than wild type genotypes (see Methods).  Somatic mutations discarded in the process of filtering are not false negatives for the mutation rate, since the corresponding search space was reduced in proportion to the number of 20  genotypes that did not pass filters. Nevertheless, we compared the frequency of verified mutations in the broader pool (relaxed filters) and the conservative pool (more stringent filters) to confirm that candidate mutations discarded by the conservative pool’s stringent filters were largely false positives. In general, the relaxed filters targeted a subset of the quality statistics targeted by the stringent filters, and the relaxed filters excluded less of the genome. The frequency of true mutations in the conservative pool of 24 candidate base substitutions was 7 times higher than among the new genotypes in the broader pool of 141 base substitutions.  2.3 Discussion We report a somatic mutation rate in old growth Sitka spruce that is among the highest rates of heritable base substitution per generation in any eukaryote, even without mutations from error-prone meiosis (Magni 1963). Differences in age and height within a tree species (Whitham & Slobodchikoff 1981), among other effects (e.g. Sharp and Agrawal 2012), are predicted to produce broad variation in somatic mutation rates among individuals and populations. Our population of Sitka spruce has outlived the average generation time, so we expect an elevated per-generation mutation rate; and this increase could be non-linear if older trees are more or less susceptible to mutagenic infections or other stress (Ranade et al. 2014). In contrast, the annual somatic mutation rate in Sitka spruce is remarkably low:             7.4 x 10-11 base substitutions per base pair per year, assuming an average age of 360 years for the trees sampled (Figure 2.3). Perhaps this is unsurprising, since height and genome size are negatively correlated with neutral substitution rates in both gymnosperms (De La Torre et al. 2017) and angiosperms (Bromham et al. 2015), and Sitka spruce is both among the tallest tree species and has a ~21 Gb genome (Birol et al. 2013). Meiosis aside, the per-year mutation rate in 21  Sitka spruce is one-hundredth of that reported for rice (Oryza sativa; Yang et al. 2015), for instance, as well one-third of the per-year somatic mutation rate in oak (Schmid-Siegert et al. 2017), consistent with the much lower neutral substitution rates reported for gymnosperms (De La Torre et al. 2017).  Finally, we note the discrepancy of nearly an order of magnitude between the annual somatic mutation rate for Sitka spruce and the higher average neutral substitution rate reported for Picea (average of P. glauca, P. sitchensis, and P. abies; De La Torre et al. 2017). This might be explained by the addition of meiosis and subsequent mitoses to the substitution Figure 2.3 Mutation rates (µ) per generation (A) and per year (B) for multicellular species with sufficient data available. Per-year rates were calculated using a reported age, an estimate of generation time, or an average of two estimates of generation time (Table A.1, Appendix A). For the trees, µ was estimated in exceptionally long-lived specimens; and for Quercus and Picea, µ excludes the contribution of meiosis, subsequent mitoses, and early growth.   22  rate, interspecific differences, or possibly by episodes of faster mutation in the history of the genus.  That the somatic mutation rate is high per generation but low per year suggests that it should be interpreted according to the timescale of the evolutionary process in question. For example, somatic mutations may explain the high genetic load of most trees (Klekowski 1988). Although there are examples of nearly loadless species, a median of 7 lethal equivalents per zygote in loblolly pine (Pinus taeda) and 10 in Douglas-fir (Pseudotsuga menziesii) are considered representative for conifers—compared with just 2 lethal equivalents per zygote in humans—and considerable genetic load has also been documented in angiosperms such as Eucalyptus regnans (Klekowski 1988). Since an increase to the deleterious mutation rate directly increases mutation load (Haldane 1937), mutation load due to the accumulation of somatic mutations likely accounts for much of the observed genetic load in trees. Selection among or within a tree’s branches (i.e. among or within apical meristems) is likewise a per-generation (or per-cell-generation) process. Although the extent of variation within an apical meristem is unknown, our results indicate that the somatic mutation rate between two distant branches on an average Sitka spruce could well be near the human germline mutation rate (1.20 x 10-8; Kong et al. 2012). This is a difference of 4–5 base substitutions in coding regions and UTRs (untranslated regions), assuming that Sitka spruce’s transcriptome is the same size as that of Picea glauca x engelmannii natural hybrids (182.2 Mb; Yeaman et al. 2014). Indels, inversions, and structural changes, as well as further base substitutions in regulatory regions and introns, would generate additional dissimilarities. If selection on such variation is effective, it could skew the distribution of fitness effects for heritable mutations by removing variants that are deleterious at the cell or branch level (Gaul 1964). In fact, one caveat 23  to our estimate is that selection within or among branches may have already depressed the somatic mutation rate. Over short timescales, however, the annual addition of heritable somatic mutations is just 0.03 mutations per year in coding regions and UTRs (in 182.2 Mb; see above). In comparison, given its per-generation mutation rate (Ossowski et al. 2010), an annual plant such as Arabidopsis could generate more than 80 times as many mutations per year in an equivalent amount of sequence. All else equal, it is improbable that such a minute boost to the mutation rate in Sitka spruce could sidestep its long generation times and let it adapt to fast-changing climates (Aitken et al. 2008) or keep pace with coevolving insect herbivores (Whitham and Slobodchikoff 1981). Nevertheless, it is possible that the somatic mutations observed in Sitka spruce comprise an elevated proportion of beneficial alleles. If selection among branches or cell lineages within a tree has removed many deleterious mutations, then the beneficial mutation rate may be higher than it appears, and somatic mutations may do more to facilitate adaptation over short timescales than the comparison with Arabidopsis suggests. The structure of conifer apical meristems determines the importance of this process (Otto and Orive 1995). If deleterious mutations depress the rate of cell division in mutant apical initials by just 20% in a structured meristem, selection could remove as much as two-thirds of deleterious mutations (Otto and Orive 1995). Conifer meristems have more structure than the structured meristem considered by Otto and Orive (Evert 2006), however, and this may make them resistant to such selection. Alternatively, it is also possible that enough (beneficial) mutations arise early in the development of conifers—meaning they appear in most reproductive branches—that they make an outsized contribution to adaptation by appearing in gametes at a higher rate per year than expected. 24  Even if the adaptive contribution of somatic mutations over short timescales is negligible, somatic mutations could generate enough variation to strengthen local adaptation over centuries or generations. When adaptation requires a very rare beneficial mutation or sequence of mutations, for example, somatic mutations improve the odds that it will appear in a population, although beneficial mutations are easily lost to drift when they first arise. Selection within the individual could even promote a beneficial somatic mutation if it improves the reproductive success of the branches that bear it. Similarly, since conifers often have large effective population sizes (Dodd & Silvertown 2000), a higher mutation rate can produce considerably stronger adaptation when many loci underlie adaptive traits (Yeaman 2015).  A striking similarity between Sitka spruce and other plants is the preponderance of G/C to A/T transitions (Yang et al. 2015; Xie et al. 2016; Schmid-Siegert et al. 2017; Exposito-Alonso et al. 2018), which often result either from UV damage or the deamination of methylated cytosine at CpG sites (Friedberg et al. 2006). Although mitosis can transform such damage into mutation, mitosis does not cause either methylation damage or UV damage, and so the common doctrine that replication errors are the dominant cause of mutation in plants deserves more scrutiny than it receives. For example, if we accept predictions of the mitotic rate based on the number of mitoses during branching events in Arabidopsis (assumed to be bifurcative; Burian et al. 2016), just 18–24 mitoses from an embryonic meristem to a third- or fourth-order branch in Sitka spruce yielded 4 times more mutations than both meiosis and 30–50 mitoses do over the life of an Arabidopsis plant (Ossowski et al. 2010). Alternatively, if the number of mitoses is proportional to the length of vegetative growth, then Arabidopsis, which might be less than 0.3 m tall, produced nearly a quarter of the mutations that occurred in 74 m of Sitka spruce. In this example, then, estimates of the mitotic rate do not explain the mutation rate.  25  We offer the first direct estimate of the somatic mutation rate in a conifer. The somatic mutation rate is very high in large, old Sitka spruce within a generation but very low both per year and per meter of growth, and we argue that this distinction determines the relevance of somatic mutation for evolutionary processes. This interpretation supports the hypotheses that somatic mutations both cause high genetic load in trees and also provide significant genetic variation for both selection within the individual and also adaption over long timescales. It does not, however, support the hypothesis that somatic mutations allow trees to maintain adaptation under fast-changing conditions.  Future research on intraspecific variation of the somatic mutation rate with tree age, height, and environment could identify useful correlates for predicting interspecific differences in the mutation rate.  2.4 Methods 2.4.1 Samples and processing Twenty Sitka spruce trees were chosen for exceptional height along a one-kilometer stretch of the Carmanah River on Vancouver Island, Canada (48°66ˈ N, 124°69ˈ W; Table B.1, Appendix B). A previous study of the population (Little et al. 2013) found that the largest Sitka spruce in the Carmanah River valley are between 220 and 500 years old, and we used 360 years (the midpoint of the range) as an estimate for the age of the trees we sampled (e.g. in Figure 2.3). Two bark samples were taken from opposite sides of the base of each tree with a leather punch. The outer bark and as much of the phloem as possible were shaved off, leaving a mixture of cambium and phloem to be dried in silica beads in the field, and DNA from each sample was extracted in the laboratory using a CTAB protocol modified from Zeng et al. (2002). In particular, we omitted the additives used during grinding and added 5 µL β-mercaptoethanol 26  with the 3% CTAB. Each sample of needles was taken from a single twig as high in the tree as possible, similarly dried, and then divided into 2 technical replicates before DNA extraction (NucleoSpin). Sequence capture probes (Roche Nimblegen) were designed for a subset of the exon-rich hybrid spruce contigs targeted in a previous study (Suren et al. 2016). We targeted the contigs that were captured and sequenced most successfully across the most individuals using Suren et al.’s sequence capture probes on Sitka spruce (Joane Elleouet, unpublished data). Prepared libraries were barcoded and sequenced on 4 lanes of an Illumina HiSeq 4000 with 100 bp paired-end reads at Genome Quebec (Montreal, QC).   2.4.2 SNP calling and filtering We aligned the reads from each sample to the PG29 v4.1 hybrid spruce draft reference genome (Birol et al. 2013) using NextGenMap v0.5.0 (Sedlazeck et al. 2013) and marked PCR duplicates with Picard MarkDuplicates v2.8.1. The contigs of the highly-fragmented reference genome were first assembled into pseudo-scaffolds separated by 400 “N”s to facilitate the use of bioinformatics programs downstream. We then applied HaplotypeCaller and GenotypeGVCFs from GATK v3.7 with the flag -all_sites to create a VCF file of both variable and invariant sites (Van der Auwera et al. 2013).  Because we found that genotype calls at sites with a high minor allele frequency (MAF) across all trees were less reliable, we divided sites according to a MAF threshold of 0.05 and processed low- and high-MAF sites separately, with more stringent filters on high-MAF sites. We then applied a series of unbiased filters. We discarded indels, multiallelic sites, and sites with more than 85% missing data before filtering loci by site depth (high-MAF: <6000 reads across all samples; low-MAF: <7000 reads) and genotype depth (high-MAF: >11 and <71 reads per sample; low-MAF: >11 and <76 reads) to obtain a basic set of good-quality sites (the initial set, 27  below). We then separated invariant and variable sites, filtered the latter by GATK’s ReadPosRankSumTest (high-MAF: >-0.6 and <0.7; low-MAF: >-1.3 and <1.6) and BaseQualityRankSumTest (high-MAF: >-0.6 and <1.0; low-MAF: >-0.7 and <1.3), and further filtered high-MAF sites using the strand bias tests StrandOddsRatio (<1.4) and FisherStrand (<12). We avoided stringent filtering in order to procure a generous search space for candidate mutations. Next, we defined homozygous-reference, heterozygous, and homozygous-alternate genotypes by setting requirements for the proportion of reads supporting the reference and alternate alleles (from the hybrid spruce reference). In particular, heterozygotes were called if 40–60% of filtered reads supported each allele (high-MAF; 23–77% for low-MAF); and homozygotes or homozygote-alternates were called if >98% of filtered reads supported one allele (high-MAF; >97% for low-MAF). Such requirements are biased because they skew the proportion of heterozygous genotypes and hence also the proportion of true mutations (bias corrected below). We also filtered by GATK’s genotype likelihood values (high-MAF: >53; low-MAF >15). The result was two final sets of high-confidence genotypes for low-MAF sites (containing 𝑛𝑙𝑜𝑤 genotypes, below) and high-MAF sites (ditto 𝑛ℎ𝑖𝑔ℎ). We considered any locus for which both bark samples have the same genotype, both foliage samples have the same genotype, and the bark and foliage genotypes differ to be a candidate somatic mutation.  Candidate mutations were verified using Sanger sequencing. We designed primers for each mutation, amplified it, and sent it to the Sequencing and Bioinformatics Consortium (the University of British Columbia) for sequencing. If the mutant allele was not found in a heterozygous sample, we considered the candidate mutation a false positive. If the mutant allele 28  was both confirmed in a heterozygous sample and also absent in a homozygous sample, we considered the candidate mutation a true positive.  2.4.3 The IGV criteria We examined raw BAM files for the four relevant samples to evaluate select candidate mutations. If we define the candidate mutant allele to be whichever allele is present in the heterozygous samples but not the homozygous samples, then we consider a candidate mutation to be a false positive if one of the following is true: i. All reads that show the candidate mutant allele also show a unique allele at a nearby locus in the heterozygous samples, and at least 1 of the homozygous samples contains reads supporting both alleles. ii. At the candidate locus and a nearby polymorphic locus there are 3 haplotypes in the heterozygous samples, each supported by multiple reads; and/or, iii. All four samples contain at least 1 read supporting the candidate mutant allele. 2.4.4 Estimating the size of the search space The calculation of the search space accounts for the fact that not all filters could be applied to invariant sites and corrects for the bias in the filtering process against heterozygotes. The bias against heterozygotes was observed as a decrease in heterozygosity after the genotype definitions (above) were applied to variable sites. Since a mutation consists of two homozygous bark samples and two heterozygous needles samples, this implies a bias against mutations as well. Thus, we reduced the size of the search space by 7% to bring it in line with the slightly lower pass rate of mutant loci (Appendix C).  More precisely, the search space was calculated as  2𝑆−1(𝑛ℎ𝑖𝑔ℎℎℎ𝑖𝑔ℎ + 𝑛𝑙𝑜𝑤ℎ𝑙𝑜𝑤) 29  where 𝑛ℎ𝑖𝑔ℎ and 𝑛𝑙𝑜𝑤 are the high- and low-MAF high-confidence variable genotypes, 𝑆 is the fraction of all genotypes that are contained in variable sites—counted in the initial set, above—and the factor of 2 adjusts for diploidy. Finally, ℎ𝑙𝑜𝑤 and ℎℎ𝑖𝑔ℎ constitute the heterozygosity correction described above. 2.4.5 Synonymous and non-synonymous substitutions Mutant codons were identified using BLAST (Altschul et al. 1990) to align 200 bp around each mutation to the Picea glauca x engelmannii transcriptome (Yeaman et al. 2014), retaining only high-confidence matches (E value <10-40). We then mapped mutations onto the longest predicted open reading frames produced by Yeaman et al. (2014) and determined the impact on the predicted protein (Table 2.1).   30  Chapter 3: Conclusion I estimated a somatic base substitution rate of 2.7 x 10-8 per base pair within a generation in 20 exceptionally tall old-growth Sitka spruce trees. This is among the highest known mutation rates for any species on a per-generation basis, but it corresponds to a remarkably low rate per year because of the great age of these trees (Figure 2.3). Most of the mutations I identified were G/C to A/T transitions (Table 2.1), as in other plants, and they may have been caused by UV damage or the deamination of methylated cytosine.  One caveat is that the mutation rate reported here is probably higher than average for Sitka spruce. Their mean generation time and mitotic rate are unknown, but if mutations accumulate linearly with time or cell divisions, we might expect most individuals to mutate at half or less of the rate estimated here because they are much smaller on average—although heritable somatic mutations would still be exceptionally common per generation. Likewise, since I sampled 20 trees, I could only sequence a small fraction of Sitka spruce’s ~21 Gb genome. Whole-genome sequencing (as in Schmid-Siegert et al. 2017) might have provided enough data to reveal a clearer excess of G/C to A/T transitions, as well as a more precise estimate of the somatic mutation rate, but would have required resources far in excess of those available. In order to disentangle the causal effects of time and vegetative growth on the somatic mutation rate, one could study conifers of similar heights but different ages and vice versa. For example, one could choose a species such as Abies amabilis that grows across a wide variety of elevations at the same latitude, since high-elevation trees both grow more slowly and are exposed to more UV light. If taller, younger trees mutate more frequently than older, smaller trees, then replication is the culprit; but if the converse is true, then time-dependent DNA damage is more likely (although patterns of mitosis in apical meristems under different 31  conditions, as well as the joint increase in age and UV intensity at higher elevations, could confound these inferences). Alternatively, since the pattern of mitosis in the cambium is well studied (e.g. Newman 1956; Bannan 1957; Murmanis 1970; Larson 1994), it would be interesting to sequence tissue culture derived from single cambial cells on opposite sides of the trunk and evaluate whether the per-cell-division mutation rate (of non-heritable mutations) is constant for trees of different ages.  Similarly, an initial estimate of the somatic mutation rate is far from sufficient to evaluate their significance for conifer evolution. My result can only suggest that somatic mutations are likely to be of greater interest for per-generation processes than per-year processes. For instance, recent theory on alleles of small effect explores the adaptive consequences of a boost to the mutation rate such as somatic mutations provide (Yeaman 2015). In particular, Yeaman’s simulations show that for multilocus traits, transient alleles can replace each other to maintain adaptation even as some are lost to drift, and that even a modest increase to the per-generation mutation rate can enhance local adaptation of this sort. The implication is that somatic mutations might contribute to the remarkable local adaptation that trees display—but over many generations and not just many years. Unless selection within the individual drastically increases the proportion of beneficial mutations that a tree produces, it is unlikely that a tiny annual contribution of less than 10-10 mutations per base pair could do much to accelerate adaptive responses to the rapidly changing conditions created by species invasions or increasing drought frequency with climate change, for example.  The proposal that selection lowers the mutation rate until drift obscures the benefit of any further decrease is one hypothesis that is clearly locked to the per-generation mutation rate: this is the drift-barrier hypothesis of mutation rate evolution (Lynch 2010), and the implication is that 32  in species with large effective population sizes such as many conifers, the heritable mutation rate per generation should be low. But because the accumulation of somatic mutations results in a high per-generation mutation rate in some conifers despite their large effective population size, they may offer a chance to test the generality of the hypothesis. That is, are mutation rates similar among conifers and non-conifer species with similar effective population sizes?  For many of these investigations, better genomic resources and more knowledge of basic population genetic parameters are required. In trees these are scarce—although they could be obtained for fast-growing angiosperm trees with small genomes—but since somatic mutations are heritable in many species, perhaps some of these ideas could be evaluated in well-studied model organisms. Nonetheless, many apply only to large or long-lived species, and so trees provide one of the better systems for research. In particular, a comparison of the distribution of fitness effects for somatic mutations between large, old trees and seedlings could assess the efficiency of selection within the individual, and the methods described above could elucidate both the causes of somatic mutations and their adherence to the drift-barrier hypothesis. I hope that early estimates of the somatic mutation rate in conifers will be of use for further investigation of these problems. 33  References Aitken SN, Yeaman S, Holliday JA, Wang T, Curtis-McLane S. 2008. Adaptation, migration or extirpation: climate change outcomes for tree populations. Evol. Appl. 1:95–111.   Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale AL, et al. 2013. Signatures of mutational processes in human cancer. Nature 500:415–421.   Ally D, Ritland K, Otto SP. 2008. Can clone size serve as a proxy for clone age? An exploration using microsatellite divergence in Populus tremuloides. Mol. Ecol. 17:4897–4911.   Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410.  Antolin M, Strobeck C. 1985. The population genetics of somatic mutation in plants. Am. Nat. 126:52–86.  Arnheim N, Calabrese P. 2009. Understanding what determines the frequency and pattern of human germline mutations. Nat Rev Genet 10:478–488.   Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, et al. 2013. From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline. Curr. Protoc. Bioinforma. 43:1–11. 34   Baer CF, Miyamoto MM, Denver DR. 2007. Mutation rate variation in multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 8:619–631.   Bannan MW. 1957. The relative frequency of the different types of anticlinal divisions in conifer cambium. Can J Bot. 35:875–884.  Birky CW, Walsh JB. 1988. Effects of linkage on rates of molecular evolution. Proc. Natl. Acad. Sci. U.S.A. 85:6414–6418.  Birol I, Raymond A, Jackman SD, Pleasance S, Coope R, Taylor GA, Saint Yuen MM, Keeling CI, Brand D, Vandervalk BP, et al. 2013. Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics 29:1492–1497.   Bowers JE, Chapman BA, Rong J, Paterson AH. 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplications events. Nature 422:433–438.   Bromham L, Hua X, Lanfear R, Cowman PF. 2015. Exploring the relationships between mutation rates, life history, genome size, environment, and species richness in flowering plants. Am. Nat. 185:507–524.  Brutovská E, Sámelová A, Dušička J, Mičieta K. 2013. Ageing of trees: application of general ageing theories. Ageing Res. Rev. 12:855–66.  35   Burg K, Helmersson A, Bozhkov P, von Arnold S. 2006. Developmental and genetic variation in nuclear microsatellite stability during somatic embryogenesis in pine. J. Exp. Bot. 58:687–698.   Burian A, de Reuille PB, Kuhlemeier C. 2016. Patterns of stem cell divisions contribute to plant longevity. Curr. Biol. 26:1–10.   Buschiazzo E, Ritland C, Bohlmann J, Ritland K. 2012. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms. BMC Evol. Biol. 12. doi:10.1186/1471-2148-12-8.  Buss LW. 1983. Evolution, development, and other units of selection. Proc. Natl. Acad. Sci. U.S.A. 80:1387–1391.  Castiglia R, Capanna E. 1999. Contact zones between chromosomal races of Mus musculus domesticus. 1. Temporal analysis of a hybrid zone between the CD chromosomal race (2n=22) and populations with the standard karyotype. Heredity. 83:319–326.   Chen J. 2012. Conifer evolution, from demography and local adaptation to evolutionary rates [dissertation]. Upsala Universitet. 52.  Cloutier D, Rioux D, Beaulieu J, Schoen DJ. 2003. Somatic stability of microsatellite loci in eastern white pine, Pinus strobus L. Heredity. 90:247–52.  36   Davis KF, Gephart JA, Gunda T. 2016. Sustaining food self-sufficiency of a nation: the case of Sri Lankan rice production and related water and fertilizer demands. Ambio 45:302–312.  Dodd ME, Silvertown J. 2000. Size-specific fecundity and the influence of lifetime size variation upon effective population size in Abies balsamea. Heredity. 85:604–609.  Edwards PB, Wanjura WJ, Brown WV, Dearn JM. 1990. Mosaic resistance in plants. Nature 347:434.  Evert RF. 2006. Esau’s plant anatomy: meristems, cells, and tissues of the plant body: their structure, function, and development. 3rd ed. John Wiley and Sons. [accessed 2015 Nov 5].  Exposito-Alonso M, Becker C, Schuenemann VJ, Reiter E, Setzer C, Slovak R, Brachi B, Grimm G, Chen J, Busch W, et al. 2018. The rate and potential relevance of new mutations in a colonizing plant lineage. PLOS Genet. 14:1–21.  Falahati-Anbaran M, Lundemo S, Stenøien HK. 2014. Seed dispersal in time can counteract the effect of gene flow between natural populations of Arabidopsis thaliana. New Phytol. 202:1043–54.  37  Folse HJ, Roughgarden J. 2012. Direct benefits of genetic mosaicism and intraorganismal selection: modeling coevolution between a long-lived tree and a short-lived herbivore. Evolution. 66:1091–1113.  Friedberg E, Walker G, Siede W, Wood R, Schultz R, Ellenberger T. 2006. DNA repair and mutagenesis. 2nd ed. Washington, DC: ASM Press.  Gao Z, Wyman MJ, Sella G, Przeworski M. 2016. Interpreting the dependence of mutation rates on age and time. PLoS Biol. 14:1–37.   Gates RR. 1920. Mutations and evolution. New Phytol. 19:64–88.  Gaul H. 1964. Mutations in plant breeding. Radiat. Bot. 4:155–232.   Gaut B, Yang L, Takuno S, Eguiarte LE. 2011. The patterns and causes of variation in plant nucleotide substitution rates. Annu. Rev. Ecol. Evol. Syst 42:245–66.   Geraldes A, Basset P, Gibson B, Smith KL, Harr B, Yu HT, Bulatova N, Ziv Y, Nachman MW. 2008. Inferring the history of speciation in house mice from autosomal, X-linked, Y-linked and mitochondrial genes. Mol. Ecol. 17:5349–5363.  Gill DE, Chao L, Perkins SL, Wolf JB. 1995. Genetic mosaicism in plants and clonal animals. Annu. Rev. Ecol. Syst. 26:423–444.  38   Groot EP, Laux T. 2016. Ageing: how do long-lived plants escape mutational meltdown? Curr. Biol. 26:530–532.   Haldane JBS. 1937. The effect of variation on fitness. Am. Nat. 71:337–349.  Hartmann H, Kester D. 1975. Plant propagation. Englewood Cliffs, New Jersey: Prentice Hall Inc.  Heiken A. 1960. Spontaneous and X-ray-induced somatic aberrations in Solanum tuberosum L. [dissertation]. Upsala Universitet. 125.  Helmersson A, Jansson G, Bozhkov PV, Von Arnold S. 2008. Genetic variation in microsatellite stability of somatic embryo plants of Picea abies: a case study using six unrelated full-sib families. Scand. J. For. Res. 23:2–11.   Keightley PD. 1994. The distribution of mutation effects on viability in Drosophila melanogaster. Genetics 138:1315–1322.  Keightley PD, Ness RW, Halligan DL, Haddrill PR. 2014. Estimation of the spontaneous mutation rate per nucleotide site in a Drosophila melanogaster full-sib family. Genetics 196:313–320.   39  Keightley PD, Pinharanda A, Ness RW, Simpson F, Dasmahapatra KK, Mallet J, Davey JW, Jiggins CD. 2015. Estimation of the spontaneous mutation rate in Heliconius melpomene. Mol. Biol. Evol. 32:239–243.  Kidwell MG, Lisch D. 1997. Transposable elements as sources of variation in animals and plants. Proc. Natl. Acad. Sci. USA 94:7704–7711.   Kimura M. 1983. The neutral theory of molecular evolution. Cambridge University Press.  Klekowski EJJ. 1988. Genetic load and its causes in long-lived plants. Trees:195–203.  Klekowski EJJ. 1988. Mutation, developmental selection, and plant evolution. New York: Columbia University Press.  Klekowski EJJ. 2003. Plant clonality, mutation, diplontic selection and mutational meltdown. Biol. J. Linn. Soc. 79:61–67.   Klekowski EJJ, Godfrey P. 1989. Ageing and mutation in plants. Lett. to Nat. 340:389–391.  Klekowski EJJ, Kazarinova-Fukshansky N. 1984. Shoot apical meristems and mutation: selective loss of disadvantageous cell genotypes. Am. J. Bot. 71:28–34.  40  Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir AA, Jonasdottir AA, et al. 2012. Rate of de novo mutations and the importance of father’s age to disease risk. Nature 488:471–475.   Kuchma O, Vornam B, Finkeldey R. 2011. Mutation rates in Scots pine (Pinus sylvestris L.) from the Chernobyl exclusion zone evaluated with amplified fragment-length polymorphisms (AFLPs) and microsatellite markers. Mutat. Res. 725:29–35.   De La Torre A, Li Z, Van de Peer Y, Ingvarrsson P. 2017. Contrasting rates of molecular evolution and patterns of selection among gymnosperms and flowering plants. Mol. Biol. Evol. 34:1363–1377.   Lanfear R, Ho SYW, Davies TJ, Moles AT, Aarssen L, Swenson NG, Warman L, Zanne AE, Allen AP. 2013. Taller plants have lower rates of molecular evolution. Nat. Commun. 4:1879. doi:10.1038/ncomms2836.  Larson PR. 1994. The vascular cambium: development and structure. Berlin Heidelberg: Springer-Verlag.  Lercher MJ, Hurst LD. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18:337–340.   41  Lian C, Oishi R, Miyashita N, Hogetsu T. 2004. High somatic instability of a microsatellite locus in a clonal tree, Robinia pseudoacacia. Theor Appl Genet 108:836–841.   Lindstrom EW. 1933. Hereditary radium-induced variations in the tomato. J. Hered. 24:129–137.  Little PJ, Richardson JS, Alila Y. 2013. Channel and landscape dynamics in the alluvial forest mosaic of the Carmanah River valley, British Columbia, Canada. Geomorphology 202:86–100.   Logan BA, Reblin JS, Zonana DM, Dunlavey RF, Hricko CR, Hall AW, Schmiege SC, Butschek RA, Duran KL, Emery RJN, et al. 2013. Impact of eastern dwarf mistletoe (Arceuthobium pusillum) on host white spruce (Picea glauca) development, growth and performance across multiple scales. Physiol. Plant. 147:502–513.  Lynch M. 2010a. Evolution of the mutation rate. Trends Genet. 26:345–352.   Lynch M. 2010b. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci. U.S.A. 107:961–968.   Magni G. 1963. The origin of spontaneous mutations during meiosis. Proc. Natl. Acad. Sci. U.S.A. 50:2–7.  McGregor SE. 1976. Insect pollination of cultivated crop plants. USDA.  42  Mericle LW, Mericle RP. 1965. Biological discrimination of differences in natural background radiation level. Radiat. Bot. 5:475–492.  Murmanis L. 1970. Locating the initial in the vascular vambium of Pinus strobus L. by electron microscopy. Wood Sci. Technol. 4:1–14.  Naisbit RE, Jiggins CD, Mallet J. 2001. Disruptive sexual selection against hybrids contributes to speciation between Heliconius cydno and Heliconius melpomene. Proc. R. Soc. B Biol. Sci. 268:1849–1854.  Nei M, Maruyama T, Chakraborty R. 1975. The bottleneck effect and genetic variability in populations. Evolution. 29:1–10.  Newman I V. 1956. Pattern in meristems of vascular plants—1. cell partition in living apices and in the cambial zone in relation to the concepts of initial cells and apical cells. Phytomorphology 6:1–19.  O’Connell LM, Ritland K. 2004. Somatic mutations at microsatellite loci in western redcedar (Thuja plicata: Cupressaceae). J. Hered. 95:172–176  Orive ME. 2001. Somatic mutations in organisms with complex life histories. Theor. Popul. Biol. 59:235–249.  43  Ossowski S, Schneeberger K, Lucas-Lledó JI, Warthmann N, Clark RM, Shaw RG, Weigel D, Lynch M. 2010. The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science 327:92–94.  Otto SP, Hastings IM. 1998. Mutation and selection within the individual. Genetica 102/103:507–524.   Otto SP, Orive ME. 1995. Evolutionary consequences of mutation and selection within an individual. Genetics 141:1173–1187.  Padovan A, Keszei A, Foley WJ, Külheim C. 2013. Differences in gene expression within a striking phenotypic mosaic Eucalyptus tree that varies in susceptibility to herbivory. BMC Plant Biol. 13. doi:10.1186/1471-2229-13-29.  Pavy N, Pelgas B, Laroche J, Rigault P, Isabel N, Bousquet J. 2012. A spruce gene map infers ancient plant genome reshuffling and subsequent slow evolution in the gymnosperm lineage leading to extant conifers. BMC Biol. 10. doi:10.1186/1741-7007-10-84.  Petit RJ, Hampe A. 2006. Some evolutionary consequences of being a tree. Annu. Rev. Ecol. Evol. Syst. 37:187–214.  Phung TN, Huber CD, Lohmueller KE. 2016. Determining the effect of natural selection on linked neutral divergence across species. PLOS Genet. 12. doi:10.1371/journal.pgen.1006199. 44   Pineda-Krch M, Fagerstrom T. 1999. On the potential for evolutionary change in meristematic cell lineages through intraorganismal selection. J. Evol. Biol. 12:681–688.   Pla M, Jofré A, Martell M, Molinas M, Gómez J. 2000. Large accumulation of mRNA and DNA point modifications in a plant senescent tissue. FEBS Lett. 472:14–16.   Ranade SS, Ganea L-S, Razzak AM, Garcia Gil MR. 2014. Fungal infection increases the rate of somatic mutation in scots pine (Pinus sylvestris L.). J. Hered.:1–9.   Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. 2011. Integrative genomics viewer. Nat. Biotechnol. 29:24–26.  Schmid-Siegert E, Sarkar N, Iseli C, Calderon S, Gouhier-Darimont C, Chrast J, Cattaneo P, Schütz F, Farinelli L, Pagni M, et al. 2017. Low number of fixed somatic mutations in a long-lived oak tree. Nat. Plants 3:926–929.   Schrider DR, Houle D, Lynch M, Hahn MW. 2013. Rates and genomic consequences of spontaneous mutational events in Drosophila melanogaster. Genetics 194:937–954.   Schultz ST, Scofield DG. 2011. Mutation accumulation in real branches: fitness assays for genomic deleterious mutation rate and effect in large-statured plants. Am. Nat. 174:163–175.   45  Scofield DG, Schultz ST. 2006. Mitosis, stature and evolution of plant mating systems: low-phi and high-phi plants. Proc. R. Soc. Biol. Sci. 273:275–282.   Sedlazeck FJ, Rescheneder P, von Haeseler A. 2013. NextGenMap: fast and accurate read mapping in highly polymorphic genomes. Bioinformatics 29:2790–2791.  Seeley TD. 1978. Life history strategy of the honey bee, Apis mellifera. Oecologia 32:109–118.   Sharp NP, Agrawal AF. 2012. Evidence for elevated mutation rates in low-quality genotypes. Proc. Natl. Acad. Sci. U.S.A. 109: 6142–6146.  Shields DC, Sharp PM, Higgins DG, Wright F. 1988. “Silent” sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol. Biol. Evol. 5:704–716.   Smith SA, Donoghue MJ. 2008. Rates of molecular evolution are linked to life history in flowering plants. Science. 322:86–89.  Suren H, Hodgins KA, Yeaman S, Nurkowski KA, Smets P, Rieseberg LH, Aitken SN, Holliday JA. 2016. Exome capture from the spruce and pine giga-genomes. Mol. Ecol. Resour. 16:1136–1146.   The Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815. 46   Thompson L. 1994. The spatiotemporal effects of nitrogen and litter on the population dynamics of Arabidopsis thaliana. J. Ecol. 82:63–68.  Uchimura A, Higuchi M, Minakuchi Y, Ohno M, Toyoda A, Fujiyama A, Miura I, Wakana S, Nishino J, Yagi T. 2015. Germline mutation rates and the long-term phenotypic effects of mutation accumulation in wild-type laboratory mice and mutator mice. Genome Res. 25:1125–1134.   Wang H, Moore MJ, Soltis PS, Bell CD, Brockington SF, Alexandre R, Davis CC, Latvis M, Manchester SR, Soltis DE. 2009. Rosid radiation and the rapid rise of angiosperm-dominated forests. Proc. Natl. Acad. Sci. U.S.A. 106:3853–3858.  Wang X-Q, Ran J-H. 2014. Evolution and biogeography of gymnosperms. Mol. Phylogenet. Evol. 75:24–40.  Wang Y, Tang H, Debarry JD, Tan X, Li J, Wang X, Lee TH, Jin H, Marler B, Guo H, et al. 2012. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40. doi:10.1093/nar/gkr1293.  Warren RL, Keeling CI, Saint Yuen MM, Raymond A, Taylor GA, Vandervalk BP, Mohamadi H, Paulino D, Chiu R, Jackman SD, et al. 2015. Improved white spruce (Picea glauca) genome 47  assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. Plant J. 83:189–212.  Watson JM, Platzer A, Kazda A, Akimcheva S, Valuchova S, Nizhynska V, Nordborg M, Riha K. 2016. Germline replications and somatic mutation accumulation are independent of vegetative life span in Arabidopsis. Proc. Natl. Acad. Sci. U.S.A. 113: 12226–12231  Wegrzyn JL, Liechty JD, Stevens KA, Wu LS, Loopstra CA, Vasquez-Gross HA, Dougherty WM, Lin BY, Zieve JJ, Martínez-García PJ, et al. 2014. Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics 196:891–909.  Whitham TG, Slobodchikoff CN. 1981. Evolution by individuals, plant-herbivore interactions, and mosaics of genetic variability: the adaptive significance of somatic mutations in plants. Oecologia 49:287–292.   Willyard A, Syring J, Gernandt DS, Liston A, Cronn R. 2006. Fossil calibration of molecular divergence infers a moderate mutation rate and recent radiations for Pinus. Mol. Biol. Evol. 24:90–101.   Xie Z, Wang L, Wang L, Wang Z, Lu Z, Tian D, Yang S, Hurst L. 2016. Mutation rate analysis via parent-progeny sequencing of the perennial peach. I. a low rate in woody perennials and a higher mutagenicity in hybrids. Proc. R. Soc. Biol. Sci. 283:1–9.  48  Yang S, Wang L, Huang J, Zhang X, Yuan Y, Chen J-Q, Hurst LD, Tian D. 2015. Parent–progeny sequencing indicates higher mutation rates in heterozygotes. Nature 523:463–467.   Yeaman S. 2015. Local adaptation by alleles of small effect. Am. Nat. 186:574–589.  Zeng J, Zou Y-P, Bai J-Y, Zheng H-S. 2002. Preparation of total DNA from “recalcitrant plant taxa.” Acta Bot. Sinaica 44:694–697.49  Appendices Appendix A Table A.1 Data and references for Figure 2.3 Species µ per gen.  µ per year Estimated age (years) Notes References  Heliconius melpomene 2.90 x 10-9 1.16 x 10-8 0.25 Assumes 6-month lifespan and continual reproduction (Boggs 1979, cited in Naisbit et al. 2001; Keightley et al. 2015) Mus musculus 5.40 x 10-9 2.16 x 10-8 0.25, 1.0 Average generation time (e.g. Castiglia and Capanna 1999; e.g. Geraldes et al. 2008; Uchimura et al. 2015) Drosophila melanogaster 5.49 x 10-9 5.49 x 10-8 0.1, 0.25 Average generation time (e.g. Nei et al. 1975; e.g. Keightley 1994; Schrider et al. 2013) Apis mellifera 6.80 x 10-9 4.53 x 10-9 1.5 Assumes 3-year lifespan and yearly swarming (Seeley 1979; Yang et al. 2015) Arabidopsis thaliana 7.00 x 10-9 1.40 x 10-8 0.5, 1.3 Average generation time (Thompson 1994; Falahati-Anbaran et al. 2014; Ossowski et al. 2010) Oryza sativa 7.10 x 10-9 1.42 x 10-8 0.9 Harvest frequency (Davis et al. 2016; Yang et al. 2015) Prunus mira 9.48 x 10-9 4.74 x 10-11 200  (Xie et al. 2016) Pan troglodytes 1.20 x 10-8 5.00 x 10-10 24  (Venn et al. 2014) Homo sapiens 1.20 x 10-8 4.04 x 10-10 29.7  (Kong et al. 2012) Picea sitchensis  2.65 x 10-8 7.36 x 10-11 360 Age estimated from population dendrochronology data (Little et al. 2013) Quercus robur 4.25 x 10-8 1.82 x 10-10 234  (Schmid-Siegert et al. 2017)   50  Appendix B  Table B.1 Geographic coordinates of trees sampled. The accuracy of these GPS coordinates is questionable, since they were taken under a large forest canopy and because some of the trees were close  together. In some cases identical GPS coordinates are given for multiple trees. The trees are best identified by photographs, which can be provided upon request.   Latitude (° N) Longitude (° W) 48.6627 124.6955 48.6619 124.6954 48.6625 124.6918 48.6627 124.6916 48.6629 124.6911 48.6629 124.6911 48.6629 124.6911 48.6630 124.6910 48.6634 124.6907 48.6634 124.6907 48.6634 124.6907 48.6632 124.6909 48.6634 124.6907 48.6634 124.6907 48.6632 124.6909 48.6557 124.6951 48.6559 124.6948 48.6559 124.6948 48.6560 124.6951 48.6557 124.6951        51  Appendix C The heterozygosity correction to the size of the search space is derived from the observed decrease in heterozygosity caused by the genotype-level filters. If these filters discard a proportion 𝑝 of homozygous genotypes and a proportion 𝑝 + 𝑞 of heterozygous genotypes (where a genotype is specific to a particular locus and sample from a tree), and if all four genotypes for any locus in a tree must pass all filters if the locus itself is to pass filters, then (1 − 𝑝)4 of 4-sample homozygous loci and (1 − 𝑝 − 𝑞)4 of 4-sample heterozygous loci would pass filters. Both quantities can be directly estimated from the numbers of heterozygous and homozygous consensus genotypes (i.e. all four samples for a tree agree) before and after filtering, so 𝑝 and 𝑞 can be calculated. Since mutations consist of two heterozygous and two homozygous genotypes, (1 − 𝑝)2(1 − 𝑝 − 𝑞)2 of them pass filters.  Let 𝑛ℎ𝑒𝑡 and 𝑛ℎ𝑜𝑚𝑜 be the known number of heterozygotes and homozygotes, respectively, among the 4-sample genotypes that are checked for mutations (i.e. that have passed all filters), and let 𝑛ℎ𝑒𝑡′  and 𝑛ℎ𝑜𝑚𝑜′  be the same among the genotypes that have passed site-level filters but not yet genotype-level filters. Then, 𝜎 = 𝑛ℎ𝑒𝑡(1 − 𝑝)2(1 − 𝑝 − 𝑞)2(1 − 𝑝 − 𝑞)4+  𝑛ℎ𝑜𝑚𝑜(1 − 𝑝)2(1 − 𝑝 − 𝑞)2(1 − 𝑝)4 is the size of the search space that would have resulted if all loci passed filters at the rate mutations do? Substituting expressions for 𝑝 and 𝑞 in terms of the four known quantities, we obtain 𝜎 = (𝑛ℎ𝑒𝑡 + 𝑛ℎ𝑜𝑚𝑜) ∗ ℎ where 52  ℎ =𝑥 ∗ 𝑛ℎ𝑒𝑡 + 𝑥−1 ∗ 𝑛ℎ𝑜𝑚𝑜𝑛ℎ𝑒𝑡 + 𝑛ℎ𝑜𝑚𝑜 and  𝑥 = √𝑛ℎ𝑜𝑚𝑜 ∗ 𝑛ℎ𝑒𝑡′𝑛ℎ𝑒𝑡 ∗ 𝑛ℎ𝑜𝑚𝑜′  And if we restrict  𝑛ℎ𝑒𝑡 and 𝑛ℎ𝑜𝑚𝑜 (and 𝑛ℎ𝑒𝑡′  and 𝑛ℎ𝑜𝑚𝑜′ ) to genotypes contained in either low-MAF or high-MAF sites, then we obtain ℎ = ℎ𝑙𝑜𝑤 and ℎ = ℎℎ𝑖𝑔ℎ, respectively.    

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0365772/manifest

Comment

Related Items