UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The landscape of divergence in silverleaf sunflowers Moyers, Brooke Taylor 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2015_may_moyers_brooke.pdf [ 19.8MB ]
JSON: 24-1.0166207.json
JSON-LD: 24-1.0166207-ld.json
RDF/XML (Pretty): 24-1.0166207-rdf.xml
RDF/JSON: 24-1.0166207-rdf.json
Turtle: 24-1.0166207-turtle.txt
N-Triples: 24-1.0166207-rdf-ntriples.txt
Original Record: 24-1.0166207-source.json
Full Text

Full Text

THE LANDSCAPE OF DIVERGENCE IN SILVERLEAF SUNFLOWERS  by  Brooke Taylor Moyers  B.A., Reed College, 2007  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Botany)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  May 2015  © Brooke Taylor Moyers, 2015 ii  Abstract  The Texas endemic silverleaf sunflower, Helianthus argophyllus, exhibits striking genetic variation in life history: some individuals flower in late summer and are relatively short, while other individuals delay flowering in favor of growth until fall. The central goal of this research is to identify and characterize the evolutionary drivers of this variation: either local adaptation under divergent natural selection, neutral phenotypic divergence resulting from reduced gene flow and subsequent genetic drift, or both. Helianthus argophyllus exhibits strong regional genetic structure. However, populations from the central area of the species range form a single genetic cluster but are split into two phenotypic clusters: mainland coast populations, which are primarily tall and late flowering, and barrier island populations, which contain short/early flowering and tall/later flowering individuals at roughly equal frequencies. Some traits, including floral size characters, are more differentiated across the species range than is expected based on neutral genetic divergence (QST > FST), a signal of local adaptation. In a reciprocal transplant experiment, barrier island plants had higher survival rates and overall fitness than non-local individuals at barrier island sites. Observations of selection in wild populations revealed directional selection for early flowering in barrier island populations that contrasts with selection for a later flowering optimum in mainland coast populations. Collectively, these analyses support a hypothesis of adaptive divergence in flowering time in H. argophyllus, although the ecological mechanism(s) and genetic basis of this divergence have yet to be explored. iii  Preface  I alone designed, carried out, and analyzed this research. There are a few exceptions: (A) Sébastien Renaut and I worked together to prepare and analyze the transcriptome data from which I designed the SNP panels used for genotyping in Chapter 2, and (B) under my direction Chris Grassa wrote the perl and UNIX scripts used to filter and randomize the sites chosen for the SNP panels in Chapter 2. My advisor Loren Rieseberg provided advice and feedback throughout, and I supervised numerous field, greenhouse, and laboratory research assistants who helped with plant cultivation, sample preparation, and data collection.  iv  Table of Contents  Abstract.......................................................................................................................................... ii!Preface........................................................................................................................................... iii!Table of Contents ......................................................................................................................... iv!List of Tables .............................................................................................................................. viii!List of Figures............................................................................................................................... ix!List of Abbreviations .................................................................................................................. xii!Acknowledgements .................................................................................................................... xiii!Dedication .....................................................................................................................................xv!Chapter 1: Introduction ................................................................................................................1!1.1! Motivation.......................................................................................................................... 1!1.2! Background ........................................................................................................................ 2!1.3! Variation in reproductive timing in plants ......................................................................... 4!1.4! Natural history ................................................................................................................... 8!1.5! Research questions........................................................................................................... 10!Chapter 2: Genetic variation and population structure...........................................................13!2.1! Introduction...................................................................................................................... 13!2.2! Methods............................................................................................................................ 18!2.2.1! Collections ................................................................................................................ 18!2.2.2! Genotyping................................................................................................................ 24!2.2.3! Population structure .................................................................................................. 29!2.2.4! Relatedness ............................................................................................................... 32!v  2.3! Results.............................................................................................................................. 33!2.3.1! Population structure in H. argophyllus ..................................................................... 35!2.3.2! Population structure and gene flow among all Helianthus populations ................... 42!2.3.3! Relatedness ............................................................................................................... 46!2.4! Discussion ........................................................................................................................ 47!2.4.1! Helianthus argophyllus and sympatric congeners .................................................... 48!2.4.2! Helianthus argophyllus expatriates........................................................................... 50!2.4.3! Relatedness among family members ........................................................................ 50!Chapter 3: Quantitative genetics and QST versus FST...............................................................52!3.1! Introduction...................................................................................................................... 52!3.2! Methods............................................................................................................................ 54!3.2.1! Common garden........................................................................................................ 54!3.2.2! Trait analyses ............................................................................................................ 58!3.2.3! Observation of wild populations............................................................................... 59!3.2.4! Quantitative genetic analyses.................................................................................... 61!3.3! Results.............................................................................................................................. 62!3.3.1! Phenotypic variation in the common garden ............................................................ 62!3.3.2! Additive genetic variation and heritabilities ............................................................. 72!3.3.3! QST versus FST............................................................................................................ 73!3.3.4! Observations in wild populations.............................................................................. 75!3.4! Discussion ........................................................................................................................ 76!3.4.1! Trait variation............................................................................................................ 76!3.4.2! Quantitative genetics and QST versus FST.................................................................. 79!vi  3.4.3! Expatriate H. argophyllus ......................................................................................... 81!Chapter 4: Reciprocal transplant...............................................................................................83!4.1! Introduction...................................................................................................................... 83!4.2! Methods............................................................................................................................ 86!4.2.1! Field experiment ....................................................................................................... 86!4.2.2! Selection analyses ..................................................................................................... 89!! Viability selection .............................................................................................. 89!! Kaplan-Meier estimator .............................................................................. 89!! Cox proportional hazards............................................................................ 90!! Parametric survival regression.................................................................... 90!! Fertility selection and selection gradients.......................................................... 91!! Aster modeling................................................................................................... 92!4.3! Results.............................................................................................................................. 93!4.3.1! Viability selection ..................................................................................................... 93!4.3.2! Fertility selection and selection gradients................................................................. 98!4.3.3! Aster modelling....................................................................................................... 103!4.4! Discussion ...................................................................................................................... 104!Chapter 5: Selection in wild populations .................................................................................107!5.1! Introduction.................................................................................................................... 107!5.2! Methods.......................................................................................................................... 109!5.2.1! Data collection ........................................................................................................ 109!5.2.2! Data analyses .......................................................................................................... 110!5.3! Results............................................................................................................................ 112!vii  5.4! Discussion ...................................................................................................................... 122!Chapter 6: Conclusion...............................................................................................................125!6.1! Discussion ...................................................................................................................... 125!6.2! Strengths and limitations................................................................................................ 128!6.3! Future directions ............................................................................................................ 129!References...................................................................................................................................131!Appendix.....................................................................................................................................143! viii  List of Tables  Table 2.1 Populations used in this dissertation............................................................................. 20!Table 2.2 Single nucleotide polymorphisms genotyped in this study .......................................... 27!Table 2.3 Hierarchical F-statistics for Helianthus argophyllus populations at four levels .......... 38!Table 2.4 Tracy-Widom statistics for principal components of genetic variance in H. argophyllus....................................................................................................................................................... 39!Table 2.5 Hierarchical F-statistics for all Helianthus populations at five levels.......................... 44!Table 2.6 Tracy-Widom statistics for principal components of genetic variance among all Helianthus individuals .................................................................................................................. 44!Table 3.1 Variance component and heritability estimates of traits and principal components of quantitative traits........................................................................................................................... 72!Table 3.2 Regional quantitative trait differentiation (QST) for traits and principal components of quantitative traits........................................................................................................................... 74!Table 4.1 Model significance and terms for survival functions by region of origin at each reciprocal transplant site ............................................................................................................... 96!Table 4.2 Estimated standardized selection gradients for reciprocal transplant ......................... 100!Table 5.1 Per-site summary of trait values in seven wild populations ....................................... 113!Table 5.2 Estimated standardized selection gradients in wild populations ................................ 118! ix  List of Figures  Figure 1.1 Helianthus argophyllus growing at the UBC Farm in early September 2009 .............. 1!Figure 1.2 Typical Helianthus argophyllus reproductive and vegetative structures. ..................... 8!Figure 1.3 Ranges of four annual sunflowers in Texas ................................................................ 10!Figure 2.1 Map of South Texas Helianthus collections................................................................ 23 Figure 2.3 Sample Sequenom® iPLEX® Gold SNP genotyping data ......................................... 26!Figure 2.4 Helianthus argophyllus regional genetic differentiation versus observed heterozygosity ............................................................................................................................... 34!Figure 2.5 STRUCTURE barplots for H. argophyllus populations only .......................................... 36!Figure 2.6 Population structure across the range of H. argophyllus............................................. 37!Figure 2.7 Genetic principal components of H. argophyllus populations .................................... 40!Figure 2.8 Isolation by distance among H. argophyllus Texas populations and regions ............. 41!Figure 2.9 STRUCTURE barplots for all Helianthus populations in this study ............................... 43!Figure 2.10 Genetic principal components of all Helianthus populations.................................... 45!Figure 2.11 Violin and boxplot distributions of relatedness......................................................... 47!Figure 3.1 Time-lapse camera and census observation sites of wild populations ........................ 60!Figure 3.2 Trait distributions for six floral morphology traits by region...................................... 64!Figure 3.3 Trait distributions for four bimodally-distributed traits by region .............................. 64!Figure 3.4 Trait distributions for two morphological and two functional traits by region ........... 65!Figure 3.5 Loading of quantitative traits on the first two principal components.......................... 66!Figure 3.6 The first two principal components of quantitative trait variation .............................. 67!Figure 3.7 Geographic pattern of quantitative trait syndrome...................................................... 68!x  Figure 3.8 Mosaic plot comparing regions by branch flower initiation ....................................... 69!Figure 3.9 Branch initiation of flowering in a wild individual in coastal South Texas. ............... 70!Figure 3.10 Mosaic plots comparing regions by stem color (left) and stem hairiness (right) ...... 71!Figure 3.11 Example stem sections for each stem hairiness index............................................... 71!Figure 3.12 Means and 95% confidence intervals for FST and QST of traits ................................. 75!Figure 3.13 Observed first day of flowering in South Texas wild populations in 2010............... 76!Figure 3.14 Rare phenotype of yellow disc floret lobes ............................................................... 82!Figure 4.1 Map of reciprocal transplant sites in coastal South Texas .......................................... 86!Figure 4.2. Kaplan-Meier and log-logistic regression estimates of the overall survival functions for each site ................................................................................................................................... 94!Figure 4.3 Mosaic plot comparing site to cause of death ............................................................. 95!Figure 4.4 Cox proportional hazards estimated survival functions for different regions at each site ................................................................................................................................................. 97!Figure 4.5 Kaplan-Meier and log-logistic regression estimates of the survival functions for different regions at each site ......................................................................................................... 98!Figure 4.6 Trait space occupied by surviving individuals at the two island sites......................... 99!Figure 4.7 Contour plots of two dimensional fitness landscapes................................................ 101!Figure 4.8 Fitness landscapes for individual traits...................................................................... 102!Figure 4.9 Expected flowerhead number for an average individual from each region of origin at the two island sites ...................................................................................................................... 103!Figure 5.1 Wild population sites chosen for observation of natural selection............................ 109!Figure 5.2 Pearson’s product-moment correlation coefficient for flowering time and height at flowering versus the percent of surviving individuals at each site ............................................. 114!xi  Figure 5.3 Trait space occupied by wild individuals in each region .......................................... 115!Figure 5.4 Tensor product smooth-based contour plots of the GAM function relating flowerhead number to flowering time and height at flowering in the Island and Coast regions................... 116!Figure 5.5 The GAM smooth functions relating traits to flowerhead number in the Inland region..................................................................................................................................................... 117!Figure 5.6 Estimated fitness functions for each trait at each region........................................... 120!Figure 5.7 Fitness landscape contour plots for each region........................................................ 121! xii  List of Abbreviations  ANOVA analysis of variance cM  centimorgan DNA  deoxyribonucleic acid GAM  generalized additive model LG  linkage group, also known as a chromosome MAF  minor allele frequency MCMC Markov chain Monte Carlo PC(A)  principal component (analysis) PCR  polymerase chain reaction QTL  quantitative trait locus/loci SNP  single nucleotide polymorphism USDA  United States Department of Agriculture xiii  Acknowledgements  I am deeply indebted to my supervisor Loren Rieseberg for his support, advice, and patience throughout my graduate career. The members of my committee (Jeannette Whitton, Dolph Schluter, Amy Angert, and at one time Mark Vellend) have been very generous with their time and expertise. Darren Irwin provided valuable comments on my dissertation proposal.  The research described here would not have been possible without material and logistical support from the following institutions and people: Laura Marek and many of her colleagues at the USDA; the UBC Farm, particularly Andrew Rushmere and Tim Carter; Tom Juenger at the University of Texas at Austin; John Crutchfield at Brackenridge Field Lab; Veril Barr, Steven Lanoux, and Ed Buskey at the University of Texas Marine Science Institute; Terry Blankenship at the Welder Wildlife Foundation; Glenn Cole at Dupont Pioneer; and a number of Texans who generously allowed me to use their land: Karen and Donna, Helen, Margaret and Jerry, and the managers at Island Moorings Marina in Port Aransas and the Mustang Island Conference Center.  This work would also not have been possible without a vast army of volunteers and research assistants, especially: Natalie Walker, Maxime Lepine, Rebecca Seifert, Abi Johnson, Audrey Kelly, Richie Sinclair, and Alexandra Paquet.  I have learned most of what I know about molecular ecology from my fellow Rieseberglars, including: Kathryn Turner, Kate Ostevik, Greg Baute, Greg Owens, Chris Grassa, Nolan Kane, Rose Andrew, Dan Ebert, Sébastien Renaut, Marco Todesco, Hannes Dempewolf, and honorary Rieseberglar Kieran Samuk. My roommate Lea Dunn is probably responsible for my continued sanity as a graduate student, and my parents are to thank for getting me here in the first place. xiv  Funding for the research was provided by Genome Canada, Genome BC, and the Canadian Natural Sciences and Engineering Research Council. My graduate career was supported in part by a UBC Four Year Fellowship and a US National Science Foundation Graduate Research Fellowship.  xv  Dedication  This thesis is dedicated to genetic inheritance, differential fitness, random error, and time, without which none of us would be here to read it.  and  This thesis is also dedicated to the R Platform for Statistical Computing and other open source analysis tools, without which there would be much less to read.1  Chapter 1: Introduction  1.1 Motivation  Figure 1.1 Two populations of Helianthus argophyllus growing at the UBC Farm in early September 2009. The population on the left is from inland South Texas, and the population on the right is from the coast.  This research was motivated by a single observation: during a collaborative survey of sunflower biomass production in 2009, I was struck by the strongly bimodal distribution of flowering time in four populations of Helianthus argophyllus. Plants from two populations on the Texas coast uniformly began flowering in late July, while plants from the two populations from inland Texas delayed flowering and continued growth until at least late September (Figure 1.1). In an annual plant species with a relatively restricted latitudinal range, this degree of 2  variation in the timing of reproduction is surprising. As an evolutionary biologist, I was intrigued.  1.2 Background Understanding how and why trait variation arises and is maintained in natural populations is fundamental to evolutionary biology.  Phenotypic variation is the basis upon which all evolutionary change depends, and can have far-reaching ecological consequences (e.g. Bolnick et al. 2011). With an understanding of trait variation, we can begin to explain patterns of biological diversity and the history of life on earth, and even make predictions about how living organisms will respond to specific environmental changes. Applications for the study of trait variation range from the clinical and medical sciences to conservation biology, and even encompass the selective improvement of domesticated species. Classically, natural variation originates with a mutation that arises within a single individual. This mutation spreads or is lost within a species through some combination of natural selection, gene flow, and genetic drift (Dobzhansky 1937; Grant 1963). Under these processes, phenotypic variation that aligns with environmental variation is most often attributed to local adaptation, the response to natural selection favoring different trait values in different environments. A long history of study, particularly in plants, has demonstrated that this pattern of high fitness of local individuals relative to foreign individuals occurs frequently (Turesson 2010; Kawecki & Ebert 2004; Leimu & Fischer 2008).   The process of local adaptation via divergent natural selection can be opposed by gene flow, which when high acts to homogenize variation among populations (Slatkin 1987). If locally adapted alleles are continually swamped out by the influx of non-adaptive alleles from 3  other populations, then population mean fitness will be reduced (e.g. Stanton et al. 1997). On the other hand, low levels of gene flow can result in a pattern of phenotypic differentiation among populations via the effects of genetic drift (Garant et al. 2007). In this case, populations may diverge both genetically and phenotypically across a species range, even in the absence of divergent selection, as local allele frequencies change through repeated random subsampling or genetic drift. Wright (1951) described this pattern as ‘isolation by distance’, and our expectations of the history and future of populations isolated by distance differ from our expectations of locally adapted populations. Distinguishing between patterns of variation driven by divergent natural selection and those produced following the reduction of gene flow is therefore essential to understanding trait variation, although in some cases both processes may act in tandem if adaptation under divergent natural selection reduces gene flow (‘isolation by adaptation’; Nosil et al. 2008). Beyond divergent natural selection and genetic drift, phenotypic plasticity, or the ability of an organism to respond to environmental variation, can influence phenotypic divergence among populations. Phenotypic plasticity can create a pattern of trait variation among populations in variable environments in the absence of genetic divergence, and can slow adaptive genetic divergence when individuals are able to modify their phenotypes in response to new environments (Dickerson 1962; Crispo 2008). Phenotypic plasticity can further influence the expression of genetic covariance among quantitative traits, such that selection that acts to increase one trait might indirectly act to increase a correlated trait in one environment, and indirectly act to decrease the same correlated trait in another environment (Stearns 1989). Finally, the expression of phenotypic plasticity can itself evolve if the relationship between 4  environment and phenotype, the reaction norm, is heritable and variable (Via & Lande 1985; Schlichting 1986).   1.3 Variation in reproductive timing in plants For an annual plant, correctly timing the transition from vegetative growth to reproduction is critically important. Plants mate assortatively by flowering time (e.g. Lennartsson 1997; Weis & Kossler 2004), so a plant flowering out of sync will have fewer mating opportunities. The timing of flowering also determines the environment experienced during fertilization and embryonic development. In a meta-analysis, Munguía-Rosas et al. (2011) show that selection on flowering time is relatively strong and tends to favor earlier flowering, especially in temperate environments. Theory supports these observations: early flowering tends to be favored when pre-reproductive mortality rates are moderate to high (Kozłowski 2003), which is often the case for plants. Some researchers have additionally hypothesized an intrinsic cost to later flowering, positing mechanisms such as meristem limitation (Kudoh et al. 2002). This pattern is reflected at higher taxonomic scales: in a phylogenetic analysis of rate changes in plant developmental timing, Li and Johnston (2000) found that rate changes in developmental timing almost always involve earlier termination (e.g. earlier flowering) in descendent species. There are, unsurprisingly, exceptions: for one example see Pilson (2000) for evidence of selection for later flowering by seed predators in Helianthus annuus. Further, there are extrinsic and intrinsic constraints on how early a plant can flower, and so the optimal strategy is likely to be some balance among all of these factors (Forrest & Miller-Rushing 2010). One of the most prevalent patterns in flowering time variation is latitudinal divergence, which has been linked to divergent selection in many systems. This pattern is well characterized 5  in the genetic model plant Arabidopsis thaliana (Caicedo et al. 2004), in the wild beet Beta vulgaris subsp. maritima (Dijk et al. 1997), and in two sunflower species, Helianthus annuus (Blackman et al. 2011) and H. maximilliani (Kawakami et al. 2011), among many others. In these systems, which generally span a wide latitudinal range, populations are locally adapted to flower during the appropriate season, using cues of temperature or photoperiod that are regulated by genetic mechanisms. Many of the genetic mechanisms that have been identified in latitudinal flowering clines involve the same regulatory networks, even in distantly related species. Additionally, the establishment of latitudinal clines in flowering has occurred repeatedly in some systems, including Helianthus (Henry et al. 2014).  Another common pattern of flowering time divergence occurs over water availability gradients (e.g. in Mimulus guttatus, Hall & Willis 2006; Chaetanthera moenchoides, Bull-Hereñu & Arroyo 2009). Plants in drier environments are often under selection to flower earlier to complete reproduction before the driest period of the year, while plants in more temperate environments often delay flowering in favor of increased vegetative growth. Biotic interactions can also drive divergence in flowering time, e.g. through variation in pollinator or herbivore presence or preferences (Sandring et al. 2007; Sandring & Ågren 2009). A review of the literature suggests that pollinators tend to select for earlier flowering, while seed predators tend to favor later flowering, but these trends can vary strongly between systems and local environments (Elzinga et al. 2007). The timing of flowering can evolve rapidly, although some analyses suggest that flowering time may be relatively phylogenetically constrained, at least at the level of family (Kochmer & Handel 1986). Under artificial disruptive selection, Murty et al. (1972) demonstrated a dramatic response in flowering time in the crop species Brassica campestris in 6  only six generations. Rapid evolution of flowering time occurred over only two generations in wild California Brassica rapa in response to severe drought (Franke et al. 2006; Franks et al. 2007). Less dramatically but still quite rapid on evolutionary timescales, several invasive species, including Lythrum salicaria (Montague et al. 2007; Colautti & Barrett 2013) and Microstegium vimineum (Novy et al. 2012), have established latitudinal clines in flowering time during invasive range expansion in the few centuries since their introduction, despite having relatively low initial genetic variation.  Natural selection is not required for the divergence of flowering time in natural populations. Devaux and Lande (2008) show that it is possible to evolve distinct clusters of flowering phenology in the absence of selection, especially in small populations. Clusters are more likely to form when there is high mutational variance of flowering time, short individual flowering duration, or long flowering seasons, and cluster formation is hindered by inbreeding depression, stabilizing selection on flowering time, and pollinator limitation (Devaux & Lande 2008). Some authors even argue that the timing of flowering is largely selectively neutral, and that most of the variation we observe is due to drift (Ollerton & Lack 1992; but see Munguía-Rosas et al. 2011). Plants also exhibit phenotypic plasticity in phenology: varying flowering time in response to developmental cues, size, photoperiod, temperature, resource availability, exposure to winter temperatures (vernalization), and other factors (Forrest & Miller-Rushing 2010; Pigliucci & Schlichting 1998). Phenotypic plasticity can be a useful response when the optimal strategy varies over time or space, or when gene flow limits the degree of local adaptation (Levin 2009; de Jong 2005). Because of this, both phenotypic plasticity and adaptive evolution can underlie 7  adaptive changes in flowering time, as is the case for Boechera stricta (Anderson et al. 2010; Anderson et al. 2012). Regardless of the primary evolutionary process driving divergence, flowering time variation has implications for the future trajectory of a species. Plants mate assortatively by flowering time, and gene flow is important in maintaining cohesion among members of a species (Coyne & Orr 2004). The reduction of gene flow between populations that flower at different times provides opportunities for further divergence and potentially speciation (Baker 1959; Grant 1981; Rieseberg & Willis 2007; Stam 1983). Indeed, phenology appears to provide a relatively strong barrier in many recently diverged plant systems (Lowry et al. 2008). These implications are why this research focuses on flowering time variation in Helianthus argophyllus, which also varies in other characters including size at maturity (Figure 1.1).   Wild populations of H. argophyllus have never been the focus of intensive study, despite a long history of ecological, systematic, and genetic research in the genus (Heiser 1948). Yet the species is remarkable in several ways: later flowering individuals flower much later than most other annual sunflowers (although see Moyers & Rieseberg 2013), and the observed dramatic flowering time variation occurs not along a large latitudinal gradient but over a scale of tens of kilometers. This scale of reproductive variation is striking in an outcrossing species with a relatively small range and small effective population size (Heiser et al. 1969; Strasburg et al. 2011). Further, the direction of flowering time variation along a moisture availability cline appears to contrast with most previous studies (e.g. Mimulus guttatus; Hall & Willis 2006): later flowering in drier environments. As such, this system has the potential to provide novel information about how flowering time evolves in wild species, and more broadly to inform theory on the maintenance of intraspecific variation in reproductive characters. 8   1.4 Natural history  Figure 1.2 Typical Helianthus argophyllus reproductive and vegetative structures.  Helianthus argophyllus Torrey & Gray is an annual sunflower native to sandy soils along the southern Gulf coast of Texas and up to 200 km inland (Figure 1.3), and introduced or naturalized in several other locations globally (Heiser 1951; Heiser et al. 1969). The species is 9  colloquially known as the silverleaf sunflower due to the dense, silky pubescence of the leaves (Figure 1.2). Helianthus argophyllus is an obligate outcrosser that produces substantial woody vascular tissue (Ziebell et al. 2013), grows relatively tall (1–5 m, Heiser et al. 1969), and is comparatively drought- and salt-tolerant (Heiser 1951; Jamaux et al. 1997; Rauf 2008; although see Richards 1992). As H. argophyllus is sister to and partially reproductively compatible with the domesticated sunflower, H. annuus, it is a potential donor of agronomically important genotypes (Belhassen et al. 1994; Timme et al. 2007). Indeed, alleles for downy mildew resistance (Wieckhorst et al. 2010) and cytoplasmic male sterility (Christov 1990) from H. argophyllus have already been used in sunflower crop breeding. Sympatric with H. argophyllus are subspecies of three other species of annual sunflower. Broadly sympatric, although generally growing on more mesic soils, are wild populations of H. annuus, primarily the subspecies H. annuus subsp. texanus but also possibly H. annuus subsp. annuus (Heiser 1954; Figure 1.3). A more distantly related species pair splits the range of H. argophyllus between them: the cucumberleaf sunflower, H. debilis subsp. cucumerifolius, primarily overlaps with the Northern and inland range, while Runyon’s sunflower, H. praecox subsp. runyonii, is sympatric with Southern and coastal populations (Heiser et al. 1969; Figure 1.3). Note that throughout this document I occasionally refer to each subspecies by only the specific epithet. The flowering season for Helianthus argophyllus begins later than that for all three sympatric annual Helianthus species, although early flowering H. argophyllus individuals still have significant overlap in flowering period with the other species, especially H. annuus (Heiser et al. 1969; Blackman et al. 2011). In addition to differences in flowering, H. argophyllus exhibits strong intrinsic post-mating reproductive barriers with H. debilis and H. praecox (Heiser 10  et al. 1962), although similarly strong barriers have not prevented two other annual Helianthus species from forming three independent hybrid species (Rieseberg 1997).   Figure 1.3 Ranges of four annual sunflowers in Texas. Helianthus annuus = blue, H. argophyllus = red, H. debilis subsp. cucumerifolius = green, and H. praecox subsp. runyonii = green. Adapted from Heiser et al. 1969 and Rogers et al. 1982.  1.5 Research questions The fundamental question driving this research is: what are the causes of flowering time variation in Helianthus argophyllus?  The two strongest hypotheses are: (1) local adaptation in response to divergent ecological selection, or (2) isolation by distance, where phenotypic divergence is the result of neutral genetic divergence caused by genetic drift and low levels of gene flow. It is also possible that flowering time variation could be the result of geographically heterogeneous introgression from sympatric species. These hypotheses are not mutually exclusive, especially as divergent selection on flowering time would reduce opportunities for gene flow, a pattern that has been called isolation by adaptation (Nosil et al. 2008). Further, divergent selection may not be acting directly on flowering time, but instead indirectly through 11  selection on some correlated phenotype, and geographic variation in flowering time could represent phenotypic plasticity. To explore the evidence for these two hypotheses, I conducted the following studies:  • In Chapter 2, I examine population structure in H. argophyllus, and ask whether the partitioning of genetic variation is evidence for isolation by distance. In this chapter I also explore the evidence for contemporary gene flow between H. argophyllus and sympatric annual Helianthus species, and calculate the mean pairwise relatedness of seeds from the same open-pollinated maternal plant.  • In Chapter 3, I describe a common garden experiment where I characterized the geographic pattern of quantitative trait variation in H. argophyllus. I report covariation among and heritabilities and coefficients of additive genetic variation for a set of life history, physiological, and morphological traits. I compare quantitative trait differentiation (QST) to neutral genetic differentiation (FST) to look for evidence of divergent selection on those traits.  • In Chapter 4, I report the results of a reciprocal transplant that experimentally tested for local adaptation in H. argophyllus. I examine viability selection, fertility selection, and their cumulative effects using aster models. I also estimate selection gradients and fitness functions for two traits and their interaction: age and size at flowering.  12  • In Chapter 5, I further examine selection on age and size at flowering in wild populations of H. argophyllus across three geographic regions during a single flowering season. I estimate selection gradients and fitness functions for these traits and their interaction, and compare selection across geographic regions. I ask whether the patterns of selection are consistent with the observed geographic distribution of phenotypic variation for those traits.  In the final chapter I synthesize the results from these four studies to come to some conclusions about the evolutionary drivers of flowering time variation in H. argophyllus. I discuss the strengths and limitations of this research, and explore avenues for future research.  13  Chapter 2: Genetic variation and population structure  2.1 Introduction To assess the relative influence of selection and geography on phenotypic variation in Helianthus argophyllus, I must first characterize the geographic partitioning of genetic variation, or population structure. Population structure defined most broadly is a pattern of organization of individuals into groups that share more recent genetic ancestry with each other than with members of other such groups. This pattern is common in natural populations: most species have larger ranges than individual members of that species, and often mating primarily occurs between individuals in the same locality (Kimura & Weiss 1964). Over time, allele frequencies in each group of intermating individuals can change due to local differences in natural selection, genetic drift, mutation, or even introgression with closely related species. In the absence of significant gene flow, groups will evolve to have different allele frequencies, even at loci that are not under divergent selection. This pattern creates a reduction of heterozygosity across all individuals in a species, even if the groups themselves are in Hardy–Weinberg equilibrium (the Wahlund effect; Sinnock 1975). More importantly, at least for the purpose of this research, this pattern can create local differences in phenotypic traits, even in the absence of geographically divergent natural selection. Neutral traits, those that do not affect individual fitness, will change as the genetic loci underlying them evolve via genetic drift. To study of phenotypic variation in natural populations, it is therefore useful assess the partitioning of genetic variance and extent of population structure in the studied system. Wright (1951) proposed the use of F-statistics to describe population structure, and these parameters and their unbiased estimators (e.g. Weir & Cockerham 1984) remain the most 14  common approach used to infer population structure. Fixation indexes (F) measure the proportion of genetic variance contained within groups at one level of organization (e.g. individuals, subpopulations, populations) relative to the total genetic variance at a higher level of organization. These parameters vary between 0 and 1, with zero indicating that groups at the lower level of organization represent essentially random subsets of the higher level, and one indicating that each group at the lower level of organization is fixed for unique alleles at the locus or loci under study.  There are three classic indexes that relate individuals (I) within (sub)populations (S) within the total population, or species (T): FIS is the proportion of genetic variance found within individuals relative to the total genetic variance in a population, FIT is the proportion of genetic variance found within individuals relative to the total genetic variance of the sample, and the most widely-used, FST, is the proportion of genetic variance contained within populations relative to the total (Wright 1965; Weir & Cockerham 1984). FST therefore measures the degree to which genetic variance is shared versus partitioned among populations, and high values of FST indicate the populations differ greatly in allele frequencies and hence are highly structured. The classic three-level hierarchy of organization can also be extended to examine more complex hierarchies: for example, partitioning of genetic variance at the population level relative to the regional level, or at the regional level relative to the total (Yang 1998). Population structure can also be inferred by modelling. The most widely used population genetics software program, STRUCTURE models population structure using Markov chain Monte Carlo (MCMC) simulations on allele frequencies from multiple loci to estimate genetic cluster membership of individuals in a given number of genetic clusters, K (Pritchard et al. 2000; Falush et al. 2003). STRUCTURE is often used to determine the probable number of discrete populations and to classify individuals into these populations, but the software can also be used to examine 15  population structure more broadly, and provides a very intuitive output in the form of the estimated ancestry for an individual in each genetic cluster. STRUCTURE does not, however, provide any formal tests for population structure analysis. A relatively more recent method, genetic principal components analysis, does provide the means to statistically test for population structure, without making any assumptions about the underlying population genetic model (Patterson et al. 2006; Novembre & Stephens 2008; Price et al. 2010). The method, developed by Patterson et al. (2006), uses Tracy-Widom statistics to formally test the significance of each of a set of eigenvalues (following Tracy & Widom 1994 and Johnstone 2001), with the number of significant eigenvalues representing the number of ‘axes’ of genetic variation among individuals. Unlike STRUCTURE, this approach is free from the assumption that genotypes can be grouped into discrete clusters, but is not as robust to the effects of linkage disequilibrium or high levels of admixture (Patterson et al. 2006). These three approaches (fixation indices, STRUCTURE, and PCA) each have elements to recommend them, as well as potential drawbacks. In this study, I take all three approaches to examine population structure in Helianthus argophyllus and closely related sympatric species. If populations of H. argophyllus are connected by high levels of gene flow, then I expect to observe little population structure in any approach, indicating that divergent selection is the most likely explanation for significant phenotypic divergence. If, on the other hand, H. argophyllus has significant population structure, I cannot rule out the effects of geography and drift. Further, identifying the geographic scale at which individuals cluster genetically will inform my later hypotheses of phenotypic divergence in this system (Chapters 3–5). Using data from 247 Helianthus argophyllus individuals genotyped at 64 bi-allelic single nucleotide polymorphisms (SNPs), I ask:  16  (1) How is genetic variance partitioned in H. argophyllus? (2) At which hierarchical level(s) (e.g. regionally, local populations) is H. argophyllus genetically structured? I also use these data with additional genotyping of sympatric species or deeper sampling within H. argophyllus families to answer a set of tangential questions about genetic variation in H. argophyllus: (1) Is there evidence for contemporary gene flow between H. argophyllus and sympatric Helianthus species?  This question is motivated by my observation of phenotypically intermediate individuals both in the field and in my collections grown in a common garden (Chapter 3). Are these individuals of hybrid origin, and is there any evidence of on-going, later-generation hybridization? If so, gene flow from sympatric congeners may play a role in the observed phenotypic variation in H. argophyllus. In addressing this question, I use analyses of population structure across collections of H. argophyllus, H. annuus, H. debilis subsp. cucumerifolius, and H. praecox subsp. runyonii to identify individuals with putative hybrid ancestry. (2) What are the likely origin(s) for populations of H. argophyllus that have established outside of Texas?  Helianthus argophyllus has established invasive/exotic populations in regions around the world, primarily in coastal areas. Heiser (1969) speculates that these populations were established by Portuguese sailors in the 17th or 18th centuries, possibly through bilgewater. This may be quite fanciful, but it spurred me to attempt to identify a likely geographic origin (or origins) for these populations in the native range. I address this 17  question by including a sample of invasive/exotic populations in my analyses of population structure in H. argophyllus and identifying the native populations with which these expatriate populations most closely cluster. (3) How closely related are individuals from the same maternal plant in H. argophyllus? My collections were made from open-pollinated self-incompatible plants, so individuals from the same maternal plant may be anything from half-siblings (r = 0.25) to full siblings (r = 0.5), or even more related if the parents are inbred. An estimate of the average relatedness among individuals from the same maternal plant is essential to accurately calculating measures of quantitative trait variation in Chapter 3. I address this question by calculating the average pairwise relatedness of individuals collected from the same open-pollinated maternal plant of H. argophyllus.  18  2.2 Methods 2.2.1 Collections  Figure 2.1 Map of South Texas Helianthus collections. Number corresponds with the ID column in Table 2.1. The right panel is a detailed view of the boxed area in the left panel. Locations are slightly modified for better visualization. Texan H. argophyllus populations are assigned to four regions based on gaps in the species distributions: those inland and above and below 27.5 degrees latitude (left panel) to North and South Inland, respectively, with populations on the Barrier Islands and on the mainland Coast (mainly in right panel) forming the other two regions. See also Figure 2.2.  I obtained the seeds used in this study from two sources: collections from wild populations I made in October 2009, and accessions maintained by the U.S. Department of Agriculture’s (USDA) North Central Regional Plant Introduction Station in Ames, Iowa (Figure 2.1, Table 2.1). The USDA maintains collections of wild relatives of major crop species, and for H. argophyllus many of these collections were made prior to 1985. I included the H. argophyllus −100 −99 −98 −97 −9626272829300 50 100 km● San Antonio12345−679−1361 1763647842434445674647/7980488182684950 H. argophyllusH. annuusH. debilis−98.0 −97.6 − 1020 kmCorpus Christi814/7415161862 7519762021222324−25262728−2930−3165 32 6633−397740/6041H. argophyllusH. annuusH. praecox19  USDA accessions in part to assess whether phenotypic or genetic differences could be observed over this time period, and in part to incorporate fuller geographic sampling (including exotic/invasive H. argophyllus and non-Texas H. annuus populations). Detailed information about USDA accessions is available from the Germplasm Research Information Network (http://www.ars-grin.gov/) via an accession’s Plant Inventory (PI) number (Table 2.1). These accessions are available as bulked seed obtained from one or more generations of within-accession mating of descendants from the original collection, so family structure is not maintained. For collections made from wild populations, I removed all mature seed heads from up to 30 individual plants of each species, each growing at least 2 m apart, per population. I required collection sites to be at least 1 km distant from each other, and attempted to collect from all major regions of the species range.  I kept seeds from each maternal plant in these wild collections separate, and hereafter refer to these as “families”.   20  Table 2.1 Populations used in this dissertation. Sources are either wild populations collected in 2009 (‘wild’) or from USDA accessions with the given plant inventory (PI) numbers. Pop ID corresponds with population as used later in this chapter. The species are Helianthus argophyllus (ARG), H. debilis subsp. cucumerifolius (DEB), H. praecox subsp. runyonii (PRA), and H. annuus (ANN). Province indicates a hierarchical level used in these analyses between species and region. Region indicates a further subdivision of ARG populations (referred to throughout this research and depicted in Figure 2.2). Note that in these analyses region for non-Texas ARG populations is ‘Expat’, and for ANN populations is identical to province: state/country of origin are listed here for information only. Latitude and longitude (decimal degrees) for most USDA collections are approximate based on locality information. Coastal distance (km) is the shortest distance between the population locality and open ocean (the Gulf of Mexico). Study indicates in which analysis each population was included: population genetics (PG, Chapter 2), common garden (CG, Chapter 3), or reciprocal transplant (RT, Chapter 4). Population Source PI Pop ID Species Province Region Latitude Longitude Coastal distance Study arg14B wild NA 46 ARG not South North Inland 28.69902 -97.38758 88 PG, CG, RT arg2B wild NA 48 ARG not South North Inland 29.35995 -97.75291 168 PG, CG arg4B wild NA 47/79 ARG/DEB not South/ debilis North Inland/ debilis 28.72903 -97.14055 75 PG, CG btm25 wild NA — ARG not South North Inland 28.26540 -97.30904 49 — btm26 wild NA 44 ARG not South North Inland 28.44285 -97.32563 63 PG, CG, RT btm27 wild NA 45 ARG not South North Inland 28.61525 -97.32674 77 PG, CG btm30 wild NA — ARG not South North Inland 29.19997 -97.49628 138 — btm31 wild NA — ARG not South North Inland 29.27076 -97.66214 153 — btm32 wild NA 50 ARG not South North Inland 29.64619 -97.69089 188 PG, CG, RT btm34 wild NA 49 ARG not South North Inland 29.55369 -97.78306 201 PG, CG, RT btm17 wild NA 19 ARG not South Barrier Island 27.45275 -97.29214 1 PG, CG, RT btm18 wild NA 20 ARG not South Barrier Island 27.57816 -97.22827 1 PG, CG btm19 wild NA 21 ARG not South Barrier Island 27.61768 -97.21676 2 PG, CG, RT btm20 wild NA 22 ARG not South Barrier Island 27.67927 -97.17168 1 PG, CG, RT btm21 wild NA 23 ARG not South Barrier Island 27.75700 -97.11933 1 PG, CG, RT btm22 wild NA 27 ARG not South Barrier Island 27.85657 -97.08013 4 PG, CG, RT 21  Population Source PI Pop ID Species Province Region Latitude Longitude Coastal distance Study btm9 wild NA 28 ARG not South Coast 27.87247 -97.19174 14 PG btm10 wild NA 31 ARG not South Coast 27.91720 -97.13564 12 PG, CG, RT btm12 wild NA 33 ARG not South Coast 27.99752 -97.07198 11 PG btm13 wild NA 38 ARG not South Coast 28.04006 -97.04207 12 PG, CG, RT arg6B wild NA 40/60 ARG/ANN not South/ sympatric Coast/Texas 28.10970 -97.02723 16 PG, CG arg11B wild NA 15 ARG South South Inland 27.27322 -9780295 44 PG, CG, RT btm5 wild NA 7 ARG South South Inland 27.14243 -98.14900 77 PG, CG, RT btm7/7b wild NA 14/74 ARG/PRA South/ praecox South Inland/ praecox 27.25746 -97.89252 54 PG, CG ARG-1813 USDA 494577 42 ARG not South North Inland 28.238 -97.325 45 PG ARG-1812 USDA 494576 43 ARG not South North Inland 28.253 -97.687 75 PG ARG-1834 USDA 494582 — ARG not South North Inland 28.805 -97.005 76 — ARG-1806 USDA 494572 4 ARG not South Barrier Island 26.833 -97.367 1 PG ARG-1807 USDA 494573 26 ARG not South Barrier Island 27.833 -97.05 1 PG ARG-1805 USDA 494571 24 ARG not South Coast 27.78 -97.398 26 PG ARG-415 USDA 435629 25 ARG not South Coast 27.78 -97.398 26 PG ARG-408 USDA 435628 29 ARG not South Coast 27.882 -97.212 16 PG ARG-1808 USDA 494574 30 ARG not South Coast 27.9 -97.133 11 PG ARG-407 USDA 435627 32 ARG not South Coast 27.985 -97.15 17 PG ARG-1317 USDA 468649 34 ARG not South Coast 28.02 -97.054 11 PG ARG-1315 USDA 468648 35 ARG not South Coast 28.027 -97.053 12 PG ARG-404 USDA 435625 36 ARG not South Coast 28.027 -97.053 12 PG ARG-406 USDA 435626 37 ARG not South Coast 28.027 -97.053 12 PG ARG-1809 USDA 494575 39 ARG not South Coast 28.06 -97.037 13 PG ARG-400 USDA 435623 41 ARG not South Coast 28.167 -97 18 PG ARG-1803 USDA 494570 8 ARG South South Inland 27.222 -97.79 42 PG ARG-425 USDA 435632 16 ARG South South Inland 27.297 -97.815 46 PG ARG-422 USDA 435631 51 ARG South South Inland 27.297 -97.815 46 PG ARG-420 USDA 435630 18 ARG South South Inland 27.367 -97.8 45 PG ARG-1802 USDA 494569 1 ARG South South Inland 26.48 -97.782 52 PG ARG-1822 USDA 494581 2 ARG South South Inland 26.55 -98.117 82 PG No. 83 USDA 649863 3 ARG South South Inland 26.647 -97.782 47 PG No. 81 USDA 649862 5 ARG South South Inland 26.883 -98.133 76 PG ARG-1820 USDA 494580 6 ARG South South Inland 26.89 -98.132 76 PG ARG-1818 USDA 494579 9 ARG South South Inland 27.225 -98.143 77 PG ARG-427 USDA 435633 10 ARG South South Inland 27.225 -98.143 77 PG ARG-428 USDA 435634 11 ARG South South Inland 27.225 -98.143 77 PG 22  Population Source PI Pop ID Species Province Region Latitude Longitude Coastal distance Study ARG-430 USDA 435635 12 ARG South South Inland 27.225 -98.143 77 PG No. 91 USDA 649864 13 ARG South South Inland 27.225 -98.143 77 PG ARG-1815 USDA 494578 17 ARG South South Inland 27.358 -98.123 77 PG ANN-1 USDA 649866 52 ARG South South Inland — — — PG ARG-2624 USDA 664730 54 ARG not-South North Carolina -77.999 33.876 — PG ARG-1575 USDA 468651 53 ARG not-South Florida 29.254 -81.021 — PG HEL153/83 USDA 649865 59 ARG not-South former USSR — — — PG Moz291 USDA 490291 55 ARG not-South Mozambique — — — PG QLD-11 USDA 664803 56 ARG not-South Australia -23.29861 150.78806 — PG QLD-09 USDA 664801 57 ARG not-South Australia -23.23944 150.825 — PG QLD-08 USDA 664800 58 ARG not-South Australia -23.18806 150.79194 — PG btm14 wild NA 77 PRA praecox praecox 28.08714 -97.03754 15 PG btm15 wild NA 76 PRA praecox praecox 27.49065 -97.27976 1 PG btm16 wild NA 75 PRA praecox praecox 27.43203 -97.29486 1 PG btm24 wild NA 78 DEB* debilis debilis 28.12755 -97.42551 46 PG btm33 wild NA 82 DEB debilis debilis 29.52810 -97.70129 170 PG btm36 wild NA 80 DEB debilis debilis 29.24064 -97.99017 164 PG btm37 wild NA 81 DEB debilis debilis 29.36161 -98.22041 195 PG btm1 wild NA 64 ANN sympatric Texas 27.91457 -98.61809 142 PG btm2 wild NA 61 ANN sympatric Texas 27.92014 -98.56528 138 PG btm3 wild NA 63 ANN sympatric Texas 27.54784 -98.26807 97 PG btm8 wild NA 62 ANN sympatric Texas 27.25746 -97.89252 53 PG btm11 wild NA 66 ANN sympatric Texas 27.98979 -97.07954 68 PG btm23 wild NA 65 ANN sympatric Texas 27.97398 -97.806055 70 PG btm28 wild NA 67 ANN sympatric Texas 28.68221 -97.30436 78 PG btm35 wild NA 68 ANN sympatric Texas 29.53182 -97.95792 186 PG ANN-1405 USDA 468559 70 ANN allopatric New Mexico 32.757 -107.264 — PG ANN-1399 USDA 468556 71 ANN allopatric New Mexico 33.511 -105.464 — PG ANN-886 USDA 435619 69 ANN allopatric Oklahoma 36.154 -95.993 — PG A-1572 USDA 413123 73 ANN allopatric Mexico — — — PG A-1516 USDA 413067 72 ANN allopatric Mexico — — — PG *Note that population btm24 occurs on the range border of H. debilis with H. praecox (Figure 1.3), and was phenotypically intermediate between these species.  I assigned H. argophyllus populations from Texas to four regions based on observed distribution gaps in the species range (Table 2.1, Figure 2.2). I grouped populations growing on 23  the barrier islands into one region, ‘Barrier Island’ or ‘Island’, and populations within 20 km of the mainland coast into another, ‘Coast’ (generally clustered towards the center of the species range, see Figure 2.1). Populations more than 20 km inland form two disjunct regions: North Inland (above ~28 degrees latitude) and South Inland (below ~27.5 degrees latitude), as seen in Figure 2.1. In addition to the H. argophyllus regions depicted in Figure 2.2, I combined all non-Texas originating H. argophyllus populations into a ‘region’ called ‘Expat’. In some analyses, I also split populations of H. annuus into two groups: sympatric (collected in Texas) and allopatric (collected outside of Texas) with H. argophyllus.   Figure 2.2 Map of the four Texas H. argophyllus regions used throughout this research (red = North Inland, green = South Inland, purple = Coast, and blue = Barrier Island, the thin strip of islands running parallel to the East coast). These regions are assigned based on observed gaps in the species distribution. See Figure 2.1 and Table 2.1 for the H. argophyllus populations assigned to each region. Not pictured is the ‘Expat’ region, which includes all non-Texas collections.  0  100  200 km24  2.2.2 Genotyping  I collected young leaf tissue from plants in the common garden (Chapter 3; see also Table 2.1) and immediately placed leaves between layers of silica gel to dry for a minimum of 120 hours. I also germinated additional seeds from USDA and wild collected populations (see Table 2.1) and collected leaf tissue from four-week-old seedlings, drying the material in silica as described above. I extracted DNA from dried leaves using a modified 96-well plate Qiagen protocol optimized for common issues with sunflower extractions (described in Horne et al. 2004) and eluted into 10mM Tris-HCl (pH 8.0). I randomly assigned each sample to a well within an extraction plate. I evaluated DNA quantity using a dsDNA Broad Range Assay with a Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA) and DNA quality using a NanoDrop 1000 Spectrophotometer (Fisher Scientific, Pittsburg, PA, USA). I re-extracted samples that had concentrations of <20 ng/µl, or that exhibited 260/230 or 260/280 absorbance ratios <1.6. If the second extraction also failed to meet these standards, I pooled the two extractions, precipitated the DNA in ethanol, and concentrated the samples by eluting into a smaller volume. I submitted the 657 samples that were of sufficient quality and quantity to be genotyped with two 40 SNP marker panels using the Sequenom® iPLEX® Gold Genotyping Technology (San Diego, CA, USA) at the Genome Quebec Innovation Centre (McGill University, Montreal, QC, Canada). Sequenom® iPLEX® Gold genotyping uses a multiplex PCR followed by single base extension using locus-specific primers. The PCR products are separated and allelic differences detected by mass spectrometry (Oeth et al. 2005; see Figure 2.3). The genotype call error rate is estimated by the service providers as <0.1% (Oeth et al. 2005). I also submitted 15 negative controls, randomly placed among samples, which were 25  treated identically starting with DNA extraction (although without leaf tissue), as a way to estimate the error rate of our entire protocol due to cross-contamination.  To design the SNP panels used in genotyping, I started with twenty-five H. argophyllus mRNASeq transcriptome libraries aligned against a H. annuus reference transcriptome (see Renaut et al. 2012 and Renaut et al. 2013 for a full description of these samples, how they were aligned to the transcriptome, how the transcriptome was constructed, and the genetic map on which it was placed). Using a custom perl script, I generated a set of 27,375 biallelic SNPs by filtering the alignments for sites with exactly two alleles, no more than 10% missing data, observed heterozygosity (HO) < 0.5, expected heterozygosity (HE) > 0.05, and which were not within 10 bp of a site with HO > 0.6. I then randomly selected a subset by keeping the top 300 of a Fisher-Yates permutation of the sites, throwing out any that were found in the same contig as a site higher on the list or that were within 200 bp of the start or end of a contig (required for the genotyping technology). I used these filtering steps to increase the likelihood that the sites would be informative (truly polymorphic in H. argophyllus) while minimizing bias associated with deliberate choice of markers. This list of potential SNPs was forwarded to the technical staff at Genome Quebec, who selected a subset optimized to perform well in the multiplex PCR for each panel of 40 (Table 2.2, and see Appendix for sequences).   26   Figure 2.3 Sample Sequenom® iPLEX® Gold SNP genotyping data. Each panel represents a single SNP, plotting the relative intensity of each sample for each allele at that site. Samples are called as either homozygote (low intensity of one allele and high intensity of the other, blue or green above) or heterozygote (approximately equal, and relatively high intensity of both alleles, yellow above), or are not called if the relative intensities are ambiguous (black above). The reliability of the genotype call for each SNP (left: ‘Good’, center: ‘Fair’, right: ‘Failed’, also see Table 2.2) denotes the number and intensities of the uncalled samples relative to the called samples.  The output of the Sequenom® iPLEX® Gold genotyping platform is a bi-dimensional scatter plot for each SNP, showing the relative intensity of each sample for each allele (see Figure 2.3). Samples that have ambiguously intermediate or low intensities for both alleles are called as NN. I additionally excluded SNPs that had low call rates (< 70%) or poor call reliability (where the two alleles were not well-distinguished by mass spectrometry, Figure 2.3). It is likely that some of the failed SNPs represent paralogous sites in the H. argophyllus genome, which would create “ambiguous” genotype calls if, for example, one site was heterozygous and the other homozygous (resulting in a 3:1 intensity ratio).  27  Table 2.2 Single nucleotide polymorphisms (SNPs) genotyped in this study. Contig, site, linkage group (LG), and centimorgan (cM) reference the putative location of the SNP within a transcript and the H. annuus genome (see Renaut et al. 2012; Renaut et al. 2013). Minor allele frequency (MAF) and observed heterozygosity (HO) are calculated from all genotyped samples, and call rate indicates the percentage of samples that were genotyped. Call reliability is a measure of the general performance of a SNP across all genotyped samples produced by the genotyping platform (see Figure 2.3). Panel is the set in which the SNP was multiplexed. SNPs denoted ‘F#’ and italicized either had a call rate < 70% or poor call reliability, and were not used in subsequent analyses. SNPs with unusual behavior (see Figure 2.4) are bolded. See Appendix for the submitted sequences. SNP LG cM Contig Site (bp) Major Allele Minor Allele MAF HO Call Rate Call Reliability Panel 1 1 2.69 BigSet015031 782 G A 0.23 0.20 93.46% Good 1 2 1 13.44 BigSet002568 777 T C 0.35 0.26 93.53% Fair 2 3 1 24.76 BigSet009543 1121 G C 0.15 0.23 99.28% Good 2 4 1 30.67 BigSet008902 453 T G 0.05 0.06 97.66% Excellent 2 5 1 35.51 BigSet007266 660 A G 0.27 0.31 97.12% Good 2 6 1 39.28 BigSet006518 593 T C 0.16 0.22 97.39% Good 1 7 1 46.80 BigSet005942 656 T C 0.15 0.20 98.86% Good 1 8 2 44.65 BigSet008428 235 G A 0.31 0.29 97.84% Fair 2 9 2 85.02 BigSet004075 614 C T 0.06 0.05 91.99% Good 1 10 3 31.19 BigSet013449 508 G A 0.17 0.12 98.53% Good 1 11 3 32.27 BigSet004967 1513 C T 0.01 0.02 100.00% Excellent 1 12 3 41.95 BigSet001877 461 T C 0.28 0.28 93.79% Good 1 13 3 54.31 BigSet000121 559 C T 0.16 0.21 97.71% Good 1 14 3 69.37 BigSet003287 501 C T 0.03 0.06 98.53% Good 1 15 4 50.1 BigSet002602 1047 T C 0.08 0.11 94.24% Fair 2 16 5 3.23 BigSet008927 353 A G 0.19 0.15 98.69% Good 1 17 5 6.99 BigSet004837 689 T C 0.04 0.05 99.18% Excellent 1 18 5 31.21 BigSet014819 856 A G 0.13 0.15 95.92% Good 1 19 5 56.48 BigSet012137 1146 A G 0.16 0.07 99.84% Excellent 1 20 5 64.01 BigSet014931 328 G A 0.15 0.18 98.04% Good 1 21 6 10.22 BigSet007066 633 T C 0.20 0.21 95.92% Good 1 22 6 25.81 BigSet002055 357 G A 0.21 0.19 97.84% Good 2 23 7 22.85 BigSet012604 577 G A 0.21 0.19 98.92% Good 2 24 7 23.12 BigSet015961 579 G A 0.24 0.18 71.94% Fair 2 25 8 17.21 BigSet011551 586 T C 0.40 0.41 99.82% Good 2 26 8 32.26 BigSet008521 754 G C 0.44 0.65 81.47% Fair 2 27 8 37.10 BigSet006107 417 A G 0.10 0.11 97.39% Good 1 28 8 43.02 BigSet003217 451 A C 0.35 0.29 94.28% Good 1 29 8 55.93 BigSet013825 446 G A 0.20 0.22 96.76% Fair 2 30 9 1.08 BigSet006581 236 A G 0.17 0.19 88.56% Fair 1 28  SNP LG cM Contig Site (bp) Major Allele Minor Allele MAF HO Call Rate Call Reliability Panel 31 9 25.82 BigSet005112 904 T A 0.03 0.05 97.06% Fair 1 32 9 36.84 BigSet014771 421 T C 0.45 0.40 96.08% Good 1 33 9 38.18 BigSet001038 704 G T 0.37 0.20 81.29% Fair 2 34 9 46.25 BigSet013222 693 T C 0.50 0.34 97.84% Good 2 35 9 50.01 BigSet013470 944 G A 0.22 0.21 94.28% Fair 1 36 9 94.12 BigSet013583 971 C T 0.47 0.32 98.37% Good 1 37 10 45.74 BigSet013487 1202 A G 0.24 0.41 98.20% Fair 2 38 10 48.7 BigSet011831 493 C T 0.15 0.28 96.94% Good 2 39 10 87.97 BigSet004768 429 T C 0.22 0.26 98.69% Good 1 40 11 16.94 BigSet010423 928 G A 0.08 0.12 100.00% Good 2 41 12 1.61 BigSet000202 1198 A G 0.13 0.19 98.53% Good 1 42 12 52.16 BigSet006163 1295 G A 0.37 0.30 98.02% Good 2 43 12 53.77 BigSet014556 316 T C 0.36 0.32 95.86% Fair 2 44 12 62.94 BigSet007798 348 A G 0.37 0.30 93.30% Fair 1 45 12 65.36 BigSet002959 1264 T A 0.09 0.14 99.18% Excellent 1 46 12 68.59 BigSet000728 447 T C 0.16 0.20 97.22% Good 1 47 13 1.88 BigSet004617 549 A G 0.48 0.27 97.88% Good 1 48 13 31.72 BigSet004008 1096 G T 0.24 0.16 98.04% Good 1 49 14 17.21 BigSet000515 987 A G 0.08 0.09 98.53% Good 1 50 14 37.92 BigSet011930 551 A C 0.34 0.08 93.63% Good 1 51 14 74.76 BigSet005803 678 G C 0.28 0.22 97.66% Fair 2 52 15 47.32 BigSet011115 254 G A 0.20 0.27 96.24% Good 1 53 15 55.4 BigSet012376 385 C T 0.18 0.23 95.86% Fair 2 54 15 70.99 BigSet000528 255 A G 0.21 0.16 99.28% Good 2 55 15 74.22 BigSet007409 526 G A 0.26 0.28 96.41% Good 1 56 15 75.29 BigSet004314 231 G A 0.15 0.19 92.65% Fair 1 57 16 13.98 BigSet001225 402 G A 0.05 0.09 99.28% Good 2 58 16 36.04 BigSet010692 659 G A 0.03 0.06 99.10% Good 2 59 16 49.48 BigSet011600 830 C T 0.33 0.32 96.90% Good 1 60 16 59.16 BigSet009360 621 A C 0.10 0.13 90.11% Good 2 61 16 70.99 BigSet011629 492 A G 0.21 0.22 96.22% Good 2 62 17 25.82 BigSet009150 457 T C 0.46 0.35 93.88% Fair 2 63 17 30.92 BigSet013370 326 T G 0.06 0.05 78.96% Fair 2 64 17 39.80 BigSet000232 286 T G 0.45 0.34 96.24% Good 1 F1 1 8.07 BigSet015505 568 G C 0.00 0.00 0.00% Failed 1 & 2 F2 3 54.31 BigSet014877 565 A G 0.46 0.32 77.70% Poor 2 F3 4 63.81 BigSet001741 538 T C 0.00 0.00 0.00% Failed 2 F4 5 33.36 BigSet008442 983 G A 0.00 0.00 0.00% Failed 1 & 2 F5 7 8.07 BigSet002692 953 G C 0.41 0.32 94.78% Poor 2 F6 10 61.33 BigSet006166 515 T C 0.00 0.00 0.00% Failed 2 F7 11 30.12 BigSet006816 1126 T C 0.21 0.31 92.09% Poor 2 F8 13 52.42 BigSet010881 479 T C 0.00 0.00 0.00% Failed 2 F9 13 55.38 BigSet009324 797 G A 0.00 0.00 0.00% Failed 2 F10 14 65.08 BigSet004529 959 T C 0.00 0.00 0.00% Failed 2 29  SNP LG cM Contig Site (bp) Major Allele Minor Allele MAF HO Call Rate Call Reliability Panel F11 16 36.04 BigSet010692 659 G A 0.00 0.00 0.00% Failed 1 F12 16 53.24 BigSet008650 1024 G A 0.00 0.00 0.00% Failed 1 & 2 F13 17 55.93 BigSet015952 857 C A 0.41 0.19 56.83% Failed 2  2.2.3 Population structure I excluded samples with fewer than 70% genotyped SNPs from my analyses. The samples collected from the common garden plot (Chapter 3) consisted of multiple individuals from the same maternal family. To examine population structure, I subset one individual per family by choosing arbitrarily among those individuals within a family with > 80% genotype calls. I performed all analyses with two independent subsets of these data; the two produced almost identical results, so I present only one here.  I included all representative samples from the USDA accessions, as family structure is not preserved in their collections and so relatedness is unknown. This subset dataset includes 325 individuals, 2–11 from each of 59 H. argophyllus, 13 H. annuus, 4 H. praecox subsp. runyonii, and 5 H. debilis subsp. cucumerifolius collections (Table 2.1). I performed additional analyses using only data from the 247 H. argophyllus individuals. I estimated FST for each SNP among populations or regions with the wc() function in the R package HierFstat, which uses Weir and Cockerham’s unbiased estimator (Weir & Cockerham 1984; Goudet 2005). I next estimated hierarchical F-statistics with HierFstat (Goudet 2005; implementing Yang 1998). This approach looks at hierarchical, nested levels of organization, such as individuals within populations within regions. I organized each dataset into biologically sensible hierarchical levels: first individuals, then collection ‘population’, then region (for H. argophyllus) or broader geography (for H. annuus), and finally species. I also included 30  ‘province’ at the level below species: an emergent division within H. argophyllus that I observed in the H. argophyllus STRUCTURE K = 2 runs, essentially the South Inland region versus all other populations (see Figure 2.5 and Table 2.1). For the full Helianthus dataset, the examined levels included, from highest to lowest: species, province with H. annuus split into allopatric and sympatric groups, region (for H. argophyllus), population, and individual. For the analysis with only H. argophyllus collections, I looked at four levels: province, region, population, and individual. Each of these levels is fully nested within the previous levels, so that one individual belongs to only one population, one population to one region, etc. To test for the effect of any hierarchical level on population structure, HierFstat randomly permutes genotypes at lower hierarchical levels among the level of interest, while keeping higher hierarchical level membership in place, and estimates the log-likelihood ratio G-statistic across all loci for each permutation (Goudet 2005). The G-statistic is a goodness of fit statistic that accounts for alleles within genotypes, and is more powerful than FST estimators when sampling is unbalanced (Goudet et al. 1996). For each test, I ran 1000 permutations, and calculated p as the proportion of permutations that gave a global G-statistic equal to or larger than that observed in the data. For both datasets, I estimated population structure using the program STRUCTURE (Pritchard et al. 2000). After examining convergence of the provided summary statistics (α, F, D, and likelihood) for a range of burn-in and run lengths (following Gilbert et al. 2012), I decided to set each run for 100,000 burn-in and 100,000 recorded MCMC iterations. I used the admixture model with correlated allele frequencies, and all other parameters set to default. For both datasets, I examined K values from 1 to 10, with ten replicate runs for each K. I chose to examine in more detail both the K that maximized the log likelihood of the data (Falush et al. 2007) and the K with the highest ΔK (ΔK = mean of the absolute value of the 2nd order rate of change of the 31  log likelihood from K – 1 to K across replicate runs, divided by the standard deviation of the log likelihood for replicate runs of K; Evanno et al. 2005, implemented by STRUCTURE HARVESTER, Earl & vonHoldt 2011). To visualize the results for replicate runs of a particular K, I used the CLUMPAK (Cluster Markov Packager Across K) beta program (http://clumpak.tau.ac.il/), which attempts to sum membership probabilities for individuals across replicate runs using CLUMPP (Jakobsson & Rosenberg 2007) while identifying distinct alternate modes in the replicate space with MCL (Enright 2002; Van Dongen 2008), and provides the resulting summed probabilities as well as graphical illustrations. I used a third approach to analyze population structure in both datasets: principal components analysis using the EIGENSOFT software (Patterson et al. 2006). This approach provides formal tests for population structure using Tracy-Widom statistics, outputs each individual’s genetic “coordinates” along a number of axes of variation, and provides tests for differentiation among pre-defined populations. I ran the sub-program smartpca for ten axes of variation (i.e. eigenvectors), without removing outliers, and normalized each SNP by allele frequency as recommended (Patterson et al. 2006). The normalization step also scales each SNP genotype column to have equal variance. I used Tracy-Widom statistics to evaluate the statistical significance of each principal component (eigenvector) by running the twstats subprogram on the resulting eigenvalues. This approach is in practice similar to a chi-squared test, although theoretically more complex: the distribution of expected eigenvalues is approximately a distribution described by Tracy and Widom (1994), and the value of an observed eigenvalue can therefore be used as a statistic to compute a p-value against this distribution (Johnstone 2001; Patterson et al. 2006). A ‘significant’ eigenvector or axis of variation is one whose eigenvalue is large enough to reflect ‘real’ covariance among the sampled individuals, beyond simply 32  sampling variance (Patterson et al. 2006; Price et al. 2010). I additionally tested whether regions overall (as pre-defined) differed significantly along each significant axis of variation, as well as whether each pair of regions was significantly differentiated along that axis, using ANOVA. I also examined the SNPs that had maximum weight for each principal component, and report those with weight > 2.  Weights indicate the value by which each SNP genotype is multiplied to get the component score and are proportional to the correlation across samples for each SNP and PC. I examined isolation by distance within only the H. argophyllus samples collected in Texas, using the R package ‘adegenet’ (Jombart and Ahmed 2011). I used the dist.genpop() function and individual genotypes to calculate a pairwise Edwards’ genetic distances (a Euclidian measure of angular distance; Edwards 1971) between all population pairs (i.e. collection localities, Table 2.1). I used the dist() function to calculate the pairwise Euclidean geographic distance between all population pairs based on their latitude and longitude (Table 2.1). Finally, I performed a Mantel test for correlations between these two distance matrices using the function mantel.randtest(), and tested for significance using a randomization test with 999 permutations (Mantel 1967). I performed these analyses including all H. argophyllus populations from Texas, as well as within each of the three major genetic clusters identified in my STRUCTURE analyses at K = 4 (North Inland, South Inland, and the central coast, consisting of populations from the Coast and the Barrier Islands). 2.2.4 Relatedness To calculate the average pairwise relatedness among individuals within open-pollinated H. argophyllus families, I subset the SNP genotypes of 19 populations (denoted ‘CG’ in Table 2.1) represented by 3–6 individuals from each of 4–5 families, for a total of 421 plants. I used the 33  compareestimators function in the R package ‘related’ to simulate individual data of known relatedness from the observed allele frequencies and compare the performance of four commonly used estimators on those data (Lynch & Ritland 1999; Wang 2011; Wang 2014). While all four estimators performed relatively well, I chose the Lynch & Ritland (1999) estimator because it showed the greatest discrimination between half- and full-sib data. I estimated relatedness for every pair of individuals, and calculated the grand mean relatedness of all within family comparisons (N = 95 family means). I also compared the distribution of pairwise relatedness between members of the same family, the same population (but not family), the same region (but not population), and individuals in different regions.  2.3 Results From two panels of 40 markers, 64 SNPs were successfully genotyped in > 70% of individuals and were used in subsequent analyses. Of the 657 samples, 607 were genotyped at >70% of those SNPs. The successful markers are distributed throughout the genome, most at least 1–2 cM away from the next nearest marker and so likely not in linkage disequilibrium (Table 2.2). The mean genotype rate for all samples was 90.1%. I observed high potential for cross-contamination: ‘no DNA’ negative controls, which had been treated as all other samples from DNA extraction through sequencing, were genotyped at an average of 19% SNPs (range: 1–31%). This measure is likely conservatively high: the Sequenom® iPLEX® Gold platform is a PCR-based approach, so a small amount of contamination in a ‘no DNA’ well would likely be detectable but would not necessarily affect the genotype call in a well with a large quantity of DNA. Cross contamination would also increase the likelihood of a sample having an ambiguous genotype (called as ‘NN’) in the Sequenom® system. In this case, cross contamination would 34  most likely reduce evidence for population structure, as samples were randomized prior to DNA extraction. The mean FST estimated from all SNPs among H. argophyllus collection populations (as defined in Table 2.1) is 0.161, while the mean FST estimated among H. argophyllus regions is 0.097. Unsurprisingly, the FST estimates for each SNP are strongly correlated between these datasets (corr. coef. = 0.89; t = 15.88, df = 62, p < 2.2e-16). From the analyses of population structure reported below, region appears to be the primary level of population differentiation, so I present the regional data.    Figure 2.4 Helianthus argophyllus regional genetic differentiation (mean pairwise FST among regions) versus observed heterozygosity (HO) for the 64 SNPs included in this study. SNPs marked red may be under divergent selection among regions, while the SNP marked in black may be under global balancing selection or represent paralogous sites.  ●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●● ●●●●●● ●●●●0.0 0.2 0.4 0.6   Most SNPs appear to behave neutrally in this dataset, with five exceptions (Figure 2.4). SNPs 10, 15, 22, and especially 50 exhibit higher levels of regional genetic differentiation than would be expected given observed heterozygosity, which suggests that these loci or genetically linked loci might be under divergent natural selection among regions. SNP 26 is also behaving oddly, with very high observed heterozygosity and low regional differentiation. It is possible this locus is under global balancing selection that favors heterozygotes at this locus, or that this locus actually represents paralogous sites that are fixed or nearly fixed for alternate nucleotides at the sites of interest. Alternately, the behaviors observed in these five SNPs could simply be due to genetic drift, which can create similar patterns under certain circumstances. In any case, the exclusion or inclusion of any of these putatively non-neutral loci did not substantially affect any of the analyses of population structure.  2.3.1 Population structure in H. argophyllus Helianthus argophyllus individuals show strong regional population structure in all analyses. In my STRUCTURE analyses, all replicate runs of K converged on a single parameter space mode (i.e. individual cluster membership did not vary substantially across replicate runs). K = 2 had the highest ΔK (best K following Evanno et al. 2005) and K = 4 had the highest log likelihood (best K following Pritchard et al. 2000). Figure 2.5 illustrates the individual membership probabilities for both cases: the clusters at K = 2 (top panel) are related to latitude, with Southern collections strongly differentiated from Northern collections with others somewhat intermediate. As several regions (as identified in Section 2.2.1) had a strong majority in one of the two clusters at K = 2, I used this emergent division as the level of province in my hierarchical F-statistic analyses. At K = 4 (Figure 2.5, bottom panel) the clusters correspond 36  strongly to region, with orange as South Inland, blue as Coast and Barrier Island, purple as North Inland, and green as the primary cluster for the ‘Expat’ populations. This is illustrated geographically in Figure 2.6.    Figure 2.5 STRUCTURE barplots for H. argophyllus populations only (top: K = 2, bottom: K = 4). Each vertical bar represents a single individual’s estimated membership proportion for each colored genetic cluster, grouped by population, then ordered by latitude (South on left to North on right, with ‘Expat’ populations collected outside of Texas on the far right). The numbers correspond to the population ID column in Table 2.1. K = 4 clusters correspond well with region in Table 2.1 and Figure 2.2. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 2829 30 31 323334 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 5152 53 54 55 56 5758 591 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 2829 30 31 323334 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 5152 53 54 55 56 5758 5937   Figure 2.6 Population structure across the range of H. argophyllus. The four colors represent genetic clusters identified using the program STRUCTURE (K =4, Figure 2.5 above), with the proportion of color in a circle representing the mean estimated membership proportion of that cluster in that local group of plants.  Each pie represents a group of 2–30 individuals collected near the locality on which the circle is centered (some representing multiple populations from Table 2.1). The five oceanic circles represent ‘Expat’ populations collected in Florida (ID 53), North Carolina (ID 54), Mozambique (ID 59), the former USSR (ID 55), and Australia (IDs 56–58). The panel on the right is an expanded view of the grey box in the left panel.   Under permutation testing, the only hierarchical level with a significant effect on population structure was region (p = 0.004). Population (p = 0.775) and province (p = 0.194) did not significantly affect population structure. Table 2.3 reports the hierarchical F-statistics for H. argophyllus populations at four levels, with each cell representing the F-statistic for the column −99 −98 −97 −96 −9526272829300 50 100 km●● Houston San AntonioN = 3N = 305354595556−58−97.6 −97.5 −97.4 −97.3 −97.2 −97.1 −97.0 −96.927.427.627.828.028.20 5 10 kmCorpus ChristiN = 3N = 3038  level relative to the row level, i.e. the proportion of the genetic variance contained within groups at the column hierarchical level relative to the total genetic variance at the row hierarchical level. The primary statistics of interest are on the diagonal. FIndividual/Population is equivalent to FIS, while the other three are conceptually equivalent to FST at different hierarchical levels.  Table 2.3 Hierarchical F-statistics for Helianthus argophyllus populations at four levels: province, region, population, and individual. Each cell represents the F[column level]/[row level] statistic, with individual nested within population nested within region nested within province. The statistics on the diagonal (bolded) represent conceptual equivalents of FST at each hierarchical level, except FIndividual/Population which is equivalent to FIS. Region is the only level with a significant effect on population structure under permutation (p = 0.004).   Province Region Population Individual Total 0.0731528 0.1297597 0.21328951 0.2930737 Province — 0.0610747 0.15119721 0.2372785 Region — — 0.09598475 0.1876654 Population — — — 0.1014149   Principal component analysis provides additional support for regional population structure in H. argophyllus. The first four principal components (PC) are statistically significant, indicating that there are four axes of genetic variance along which individuals co-vary beyond what might be expected from sampling variance (Table 2.4). The pre-defined regions differ significantly along the first three axes (Table 2.4).   39  Table 2.4 Tracy-Widom (T-W) statistics for the four statistically significant principal components of genetic variance in H. argophyllus. Proportion variance is the proportion of genetic variance explained by that axis. The ANOVA p-value tests whether the pre-assigned region means differ significantly along that axis. PC Proportion variance Eigenvalue Difference T-W statistic T-W p-value T-W  effect size ANOVA p-value 1 0.089 21.909 — 16.927 4.77 e-22 76.05 2.22 e-16 2 0.069 16.882 -5.027 18.438 8.14 e-25 109.43 < 1 e-16 3 0.041 10.096 -6.786 6.282 9.36 e-07 154.37 1.11 e-16 4 0.033 8.231 -1.864 1.330 0.03 168.68 0.097   PC1 correlates strongly with latitude (corr. coef. = -0.815; t = -20.9032, df = 221, p < 2.2e-16) and differentiates the North Inland and South Inland regions from each other and all other regions (Figure 2.7). The four SNPs that are putatively under divergent selection among regions (Figure 2.4) have the largest weights (the value by which each SNP genotype is multiplied to get the component score) for PC1: SNP 50 (weight = 3.274), SNP 10 (2.476), SNP 15 (2.258), and SNP 22 (2.087). PC2 correlates strongly with longitude (corr. coef. = 0.410; t = 6.6792, df = 221, p = 1.9e-10) and differentiates the ‘Expat’ individuals from the Coast and Barrier Island regions, which are in turn differentiated from the two Inland regions (Figure 2.7). Among the SNPs with the highest weight for PC2 are SNP 59 (weight = 2.409), SNP 56 (2.033), and SNP 46 (2.004). Together, the first two principal components recapitulate the geography of H. argophyllus. PC3 separates the Coast and Barrier Island regions from all other regions, and is most heavily weighted by SNP 27 (weight = 3.064) and SNP 28 (2.430), which are putatively 6 cM apart on linkage group 8, as well as SNP 56 (2.127), which also loads on PC2.    40   Figure 2.7 Genetic principal components of H. argophyllus populations. The first four principal components (PCs) are significant using Tracy-Widom statistics (Table 2.4). PC1 and PC2 correlate strongly with latitude and longitude, respectively.   Across all H. argophyllus populations in Texas, there is a significant pattern of isolation by distance (Mantel test, observation = 0.355, p = 0.001; Figure 2.8, top left panel). However, within each of the three primary genetic regions identified at STRUCTURE K = 4, genetic and geographic distance are only correlated among North Inland populations (Mantel test, ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●−0.20 −0.10 0.00 0.05 0.10−●North InlandCoastIslandSouth InlandExpats ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●−0.10 0.00 0.10 0.20−0.2−●North InlandCoastIslandSouth InlandExpats●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●−0.2 −0.1 0.0 0.1−●North InlandCoastIslandSouth InlandExpats●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●−0.2 −0.1 0.0 0.1 0.2−0.2−●North InlandCoastIslandSouth InlandExpats41  observation = 0.369, p = 0.037; Figure 2.8, top right panel), and not among South Inland or central coast populations (Figure 2.8, bottom panels).   Figure 2.8 Isolation by distance among all H. argophyllus Texas populations (top left) and within three genetic regions: North Inland (top right), South Inland (bottom right), and the central coast, including the Coast and Barrier Island populations (bottom left). Edwards’ genetic distance and Euclidean geographic distance are significantly positively correlated among all populations (p = 0.001) and within the North Inland region (p = 0.037), and not correlated among South Inland or Central populations.  ●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●● ● ●●●● ●●● ● ●●●●●● ●● ● ●●●●●●●●●●●●●●●●●●●● ● ●●●●●● ●● ● ●●●●●●● ● ●●●●●●●●●●●●●● ● ●● ●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●● ●● ● ●●●●●●● ●●●●●●●●● ●●●●● ●●●●● ●●●●●●● ●● ● ●●●●●●● ●●●●●●● ●●●●●●●●●●●●●● ●● ●●●● ●●●●●●●●●●●● ● ●●●●●●●●●●●● ● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ● ●●●●●● ● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●● ●●●●● ●●●●●●● ●●● ● ● ●● ●●●●●●●●●●●●●●●●●● ● ●● ●●●●●●●●●● ●●●●●●●●●● ●●●● ●●●●●●●●●● ●●●● ●● ●● ●●● ●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●● ● ●●● ●●●●● ●●●●●●●●●●●● ●● ● ●●●●●●●●●● ●●● ● ●●●●●●●● ●●●●●●●●●●●● ●● ●●●●●●● ●●●●●●●●●●● ●●● ●●●●●●●●●● ●●●●●●●● ●●●● ●● ●●●●●●●●●●●● ●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●0 10 20 30 40 500.150.250.350.45AllGeographic DistanceGenetic Distance●●●●●●●●●●●●●●●●● ● ●●●● ●● ●●●●●●●●●●●●1 2 3 4 5 6 7 DistanceGenetic Distance●●● ●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●5 10 15 DistanceGenetic Distance●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●5 10 DistanceGenetic Distance42  2.3.2 Population structure and gene flow among all Helianthus populations The SNPs used in this study strongly differentiate H. argophyllus from the other species in this analysis, but are not as useful for distinguishing among those species, even though H. annuus is relatively distantly related to either H. debilis or H. praecox. In the STRUCTURE analyses, K = 2 had the highest ΔK and distinguished H. argophyllus individuals from all other species (Figure 2.9). Several possible hybrids can be identified with > 10% membership in both clusters, five from putative H. annuus mothers (three from population arg6B–#60, and one each from two populations in the North region, btm28–#67 and btm35–#68) and nine from putative H. argophyllus populations, all either USDA collections (ARG-420–#18, ARG-1805–#24, and Moz291–#55), from the North Inland region (arg4B-#47 and btm34–#49), or both (ARG-1812–#43 and ARG-1813–#42). Setting K = 5 maximized log likelihood, and the clusters identified in the H. argophyllus-only K = 4 analysis are recapitulated with the addition of a cluster containing all other species (Figure 2.9).   43    Figure 2.9 STRUCTURE barplots for all Helianthus populations in this study (top: K = 2, bottom: K = 5). The numbers correspond to the population ID column in Table 2.1. Each bar represents a single individual’s estimated membership proportion for each colored genetic cluster. Helianthus argophyllus populations (1–59) are ordered as in Figure 2.5, followed by H. annuus (60–73), H. praecox subsp. runyonii (74–77), and H. debilis subsp. cucumerifolius (78–82) populations, which consistently cluster together.   The only hierarchical level with a significant effect on population structure among all species was region (p = 0.001, see Figure 2.7). Population (p = 0.777), province (p = 0.194) and species (p = 1.000) did not significantly affect population structure under permutation.     1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2526 27 2829 30 31 32333435 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 515253 54 55 565758 59 60 61 62 63 64656667 68 69 7071 72 73 747576777879 80 81 821 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2526 27 2829 30 31 32333435 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 515253 54 55 565758 59 60 61 62 63 64656667 68 69 7071 72 73 747576777879 80 81 8244  Table 2.5 Hierarchical F-statistics for all Helianthus populations at five levels: species, province, region, population, and individual. Each cell represent the F[column level]/[row level], with individual nested within population nested within region nested within province nested within species. The statistics on the diagonal (bolded) represent conceptual equivalents of FST at each hierarchical level, except FIndividual/Population which is equivalent to FIS. Only region has a significant effect on population structure under permutation (p = 0.001).  Species Province Region Population Individual Total 0.307522 0.36816859 0.40706374 0.4674837 0.5245493 Species — 0.08757913 0.14374719 0.2309990 0.3134068 Province — — 0.06155938 0.1571861 0.2475038 Region — — — 0.1018996 0.1981419 Population — — — — 0.1071622   There are six axes of genetic variance along which individuals vary significantly in the all Helianthus principal component analysis, four of which correspond with significant differences among the pre-defined regions and species (Table 2.6).   Table 2.6 Tracy-Widom (T-W) statistics for the six significant principal components of genetic variance among all Helianthus individuals. Proportion variance is the proportion of genetic variance explained by that axis. The ANOVA p-value tests whether the pre-assigned H. argophyllus regions and other species’ means differ significantly along that axis. PC Proportion variance Eigenvalue Difference T-W statistic T-W p-value T-W effect size ANOVA p-value 1 0.202 65.568 — 17.641 2.43 e-23 23.99 < 1 e-16 2 0.062 20.198 -45.371 15.306 3.28 e-19 91.31 < 1 e-16 3 0.052 16.926 -3.272 17.872 9.14 e-24 121.27 1.11 e-16 4 0.031 10.158 -6.767 4.852 3.24 e-05 167.08 2.22 e-16 5 0.029 9.481 -0.677 4.792 3.73 e-05 179.71 0.10 6 0.025 8.114 -1.367 1.149 0.04 193.36 0.52  45  The first principal component cleanly separates H. argophyllus individuals from the other three species (Figure 2.10). Here again putative early-generation hybrids can be seen with coordinates between 0.01 and 0.05, which correspond to the six individuals identified as having >30% membership in both clusters in the K = 2 all population STRUCTURE analysis (see Figure 2.9). The four SNPs with highest weight on this axis are SNP 19 (weight = 2.434), SNP 16 (2.195), SNP 48 (2.182), and SNP 54 (2.113).  Figure 2.10 The first two genetic principal components of all Helianthus populations. All symbols where species is not named in the legend belong to H. argophyllus. Putative early generation hybrids can be seen with coordinates between 0.01 and 0.05 on PC1.  Later principal components in the all-Helianthus analysis largely correspond with those from the H.-argophyllus-only principal coordinate analysis and are not shown here. Helianthus ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●−0.05 0.00 0.05 0.10−0.15−●North InlandCoastIslandSouth InlandExpatsH. annuusH. debilisH. praecox46  annuus, H. debilis and H. praecox cluster together along all significant axes except PC5, which differentiates H. annuus from the other two species, with H. argophyllus samples intermediate. PC5 is weighted most heavily by SNP 35 (weight = 2.582), SNP 30 (2.461), SNP 25 (2.424), and SNP 56 (weight = 2.167, also loading on PC2 and PC3 in H. argophyllus). 2.3.3 Relatedness The grand mean of pairwise relatedness for individuals within families is 0.411 ± 0.191 s.d., which is closer to the average relatedness of full-siblings (0.5) than the average relatedness of half-siblings (0.25). Relatedness decreased predictably from within families to within populations to within regions to between regions (Figure 2.11). Individuals from different regions were on average slightly less related than would be expected from a random draw from the total sample (between regions: r = -0.0577 ± 0.0425 s.d.; total: r = 0.0271 ± 0.106 s.d.), another indicator of regional population structure.  47   Figure 2.11 Violin and boxplot (quartile) distributions of relatedness: (1) within families, (2) within populations but not family, (3) within region but not population, and (4) between individuals from different regions.  2.4 Discussion Helianthus argophyllus exhibits clear, regional genetic population structure. This is perhaps unsurprising, given the degree of phenotypic differentiation in reproductive behavior observed at the regional level (Chapter 3). However, previous studies characterized the species as relatively genetically homogenous (Strasburg et al. 2009; Strasburg et al. 2011). The discrepancy may reflect the depth of sampling in this study, or suggest that markers useful for comparisons between species (as in previous studies) may not always reveal intraspecific patterns. Most genetic variation appears to be latitudinal, with the four most divergent SNPs ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●−1.0− 2 3 4LevelRelatedness48  loading onto the first principal component, which correlates strongly with latitude. Three regions can be genetically distinguished in Texas populations: North Inland, South Inland, and the central coast (including the Coast and Barrier Island regions). Populations on the Barrier Islands are not genetically differentiable from populations in the Coast region. The observed regional genetic structure could be the result of multiple evolutionary processes. It is possible that some unobserved factor, for example pollinator distributions, limits gene flow among the three regions, and the primary driver of differentiation is genetic drift. This pattern of variation has resulted in range-wide isolation by distance, where genetic differentiation increases with increasing geographic distance due to genetic drift (Kimura & G. H. Weiss 1964). However, the scale of isolation by distance appears to be primarily regional—geographic distance does not correlate with genetic distance within the two regions with more densely distributed populations (South Inland and the central coast). It is also possible that the three regions differ in selective regime, so that some trait values with high fitness in one region have low fitness in another, and so the observed genetic differentiation is the result of local adaptation and subsequent reduced gene flow. Had I observed a lack of population structure, I would have had some justification to suggest natural selection as the most likely driver of the observed phenotypic variation. As it is, the question remains open, to be addressed again in Chapters 3, 4 and 5. 2.4.1 Helianthus argophyllus and sympatric congeners In this study, the three sympatric annual Helianthus species are clearly differentiated from H. argophyllus but not easily differentiated from each other. The SNPs used in these analyses were identified in H. argophyllus transcriptome sequence, and filtered for a 5% minor allele frequency in that dataset, so it is not surprising that they might be less informative in the 49  other species. Indeed, at all but five sites the three species share the same major allele. While all 64 SNPs are variable in H. argophyllus, H. annuus individuals are fixed at 19 sites and H. praecox and H. debilis together are fixed at those sites and an additional fifteen. Those nineteen SNPs are likely novel mutations in the H. argophyllus lineage. There is some evidence for contemporary hybridization between H. argophyllus and sympatric Helianthus species, most likely H. annuus. I observed putative early generation hybrids in both the STRUCTURE and PCA analyses, in samples from phenotypically H. argophyllus and H. annuus mothers. One of these, from an apparently H. annuus mother, had an estimated ancestry proportion of 0.90 in the H. argophyllus STRUCTURE cluster at K = 2. This is consistent with observations I made while collecting: multiple species growing together in several localities, occasionally with a few phenotypically intermediate, putative hybrid individuals. I took care not to collect seed from intermediate individuals (who would likely have low fertility in any case; Heiser et al. 1962), but decided to collect in those populations to gain an unbiased view of genetic variation in the species. It appears unlikely that hybridization has resulted in significant gene flow into Texan H. argophyllus from sympatric species, but this remains an open question. Helianthus argophyllus and H. annuus are closely related and capable of producing at least partially fertile offspring with artificial pollination (Heiser 1951), but they also differ in chromosomal arrangement (Barb et al. 2014), in soil preference (Heiser et al. 1969), and, especially for the late flowering H. argophyllus, reproductive timing. Helianthus annuus is relatively new to South Texas (Heiser 1954), and populations in Texas are at the Southern edge of the species range. Helianthus argophyllus, on the other hand, appears to have expanded its range northward during the last 60 years (pers. obs., compared against maps in Heiser 1951). As such, their opportunities to 50  exchange genes may be increasing as the two shift from a mainly parapatric to a more sympatric relationship. This may explain why the putative early generation hybrids in Texan H. argophyllus were primarily collected from North Inland populations.  2.4.2 Helianthus argophyllus expatriates Helianthus argophyllus has established permanent populations in coastal areas around the globe, including the USDA accessions from the Atlantic coast of the United States, port towns in Mozambique and South Africa (Vischi et al. 2004), Queensland, Australia, as well as vouchered reports from Fiji, Tonga, and Japan (http://www.hear.org/pier/species/helianthus_ argophyllus.htm).  Interestingly, the widespread populations included in my analysis form a largely independent genetic cluster in both the H. argophyllus STRUCTURE K = 4 analysis and along PC2 of the H. argophyllus principal component analysis. The populations appear to represent genotypes that are rare in the native range but have risen to high frequency in expatriate populations. This suggests a shared evolutionary history (e.g. a common origin or dispersal mechanism) or a shared history of selection. In any case, it is likely that the expatriate populations originated from ancestors of the South Inland, Coast, or Barrier Island populations, as they appear to share some ancestry with those regions. 2.4.3 Relatedness among family members In an open pollinated, self-incompatible plant species, seeds from the same family are at minimum half-siblings (r = 0.25) but may also be full siblings (r = 0.5) if they share the same paternal parent or even more related if their parents are inbred (Wang 2014). The degree of relatedness is an important factor in calculations of quantitative genetics parameters, and had I assumed either a half-sibling or intermediate value (e.g. 0.33), I would have consequently over-estimated additive genetic variation for quantitative traits in H. argophyllus. The grand mean of 51  within-family relatedness is certainly higher than I expected, given that H. argophyllus has a sporophytic self-incompatibility system that should act to reduce inbreeding among close relatives (Heiser et al. 1969; Gandhi et al. 2005). Possibly my method of collecting influenced this result: seeds collected at a single time point from a maternal plant may be more likely to share a paternal parent due to patterns of pollination and flowering. In any case, I am now armed with an empirical estimate of the relatedness of individuals within families in H. argophyllus, which will be useful in Chapter 3.  52  Chapter 3: Quantitative genetics and QST versus FST  3.1 Introduction  Phenotypic variation among individuals can be the result of genetic differences, plastic responses to variable environments, or both.  Determining which is the case is important to studying the evolution of a trait: our expectations of how a trait will respond to selection depend on our understanding of its genetic basis. Experiments to discriminate between phenotypic plasticity and genetic variation have a long and distinguished history, beginning with Darwin and his contemporaries and first used systematically by Clausen, Keck and Heisey in 1940 (via Olmsted 1941).  The simplest experiment involves growing individuals in a common environment, sometimes called a common garden experiment.  If phenotypic variation observed among individuals in natural populations persists, one may conclude that this variation is the result of genetic variation.  Conversely, if the observed natural variation disappears when individuals are reared together, it may stem from a plastic response to the environment shared by all populations.  There are a number caveats to this reasoning that limit our ability to completely rule out phenotypic plasticity: (1) genetic variation and phenotypic plasticity are not mutually exclusive and may act in combination to produce variation (e.g. Bull-Hereñu & Arroyo 2009), (2) parental effects may cause an inter-generational cascade of phenotypic plasticity with the result that populations in a common garden appear more genetically variable than they are (Roach & Wulff 1987; Räsänen & Kruuk 2007), and (3) phenotypic plasticity itself can have a genetic basis that varies among individuals (Scheiner 1993; Schlichting 1986).  Further, the choice of common 53  environment can significantly affect the degree of observed variation if that variation is expressed in some environments and not others. One way to address this issue, at least in part, is to compare phenotypic variation in a common garden to variation observed in natural populations. If the observations are similar in degree and pattern, one can be reasonably assured that the choice of common environment did not significantly alter the expression of phenotypic variation.  By controlling for the effects of variable environments (VE), common garden experiments also enable researchers to measure the additive genetic variance of a trait (from individuals of known relatedness, e.g. parents and offspring, siblings). Traits with high additive genetic variance (VA) relative to phenotypic variance (VP) are traits for which relatives often resemble one another, and so we use the ratio of these two values to estimate narrow-sense heritability (h2). It is important to note that low heritability can reflect either a trait whose variance is primarily determined by the environment (VA << VE) or a trait with low additive genetic variance and little environmental variance (VA & VE << VP), including traits whose genetic variance has been depleted by selection (Houle 1992; Hartl & Clark 1997). Additive genetic variance can also be used to estimate the amount of genetic differentiation among populations. For quantitative traits, this measure is QST, representing the genetic variance for a trait among populations relative to the total genetic variance for that trait (Spitze 1993; Whitlock 2008). QST is analogous to FST: it represents the correlation of individual genotypes at loci underlying the quantitative trait in question within populations, relative to the correlation of genotypes at those loci across all individuals. This provides a way to detect divergent selection on a quantitative trait: for traits evolving neutrally, the mean and variance of QST values should equal the mean and variance of FST values for neutrally-evolving genetic loci, although the 54  former distribution is more difficult to measure experimentally (McKay & Latta 2002; Whitlock 2008). Consequently, any trait with a QST that falls outside of the distribution of FST values for neutral loci is likely to be under divergent (if higher) or stabilizing (if lower) selection among the set of populations being studied (Whitlock & Guillaume 2009).  In practice, stabilizing selection can be difficult to detect unless populations are highly diverged; the distribution of FST per locus often overlaps with zero for closely related groups. The signal of divergent selection is conversely easier to detect when population divergence is low.  To examine genetic variation for phenotypic traits in Helianthus argophyllus and determine whether those traits show a pattern consistent with divergent selection, I grew populations collected from throughout the species range in a common garden and measured a suite of phenotypes. I ask: (1) How do quantitative traits co-vary in H. argophyllus? (2) What is the geographic pattern of phenotypic variation in H. argophyllus? (3) How heritable and ‘evolvable’ are these traits in H. argophyllus? (4) Are quantitative traits more divergent than expected given neutral genetic divergence?  3.2 Methods 3.2.1 Common garden In 2010, I grew five individuals from each of five families from 19 wild-collected H. argophyllus populations in a common garden at the UBC Farm (seven North Inland, 6 Barrier Island, 3 Coast, and 3 South Inland, see Table 2.1). I scarified seeds from wild families on 5 May 2010 by removing 1/3 of the seed coat and cotyledon furthest from the embryo, and then germinated them in petri dishes on moist filter paper. In some cases, germination rates were low, 55  so I scarified a second round of seeds on 11 May 2010. After seedlings developed a primary root and cotyledons (7–9 days post-scarifying), I transplanted them into a seedling medium and randomized them into racks of 98 one-inch diameter Ray Leach “cone-tainers”TM (Stuewe & Sons, Inc., Tangent, OR, USA). I grew the seedlings in the UBC Horticultural greenhouse for three weeks under ambient light conditions and 22°C. Before transplanting into the field, I moved the seedlings to partially shaded, external benches at the UBC Farm greenhouses for one week to acclimatize them to outdoor conditions. I included these two phases of protected development to reduce transplant-associated mortality.  The UBC farm is a 40–hectare research, teaching and production facility on the University of British Columbia’s Point Grey campus. The field site for this project was approximately 0.1 hectare, located on the east edge of the farm property at N 49.248728,           W –123.23652. I transplanted one individual from each family (five families per population) into each of five randomized complete blocks arrayed to account for two potential sources of microenvironment variation in the field (slope and shade). Each block was made up of two rows of randomly-ordered plants spaced 1 m apart, with 1 m separating the two rows. Blocks were separated by 3 m, and with two border rows at each side of the plot and two border individuals at each end of a row to buffer against edge effects. After transplanting, I hand watered seedlings daily for one week. Mortality from transplant shock was < 2%, and I replaced those seedlings that died within the first week with another individual from the same family when possible. After the first week, I ceased watering, and my only further manipulation of the environment was a monthly removal of weeds during the first four months. I also grew a subset of H. argophyllus accessions from the USDA in parallel with this common garden. I started these plants as described above and randomly assigned three plants per 56  population to transplant locations in two rows planted just North of the common garden, without buffer plants. I included this component because I was curious to see whether these collections differed phenotypically from mine: almost all of the USDA accessions were collected prior to 1985 from localities that mainly overlap with my collections, and so might represent an earlier evolutionary “snapshot” if patterns of variation in H. argophyllus have changed in the last few decades. Of course, these collections could differ from mine for other reasons, including selection imposed by the seed bulking process at the USDA and micro-environmental differences between the planting plots.  I censused both the common garden and USDA accession plants three times per week until either the plant started to senesce or was harvested the week of 22 November 2010. I noted occurrences of damage or apparent ill-health (e.g. herbivory, fungal growth, apical damage, etc.). I recorded days to first flower (hereafter, days to flower) as the number of days from germination to the first observation of reproductively active flowers (visible stigmas or pollen). On the first day that flowering was observed, I also recorded the length of the primary stem (flowering height, in cm) and a number of measures of the first flowerhead, including: diameter of disc (head width, in mm), disc floret color (purple/red or yellow), number of ray florets, the length and width of a typical ray floret ligule (ray length/width, in mm), and the length and width of a typical phyllary (phyllary length/width, in mm). I also noted whether the first flowerhead was produced apically on the primary stem (typical to the species), or on a branch (occurred in 16.7% of observed flowerings). On the same day, I removed a recently produced, fully expanded, unshaded and undamaged leaf blade and calculated specific leaf area as the ratio of fresh leaf area (measured in m2 using ImageJ on scanned images of leaves; Schneider et al. 2012) to dry mass (measured in kg after leaf was dried for a minimum of 72 hours at 60 ºC). I timed leaf 57  collection at a single developmental event to account for developmental differences in leaf production (Wilson et al. 1999). Specific leaf area is a measure of physiological investment in photosynthetic capacity: plants with leaves of high area per mass tend to have higher photosynthetic rates and increased growth rates, while plants with leaves of low area per mass tend to have slower rates of growth but also require lower water and nutrient input (Reich et al. 1998; Adler et al. 2014). I chose these traits and those described below because they are variable in the species as well as the genus Helianthus (pers. obs.; Heiser et al. 1969), and therefore potentially targets of natural selection. I harvested one block per day during the week of 22 November 2010, regardless of its life history stage. At this point the majority of plants had initiated flowering (67.5%), and many had started to senesce, but the decision was also practical: the first frost of the season had occurred on 21 November 2010, and H. argophyllus is not frost tolerant. I recorded those plants that had not yet flowered (32.5%) as flowering at least 206 days after planting, and measured the characters described above that are not dependent on the presence of flowers immediately prior to harvesting. To harvest, I cut each plant down at the base of the primary stem, and recorded the length of the primary stem (mature height, in cm), the circumference of the primary stem at the first node (basal stem circumference, in cm), the length of the stem from the most apical leaf to the primary flowerhead (peduncle length, in mm), and the number of branches produced on the primary stem that were at least 2 cm in length (branch number). I also cut a section of the primary stem from the most basal node to 20 cm above. I scored this section for pigmentation (stem color, 1 = entirely green to 5 = almost to entirely purple) and hairiness (stem pubescence, 1 = almost hairless to 5 = densely wooly hairs), and measured stem density as the ratio of dry mass (measured in g after section was dried for a minimum of 120 hours at 60 ºC) to fresh volume (ml 58  estimated from the mass of water displaced by the fresh section when fully submerged). Basal stem density is highly correlated with wood density in sunflowers, as a majority of both volume and mass of the stem base is woody secondary growth (Ziebell et al. 2013). Increased wood density predicts resistance to vascular embolism caused by drought stress (Hacke et al. 2001), and wood is also hypothesized to provide mechanical support and defense against herbivores and pathogens (Poorter et al. 2009). 3.2.2 Trait analyses  Many of the traits I measured are likely to co-vary (e.g. measurements on the size of flowerhead components). To account for this and to assess how traits co-varied, I performed a principal component analysis on all fourteen quantitative traits using the R package ‘FactoMineR’ (Husson et al. 2010). For each trait, I scaled the data to the unit variance. I performed an analysis with only the common garden populations, with only the USDA populations, and with both datasets combined. Because the two datasets did not differ substantially, I present results from the combined analysis. I excluded both putative early-generation hybrid (see Chapter 2) and ‘Expat’ individuals from the final calculation of eigenvalues and eigenvectors, as these individuals are phenotypically distinct and my primary interest lies in understanding variation of H. argophyllus in the native range. I did, however, calculate their coordinates on the resulting principal components. I asked whether regions differed in qualitative trait scores (branch flowering initiation, stem color, stem hairiness, and disc floret color) using Pearson’s Chi-squared tests with simulated p-values (using a Monte-Carlo test with 1,000 replicates), as several expected values of the contingency tables were less than 5. I visualized the contingency tables shaded with the 59  resulting Pearson’s residuals in a mosaic plot created with the R package ‘vcd’ (Meyer et al. 2006).  3.2.3 Observation of wild populations During the summer of 2011, I installed time-lapse cameras (Wingscapes WSCA04 Timelapse Outdoor PlantCam, EBSCO Industries, Inc., Birmingham, AL, USA) at seven wild populations in the natural range of Helianthus argophyllus. These seven populations included two North Inland, two Coast, and three Barrier Island sites that spanned the geographic range of observed phenotypic variation in flowering time, but did not include any South Inland sites (see Figure 3.1, circles). I installed the cameras 6–8 feet above ground, within 25 m of and facing at least 30 H. argophyllus individuals. The goal of the study was to assay the geographic pattern of flowering time in wild populations: fortunately, both the silvery color of H. argophyllus vegetative growth and the bright yellow ray florets of the flowerheads are a sharp contrast to most environmental features in South Texas and the species generally initiates flowering apically and is the tallest annual plant in the region. I programmed the cameras to take two 2560 x 1920 pixel photographs daily at 1 pm of the maturing populations (30 to >100 individuals per field of view) from late April 2011 until all populations had started to senesce in early December. I checked the cameras every four to six weeks, and adjusted/re-focused as necessary to observe the growing plants. At one site, the focal plants were mowed before any had flowered, so the camera was repositioned to observe another set of plants that had not yet flowered. I noted the population-level first day of flowering as the first day that a H. argophyllus flower was photographed by a time-lapse camera at each site. Note that these data are potentially biased later than the actual first date of flowering in a population, as not all individuals were in the camera’s field of view, and even those that were may have flowered from a branch facing away from the 60  lens, or been hidden by other individuals’ vegetative growth. Fortunately, the observed date of first flowering does not appear to be related to population density.   Figure 3.1 Time-lapse camera (filled circles) and census (crosses) observation sites of wild H. argophyllus populations.  I also censused seven additional populations (two North Inland, two Coast, and three Barrier Island, see crosses in Figure 3.1) in person weekly for flowering behavior. These populations are described in further detail in Chapter 5. I include in this analysis only the locations of these populations and the first date that we observed flowering in any focal plant in each population.  −97.8 −97.6 −97.4 −97.2 −97.0 −96.8 −96.627.627.828.028.20 10 20 30 kmCorpus Christi●●●●●●●● PlantCamCensus61  3.2.4 Quantitative genetic analyses To compare regional differentiation in quantitative traits to neutral genetic differentiation, I used the R package ‘QstFstComp’ (Gilbert & Whitlock 2014). The QstFstComp function tests for a statistical difference between the observed QST for a quantitative trait and the mean FST of putatively neutral genetic markers, and includes a model that can account for an unbalanced dataset of half-siblings with a shared maternal parent, such as my data here. QstFstComp conducts parametric resampling of QST and bootstrapping across marker loci to estimate the uncertainty of mean FST, assuming neutrality of both the phenotype and the marker loci. These two distributions are then used to generate a null distribution of QST minus FST against which the observed value can be compared. I designated a “half-sibling” breeding design with maternal plant nested within population and an average sibling relatedness of 0.411 (as estimated in Chapter 2). I input trait values for each individual within each family from each population in the common garden, excluding any family with fewer than three individuals measured for that trait. To account for block effects from the common garden, I adjusted each set of trait measurements by adding the trait mean to the residuals of a linear model with the trait as the response and block as a fixed effect. I included the fourteen quantitative traits measured in the common garden, along with the pseudo-quantitative traits of stem hairiness and stem color (the traits themselves are continuous, but for ease of measurement I binned them into a qualitative scale). I also calculated QST for individual coordinates on the first two principal components of the quantitative traits. I used 63 SNP genotypes for one individual per family from these populations (same individual subset presented in Chapter 2, excluding SNP 50 which showed strong evidence of non-neutrality) to calculate FST. I used 10,000 resampling steps for each QstFstComp analysis. I ran the analysis both with populations and with those populations grouped into region 62  (North Inland, Coast, Barrier Island, and South Inland; see Table 2.1). The results did not differ substantially, so I report only the regional analysis, as this is the hierarchical level at which I detected population structure (Chapter 2). QstFstComp also outputs estimates of the additive genetic variance (VA) and the coefficient of additive genetic variance (CVA) for each trait, with 95% confidence intervals (Gilbert & Whitlock 2014). The former measure is relative to the unit in which the trait is measured, while the latter is standardized by the trait mean and can be compared among traits. The CVA of a trait is related directly to the response to selection, and so is sometimes termed a measure of the ‘evolvability’ of a trait (Houle 1992; Garcia-Gonzalez et al. 2012).  For half-siblings, VA is equal to four times the variance within families (Cornelius 2006); in this case, it was scaled to reflect the observed degree of relatedness among siblings within a family. Because coefficients of variation are only sensible for data on a ratio scale (only non-negative values), I did not calculate CVA for the principal component coordinates. I calculated narrow-sense heritability (h2) for each trait as the ratio of additive genetic variance estimated by QstFstComp to the variance of the trait, or VA/VP. Note that my estimates of heritability will be inflated by the presence of maternal effects, given my breeding design (siblings from the same open-pollinated maternal plant).  3.3 Results 3.3.1 Phenotypic variation in the common garden Helianthus argophyllus exhibited substantial phenotypic and additive genetic variation in the common garden. Figures 3.2–3.4 show the distributions of each quantitative trait, split by the region of origin. 63   RegionFirst Flowerhead Diameter (cm)●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●1020304050N_Inland Coast Island S_Inland ExpatRegionNumber of Ray Florets●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●152025303540N_Inland Coast Island S_Inland ExpatRegionRay Floret Ligule Length (mm)●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●102030405060N_Inland Coast Island S_Inland ExpatRegionRay Floret Ligule Width (mm)●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●● ● ● ●●●●● ●●●●●5101520253035N_Inland Coast Island S_Inland ExpatRegionPhyllary Length (mm)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●101520253035N_Inland Coast Island S_Inland ExpatRegionPhyllary Width (mm)●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●● ●● ● ●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ● ●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●51015N_Inland Coast Island S_Inland Expat64  Figure 3.2 (previous page) Trait distributions for six floral morphology traits by region. Circles denote plants from the common garden, and Xs plants from the USDA plot, with black indicating individuals with putative hybrid ancestry (>30% in both clusters at K = 2 in the all-Helianthus STRUCTURE analysis). The median, 1st, and 3rd quartiles are plotted over individual data points that have been jittered to improve visualization.  Figure 3.3 Trait distributions for four bimodally-distributed traits (flowering time, height at flowering and maturity, and branch number) by region. Circles denote plants from the common garden, and Xs plants from the USDA plot, with black indicating individuals with putative hybrid ancestry (>30% in both clusters at K = 2 in the all-Helianthus STRUCTURE analysis). Individual data points have been jittered to improve visualization. RegionDays to Flower●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●100140180N_Inland Coast Island S_Inland ExpatRegionBranch Number●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●20406080N_Inland Coast Island S_Inland ExpatRegionHeight at Flowering (cm)●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●100200300400N_Inland Coast Island S_Inland ExpatRegionHeight at Maturity (cm)●●●●●●●●● ●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●100200300400N_Inland Coast Island S_Inland Expat65   Figure 3.4 Trait distributions for two morphological and two functional traits by region. Circles denote plants from the common garden, and Xs plants from the USDA plot, with black indicating individuals with putative hybrid ancestry (>30% in both clusters at K = 2 in the all-Helianthus STRUCTURE analysis). The median, 1st, and 3rd quartiles are plotted over individual data points that have been jittered to improve visualization.   There was substantial regional variation in quantitative traits, with the Island region and expatriate plants tending to differentiate from both Inland regions and most Coast plants. Four traits are clearly bi-modally distributed, including flowering time, with most individuals from the RegionPeduncle Length (cm)●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●● ●● ●●●●01020304050N_Inland Coast Island S_Inland ExpatRegionSpecific Leaf Area (m^2/kg)●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●● ●●● ●● ● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●10203040N_Inland Coast Island S_Inland ExpatRegionBasal Stem Circumference (cm)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ● ●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●20406080100N_Inland Coast Island S_Inland ExpatRegionBasal Stem Density (g/ml)●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●● ●●●●●●● ● Coast Island S_Inland Expat66  native range outside of the Island region flowering late (Figure 3.3). Most of the few plants from either Inland region that flowered early were identified as early-generation hybrids in Chapter 2.  All but one (basal stem circumference) of the quantitative traits measured are highly correlated, and load onto the first principal component (Figure 3.5). The first two principal components account for 47.68% and 9.81% of the variance among individuals in the common garden, respectively. Later principal components each account for < 5% of the observed variance and are not reported here.  Figure 3.5 Loading of quantitative traits on the first two principal components. The traits include: days to flower (Fdays), height at flowering (Fheight), width of first mature flowerhead disc (headW), the number of ray florets (rayN), width and length of a typical ray floret ligule (rayW & rayL), width and length of a typical phyllary (phyW & phyL), specific leaf area (sla), peduncle length (pedL), height at harvest (matH), branch number (branches), basal stem circumference (basalC), and basal stem density (SSD).  ●−1.0 −0.5 0.0 0.5 1.0−1.0− 1 (47.68%)Dim 2 (9.81%)FdaysFheightheadWrayNrayWrayLphyWphyLmatHpedLbasalCbranches SSDSLAi ttlr rrri ttli ttl67  Individuals form two discrete phenotypic clusters on the first two principal components that are largely differentiated along PC1 (Figure 3.6). Individuals with values less than ~1.5 on PC1 are late flowering, tall, producing more branches, leaves of higher specific leaf area, initial flowerheads with somewhat more ray numbers, and somewhat denser stems. Individuals with values greater than ~2.5 are short and early flowering, with larger initial flowerheads that are held further above the vegetative portion of the plant on longer peduncles. I will refer to these two sets of traits as a “flowering time syndrome”, with early and late modes. Individuals in both clusters vary in basal stem circumference.   Figure 3.6 The first two principal components of quantitative trait variation, explaining 47.68% and 9.81% of the variance among individuals. Individuals are plotted with circles colored by region. Regional mean values are plotted as triangles. Note that ‘Expat’ individuals were not used to construct these principal components. ●−5 0 5 10−6−4−20246Dim 1 (47.68%)Dim 2 (9.81%)●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●N_InlandCoastIslandS_InlandExpat68   PC1 also differentiates among regions: both Inland regions and the Coast region fall in the “late flowering” syndrome, while the mean for the Barrier Island region is intermediate, with approximately equal representation of the individuals from the Island region in each cluster. This can be seen clearly in Figure 3.7, which shows the geographic pattern of variation in flowering time syndrome.  Figure 3.7 Geographic pattern of quantitative trait syndrome for H. argophyllus common garden plants. Individuals with PC1 values of > 2.5 are binned into the ‘early flowering’ trait syndrome (green), individuals with PC1 < 1.5 are binned in ‘late’ (orange), and individuals with intermediate values are in blue (see Figure 3.6). Each pie represents data from a population grown in the common garden, with the connected point fill color representing the region of that population (red = North Inland, purple = Coast, blue = Island, & green = South Inland). The diameter of each pie is proportional to the sample size for that population. The panel on the right is an expanded view of the grey box in the left panel. Note that most ‘early’ individuals in the North Inland populations (red points in left panel) are putative early generation hybrids (Chapter 2).  −99 −98 −97 −96 −9526272829300 50 100 km●● Houston San AntonioN = 10N = 30 LateIntermediateEarly●●●●●●●●●−97.6 −97.4 −97.2 −97.0 −96.827.427.627.828.028.20 510 kmCorpus ChristiN = 10N = 30●●●●●●●●●69  The majority of H. argophyllus plants, and indeed plants in the genus Helianthus, produce their first mature flowerhead on the apex of their primary stem. When I first observed the initiation of flowering on a secondary axis (a branch), I assumed it to be a plastic response to damage of the apical meristem by herbivores, pathogens, or mechanical injury. However, branch first flowering was not significantly associated with observed apical damage (Fisher’s exact test, p = 0.3722), and most plants that initiated flowering on a branch later produced an apical flower (pers. obs.). Instead, branch flowering initiation appears to be regionally differentiated and is observed more often than expected by chance in Coast individuals and less often than expected by chance in Island individuals (Figure 3.8). I also observed this phenotype in wild coastal populations (Figure 3.9).   Figure 3.8 Mosaic plot of the contingency table comparing regions by branch flower initiation. The presence of significant associations was tested using a Pearson’s Chi-squared test with MC simulated p-value. The cells are shaded with Pearson’s Chi-squared residuals: blue are larger than expected, and red are smaller. Branch Flower InitiationRegionExpatNorthSouthIslandCoastNo Yes70   Figure 3.9 Branch initiation of flowering in a wild individual observed in coastal South Texas.   Stem hairiness is also strongly associated with region, while stem color is not (Figure 3.10). Plants from the North Inland region have stems with higher hairiness indices than expected, while plants from the Coast or Island regions have stems with lower hairiness indices than expected (Figure 3.11). 71   Figure 3.10 Mosaic plots of the contingency tables comparing regions by stem color (left) and stem hairiness (right). The presence of significant associations was tested using a Pearson’s Chi-squared test with MC simulated p-value. The cells are shaded with Pearson’s Chi-squared residuals: blue are larger than expected, and red are smaller.   Figure 3.11 Example stem sections for each stem hairiness index. Sparse (1) on left to dense (5) on right.   Stem Color (green --> purple)RegionExpatNorthSouthIslandCoast1 2 3 4 5  Stem Hair (sparse --> dense)RegionExpatNorthSouthIslandCoast1 2 3 4 572  3.3.2 Additive genetic variation and heritabilities Estimates of heritability (h2) ranged from 0.25 for stem hairiness to 0.81 for flowering time (Table 3.1). Overall, the traits with the highest estimated heritability (> 0.5) included the four bimodally-distributed traits shown in Figure 3.3: flowering time, height at flowering, height at harvest, and branch number, along with peduncle length. Principal component 1 is also highly heritable (h2 = 0.55). The estimated coefficients of additive genetic variance (CVA), which are a measure of the ability of a population to respond to selection on that trait, span an order of magnitude from basal stem density to peduncle length (Table 3.1).   Table 3.1 Variance component and heritability estimates of fourteen quantitative traits, two pseudo-quantitative traits (stem color and hairiness rated on a scale of 1–5), and the first two principal components of quantitative traits. VP is the phenotypic variance, while VA and CVA are the additive genetic variance and the coefficient of additive genetic variance for that trait, given along with 95% confidence intervals (estimated using the R package QstFstComp). h2 is narrow-sense heritability, and is equal to VA/VP. Trait VP VA VA C.I. CVA CVA C.I. h2 Flowering time 1078.77 870.60 530.20–1268.83 17.35 13.54–20.95 0.81 Flowering height 5440.23 3462.68 1887.44–5278.84 20.91 15.44–25.82 0.64 Head width 40.42 19.56 9.26–31.58 22.24 15.30–28.26 0.48 Ray number 16.60 6.28 1.65–11.53 10.92 5.60–14.80 0.38 Ray length 83.53 41.94 20.45–66.88 26.36 18.41–22.38 0.50 Ray width 17.19 8.46 4.08–12.74 28.69 19.92–36.57 0.49 Phyllary length 14.99 6.11 2.34–10.44 14.89 9.21–19.45 0.41 Phyllary width 3.39 1.69 0.81–2.73 20.22 13.97–25.82 0.50 Specific leaf area 25.45 8.86 2.03–16.80 13.43 6.44–18.49 0.35 Peduncle length 52.82 40.56 16.42–68.00 81.65 51.96–105.72 0.77 Harvest height 4386.74 3335.08 1827.54–5113.29 19.91 14.74–24.65 0.76 Branch number 191.50 125.72 70.63–190.91 22.36 16.76–27.55 0.66 Basal stem circumference 117.18 31.31 1.81–65.46 11.10 2.66–16.04 0.27 Basal stem density <0.01 <<0.01 0.00–0.00 8.09 2.86–11.40 0.30 Stem color 1.13 0.24 0.00–0.57 18.25 0.00–28.25 0.21 Stem hairiness 1.10 0.28 0.06–0.52 17.79 8.59–24.26 0.25 73  Trait VP VA VA C.I. CVA CVA C.I. h2 PC1 8.57 4.68 2.68–7.02 — — 0.55 PC2 1.73 0.47 0.04–0.94 — — 0.27  3.3.3 QST versus FST Estimates of quantitative trait differentiation at the regional level ranged from 0.00 for peduncle length to 0.44 for stem hairiness (Table 3.2). Overall the confidence intervals for QST were quite large, often overlapping with zero (Figure 3.12), which is likely the result of the unbalanced dataset. Table 3.2 reports two-tailed (QST – FST  ≠ 0) and upper one-tailed (QST – FST  > 0) p-values for the difference between QST and FST for each trait, tested against the resampled neutral distribution of QST minus FST for that trait. Because I am testing a hypothesis of divergent natural selection (QST > FST) and chose traits known to vary across the species’ range, the upper one-tailed value is likely appropriate, but I report both. Floral display size (head width, ray floret ligule length and width, phyllary length and width) and stem hairiness are more regionally differentiated than expected by neutral genetic divergence (with QST significantly higher than FST; Table 3.2).  QST may be larger than FST for height at flowering, although the evidence is not conclusive, and there is evidence that peduncle length is under stabilizing selection among the common garden populations (QST < FST, lower one-tailed p = 0.045). Unsurprisingly, PC1, which loads with all of the floral characters, appears to have QST > FST. I should note that the principal component QST analyses are not directly comparable to the single trait analyses: (1) the PC dimensions by design maximize individual variation (and might be expected to have high QST, although PC2 does not), and (2) the PC dimensions are derived from the single trait measurements.   74  Table 3.2 Regional quantitative trait differentiation (QST) for fourteen quantitative traits, two pseudo-quantitative traits (stem color and hairiness rated on a scale of 1–5), and the first two principal components of quantitative traits. The QST estimate and 95% confidence interval were calculated from the observed values using the R package QstFstComp, which also was used to estimate an expected neutral QST for that trait by bootstrapping across resampled simulations. The two tailed p-values represent tests that the observed QST minus FST value is significantly different from a resampled neutral QST –FST distribution for that trait, while the upper one-tailed p-values test whether QST minus FST is significantly greater than the neutral distribution. * p < 0.05, ^ p < 0.1 Trait QST QST C.I. Neutral QST (resampled) Upper one-tailed p-value Two-tailed p-value Flowering time 0.20 0.00–0.47 0.09 0.102 0.204 Flowering height 0.23 0.00–0.52 0.09 0.072 ^ 0.143 Head width 0.31 0.01–0.64 0.09 0.026 * 0.051 ^ Ray number 0.17 -0.01–0.53 0.10 0.205 0.410 Ray length 0.30 0.01–0.63 0.09 0.025 * 0.050 ^ Ray width 0.29 0.01–0.62 0.09 0.032 * 0.064 ^ Phyllary length 0.30 0.01–0.65 0.10 0.041 * 0.082 ^ Phyllary width 0.28 0.01–0.61 0.09 0.041 * 0.082 ^ Specific leaf area 0.24 -0.01–0.65 0.10 0.111 0.221 Peduncle length 0.00 -0.05–0.11 0.10 0.955 0.090 ^ Harvest height 0.15 0.00–0.40 0.09 0.200 0.400 Branch number 0.19 0.00–0.46 0.09 0.127 0.253 Basal stem circumference 0.06 -0.04–0.40 0.12 0.557 0.887 Basal stem density 0.09 -0.02–0.46 0.07 0.412 0.825 Stem color 0.03 -0.18–0.45 0.13 0.649 0.702 Stem hairiness 0.44 0.03–0.82 0.13 0.022 * 0.043 * PC1 0.29 0.02–0.60 0.09 0.025 * 0.050 ^ PC2 0.07 -0.03–0.43 0.13 0.498 0.996  75   Figure 3.12 Means and 95% confidence intervals for FST (red) and QST of traits (black) and principal components (blue). The dashed red line indicates mean FST, and stars denote traits with QST minus FST significantly greater than a resampled neutral QST –FST distribution for that trait (Table 3.2). The traits are (left to right): days to flower, height at flowering, width of first mature flowerhead disc, the number of ray florets, length and width of a typical ray floret ligule, length and width of a typical phyllary, specific leaf area, peduncle length, height at harvest, branch number, basal stem circumference, basal stem density, stem hairiness (on a scale of 1–5), and stem color (on a scale of 1–5). The first two principal components of variation include the fourteen traits other than stem color and hairiness.  3.3.4 Observations in wild populations The timing of flowering initiation observed in wild populations shows the same pattern as that observed in the common garden. Populations from inland (in this case, only North Inland) initiated flowering later than populations from the Coast and Island regions (Figure 3.13).  − Differentiation●●●●●● ● ● ●●●●●●●●●●●Fstflwr daysflwr heighthead widthray numray lengthray widthphy lengthphy widthSLAped lenfthmature heightbranchesbasal cirstem densitystem colorstem hairPC1PC276   Figure 3.13 Observed first day of flowering in South Texas wild populations in 2010. Data from time-lapse camera (filled circles) and census (crosses) observation sites mapped in Figure 3.1. Note that this is a population-level measure, and that the value is the index of the date of observation counting from 1 Jan.  3.4 Discussion This study examines the geographic partitioning and genetic correlation of quantitative trait variation in Helianthus argophyllus. Overall, the measured traits are highly correlated and regionally divergent. These quantitative traits on the whole have relatively high heritability and additive genetic variance, and some appear to have evolved under regionally divergent selection.  3.4.1 Trait variation Although I measured fourteen quantitative traits, only two axes of variation are apparent in both common garden and USDA-plot individuals: flowering time syndrome (PC1) and basal stem circumference (PC2). I observe very clear regional phenotypic differentiation: with the RegionFirst Day of Observed Flowering (1 = 1 Jan)●●●●●●●150200250300350Inland Coast Island77  exception of a few, primarily putative hybrid individuals, both inland regions exclusively exhibit late flowering syndrome, which is also the majority syndrome in the Coast region. Plants from the Barrier Island region exhibit either syndrome at approximately equal proportions. This pattern is especially interesting in light of the fact that North and South Inland populations are strongly genetically differentiated, while Coast and Island individuals form a single genetic cluster (Chapter 2). The mismatch between phenotypic and genetic clusters suggests that at least some of the traits measured may be under divergent selection between at least the Coast and Island regions.  Flowering time syndrome may represent a trade–off in resource allocation: individuals that flower early invest in larger flowers that are elevated further above the vegetative portion of the plant, but make relatively fewer branches (and consequently flowers) than late flowering plants that invest in further growth. While H. argophyllus has indeterminate growth, individuals generally cease vertical growth and the production of new branches after initiating flowering, which limits the maximum size of the plant. Similar trade-offs have been observed in other systems (Li & Johnston 2000; Kudoh et al. 2002), such as Rhinanthus glacialis, where a ten–week difference in the extremes of flowering initiation is related to a trade-off in timing versus growth: early flowering plants have higher survival rates but later flowering plants produce more flowers (Zopfi 1995). I will explore the relationship among size, phenology, and fitness in H. argophyllus further in Chapters 4 & 5. One trait that does partition as one would expect from the genetic structure of the species is stem hairiness: North Inland individuals have hairier stems than other regions, and both Coast and Island individuals have less hairy stems than expected. Plant hair serves any number of adaptive (or maladaptive) roles, from protection against herbivory or UV radiation to providing a 78  safe haven for fungal spores to germinate (Levin 1973), any or none of which might be playing a role here. Branch initiation of flowering is another interesting trait. It may simply be a by-product of some developmental process or even wounding by insect predators, but I hypothesize that it may serve a functional role. The apical flowerhead is almost always the largest, with the largest number of reproductive disc florets, while the branch-first flowerheads tend to be smaller with few disc florets but many ray florets, which are typically sterile (data not shown). Perhaps the branch-first flower serves as a distracter for seed predators or floral herbivores, directing them away from the important investment in reproduction that the apical flowerhead represents. This is of course entirely speculative. There are three factors that limit the conclusions that I can draw from these data. First, the site at which I conducted this experiment is quite outside of the native range of Helianthus argophyllus. Experiments conducted in controlled environments do not always recapitulate the pattern of variation in the wild (e.g. Anderson et al. 2010). This issue is somewhat addressed by my observations in wild populations: for at least one trait, flowering time, wild populations exhibited a similar degree and geographic pattern of trait variation. Secondly, and stemming from the first, the fact that almost 1/3 of the common garden individuals did not flower before harvest biases the available data for any flowering-associated traits, including height at flowering, floral display and size characters, peduncle length, and even specific leaf area. The data are likely also biased for some traits for which I do have data, such as height at harvest, as these plants might have continued growing before flowering in a warmer environment. However, assuming that very late flowering plants did not differ from somewhat late-flowering plants in the set of character values associated with the late flowering syndrome, many of my conclusions 79  may still be valid. Finally, these plants were grown from seed collected from wild mothers, so I cannot rule out maternal effects. Sunflowers can exhibit large maternal effects, especially for early life history traits (e.g. seed size, seed dormancy, and germination timing; A. N. Weiss et al. 2013; Alexander et al. 2014) and in hybrids between wild species (e.g. Sambatti et al. 2008) or between domesticated and wild populations of the common sunflower (Alexander et al. 2014). However, maternal effects in sunflowers tend to be smaller in wild populations (e.g. Mercer et al. 2011) and for later life history traits such as those measured here, even in hybrid individuals (e.g. flowering time, floral traits, mature size; Alexander et al. 2014). The study design, which minimized the effects of seed dormancy and differences in germination timing, should have helped to reduce even early life-history differences caused by maternal effects. 3.4.2 Quantitative genetics and QST versus FST  To respond to natural selection, a trait must be variable, heritable, and correlated with differences in fitness among individuals. This analysis addresses the first two requirements, and indirectly approaches the third. Many of the traits I measured are quite variable and highly heritable among plants in the common garden, although it is important to note that heritability estimated in a common environment can be inflated relative to wild individuals living in heterogeneous environments (e.g. Simons & Roff 1994, although see Weigensberg & Roff 1996). Coefficients of additive genetic variance, used by Houle (1992) as way to estimate a trait’s ‘evolvability’, were also moderately high for most traits (Table 3.1). Floral display traits (floral display size, peduncle length) appear to harbor the largest amount of additive genetic variance (mean CVA = 29.28), followed by measurements of plant size (mean CVA = 18.57), stem color and hair (mean CVA = 18.02), and flowering time. The estimated CVAs for physiological traits (specific leaf area & basal stem density, mean CVA = 10.76) are relatively 80  low, which may reflect higher plasticity or a history of selection that has reduced variation in the loci underlying these traits. Physiological traits also have the lowest heritabilities in this study (mean h2 = 0.32) along with stem hair and color (mean h2 = 0.23), while flowering time heritability was quite high. The combination of high heritability and relatively low additive genetic variation for flowering time likely reflects the essentially qualitative nature of this trait in this dataset (i.e. either “early” or “late”). Unsurprisingly, given the distribution of trait values, flowering time syndrome (or PC1) appears to be highly heritable (Table 3.1). This syndrome also shows a potential signal of divergent selection among regions: the difference between QST and FST for PC1 is significantly greater than would be expected under neutrality. This is also the case for stem hairiness and five measurements of floral display size (which may be driving the signal in PC1), but not for any measure of plant size, or for flowering time (Figure 3.12). This suggests that divergence in flowering time is no more than would be expected under neutral processes (however, see following paragraph). Divergent selection on floral display size is a reasonable hypothesis in this system, especially if there is a fitness trade-off between timing and growth. Taller plants, larger floral displays or flowers, and more conspicuous flowers all have been shown to increase pollinator attraction, and so may be favored in some environments (e.g. Cruzan et al. 1988; Eckhart 1991; Sandring & Ågren 2009). In other contexts, these larger displays may more be energetically costly, attract pests, or be constrained by abiotic factors and so alternate trait values favored across a species’ range (Strauss & Whittall 2006).  The model I used to calculate QST assumes that each trait is normally distributed when generating the neutral expectation for that trait, which may not be appropriate for bimodally-distributed traits like flowering time (Michael C. Whitlock, pers. comm.). Unfortunately, an 81  appropriate model also capable of accounting for the design of the study does not yet exist. It seems wise, however, to take the results of the QST versus FST comparisons with some caution, especially given the wide confidence intervals for most QST values. Again, it is likely that the unbalanced dataset served to increase uncertainty. In addition to the non-normal distribution of some traits, the presence of maternal effects that increased trait similarities among siblings would likely bias estimates of QST downwards (Gilbert & Whitlock 2014). These plants were grown from seed collected in wild populations, so I cannot rule out maternal effects. I observed individuals sharing the same maternal parent that exhibited different flowering time syndromes, which suggests that the observed variation is not due to the effect of maternal environment alone. 3.4.3 Expatriate H. argophyllus Returning again to the expatriate populations of H. argophyllus, I note that they represent a unique phenotypic cluster: exclusively and even ‘transgresssively’ (relative to individuals collected in the native range) early flowering, with very large flowerheads and very long peduncles. Peduncle length was a trait with high heritability and a very high coefficient of additive genetic variance (at least as measured in the native populations in the common garden), as were other floral display characters, and reproductive timing is often a target of selection during range expansion or invasion (Barrett et al. 2008). This again suggests that these widespread ‘expatriate’ populations may have experienced similar selective pressures in their new environments, or may share more recent common ancestry than might be expected given their current far-flung origins. The expatriate group was also the only group in which I observed yellow disc florets (in 14% of individuals; Figure 3.14), a trait that is common among other members of the genus but very rare in wild H. argophyllus (Heiser et al. 1969; pers. obs.).  82  !Figure 3.14 Rare phenotype of yellow disc floret lobes, observed in only expatriate individuals in the common garden. This photo, however, was taken in a wild population in coastal South Texas, one of two wild individuals observed with this phenotype in 2011. See Figure 1.2 for typical disc floret color in H. argophyllus.   In this chapter I have established that flowering time is variable and heritable, as well as more regionally differentiated than might be expected from the results of Chapter 2 (at least for the Coast and Barrier Island regions). In the Chapters to follow I explore how flowering time and correlated traits relate to plant fitness in the natural environment. 83  Chapter 4: Reciprocal transplant  4.1 Introduction The classic method for detecting local adaptation under divergent ecological selection is the reciprocal transplant experiment (Turesson 2010; Olmsted 1941). Here, the fitness of populations in their native habitats is reciprocally tested against their fitness in a different part of the same species range. If divergent ecological selection has acted locally, foreign individuals should have reduced fitness relative to native individuals (Schluter 2000; Kawecki & Ebert 2004).  Even a clear signal of local adaptation, however, will not necessarily point to the phenotypic dimensions of adaptation. Identifying which phenotypes are under divergent selection requires further analysis. One way to approach this is to estimate the relationship between phenotypic traits and fitness in each environment, and to ask whether differences in fitness among local and non-local individuals can be attributed to their phenotypes. This requires that one estimate the strength, shape, and direction of selection on a trait. The most widely-used approach to measuring selection on quantitative traits was first proposed by Lande and Arnold (1983). Put simply, this method involves ordinary least squares multiple regression of relative fitness (or some component of fitness) on a set of phenotypic traits that may or may not be correlated. The resulting regression coefficients can then be used to estimate the strength and direction of linear (directional) selection (termed β) directly on each trait while holding the other traits statistically constant. One can also use the square product of each trait or the product of two traits as predictors to estimate the strength and direction of non-linear (quadratic) selection (termed γ). The implementation and interpretation of this approach is 84  therefore straightforward, but it makes a problematic assumption: that the conditional distribution of fitness is normal, which is very often not the case (Mitchell-Olds & Shaw 1987; Shaw & Geyer 2010; Morrissey & Sakrejda 2013). Further, estimates of non-linear selection can be misleading, as they do not incorporate data about the range of observed trait values or reliably indicate the presence of fitness maxima or minima (Mitchell-Olds & Shaw 1987; Schluter 1988). However, the method has been widely adopted, and selection gradients estimated this way are often used to compare the strength, direction, and shape of selection across time, space, and biological units (e.g. Siepielski et al. 2013).  Recent developments in measuring selection gradients and fitness landscapes have provided alternative approaches that help to address issues with earlier methods. First, Morrissey and Sakrejda (2013) present a method that estimates selection gradients via derivatives from generalized functions relating fitness to phenotype. These models can explicitly model non-normal distributions of fitness, account for odd fitness function shapes, and the estimated selection gradients (and fitness functions) are quantitatively comparable to those proposed by Lande and Arnold (1983). A second approach, aster modeling, is particularly useful when examining the joint effects of multiple components of life histories (e.g. survival and reproduction) on fitness (Geyer et al. 2007). Aster models allow each variable to have a separate conditional distribution (e.g. Bernoulli for survival, Poisson for counts of offspring), and the structure of aster models reflects the real conditional dependence of later life history variables on earlier (e.g. reproduction is conditional on survival to reproduction). These biologically realistic features are lacking in other approaches, where multiple episodes of selection across the life cycle of an organism must be modeled independently (Shaw & Geyer 2010). 85  For this study, I conducted a reciprocal transplant of wild populations from across the range of H. argophyllus into four sites, two in the North Inland region, and two on the barrier islands. If H. argophyllus populations are locally adapted at the regional scale, plants from populations within the local region should have higher fitness than plants from other regions. Further, if divergent selection has driven local adaptation in flowering time, the locally optimal strategy for flowering time, estimated using the above methods, should match the observed pattern of phenotypic divergence (Chapter 3). With these hypotheses, I ask: (1) Is there evidence for local adaptation in H. argophyllus? (2) How does (female) fitness relate to age and size at flowering at each site?  86  4.2 Methods 4.2.1 Field experiment  Figure 4.1 Map of reciprocal transplant sites in coastal South Texas. Color indicates region (blues = Island, reds = North Inland) and shading indicates vegetation management status (dark = recent, bright = not within 10 years). Note that only two of four regions identified in Chapters 2 & 3 were included as transplant sites in this study.  In 2011, I conducted a reciprocal transplant experiment at two North Inland (Papalote & Welder) and two Barrier Island (Marine & Port Aransas) sites in coastal South Texas (Figure 4.1). I will hereafter refer to these sites as ‘inland’ or ‘island’. I chose these sites for their availability, size, and to span the range of the phenotypic cline observed among the common garden populations while requiring no more than three hours driving between the two most −98.0 −97.8 −97.6 −97.4 −97.2 −97.0 −96.8 −96.627.627.828.028.228.40 20 40 kmCorpus Christi●●●●PapaloteWelderMarinePort Aransas87  distant sites. All sites were located within < 100 m of a wild H. argophyllus population. Within each region, one site had a recent history of human vegetation management (mowing or grazing; at Papalote and Marine) and one site had been reportedly free of human management for at least ten years (Welder and Port Aransas). Helianthus argophyllus grows at high density in both types of environment, although individual plants are generally not tolerant of apical damage. At three sites I transplanted 11 wild-collected H. argophyllus populations: two Coast, two South Inland, three North Inland, and five Barrier Island (see Table 2.1 for included populations). At the fourth, smaller site of Port Aransas, I transplanted a subset of seven populations: one South Inland, two North Inland, and four Barrier Island. I chose to include a larger representation of the Barrier Island populations because these populations showed the greatest phenotypic variation in the common garden experiment (Chapter 3).  I scarified seeds from wild families on 11 March 2011, and then germinated them in petri dishes on moist filter paper. After seedlings developed a primary root and cotyledons (7–9 days post-scarifying), I transplanted them into a sandy clay loam soil medium (similar to the soil at the transplant sites) and randomized them into racks of 98 one-inch diameter Ray Leach “cone-tainers”TM (Stuewe & Sons, Inc., Tangent, OR, USA). I grew the seedlings in a Brackenridge Field Laboratory (University of Texas at Austin) greenhouse for three weeks under ambient light and temperature conditions. Before transplanting into the field, I transported the seedlings to South Texas and hardened them off for one week in partial shade (at the Welder site for the plants intended for the inland plantings, and at the Marine site for those intended for the island plantings).  Before transplanting and for the first month after transplanting, I removed any wild silverleaf sunflower seedlings growing at the transplant sites to avoid confusion with 88  experimental plants. I tagged each experimental individual with a labeled aluminum tab, and marked its location with two small plastic stakes. At the three larger sites, I transplanted one individual from each of five families per population into each of four randomized blocks, for a total of 55 individuals per block. At the smaller site, Port Aransas, I planted only three blocks, with a total of 35 individuals per block. I transplanted the Welder and Marine sites on 4 April 2011, and the Papalote and Port Aransas sites on 5 April 2011. At all sites, each block consisted of a row of plants spaced 1 m apart, separated from the next block by 2 m. I included a border row to either side of the experimental blocks and two border individuals at either end of each block to mitigate edge effects. After transplanting, I hand watered seedlings every other day for two weeks. I replaced the seedlings that died within this period (from transplant shock or animal uprooting) with another individual from the same family when possible.  Two weeks post-transplanting I began censusing each site approximately weekly. I noted any damage or apparent ill health, as well as the first date that a plant was observed as dead (or missing) and the putative cause of death. These causes included wilting, herbivory, uprooting, and burial. For those plants that survived to flower, I noted the first date that a reproductively active flower was observed, along with the length of the primary stem (flowering height, in cm). To put these date data on the same scale as those in Chapter 5, I transformed each date to the index of that day counting from 1 January 2011. I continued to census the plants until senescence, when I counted the number of branches and the number of flowerheads that the plant had produced in its lifetime as an estimate of (female) fecundity. The last recorded flowering date was 7 November, or 310 days, and the last date of census was 8 December 2011. 89  4.2.2 Selection analyses I asked whether sites differed by causes of death using a Pearson’s Chi-squared test with a simulated p-value (using a Monte-Carlo test with 1,000 replicates), as several expected values of the contingency table were less than 5. I visualized the contingency table shaded with the resulting Pearson’s residuals in a mosaic plot created with the R package ‘vcd’ (Meyer et al. 2006).  I analyzed viability and fertility selection both independently and together using aster models (Geyer et al. 2007). Viability selection For viability selection, I estimated survival functions for each site and for each region of origin (Island, Coast, North Inland, & South Inland) at each site using three modelling approaches: non-parametric (Kaplan-Meier), semi-parametric (Cox Proportional Hazards), and parametric (e.g. regression specifying a specific hazard function). Each approach has advantages and disadvantages (see below), but in these analyses the results of all approaches are in agreement. The data for these analyses are right-censored; I truncated the ‘observation period’ on the day after the last plant to die before flowering was recorded, as I am primarily interested in survival to reproduction. Kaplan-Meier estimator The Kaplan-Meier (KM) estimator has the advantage of being able to take into account censored data and is free from any assumptions about the underlying hazard function except that the survival function between observed time points is constant (Fleming & Harrington 1984). I fit KM survival functions using the survfit function of the R package ‘survival’, and tested for 90  differences among the fitted functions using the survdiff function, which implements the G-rho family of tests (Harrington & Fleming 1982). Cox proportional hazards The Cox proportional hazards model is semi-parametric because it makes no assumptions about the form of the baseline hazard function (Therneau & Grambsch 2000). This makes it a popular approach, but it also assumes that the effects of the covariates are constant and proportional, i.e. each unit increase of a covariate results in a proportional scaling of the hazard ratio independent of time (Grambsch & Therneau 1994). I only fit Cox models to the within-site comparisons of region of origin, as the between-site comparison matrix contains singularities (i.e. at two sites, all plants died). Similarly, I was not able to fit within-site models including population or family as fixed effects. I fit mixed Cox models using the R package ‘coxme’ (Therneau et al. 2003) including family within population as random effects. I tested the proportional hazards assumption for each model by fitting a weighted least-squares line to the residuals of each covariate and evaluating the hypothesis that the slope = 0, implemented with the function cox.zph. Given that the data fit the proportional hazards assumption, I tested for the effect of region of origin using likelihood ratio tests against baseline models lacking this predictor. I obtained the hazard ratio for each category of a predictor variable as the exponential of the maximum partial log-likelihood estimate of the Cox regression coefficient. Parametric survival regression Parametric survival regressions provide more power to detect differences among groups, but they assume a specific distribution that may or may not be appropriate for the data. I estimated parametric survival functions using the survreg function of the R package ‘survival’ (Therneau & Grambsch 2000). I assumed a log-logistic distribution for all survival functions, as 91  models assuming this distribution uniformly had the lowest Akaike Information Criteria relative to models assuming normal, log-normal, logistic, exponential, or Weibull distributions. I tested for the effects of predictors (site, region, population, and the interaction of site by region for the all sites analysis; region and population for the within-site analyses) using likelihood ratio tests against baseline models lacking those variables. To test for log-logistic scale parameter (models the ‘rate’ of death) differences among categories of a significant predictor, I used the Wald test against a normal distribution. For each fitted model I also visually compared the estimated parametric survival functions to the Kaplan-Meier estimates to examine how the assumed distribution matched the observed shape of the data. Fertility selection and selection gradients As no plants survived to flower at either Inland site and very few survived at the Island sites, I combined all Island site survivors into a single dataset. With so few data points, any complex model is likely to be over-fitted, so this analysis should be taken with some caution. Nevertheless, I was interested in examining the relationship between female fitness (estimated as the number of flowerheads produced), days to flower, and height at flowering for the surviving plants. I used the R package 'gsg' to estimate the selection gradients and fitness functions for each trait and their interaction (Morrissey & Sakrejda 2013). I fit both a Lande and Arnold (1983) multiple regression model (which approximates a least-squares regression of relative fitness on phenotype) and a generalized additive model (GAM) with a Poisson error function and separate smoothing functions for each trait and no smoothing function for the interaction. The former approach is the field standard, and provides estimates of selection gradients that can be standardized to compare across studies (Morrissey & Sakrejda 2013). The latter approach is essentially a spline-based semi-parametric regression analysis, following the approach advocated 92  by Schluter (1988), which has the advantage of not requiring that the residuals of fitness be normally distributed. I checked the fit of the GAM using the gam.check function, which runs diagnostic tests to determine if the basis dimensions of the smooth terms are large enough to avoid over-smoothing (Morrissey & Sakrejda 2013). For the GAM, I estimated the directional and quadratic selection gradients for each trait by approximation to the first and second order partial derivatives of population mean fitness with respect to mean phenotype, calculating standard errors and p-values using parametric bootstrapping with 1000 replicates (as implemented in the gam.gradients function; Morrissey & Sakrejda 2013). For the multiple regression model, I estimated the selection gradients using the regression approach described by Lande and Arnold (1983; again with the gam.gradients function), and calculated standard errors and p-values by case bootstrapping over 1000 replicates. For both models, I standardized the estimated selection gradients to the trait variance. I also used the gsg package to estimate the fitness landscape for each trait (independent of the other trait) and as well as their interaction from the gam fitness function using the fitness.landscape function (Morrissey & Sakrejda 2013). For the fitness functions of each trait, I also estimated 95% prediction intervals using parametric bootstrapping with 1000 replicates. Aster modeling To examine the joint effects of viability and fertility selection, I used aster modeling as implemented in the R package ‘aster’ (Geyer et al. 2007). This approach is designed for life history analysis, where the variables that make up fitness often have different probability distributions (e.g. Bernoulli for survival, Poisson for counts of offspring). An aster model’s joint distribution of multiple variables is the product of conditional distributions of each life history variable (e.g. survival to reproduction conditions reproductive fitness). I created a very simple 93  graph for my analysis, with just two nodes: survival to flower, coded as 0 or 1, pointing at number of flowerheads produced, coded as 0 for all non-surviving individuals. I specified a Bernoulli distribution for survival and a Poisson distribution, truncated at 0 to manage the large number of non-reproducing individuals, for the number of flowerheads. As recommended by Shaw and Geyer (2010), I modeled fitness as an unconditional canonical parameter, testing for the effect of region of origin as a predictor of fitness by comparing a full model with the interaction of region of origin and number of flowerheads as a predictor to a null model (with only site and the graph nodes as predictors) using a likelihood ratio test. My analysis was restricted to the two Island sites; I was unable to include the data for the Inland sites, or to include variables for block, location within block, or population of origin, because the cells for these variables contain singularities. The Inland site analyses, at least, would not have differed substantially from the viability selection analyses (node 1 of my graph), and population was not a significant term in any survival model for either Island site. I calculated maximum likelihood estimates of the number of flowerheads for an individual from each region at each Island site using the full model described above.  4.3 Results 4.3.1 Viability selection No plants survived to flower at either inland site (Welder or Papalote), while few plants survived to flower at each island site: 10 (4.2%) at Marine and 11 (10.5%) at Port Aransas. The overall survival functions differed significantly among the four sites (N = 809; KM: χ2 = 177, df = 3, p < 0.0001; Parametric: D = 95.26, df = 17, p < 0.0001; Figure 4.2). The Port Aransas log-logistic scale parameter (α) is significantly larger than the Marine scale parameter (3.85 versus 94  3.58; Wald p < 0.0001), which is in turn significantly larger than either scale parameter for Papalote or Welder (3.39 & 3.41, respectively; Wald p < 0.0016).  Including population as a predictor improved the model fit significantly (D = 44.97, df = 12, p < 0.0001), while including an interaction term for site by region of origin did not (D = 1.91, df = 6, p = 0.927) so that term was dropped from the final model. As can been seen in Figure 4.2 and 4.5, the log-logistic regression curves tend to fit the data (represented by Kaplan-Meier survival functions) more closely at earlier time points but converge at later times, suggesting that the regression analyses might have reduced power to detect later differences in survival.   Figure 4.2. Kaplan-Meier (solid line) and log-logistic regression (dashed) estimates of the overall survival functions for each site (inland sites – red, island sites – blue). Site significantly affects the Kaplan-Meier survival function estimates (X2 = 177, df = 3, p < 0.0001). The Port Aransas log-logistic scale parameter (α) is significantly larger than the Marine α (3.85 versus 3.58; Wald p < 0.0001), which is in turn significantly larger than both α for Papalote or Welder (3.39 & 3.41, respectively; Wald p < 0.001). Note that the log-logistic 50 100 150 2000. after TransplantingProportion SurvivingPapaloteWelderMarinePort Aransas95  estimates appear to fit less well at later time points, and that the Papalote and Welder log-logistic curves are largely overlapping. At all sites, the primary observed (proximate) cause of death was wilting (Figure 4.3). There were also differences in cause of death among the sites (Pearson’s χ2 = 62.64, simulated p-value (1,000 replicates) < 0.001). Port Aransas experienced greater death by herbivory than expected, Welder had both greater death by unearthing and more missing plants than expected, and Papalote had fewer missing plants than expected (Figure 4.3). The primary source of missing plants was likely herbivory: I observed both deer and ants removing all above-ground tissue from several plants, although in one instance I also observed an uneaten tagged plant that had been unearthed and moved about 300 m away from the transplant site.   Figure 4.3 Mosaic plot of the contingency table comparing site (Inland: P = Papalote, W = Welder; Island: M = Marine, A = Port Aransas) to cause of death (w = wilted, b = buried, u = unearthed, h = herbivory, m = missing). Cells are shaded with Pearson’s χ2 residuals: blue are larger than expected, and red are smaller.   At three sites (both inland sites & Marine), the survival functions did not differ significantly by region of origin in any analysis (Table 4.1a–c).  At both inland sites, including a fixed term for population significantly improved the fit of the log-logistic regression, indicating that individual populations differed in estimated hazard ratio (Table 4.1). Population terms did 3.02.00 .02 .03 .6pva lue  = 1.0 ^ -5 SiteCause of deathAmhubwM P W Pearson residuals96  not improve the model fit the log-logistic regression models of survival at the Island sites (Table 4.1).  Table 4.1 Model significance and terms for survival functions by region of origin at each reciprocal transplant site. The models are: non-parametric Kaplan-Meier (KM), semi-parametric Cox proportional hazards (Cox PH), and log-logistic regression. The Cox PH models were fit with random effects for family within population, while the log-logistic models include population as a fixed effect. I tested for differences in the scale parameter coefficient for each region in the log-logistic regressions using a Wald test. The Welder dataset violates the proportional hazards assumption of the Cox PH model, and so that analysis should be interpreted with caution. * p < 0.05, ** p < 0.01, *** < 0.001. Model Test statistic df p-value scale coefficient (a) Papalote (inland), N = 239 KM χ2 = 3.1 3 0.373 — Cox PH D = 2.35 5 0.795 — Log-logistic D = 30.35 15   0.011* no difference (b) Welder (inland), N = 240 KM χ2 = 2.5 3 0.478 — Cox PH D = 7.78 5 0.169 — Log-logistic D = 45.85 15 <0.001*** no difference (c) Marine (island), N = 240 KM χ2 = 2.8 3 0.430 — Cox PH D = 3.04 5 0.693 — Log-logistic D = 7.48 15 0.940 no difference (d) Port Aransas (island), N = 105 KM χ2 = 12.8  2 0.002** — Cox PH D = 9.75 4 0.043* — Log-logistic D = 18.21 7 0.011* Island > North & South   At the fourth site, Port Aransas, region of origin had a significant effect on survival function in all three analyses (Table 4.1d). At this site, the Cox proportional hazards estimated hazard ratio for Island plants was 68% of the hazard for North Inland plants, and 39% of the hazard for South Inland plants (Figure 4.4). Similarly, a Wald test of the log-logistic regression 97  coefficients revealed that the scale parameter for Barrier Island plants (the ‘rate’ of the function) was significantly larger than that for the North Inland or South Inland plants (3.91 versus 3.53 and 3.25, respectively; Figure 4.5). Collectively, these analyses present strong evidence that plants from the barrier islands survived at higher rates than inland plants at the Port Aransas site.    Figure 4.4 Cox proportional hazards estimated survival functions for plants from different regions at each site (solid = maximum likelihood estimate, dashed = 95% confidence interval; red = North Inland, green = South Inland, purple = Coast, blue = Barrier Island). At Port Aransas, Barrier Island plants experienced a significantly lower hazard (68% versus North Inland and 39% versus South Inland), evidence for local adaptation. At all other sites, region of origin did not significantly affect the hazard or survival functions.  50 100 150 2000. (Inland)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandCox P.H. Estimate95% C.I.50 100 150 2000. (Inland)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandCox P.H. Estimate95% C.I.50 100 150 2000. (Island)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandCox P.H. Estimate95% C.I.50 100 150 2000. Aransas (Island)Days after TransplantingProportion SurvivingIslandNorth InlandSouth InlandCox P.H. Estimate95% C.I.98    Figure 4.5 Kaplan-Meier (solid line) and log-logistic regression (dashed) estimates of the survival functions for plants from different regions at each site (red = North Inland, green = South Inland, purple = Coast, blue = Barrier Island). At Port Aransas, both estimators agree with the Cox PH model (Figure 4.4): the Barrier Island survival function is significantly different from either Inland survival function, with a larger log-logistic scale parameter (α = 3.9 vs. 3.5 North and 3.3 South Inland). At all other sites, there are no significant differences among the estimated survival functions for any region of origin.  4.3.2 Fertility selection and selection gradients Among the 21 plants that survived to flower, region of origin did not have a significant effect of the number of flowerheads produced (ANOVA, F = 1.80, df = 3, p = 0.19). For both Inland (North and South) and Island individuals, the number of flowerheads produced ranged from 1 to 92, with the lone Coast survivor producing seven flowerheads. Days to flower and 50 100 150 2000. (Inland)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandKaplan−MeierLog−logistic50 100 150 2000. (Inland)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandKaplan−MeierLog−logistic50 100 150 2000. (Island)Days after TransplantingProportion SurvivingIslandCoastalNorth InlandSouth InlandKaplan−MeierLog−logistic50 100 150 2000. Aransas (Island)Days after TransplantingProportion SurvivingIslandNorth InlandSouth InlandKaplan−MeierLog−logistic99  height at flowering were not significantly correlated (Pearson’s product-moment correlation: t = -1.19, df = 19, p = 0.25), unlike the observed pattern of trait correlation in the common garden (Chapter 3). This may be due to a single outlier individual (see Figure 4.6, bottom left), but the removal of that individual reveals a significant negative correlation (corr. coef, = -0.57, t = -2.97, df = 18, p = 0.008), the opposite direction of correlation from the common garden.   Figure 4.6 Trait space occupied by surviving individuals at the two island sites, colored by region of origin (red = North Inland, green = South Inland, purple = Coast, blue = Barrier Island).   Both the Lande and Arnold (1983) and the Morrissey and Sakrejda (2013) models of the fitness function predict significant directional selection on height at flowering (Table 4.2). Only the GAM approach predicted significant positive quadratic selection (as well as marginally significant negative directional selection) on flowering time. Indeed, the sign and magnitude of the estimated quadratic selection gradients on individual traits are quite different between the ●●●●●●●●●●●●●●●●●●●●●220 240 260 280 30050100150200250Flowering Time (days)Height at Flowering (cm)100  two approaches, which may represent differences in the effects of non-normal trait distributions or small sample size, but could also result from the odd shape of the fitness function (Morrissey & Sakrejda 2013; Figure 4.8). Inclusion or removal of the plant that flowered very early (205 days; Figure 4.6) does not substantially affect the estimated selection gradients.  Table 4.2 Estimated standardized selection gradients (β: linear; γ: higher order, e.g. quadratic) for days to flower and height at flowering from the number of flowerheads produced (a measure of female fitness). (a) GAM selection gradients were estimated by approximation to the first and second order partial derivatives of population mean fitness with respect to mean phenotype, following Morrissey & Sakrejda (2013). (b) Selection gradients estimated using the approach first proposed by Lande and Arnold (1983). Standard errors (SE) and p-values were estimated by parametric (a) or case (b) bootstrapping over 1000 replicates.      ^ < 0.1, * < 0.05, *** < 0.001. Coefficient Estimate SE p-value (a) GAM selection gradients β–days to flower -0.290 0.143 0.056 ^ β–height 0.659 0.046 0.000 *** γ–days to flower 1.251 0.475 0.038 * γ–height 0.131 0.099 0.182 γ–days to flower*height -0.314 0.222 0.242 (b) Least squares regression selection gradients β–days to flower -0.216 0.278 0.296 β–height 0.747 0.224 0.000 *** γ–days to flower -0.028 0.290 0.800 γ–height -0.117 0.462 0.334 γ–days to flower*height -0.218 0.415 0.226    The estimated fitness function for the interaction of days to flower and height at flowering reflects the observed GAM selection gradients (and is derived from them; Figure 4.7).  It is steepest in the height dimension and expected population mean fitness declines dramatically 101  with shorter plants (Figure 4.8, top). For flowering time, population mean fitness is maximized at earlier flowering, but only if population mean height is tall (Figure 4.7, left). At shorter mean height (or when height is held constant), population mean fitness appears to increase at either extreme, consistent with the significant positive quadratic selection gradient (Table 4.2; Figure 4.8 bottom). When the unusually early flowering individual is excluded from the analysis, the fitness landscape shifts somewhat, particularly along the flowering time dimension (Figure 4.7, right). In this landscape, population mean fitness is maximized at either moderately early flowering (~265 days) or late flowering (after 285 days), suggesting the possibility of two fitness optima.   Figure 4.7 Contour plots of two dimensional fitness landscapes: expected population mean fitness as a function of hypothetical population mean values for flowering time and height at flowering. The panel on the left includes all observations, while the panel on the right does not include the individual that flowered unusually early (at 205 days, Figure 4.6).  Flowering Time (days)Height at Flowering (cm)6080100120140160250 260 270 280 2900. Time (days)Height at Flowering (cm)6080100120140160260 270 280 2900.    Figure 4.8 Fitness landscapes for individual traits. Population mean fitness is shown as a function of height at flowering (top) and flowering time (bottom), each evaluated holding the other constant at the observed population mean. The dashed lines indicate 95% prediction intervals estimated by bootstrapping, similar to a standard error. These fitness landscapes are estimated including all observations.  60 80 100 120 140 160020406080100Height at Flowering (cm)Flowering Head Number250 260 270 280 290010305070Flowering Time (days)Flowering Head Number103  4.3.3 Aster modelling Aster modelling shows a highly significant effect of region of origin on lifetime fitness as represented by the number of flowerheads produced (likelihood ratio test versus the null model: D = 105.63, df = 6, p < 0.001). The maximum likelihood estimates of predicted fitness for a ‘typical’ individual from each region at each site are shown in Figure 4.9. For both island sites, only Barrier Island (that is, local) individuals have a predicted fitness distribution that does not overlap with zero.   Figure 4.9 Expected flowerhead number for an average individual from each region of origin at the two island sites (Marine = solid; Port Aransas = dashed), showing maximum likelihood estimates ± 95% confidence intervals from an aster model of the interaction of region of origin with fitness (full model described in Section  −202468●●●●●●●●Coast IslandNorthInlandSouthInlandMarinePort AransasRegion of OriginPredicted Fitness (Flowerheads)104  4.4 Discussion This study provides evidence for local adaptation in H. argophyllus. When differences in fitness by region of origin were observed, local individuals had higher rates of survival (at Port Aransas), and higher predicted cumulative fitness (at both Island sites). The aster model results are likely largely driven by the differences in survival, as among the surviving plants local individuals did not have higher fecundity. This observation makes it unlikely that the observed local adaptation had any association with flowering time. In fact, most plants initiated flowering during ~20 days from late September through October, regardless of region of origin. Some proportion of the observed effects may be due to maternal effects (e.g. Alexander et al. 2014), as this study again used seed collected directly from wild populations. I observed no differences in fitness between locals and non-locals at Inland sites: every plant died, at more or less equal rates. This pattern of non-reciprocal local adaptation is not uncommon in plant reciprocal transplant experiments (Leimu & Fischer 2008), but 2011 also may have been unusual year for South Texas.  The year 2011 had the hottest summer on record for the state (http://www.ncdc.noaa.gov/cdo-web/datatools/records), although this record has since been surpassed. This is likely related to the high rates of death by wilting before reproduction. I observed similar mortality in wild populations adjacent to the reciprocal transplant sites (Chapter 5). In addition, high temperatures may explain the higher mortality at inland sites: coastal Texas generally experiences a more moderate climate than inland areas. Unlike the other three sites, the site with the highest survival rates (Port Aransas) was shaded by surrounding trees, which may have acted to further ameliorate the effect of high temperature.  It is important to note that fitness as estimated in this study only represents the female fecundity component of each hermaphroditic plant. With that caveat, I observed significant 105  selection on both flowering time and height at flowering among the surviving study plants at the two Island sites. This is straightforward for height at flowering: there was strong directional selection on height, with taller plants producing more flowerheads. For flowering time, the story is more complicated, and it is likely that the small dataset obscures much of the picture. However, there is suggestive evidence for disruptive selection on flowering time, with alternate optima represented at either end of the observed flowering range. If this signal is indeed real, it would be consistent with the observation of early and late flowering plants at approximately equal proportions in Island populations (Chapter 3). Although the selection gradient for the interaction of flowering time and height at flowering was not significantly different from 0, plants with the highest observed fitness flowered early and at a large size.   Flowering time in H. argophyllus may be fully or partially constrained by a plant size threshold, below which flowering is delayed or suppressed. Several studies, primarily in perennial plants, have shown a threshold-size effect on flowering time (Wesselingh et al. 1997; Burd et al. 2006). In this case, individuals that grew more slowly (due to genotype or stress caused by environmental micro-heterogeneity) would be expected to flower later and at a relatively smaller size than individuals who grew more quickly. Many species exhibit an earlier age at maturity with increased growth conditions (Stearns & Koella 1986; Stearns 1989). This hypothesis is consistent with the observed potentially negative correlation between age and height at flowering in this study relative to the strong positive correlation between those traits under less stressful conditions in the common garden (Chapter 3). Plants in the common garden overall grew more quickly and reached larger mature sizes relative to the survivors of the reciprocal transplant. Given high rates of mortality by wilting at all sites, the surviving reciprocal transplant plants likely experienced significant water stress. I attempted to control for 106  environmental heterogeneity within each site by experimental design, but cannot rule it out completely. In addition, local adaptation may have acted to moderate the degree of stress experienced by each plant, if plants that survived were also better able to access soil or water resources or tolerate local biotic factors.  If selection in H. argophyllus on the Texas barrier islands is largely for increased plant size, with large plants having higher fitness, then a threshold size for the transition to flowering could represent adaptive phenotypic plasticity. Further, depending on the shape and slope of the reaction norms and the type or existence of this developmental threshold, natural selection could favor alternate flowering strategies: fast-growing and early flowering versus slow growing and late flowering. It is possible that phenotypic plasticity could consequently be maintaining the observed phenotypic variation for flowering time in Barrier Island populations (Chapter 3). However, my ability to extrapolate is hampered by my limited data. The characterization of selection in H. argophyllus needs further work, and I explore this in more detail in Chapter 5.   107  Chapter 5: Selection in wild populations  5.1 Introduction Perhaps the most direct method to assess the strength, direction, and shape of natural selection on phenotypes is to observe wild populations. The classic approach is to measure the relationship between components of fitness (e.g. survival, mating success, reproduction) and a set of phenotypic traits over the course of a single generation or across multiple generations (Endler 1986; Linnen & Hoekstra 2009). With these data, one can use the approaches discussed in Chapter 4 to estimate selection gradients and fitness functions, and evaluate the evidence for natural selection on those traits.   The results of Chapter 4 suggest that selection in H. argophyllus may be acting more directly on plant size at first reproduction than on flowering time, and possibly also on the reaction norm relating flowering time and size at maturity to environmental variation. Both selection for optimal flowering phenology and size at maturity (e.g. Lotz 1990) and the genetic correlation among these traits can vary across environments (Stearns 1989; Schlichting 1989). In annual plants a positive correlation between age and size at flowering is relatively common (e.g. Mitchell-Olds 1996; Franks & Weis 2008; Colautti & Barrett 2013). However, negative correlations between age and size at maturity are also quite common (Rowe & Ludwig 1991), and may be largely related to developmental thresholds (Day & Rowe 2002). Environmental variation can modify these correlations via reaction norms, such that traits that are positively correlated in one environment may be negatively correlated in another (Burd et al. 2006; Heino et al. 2002; Day & Rowe 2002; Stearns 1989; Schlichting 1989). Many studies have demonstrated dramatic changes in the phenotypic architecture of plants under resource stress 108  (e.g. water limitation, competition) relative to plants in optimal conditions (e.g. Volis et al. 2004; Pigliucci & Schlichting 1998).  If divergent selection is driving the observed regional variation in flowering time in H. argophyllus, then the direction and possibly shape of selection on flowering time should vary across regions. Specifically, later flowering plants should be favored at Inland, and, to a lesser degree, Coast sites, while earlier flowering plans should be selectively neutral or favored at Barrier Island sites. Further, if flowering time is constrained by a plant size threshold and individuals vary in their bivariate reaction norm with respect to age and size at flowering, then the strength and possibly sign of the genetic covariance between age and size at flowering should change in more stressful conditions (e.g. be negatively correlated in ‘stressful’ wild environments despite strong positive genetic correlations in the common garden, Chapter 3). To address these hypotheses, I studied the relationship between reproductive (female) fitness and age and size at flowering in a single generation at seven wild populations of H. argophyllus along a cline from the barrier islands to North inland Texas. I ask: (1) How do age and size at flowering affect (female) fitness in each region? (2) How are age and size at flowering correlated in each region, and do the correlation coefficients vary predictably with environmental ‘stress’? (3) Are patterns of selection consistent with the geographic partitioning of phenotypic variation observed in the common garden (Chapter 3)?  109  5.2 Methods 5.2.1 Data collection In early June 2011, I identified seven sites at which to track individual plants throughout the growing season (Figure 5.1): three in the Barrier Island region, two in the Coast region, and two in the North Inland region. I required that each site be relatively undisturbed (although in one case, Mancaves, this requirement was not met) and have evidence of a persistent local population in the form of old, remnant woody stems in various states of decomposition.   Figure 5.1 Wild population sites chosen for observation of natural selection on size and age at flowering. Note that the South Inland region was not represented in this study.  −98.0 −97.8 −97.6 −97.4 −97.2 −97.0 −96.8 −96.627.627.828.028.228.40 20 40 kmCorpus Christi●●●●●●●IslandPackBeachTracksMancavesWelder FarRegionInlandCoastIsland110  At each site I established three 100-meter transects of 50 tagged plants, with transects separated by at least 100 m. I haphazardly chose transect placement and direction by tossing a Frisbee into a patch of individuals, then proceeding from that point in the direction that the Frisbee landed when tossed vertically into the air. If the chosen direction did not contain enough individuals, I repeated this process. I attached a labeled aluminum tag to the individual plant closest to each two meter point along the transect. After tagging, I measured the height of each plant along the main stem in centimeters.  Following the initial tagging, I returned to each site approximately weekly and censused the plants for budding, flowering, and survival. On the first day I observed a reproductively active flower, I measured the plant’s height and noted whether the first flower was produced on a branch or the primary stem apex. Because I did not have information about the timing of germination, I measured flowering time as the index of days post 1 January 2011. If a plant died before any plant in the population (across all three transects and among all untagged plants in sight) had flowered, I noted this and tagged the next nearest individual. As I was primarily interested in the relationship between fitness and flowering time, I ceased re-tagging once the first plant in a population had initiated flowering. Once a plant ceased producing new flowerheads and started to senesce, I counted the number of flowerheads it had produced over its lifetime. I continued to census until all focal plants had either died before reproducing or ceased producing new flowerheads. 5.2.2 Data analyses To understand how the correlation between flowering time and height at flowering varied by site, I ran a simple linear regression predicting the Pearson’s product-moment correlation coefficient between the two traits by survival rate (as a proxy for environmental ‘stress’) at each 111  site. I calculated survival rate as the percent of all original focal individuals (i.e. excluding replacement individuals) that survived to produce at least one flowerhead. I tested the fit of the regression model using ANOVA. I also tested for significant correlations between flowering time and height at flowering at the regional level using two-tailed t-tests based on Pearson’s product-moment correlation coefficients. To examine the relationship between fitness (measured as number of flowerheads produced) and age and size at flowering, I used the R package 'gsg' as described in Chapter 4, Section I explored the data for each site, and found that sites within a region showed consistent patterns of selection. As I am primarily interested in differences at the regional scale, for each region I fit both generalized additive models (with Gaussian distribution functions and random effects for transect within site) and traditional Lande and Arnold (1983) multiple regression models. For the Island and Coast regions, there were sufficient data to justify fitting each GAM with a tensor product smooth of the two traits. The tensor product smooth is scale invariant, which is useful when traits are measured on different scales (Wood 2006; Morrissey & Sakrejda 2013). At the Inland sites, very few individuals survived to flower, so I fit the GAM with separate smoothing terms for each predictor. I chose Gaussian distribution functions for the GAMs because they uniformly minimized model Akaike information criteria relative to models fit with Poisson distributions, even though the response variable is count data. I checked the fit of each GAM using the gam.check function, and estimated selection gradients for each trait by approximation to the first and second order partial derivatives of population mean fitness with respect to mean phenotype (with gam.gradients; Morrissey & Sakrejda 2013). I calculated standard errors and p-values using parametric bootstrapping with 1000 replicates. For the multiple regression model, I estimated selection gradients as described by Lande and Arnold 112  (1983), and calculated standard errors and p-values by case bootstrapping over 1000 replicates. For both models, I standardized the estimated selection gradients to the trait variance. I also used gsg to estimate fitness landscapes from the GAM fitness function at each site for the interaction of the two traits and each trait independently, holding the other trait constant at the observed population mean.   5.3 Results Pre-reproductive mortality varied dramatically across the sites, with the lowest survival at the Inland sites and the highest survival at the Coast site Tracks and the Barrier Island site Island (Table 5.1). Table 5.1 also reports the trait means and standard errors for flowering time, height at flowering, and flowerhead number at each site. The most common observed cause of death was wilting (62%), with herbivory as the second most common (28%). Survival at each site, as a proxy for environmental stress, significantly predicted Pearson’s product-moment correlation coefficient for flowering time and height at flowering (slope = 0.009, R2 = 0.58; F = 6.81, df = 1 and 5, p = 0.048; Table 5.1 and Figure 5.2).  113  Table 5.1 Per-site summary of percent survival (of the 150 original focal plants at each site, excluding replacement individuals), the sample size for plants that survived to flower (N), flowering time (index days post 1-Jan), height at flowering (in cm), Pearson’s product-moment correlation coefficient for flowering time and height at flowering, and flowerhead number (a proxy for female fecundity). Values for the three traits are means ± SE. See also Figure 5.2, and Figure 5.3 for regional pattern. Site Region Percent Survival  N Flowering Time Height at Flowering Corr. Coeff. Flowerhead Number Far North Inland 2 5 297.8 ± 3.4 38.8 ± 4.3 –0.77 0.1 ± 0.1 Welder North Inland 9 20 286.9 ± 3.2 62.1 ± 11.6 –0.68 1.4 ± 0.9 Tracks Coast 63 126 267.8 ± 2.0 136.3 ± 7.2 0.21 23.6 ± 3.8 Mancaves Coast 57 133 259.9 ± 1.2 114.8 ± 6.8 –0.47 14.3 ± 2.4 Beach Barrier Island 43 90 248.0 ± 2.9 137.1 ± 7.2 –0.38 53.0 ± 10.0  Island Barrier Island 67 122 267.1 ± 1.4 148.0 ± 6.3 –0.36 39.8 ± 4.2 Pack Barrier Island 48 129 254.0 ± 2.4 89.9 ± 5.5 –0.23 26.0 ± 4.4   Although I did not observe a bimodal distribution of flowering time, the distribution of flowering dates in each region corresponds well with that observed in the common garden (Chapter 3): more early flowering individuals at Island sites than at Coastal sites, and no early flowering individuals Inland (Figure 5.3). Regional mean flowering date was 257 days post-January 1st in the Barrier Island region, 264 in the Coast region, and 289 in the Inland region. Among the plants that survived to flower, flowering date and height at flowering were significantly negatively correlated in the Barrier Island region (corr. coef. = -0.194: t = -3.644, df = 337, p < 0.001) and Inland region (corr. coef. = -0.689: t = -4.554, df = 23, p < 0.001), and not significantly correlated in the Coast region (p  = 0.92; although see Table 5.1 for between site variation).   114   Figure 5.2 Pearson’s product-moment correlation coefficient for flowering time and height at flowering versus the percent of surviving individuals at each site (a proxy for environmental stress). Each site is labeled with the site name (see Figure 5.1) and colored by region (red = Inland, purple = Coast, blue = Island). The line represents the best fit regression of the correlation coefficient on survival at each site (slope = 0.009, R2 = 0.58; F = 6.81, df = 1 and 5, p = 0.048).   ●●●●●●●0 20 40 60−0.8− survival of focal individualsPearson's correlation coefficientBeachFarIslandMancavePackTracksWelder115     Figure 5.3 Trait space occupied by wild individuals who survived to flower in each region (Island = blue, Coast = purple, Inland = red). Symbols denote sites within a region as indicated in each panel legend. Note that censusing for flowering occurred approximately weekly: the data are jittered to improve visualization. ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●150 200 250 3000100300Flowering date (days)Height at flowering (cm)● BeachIslandPack150 200 250 3000100300Flowering date (days)Height at flowering (cm)TracksMancave150 200 250 3000100300Flowering date (days)Height at flowering (cm)WelderFar116   Figure 5.4 Tensor product smooth-based contour plots of the GAM function relating flowerhead number to flowering time and height at flowering in the Island region (blue, top) and the Coast region (purple, bottom).  Partial residuals are plotted as points. The numbers on the contour lines represent flowerhead number (a proxy for female fecundity). Note that the smooth function will perform less well away from the data. Flowering time (days)Height at flowering (cm) −50  −50  0  0  50  50  100  100  100  100  150  200 150 200 250 300100200300400Flowering time (days)Height at flowering (cm) −600  −600  −400  −200  0  0  0  0  0  200  200  200  200 150 200 250 3000100200300400117  The estimated fitness functions (the fitted tensor product smooth for each GAM) are quite steep, with dramatic changes in predicted fitness over relatively narrow phenotypic ranges (Figure 5.4). Island and Coast sites appear to differ primarily in the dimension of flowering date. These functions are not directly comparable to the Inland analysis (Figure 5.5), as the smoothing functions differ. Note that the Inland model was fit with few data points, and so resulting estimates should be interpreted with caution.    Figure 5.5 The GAM smooth functions relating each term (left: flowering date in days; right: height at flowering in cm) to flowerhead number in the Inland region. Partial residuals are plotted as points and rugged at the foot of each plot. Note that this model is likely overfitted due to low sample size: the shape of the smooth function for height, right, is largely driven by three data points.    I detected significant positive directional selection on height at flowering at all sites and in all analyses (Table 5.2; Figure 5.6). The shapes of the individual fitness functions for height vary across regions, however: in the Island and Inland regions the fitness landscape is relatively linear, while in the Coast region the relationship between height and flowering date appears to 260 270 280 290 300−15−505flowerSmoothed fitness function50 100 150 200−15−505f_heightSmoothed fitness function118  involve an inflection point above which the rate of gain in estimated mean fitness per cm of mean height increases (Figure 5.6, left column). There is some evidence for either correlational selection on height and flowering date (GAM selection gradients) or possibly positive quadratic selection on height (regression selection gradients) in the Inland analysis, but I am hesitant to rely heavily on a model fitted to so few data points. In general, I feel more confident in the GAM-derived selection gradients, as several of the single-trait fitness landscapes appear to have complex shapes (Figure 5.6), which the classic regression approach might fail to adequately capture (Morrissey & Sakrejda 2013).   Table 5.2 Estimated standardized selection gradients for flowering date and height at flowering in each region (A. Island; B. Coast; C. Inland), with fitness estimated as the lifetime number of flowerheads produced.  (i) GAM selection gradients were estimated by approximation to the first and second order partial derivatives of population mean fitness with respect to mean phenotype, following Morrissey & Sakrejda (2013). (ii) Selection gradients estimated using the approach first proposed by Lande and Arnold (1983). Standard errors (SE) and p-values were estimated by parametric (i) or case (ii) bootstrapping over 1000 replicates. ^ < 0.1, * < 0.05, *** < 0.001. Coefficient Estimate SE p-value A. Island (N = 319) (i) GAM selection gradients β–flowering date -0.201 0.093 0.026* β–height at flowering 0.817 0.118 0.000*** γ–flowering date 0.155 0.180 0.430 γ–height at flowering -0.059 0.275 0.958 γ–flowering date*height -0.201 0.168 0.216 (ii) Least squares regression selection gradients β–flowering date -0.099 0.079 0.186 β–height at flowering 0.828 0.061 0.000*** γ–flowering date 0.209 0.143 0.120 γ–height at flowering -0.098 0.098 0.434 γ–flowering date*height -0.138 0.096 0.070^ 119  Coefficient Estimate SE p-value B. Coast (N = 225) (i) GAM selection gradients β–flowering date 0.210 0.166 0.272 β–height at flowering 0.430 0.254 0.044* γ–flowering date -1.218 0.422 0.010* γ–height at flowering 0.846 1.108 0.660 γ–flowering date*height 0.833 0.500 0.204 (ii) Least squares regression selection gradients β–flowering date -0.145 0.161 0.460 β–height at flowering 1.027 0.092 0.000*** γ–flowering date 0.070 0.148 0.554 γ–height at flowering 0.152 0.217 0.328 γ–flowering date*height -0.162 0.287 0.628 C. Inland (N = 23) (i) GAM selection gradients β–flowering date -0.011 0.062 0.774 β–height at flowering 0.765 0.131 0.000*** γ–flowering date <0.000 0.001 0.988 γ–height at flowering 0.318 0.282 0.392 γ–flowering date*height -0.259 0.135 0.026* (ii) Least squares regression selection gradients β–flowering date 0.018 0.129 0.792 β–height at flowering 0.681 0.357 0.010* γ–flowering date 0.001 0.046 0.670 γ–height at flowering 1.429 1.972 0.084^ γ–flowering date*height -0.041 0.426 0.604    The strength, direction, and possibly shape of selection on flowering date vary across regions. In the Island analysis, I detected a relatively weak negative directional selection on flowering date (Table 5.2; Figure 5.6, right column). This contrasts with the Coast region, where there appears to be relatively strong, possibly stabilizing selection on flowering date, with the population mean optimum at relatively later flowering (Table 5.2; Figure 5.6, right column). There is no signal of linear or non-linear selection on flowering date in the Inland region. 120     Figure 5.6 Estimated fitness functions for each trait (left: height at flowering, right: flowering date; each holding the other constant at the observed population mean) at each region (Island = blue, Coast = purple, Inland = red). Dashed lines represent 50% prediction intervals obtained by bootstrapping, approximately equivalent to standard errors. Note that the x-axis scale varies among panels in the same column.   Examining the combined fitness landscapes (shown in Figure 5.7), we can see that in all regions the population mean height that maximizes population mean fitness is tall. In both the 60 80 100 120 140 160 1800. at flowering (cm)Relative fitness240 250 260 270 2800. date (days)Relative fitness50 100 150 2000. at flowering (cm)Relative fitness250 260 270 2800. date (days)Relative fitness20 40 60 80 1000. at flowering (cm)Relative fitness280 285 290 295 300 3050. date (days)Relative fitness121  Island and (with appropriate caution) Inland fitness landscapes, population mean fitness is also maximized for an early mean flowering date (although note that the actual values are relative to the observed population means; Figure 5.7). In the Coast fitness landscape, a too-early mean flowering date appears quite problematic if the population is, on average, tall, although no observed phenotypes actually fell into that range (Figure 5.3).   Figure 5.7 Fitness landscape contour plots (estimated from flowerhead number) as a function of flowering date and height at flowering for each region (Island = blue, Coast = purple, Inland = red). Each panel shows expected regional mean fitness to 1 s.d. on each side of observed phenotypic mean (so the scale varies among panels).  As a final note, branch flowering initiation was observed in all regions at approximately equal rates (19.7 % across all regions), although in the Inland region it was primarily a response to apical damage by herbivory. This contrasts with my observations in the common garden, where branch flowering was significantly associated with the Coast region (Chapter 3).   Flowering time (days)Height at flowering (cm)100150240 250 260 270 2800. time (days)Height at flowering (cm)50100150200250 260 270 280−0.4−0.20.0 time (days)Height at flowering (cm)20406080100280 285 290 295 3000.  5.4 Discussion The pattern of selection on flowering time in wild populations of Helianthus argophyllus varies dramatically across regions and is consistent with my hypothesis of local adaptation in this trait. Earlier flowering was favored in the Island populations, where early flowering plants occur in higher proportion (Figure 3.7). In contrast, plants that flowered at the population mean or slightly later had the highest fitness in the Coast populations (Figure 5.7), where most individuals are late flowering. The decline in fitness for early flowering individuals in Coast populations is much more dramatic than that for late flowering individuals in Island populations (Figure 5.6, right column). If the observed pattern of selection is typical, this might explain the higher than expected prevalence of ‘low fitness’ late flowering plants in Island populations, as gene flow between the Coast and Islands is moderately high (Chapter 2). The mean of flowering date in Island populations (257 days after January 1st) is later than the value that would maximize population mean fitness (≤ 230 days), suggesting that some mechanism other than direct natural selection may be maintaining later flowering individuals on the Islands. At least qualitatively, the selection gradients and fitness landscapes that I estimated for my Island reciprocal transplant survivors align with those I observed in the Island region (Figure 4.7–4.8 vs. Figure 5.6–5.7). Such strong spatial variation in the direction and shape of selection has been observed relatively rarely in natural populations (Siepielski et al. 2013), although local adaptation is frequently detected, suggesting that it may in fact be common (Leimu & Fischer 2008).  Flowering time and height at flowering were significantly negatively correlated in the Island and Inland regions, but not in the Coast region. Survival rate in each population was a significant predictor of the population’s flowering time versus height at flowering correlation coefficient, with higher survival rates increasing the correlation (Figure 5.2). This pattern may be 123  driven by variation in local environmental ‘stress’ shifting the expression of genetic covariance of these two traits, particularly for the populations in the Inland region where survival was low. However, the model is likely somewhat flawed: it predicts the correlation coefficient between these two traits for the common garden (98% survival; Chapter 3) as 0.13, rather less than the observed 0.84. A similar mismatch is seen for the reciprocal transplant survivors (Chapter 4), where 6.5% survival predicts a correlation coefficient of –0.71 rather the observed –0.26 (with outlier individual) or –0.57 (without outlier). Survival rate may not fully capture ‘stress’ in each environment, and it is possible that a linear model is not the best fit for the relationship between environment and phenotypic expression. It is also possible that trait-biased low survival has eliminated enough individuals to obscure the formerly observed positive genetic covariance. Very low survival rates in Inland populations hindered my ability to estimate selection on flowering time in that region. The few inland survivors were also relatively homogeneous for age and size at flowering, which may reflect strong viability selection or low genetic variation for the reaction norm relating age and size at flowering to the environment. The latter interpretation might be sensible in the context of my common garden study, where North Inland populations were relatively phenotypically homogeneous (Chapter 3). However, this study was conducted the same year as the reciprocal transplant experiment, and as I hypothesize in Chapter 4, the unusually high temperatures likely had stronger effects Inland than in the Coast or Island regions. At the two Inland populations, I observed many old woody stems from previous generations of H. argophyllus that were more than three meters in length, and yet the tallest focal plant at either site was just over two meters tall. This suggests that something, likely the climate, imposed a significant constraint on the resources available to Inland plants during the year of observation.  124   It is important to note that I only measured one component of fitness (female fecundity). This is a common caveat to many studies of fitness in wild plants, especially outcrossing hermaphrodites like H. argophyllus (Leimu & Fischer 2008). Researchers have hypothesized that much of the selection on floral traits is caused by differences in male fitness, as male fitness should be determined primarily by pollination or mating success (Sutherland & Delph 1984; although see Conner & Via 1993). The logistical requirements of studying male fitness were prohibitive, and yet it is possible that the patterns I observed in this study only show part of the picture of selection on flowering time.   In addition, this study only examined selection on two phenotypes: age and size at flowering. I chose these traits because my central interest is in flowering time, and size at flowering is often strongly correlated with flowering time in plants (e.g. Mitchell-Olds 1996; Franks & Weis 2008) and appears to be important in this system (Chapter 4). However, my analyses in Chapter 3 suggest that floral size characters (or stem hairiness) might be more important targets of divergent selection among regions, and these traits are also (negatively) correlated with flowering time in my common garden data. It is possible that the selection I estimate as acting directly on flowering time in H. argophyllus is actually indirectly affecting flowering time through direct selection on those or other characters. Finally, this study is limited to only a single year, and a year that appears to have been a climactic outlier (although subsequent years have continued in the same vein). The strength of selection can vary among years, as can the direction of selection (Siepielski et al. 2009, although see Morrissey & Hadfield 2012). To fully understand how selection has shaped flowering time variation in this species, I would need to examine how selection varies across time as well as space. 125  Chapter 6: Conclusion  6.1 Discussion The evidence that I have presented points to divergent natural selection acting on flowering time, either directly or indirectly, to create the pattern of phenotypic differentiation that I observe in controlled and wild environments. The evidence is strongest for divergent selection acting between the barrier islands and the mainland coast. Populations from these two regions form a single genetic cluster, and yet populations from the Coast region are primarily later flowering while populations on the Barrier Island region are made up of both early and late flowering individuals at roughly equal proportions (at least in the common garden). Island plants survived at significantly higher rates at one barrier island reciprocal transplant site, and island plants had the highest cumulative fitness overall across both island sites, consistent with local adaptation of island populations. During the same year, island populations experienced moderate directional selection for earlier flowering, while populations in the Coast region experienced strong selection to flower at the observed population mean or later. This dramatic shift in selective regime took place over less than 20 km, and the selective regime shifted again a further 25 km along the same cline. In inland populations, viability selection was very strong, and obscured any relationship between flowering time and fitness, or signal of local adaptation. One potential complexity is the shifting genetic covariance between the traits of flowering time and height at flowering as measured in the common garden versus the reciprocal transplant and wild populations, from a strong positive correlation to either no or a negative correlation. This sign change is very likely the result of a genotype by environment interaction, where individuals vary in the bivariate reaction norm relating age and size at flowering to the 126  environment. Patterns like this can occur when a population harbors genetic variation for reaction norm slope, shape, or range (Stearns 1989; Stearns 2000), and are relatively common in life history studies (Stearns & Koella 1986; Rowe & Ludwig 1991; Day & Rowe 2002). To investigate this more fully, I would need to measure reaction norms for genotypes across multiple environments, and potentially examine how variation in reaction norm affected fitness in each environment (e.g. Scheiner & Callahan 1999). This pattern might also occur if flowering time or size at flowering were under fluctuating selection across the life time of an individual (Schluter et al. 1991), although this hypothesis would be difficult to test with these traits without more knowledge of their genetic bases because they are not expressed until later in a plant’s life history. These analyses do not clarify the pattern of phenotypic variation across the entire species’ range. If North and South Inland populations are not under selection for later flowering, the absence of earlier flowering individuals from either region may be the result of genetic drift and reduced gene flow due to the relatively larger geographic scales: a pattern of isolation by dispersal limitation (Orsini et al. 2013). Selection favoring taller plants, which I observed in North Inland wild populations, may also result in indirect positive directional selection on flowering time, as the two traits are highly correlated (at least in the relatively stress-free conditions of the common garden). If inland environments tend to be more “stressful” than marine environments, as suggested by higher mortality rates in my data, then the resulting slower growth might interact with selection for tall plants to favor individuals who delay the transition to reproduction in favor of an extended growing period. This hypothesis is supported by life-history theory (Stearns & Koella 1986; Scheiner 1993). Finally, there are suggestions of more wrinkles to the story: several of the SNP markers used to characterize population structure in this 127  study show signals of non-neutral differentiation along a latitudinal cline, as does at least one trait (stem hairiness). It is also possible that the observed patterns are driven by isolation by adaptation (e.g. Nosil et al. 2008). If selection on flowering time (or a correlated trait) is heterogeneous across regions, then divergence in flowering time through local adaptation could lead to a subsequent reduction in gene flow. This reduction would be the result not only of divergence in flowering time, but also selection against immigrants or hybrids that displayed non-locally-adapted phenotypes. Following a reduction in gene flow, the effects of genetic drift could further differentiate populations both genetically and phenotypically. This interplay between natural selection, gene flow, and genetic drift is increasingly recognized as a common driver of population differentiation (Orsini et al. 2013). Under isolation by adaptation, we might expect an accelerated rate of divergence, with evolutionary forces combining to form a positive feedback loop. From this, we might consider H. argophyllus a case study in incipient ecological speciation. Putting aside speculation, the value of the present study lies in this system’s dissimilarity to previous research on flowering time variation in wild plant species. The observed variation is not latitudinal, and, if it does align with a cline in moisture availability, the direction of this relationship (later flowering in drier habitats) is contrary to most other well-characterized systems (e.g. Mimulus guttatus; Hall & Willis 2006). The pattern does not appear to be driven by differences in pollinator preferences, at least from personal observation. In most regions, late flowering appears to be the optimal strategy, which contrasts with the finding that earlier flowering is generally favored in plants (Munguía-Rosas et al. 2011). Although flowering time shifts are relatively common in the annual sunflowers (Henry et al. 2014), late flowering H. 128  argophyllus individuals flower much later than other annual species of sunflower (although see Moyers & Rieseberg 2013), which suggests that the genetic basis for this phenotype is in some way novel. In sum, these data add to our understanding of what is possible in a wild plant species and in a sunflower.  6.2 Strengths and limitations In longitudinal studies of wild populations the strength and even direction of selection can vary from year to year, although in many cases this may be due to sampling error (Siepielski et al. 2009; Morrissey & Hadfield 2012). This research was limited in temporal scope, and it is possible that my conclusions are highly dependent on the year of observation. This seems plausible given that my field studies were conducted in a year of exceptional heat and drought for South Texas. Additionally, it is possible that the observed relationships between fitness and flowering time in each region are primarily driven by selection on unmeasured traits. In my studies of local adaptation and selection in wild populations (Chapters 4 & 5), I only measured one two components of fitness: survival from seedling to reproduction and female fecundity. It is possible that female fecundity is only a minor component of lifetime fitness in this species. Similarly, it is possible that selection during earlier, unobserved life history stages (e.g. germination, seedling establishment) is as important, or even more important, than later survival and reproduction. If the real picture of natural selection were dramatically different from what I observed in this research, I would need to re-evaluate certain of my hypotheses.  The development of general theory in evolutionary biology comes from two sources: theoretical models that explore possible hypotheses about the origin, loss, and maintenance of biological variation, and empirical data that provides support for (or against) these models and 129  can serve as inspiration for the generation of new hypotheses. This work fits squarely into the second category, and provides data to support the general theory of divergent selection on a reproductive trait. This research uses a set of well-supported tools to ask familiar questions in an entirely new system. Although Helianthus may in some ways be a model system for ecological adaptation research, H. argophyllus has never been the focus of careful evolutionary study, and the patterns of genetic and phenotypic differentiation that I have characterized have not been previously reported. The degree of observed variation within some populations (and even some families) is both unexpected and largely without equal in the literature, especially for a reproductive trait in an outcrossing species with a relatively small range and small effective population size (Heiser et al. 1969; Strasburg et al. 2011). This research has provided a strong basis upon which to build further studies in this system.  6.3 Future directions There are two primary questions that remain to be addressed to fully understand flowering time variation in Helianthus argophyllus:  (1) What are the ecological drivers of selection in this system?  I have established that flowering time variation is likely under divergent selection between Coast and Barrier Island populations. However, the question remains: what is imposing divergent selection in this system? Answering this will require testing specific hypotheses about potential agents of selection, which in turn requires that likely candidates be identified. This is relatively complicated, as an entire suite of 130  potential candidates all co-vary along the same island to mainland cline, including abiotic (temperature, precipitation, humidity, light intensity, salt, nutrient availability, tropical storm intensity) and biotic factors (plant communities, pollinators, seed predators, herbivores, human disturbance). Some of these factors are more likely candidates as agents of selection on flowering time (e.g. pollinators, seed predators, tropical storms), and careful hypothesis testing and experimental design could attempt to tease them apart. This is something that I would very much like to do, should funding and time permit in the future.  (2) What is the genetic basis of flowering time variation?  Understanding the genetic architecture of a trait allows us to make predictions about how that trait will respond to selection, or what levels of gene flow might be necessary to counteract the effects of adaptive evolution. Towards this end, I have generated and phenotypically evaluated two replicate F2 mapping populations, one a late flowering North Inland by early flowering Coast cross, and the other a late flowering South Inland by early flowering Coast cross. I am currently generating a genetic map to support analyses to identify the quantitative trait loci (QTL) underlying flowering time in these populations, as well as QTL for the suite of quantitative traits measured in the common garden population (and others, including variation in UV patterning on ray florets). I look forward to exploring patterns of genetic correlations among these traits, and asking whether the genetic architecture of late flowering differs between North and South Inland populations. 131  References  Adler, P.B. et al., 2014. Functional traits explain variation in plant life history strategies. Proceedings of the National Academy of Sciences of the United States of America, 111(2), pp.740–745. Alexander, H.M. et al., 2014. Roles of maternal effects and nuclear genetic composition change across the life cycle of crop-wild hybrids. American Journal of Botany, 101(7), pp.1176–1188. Anderson, J.T. et al., 2012. Phenotypic plasticity and adaptive evolution contribute to advancing flowering phenology in response to climate change. Proceedings of the Royal Society B: Biological Sciences, 279, pp.3843–3852. Anderson, J.T., Lee, C.-R. & Mitchell-Olds, T., 2010. Life-history QTLs and natural selection on flowering time in Boechera stricta, a perennial relative of Arabidopsis. Evolution, 65(3), pp.771–787. Baker, H.G., 1959. Reproductive methods as factors in speciation in flowering plants. Cold Spring Harbor Symposia on Quantitative Biology, 24, pp.177–191. Barb, J.G. et al., 2014. Chromosomal evolution and patterns of introgression in Helianthus. Genetics, 197(3), pp.969–979. Barrett, S.C.H., Colautti, R.I. & Eckert, C.G., 2008. Plant reproductive systems and evolution during biological invasion. Molecular Ecology, 17(1), pp.373–383. Belhassen, E. et al., 1994. Dynamic management of genetic resources: First generation analysis of sunflower artificial populations. Genetic Selection Evolution, 26(Suppl 1), pp.241s–253s. Blackman, B.K., Michaels, S.D. & Rieseberg, L.H., 2011. Connecting the sun to flowering in sunflower adaptation. Molecular Ecology, 20, pp.3503–3512. Bolnick, D.I. et al., 2011. Why intraspecific trait variation matters in community ecology. Trends in Ecology & Evolution, 26(4), pp.183–192. Bull-Hereñu, K. & Arroyo, M.T.K., 2009. Phenological and morphological differentiation in annual Chaetanthera moenchioides (Asteraceae) over an aridity gradient. Plant Systematics and Evolution, 278(3-4), pp.159–167. Burd, M. et al., 2006. Age-size plasticity for reproduction in monocarpic plants. Ecology, 87(11), pp.2755–2764.  132  Caicedo, A.L. et al., 2004. Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life history trait. Proceedings of the National Academy of Sciences of the United States of America, 101(44), pp.15670–15675. Christov, M., 1990. A new source of cytoplasmic male sterility in sunflower originating from Helianthus argophyllus. Helia, 13(13), pp.55–61. Colautti, R.I. & Barrett, S.C.H., 2013. Rapid adaptation to climate facilitates range expansion of an invasive plant. Science, 342(6156), pp.364–366. Conner, J. & Via, S., 1993. Patterns of phenotypic and genetic correlations among morphological and life-history traits in wild radish, Raphanus raphanistrum. Evolution, 47(2), pp.704–711. Cornelius, J., 2006. Heritabilities and additive genetic coefficients of variation in forest trees. Canadian Journal of Forest Research, 24, pp.372–379. Coyne, J.A. & Orr, H.A., 2004. Speciation, Sunderland, MA: Sinauer Associates. Crispo, E., 2008. Modifying effects of phenotypic plasticity on interactions among natural selection, adaptation and gene flow. Journal of Evolutionary Biology, 21(6), pp.1460–1469. Cruzan, M.B., Neal, P.R. & Willson, M.F., 1988. Floral display in Phyla incisa: Consequences for male and female reproductive success. Evolution, 42(3), pp.505–515. Day, T. & Rowe, L., 2002. Developmental thresholds and the evolution of reaction norms for age and size at life-history transitions. The American Naturalist, 159(4), pp.338–350. de Jong, G., 2005. Evolution of phenotypic plasticity: Patterns of plasticity and the emergence of ecotypes. New Phytologist, 166(1), pp.101–118. Devaux, C. & Lande, R., 2008. Incipient allochronic speciation due to non-selective assortative mating by flowering time, mutation and genetic drift. Proceedings of the Royal Society B: Biological Sciences, 275(1652), pp.2723–2732. Dickerson, G.E., 1962. Implications of genetic-environmental interaction in animal breeding. Animal Production, 4(1), pp.47–63. Dijk, H.V. et al., 1997. Flowering time in wild beet (Beta vulgaris ssp. maritima) along a latitudinal cline. Acta Oecologica, 18(1), pp.47–60. Dobzhansky, T.G., 1937. Genetics and the Origin of Species N. Eldredge & S. J. Gould, eds., New York: Columbia University Press. Earl, D.A. & vonHoldt, B.M., 2011. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources, 4(2), pp.359–361. 133  Eckhart, V.M., 1991. The effects of floral display on pollinator visitation vary among populations of Phacelia linearis (Hydrophyllaceae). Evolutionary Ecology, 5(4), pp.370–384. Edwards, A.W.F. 1971. Distance between populations on the basis of gene frequencies. Biometrics, 27, pp.873–881. Elzinga, J.A. et al., 2007. Time after time: Flowering phenology and biotic interactions. Trends in Ecology & Evolution, 22(8), pp.432–439. Endler, J.A., 1986. Natural Selection in the Wild, Princeton, New Jersey: Princeton University Press. Enright, A.J., 2002. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research, 30(7), pp.1575–1584. Evanno, G., Regnaut, S. & Goudet, J., 2005. Detecting the number of clusters of individuals using the software structure: A simulation study. Molecular Ecology, 14, pp.2611–2620. Falush, D., Stephens, M. & Pritchard, J.K., 2007. Inference of population structure using multilocus genotype data: Dominant markers and null alleles. Molecular Ecology Resources, 7(4), pp.574–578. Falush, D., Stephens, M. & Pritchard, J.K., 2003. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics, 164(4), pp.1567–1587. Fleming, T.R. & Harrington, D.P., 1984. Nonparametric estimation of the survival distribution in censored data. Communications in Statistics: Theory and Methods, 13(20), pp.2469–2486. Forrest, J. & Miller-Rushing, A.J., 2010. Toward a synthetic understanding of the role of phenology in ecology and evolution. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 365(1555), pp.3101–3112. Franke, D.M. et al., 2006. A steep cline in flowering time for Brassica rapa in Southern California: Population-level variation in the field and the greenhouse. International Journal of Plant Sciences, 167(1), pp.83–92. Franks, S.J. & Weis, A.E., 2008. A change in climate causes rapid evolution of multiple life-history traits and their interactions in an annual plant. Journal of Evolutionary Biology, 21(5), pp.1321–1334. Franks, S.J., Sim, S. & Weis, A.E., 2007. Rapid evolution of flowering time by an annual plant in response to a climate fluctuation. Proceedings of the National Academy of Sciences of the United States of America, 104(4), pp.1278–1282. 134  Gandhi, S.D. et al., 2005. The self-incompatibility locus (S) and quantitative trait loci for self-pollination and seed dormancy in sunflower. Theoretical and Applied Genetics, 111(4), pp.619–629. Garant, D., Forde, S.E. & Hendry, A.P., 2007. The multifarious effects of dispersal and gene flow on contemporary adaptation. Functional Ecology, 21(3), pp.434–443. Garcia-Gonzalez, F. et al., 2012. Comparing evolvabilities: Common errors surrounding the calculation and use of coefficients of additive genetic variation. Evolution, 66(8), pp.2341–2349. Geyer, C.J., Wagenius, S. & Shaw, R.G., 2007. Aster models for life history analysis. Biometrika, 94(2), pp.415–426. Gilbert, K.J. & Whitlock, M.C., 2014. QST- FST comparisons with unbalanced half-sib designs. Molecular Ecology Resources. Gilbert, K.J. et al., 2012. Recommendations for utilizing and reporting population genetic analyses: The reproducibility of genetic clustering using the program STRUCTURE. Molecular Ecology, 21(20), pp.4925–4930. Goudet, J., 2005. hierfstat, a package for r to compute and test hierarchical F-statistics. Molecular Ecology Resources, 5(1), pp.184–186. Goudet, J. et al., 1996. Testing differentiation in diploid populations. Genetics, 144, pp.1933–1940. Grambsch, P.M. & Therneau, T.M., 1994. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika, 81(3), pp.515–526. Grant, V., 1981. Plant Speciation 2nd ed., New York: Columbia University Press. Grant, V., 1963. The Origin of Adaptations, New York: Columbia University Press. Hacke, U.G. et al., 2001. Trends in wood density and structure are linked to prevention of xylem implosion by negative pressure. Oecologia, 126(4), pp.457–461. Hall, M.C. & Willis, J.H., 2006. Divergent selection on flowering time contributes to local adaptation in Mimulus guttatus populations. Evolution, 60(12), pp.2466–2477. Harrington, D.P. & Fleming, T.R., 1982. A class of rank test procedures for censored survival data. Biometrika, 69(3), pp.553–566. Hartl, D.L. & Clark, A.G., 1997. Principles of Population Genetics 4 ed., Sunderland, MA: Sinauer Associates.  135  Heino, M., Dieckmann, U. & Godø, O.R., 2002. Measuring probabilistic reaction norms for age and size at maturation. Evolution, 56(4), pp.669–678. Heiser, C.B., Jr, 1951. Hybridization in the annual sunflowers: Helianthus annuus x H. argophyllus. The American Naturalist, 85(820), pp.65–72. Heiser, C.B., Jr, 1948. Taxonomic and cytological notes on the annual species of Helianthus. Bulletin of the Torrey Botanical Club, pp.512–515. Heiser, C.B., Jr, 1954. Variation and subspeciation in the common sunflower, Helianthus annuus. American Midland Naturalist, 51(1), pp.287–305. Heiser, C.B., Jr et al., 1969. The North American sunflowers (Helianthus). Memoirs of the Torrey Botanical Club, 22, pp.1–217. Heiser, C.B., Jr, Martin, W.C. & Smith, D.M., 1962. Species crosses in Helianthus: I. Diploid species. Brittonia, 14(2), pp.137–147. Henry, L.P., Watson, R.H.B. & Blackman, B.K., 2014. Transitions in photoperiodic flowering are common and involve few loci in wild sunflowers (Helianthus; Asteraceae). American Journal of Botany, 101(10), pp.1748–1758. Horne, E.C. et al., 2004. Improved high-throughput sunflower and cotton genomic DNA extraction and PCR fidelity. Plant Molecular Biology Reporter, 22(1), pp.83–84. Houle, D., 1992. Comparing evolvability and variability of quantitative traits. Genetics, 130(195-204). Husson, F., Lê, S. & Pagès, J., 2010. Exploratory Multivariate Analysis by Example Using R, London: Chapman and Hall. Jakobsson, M. & Rosenberg, N.A., 2007. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics, 23(14), pp.1801–1806. Jamaux, I., Steinmetz, A. & Belhassen, E., 1997. Looking for molecular and physiological markers of osmotic adjustment in sunflower. New Phytologist, 137(1), pp.117–127. Johnstone, I.M., 2001. On the distribution of the largest eigenvalue in principal components analysis. Annals of Statistics, 29(2), pp.295–327. Jombart, T. & Ahmed, I. 2011. adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics, 27(21), pp.3070–3071. Kawakami, T. et al., 2011. Natural selection drives clinal life history patterns in the perennial sunflower species, Helianthus maximiliani. Molecular Ecology, 20(11), pp.2318–2328. 136  Kawecki, T.J. & Ebert, D., 2004. Conceptual issues in local adaptation. Ecology Letters, 7(12), pp.1225–1241. Kimura, M. & Weiss, G.H., 1964. The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics, 49(4), pp.561–576. Kochmer, J.P. & Handel, S.N., 1986. Constraints and competition in the evolution of flowering phenology. Ecological Monographs, 56(4), pp.303–325. Kozłowski, J., 2003. Optimal allocation of resources to growth and reproduction: Implications for age and size at maturity. Trends in Ecology & Evolution, 7(1), pp.15–19. Kudoh, H. et al., 2002. Intrinsic cost of delayed flowering in annual plants: Negative correlation between flowering time and reproductive effort. Plant Species Biology, 17(2-3), pp.101–107. Leimu, R. & Fischer, M., 2008. A meta-analysis of local adaptation in plants. PLoS ONE, 3(12), p.e4010. Lennartsson, T., 1997. Seasonal differentiation: a conservative reproductive barrier in two grassland Gentianella (Gentianaceae) species. Plant Systematics and Evolution, 208(1-2), pp.45–69. Levin, D.A., 2009. Flowering-time plasticity facilitates niche shifts in adjacent populations. New Phytologist, 183(3), pp.661–666. Levin, D.A., 1973. The role of trichomes in plant defense. Quarterly Review of Biology, 48(1), pp.3–15. Li, P. & Johnston, M.O., 2000. Heterochrony in plant evolutionary studies through the twentieth century. The Botanical Review, 66(1), pp.57–88. Linnen, C.R. & Hoekstra, H.E., 2009. Measuring natural selection on genotypes and phenotypes in the wild. Cold Spring Harbor Symposia on Quantitative Biology, 74, pp.155–168. Lotz, L., 1990. The relation between age and size at first flowering of Plantago major in various habitats. The Journal of Ecology, 78(3), pp.757–771. Lowry, D.B. et al., 2008. The strength and genetic basis of reproductive isolating barriers in flowering plants. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 363(1506), pp.3009–3021. Lynch, M. & Ritland, K., 1999. Estimation of pairwise relatedness with molecular markers. Genetics, 152(4), pp.1753–1766. Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research, 27, pp.209–220. 137  McKay, J.K. & Latta, R.G., 2002. Adaptive population divergence: Markers, QTL and traits. Trends in Ecology & Evolution, 17(6), pp.285–291. Mercer, K.L., Alexander, H.M. & Snow, A.A., 2011. Selection on seedling emergence timing and size in an annual plant, Helianthus annuus (common sunflower, Asteraceae). American Journal of Botany, 98(6), pp.975–985. Meyer, D., Zeileis, A. & Hornik, K., 2006. The Strucplot framework: visualizing multi-way contingency tables with vcd. Journal of Statistical Software, 17(3), pp.1–48. Mitchell-Olds, T., 1996. Genetic constraints on life-history evolution: Quantitative-trait loci influencing growth and flowering in Arabidopsis thaliana. Evolution, 50(1), pp.140–145. Mitchell-Olds, T. & Shaw, R.G., 1987. Regression analysis of natural selection: Statistical inference and biological interpretation. Evolution, 41(6), pp.1149–1161. Montague, J.L., Barrett, S.C.H. & Eckert, C.G., 2007. Re-establishment of clinal variation in flowering time among introduced populations of purple loosestrife (Lythrum salicaria, Lythraceae). Journal of Evolutionary Biology, 21, pp.234–245. Morrissey, M.B. & Hadfield, J.D., 2012. Directional selection in temporally replicated studies is remarkably consistent. Evolution, 66(2), pp.435–442. Morrissey, M.B. & Sakrejda, K., 2013. Unification of regression-based methods for the analysis of natural selection. Evolution, 67(7), pp.2094–2100. Moyers, B.T. & Rieseberg, L.H., 2013. Divergence in gene expression is uncoupled from divergence in coding sequence in a secondarily woody sunflower. International Journal of Plant Sciences, 174, pp.1079–1089. Munguía-Rosas, M.A. et al., 2011. Meta-analysis of phenotypic selection on flowering phenology suggests that early flowering plants are favoured. Ecology Letters, 14(5), pp.511–521. Murty, B.R. et al., 1972. Effects of disruptive selection for flowering time in Brassica campestris var. brown Sarson. Heredity, 28(3), pp.287–295. Nosil, P., Egan, S.P. & Funk, D.J., 2008. Heterogeneous genomic differentiation between walking-stick ecotypes: “Isolation by adaptation” and multiple roles for divergent selection. Evolution, 62(2), pp.316–336. Novembre, J. & Stephens, M., 2008. Interpreting principal component analyses of spatial population genetic variation. Nature Genetics, 40(5), pp.646–649. Novy, A., Flory, S.L. & Hartman, J.M., 2012. Evidence for rapid evolution of phenology in an invasive grass. Journal of Evolutionary Biology, 26(2), pp.443–450. 138  Oeth, P. et al., 2005. iPLEX™ assay: increased plexing efficiency and flexibility for MassARRAY® system through single base primer extension with mass-modified terminators. Sequenom Application Note. Ollerton, J. & Lack, A.J., 1992. Flowering phenology: An example of relaxation of natural selection? Trends in Ecology & Evolution, 7(8), pp.274–276. Olmsted, C.E., 1941. Growth and development in range grasses: I. Early development of Bouteloua curtipendula in relation to water supply. Botanical Gazette, 102(3), pp.499–519. Orsini, L. et al., 2013. Drivers of population genetic differentiation in the wild: Isolation by dispersal limitation, isolation by adaptation and isolation by colonization. Molecular Ecology, 22(24), pp.5983–5999. Patterson, N., Price, A.L. & Reich, D., 2006. Population structure and eigenanalysis. PLoS Genetics, 2(12), p.e190. Pigliucci, M. & Schlichting, C.D., 1998. Reaction norms of Arabidopsis: V. Flowering time controls phenotypic architecture in response to nutrient stress. Journal of Evolutionary Biology, 11(3), pp.285–301. Pilson, D., 2000. Herbivory and natural selection on flowering phenology in wild sunflower, Helianthus annuus. Oecologia, 122(1), pp.72–82. Poorter, L. et al., 2009. The importance of wood traits and hydraulic conductance for the performance and life history strategies of 42 rainforest tree species. New Phytologist, 185(2), pp.481–492. Price, A.L. et al., 2010. New approaches to population stratification in genome-wide association studies. Nature Reviews Genetics, 11(7), pp.459–463. Pritchard, J.K., Stephens, M. & Donnelly, P., 2000. Inference of population structure using multilocus genotype data. Genetics, 155(2), pp.945–959. Rauf, S., 2008. Breeding sunflower (Helianthus annuus L.) for drought tolerance. Communications in Biometry and Crop Science, 3(1), pp.29–44. Räsänen, K. & Kruuk, L.E.B., 2007. Maternal effects and evolution at ecological time-scales. Functional Ecology, 21(3), pp.408–421. Reich, P.B., Ellsworth, D.S. & Walters, M.B., 1998. Leaf structure (specific leaf area) modulates photosynthesis-nitrogen relations: Evidence from within and across species and functional groups. Functional Ecology, 12(6), pp.948–958. Renaut, S. et al., 2013. Genomic islands of divergence are not affected by geography of speciation in sunflowers. Nature Communications, 4(1827). 139  Renaut, S. et al., 2012. The population genomics of sunflowers and genomic determinants of protein evolution revealed by RNAseq. Biology, 1(3), pp.575–596. Richards, R.A., 1992. Increasing salinity tolerance of grain crops: Is it worthwhile? Plant and Soil, 146(1-2), pp.89–98. Rieseberg, L.H., 1997. Hybrid origins of plant species. Annual Review of Ecology and Systematics, 28, pp.359–389. Rieseberg, L.H. & Willis, J.H., 2007. Plant speciation. Science, 317(5840), pp.910–914. Roach, D.A. & Wulff, R.D., 1987. Maternal effects in plants. Annual Review of Ecology and Systematics, 18, pp.209–235. Rowe, L. & Ludwig, D., 1991. Size and timing of metamorphosis in complex life cycles: Time constraints and variation. Ecology, 72(2), pp.413–427. Sambatti, J.B.M. et al., 2008. Ecological selection maintains cytonuclear incompatibilities in hybridizing sunflowers. Ecology Letters, 11(10), pp.1082–1091. Sandring, S. & Ågren, J., 2009. Pollinator-mediated selection on floral display and flowering time in the perennial herb Arabidopsis lyrata. Evolution, 63(5), pp.1292–1300. Sandring, S. et al., 2007. Selection on flowering time and floral display in an alpine and a lowland population of Arabidopsis lyrata. Journal of Evolutionary Biology, 20(2), pp.558–567. Scheiner, S.M., 1993. Genetics and evolution of phenotypic plasticity. Annual Review of Ecology and Systematics, 24, pp.35–68. Scheiner, S.M. & Callahan, H.S., 1999. Measuring natural selection on phenotypic plasticity. Evolution, 53(6), pp.1704–1713. Schlichting, C.D., 1989. Phenotypic plasticity in Phlox. Oecologia, 78(4), pp.496–501. Schlichting, C.D., 1986. The evolution of phenotypic plasticity in plants. Annual Review of Ecology and Systematics, 17, pp.667–693. Schluter, D., 1988. Estimating the form of natural selection on a quantitative trait. Evolution, 42(5), pp.849–861. Schluter, D., 2000. The Ecology of Adaptive Radiation 2nd ed., Oxford: Oxford University Press. Schluter, D., Price, T.D. & Rowe, L., 1991. Conflicting selection pressures and life history trade-offs. Proceedings of the Royal Society B: Biological Sciences, 246(1315), pp.11–17.  140  Schneider, C.A., Rasband, W.S. & Eliceiri, K.W., 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods, 9, pp.671–675. Shaw, R.G. & Geyer, C.J., 2010. Inferring fitness landscapes. Evolution, 64(9), pp.2510–2520. Siepielski, A.M. et al., 2013. The spatial patterns of directional phenotypic selection P. Thrall, ed. Ecology Letters, 16(11), pp.1382–1392. Siepielski, A.M., DiBattista, J.D. & Carlson, S.M., 2009. It’s about time: The temporal dynamics of phenotypic selection in the wild. Ecology Letters, 12(11), pp.1261–1276. Simons, A.M. & Roff, D.A., 1994. The effect of environmental variability on the heritabilities of traits of a field cricket. Evolution, 48(5), pp.1637–1649. Sinnock, P., 1975. The Wahlund effect for the two-locus model. The American Naturalist, 109(969), pp.565–570. Slatkin, M., 1987. Gene flow and the geographic structure of natural populations. Science, 236(4803), pp.787–792. Spitze, K., 1993. Population structure in Daphnia obtusa: Quantitative genetic and allozymic variation. Genetics, 135(2), pp.367–374. Stam, P., 1983. The evolution of reproductive isolation in closely adjacent plant populations through differential flowering time. Heredity, 50(2), pp.105–118. Stanton, M.L., Galen, C. & Shore, J., 1997. Population structure along a steep environmental gradient: Consequences of flowering time and habitat variation in the snow buttercup, ranunculus adoneus. Evolution, 51(1), pp.79–94. Stearns, S.C., 2000. Life history evolution: Successes, limitations, and prospects. Naturwissenschaften, 87(11), pp.476–486. Stearns, S.C., 1989. The evolutionary significance of phenotypic plasticity. Bioscience, 39(7), pp.436–445. Stearns, S.C. & Koella, J.C., 1986. The evolution of phenotypic plasticity in life-history traits: predictions of reaction norms for age and size at maturity. Evolution, 40(5), pp.893–913. Strasburg, J.L. et al., 2011. Effective population size is positively correlated with levels of adaptive divergence among annual sunflowers. Molecular Biology and Evolution, 28(5), pp.1569–1580. Strasburg, J.L. et al., 2009. Genomic patterns of adaptive divergence between chromosomally differentiated sunflower species. Molecular Biology and Evolution, 26(6), pp.1341–1355.  141  Strauss, S.Y. & Whittall, J.B., 2006. Non-pollinator agents of selection on floral traits. In L. D. Harder & S. C. H. Barrett, eds. Ecology and Evolution of Flowers. Oxford: Oxford University Press, pp. 120–138. Sutherland, S. & Delph, L.F., 1984. On the importance of male fitness in plants: Patterns of fruit-set. Ecology, 65(4), pp.1093–1104. Therneau, T.M. & Grambsch, P.M., 2000. Modeling Survival Data: Extending the Cox Model, New York: Springer. Timme, R.E., Simpson, B.B. & Linder, C.R., 2007. High-resolution phylogeny for Helianthus (Asteraceae) using the 18S-26S ribosomal DNA external transcribed spacer. American Journal of Botany, 94(11), pp.1837–1852. Tracy, C.A. & Widom, H., 1994. Level-spacing distributions and the Airy kernel. Communications in Mathematical Physics, 159(1), pp.151–174. Turesson, G., 2010. The genotypical response of the plant species to the habitat. Hereditas, 3(3), pp.211–350. Van Dongen, S., 2008. Graph clustering via a discrete uncoupling process. SIAM Journal on Matrix Analysis and Applications, 30(1), pp.121–141. Via, S. & Lande, R., 1985. Genotype-environment interaction and the evolution of phenotypic plasticity. Evolution, 39(3), pp.505–522. Vischi, M. et al., 2004. Comparison of populations of Helianthus argophyllus and H. debilis ssp. cucumerifolius and their hybrids from the African coast of the Indian Ocean and the USA using molecular markers. Helia, 27(40), pp.123–132. Volis, S. et al., 2004. Phenotypic selection and regulation of reproduction in different environments in wild barley. Journal of Evolutionary Biology, 17(5), pp.1121–1131. Wang, J., 2011. coancestry: a program for simulating, estimating and analysing relatedness and inbreeding coefficients. Molecular Ecology Resources, 11(1), pp.141–145. Wang, J., 2014. Marker-based estimates of relatedness and inbreeding coefficients: An assessment of current methods. Journal of Evolutionary Biology, 27(3), pp.518–530. Weigensberg, I. & Roff, D.A., 1996. Natural heritabilities: Can they be reliably estimated in the laboratory? Evolution, 50(6), pp.2149–2157. Weir, B.S. & Cockerham, C.C., 1984. Estimating F-statistics for the analysis of population structure. Evolution, 38(6), pp.1358–1370.  142  Weis, A.E. & Kossler, T.M., 2004. Genetic variation in flowering time induces phenological assortative mating: Quantitative genetic methods applied to Brassica rapa. American Journal of Botany, 91(6), pp.825–836. Weiss, A.N. et al., 2013. Maternal effects and embryo genetics: Germination and dormancy of crop–wild sunflower hybrids. Seed Science Research, 23(4), pp.241–255. Wesselingh, R.A. et al., 1997. Threshold size for flowering in different habitats: effects of size-dependent growth and survival. Ecology, 78(7), pp.2118–2132. Whitlock, M.C., 2008. Evolutionary inference from QST. Molecular Ecology, 17(8), pp.1885–1896. Whitlock, M.C. & Guillaume, F., 2009. Testing for spatially divergent selection: Comparing QST to FST. Genetics, 183(3), pp.1055–1063. Wieckhorst, S. et al., 2010. Fine mapping of the sunflower resistance locus PlARG introduced from the wild species Helianthus argophyllus. Theoretical and Applied Genetics, 121(8), pp.1633–1644. Wilson, P.J., Thompson, K. & Hodgson, J.G., 1999. Specific leaf area and leaf dry matter content as alternative predictors of plant strategies. New Phytologist, 143(1), pp.155–162. Wood, S.N., 2006. Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics, 62(4), pp.1025–1036. Wright, S., 1951. The genetical structure of populations. Ann Eugenic, 15(1), pp.323–354. Wright, S., 1965. The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution, 19(3), pp.395–420. Yang, R.C., 1998. Estimating hierarchical F-statistics. Evolution, 52(4), pp.950–956. Ziebell, A.L. et al., 2013. Sunflower as a biofuels crop: An analysis of lignocellulosic chemical properties. Biomass and Bioenergy. Zopfi, H.-J., 1995. Life history variation and infraspecific heterochrony in Rhinanthus glacialis (Scrophulariaceae). Plant Systematics and Evolution, 198(3-4), pp.209–233. 143  Appendix  Sequence used to genotype samples in Chapter 2, using the Sequenom® iPLEX® Gold Genotyping Technology (San Diego, CA, USA) at the Genome Quebec Innovation Centre (McGill University, Montreal, QC, Canada). The target site is given in square brackets with alternate alleles (e.g. [A/G]) surrounded by 200 base pairs of flanking sequence in each direction. Repetitive regions are marked in lower case, and neighboring polymorphisms are denoted either in IUPAC code (SNPs) or as ‘N’ (indels). SNP Sequence 1 GAAYRTAAGATTCTAGAAGCAAAGAAGGAGGCTCTSGTAARAGAATGTTCYGAAAGCAGTCTGWTTGGTTCTGCKTTGMATGATAAGTTGCGTGAACTCGATTCAGTGAARCCTAACCCATACGGAACATTACACGCGGCRTCCGAAGTTTATAACGAAGCTGGAATACRRGGATTTTGGAAAGGCCTTGTTCCTACTTT[A/G]ATCATGGTGTGCAATCCATCAATYCAGTTC ATGATATATGAGACGGCKTTAAAGTACTTGAAAGCTAAACGTGCTGAKAAGAAGCAAAGTTCAAATAAAGTTTCTGCTCTGGAGGTWTTTRTWGTAGGTGCCATTGCTAAACTTGGAGCAACTGTGACAACATATCCTTTGTTAGTTGTAAAGTCAAGACTTCAAGCAAA 2 TAACTATCAAATATCGTGTTGAGAATTGCTGTTCCTCGAGAAGCGGTYAAAATTGCATTTCTSAGTCCAAGAAGCCCACGAGTTGGTATCTTGTATTTGAGYAGTGTTGTTCCTTCAGATCCAAGTCCTTGCATATCAAACATTTGACCACGTCTTTTACCRAGAAGTTCAACAACCGATCCCATATGTTCTTCAGGCAC[C/T]TCCACAGTGGCAATCTCATAAGGTTCAAGCAACTTGTCATCAACTTTCTTGTTRATTACTTTTGGGGGTCCCACCATGAACTCGTAATTTTCCCTTCGCATGTTCTCTATTAATATTGTAATGTGTAAAGTACCACGTCCGCTAACAAGGAACGTATCTGCAGTTTCACCGTCTTCAACTTTCATGGCCAAATTTCTTTC 3 GAATCGTTAAACGGTAGAAAACCTGCATTCATCACGAACAAWATGATACCGCATGACCAGATGTCCGCCTTCGCTCCGTCGTAGCCTCTCTTCGTTAAAATCTCCGGCGAGACATACGACGGCGTRCCGCATAACGTGTGCAACAATCCGTCAACTCGGATCTGATCCGTCAACGCACTCAAMMCGAAATCGGAAACCTT[C/G]AGATCTCCGTTCTCGTCGATTAACAGATTCTCCGGCTTCARATCTCGATGAAACACACCTTTCGAATGACAATATCCGATTGCGGAAATCAACTGTTGAAAATACTTCCGACTATTCGATTCCGATAACCGTCCTTTSGCTACCTTCGCGAACAACTCACCTCCTTTCACAAACTCCATAACGACATAGATCTTCGTTTT 4 CCGTGGGGACGACACCGATTACAAAGATATGCATCTCGGAAGGTGCCGCCTCGGGGTGGCACCCTTGCAGCCGGAATGAAGATGTAGCAGCGGATGATGGCGGCGAGAGTCGCCGGAGGCTGCTCACTTGGTCAAAGATGGGCAACAACGCCAGGCGGATMTTGGCAGGTGCTACTAGTGGAGGCGACGACAAATGTGCT[A/C]AAAAGGGTAAAGTATCGTTCATGTCGTATGACGGGGTTCATCAACTTCATATATTCATCTTTGCTCTCGCATTATTTCATGTTATTTATAGTGTCCTTACTATGGCTTTAGGACAGGCCAAGATGAGAAAATGGAAAAAYTGGGAGAAAGAGACAAGGACTATAGAGTATCAATTCTCTCACGATCCAGAGAGGTTTAGA 5 CTTATAGAKAATGGGAATCAAGAAAARGGTGGGGAGAAAGTGGAAGATGAAGATGATGARTCAAAGAAACTGCTGTTGCCTAACAAAGGCGGGTTATCRAMGAAGAAGAAATCGGGGAATAAGCACMGAAAAGTGCAGTGGAATGATAGGAATGGATATAAACTTGCTGAAGTTTTGGAATATCAACCAAGTGATGTAAG[C/T]GATTCTGAAGAYGAAGATGAGGATTCTTGCATATGTAACATAATGTAGATCTTACACCGCACGCATCCGGCTATTGGCTGGTAATACCATTAGGCTAATCTTCTTAATATCGTGTCAAATCWAAGAGTTCTTGATCGTGTCATGAAACATTRTTGTTGCTATACTATATTACNNTATATRTGATATGGAGAAAAAATRTT 6 AAATSTGAAGATGAAAAGTGCCCTCGACCCATGTGCTACAAGGCCTATGGAAGTGGAAAAGAAGATAGCCCTCCTTGTGACGTTCCTGGATTTGAGAACWGCAAGATGAAACTGATGAGGCATGTATCGTTTGTYGACTGTCCAGGACACGATATTCTCATGGCCACTATGCTTAACGGAGCAGCTATYATGGATGGAGC[A/G]TTACTTCTCATAGCAGCAAATGAAAGYTGYCCCCARCCTCAGACYTCTGAGCACTTGGCTGCTGTTGAGATTATGAGGCTMCAACATATTATAATTCTTCARAACAAGGTTGATCTAATTCAAGAAAATGTAGCCATTAACCARCACGACGCWATTCAGCAATTYATCCAGGGAACGGTTGCYGACGGTGCACCAGTGAT 144  SNP Sequence 7 GTTTRGAAGAACTCCTTTATGATGTTCCGGGATCCGGAGGGGTTCTCRACTCRCCATTGCTACTTATTCAGGTGACAAAATTATTATGTGGAGGTTTTATCTTTGCTCTACGACTGAAYCACACCATGAGTGATGCACCGGGKCTCGTTCAATTCATGACAGCACTTGGTGAAATGGCTCAAGGTGCATYRAAACCTTCA[A/G]TTTTGCCGGTGTGGCAAAGGGAGTTGCTTTTTGCACGASAYCCACCACGTGTAACGTGYACTCATCAYGAGTATGAYAAAGTSGAAGACACYAAGGGTACCATAATYCCACTAGATGACATGGCRCATAAATCMTTTTTCTTTGGACCTGCTGAGGTSTCAGCRTTRCGTAGATTTGTTCCMCCVTACCTTAAAAADTGT 8 CTGCCGAMACCACCCTCCACGACCGGCTAGCAATCGTAANCCGGCGGATCCCGCGGYATCGGGAAAGCCATCTCCCTGCATCTCGCATCTCTMGGGGCAAAACTCATAATCAACTACAAYTCTAACTCTAYWCARGCAGATCTAACAGTCTCGGAAATYAATACCAAATTCCCATCACGGGCTGTTGCTGTCAAAGCTGA[C/T]GTGTCTGACCCAGTTCAAGTKAAAAGTCTATTTGATGCAGCGGAATCTGCCTTTGGTTCCCCTYTGTACATCTTTGTTAATGCTGCYGGAATCTTGGATTCCTCYTACTCGTCRATACCGAATTCCTCGCTTGAGGAATTCGATCGGACGTTTGCGGTGAATACMAGAGGVGCTTAYTTGTGYTGCAAGGAAGCTGCKAA 9 GGCTTTCCACCAGACAGAATATTGCTCTCCGTAGGTTCCACGCCAATCACCTTAATATYAGGATTTTGAGTTTTGAGAAATCGACCAGCCCCTGAWATAGTTCCACCRGTKCCAATTCCYGCAACAAATATATCAACTTTTCCATTTGTGTCTTCCCAGATCTCCGGGCCAGTTGTCTCAAAATGTATCTTTGGGTTAGC[A/G]GGATTATCGAACTGTTGAAGCATGTAAGCRTTGGGWGTGCTGTTGACRATTTCTTCRGCTTTCTGAACYGCTCCTTTCATRCCCTTTGAAGAGTCWGTTAGWACAAGATCRGCTCCAAATGCYTTTAGAAGAACCCTTCTCTCRAGACTCATTGAYGCKGGCATCGTCAATATCAGCTTRTAYCCTTTTGAAGCAGCRAT 10 YGAAGACAACAKAAGCSTTAATACTATAGTCCATTTTTATTCATTTCATATTCATAGAAATGGAGCTACCCCATGGTGTAACTTGGGGTAGGCAAGTTGATCACTGTGSTCTTAACTTTYGTCATCATATTAATGCTATTTGTGATTCCTTGGGATCCGATCCCGCTGTCCTTCAAACCCTGAAAAGGGAAATGATCAGG[A/G]CCACGAGCTGGAGCTGARTTAATCTGGACGGTTCCTGTTTCCATCGCGTCACTAATCAAWATCGCTTTGTTWATGTCYTTGGTGAACACGCATCCCTGAAGACCAAAATTGCTAGCATTGCAATGGTGGATCCCTTCTTCAACRGAGTTGATCCTAATAACSGGAATGATGGGCCCAAACGGCTCCTCCCATGCGATTCT 11 GTGGTGAGYTTYAATTCAGGAGTTAARTCTTGYGARGAAGCAAGWGCAACCGCAACATTGTTAGCCTGRGAAGCCTTAGCCAATGAAGCAACAGACTCACCRAGACTTCTRTAAGCGGAAGTTGWTGAGTCGGTGGGCCCTTYTCCGAGCCCAACTAAACTGACCCGTTTGGTACCGAGACCAGAAAGCCGAAGAATAGT[C/T]GAYTGTCCAGCCTTTCCGGTRAAATCTTCTTCYGTAGATACTTCRGATAATAGCCCGTTTAATTGAGAATCGAGYGTCTTCAGGACAWCGTTTTGGAATTTTGAATTTTCGTCTTTTGTCATGTCTTTCTCYGTGACACCAACTGCTAGAATGTCTCCTTTCCATTCRAGCAAGTTSGTATCTTTTGCAGCAAAAGATAT 12 TATTTTCTATGTTCCCCCTCTCGTATTCCARGAACTCTTTTACCGCTTTTGGGCTTGAAAGTGCTTCTATTTGCTCTTCGCTCAGTTTCATATTTGACTGTATTTGTAGTAGTTTGTCTCCATAAACCTGCATGCCATGCTCTTCAAACAACRATTCAGGAATTTCAACTTGAATCATCTTTTGAAGCTGGTCAAGAAYR[A/G]CGTTATCAGTTGCTTGATTTTTAGCTGTTTTTTCCAGTYCTATACAACTCTCTAACAATGACTCCTTAACCTGCTTGATRTTGGTGCATYCAKGAAGAAGCTTWTCGGCAATAGCATCATTCAWCTCCGGCAATTCTCTATAAAAGAGYTCTTTACATTCAACAGTAAACTGACAAGRAAGACCACKKAGATCTTCTTGT 13 GGRTACAAGCTTTCAATCTTTATCGTATTCATGTCAATACGTTGGCCGAGAAKTCGTGCAAGGATCAAACCCTTTCGTGCAACATCCATACCGCTGAGATCGTCACGAGGATCTGGTTCAGTATAGCCGAGGTCTTTAGCAGCTTTAACAACTTCACTGAACGGTTTTCCATTTTCTACCTCRCTCATTACATAGCCCAA[C/T]GTACCACTCAAGCTGCCAACTATTGATGTMACTTCGTCTCCRGATGAAATAAYACGGTTTAATGATGCTATAACGGGGAGGCCAGCACCAACAGTTGATTCRTGCCTAATGYRACGYGGTTGTAAAACAATTGGTCAAATTCTTCCATTGRACCMGTAAGCGGTTTCTTATTGGCAAGCACAGTGCAGCATCCTAATTTT 14 CTAGTGGTCTTTATTTATTATAATTATAACTAATTTCCACTTCAAACAWCTTTCTTCCGTACAAWADTMTTCCATCCTCACTGCACCTTRGTGCARTCAGTGCTTKKGCTGATCTTGTAAGGGATGTTAACGCCACACTTGCCGGGGAGGCTGGCRGCATTGYYAGCGTTGATSTTGGRCGAGGCAGCAMTCTTTAAGCA[A/G]TTACAAATAGCCTGACGATCAGCSGTTGTTTTAGCCAAGCTATTGAGTCCTTTTACCCCAMTACAACAATTAGGCGACACARCGCCCCCTTTGGTTAGGTAGCCATAGCAYGRMAYYAGTTTRCYYGTCACCTGRCCRCAACTAATAGCCTCCGCATAGGGTGCAGCCACCACCATGCAAGTCACCACTACACATAAAAY 15 CTCCCCTCACTAGCATTTTCGCACTTCCAAAATCACAAATCTTGACTTGGTGAGTCAAAGGGTCCACCAAAACGTTCTGAGGTTTYAAGTCCCTGTGGCATACTCCGGTAACCATGTGCATGTACGCCARGCCCCTAAARATTTGATAYGTGTAAAGTTTCACGTAAATAAGGGGCATGCTTTGGTGTAGATTTTCATAA[C/T]GTTTYAAAACTCGATAAATCGTCTCCGGGAYGTATTCCATGACTASGTTTaaaaaaaCCTCGTCTCTGCTTGTAGTAGAAAAGAAACAATGTTTTAAGGARACRACRTTTGGGTGGTCCATCGCGCGCRTTAATTGCAACTCGCGGTTCTTGTATCTTCGGTCTTGCAAAACCTTTTTTATCGCAACGGTTTCTCCAGTT 16 ACAYRTTCTCAATGCATAAACAACACAKTTTCTAGTTTTCTTCATCATCTAAGGGCRTTAATGARAGCCATCAAMGAGTCAACATCTCTCTTCTCGGACGCATACTTRATAGGCCTTGACGAGTGCTTCGGGAAGAAAAGGATYGTCGGGAAGCTCCCAAGCTGCAATTCTTGTTTCGCRAACGTCTTCTGGTCACCGTC[A/G]GCTCTAAATTTCCCAACCTTTACACCACTTCCAGCTAACTTATCAGCCAGTTCCTCATACGATGYTTCCATTGCCTGACAGAAGGGGCACCAGGGTGCRTAAAGAACAACCATCCAACCGTCTTTTCGGTCTTCCATCTTCAGAAGATTTTCAATACCAGGCCGGCTCAAGTTCACGATGTTTTGGCTCTCAAAGATATC 145  SNP Sequence 17 AGGATTCARCTAAGAGTGCCYGAGTATGGKTGGACAACACARAAAGTYTTCCACCTCATGAACTTCATTGTAAATGGAGTGCGTGCCATTGTCTTTGGGTTCCACTTACAAGTGTTTAATTTGCATCCAAAGGTTTGCAYCTGGATGCTATTGGAGGTGCCCGGTCTACTGTTTTTCTCAACRTATACTCTTCTTGTCCT[A/G]TTTTGGGCTGAGATATATCACCAGGCTAAGAATTTACCAACGGGTAAACTCAGGATTATATACGTATCAGTTAACGCAGGSGTTTATTTGATACAAGGTTGTCTTTGGYTATATCTATGGTTGGATGATAGCAGCTTGGTGCAATTTCTCGGRAAGATATTTGTTGCAGTTATATCRCTTATGGCTGCAGTAGGCTTNCC 18 TCRACAGCATAMGGRACCGCAGACGGATCACCTGGCCCATTGCTGAAAAGAACTCCATCTGGTTTCATTTTAAGAGTCTCAGCCGCAGGCCAYGTTGACGGAACAACGGTTATTTTGCACCCGTAAGACRCCARCCGCCTYARTATGTTGTGCTTGATACCAAAATCATACGCCACCACATRGAAAGTCTCCTCGTGTCT[C/T]CGATTGGAGTTAAAATCCCATTCCGAACCGGTCTTTTCAACCCATTCATAAGGYTGCTTACAWGAGACRYTAGAAATTAAATCCACGCCRACAATRTCCCAWGTGCGGGACATCTCCAAAAGTTCTTCATCGGTTTTYGACTCTTCTGTGCTCAAGACACCAATCRGGCTTCCATCTTCTCGTAACCGGCGAGTAATCGC 19 CAATTACCGATGGGGTTGATWTGGGYGGACACTGTGATGGTGGTTATGGATATTTCTGTTCAGCTAATGAACTCCGTTTGGATTCAACGGGTGTACAAGATCTTGATAAATTAGTCATCAGCACCTTGTTTGAGRTTGTATCCACTGAGAGTAGGAATATTCCGTTCNATTCTATTCATGAAAGATGCCGAGAAATCCAT[C/T]GTTGGGAGCTCAGAAGCRTATTCAACATTCCGAGATTGGCTYGATAAGCTCCCAGACAATGTTGTTGTAATCGGTTCACAAACTCACACCGACAATCGCAAAGAAAAGKCGCATCCTGGTGGTCTACTTTTCACAAAATTCGGCAGCAACCAAACCGCTTTGCTCGACTTTAATCTTCCKGATATCGGAAGATTRCATGA 20 AAAAAGCTTGTGCTTCCTGATGGATCRATAYTRCGGGCCAAACTYCCAGGGAGACCDACAMGAGATTGCTTGTTTACTGATCCCACCAGAGAYGGAAAAAGTCTTTTGAAGATCTGGAACGTTAACRRTTACACTGGAGTGGTKGGAGTYTTCAACTGCCAGGGAGCAGGGTGGTGCAAAGAYGGTAAAAAGATCCTCAC[A/G]CACGATGAACAACCAARCACCATCACTGGCGTCATCAGGGCTAAAGATGTAAATTACYTGCCKAAAGTTGYCGAYTCTACATGGGATGGTGATGCAGTTGYATACAGTCAYGTCGGYGGTGGACTGKTTTACCTACCAAAGAATGCATCCATCCCAGTAACATTGAAACCGAGAGAATACRAGGTTTTTACAGTAGTTCC 21 TTRCAATTAGCGACGAACCACATCGGTCACTTCCATATGACAAAYGGTTTATTGGAAACCATGAAAAACACTGCCGAAAAGTCRGGAGTYCAGGGTAGGATCGTCATAGTGTCDTCSGTGCTTCAGAAAACGACTTATAAAGAAGGGATTAGATTCGATAAAATAAATGACGAAAAGAGYTACAATGCTTTACTGGCGTA[C/T]GGACARTCWAAGCTTGCTAATGCTTTGCATGCAAAGGAACTCGCAAGGCGACTCAAGGAAGAAGGTGTTAATATAACYGTAAATTCACTCCATCCGGGAGTTATCGCGACCAATCTTGCACGCCACACAGCTGTKATGCGAGTGATYTTTGGGTACGTCGTACGACTTTTCYTGAAGAATGTGGAACAGGGGGCSTCAAC 22 GAGTTGYACACAGKWTWWTGAWACGATAAAAGCGGACACGACATATATTATTGCAACTGTTGGCAYATAAAAAATCATACACAAACAACGTATGTGAYGGMAATATATMTTAATAATCATATTAAYGCTATCGAAATCTAGGTTGTGCGGCAACCACCATCCCACSATCATCTGCTTCAAGGTGYTTTSGACTTGGGTGC[C/T]GATGGAAGCAACTGATAACATGATATGAGCGAGGTGGTGAATCCGAAGGCTCCTGTRACRCGTGGAGTGATCTTTTTAGGTGCCARTTGAAGCAGCCCAACRGCAACCACTACATCCATCGCTGCTTTGATCAAGGCCAATGTCCTCTCGTTTGAYGTTGCAAGTTTAGCAYGRTATTGTTCATTCTTGTGTTTATCGGC 23 TTAGCAAGCTCCTCTTCRTCYCCTCGCCCGCCAGCTGGAAGCGCRACCGCATTGTCATAACCGGTGGACCCACCTCGGCCCTTTGGGTCAAGGAAAGAAGAGCCTCTGTAAGACGGCACTAGAAACTCCCCACTAAAGTTCTCGGGTGGACCGGTTGCCACAAGTTCYTTGACTGTAAACAAGAACGGCACACGCTCACC[A/G]CCAGGAAGCTGAACRGTSACTGCAGCGTAATCGATACCGTCTTTTTCTTCGAATTTGAGGGTGCCRTCGGATGACACTTCCAACGGRCCCTCGATTTCGTCAVGGGTGTATGTCAAACGGGTCATTAGTTTGGTCTTTTGRAACTCGGGTSYAGAGTTCTTGCTTATACCCTCGGCTTTGACCGTGAAAGATGTCGGCTC 24 TTTGCGATTTTWCCTTCCARGTGATTGACTTCAAYYTCCTTTTTCTCCACCTCRATAACCTTAACTTTGAATTCATCGTYTAACGATTTTCTCTTTTCTTCCATTTCTAACTCAAACTCGCGTGTCTTYGCATCCAATAKGRCCTTGTGTTCATCCWTGAGTTTTTGAATCTCTACTTTTTCTCTCGCGTTCAGCTTCTC[C/T]TCTACCCCTTGTAATTCCYTTTCTTTTACYTCTAGATTCTTTCTCATAACCGCAGCTTCCTTCTCTTTCAAAGCTACGTTTTTCATCCTCGTGCTGATTTCATCKTCTTTAGCTTTTAAAGCCGCRTTACCGATTTCAATCTTCTTTTGTTCTTYTTCGAGATCAAGTTGTTTCTGCTTAAMGATTTTATCRTTRTCATT 25 ACRTTGTATGGAGCYGAAGACATTACCCTTTCTTTGCGCTTCTCAAGAAACCGAGCCAAGGATGCTTTGCGCGCTTGCGGAACAGCYGAYTGCATAACTTGTTCAAGTGAGCCGATAGCTCTWGGGGTRTCCGCTTTAACTAYTTTAGCACCTTCRTCTTTGTTAGTARGTCTAAAAGGGTGCGAKGACACAGACATTGG[A/G]CTCGAGATGGCTGAGCRGGGCTGTRGGTTAATCGGTTGGGTCATATAAACCGCATCCVYTMCYGGTGTTCKTGAAACAGKCACCTGGGCTTGAACCCTAGGCTGAGGTGTGSHYGCARACGCWCCATTTGCCGCCAAGAACATGATAGCTTGAGCCTTCTCSGGGGAGATRTCATTGTACACGTTAACCGTTCCACCGTA 26 CRCTAGAAGCAATGACCKTGATTTGCTCACTTGTATCCAYGGGACCAAACTTAGTAGAAGTGCATTCAAYTTTGATAACAGCTTACGATTATTATCAGAAAAATTCGCGACGCCTTTAAAGGATTCCCGCAAAACCTTCCCGAAAATCAGAGGTATCAAAAGTGTGACRACGAGRCTTTTAAATAACTGATCAGCAGGAA[C/G]AGATGTACCCACTCCACCAGCTATTAGTTTGGATATRGTAAAGGGAACAAAAAGAATTCCCAACAGATTAGATATTAMGGTCATTGCAAGAGCCAAGGCTGAATTTCCTCCAGCAAGCCGGGTTAGAGCAACTCCGCTTGATAATGTTGTAGGCATACAGSAAAATAGAGCAAGTCCTGTGACAAATTCTTGRGGTTTAA 146  SNP Sequence 27 CTCGACATTCTTGTTGTTGGCGGTGGAGCCACCGGATGTGGCGTTGCTCTTGACGCCGTTACTAGAGGCCTTAGGGTTGKTCTGGTGGAAMGTGATGATTTTTCKTCTGGAACGTCGTCTAGATCTACGAAGTTGATTCATGGAGGTGTTCGCTACTTGGAAAAAGCCGTGTTCAATCTAGATTACGGGCAACTAAAGCT[A/G]GTTTTTCATGCRCTAGAGGAACGCAAACAGGTTATCGATAACGCACCACACCTCTGTCATGCTTTACCCTGCATGACACCATGTTTTAGCTGGTTTGAGGCAGTATACTATTGGGTGGGCTTAAAAATGTAYGATCTAGTGGCWGGAAAACATTTACTTCACTTATCTAGATATTACTCTGCACAAGAGTCCRCCGAACT 28 ACMAATGGCAAAAAAGGTGCCATGTCATATTTGGCACAYAGGAAATAATAAAGTCCTAACATCAATTATTCATGTTGYTTATYSGKWTATGATCAATATTGAAYGGAYWTTWCTGCGGTCATGAAAGAGGATATAYGCGATRTATCTTTTTGAATTCCATCTGAAGCRCATATACCACTGCTGACATCAAGTCCATCTGG[G/T]TTTAGTATAGATATGGCTTCRGTTACATTCTCGGGTTTCATTCCTCCGGCCAAGAGCCATCCATGTTTGCTATTAACYGGTGGTAGCTTGAACCGAGACCAATTAAATCCTTTACCACTGCCCCCTTTTGCACTGTCAACCAGAACCCAATCAACTAAAGAACAATCTCCATCTGATATTTGGTTCATAAGGKCACCATC 29 AGCAGATGCTGCAAGGKCTTWAAGATCATAAAATTACNNATATTCAATTCCCAATTTYRCACCTTCATACTGAATTTCATTCTTTTTTAACCTAATATTTGCAATYYGCAWYGTCAATTYGGTRTATCATCTAACAATCTCTCTTTYGTGAYAACWARGGCTTATGACAAGGCAGCGATCGAGTGTAACGGAAGAGAAGC[C/T]GTGACCAACTTCGAACCCCGTTCTTACRAAGGAAACACACTTCCTRCAATTCATCATGAAAATGATCACAACCTGAATTTAAGCTTGGGAATTTCGACCCCTTCGCATGGAGCGGGCTCAAGTAGAACTGACAATGTAGAGCAAGTCCATTTGAATTACRTGCATGAYCCTAGGAGATTACAGTTGGAGAACCCTGTTTC 30 AAGAAACAAGACTCACCCGCTGCGGAAAACCAATCCTGCASCAACTTTGGTAGAGGAAAATGGGTCAGGGATGACACTCGCCCTCTGTATTCMGGATTTGGTTGTAARCAGTGGTTGTCATCRATGTGGGCTTGCCGKTTGACACAACGCACRGATTTTSAATACGAGAAGTTAAGATGGCAGCCAAAAGATTGCCAAAC[A/G]GATGATTTTTCAGGGCCTAAATTTCTTAAAAGGATGCAAGACAAAACGCTGGCGTTTGTCGGAGATTCCTTAGGGCGGCAGCAGTTCCAGTCATTAATGTGCATGATCACTGGCGGTGAAGAGCGGCAAGAYGTTCAAGAYGTKGGTAAGGARTTCGGTCTAGTGAAAGCCCGTGGGTCGGTTCGACCCGACGGCTGGGC 31 TACTACAATCGGTTATTTCCACAGTGGCTTCTSTACTACCCACCTATTGATATTTTTATGAAGGCCCAAAACTGTTTGAACCAAGAAAAAGACAGAGCTAGTCAATTCTTGCATCAAACCTCGGTTGAGAAGTTGCTACAAGTTGTGCATGAGCAGTTATTGGGTCAAACCGCATTGGATCAAGCTAAACTTTACGAAAA[A/T]CAGAAGACTGAGTGTGGCGATTGCTCAATCCAATACCAGGAGGTGCTGTCGGGATGTGCGGGTTTGAATCTCGGTGAAGGGAGTTCGCGAGTGTGATGCAGATTTAGAYGGAGAATTTTGGTCATGAACCACGGYTATTTTAGTAATTCCATTAAATGAATATAGATGCAGATTTACTCACCCTTTTAACCTGTCTCTWC 32 ACCCTATTCATTATCCCAADAGGTTTTGTTAGCTCATCAAACTCATCYGCATCTAAACCCAAAGGCTCACTSATCATCAAGCTTGTTCTCGCCTCACCCGTTAYCTCCCCARCTTCTTCATTACCAACATCTAAGCTGATATATTCAGCAAACTTTCCTTCCTGACACAAKTCCACATCATCATCCATGTCATCATTCTT[C/T]ARATAAAGCTTTTCTGATAYGCCCBTTKCTGATAYGCCYSTTGCCTCTTCAACCCTAACRGAACCAAAGTAACGACCCCTCRAATTCTTCTGCTGTTGATCCATTGWGCTTTGYTGTTTCCTTTTCAACTGGTTTTTMGGATGGTGGAGCTTGCACTTTGTACCTTGAGAACACTTGCCTGTTGCTTCAAACGCTGGGCA 33 AAGCATTCTTCTGTGAYGTGTGTGCTATCKCCTCGGTATGGAAGTAGCAAACTTAGCTTTTTTAAGCAACAGGTTGTAGGGACCGCTTTTACACATCAYCARAAATGTGTTCTTGTAGATACACAGGCTGCTGGTAACAATAGAAAGGTTACATCATTTATAGGTGGTCTTGATCTCTGTGATGGCCGTTATGACACACC[G/T]GARCATAGGTTATTTCACGATCTAGACACTGTTTTTCTTGACGATGTGCAYCAGCCGACTTACCCGGCAGGAACCAARGCCCCGAGGCAGCCATGGCAYGATTTGCACTGCAAGATTGACGGTCCAGCTGCATATGATGTACTGMTGAACTTTGAACAGCGTTGGAAAAAAGCGACCAAGTGGAGAGAATTTGCACTCCT 34 GCCATTAAGCTGACCCGAGACAATATCAAAGCTATAATTATGGACGTTATGTTTGGTGGTACGGAGACAGTGGCTTCTGCTATAGAATGGGCCTTGACCGAGCTGATGCATACACCGGAAGCRCTRAAAWCCGTGCAWCAAGAGTTGGCTGATGTCGTTGGTCTTRACCGTCGGGTCGAAGAATCCGACTTGGAGAAGTT[A/G]ACGTACTTCAAATGTGTTGTCAAAGAGACRCTTCGTCTACACCCTCCTATCCCGGTTACCCTCCACAAATCGTCGGAGGACACAAAGGTTGCGGGATACCACATCCCGAAAGGGTCACGTGTTATGGTCAACGTGTTYGCCATCAACCGCGATAAGARCTCTTGGGAAGATCCTAAWACTTTCAACCCGTCTCGGTTTTT 35 GGTTTTCCCAAATTCTTCTTAGCAGCAAAYTCAACAATAATAGACTTGCAAAAGATTTTCTCCTGTTGTATAAAAYGCAATGCRTCAGTCTTTGYCTGCTGTGTTATGGCAWTGTATGCAACTCTKGAAGCATCYGCTTCCGACATCTTACGTTGGTGAAAYTTGTTRGAAGATGTGTAGCTCACACCATCGACCCAAAC[A/G]GTGGTCCTGAACCATGGTAGATGCTCAGATCCTTCATTTTRTKTTTGATAWATGGGGGGTGGCTTCTTCAACTTTTGAACATGTTGGTTAAGAAGACACTTGTATAGAAAAGTTGGCTTTTGACCAGCAGATAAGGTCTKAGAGGAATGGTGAGCATTCTCCGGCGCCGGTTGAGACTCTGGYGTCGACATAACGCCGCC 36 AAAGGCATATGGAGTTATGTGTGTAAGATGGATAAAGCTCTTCGGAWGTATTCTGCCGYCAAACGTATTCAGYTGACTTCAACTGTTRGTGCAATCACTTTTGTTCAGAAAGTACCACTTTCATTGGATTCCACTGGCGARAGGGCTGATCCAGAAGTSYYGGYCTCTGAAGAGTGCAWTAAGAAAAAACYGTCTAGAAA[A/G]CCTTCAAAGAAAGTGATAGCGAACGGGTTGATCATTGCTGGTGGTGTGATCTGTTTGGCTCGTGGWCACACCAAYTTCAGTGCAAAGGTTGCAATGGCATACATAKTAAGTAAGTTGACAAAGCGTCGTGAAACTTCTTTGCACTAGGGAGGGGGCRTSTAAATGCTTTTTATGGAGATTTGATATGCSAATACAASCGA 147  SNP Sequence 37 AGMGTCTKAACAGAGTGTCCAACTGCGGCAATTTCACCACTAACATCACGCCCCAAAATTAAAGGAAGAAGTGACTCAAATAGAGAACGMCCGTAGCCTGCTCGCATTCTAGTGTCAAGCGGATTAACCGAAACAGCGCGAGCTCGAACGAGGACCTCATTCGGCTTGAGATCAGGAACACGAACATCATCRCGCACTWC[C/T]AGAACATCAGCGGAACCRAAGCGCGGCAAAAGCACGGCTCTAGMGGTGGTGACAACGGTTCTAATACCRAAAYCTAASTTGTTTGAAGGAGAATGAAGGTGTTTGATGTKRTCAGTTGTTGASAATTGGAGTTGTTTAGATCGGAATAASGAGCGCATCGTGTGATTGTTGTGGTRARWGATCRGAAAATTGAAGATATT 38 TTAGTCAGCATATTTTTAGTTGGATTATATATCTAYCAGCCCAGGAATACTGCTGCCTGYTAYATCTTTTTGTCCCGTGGCTGTTCYGATCTTGAAAAYACTCCAGCAGTTCCTTCCAGAGAATTAACAGATGAAGAAAMYGCAGCTCGGGTGGTCTTCACGGAAATTTTGAAATCATCWCCTGCTGCTCCAAAGAACGC[A/G]AAAGTCGCTTTCATGTTTTTTGACTCCTGGCCCATTACCGTTTGAAATGCTATGGGACAAGTTCTTTCAAGGCCATGATGGTAGATTTACTGTTTATGTGCATGCCTCAAGAGAACAACCGCCTCATGTTAGTCCATACTTCAATGGTAGAAATATAAGAAGTGAAAAGGTTGATTGGGGGAAAATATCAATGGTTGATG 39 CATGTTAACAATTYCGTACATGAACAAYTGTAAAACAACAAATTCAACRGTACAAGAATTCGAATRGCAGTGTATGTGKATTAGCGCATCTTGTATTGTTGTGTTCTTGAGAYRACACARCGTTGACCRKTTAACGAGTATTTGAGTGTGATATCCACGTCYCGTGGATTCTTCTTATTTTGTGMSACWGYCATGMTCCC[A/G]GTAATCGATTCGCCTTCACATATAGTCAACACCTCTTCAAGRTATAAAACRGTTTGTTTCCAGTGTGTGTTTCTTGWTCTTGGTCCTGTAGAAAACCCTGTGRGYTTGTGACACACRGTAAAYGCTACATCAAAATATGCGACTAAAGCATGRACATAATCGTCACGTTCAGCCACAAGTTTAAAAGGTGCTGTGAAAGA 40 CTATAGAAATGGATATGCYTTTGTTGACTCGATYKTCTCTCRAAACTCCCGCAATCAACCTGTGTTTGWTTCTCTGTTACAATGTTATGTCTTTACCATAARATATGCCGACTGTTATTAATATAATACCATCRTCATGTGAAGCCGGTGCCCACCCTGACTGGTCTTATACGYCTTGGAACTCTTCGTATAMATTCAAA[C/T]GCTAATACAAAAATGCTAAATATGTATAGCACGAYGSGCCTAAYTGTTTCAGCAATGCCACCTTCCGACTTGTATCCACTAGAGAAATTCATTGACCAGAAGCCAGTGCTYYTCTCYGACAAATTGGTCCAGTCATGYGGGCTTGTCTTGCYTTCTGMTGCCTTTATCAGTTCAAGCAAAGCTTCRTTGACCTCTTTCGT 41 AACACYCGGCGAAAGGTTGACTGGTTGACCGACAAAATGCGTAGCCGAGACCACACAGTCTCYGCCACWCACGGTGACATGGACCAAAACACCCGTGACATCATCATGCGAGAATTCCGTTCRGGTTCCTCACGTGTTCTGATCACCACTGATCTTTTGGCACGTGGGATTGACGTACAGCAAGTGTCACTKGTGATCAA[C/T]TATGATCTTCCAACTCAGCCTGAAAACTATCTKCATCGTATTGGACGTAGTGGTCGGTTTGGGAGGAAGGGTGTTGCGATCAACTTTGTGACTTTAGACGATGAACGGATGCTTGCGGATATTCAGAAGTTTTATAATGTGGTGGTTGAGGAATTGCCATCTAATGTTGCTGATCTGATTTAGAAAAGATTTTGAGATGT 42 ACTGTCAGCCAATCGTCCAARCCGCCCTTTCCCTTTCCKGCATGAATAGCAATCAAATTCTTATATGCAGGGGCACTGTCGTCTTGATMGTCCTCTTCRTCKTCTTCTTCSAAATCATCCTTAAATTCAGCACTCCTGGATTTAAAGCTGGGAATTTCACCACCCCAGGCTTCATCATTCTCTTTTTGTGAAGCGTTTCC[A/G]CTTGCTTTAGTTTCTTGGGMTTTAAGATACWCTTTAGGAGGTTTAGTAGATATCATAACCCTTCTTTTCAGAGTTTCTGGAGACGGAAATTCTGGTAAAACCTCTGYGTCRGGRGTAAACASCATGTCYCCMAATGTGTCATGAACCATCTTAGCCACTTTAGCCTGAAGATCRGTAGTAAGGTGGTCTTCTAGGGTAAT 43 TCTGTTCTACACKTCTYGATTGATTAAAAGGTGGAAATCATGAAGCTTGACTTTAGTGGACTAGAATCAAGTGCACCACTCTACGGGGGACCAAATGAGTTRCTTTGTGASGGGTTCTCAGCTGCCCCKTCGTTTGACCTTCCAGTTACATCAGATTTTGATGGATTCCAGAAGAATTCTATCCAAATGGTAAAGCCAGC[A/G]AAGGGAACAACAACCTTGGCTTTTATTTTTAAGGAGGGTGTGATGGTTGCTGCTGATTCCCGAGCTAGCATGGGGGGCTATATCTCATCTCAATCCGTGAAGAAGATCATTGAAATCAATCCCTATATGCTGGGTACAATGGCAGGAGGAGCTGCWGATTGCCAATTCTGGCACAGAAAKCTAGGCATTAAGTGYYGTTT 44 CATTATATATTTCTTATTTTCTTTGTCACCCACATTTTGTCTAACTTCTACATATCATTTGTATTCTTTATAGCTCCAAGACTATTAATTTCCACTCATTTCACAAGTTTCAAAGGCCTCATCATCACATTATCYTCATGATCCAATATATGGCAATGGTAGACATATCCCGGCTCCRCGGTTGCATCRAACGCATACGA[C/T]GCATTCGTATGTATATACGCAAACTTCACAATGATCTTCGTCACGTAWCCGGGATGCATCTTRTACACATTCTTCCATCCTTTCTCATACGGTTCCACACTCAACTTTTCACCACGTGCGTATTTTTCAATATGACATTTTTCCGCATTATTCAACTTATTCATACATTCTCTAAATTCATCTATATTTGTTATATTCGT 45 GACCCACTGCAGGCRTTGCAAGCTACRAATGGAGTTGTCGGGCCTATTGTTGGAGCRTTTTCGCTCCTTGCCATTGCAACTTCTTACATTGGATTYGTTTTGGGCCTCTCCGACTTCCTCTCTGACTTGCTGAAACTACCATCTGATAAAAACAGACCGCTGCCTTACATCCTCACRGTGTTTCCGCCGCTMATACTGTC[A/T]TTGCTAGACCCTGATATCTTTTTCAAAGCACTGGATTTTGCTGGAACTTATGGRGTTCTGGTGCTGTTCGGCGTTCTTCCTGCTGCAATGKCCTGGTCCGATAGATACTCGAGTTCCTCTTTGTCCTCARATATCCCAGAGCTTGTACCYGGAGGCAGGTTCACCCTTTCACTAGTAATTGGAGGTGCAGGATATGTTAT 46 GTGCAATATGCTCTTAGYAGGATTAGAAATGCTGCAAGAATGCTTTTGACTCTTGAAGAGAAGGAYCCACGTAGGATTTTTGAGGGTGAAGCRCTTATGCGAAGAATGAACCGTTACGGGCTTTTAGACGAGAGTCAAAATAAGCTTGATTACGTTCTGGCTCTTACTGTTGAGAACTTTCTAGAACGCCGTTTGCAGAC[A/G]CTTGTGTTTAAAACTGGTATGGCTAARTCTATTCACCATGCTAGAGTCCTCATCAARCAGAGGCACATCAGGGTCGGGCGACAAGTGGTGAACGTACCVTCATTCATGGTGAGGGTYGACTCACAGAAACACATTGACTTTTCTCTCACYAGTCCKTTTGGYGGWGGCCGRCCAGGCAGAGTGAAGCGAAAGAACCAGAA 148  SNP Sequence 47 TTAGTCATCGRTTTCGGTTATCCAAAAGCGATAARAACCGGGCCTCAAGACTGGTTTTAGGTGGTCAGTTTTCTGGCTGTTCTGTTGTTAGAGCAAATCCAGTGAACAACAGCATTACTCCRGCTGGAGATGAYGACGAAGGCGTTTCGTTGGGGACGATGAAGCTMCCGATTAATACTGATCTTGATCGGTTTGAAACG[C/T]TGTTGTTTCAGTGGGCAAATAGTATGAATCAAGGTGCACAGATACCACTTCCTATGCCTCTCAAGGTGGACAAAGTTAAAGGAGGAATAAGGMTAGGTTTCATTACAATCGGCGACGGAGTGAYTGAGGTTCCRGTGTAYATCGACTGTTTGGTTTATCCSGCAGYCGCTGGTTCACCACCGATTTTCCGGGCTATACGC 48 ACACTGCCGCGCGCAGTTGCCGATGGGTTGTTATGGAACGGGAACTTCGAGCTCGGGCCAAAACCCGCGGACATGAAAGGCACGGAGGTGTTAAAGCATGATGCCATCCCCGGATGGAAGATCTTCGGTTTTGTCGAGTATATTAAATCAGGCCAAAAGCAAGGRGACATGTTGYTAGTCGTTCCTGAGGGAGCCTTTGC[A/C]GTTAGGCTAGGAAACGAAGCAACAATCGAGCAGACCATAAACGTGACTAAAGGAATGTATTATTCCCTCACGTTTAGTGCTGCTCGAACATGTGCTCAAGAAGAGACACTCAATGTATCGGTAGCACCTGATTTTGGYGTGCTACCAATGCAGACRTTGTATAGYAGTAGTGGATGGGACTCATATGCSTGGGCTTTTCA 49 GAACGTTTAGCTTGTCAATGTGACGGGTGCACGTGCAAGAACACKTATGGTGGATATGATTGCAAATGCAAMGGGGACAAGMTTTATATAGCYGATCAAGATGCGTGTATTGAAAGAAAGGCTTCAAAGTATGCTTGGTTCATTAGCTTGGTGGTACTAGGGGTGGTCGYGAGTGCTGGTTTAGCCGGCTACATCTTCTA[C/T]AGATACAGGCTAAGGGCATACATGGATTCGGAGATCATGGCAATYATGTCTCAATACATGCCTCTCGACAACCAGAACCAAAATCAAGTGGTTGTTCAYGAAWMCGAACCTCTRCGTCAAACMTCAACCGTATGATGCACAAGGAARGCGRCARGYGACAGACAAAAAACAGGACCCATGGGACAATTTTCGGACCAATG 50 TGCCGATGGACGACACCTACGCTAGAAGAYCCAGTGAATTGTGAGAACTATAACTATTTCAACAATGCTTTAGGTGAAACATCAATGACAAGTGTAGAAGATGAAGAGGTTGTTAACCGTTGTGGTAGGYACACATTAATGGCAAGCAAGAAGATGGTAGGAGTGTTCATAAGTGTGTGGATAAAYACATCATTGTTCGA[G/T]AAGTATAATATATCAAGAGTGAAAGTGAGTGCTGTGGCATGTGGGATTATGGGTTATTTGGGAAACAAAGGGTCKGTGGCGGYTAGCATGTCGATTGAAGGAACAAGCTTCTGCTTTGTTGTTGCTCACYTGGCTTCYGGTGAGAAGAAAGGSGATGARGGGAGAAGGAATCACCAAGTTTCTGAGATCTTTAAACGAAC 51 GGSCTTCATCGTCCGCTTTTGTCAATCCAGCTAACAAAGCTCAAAGATGGGCTTGCAATGGGCTGCGCRTTCAACCACGCAATCTTAGACGGTACATCCACGTGGCACTTCATGAGYTCATGGGCCGAAATTTGCACCGGATCCAAATCCATATCCATTCAACCTTTCCTAGACCGGACCCAAGCGCGTAACACGCGCGT[C/G]AAGCTCGACCTAACCCCACCRGAKCAACAAAACGGCGACGGAGCAACCGCCACAAAGCCACCACCACywaaaagaaaaaaTCTTCAAATTCTCAGAATCCTCAATCGACAAAATCAAAGCTAAAGTCAACGCCAATCCARCAAACGAATCCACCAAACCGTTCTCCACTTTCCAATCGTTATCCACACACATCTGGCACG 52 ACGCGGTTATGGGCATGCGTGKTTAGCTTGCTATTGGGTTTCCTTAGTTTCTTATGTCTTTCTATGGAAAGCATACAAACATGTGTCTGATCTTAGAGCTGCAGCATTGATGTCACCCGAAGCTAAACCCGAACAAWTTKGCTGTTTTAGKKKAGAGAYATACCRGCATCATCTGACGGGCAGTCTCGAAAGGAWCAAGT[C/T]GACGCGTATTTCAAAAACATATATCCCGAAACGTTCTACARGTCRTTGGTGGTRACAGAGAACAAAGAAGTCAACAAAATCTACGAAGAGTTGGAAGAGTGTAAAAAGAAGCTCAGACGAGCTGAGATCATATACGCTGACTCGAAAAAAGTCAACCCCGAAGGAGTTGTACAGACTCACAAGACTGGCTTTCTTGGTCT 53 ATACAAATATWTTACATCACATCACATATSACCCAACAAATTTTTKATTTTACTTCAATTGTTTCTTTTTCCAGATCAATGCTAATCCGAACACAATACGAGTCGATCTTCAGTTCATTTTGSCCGTCTCTTTACAKCGTCAGTGTTCTTGATACAAAGATCTKTTTCTTGTCCTCRCRAATAATGATCATTTTAAACCC[A/G]AAATAATAATAGTTAGTAGCAGCGTCAGCATCACCACCGCTACATGAACTAAACCAATATCTTTTTACATATAGATACATATTTACAGTTAATGGGCTAACGTASAYCGCTCAGTTGTGCTGTAGGGATGACARCCATTGTTCCATGTACGCTTCGCTCTCGGGAGARTAGATGTCGGGGACTTGCGCAACGCCAATAGT 54 CRACARYCAAATCKAACAAWCATGTTCCTTYGAAGGTTTGCTCGRCCATCATCMWTGATGATGATGGCAARGGTGAAGGAAACAACAGGGATCGTCGGTCTGGAAGTGGYCCCTAACGCACGGGAGGTKTTAATCGATCTTTACAMCAAAACCCTAACCGAGATCCAGCGAGTACCGGAGGACGAAGGCTACCGTAAGGC[C/T]GTCGAGAGCTTCACGCGCCACCGGCTGAGCGTRTGTGARCAGGAAGAAGACTGGGAGWCCATTGAGAAGAAACTYGGYTGTGGTCAGGTTGAAGARCTTATTGAAGAGGCTYGGGATGAACTCAAACTCATCGACAAAATGATCGAGTGGGACCCATGGGGTGTGCCMGATGACTACGAATGCGAAGTAGTTGAAAACGA 55 GAGGACAATTCTGTAGAAAATAATCCRTCTTTGAAGTCATTCACTCTCAAGGAATTTACBTCTTTAAKKDTCCTTTMTTWTTATTAATYGTTYVKTTCACTACSACCGAYTGKTGTTAGTTACYAGATGTTATATAACTCCCATCCTTAACAYGTTATTWTARTGTTCAACAGTTGTGATGTTTTGAAGCCTTAYGTCCC[C/T]CATATAGATGACATATTCAAAGATTTYACRTCRTAYAAGGTTCGAGTTCCTGTAACAGGGGCTATTATTTTGGATCAAACTTATGAAAGGTGTGTRCTGGTAAAAGGATGGAAGGGRACAAGCTGGAGTTTCCCTAGAGGAAAAAAGAACAAAGATGAGGAAGATGATGCKTGTGCYATYCGAGAGGTTCTGGAAGAAAC 56 ATCTCCTCAATTTCTACATTTTWGATGGAGGCTGCAGCGTATAAYCCCCGCACAGTCGAAGAAGTTTTTAGGGATTTCAAAGGTCGTCGTGCCGGAATGATCAAGGCTCTCACCGCCGAGGTTGAAGATTTCTACCAGCAGTGTGATCCTGAAAAGGAGAATTTGTGTTTGTATGGTTTTCCAAGTGAGCAATGGGAGGT[A/G]AATTTGCCTGCTGAAGAGGTCCCACCAGAACTCCCGGAACCAGCATTGGGTATCAACTTTGCTAGAGACGGAATGCAAGARAAAGATTGGTTATCGTTGGTTGCTGTTCATAGTGAYGCTTGGTTGCTTTCCGTCGCATTCTATTTTGGTGCCAGATTTGGCTTTGATAAAGCTGACAGGAAGCGTCTGTTCAACATGAT 149  SNP Sequence 57 GGYGATTCGAATCACTGTGACACSATTCGGAACTTGCTTAACATTGTTTTGAAYTTYGAAAMGGGGATTATTCCGCGGCTGTTTGATAYGAACATTACTAAMCCCGAGTAATCCTCGGCAGTCGTTCAGGATCACAAGTGTTTTCAACGGGCATTATAACCATACCGGAAACTATCAGGGRWTTGAATTCRCTATACAAC[C/T]ATGGTTACCGAGAGTACTTGAACARSTACTTYTCGGGGGAGACTGTTCCGGACACGTTGATCATGAATTCRGGGTTGCACGAYGGGGTTTACTGGCCGAATTTGAGAAGATTCATAAAGGGGGCCGAGGATGCTGCMGCATTTTGGGCGGAGGTTCTTGATGGAGTGAGGCGGAGAAAGGTTGCCGTCCCGAATGTTATT 58 AATCTTACAAAATCTGCAGGAAGACTCCTTGTTGCTGATTCCGACTGTAGGGCTTCGAAGCTTGATAGACGTCTTAGATTAGCTCTATCTTTTCGCTGTGCGGAAACACCTTGAATGGCGGTAAAAATMTCTGGTGTAAATACATAAACACCACAATTTATTCGGTCACTAACGAAAGTTTCAGGTTTCTCRGTATAATG[C/T]AGCAATTCGTTGGTATCGGGATCAGCTACCAGCTCACCAAATTGGTCTGCTGATTCAGGGGAAACCTTAATTACAAGGATCGTTCCCATCCCACCGTATCTTTTATGAGCATCRAGCATYKCAGGYAATGGGAAACTGCAGCAAACATCACAGTTTAGCAAGAAGATGTGTGACGGATTATCTTCCATGATYAGATCTCT 59 AGAGGACTAGACGATTACTTYCCYGAAGAAGAAATCCGTATGAGAAGTCATGAAATGCTTGAAAACGAAGATATGCAGCATTTGCTTCGGTTATTCAACATGGGCAACGGTGGGGCCCATGGTCAAACCTCGGGTAGTCATGTTAACGAAAACTATTATCAGTATTCATCAGGTTATATGCCGAACACACCTTCTAACTT[C/T]GGGTATGGATTCGATGTGGATAAAACCCGTTCTTCTGGGAAAGCTGTGGTCGGGTGGCTTAAACTCAAAGCRGCGTTGAGATRGGGCATTTTCATAAGGAAACAAGCAGCCGAAAGAAGGGCCCAGATCRTTGAACTGGAAGAAAGCCCGTAACTAAGGTAACCTCGTTTGGTTTTGATGTTTTGAAGGCTTCTTTGGCG 60 ATATCRATAAAACCGATCGCCTTGGCAGCACGACTTCTTAGTGGTCCTGCAGATATTGTGTTSACTCTGATTCCATGYTTTCTTCCAGCTTCAAATGCCAAAACTTGTGTGTCACTCTCCARACCAGCYTTGGCTGAACTCATACCTCCACCATATCCTGGTATGATCCTTTCAGAAGCAATGTAGGTCAGAGAAATTGA[A/C]GCACCACCTGGGTTCATTATTGGGGCAAAATGCTGAAGCAAAGATACATAAGAATAGCTYGACGCTGATATGGCAGCAAGATACCCATACCTTGATGTTTCTAAGAGAGGTTTGCTAACCTCGGGTCCATTGGCTAATGAATGCACAAGAATGTCGATGCTYCCAAAATCCTCTTTCACAGATTCRGCTACTTCCTTAAC 61 CCTTTATCYGACACTTGCCCCYTTTTAACCTTGCAATTTACCACAAARACAGCTCGGTATTGTGTCACRTATTTTTCATGCCTTTGGCACTCGAGGCAYAGGGTTGTAAAACACGCGGGACAGCTAAGCACTGCATCGGAATATTGACCCTTCTTTTTCTTTTGAACCCACAAYTGGTCTTTATCGTCAAGCTTTGGGTC[A/G]TAGAACTCGGGTTTRACYGAATAGTCRATGATGTCATCATCTGAAACAACYGTTTCCTCGACAKTTTGATCTTGAYttttttctttttCTTCACTATCAAYGGATTCAKTTAAAACTGCRTCTTCYTTCTGTTGCGACATTGTTGATTCGCAGTTAAAACAGAGGACGATTATCAAACAGGAAGAGAAGCATCCGGAAGY 62 GTTCTGGTCTTYGTATTATACAAGAAGGAAGCATCTCGGGTTGAGAATATGTTACAAAGAAGGGGTTGGAAGGTTGTTTCCATTAGCGGTGACAAACAACAAAAGGCGCGTACCGAGGCACTTCARTTATTCAAGGAYGGTACTTCTCCTCTATTGATCGCTACAGATGTAGCTGCTAGGGGGCTGGATATTCCAGATGT[C/T]GAAGTTGTAATWAACTACAGCTTCCCGTTGACCACAGAGGATTATGTCCACAGAATTGGAAGAACGGGYCGTGCTGGTAAAAAGGGCGTTGCACACACTTTCTTCATGAAGGAGAAYAAGGCRCTTTCTGGTGAGCTGATAAATGTTCTTAGWGAAGCAGGACAGAATGTACCGACAAACCTTTTGAATTTCGGAACTCA 63 CAAGTCAACTTCCACCTGAATTCTTGAAACCGTCTCCKGATAAGAAAYTGGTGATTGGTTTTGACTGTGAAGGCGTTGACCTTTGCCGCAATGGAACTTTATGTATTATGCAGCTAGCTTTTCCVGATGCTATATACYTRGTYGATGCAATTGAAGGTGGAAAAACRCTTGTGGAAGCCTGTAAGCCTGCACTTGAATCT[A/C]GTTACATCACTAAAGTTATTCATGATTGCAAACGCGATAGTGAGGCGTTGTATTTTCAGTTCAACATCAAGTTGCACAATGTSTTTGATACTCAGATTGCTTACAGTTTGATTGAGGAGCAAGAAGGCGGGACAAAARTACCAGACGACTACATATCATTTGTTGGTCTCCTTGCTGACCCTCGTTACTGTGGTATATCT 64 TGTACGGTGGCGATTGCRATGATGGATGCMAAGTGTTTGACGAAAGAAGGGTATGCTGCGAATCAYCCTGCYGGAMGGATTGGCAAGAGTTTGATCTTTAAGGTGAAGGATGTRATGAAGAAGCAAGAAGAACTTCCAGTTTGTAAAGAAGGAGATCTAATAATGGATCAATTAGTCGAGCTTACAAGTAAAGGATGCGG[G/T]TGCCTTCTGGTGATYRATGAYGATTACCACCTCATTGGCACATTCACCGATGGYGATCTCAGGCGAACACTCAAAGCCAGCAAAGAGGGCATCTTCAARCTCACCGTCGGTGAAATGTGCAACAGGAACCCGAGAACTATAACTGCAGAAAGAATGGCGGTTGAMGCAATGCAGAAGATGGAGGCTCCTCCATCMCCTGT F1 ATTGGGTATGTAATTCCAACAACBGTMGTGTCTCATTTCCTAGATGATTATGAGAGAAACGGGAAGTACACCGGTTTCCCTTCCCTTGGAATAYTATTGCAAAARTTAGAGAAYCCAGCTTTACGTGCCTGCTTAAAAGTGCCATCTAAYGAGGGTGTACTTGTCCGTCGTGTGGAGCCYACTTCTGGTGCCAGTAATGT[C/G]TTGAAGGAGGGGGATGTAATTGTAAGCTTYGATGGTGTAGAAGTCGGATCAGAAGGGACAGTCCCATTTCGMTCAACYGAACGCATTGCATTCCGCTACCTCATTAGTCAAAAATTTACAGGGGATATAGCAGAAGTTGGTATCATCAGATCAGGGGCWTTCATGAAAGTTCACACTGCTATGAATCCACGTGTTCATTT F2 TTATGCCTTTGCACTGAGGTCTTCCTGTGCTTTGGGAACTCAGTAGCATCSACTCGGTTTCTATTGCAAGAYGAGTTTAACATACAAACAACAMAATGCGATAACTGCATYATTGGTTTCATGTTTTGCCTCCAACAACTTGCATGCATATTCTCCATTGTTGCTTGTATTGTYGGAAGYGAAGAACTYAGYGAGGCATC[C/T]CAGTTGCTGAATTGCTTAGCCGATATGGTTTATTGCACRGTTTGTGCTTGCATGCAGACACAACACAAGGTGGAAATGGACAAACGAGATGGGAAGTTTGGACCGCAACCGATGGCAGTACCCCCGGTGCAGCAAATGTCAAGGATTGATCAACCGTATCCTCCAAATGTCGGATATGGVCAACAACCTTAYGGTTAYCC 150  SNP Sequence F3 ATGTAAGCYGATATGCTCGGATRYTTAAGGTGTACGGYATGGTTTTCRTCRAGATCAAAYTCCTTTTCCGGTTCCCAAACAACAATATCWGCATGCTTTCCGATTTCAATAGCTCCCTTAAGATCTTGGCTAGCAAGTTTAGCRGGCTTCTCACTCCACCAYGAAAYTAATTTCTCTAAAGTTATACCATATTKAACCCC[A/G]TATGACCATGTCACAGGAAGAACAAACTGTAACGAAGATATACCACCCCATGCCCTCAAAAAATCACCTTCGGCGAATAGTTTGAGCTCCGGTTCCGATGGTGAATGATCMGAGCTTAWCATGTCAATATCCCCATCCATCAAGGCCTCCCAGAGTTYTTGTCTRTTAGCTGCTTCGCGAATGGGCGGTGCACACTTAAA F4 AGTTTCATTCACGGTRTTCATGCTCGCGATGCCACTTTTCAGAACRAGCGGTGGCATTCTTCTTCAAGAGGCCCCACCCAGTATTCATTCCTCAGCRTTGACCAAATGCTGGAGACAGGTGGCTTCTCTTGAAGATGTTATTGAAGTATCTGAAGCCCGACTCTGGGAGTTTGTACCGGRTCAYGTGGTYGGATCAGTTT[C/T]GCTACAGGTGAAGAAGGGGGTGGATGATCAACCTGTTCTTGGTTTYGTAYGCGATTTATACCATGATTTAGGAGTACAAGATTTCACAGCTCAATTGGACTTGGCSGACTGATTGGAAMCTTGGTAATACAGTTGATGRCGGTTATAGCAAATTTGAACACAGTTTAAAGAGATTATTACTTRTAATTTAATTGTTAAAA F5 CAGGCCAGGCAACTCATAGGCTTTTAAGGGGAATCCATGACAAGGGGCATACGGTTTCGAGGAATCTGAAGTCAGATGGTCAGGTGGATACAATGCAAGTGTTGCACAACATCAATGAAGATGAACTTGATGGCTTTGAAGAGGTTTGGAATGGAAAAGCTAGGAAACATCTTCCTGGCTGGACTGCTGGATCTAGTACT[C/G]GTGAAGGGTTAGGTGAGCCAAGAAGGGCACTGCCATCAACGGAGGAATTTAGGAACCATGGAAGTTCCTCTGKCTCACATGTGGGTGTGAGAAGGAGATCGGATGGTGCGGAACATGGTCTATCATCAAAGATAAGGAGAGTCTAGGTCAATGGGTTCTTGTAGGGTAAGTTGCAGTCGCGCTTCAAGGTGATATATTCG F6 TTCATAACGAAGCKGAGAAGAACATGGGACTCCGTGCAGGGTCTACYGCTAGCATAAGAAACAGCAGAGCAGCRCAAGCTAATATAAGCCCTAATTTYAACCGACCYGGTACMGGTGGCATGATGCCVGGGATGCCTGGGGCCCGCWTGATGCCNRGGTWCRCCTGGATTAGAAAACGATAACTGGGAGGTTCCGAGGTC[C/T]CGCTCGATGCCTCGRGGTGTTCAACCACCGTTGCTTAACAAGACACCGTCACCAAGYCAGAAGTTTCTTCCTCAAGGTAGTGTYGGTGKYGGTGKYGGTKGCGGTTTCATTAGCGGCAARCCTAGCGCTTTATTGCAAGGCAGTGGTGGACCCGGTGGTGCCGCTTCTGTYCCCATCTCHCCGGTGACGGCTCCGACTAG F7 CTGAATGATTTCATGACGGATGATCTACCGGATAGCAGATCTTCRCGTCCTTTGTTCARCAGGCAGATCATGCAACCGAGAACGGTTACTATCTTATCTTCRAGTAATGGTTTGTCCTCYGAAGGGAAAGCGTAAGAAACACTGCATTCACCKGTAAGYGGGTTGCATAAYGTGTCCATTARTGCATTGTGCAGGCTTTC[A/G]GCAGCTCTGAAGATTCTGCAGTATGCTTCAACTTCGGCTATATCTCCYGGAAGCGGGCCCACCCAAGGTAACTGTGTCAAATTGTGGGATTGGTATGTCTGAGAATCCAAACCWATGTTAASAGAAAAAGGATTAGARACMTCAGCACCMGCASCAGCARCCACASTAASACGACGTCGTTCAGGTASCCGCCACTTGGT F8 GTTTCACTCTGTTCTTTCTTCCTCAAAGCTCTCATYGAAGAAGAAAGATGCTTTGTTTTCTCTTTGGCTTCATCAATGTCRATYTGYTCTGCTCTACACTCTTCACTACAGAATGGYKTGTCCCCTCTGTACATGAAGATATCTCTATTATGSCCCAACATTTTCTTGCAAAGAAAACAAGCRTCCAAAAAATGAGGTTG[C/T]TCATCAAATCTCCCATGATTAAACCTCCCAGATCTCGGAGAASACAGGTTTCCSATAYTCGCTTTTCTCGGCSAGTATAACGGTCGCAAGATCAGGTGGTTTCCGTTTCCGSCACGGTGGTTCTCCTCCGCCGACGACGACRGAGATGACGAAAACCCACTTTCAAAATCCGAAATTGACGCTAACCCATTATTTTCCTC F9 GCTYTGGTCAGAGGTCATGCTGTGAGAAAGCAAGCTGCGATTACTCTTAGGTGTATGCAAGCCTTAGTTAGGGTTCAGGCACGGGTTCGAGCCAGGCGGGTTCGCATTGCATTGGAAGGTCAGACAGAACAACAAAAACATCARCAGCAACTCCAACACGARGCTCATGTTCGTGAAATAGAGCAAGGATGGTGTGATAG[C/T]GTTGGATCTGTTGAAGAAATTCAAGCGAAATTGATAAAGAGACARGAAGCAGCTGCTAAACGTGAAAGAGCCATGGCWTATGCTCTTGCTCACCAGTGGCAAGCAGGATCCAAACAACAGRTAACYCACTCRGGATTTGAACCAGATAAAAGTAACTGGGGTTGGAAYTGGCTRGAGAGATGGATGGCTGTCCGTCCATG F10 CAGAAGTTYAAATACCGATCAATAATGTTCAACRTTAAGGATCCGAAAAATCCGGATTTCAGAAGAAAAGTGTTGYTAGGGCATGTGAARCYVGAWCGGATTCTGGAGTTGACCCCTGAAGAGATGGCRARCACCGAACGRCAAATGGAGAATGTGAAAATTAAGGAGAAAGCGTTGTTCGACTGTGAACGAGGYGGGCC[A/G]CCAAAAGCTACGACCGATCAGTTTARGTGCGGTCGWTGTGGGAAAAGGAAGTGCACTTATTATCAGCTGCAGACTAGGAGTGCTGATGAACCTATGACGACGTTTGTAACGTGTGTGAACTGYGACAATCATTGGAARTTCTGTTRAGGAACTYTTGTCABYAGTRGATTYGAGTTKAATGGATCTGATTAGATCTCTTC F11 AATCTTACAAAATCTGCAGGAAGACTCCTTGTTGCTGATTCCGACTGTAGGGCTTCGAAGCTTGATAGACGTCTTAGATTAGCTCTATCTTTTCGCTGTGCGGAAACACCTTGAATGGCGGTAAAAATMTCTGGTGTAAATACATAAACACCACAATTTATTCGGTCACTAACGAAAGTTTCAGGTTTCTCRGTATAATG[C/T]AGCAATTCGTTGGTATCGGGATCAGCTACCAGCTCACCAAATTGGTCTGCTGATTCAGGGGAAACCTTAATTACAAGGATCGTTCCCATCCCACCGTATCTTTTATGAGCATCRAGCATYKCAGGYAATGGGAAACTGCAGCAAACATCACAGTTTAGCAAGAAGATGTGTGACGGATTATCTTCCATGATYAGATCTCT F12 CCTCTCAATACCATCGTAAATACTTCATGAATCAGTCCCTTATAAGTTGGYTCTAACTTGTCCTTGTATTTRGTAGCGAAAAGATCTTCATTCATYGTCAASGTGCTTTCCACCACATAATCTGTTTCAAACTGCAACACTATATGYGGATACATWGTTTGACCYTTACGGATAGGTGGATCRAGGGTAACAACAACAAA[C/T]GTATGTGGTTGATTTGACTTGGGTAGTACAAACACRCGAACKACACTGCTGAACTGGATTTTRAAGTCGTTGGCTTGCCCTTGCAGCCTCARAAAGGATAGATGAAGTTCAACRTTGTAACGACCCCTTGGAGTGAGAATRGCAATGCCYTCAAAYGTGACAACRGCCTCTTCACCTCCAGCRCCRACATCKGCCATTGA 151  SNP Sequence F13 GGTGCGGTTATTGCGCTGTTTCATCTTCTGATCACTMGATCRGATAAAGTCAGAGCACTTCGGGAAGCTTTCTATCGGCARAATCTACCWAATGTGACTAATTTACTTGCTACTGTTYTGATCTTCCTCATTGTCATCTACTTCCAAGGCTTCCGTGTCGTTTTGCCTGTGAGGTCCAAGAATGCCCGTGGGCAACAAGG[A/C]TCKTACCCYATTAAGTTGTTYTATACTTCCAACATGCCTATTATYCTTCAGTCTGCMCTTGTRTCYAACCTTTAYTTCATCTCTCAGTTGCTKCACAGAAAGTACAGYGGGAATTTCTTGGTYAACTTGTTGGGAAAGTGGAAGGAGTCTGAATACTCGGGCCARTCKGTTCCAGTTGGTGGKCTTGCTTACTATGTTAC  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items