UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Applications of molecular markers to forest genetics [microform] : genetic diversity, genetic linkage… Glaubitzx, Glaubitz, Jeffrey Curtis 1995

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1996-090868.pdf [ 10.48MB ]
JSON: 831-1.0087276.json
JSON-LD: 831-1.0087276-ld.json
RDF/XML (Pretty): 831-1.0087276-rdf.xml
RDF/JSON: 831-1.0087276-rdf.json
Turtle: 831-1.0087276-turtle.txt
N-Triples: 831-1.0087276-rdf-ntriples.txt
Original Record: 831-1.0087276-source.json
Full Text

Full Text

APPLICATIONS OF MOLECULAR MARKERS TO FOREST GENETICS: GENETIC DIVERSITY, GENETIC LINKAGE MAPPING, AND GENE EXPRESSION by JEFFREY CURTIS GLAUBITZ B.Sc.(Forestry), The University of British Columbia, 1987 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Department of Forest Sciences and The Biotechnology Laboratory We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA November 1995 ©Jeffrey Curtis Glaubitz, 1995 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Forest Sciences The University of British Columbia Vancouver, Canada Date December 13, 1995 ii A B S T R A C T This thesis is comprised of four studies utilizing molecular genetic markers with coniferous trees. PCR and direct sequencing, RAPD markers, and RFLP markers were used in studies of gene expression (mRNA editing), genetic linkage mapping, and genetic diversity. In the first study, direct sequencing of PCR products was used to demonstrate the existence of RNA editing in the mitochondria of western red cedar (Thuja plicata Donn ex D. Don). This was the first demonstration of this phenomenon in a gymnosperm. The next two studies involved the use of RAPD markers for genetic linkage mapping in conifers. In the first of these, RAPD markers were shown to segregate in a Mendelian fashion among F l progeny of Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco), and hence to be useful as genetic markers in conifers. A testcross mapping strategy was proposed to allow the efficient construction of genetic linkage maps in diploid organisms such as trees using these dominant markers. The feasibility of this strategy was demonstrated in Douglas-fir. This was the first study using RAPD markers in a conifer. An alternative mapping strategy, taking advantage of the availability of haploid megagametophyte tissue in conifers, was explored in the next study. A partial genetic linkage map was constructed for a single white spruce (Picea glauca [Moench] Voss) tree by following the segregation of RAPD markers among a set of megagametophytes obtained from seeds of that tree. It was shown that the disadvantage of the dominance of RAPD markers could be overcome by using this haploid tissue as a DNA source. The approach to genetic linkage mapping first taken in this study is now commonly used by conifer geneticists. The final study was on DNA-level genetic diversity in western red cedar. Previous results with isozyme markers and terpenes had suggested that genetic diversity is relatively low in this species. Single or low copy number nuclear RFLP markers were developed to further explore this question. The resulting estimate of species-wide genetic diversity (expected heterozygosity) closely agreed with those obtained from isozyme iii studies. This was the first large scale population genetics study in a conifer using single or low copy number nuclear RFLP markers. iv TABLE OF CONTENTS Abstract ii Table of Contents ; iv List of Tables..' v List of Figures vi Acknowledgement viii Foreword ix Dedication xi Chapter 1: GENERAL INTRODUCTION 1 Development of this thesis 2 Relationship between genetic linkage mapping and population genetics 5 Use of DNA markers for conservation genetics of forest trees 6 Chapter 2: RNA EDITING IN THE MITOCHONDRIA OF A CONIFER 9 Introduction 9 Materials and Methods 10 Results 11 Discussion 12 Chapter 3: SEGREGATION OF RANDOM AMPLIFIED POLYMORPHIC DNA MARKERS IN F l PROGENY OF DOUGLAS-FIR 22 Introduction 22 Materials and Methods 25 Results and Discussion 27 Chapter 4: SINGLE-TREE GENETIC LINKAGE MAPPING IN WHITE SPRUCE USING DNA FROM HAPLOID MEGAGAMETOPHYTES 45 Introduction 45 Materials and Methods 46 Results and Discussion 49 Chapter 5: NUCLEAR RFLP ANALYSIS OF GENETIC DIVERSITY IN WESTERN RED CEDAR.. 67 Introduction 67 Materials and Methods 71 Results 77 Discussion 100 Chapter 6: GENERAL CONCLUSION 115 Future methodology 116 LITERATURE CITED 120 V LIST OF TABLES Table 2.1. RNA editing of coxl in western red cedar restores evolutionary conservation in the protein encoded 15 Table 3.1. Segregation analysis of RAPD bands polymorphic between Douglas-fir parents 38 Table 4.1. Random sample of primers showing G+C content, total number of amplification products and number of segregating bands per primer 54 Table 5.1. Results of range-wide RFLP analysis of western red cedar with 30 single or low copy number nuclear probes 86 Table 5.2. Allele frequencies at the 9 polymorphic loci 89 Table 5.3. Genetic variability measures from nuclear RFLP analysis of western red cedar 90 Table 5.4. Deviation from Hardy-Weinberg genotype proportions within regions 92 Table 5.5. Estimates of F I R , F I T and F R T at all polymorphic loci 93 Table 5.6. Hierarchical F statistics over all loci 95 Table 5.7. Matrix of Nei's genetic similarity and distance coefficients 96 Table 5.8. Matrix of Reynolds, Weir and Cockerham's genetic distances 98 Table 5.9. Allele frequency resolution for various degrees of deviation from panmixia 103 vi LIST OF FIGURES Figure 2.1. Comparison of the partial gene sequence and the corresponding partial cDNA sequence from a 712 base pair portion of the gene coxl from western red cedar 13 Figure 2.2. Corresponding portions of DNA and cDNA sequencing autoradiographs from coxl in western red cedar 14 Figure 3.1. DNA polymorphisms between 6 parental Douglas-fir genotypes from 4 RAPD primers 34 Figure 3.2. Segregation of a polymorphic DNA band (RAPD marker) produced by primer Ap9 36 Figure 3.3. Segregation of a polymorphic DNA band (RAPD marker) produced with primer Ap3 37 Figure 4.1. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 230 51 Figure 4.2. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 282 52 Figure 4.3. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 285 53 Figure 4.4. Partial RAPD marker linkage map of white spruce based on a minimum LOD score of 3.0 in the two-point analysis 58 Figure 4.5. Partial RAPD marker linkage map of white spruce based on a minimum LOD score of 2.4 in the two-point analysis 59 Figure 5.1. Range map of western red cedar showing locations of sampled trees 68 Figure 5.2. Contaminating polysaccharides affecting migration of western red cedar foliar DNA samples are removed by cesium chloride gradient purification 80 Figure 5.3. Autoradiograms from copy number analysis of homologous western red cedar RFLP probes 84 Figure 5.4. Autoradiograms from range-wide western red cedar RFLP analysis with ///'wdlll-digested DNA samples vs. two single copy, monomorphic probes 87 Figure 5.5. Autoradiograms from range-wide western red cedar RFLP analysis with i/wdlll-digested DNA samples vs. two polymorphic probes 88 Vll Figure 5.6. Phylogenetic tree showing relationships among geographical regions within the range of western red cedar viii ACKNOWLEDGEMENT Far more people than I can mention here have assisted me throughout the course of my Ph.D. research. I apologize to those deserving recognition that I may have inadvertently left out of the following acknowledgement. First I would like to thank my research supervisor, Dr. John E. Carlson, for his guidance, encouragement, and constant support. I would also like to thank all members (past and present) of my supervisory committee for their invaluable guidance. In particular, I thank Dr. Yousry El-Kassaby for his crucial role and for initially suggesting the central topic of this thesis, the study of DNA-level genetic diversity in western red cedar. Special thanks must go to Dr. John McLean, Associate Dean of Graduate Studies and Research for the Faculty of Forestry, without whose support the completion of this thesis would not have been possible. I thank the external examiner, Dr. Steven H. Strauss, for his suggested revisions which have led to significant improvements of the thesis. Also, I wish to thank all of my colleagues (past and present) in the Carlson lab for their technical advice and encouragement. I am indebted to Bodo von Schilling, Darryl Secret, and Dr. John Russell for collecting foliar samples of western red cedar on my behalf. I also extend my gratitude to Tim Crowder of Fletcher Challenge Ltd. and Steve Joyce of Western Forest Products Ltd. for providing access to seed orchard material (in the Mt. Newton and Lost Lake seed orchards respectively). My deepest gratitude belongs to my wife, Hiroko, for her love, support, encouragement and patience, and for never losing faith in me, even at times when I had lost faith in myself. I also thank my children, Axel and Erika, for their patience and unconditional love. ix FOREWORD Chapters 2, 3 and 4 of this thesis are based on previously published papers. This foreword is provided to outline the contributions of the various co-authors to each of these papers and, in particular, to delineate the role of the thesis author. Chapter 2 is based on the following article: "Glaubitz JC and Carlson JE (1992) RNA editing in the mitochondria of a conifer. Current Genetics 22:163-165." All the laboratory research reported therein was performed by the thesis author. The manuscript was also written by the thesis author. Dr. J.E. Carlson acted in a supervisory role, providing guidance and editing the initial drafts. Chapter 3 is a modified version of the following article: "Carlson JE, Tulsieram LK, Glaubitz JC, Luk VWK, Kauffeldt C and Rutledge R (1991) Segregation of random amplified DNA markers in F t progeny of conifers. Theoretical and Applied Genetics 83:194-200." The contribution to this paper of our collaborators, C. Kauffeldt and R. Rutledge, namely, the data reported from the tree species white spruce {Picea glauca [Moench] Voss), has been omitted from the thesis chapter. Hence the thesis chapter is based only on the data produced in the lab of J.E. Carlson (by L.K. Tulsieram, the thesis author, and VWK Luk) from Douglas-fir {Pseudotsuga menziesii [Mirb.] Franco). VWK Luk assisted with DNA isolation. All of the remaining lab work with Douglas-fir was shared equally between L.K. Tulsieram and the thesis author, as was the data analysis. The research supervisor, Dr. J.E. Carlson, wrote the initial draft of the manuscript, which was then edited by L.K. Tulsieram and the thesis author. Chapter 4 is based on the following publication: "Tulsieram LK, Glaubitz JC, Kish G and Carlson JE (1992) Single-tree genetic linkage mapping in conifers using haploid DNA from megagametophytes. Bio/technology 10:686-690." G. Kiss provided the open-pollinated, maternal half-sib seed (from a terminal weevil [Pissodes strobi]-tolerant white xi DEDICATION To Hiroko, Axel, and Erika. 1 Chapter 1 GENERAL INTRODUCTION Forest molecular genetics is a young, rapidly growing field of scientific inquiry. Application of the tools of molecular genetics has much to offer the field of forestry, and these tools can be utilized for the exploration of both basic and applied research goals. Because the field of forest molecular genetics was very new at the time the research described in this thesis was initiated, the development of molecular tools, or methodology, initially took precedence over specific applications. The research presented herein has lead to several seminal contributions in the field of forest molecular genetics, both in terms of methodology and applications. Several molecular marker systems were adapted for use with coniferous tree species. These include RAPD markers, direct sequencing of PCR products, and single or low copy nuclear RFLP markers. This general introduction is provided for an overview of the flow of research undertaken in this thesis and to show the inter-relatedness of the various experiments. Marker systems were developed and are described within the context of a variety of applications: genetic linkage mapping, gene expression (i.e., RNA editing), and genetic diversity. In addition to developing the required methodology, important observations were obtained in each area of application. Three different conifer species, Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco), white spruce (Picea glauca [Moench] Voss), and western red cedar (Thujaplicata Donn ex D.Don) were involved. However, the primary goal of the work was in the use of DNA markers for the study of genetic diversity in western red cedar. 2 Development of this thesis The original goal of this thesis was to study genetic diversity at the DNA-level in western red cedar. This species was of particular interest because it was thought to contain relatively low levels of genetic diversity, presumably as a result of a severe population bottleneck during glaciation (see Introduction to Chapter 5 for background). A thorough study of this question would involve DNA marker data collected from all three genomes (nuclear, mitochondrial, and chloroplast). The initial approach to this question was to attempt to study RFLP variation in mitochondrial DNA, as organelle genomes can be reliable indicators of population bottlenecks (Wilson et al, 1985). To this end a reliable protocol for mitochondrial DNA purification from western red cedar was sought. However, this proved to be extremely difficult accomplish, and was abandoned after 8 months of concentrated effort. Around that time, PCR (Saiki et al, 1988) had emerged as a popular and powerful technique. Taking advantage of this approach, PCR primers were developed from conserved sequences in angiosperm plants to amplify the mitochondrial coxl gene from western red cedar and other conifers (see Chapter 2). The amplified gene from western red cedar was then sequenced successfully using the challenging approach of direct sequencing of PCR products. From this sequence data an unexpected discovery was made. Recent reports had shown the occurrence of post-transcriptional messenger RNA editing in the mitochondria of angiosperm plants (Gualberto et al, 1989; Covello and Gray, 1989; Hiesel et al, 1989). From the DNA sequence of western red cedar coxl obtained, it could be inferred that RNA editing was also occurring in this gymnosperm. The corresponding mRNA was obtained by reverse trancriptase PCR (RT-PCR) and sequenced, confirming that RNA editing was occurring (see Chapter 2). This was the first evidence of RNA editing in the mitochondria of a gymnosperm, and thus the first evidence suggesting that plant mitochondrial C-to-U editing arose prior to the divergence of angiosperms and gymnosperms. 3 Direct sequencing of the coxl mitochondrial derived PCR product from several western red cedar trees was then attempted for a direct measure of nucleotide diversity. However, the coxl gene is very highly conserved in sequence even among genera, and hence, it was not surprising that identical sequences were obtained from 10 western red cedar trees of widespread origin (data not shown). In contrast, intergenic regions were expected to be more highly variable. The next strategy was to obtain intergenic regions flanking the coxl gene using a technique called Inverse PCR (Ochman et al, 1988), for the purpose of surveying sequence variation within western red cedar. However, several attempts at Inverse PCR failed. Meanwhile, the "universal" PCR primers of Taberlet et al. (1991) for the amplification of intergenic regions from the chloroplast genome were synthesized and found to work well in western red cedar. At this point, however, work with organelle markers was put on hold so that effort could be focused on nuclear DNA markers, which were expected to contain higher levels of within species variation. While the work on organelle markers described above proceeded, experiments were initiated by the author with nuclear markers. A brand new genetic marker system, known as "Random Amplified Polymorphic DNA," or RAPD, a clever derivative of PCR, had just been devised by Williams et al. (1990). This new method, demonstrated with soybean, corn, Neurospora and human DNA, generated much excitement and became famous even before its publication in 1990. We became interested in RAPDs early on, and became the first lab to publish on the use of this technique with conifers (Carlson et al, 1991; Chapter 3, this thesis). Of particular interest to this thesis was the eventual use of RAPD markers to study genetic diversity in western red cedar. However, RAPD markers also held great promise for the field of forest molecular genetics. The RAPD marker system was first tested in a Douglas-fir genetic linkage mapping project to see if it would reveal many polymorphisms segregating in a Mendelian fashion. If so, the system would also be useful in the study of genetic diversity in western red cedar. The tests with Douglas-fir were successful (Carlson et al, 1991; Chapter 4 3, this thesis). A drawback with RAPD markers, for both mapping and population genetic applications, is their dominance. Therefore it was of considerable interest to see if the RAPD assay could be made to work with very small amounts of DNA isolated from the haploid megagametophytes of conifers. With such haploid tissue, dominance is no longer a hindrance. A genetic linkage mapping study in white spruce was initiated for evaluation of the RAPD marker system with DNA from haploid megagametophytes. A partial, RAPD-marker based genetic linkage map of a white spruce tree was produced in demonstration of this approach (Tulsieram et al, 1992b; Chapter 4 this thesis). At that point, the possibility of using DNA from megagametophytes of western red cedar in the RAPD assay was assessed for the study of genetic diversity in this species. However, the very small size of the western red cedar megagametophyte proved to be a major obstacle. In spite of much effort to increase DNA yields, it was not possible to consistently purify enough DNA from western red cedar megagametophytes to make a population genetic study feasible. The alternative lay in the use of diploid tissue as a DNA source. With dominant RAPD markers, population allele frequencies could be estimated, and hence traditional genetic diversity parameters, if the populations were in Hardy-Weinberg equilibrium. However, as western red cedar had recently been found to have the highest selfing rate (68%) of any conifer studied (El-Kassaby et al, 1994), random mating (i.e., Hardy-Weinberg equilibrium) within populations could not be assumed for this species. For these reasons, the codominant marker system of single copy nuclear RFLPs was chosen for the study of nuclear DNA-level genetic diversity of western red cedar. As is typical for molecular genetics studies, especially in non-"model" organisms, the RFLP approach was not without hazards. Methodological obstacles were encountered and 5 eventually overcome in this RFLP study, as detailed in Chapter 5. The major difficulty concerned the isolation of the large quantities of highly pure DNA needed for single copy nuclear RFLP analysis. Also, hybridization signals were initially very weak when many samples were probed. As a result of the delays caused by these methodological difficulties, only 90 of the 250 range-wide DNA samples isolated could be analyzed (with 30 RFLP probes) in the time remaining. However, this provided enough data to answer the fundamental question, the answer being, as detailed in Chapter 5, that western red cedar is indeed low in nuclear DNA-level genetic variation as measured by single copy nuclear RFLPs. Relationship between genetic linkage mapping and population genetics Although genetic linkage mapping and population genetics were treated separately in this thesis, these two areas are definitely related. It is often of interest to utilize mapped markers in population genetic studies. The use of mapped markers assures that they are truly Mendelian. Also, the availability of a marker-based genetic map allows the option of choosing evenly spaced, loosely linked markers providing wide coverage of the genome for population genetic studies. This minimizes the chance of linkage disequilibrium between marker loci, and thus the resulting violation of the assumption of statistical independence among loci that is commonly made in data analysis (Nei, 1987). On the other hand, linkage disequilibrium itself can be a topic of interest for population genetic studies (e.g., Epperson and Allard, 1987; Hastings, 1990). The availability of a large number of mapped markers would allow a thorough exploration of the relationship between map distance and levels of linkage disequilibrium in natural populations. A thorough analysis may allow distinctions to be made regarding the possible cause of any particular disequilibrium, as epistatic selection, inbreeding, drift (bottlenecks), mutation, and migration (admixture) can all play a role (Hastings, 1990; Briscoe et al, 1994). 6 Use of DNA markers for conservation genetics of forest trees One of the primary goals of conservation genetics is the preservation of genetic variation within species (Ledig, 1986; Loeschcke et ai, 1994). In order to preserve genetic diversity, it would be helpful to be able to measure it. DNA markers provide one means by which genetic diversity can be measured. Additional means are provided by isozymes, variation in terpenoids and other secondary compounds, and common garden studies of quantitative traits. As is generally assumed to be the case for isozyme variation, the majority of the variation in random DNA markers is expected to be selectively neutral (Kimura, 1983). Hence, measurements of genetic diversity based on neutral DNA markers will not always reflect existing levels of variation at currently adaptive loci, due to potential effects of selection on the latter. However, neutral variation is expected to accurately reflect the influence of population genetic forces other than selection. These forces include genetic drift, migration, the mating system, and mutation. The first three of these forces act on a genome wide basis, and thus, inferences can be drawn regarding their effect on the whole genome from a sufficient sample of neutral loci. The effects of genetic drift and migration can sometimes override that of selection on adaptive loci (Hartl and Clark, 1989). Furthermore, selection at an adaptive locus will also affect variation at linked neutral loci, through the "hitchhiking effect" (Maynard Smith and Haigh, 1974). Also, because the footprints of evolution (i.e., shared ancestry) are left behind in current patterns of neutral variation, such variation can provide an excellent record of evolutionary legacy (i.e., phylogenetic history; Strauss et al, 1992a). Evolutionary legacy itself may sometimes have a lasting imprint on growth and adaptive traits, as appears to be the case in Norway spruce, as a result of the colonization of different parts of the species range from different glacial refugia (Langercrantz and Ryman, 1990). Hence, patterns of neutral and adaptive genetic variation can often be correlated. 7 However, even when neutral DNA marker variation does not provide a reliable measure of currently adaptive variation, a sufficient sample of loci should provide a good measure of neutral variation in the genome as a whole. The inclusion of DNA markers will allow more of the genome to be sampled than with isozymes alone. Because such neutral variation provides the raw material for future evolution, its preservation is also an important conservation genetic goal (Vida, 1994). Currently neutral variation may prove to have adaptive significance in the future. Neutral variation becomes even more relevant in the context of artificial selection, such as that practiced in tree improvement programs. In this case it is desirable to minimize the loss of variation at those adaptive loci that are neutral to whatever form of artificial selection is being practiced. When small breeding populations and/or high selection intensities are involved, genetic drift, the effects of which are readily quantifiable using neutral markers, can become an important factor in diversity loss at such loci. Tree breeding strategies have been specifically designed to minimize the loss of genetic diversity, for example, the multiple-population breeding strategy (Namkoong, 1984). Neutral isozyme and DNA markers can be used to monitor the effectiveness of such strategies (for example, Williams etal, 1995). Not only tree breeding, but the entire domestication process can potentially affect genetic diversity in tree species (Savolainen and Karkkainen, 1992; El-Kassaby, 1992; El-Kassaby, 1995). For example, fertility selection in seed orchards may lead to unequal contributions of clones to the resulting seed crops, and a concomitant reduction in genetic diversity (for example, Schoen and Stewart, 1986). Furthermore, unexpected selective effects may be occurring in seedling nurseries which may also have genetic impacts (El-Kassaby, 1992). Such issues have heretofore been addressed with isozyme markers, however DNA markers should also be very useful in this regard, especially in those species, like western red cedar, where isozyme variability is insufficient. Effects of the domestication process as a whole on genetic diversity can be monitored by comparing marker diversity levels (isozymes 8 and DNA) in the final products of the domestication process (artificially regenerated stands) to that occurring in undisturbed natural stands. Currently, such an undertaking with a large number of DNA markers would be a time consuming and expensive proposition. However, as the field of molecular genetics is moving very rapidly, technological improvements and advances (e.g., automation, multiplexing) may soon make such studies practical. 9 Chapter 2 RNA EDITING IN THE MITOCHONDRIA OF A C O N D 7 E R Introduction RNA editing is the post-transcriptional alteration of the primary sequence of an RNA molecule from that of its corresponding gene. Editing was first discovered in the mitochondria of trypanosome protozoa (Benne et al, 1986). Shortly after its initial discovery this intriguing amendment to the central dogma of molecular biology was observed in a variety of genetic systems (reviewed in Gray and Covello, 1993). In plants, RNA editing was first found in the mitochondria of angiosperms by three different groups simultaneously (Gualberto etal, 1989; Covello and Gray, 1989; Hiesel etal, 1989). Many subsequent studies showed that RNA editing occurs in the mitochondria of all angiosperms examined (reviewed in Gray et al, 1992). From such studies the characteristic pattern of plant mitochondrial RNA editing became apparent: the vast majority of edits are from C-to-U (rarely U-to-C) and are non-silent, resulting in increased evolutionary conservation in the proteins encoded. It appears that, when present, editing is essential to plant mitochondrial protein function. Such knowledge of the character of plant mitochondrial RNA editing allowed the inference to be made, based on the DNA sequence of the entire mitochondrial genome of a liverwort (Marchantia polymorpha), that editing does not occur in this lower plant (Oda et al, 1992). Similarly, editing was not expected to occur in the green alga Chlamydomonas reinhardtii (Gray et al, 1992). Clearly, further exploration of the taxonomic distribution of the phenomenon was needed so that the timing of its evolutionary origin could be accurately inferred (presuming a single origin of this complex phenomenon). The fact that C-to-U mitochondrial RNA editing had been found in all angiosperms examined (6 dicots and 3 monocots), but not in a liverwort or a green alga, led some researchers to claim that it is a "general feature of higher plant mitochondria." However, until the work described in this chapter was completed, the phenomenon had not yet been observed in a gymnosperm. Prior to this work, sequence data from mitochondrial mRNA or DNA in gymnosperms had not yet been available. This chapter contains the first evidence for the occurrence of RNA editing in the mitochondria of a gymnosperm. This evidence was obtained by sequence analysis of the mitochondrial gene cytochrome oxidase subunit I (coxl) and its corresponding transcript in the conifer western red cedar (Thuja plicata Donn ex D.Don). Materials and methods PCR amplification of the coxl gene Western red cedar total DNA was isolated from foliage following the protocol of Wagner et al. (1987). PCR amplification of an internal portion of the coxl gene was performed under standard conditions (Perkin-Elmer/Cetus), with an annealing temperature of 55°C and a MgCl 2 concentration of 1.5 mM. The PCR primer sequences, specific to coxl (see below for primer design), were TTA TTA TCA CTT CCG GTA CT and AGC ATC TGG ATA ATC TGG. Amplification of the coxl transcript by RT-PCR The same tree that was used as the source of DNA was also used for RNA. Total RNA from foliage was isolated following the protocol of Hughes and Galau (1988) with the following modifications by M . Carol Alosi (personal communication): lithium dodecylsulphate was omitted from the homogenization buffer and, instead of potassium acetate precipitation, cesium chloride gradient centrifugation was employed for final RNA purification. To remove any trace of contaminating DNA from the RNA preparation, further steps, consisting of an additional round of cesium chloride gradient purification followed by RNase-free DNase I (Boehringer Mannheim) treatment, were added for this study. RT-PCR was carried out 11 according to Kawasaki (1990). Random primers (BRL) were employed for the reverse transcription step. For the PCR amplification step, the same primers used above for amplification of the gene were employed. Sequencing of PCR products Direct sequencing of double-stranded PCR products was performed according to Casanova et al (1990). To obtain sequence from both strands, internal primers (TCT ACC GC A GCT ACC ATG AT and AGA AGG ATG GAC CCT AC A GC) were employed for sequencing in addition to the amplification primers. The genomic PCR product was also blunt-end cloned into E. coli (Scharf 1990) and several clones were sequenced. Results Amplification primers are universal The PCR primers for coxl amplification were designed based on regions sharing both amino acid sequence conservation among maize and 4 non-plant species (see Figure 3 in Isaac et al., 1985) and nucleotide sequence conservation among the angiosperms maize (Isaac et al, 1985), sorghum (Bailey-Serres etal, 1986), wheat (Bonen etal, 1987), rice (Kadowaki et al, 1989), soybean (Grabau 1986), Oenothera (Hiesel etal, 1987), and pea (Kemmerer et al, 1989). The size (750 base pairs including primers) and sequence of the product confirmed that the correct region was amplified from western red cedar DNA. The amplified region spans the middle of the gene, from nucleotide position 615 to 1326 relative to the maize sequence (Isaac et al, 1985). The primers work with all other conifers tested so far, including larch, pine, and Douglas-fir, as well as with maize (as expected), and thus can be considered "universal" for higher plants. Editing sites and their effect on the protein sequence Comparison of the coxl partial gene and partial cDNA sequences from western red 12 cedar obtained here revealed 26 C-to-U editing sites over the 712 bases examined (Figures 2.1 and 2.2). Twenty-five of the 26 edits lead to a change in the amino acid encoded (i.e., are non-silent). All of these amino acid changes (23 in total) are to the consensus residues predicted from the known gene sequences of 7 angiosperm species (Table 2.1). This is in agreement with the RNA-editing data from angiosperms where the majority of edits are non-silent and restore consensus (Gray et al, 1992). The existence of the sole silent C-to-U editing site (position 918 in Figure 2.1) may be related to the proximity of two closely flanking non-silent editing sites (two bases upstream and downstream respectively). Discussion Prediction of editing sites from the DNA sequence Covello and Gray (1990) suggested that RNA editing sites can often be predicted from plant mitochondrial gene sequences alone. Prediction of protein sequences allows the identification of radical amino acid substitutions at otherwise highly conserved positions. Those substitutions that can be avoided by C-to-U edits in the mRNA are indeed likely to be edited. Predictions made in this manner proved to be highly accurate for coxl in western red cedar. From the partial gene sequence alone, 20 C-to-U edits were predicted (marked "*" in Figure 2.1) on the basis that they prevent 18 amino acid substitutions at residues that are invariant among 4 non-plant species and maize (Table 2.1). Such predictions from the DNA sequence alone provided the incentive to obtain the sequence of the corresponding transcript for this study. The cDNA sequence subsequently obtained revealed that all of these predicted edits do indeed occur. More of the 26 editing sites could have been predicted if additional positions where amino acids are conserved only among the angiosperm plant species (based on gene-predicted protein sequences - Table 2.1) were also considered (marked "+" in Figure 2.1). In this manner, 26 C-to-U edits in total would have been predicted from the gene sequence (marked "*" or "+" in Figure 2.1), 25 of which are shown in fact to occur by the cDNA sequence (Figure 2.1 and Table 2.1). The ability to successfully predict 5 additional 13 615 - A G A I T M S/L L T D R S F N T T L/ F F D dna A GCA GGG GCA ATT ACC ATG TCA TTA ACC GAT CGA AGC TTT AAT ACA ACC CTT TTC GAT ma T T * * 673 - P A G G G D P I L Y Q H dna CCT GCT GGA GGG GGA GAC CCG ATC TTA TAC CAG CAT rna 730 _ P E V Y I L I P/L P G F G dna CCA GAG GTG TAT ATT TTG ATT CCG CCT GGA TTC GGT rna T + 787 ~T F P/S G K P V F G Y L G dna ACT TTT CCG GGA AAA CCG GTC TTC GGG TAT CTG GGC rna T * 844 - I G V L G F L V W A H H dna ATT GGT GTT CTT GGA TTC CTT GTT TGG GCT CAT CAT rna 901 -V D T R A H/Y S/F T A A T M dna GTT GAT ACG CGT GCT CAC TCT ACC GCA GCT ACC ATG rna T T T 958 - I K I F S W I A T M W G dna ATC AAA ATC TTT AGT TGG ATC GCT ACC ATG TGG GGA rna 1015 -P M P/L S/F A V G S/F I L/F L S/F dna CCT ATG CCA TCT GCT GTA GGG TCC ATC CTT CTG TCC rna T T T T T * + * * * 1072 — I V L A N S G L D I A P/L dna ATA GTC CTG GCG AAT TCC GGG CTA GAC ATT GCT CCG rna T + 1129 ~A H S/F H Y V L P/S M G A V dna GCA CAT TCC CAT TAT GTA CTT CCC ATG GGA GCC GTT rna T T 1186 -Y F R/W V G K I S G R T Y dna TAC TTC CGG GTG GGG AAG ATC TCT GGT CGA ACA TAC rna T * + 1243 _H F R/W I T F V G V N L T dna CAT TTT CGG ATC ACT TTC GTC GGG GTG AAT TTG ACC rna T 1300 - G L S G M P R R I dna GGG CTT TCG GGT ATG CCA CGT CGC ATT rna P/L CCC T * P/F CCT TT ** W TGG F TTC F TTC G GGC H CAT I ATC I ATC S AGT H CAT I ATC V GTA S TCG M ATG V GTT Y TAT A GCC M ATG I ATA S AGT T/M ACG T * F TTT T ACT V GTG G GGC P/L CCA T + D GAC I ATC T/I ACA T * A GCT V GTC P CCC T ACT G GGA G GGT S TCG I ATA R CGA Y TAC K AAA T ACA T ACC I ATA G GGA G GGA L CTT T ACT G GGA H CAT D GAT T ACT H/Y CAT T * Y TAC V GTG V GTT F TTC A GCT L TTA F TTC A GCA G GGA F TTT P CCC E GAA T ACT L TTA G GGT CAA I ATT P/F CCC TT ** F TTT P CCC M ATG H CAT F TTC L TTG Figure 2.1. Comparison of the partial gene sequence (dna) and corresponding partial cDNA sequence (ma) from a 712 base pair portion of the gene coxl from western red cedar. For the cDNA sequence, only the nucleotides which differ from those in the gene sequence are shown (all of these are "T" [i.e., "U" in mRNA]). Predicted RNA editing sites (from the gene sequence alone) are indicated by "*" or "+" (see text for details). The predicted protein sequence is shown above the gene sequence. Differences in the predicted protein sequence from gene to mRNA are indicated as X/Y, where X equals the amino acid predicted by the gene and Y equals that predicted by the mRNA. Numbering is relative to the maize sequence (Isaac et al, 1985). The partial sequences of the western red cedar coxl gene and of its transcript have been assigned EMBL accession numbers X58887 and X64833 respectively. 14 3' DNA G A T C cDNA 5' Figure 2.2. Corresponding portions of DNA and cDNA sequencing autoradiographs from coxl in western red cedar. Five C-to-T (U in mRNA) differences are shown (arrows). The portions shown correspond to nucleotides 1009 to 1061 in Figure 2.1 15 Table 2.1. RNA editing of coxl in western red cedar restores evolutionary conservation in the protein encoded. Amino Acid Edited Codon Codon Predicted from Angiosperm Non-plant Nucleotide(s) Position DNA RNA DNA RNA Amino Acid(s) Amino acid(s) 635 212 TCA uUa ser leu leu leu 664 222 CTT Uuu leu phe phe phe 710 237 CCC cUc pro leu leu leu 712,713 238 CCT UUu pro phe phe phe 752 251 CCG clJg pro leu leu leu,ile 793 265 CCG Ucg pro ser ser ser 881 294 ACG aUg thr met met met 896 299 CCA cUa pro leu leu leu,met 916,918 306 CAC UaU his tyr tyr tyr 920 307 TCT uUu ser phe phe phe 941 314 ACA aUa tin- ile ile ile 1022 341 CCA cUa pro leu leu leu 1025 342 TCT uUu ser phe phe trp,phe,tyr 1037 346 TCC uUc ser phe phe, ser phe 1042 348 CTT Uuu leu phe phe phe 1049 350 TCC uUc ser phe phe phe 1106 369 CCG cUg pro leu leu leu, phe 1117 373 CAT Uau his tyr tyr tyr 1136 379 TCC uUc ser phe phe phe 1150 384 CCC Ucc pro ser ser ser 1192 398 CGG Ugg arg top trp trp 1249 417 CGG Ugg arg top top trp,thr,ile 1279,1280 427 CCC UUc pro phe phe,leu phe Codons in the RNA are in lower case except where they differ from the DNA. The Angiosperm Amino Acid(s) column lists the gene-predicted amino acids that occur at homologous positions in maize (Isaac et al., 1985), sorghum (Bailey-Serres et al, 1986), wheat (Bonen et ai, 1987), rice (Kadowaki et al., 1989), soybean (Grabau et ai, 1986), Oenothera (Hiesel et ai, 1987), and pea (Kemmerer et al., 1989). The Non-Plant Amino Acid(s) column lists the gene-predicted amino acids occurring in Neurospora, yeast, humans, and Drosophila (see Fig. 3 in Isaac et al, 1985). In both columns, where more than one amino acid occurs among species at a position, the consensus amino acid is given first. Numbering is relative to maize (Isaac et al, 1985). 16 editing sites based on amino acid conservation only among the angiosperms suggests that there are certain amino acid residues that are crucial to the function of the coxl protein specifically in plants. Editing of coxl in angiosperms Prior to the work described herein, due to the lack of mRNA sequence data, RNA-editing of a plant coxl transcript had not been demonstrated. To our knowledge, wheat was the only plant in which any coxl mRNA sequence had been determined (Gualberto et al, 1989). In the partial sequence examined, RNA-editing was not observed. However, from angiosperm coxl gene sequences, Covello and Gray (1990) inferred that, while coxl mRNAs of monocotyledons may not be edited at all, those of dicotyledons are likely to be edited. They identified 8 probable editing sites in the dicotyledons pea, soybean, and Oenothera, 5 of which lie within the 712 base portion of the gene that we have examined in western red cedar. Only one of these 5 sites (position 1279 in Figure 2.1) is edited in cedar. The remaining 4 sites do not require editing in cedar as they already have T's in the DNA. Note also that almost all of the editing sites identified in cedar do not require editing in any of the angiosperms. These observations indicate that RNA editing sites in coxl have evolved independently among gymnosperms, dicotyledons, and monocotyledons. Coxl is not likely to be edited in Douglas-fir In addition to western red cedar, the same portion of the coxl gene (but not the transcript) was PCR amplified and sequenced in Douglas-fir (data not shown). From this gene sequence alone it could be inferred that the coxl transcript was not likely to be edited in this species. All of the editing sites identified in western red cedar already encoded the proper residue in the Douglas-fir sequence and therefore did not need to be edited. Furthermore, no other sites were identified in the Douglas-fir sequence where a C-to-U edit could prevent an amino acid substitution of a conserved residue. 17 Thus, it appears that RNA editing of coxl occurs in western red cedar but does not occur in Douglas-fir. A probable explanation of this finding is that the coxl gene was replaced via reverse-transcription of a fully-edited coxl transcript in a predecessor of the latter species. This same explanation was invoked by Covello and Gray (1990) for the analogous situation in angiosperms where coxl appears to be edited in dicots but not in monocots. Note that, in contrast to coxl, most other mitochondrial genes examined in monocots are edited (Gray et al, 1992). Similarly, the apparent lack of editing of coxl in Douglas-fir does not rule out the occurrence of RNA editing of other mitochondrial transcripts in this species. Alternate hypotheses In order to confirm that the product of the RT-PCR was not derived from residual DNA contamination in the RNA preparation, a reverse-transcriptase minus control was included. No amplification products could be detected from this control (data not shown). Because of the characteristic pattern displayed, the conclusion that the substitutions between the coxl gene and cDNA observed here are due to mitochondrial RNA editing cannot be reasonably disputed. However, it could be argued that RNA editing no longer occurs in present day western red cedar, and that what was observed in this chapter is in fact a relic of ancestral RNA editing. Nugent and Palmer (1991) showed that during angiosperm evolution the coxll gene was transferred from the mitochondria to the nucleus via an edited RNA intermediate. Of those angiosperm species that have a nuclear copy of coxll, most have retained a residual, non-functional mitochondrial copy, which formerly, when it was functional, required RNA editing (Nugent and Palmer, 1991; Covello and Gray, 1992). Thus, it is possible that the coxl cDNA sequence obtained here from western red cedar is that of the transcript of a nuclear copy of the gene (derived from gene transfer via an edited RNA intermediate in an ancestor) while the DNA sequence obtained is that of a residual mitochondrial copy (whose former transcript required RNA editing). 18 Due to the lack of reliable protocols for the isolation of pure mitochondrial RNA or DNA from conifers, this possibility could not be completely ruled out. However there are two reasons why it is unlikely. To begin with, although other plant mitochondrial genes are known to have been transferred to the nucleus (Nugent and Palmer, 1991), transfer of the coxl gene to the nucleus and transcription of this transferred gene has not been observed in angiosperms. The second, more compelling reason is the lack of sequence divergence (other than that at the postulated RNA editing sites) between the portions of the gene and cDNA sequenced here. If such a transfer, followed by functional activation of the nuclear copy, did occur in an ancestor of western red cedar, it must have occurred millions of years ago, so that nuclear transcription of the gene and mitochondrial import of the protein would have had time to evolve. Note that the most recent transfer known, coxll in legumes, occurred between 60 and 200 million years ago (Nugent and Palmer 1991). Accordingly, sequence divergence, other than that at the RNA editing sites, between the nuclear copy and residual mitochondrial copy, is expected in those species retaining both copies. In fact, such sequence divergence between nuclear and residual mitochondrial copies of coxll has been shown to be abundant in soybean (Covello and Gray, 1992). In contrast, sequence divergence other than that at the postulated RNA editing sites was not observed between the portion of the coxl DNA and cDNA examined here in western red cedar. This can be considered as strong evidence against this alternate hypothesis and, conversely, in favour of the occurrence of RNA editing in present day western red cedar mitochondria. Nevertheless, even if the alternate hypothesis couldn't be absolutely disproved, the observations reported in this chapter still confirm that mitochondrial RNA editing of coxl either occurs in western red cedar or occurred in one of its ancestors. It is unlikely that such a complex biochemical process as C-to-U RNA editing arose independently in both angiosperms and gymnosperms (Covello and Gray, 1993). Other examples of RNA editing in distantly related taxa differ in character (e.g., insertions of U's in Trypanosomes). Thus, these results 19 suggest that plant mitochondrial RNA editing of the C-to-U type evolved prior to the divergence of gymnosperms and angiosperms. Recent progress The results reported in this chapter were published in 1992 (Glaubitz and Carlson, 1992). Progress since that time in regard to the taxonomic distribution of plant mitochondrial RNA editing will now be discussed. It should be first noted, however, that RNA editing far less extensive than, but very similar in nature to that in plant mitochondria has been observed in chloroplasts as well (Hoch et al, 1991; Kudlaefor/., 1992; Maier etal, 1992). Perhaps now that molecular biologists know to look out for it, RNA editing of one form or another will turn out to be commonplace among genetic systems. At any rate, as far as the taxonomic distribution of plant mitochondrial editing goes, further progress has been made that both corroborates the results reported here and sets the date of origin of plant mitochondrial editing further back in evolutionary time. From the genomic sequence of a PCR-amplified portion of the coxll gene in the primitive vascular plant Psilotum nudum (Pteridophyta), Sper-Whitis et al. (1994) inferred that the transcript must be RNA edited. Furthermore, an extensive study of the taxonomic distribution of plant mitochondrial RNA editing was carried out by Heisel et al. (1994). They examined corresponding portions of the coxIII gene and its transcript via PCR and RT-PCR from representatives from all major plant groups. RNA editing was found in all Spermatophyta (seed plants) and Pteridophyta (primitive vascular plants) members examined but not in the Bryophyta (mosses and liverworts) or Chlorophyta (green algae) members. Taken together with previous knowledge (described above in "Introduction") their results suggest that plant mitochondrial RNA editing arose before the evolution of vascular plants (kormophytes). The inability to find mitochondrial RNA editing to date in any of the bryophytes or 20 chlorophytes examined suggests that it arose in a common ancestor of the kormophytes after the divergence of the bryophyte lineage. However because editing may occur in other mitochondrial genes or bryophyte species not yet examined, this negative result is subject to further confirmation (cf. coxl in monocots). Also, the hypothesis that plant mitochondrial RNA editing arose before the divergence of the bryophyte lineage and then was lost in this lineage (and possibly in chlorophytes as well), though less likely, cannot currently be disproved (Heisel et al, 1994). The study of Heisel et al. (1994) included three gymnosperm species: Picea abies, Ginkgo biloba, and Cycas revoluta. Of all the taxa they studied, these gymnosperms displayed the most extensive editing of the 321 nucleotide portion of the coxIII transcript examined, with Ginkgo biloba having the largest number (21) of editing sites. Taken together with my results (26 edits over a 712 nucleotide portion of coxl in Thuja plicata) it would appear, subject to further confirmation, that editing tends to be more extensive in gymnosperm mitochondria than in the mitochondria of angiosperms or pteridophytes. Evolution of RNA editing The phenomenon of plant mitochondrial RNA editing (and that of RNA editing in general) raises many difficult questions, of which this is by far the most perplexing: how did it arise in the first place? Several possible selective advantages of this process have been hypothesized (e.g., Benne, 1990; Mulligan, 1991; Schuster etal, 1991). RNA editing may provide an additional level of regulation of gene expression. The availability of a pool of unedited mRNA molecules may facilitate rapid response to sudden increases in demand for the enzyme. Also, this additional level of regulation may allow "fine tuning" of enzyme levels. Alternatively, the primary function of editing could be as an additional, RNA level, repair mechanism, allowing for the correction of deleterious mutations that escaped correction 21 at the DNA level. However, it would be limited in this role (as only T-to-C transitions would be repaired) and inefficient (as the repair of each newly arising mutation would require the evolution of a specificity determinant targeting that particular mutation for editing). Another possibility is that proteins varying in sequence are produced from the same gene by the translation of differentially edited mRNAs. These variant proteins may be present in a single mitochondrion, or certain forms may be specific to certain tissues or developmental stages, or produced in response to particular environmental conditions. One problem with this idea is that mitochondrial enzymes perform critical functions and are thus highly conserved in sequence among organisms. As many RNA edits appear to be crucial for enzyme function, many of the possible variants would then be non-fiinctional. Perhaps the presence of RNA editing allows the process of evolution to occur at the less risky RNA level rather than at the DNA level. New edits in the mRNA could be "tested" via partial editing, with the combined presence of the original mRNA (i.e., edited in the original manner) in the same mitochondria acting as a "safety net". Beneficial edits could then become fixed (i.e., edited in all transcripts). In contrast to the above hypotheses, Gray and Covello (1993; see also Covello and Gray, 1993) propose that RNA editing arose by means of genetic drift (I prefer to think of it as "biochemical drift"), in the absence of a selective benefit. This could have happened by the "recruitment" by mutation of an existing enzyme (e.g., a cytidine deaminase acting on free nucleoside) to act upon mRNA. In their view, once established, the process subsequently became essential (by the establishment of essential editing sites). The task of sorting out the above hypotheses (or other alternatives) is one of the greatest challenges faced by workers in this area. 22 Chapter 3 SEGREGATION OF RANDOM AMPLIFIED POLYMORPHIC DNA MARKERS IN F l PROGENY OF DOUGLAS-Fm Introduction The use of restriction fragment length polymorphisms (RFLPs) to generate genetic linkage maps represents an important contribution of molecular genetics to plant improvement programs. At the time that the work reported in this chapter was completed, RFLP-based genetic linkage maps had been constructed, or were underway, for maize (Helentjaris et al, 1986; Helentjaris, 1987; Burr and Burr, 1991; Beavis and Grant, 1992), tomato (Bernatzky and Tanksley, 1986), lettuce (Landry et al, 1987), rice (McCouch et al, 1988), wheat (Kam-Morgan and Gill, 1989), soybean (Kiem etal, 1989), lentil (Harvey and Muehlbauer, 1989), barley (Blake, 1990; Nilan, 1990), Brassica oleraceae (Slocum, 1990), and Arabidopsis thaliana (Chang etal, 1988; Nam et al, 1989). The most common use for RFLP-based genetic linkage maps in plant improvement is to localize genes controlling important traits and to thereby provide markers for marker-assisted selection (MAS). The advantages of MAS had been well documented (Beckman and Soller, 1986 and 1988; Patterson et al, 1988; Soller and Beckman, 1990; Lande and Thompson, 1990). Studies in tomato (Osborn et al, 1987; Nienhuis etal, 1987; Weller al, 1988) and corn (Stuber, 1987; Grant et al, 1988) had demonstrated that marker based selection is not limited to qualitative traits, that is, traits controlled by one or a few genes, but is also applicable to quantitative and semi-quantitative traits. Early selection of progeny in the development of new cultivars for annual crops may save several months to several years of growing time as well as labour and space requirements. For genetic improvement of long lived species such as coniferous trees, however, the savings could be much more dramatic. Early selection (e.g., before one half or even one quarter of rotation age) is a universal feature 23 of conifer tree breeding programs but is not risk-free (Lambeth, 1980; Newman and Williams, 1991). Weak juvenile-mature correlation at the phenotypic level could result in the inordinate risk of seriously reducing net gain if phenotypic selection were done too early. In contrast, MAS is performed at the genotypic level, so weak juvenile-mature correlation will only be a problem when associations between markers and trait loci are first being established, but not when MAS will be applied in subsequent generations. With MAS, heritable traits need not even be expressed at the time selection is performed. Similarly, environmental effects that may confound phenotypic selection will be of little consequence at the time when MAS is applied (although they may complicate the initial establishment of marker/trait associations). Despite significant potential advantages for MAS in forest tree improvement programs, especially with conifers, at the time that the research reported in this chapter and the following chapter was completed, no molecular marker based linkage map had been reported yet for a tree species. Linkage studies in forest trees had been limited to isozyme markers (Guries et al, 1978; Rudin and Ekberg, 1978; O'Malley et al, 1979; Eckert et al, 1981; Conkle, 1981; El-Kassaby et al, 1982; Cheliak et al, 1984a; Strauss and Conkle, 1986). These linkage analyses were restricted by the small number of isozyme loci in tree species and the level of polymorphism. The most comprehensive study at that time (Conkle, 1981) identified five linkage groups in six conifer species, covering a maximum of 226.4 cM. One potential problem in RFLP-based genetic linkage mapping with conifers is the large size of their nuclear genomes relative to herbaceous crops. For example, the genome size of Douglas-fir is approximately 25 pg per haploid nucleus (Ingle et al, 1975), while that of soybean is approximately 1 pg (Goldberg, 1978). Thus, reproducibly detecting single-copy genes by Southern hybridization and saturating RFLP-based linkage maps was expected to be more difficult for conifers than for angiosperms. Hence, the development of time-saving and non-hybridization based techniques would be of particular benefit to the molecular genetics of 24 coniferous trees. The development of the "Random Amplified Polymorphic DNA" ("RAPD") marker system by Williams et al. (1990) appeared to be the innovation that was needed. The RAPD system is a modification of the polymerase chain reaction (PCR) which, instead of using pairs of long (roughly 20-mer) oligonucleotide primers designed to specifically amplify particular predetermined segments, uses single, short (10-mer) primers of arbitrary sequence. Such short primers can bind to many places in the template genome, and as a result amplify DNA segments from multiple, non-predetermined locations. The multiple amplification products produced by a given 10-mer primer can be visualized as bands on an ethidium bromide stained agarose gel. Most of the time, bands of different size originate from different loci (Williams et al, 1990). The presence or absence of a particular band is a reflection of a genetic polymorphism at the locus of origin. Unlike the RFLP system, the RAPD assay does not require radioactive isotopes or long exposure times. This new approach seemed to hold great promise for quickly placing markers on linkage groups, even with large genomes. The RAPD system for genotyping can save much time relative to the standard RFLP approach of cDNA or genomic library construction and Southern hybridizations. However, RFLPs (like isozymes) are codominant markers while RAPD markers are dominant (Williams et al, 1990). In other words, in the RAPD assay, homozygotes for the presence of an amplification product (+/+) at a locus are indistinguishable from heterozygotes (presence/absence or +/-), as both genotypes produce the corresponding band on the gel. Considerations of the potential of marker assisted selection had previously been based on the potential availability of genetic linkage maps saturated with codominant markers (RFLPs). To take advantage of the sudden availability of potentially large numbers of dominant markers afforded by the advent of the RAPD marker system, a paradigm shift was necessary. 25 The work reported in this chapter, carried out while the paper of Williams et al. (1990) was still in press and published in 1991 (Carlson et al, 1991), contributed in some way to this paradigm shift, especially in regards to mapping in forest trees. It was intended to answer the following three questions: 1) Can the RAPD assay be made to work with a coniferous tree, particularly in light of the large size of the conifer genome? 2) Do polymorphic RAPD bands segregate in F l progeny of conifers as Mendelian alleles? and 3) Does the dominant nature of RAPD markers restrict their use in genome mapping of diploid organisms such as forest trees? For the promise of RAPD markers to be realized in forest trees, appropriate answers to these questions were critical. These questions were answered using Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) as a model conifer. Both the empirical results obtained and theoretical considerations suggested that RAPD markers can indeed be used to construct genetic linkage maps in trees. The ability to efficiently map dominant markers such as RAPD in diploid organisms such as trees depends on the use of markers in test cross configurations. An alternative strategy, taking advantage of the availability of haploid tissue in the megagametophytes of conifers is explored in the following chapter. Materials and Methods Genetic material A partial diallel between 6 selected Douglas-fir genotypes was established in 1987 by Dr. Y.A. El-Kassaby of Canadian Pacific Forest Products, Ltd., and Dr. J. Maze of the University of British Columbia. Thirty progeny from three of the crosses in this Douglas-fir partial diallel were sampled. DNA isolation DNA was isolated from needles by the procedure of Wagner et al. (1987). Needle DNA samples from the six parental genotypes were further purified by banding in CsCl 26 equilibrium density gradients (DeVerno et al, 1989). Concentrations of all purified DNA samples were determined by A^o absorbence on a UV spectrophotometer. RAPD assay The conditions reported by Williams et al. (1990) for the RAPD assay were optimized for use with conifer template DNA. Tests of reaction components included varying DNA concentration, magnesium concentration, source of enzymes, enzyme concentration and primer concentration, as detailed in the Results and Discussion section. The optimum reaction mix, developed with Douglas-fir template DNA, consisted of 200 ng template D N A 0-2 p.M primer (single, 10 base oligonucleotide), 2 Units Perkin-Elmer/Cetus Amplitaq enzyme, 1.9 mM MgCl 2 , and 200 uM each dATP, dCTP, dGTP, dTTP in a final volume of 25 ul IX Cetus reaction buffer, overlaid with 25 Lil mineral oil. Amplification conditions included a preliminary 7 min. denaturation at 94°C, followed by addition of the enzyme, an additional 2 min. soak at 94°C, and then a total of 45 cycles of: 1 min. at 94°C, 1 min. at 35°C and 2 min. at 72°C in the Perkin-Elmer/Cetus DNA Thermocycler. The amplifications finished with an incubation at 72°C for 10 minutes, followed by a 4°C soak until recovery. To reduce the potential for contamination of the reaction mixes or reagents with PCR product, reaction mixes were prepared under a sterile laminar flow hood using dedicated pipettors. Segregation analysis DNA samples extracted from Douglas-fir Fj progeny were amplified with a set of eight PCR primers. This set of 10 base oligonucleotides (Ap3, Ap4, Ap4c, Ap5a, Ap5h, Ap6, Ap9, and Apl3) was selected from primers used by Williams et al. (1990). RAPD markers were detected by electrophoresis of the complete amplification reaction products from progeny and parents of each cross on 1.5 % agarose horizontal gels. Gels were stained with ethidium bromide, photographed, and distribution of markers among progeny recorded. Goodness-of-fit to the 1:1 pattern of segregation for marker DNA bands, as predicted for 27 Mendelian characters in a test-cross, was determined using the Chi-square test. Results and Discussion Theoretical considerations for use of dominant molecular markers in genome mapping Prior to conducting segregation tests with RAPD markers, it was considered whether genetic theory would support or deny the hypothesis that the dominance of RAPD markers would seriously limit their usefulness in genetic linkage mapping studies with outbreeding diploid organisms such as conifers. For a more complete understanding of the issues involved, the potential for mapping with large numbers of dominant markers in traditional agricultural pedigrees, the F 2 intercross and the backcross, was first considered. F 2 intercross pedigrees In agricultural crops such as Zea mays, the traditional mapping population consists of progeny from an F 2 intercross between two inbred lines. In this case, the codominance of RFLP markers is clearly an advantage. After crossing the two homozygous parents AABB X aabb (or AAbb X aaBB), the selfing or inter-mating of the Fl (AaBb) produces nine two-locus F 2 genotypes, all of which are recognizable by RFLPs (although the coupling and repulsion versions of double heterozygotes are indistinguishable). However, only 4 groups of genotypes would be recognizable with RAPD (dominant) markers: A-B-, A-bb, aaB-, and aabb. For genome mapping with F 2 populations, the inability to distinguish all genotypes will translate into a loss of information from RAPD markers relative to RFLPs. In this case, linkage can only be detected by significant deviation from the expected 9:3:3:1 phenotypic ratio in the progeny corresponding to independent assortment. Hence to detect linkage and accurately estimate recombination frequencies with dominant marker many more F 2 individuals must be sampled, especially when linked markers are in repulsion phase (Allard, 1956). 28 Backcross pedigrees Dominant markers should be much more informative when backcross pedigrees, derived from crossing a heterozygous Fj individual back against either or both of the homozygous parents (inbred lines), is used for mapping. In this case, testcross configurations (AaBb X aabb) will exist where all progeny can be reliably genotyped even with dominant markers. In such testcross configurations, no information is lost when dominant markers are used instead of co-dominant markers. Linkage between a pair of loci in this configuration can be detected by significant deviation from the expected 1:1:1:1 ratio (AaBb: Aabb:aaBb:aabb) in the progeny corresponding to independent assortment. When multiple loci are considered for the backcross case, only those dominant marker loci in a multi-point testcross configuration (i.e., AaBbCcDdEeFf... X aabbccddeeff...) can be mapped relative to each other. Inbred lines in which all dominant marker loci (i.e., RAPD markers) detectable are homozygous recessive ( = "testers", i.e., aabbccddeeff....) very likely do not exist. Which loci group into a multi-point testcross configuration in a back cross with either of the parents must then depend on the genotypes of the parental inbred lines. Consider the mating of inbred lines of hypothetical (but realistic) genotypes AABBccDDeeffGG (parent 1) and aabbCCddEEFFgg (parent 2) producing a Fj genotype AaBbCcDdEeFfGg. In the backcross involving parent 2, the four loci A, B, D and G are in the required (four-point) testcross configuration (i.e., AaBbDdGg X aabbddgg). Conversely, in the backcross with parent 1, the remaining three loci, C, E and F, are in the required (three-point) testcross configuration (i.e., CcEeFf X cceeff). Testcross configurations will not exist in either of the backcrosses for all pairs of loci where one locus is homozygous recessive and the other locus is homozygous dominant in the same parent (e.g., A and C, A and E, A and F, B and C, etc.). Hence, all dominant marker loci that are polymorphic between the two parental inbred lines will fall into one of two mutually exclusive groups: (1) those loci in a multi-point 29 testcross configuration in the backcross of the Fj to one parent, and (2) those loci in a multi-point testcross configuration in the backcross of the F, to the other parent (i.e., all remaining loci). Thus, all loci within one of these groups can be mapped relative to each other using a testcross model. If progeny from both possible backcrosses (i.e., F, X one parent and Fj X the other parent) are available, two separate maps, consisting of group 1 loci and group 2 loci respectively, can be constructed. In this manner all dominant markers polymorphic between the two inbred lines can be placed on one or the other of these two maps. Conifer pedigrees In contrast, inbred lines are not available in coniferous trees. In fact, because of the long generation time of most conifers, even three-generation pedigrees are rare. Hence, it is important to consider whether linkage maps composed of dominant markers can be constructed using outbred, two-generation, full-sib pedigrees such as those available in the Douglas-fir partial diallel described in the Materials and Methods section of this chapter. As noted above, the testcross configuration is the most efficient for detecting linkage with dominant markers. As most conifers are highly heterozygous (the average heterozygosity of isozyme loci in Douglas-fir in British Columbia is 0.16 [Yeh and O'Malley, 1980; Yeh, 1981]), testcross configurations should be common among pairs of dominant marker loci. When a dominant RAPD marker is polymorphic between two parent trees, the parent tree lacking the band must be homozygous recessive (aa) while the parent tree in which the marker is present is either a heterozygote (Aa) or homozygous dominant (AA). In the latter case, no segregation will be observed in the progeny and the marker locus cannot be mapped. However, in the former testcross case (Aa X aa), segregation of the dominant allele will be observed in the progeny. To extend this to multiple loci, consider numerous dominant loci polymorphic between the two parent trees and heterozygous in the parent possessing the dominant allele (e.g., 30 parent 1: AabbccDdeeFfgg X parent 2: aaBbCcddEeffGg). All those loci with the dominant allele present in the same parent (e.g., parent 1) will be in a multi-point testcross configuration and can be mapped relative to each other (e.g., AaDdFf X aaddff). Conversely, all the remaining loci with the dominant allele present in the other parent (e.g., parent 2) will be in a multi-point testcross configuration and can be mapped relative to each other (e.g., BbCcEeGg X bbcceegg). Hence, as in the backcross situation, two separate maps can then be constructed. In this case, however, the two maps can be constructed by following the segregation of markers in a single set of progeny. Another difference between this case and the backcross case described above is the linkage phase of the segregating markers. In the backcross case, because the F, is derived from inbred lines, all linked segregating markers in one of the two multi-point testcross groups will be in coupling phase. In contrast, with the outbred two-generation pedigree considered here, linkage phase of the heterozygous markers in either of the parents is unknown, and therefore must be inferred from the segregation data. With reasonable numbers of progeny (e.g., >50) and closely linked markers (e.g., recombination less than 30%) this should not be a problem. Hence, from a theoretical standpoint, it should be possible to construct saturated linkage maps in conifers (or other outbred diploid organisms) using two-generation, full-sib pedigrees and dominant RAPD markers. At this point, what is meant by a "saturated" linkage map will be defined more clearly. For the purposes of this study, a saturated map will be defined as one where 90% of the genome lies within 15 cM or less of a mapped marker. The number of markers required is then dependent on genome size, and is given by the equation of Lange and Boehnke (1982): n = ln(l - P) / [ln(l - 2c/k)] where P is the proportion of the genome within a maximum distance of c (in cM) from the nearest marker and k is the genome size (in cM). Neale and Williams (1991) gave an estimate 31 of the conifer genome size of 2500 cM. This has proven to be a reasonably accurate estimate (Nelsons al, 1993; Nelson et al, 1994; Gerber and Rodolphe, 1994a; Devey etal, 1994). For this genome size, greater than 200 markers would be needed for a saturated map as defined above. The conclusion from the above theoretical considerations will hold true in practice if enough RAPD marker loci can be identified that are heterozygous in one parent and homozygous recessive in the other. This will depend on the number of available RAPD loci and on the heterozygosity of the parent trees. As over 1 million unique 10-mer primers are possible, the number of available RAPD loci is not a limitation. Furthermore, as noted above, most coniferous trees are highly heterozygous. Therefore, theoretical considerations suggest that it is highly plausible that saturated genetic linkage maps can be constructed in conifers using dominant RAPD markers in testcross configurations. The optimism of these theoretical considerations was then tested empirically in Douglas-fir. It was determined whether the RAPD assay works well with Douglas-fir DNA, whether polymorphisms could be found among parents of available controlled crosses, and, by analysis of the segregation of these polymorphisms in the progeny, what proportion of the polymorphic loci identified were in the required testcross configuration. The results of this empirical test follow. Optimization of amplification conditions for conifers The first attempts made with template DNA from Douglas-fir utilizing the reactions conditions recommended by Williams et al. (1990) revealed many polymorphisms between several individuals (data not shown). However to use the RAPD system routinely with conifer template DNA it was apparent that reaction conditions were needed that would increase reproducibility of bands and decrease background staining in gel lanes due to non-32 specific amplification products. It is likely that the increased background observed here relative to that observed by Williams et al. (1990) with soybean is due to the much larger genome size of Douglas-fir. To determine the effect of magnesium concentration, we tested four concentrations (1.5 mM, 1.9 mM, 2.3 mM, 2.7 mM) in the RAPD protocol, using one PCR primer (Ap3) vs. three Douglas-fir genotypes (Fd25, Fd439, Fd48). To determine the optimum DNA concentration, we tested five concentrations of DNA from one Douglas-fir parent (12.5, 25, 50, 100, 200 ng in 25 ul) in separate reactions with two primers (Ap3 & Ap4). We also looked at the effect of enzyme source by comparing the production of RAPD markers from DNA using the Cetus Perkin Elmer enzyme (AmpliTaq) with two Douglas-fir genotypes (Fd37 & Fdl96) and one PCR primer (Ap9) with amplification products from thermostable DNA polymerases of several other commercial suppliers. Finally, we tested different approaches to purifying the synthetic oligonucleotide primers (Sep packs [Millipore], G-25 Sephadex drip and spun columns, Pharmacia NAP-10 columns, ethanol precipitation and no purification). The optimal reaction conditions that we arrived at for generating RAPD markers from conifer template DNA are given in materials and methods. The best results were obtained when Sep packs or Pharmacia NAP-10 columns were used to purify the primers. Some of the observations that are pertinent to obtaining reproducible and readily scorable amplification products with conifers include: 1) enzyme concentration and enzyme source are critical; 2) 1.9 mM Mg2+ is optimal, higher concentrations result in background smearing while lower concentrations produced less intense PCR bands; 3) reproducibility and resolution are enhanced with higher DNA template concentration; and 4) lower molecular weight DNA bands are usually more reproducible as RAPD markers than higher molecular weight (> 1.5 kb) bands. 33 Segregation analyses As a first step, we determined which of the eight PCR primers could produce scorable RAPD markers with template DNA from the six Douglas-fir parental genotypes (Fd-25, -37, -48, -120, -196, -439). Figure 3.1 shows the electrophoretic results of reactions with 4 of the 8 primers. We found that 7 of the 8 primers revealed scorable (i.e., unambiguous) polymorphisms. Although not all parents gave unique products with every primer, certain primers, e.g., Ap4c, revealed more polymorphisms than others. The primer that did not produce scorable markers from any of the parents, Ap4, differed from the very productive Ap4c primer at only the third nucleotide (data not shown). Based on the screening of the 6 parents with the 8 primers, 11 polymorphic bands in total, produced by 5 of the primers (Ap3, Ap4c, Ap5a, Ap9 and Apl3) were chosen for the segregation analysis. Of the progeny sets available in the partial diallel, three were chosen for this analysis (Fd-25 X Fd-439, Fd-37 X Fd-196 and Fd-48 X Fd-439). Note that one of the chosen bands (a 0.4 kb band produced by primer Ap3) was tested for segregation in two different crosses (Fd-37 X Fd-196 and Fd-48 X Fd-439). Under Mendelian assumptions, when a dominant RAPD marker is present in one parent and absent in the other (i.e., A- X aa), the marker will segregate in the progeny only when the parent in which the marker is present is a heterozygote. This is the test cross configuration (Aa X aa) desired for mapping. Segregation is expected to follow a 1:1 ratio. Conversely, the lack of segregation of the marker among a large sample (i.e., > 10) of progeny (i.e., presence of the band in all progeny) is indicative that the parent in which the marker is present is a homozygote. The relative frequency of these two cases will affect the feasibility of constructing a saturated genetic linkage map using the test cross approach with diploid organisms. Segregation of RAPD markers in 30 Fi progeny from individual crosses in a diallel 34 M 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 M Figure 3.1. DNA polyrnorphisms between 6 parental Douglas-fir genotypes from 4 RAPD primers. Shown is an ethidium bromide stained agarose gel of RAPD products from primer Ap4c, lanes 1-7; primer Ap5a, lanes 8-14; primer Ap5h, lanes 15-21; primer Apl3, lanes 22-28; parental genotype Fd-25, lanes I, 8, 15, and 22; parental genotype Fd-37, lanes 2, 9, 16, and 23 (lack of amplification left lane 16 blank); parental genotype Fd-48, lanes 3, 10, 17, and 24; parental genotype Fd-120, lanes 4, 11, 18, and 25; parental genotype Fd-196, lanes 5, 12, 19, and 26; parental genotype Fd-439, lanes 6, 13, 20, and 27; minus template DNA controls, lanes 7, 14, 21, 28. Lanes marked "M" contain Lambda////>?dlll size standards. 35 between the Douglas-fir parents was analyzed by horizontal agarose gel electrophoresis. Also included on each gel were two sets of RAPD reactions from the two parental genotypes and a contamination control (a RAPD reaction without template DNA). Figure 3.2 shows the segregation among the progeny of Fd-37 and Fd-196 of a polymorphic band produced by primer Ap9. Figure 3.3 shows the segregation among the same progeny of a polymorphic band produced by primer Ap3. These figures reveal how easily the RAPD markers can be scored as segregating genetic alleles (by presence or absence of an ethidium bromide stained parental DNA band among progeny). The results from the segregation test of the 12 selected RAPD marker bands are summarized in Table 3.1. Only two of the twelve markers did not segregate. For those 10 Douglas-fir RAPD markers that did segregate, results for Chi-square analysis of goodness of fit to a 1:1 segregation ratio are given. Seven cases fit a 1:1 segregation while 3 cases did not, tested at the 0.05 level of significance. This 30% rate of segregation distortion is much higher than the rate expected by chance alone (5%). Therefore, the distortions observed are very likely to have underlying biological or methodological causes. The 30% distortion rate observed here is much higher than rates generally observed in studies of forest trees with isozymes (e.g., Cheliak et al, 1984a; Strauss and Conkle, 1986; Cheliak et al, 1987; Adams et al, 1990) or RFLPs (Devey et al, 1991; Devey et al, 1994; Bradshaw and Stettler, 1994; Jermstad et al, 1994). However, a recent study of Sugi (Cryptomeria japonica) found a 21% rate of segregation distortion over a total of 128 RFLP loci (Mukai et al, 1995). Also, high rates of segregation distortion (ranging from 17 to 28%) have been found in a number of RFLP studies of progeny of intraspecific crosses in angiosperms (McCouch et al, 1988; Landry etal, 1991 and references therein). Two possible biological causes of the observed segregation distortions are as follows. One is meiotic drive, the effect of a "selfish" DNA element that acts to distort segregation in 36 M 1 2 3 4 5 6 7 8 9 10 11 1 2 1 3 14 15 16 17 18 M M 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 M Figure 3.2. Segregation of a polymorphic D N A band (RAPD marker) produced by primer Ap9. Shown is an ethidium bromide stained agarose gel of RAPD products from Douglas-fir parent Fd-37, lanes 17 and 35; parent Fd-196, lanes 18 and 36; Fi progeny, lanes 1-15 and 19-33; and no template D N A controls, lanes 16 and 34. Arrows indicate position of the polymorphic D N A band. Lack of amplification left lanes 1 and 11 blank. Lanes marked " M " contain Lambda///7wdIII size standards. Figure 3.3. Segregation of a polymorphic DNA band (RAPD marker) produced with primer Ap3. Progeny, parents and lane arrangement identical to Figure 3.2. 38 Table 3.1. Segregation analysis of RAPD bands polymorphic between Douglas-fir parents. Primer Approx. band size (kb) Cross * Band present Band absent Total Chi-square (1:1 ratio) Ap3 0.4 37 X 196 16 14 30 0.13 Ap3 0.4 48X439 21 0 21 * * Ap3 1.2 48X439 7 15 22 2.91 Ap4c 1.1 25 X439 20 8 28 5.14 *** Ap4c 1.3 25 X 439 15 13 28 0.14 Ap4c 0.3 37 X 196 21 9 30 4.80 *** Ap4c 0.5 37 X 196 17 13 30 0.53 Ap4c 1.2 37 X 196 13 17 30 0.53 Ap5a 0.5 25 X439 20 7 27 6.26 *** Ap9 0.5 25 X 439 15 14 29 0.03 Ap9 1.0 37 X 196 11 17 28 1.29 Apl3 0.6 37 X 196 30 0 30 * * * The parent tree in which the band is present is underlined. ** No segregation. *** Significant deviation from 1:1 ratio at 0.05 level. 39 its own favour by disrupting meiosis or gametogenesis (reviewed in Lyttle, 1993). Another is close linkage of the RAPD marker in question to an allele negatively affecting viability of gametes, megagametophytes, and/or embryos (Sorensen, 1967, Bradshaw and Stettler, 1994). As genetic load is known to be particularily high in conifers (Sorensen, 1969; Savolainen, 1994), such deleterious loci are expected to be common. Alternatively, the observed segregation distortions may be due to methodological causes. All three distorted loci showed a 3:1 (presence:absence) pattern. This could be due to co-migration of segregating bands from two loci, both heterozygous in one parent and homozygous absent in the other. Also, the observed distortions could be due to low reproducibility of the scored bands. Perhaps all 30 progeny actually received the "presence allele" at one (or more) of these three loci but the band(s) in question was (were) only 75% reproducible (possibly due to partially failed reactions). The competitive nature of the RAPD assay and its sensitivity to variation in reaction conditions would appear to demand careful vigilance from researchers in regard to potential artifactual results. It is suggested that reproducibility tests be built into all future RAPD marker studies to guard against this possibility. For example, 5-10% of all reactions could be replicated (preferably using replicated DNA isolations) to provide a reproducibility screen and an estimate of the experiment-wise reproducibility rate. As it was not determined in this study whether the observed segregation distortions were due to biological or methodological causes, only those markers showing 1:1 segregation were considered to be in the desired test cross configuration. Based on this, 7 out the 12 markers examined proved to be in the test cross configuration. Note that independent assortment of multiple RAPD markers produced by the same primer within one family was observed with primer 4c (Table 3.1). This primer revealed 3 40 segregating markers in the Fd-37 x Fd-196 cross, and two segregating markers in the Fd-25 x Fd-439 cross. The potential to detect multiple segregating loci with a single primer should boost the efficiency of linkage mapping with RAPD markers. The data obtained in this preliminary study with Douglas-fir and white spruce suggest that the optimism from theoretical considerations for use of RAPD markers in genome mapping has been borne out. Loci in the test-cross situation appear to be relatively common in Douglas-fir. At least 7 of 12, or over 50%, of the polymorphic loci that were followed in this preliminary study proved to be test-cross cases. Furthermore, 7 out of 8, or over 85%, of the random primers tested in this study revealed polymorphisms between the six Douglas-fir parents. The knowledge that most random primers generate one or more polymorphisms, that over 1 million unique 10-mer primers can theoretically be synthesized, and that test-cross cases for RAPD markers in Douglas-fir are so common suggests that suitable numbers of loci in the proper test-cross arrangement can be obtained for use in constructing saturated genetic linkage maps of conifers. Mapping strategy The following strategy is suggested for linkage map construction with dominant RAPD markers using DNA from diploid tissue and a two-generation pedigree: 1) identify primers that reveal plus/minus DNA band polymorphisms between the selected parents (i.e., A- x aa); 2) screen the polymorphisms for test-cross cases (i.e., for "Aa" heterozygotes) by testing for segregation among a small sample (e.g., 6) of progeny; 3) conduct full-scale segregation analysis among all progeny (>50) with all test-cross RAPD markers (markers that deviate significantly from 1:1 segregation can be eliminated from the analysis); 4) separate the scored markers into two data sets, according to which parent was heterozygous; and 5) perform linkage analysis (parental phase unknown) for each data set separately. Two linkage maps with no common loci would result from this analysis. These two maps should contain 41 roughly the same number of loci. Linkage groups in each map could then be correlated by choosing RAPD products from several loci in each linkage group to generate RFLPs or SCARs (Sequence Characterized Amplified Regions - Paran and Michelmore, 1993) that can be followed for linkage. Furthermore, data from RAPD markers could be pooled with data from RFLPs and isozymes to create saturated linkage maps. The above proposal to convert mapped RAPD bands to RFLPs to allow map integration will be hampered somewhat by the fact that many RAPD products contain repetitive sequences (Williams et al., 1990; Reiter et al., 1992). Grattapaglia and Sederoff (1994) found in Eucalyptus that almost 50% of their mapped RAPD bands contained repetitive sequences. This would imply that twice as much effort would be required to achieve the goal of map integration than if all RAPD bands were single copy. A rough estimate of the number of RAPD primers that would be needed to construct a pair of saturated linkage maps in a typical conifer can be made based on the results of this study (presuming that the heterozygosity of RAPD markers in Douglas-fir is typical of conifers in general). Herein, polymorphisms were detected among six Douglas-fir parents, and segregation followed among progeny of three of the available crosses, involving 5 of the 6 parents. However, for construction of a pair of maps following the strategy outlined above, only a single two-generation pedigree composed of progeny of a single cross would be used. Hence, to estimate the number of primers needed only polymorphisms and segregation data in one of the crosses examined here should be considered. The cross for which the most data was obtained in this study was Fd-37 X Fd-196. Seven RAPD primers revealed 9 polymorphisms in total between these two parents, or 1.3 polymorphisms per primer. (Due to reaction failure with Fd-37 in the screening step, data from primer Ap5h were not obtained -Figure 3.1.) Of these 9 polymorphisms, 6 were tested for segregation in the progeny of Fd-3 7 X Fd-196 (Table 3.1). Four of the 6 tested polymorphisms segregated in the 1:1 ratio 42 indicative of a testcross, corresponding to 0.67 testcrosses per polymorphism. Hence, the number of testcrosses per primer can be calculated as follows: 1.3 polymorphisms per primer X 0.67 testcrosses per polymorphism = 0.86 testcrosses per primer. In order to construct two saturated maps (one for the heterozygous loci in each parent) of at least 200 markers each, at least 400 testcross loci will be needed. Dividing 400 by 0.86 yields a rough estimate of 465 primers. For planning purposes, it is best to overestimate the number of primers needed. Therefore, it is suggested that this number be increased to 600. Therefore, roughly 600 primers are needed for saturated map construction following this testcross strategy in conifers. Based on the paper of Williams et al. (1990) and the success of this study, a RAPD primer synthesis project was initiated here at U.B.C. by Dr. John Carlson. Seven hundred RAPD primers were synthesized to be shared among researchers world-wide at low cost. Hence, this source alone should provide enough primers for saturated map construction. The mapping strategy outlined in this study should be applicable not only to conifers but to any highly heterozygous, diploid, outbred species for which controlled crosses can be made. For example, angiosperm tree species would also be excellent candidates. A powerful advantage of this approach is that no prior genetic information is needed. With conifers, the availability of haploid megagametophyte tissue from seeds allowed the formulation of an alternative mapping strategy not possible in most other diploid organisms. This alternative mapping strategy is the subject of the following chapter. Progress since the publication of the results in this chapter The results reported in this chapter were published in 1991 (Carlson et al, 1991). Progress since that time in regard to the use of RAPD markers derived from diploid tissue to construct genetic linkage maps in forest trees will now be discussed (see the next chapter for 43 the use of haploid tissues in conifers). The mapping strategy described in this chapter and published in 1991 (Carlson et al, 1991) has come to be known as the "two-way pseudotestcross mapping strategy", or more concisely, the "pseudotestcross mapping strategy" (Grattapaglia and Sederoff, 1994). The latter authors have used this strategy to construct two saturated linkage maps (containing 240 and 251 RAPD markers respectively) in Eucalyptus. The efficiency of the strategy was improved by using an interspecific pedigree (E. grandis X E. europhylla). This improves the likelihood that a RAPD marker heterozygous in one parent will be absent in the other. Also, to further increase the efficiency, the chosen interspecific cross involved highly heterozygous (presumably on the basis of isozyme data) parents. The strategy proved to be highly efficient as by screening just 305 primers, 558 segregating markers were produced, 491 of which were placed on one or the other map. A different mapping strategy was employed in Populus by Bradshaw et al. (1994). They used a three-generation pedigree founded by an interspecific cross of P. trichocarpa X P. deltoides to mimic an F2 intercross derived from inbred lines. Only those RAPD markers that were both polymorphic among the two parents and present in both of the Fi mated to form the F2 were included in the analysis. As a result, the mated Fi's were both heterozygous for all RAPD loci analysed, and thus phenotypic segregation ratios of 3:1 were expected. As noted above, because linkage between a pair of such loci is detected by deviation from a phenotypic 9:3:3:1 ratio, large sample sizes are needed for accurate estimation of recombination frequencies. The maximum number of F2 progeny assayed in this study was 90. However, a large number of codominant RFLP and STS (sequence-tagged site; Olson et al, 1989) markers were also included in the analysis. Use of such a mixture of codominant and dominant markers greatly improved the power of the analysis. Of 343 markers (111 RAPDs, 215 RFLPs, and 17 STSs) followed for segregation, 312 were placed on the map, with 31 markers (19 RAPDs and 12 RFLPs) remaining unlinked. Subsequently, Bradshaw 44 and Stettler (1995) were able to use this map effectively to identify several quantitative trait loci (QTLs) for a variety of traits. In conclusion, it has now become clear that, in spite of their dominance, RAPD markers can be used, either alone or in combination with other codominant markers, to construct saturated linkage maps in forest tree species, even when diploid tissues are assayed. As the future widespread availability of such maps will lead to the identification of QTLs affecting commercially important traits in many forest tree species, their construction is the first step towards the use of marker-assisted selection as an important tool in forest tree improvement programs. 45 Chapter 4 SINGLE-TREE GENETIC LINKAGE MAPPING IN WHITE SPRUCE USING DNA FROM HAPLOID MEGAGAMETOPHYTES. Introduction The previous chapter described a study into the potential for the use of dominant RAPD markers for genetic linkage mapping in forest tree species using diploid tissues. It showed that the limitation of dominance could be overcome when diploid tissue is assayed by following the segregation of only those markers in test cross configurations. Alternatively, for those dominant RAPD markers heterozygous in both parents, significant deviation from the 9:3:3:1 phenotypic ratio expected in the progeny for a pair of unlinked loci could be detected by following the segregation among much larger numbers of progeny. In conifers, an alternative possibility existed, due to the availability of maternally derived haploid megagametophyte tissue in conifer seeds. As the conifer megagametophyte is derived from a single product of maternal meiosis (Gifford and Foster, 1989), a collection of megagametophytes from the seeds of a single tree can be thought of, for linkage mapping purposes, as a set of haploid progeny of that tree. With haploid progeny, dominance would no longer be an issue. Hence, all RAPD markers heterozygous in the maternal tree, and therefore segregating in the megagametophytes, could be mapped relative to each other using reasonable numbers of progeny (i.e., megagametophytes). This approach should then be much more efficient than the test cross strategy using diploid tissues under which it will not be possible to map many heterozygous RAPD loci because they are either heterozygous or homozygous dominant in the other parent. Furthermore, controlled crosses would not be necessary, a particular advantage in conifer breeding programs that often involve half-sib rather than full-sib progeny. The approach should also be equally applicable in those angiosperms for which haploid cell culture, dihaploid plants or recombinant inbred lines are 46 available. The objectives in this study were to assess the utility of the RAPD marker technique with megagametophyte DNA from white spruce, Picea glauca (Moech) Voss., especially in terms of reproducibility of banding patterns (i.e., when identical reactions are repeated), inheritance of marker bands and the level of polymorphisms revealed, and if found satisfactory, to demonstrate the use of the RAPD technique to construct tree-specific genetic linkage maps. The feasibility of this approach depended in part on how much DNA, if any, could be isolated from a single white spruce megagametophyte, and on how little DNA needed to be used in a RAPD reaction to obtain reliable and reproducible banding patterns. The use of white spruce presented a particular challenge, as many other important conifers have larger megagametophytes. Materials and Methods Plant material Seeds were collected from open-pollinated cones of the terminal weevil (Pissodes strobi Peck)-tolerant parental tree PG29 at the Province of British Columbia Forest Service interior spruce breeding orchard at Vernon, B.C. DNA isolation DNA was extracted from megagametophytes of individual seeds by a modified CTAB procedure (Wagner et al, 1987). To avoid the potential contamination of DNA samples with RAPD products, isolations were performed under laminar flow with dedicated pipettors. Seeds were imbibed in water for 4 hours to overnight at room temperature prior to dissection for removal of the diploid embryo and outer brown "scale" (integument) covering the megagametophyte. The isolated tissue was ground in a microcentrifuge tube containing 30 ul wash buffer (50 mM Tris HC1 pH 8.0, 25 mM EDTA, 0.35M sorbitol, 0.1% 13-47 mercaptoethanol) using a motorized pestle grinder. An additional 236 ul of wash buffer was added to the homogenate, followed by 53 ul of 5% sarkosyl, mixing by inversion and incubation at room temperature for 3-5 minutes. After addition of 46 ul 5 M NaCI and 37 ul 8.6% CTAB in 0.7 mM NaCI and mixing by inversion, the tubes were incubated at 65°C for 15 minutes. RNA was digested by addition of 2 ul RNAse (1 mg/ml) and incubation at 37°C for 15 minutes. The homogenate was extracted with one volume phenol:chloroform:isoamyl alcohol (24:24:1) and phases separated in a microcentrifuge run at maximum speed for 5 minutes. A second extraction with 1 ml of ether was then performed. DNA was precipitated by addition 2.5 volumes of ice cold 100% ethanol artd incubation at -20°C for 1 hour to overnight. The DNA was pelleted and washed with 1 ml 70% ethanol, briefly dried and resuspended in IX T/E (10 mM Tris HC1 pH 8.0, 1 mM EDTA). DNA concentration was estimated by electrophoresis of l/40th of the sample on 1% agarose gel and visual comparison with known amounts of phage lambda DNA run as standards. RAPD assay The optimized PCR conditions for the use of megagametophyte DNA were a modification of the procedure reported by Williams et al. (1990). This involved a reduction in the amount of genomic DNA and Taq DNA polymerase relative to that reported in the previous chapter (Chapter 3) and the corresponding publication (Carlson et al., 1991) with conifer needles or buds. To avoid potential contamination of the RAPD reactions with RAPD products from previous reactions, reactions were set up under laminar flow using dedicated pipettors. The components of our optimized PCR reactions consisted of 2 ng DNA, 2.3 mM MgCl 2 , 0.3 uM primer, 200 uM each dNTP and 0.625 U Perkin Elmer Cetus Amplitaq enzyme per reaction volume of 25 ul. Template DNA (in 5 Lil T/E) was overlaid with 25 ul oil and denatured at 94°C. The remaining components made up in a master mix were then aliquoted to individual tubes containing the DNA. The tubes were mixed by tapping and then spun briefly. Amplification involved 45 cycles of 1 minute at 94°C, 1 minute at 36°C and 2 48 minutes at 72°C in a Perkin Elmer Cetus DNA Thermal Cycler. Amplification finished with an incubation at 72°C for 10 minutes followed by 4°C soak until recovery. Amplified products were separated by gel electrophoresis on 1.4% agarose and detected by ethidium bromide staining. Primers The oligonucleotide decamers were synthesized on a Applied Biosystems Inc. PCR-MATE DNA synthesizer at U.B.C. and purified using NAP-5 (Pharmacia) disposable columns. The sequence of each primer was arbitrary and generated on a random basis within the constraints of G+C content between 50 and 80% and no palindromic sequences including 6 or more nucleotides. Screening ofprimers and scoring of segregating markers In order to identify primers that detect heterozygous loci in the parent tree, each primer was used in PCR reactions with DNA from the megagametophytes of 5 seeds. Those that detected one or more polymorphic bands among the 5 megagametophytes were used for segregation analysis of the polymorphic band among an additional set of 47 megagametophytes. Presence of a band was scored as a plus (+) while absence of the band was scored as minus (-). Scoring was done independently by two observers. In cases where presence or absence of bands were unclear, they were recorded as missing data. Linkage analysis Goodness of fit to a 1:1 Mendelian ratio of segregating loci was tested by Chi square analysis at the 0.01 significance level. Markers that did not segregate according to the expected ratio were excluded from the linkage analysis. Linkage was analyzed using the computer program MAPMAKER/EXP version 3.0 (Lander et al, 1987; Lincoln et al, 1992) using an "f2 backcross" data file. In order to allow MAPMAKER to identify linkage between 49 loci in repulsion phase, each locus was entered twice, once in a "natural" fashion where the presence of the band (+) was coded as "H" (heterozygote) and the absence of the band (-) was coded as "A" (homozygote), and once again in an "inverted" fashion (+ as A; - as H; Nelson et al, 1993). These were distinguished by addition of the letters "P" ("positive" - for the naturally encoded data) or "N" ("negative") for the inverted data to the end of the locus labels. Loci were sorted into linkage groups by two-point analysis (GROUP command) with linkage criteria of a minimum LOD score of 3.0 and a maximum recombination fraction of 0.4. For each linkage group obtained from the two-point analysis containing more than two loci, the relative order of the loci was determined via multipoint analysis (COMPARE command), and the order with the highest log likelihood was taken as correct. Distances between loci, expressed in cM (centiMorgans), were calculated using the Kosambi mapping function (Kosambi, 1944). As a result of coding each locus twice, two identical sets of linkage groups, differing only by the letters ("P" or "N") at the end of each locus label, are obtained. At the end of the analysis, one linkage group of each identical pair is arbitrarily discarded. A second map was also constructed where linkage groups were formed based on two-point analysis with a more relaxed minimum LOD score of 2.4 (and a maximum recombination fraction of 0.4). Results and discussion DNA extraction from megagametophytes The extraction of DNA from megagametophyte tissue was a scaled down version of a protocol used for extraction of DNA from conifer needles (Wagner et al, 1987). Adequate homogenization of the megagametophyte tissue appears critical for optimal yields. Extraction with ether after phenol/chloroform extraction has also improved yield and purity. DNA yields per seed varied from 300 - 500 ng. Since only 2 ng DNA were needed per RAPD reaction to obtain reliable and reproducible results, 150 - 250 reactions were possible for DNA extracted 50 from each megagametophyte source. Hence it is clear that, even with the relatively small megagametophyte of white spruce, enough DNA can be obtained to allow genetic map construction via RAPD markers. Primer base composition, amplification products and polymorphisms The nucleotide sequence of each primer was unique and arbitrary. Composition followed the recommendations of Williams et al. (1990), that is, 10 base pairs in length, G+C content between 50 and 80 % and no palindromic sequences over 6 or more nucleotides. Once the PCR reaction conditions were optimized it was found that the number of detectable amplification products on horizontal agarose gels generally ranged from 5-12 and only rarely were less than four products generated (Figures 4.1, 4.2 and 4.3). Each primer produced unique banding patterns. No obvious association was found between primer sequence or G+C content and the number of detectable amplification products or the number of segregating bands (Table 4.1). Reproducibility of amplification products One of the first questions addressed was reproducibility of the PCR amplification products generated by specific primer/DNA combinations. This was initially tested by setting up five identical reactions (same primer and template) for several primer/template combinations. The results of these initial tests were highly satisfactory. With the exception of very faint bands or excessively large (> 1500 bp) or small (< 300 bp) bands, RAPD patterns were highly (> 95%) reproducible. Also, after scoring segregating bands for all polymorphic primers among the 47 seed genotypes, three primers were chosen at random and reactions repeated under the same conditions and the five segregating bands produced were scored again. The genotypes obtained were then compared to those originally scored. With this test a repeatability estimate of 97.4% (221 matches out of 227 comparisons) was obtained. This 51 Figure 4.1. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 230. Lambda//V/'wdIII size marker is shown at the extreme left lane of each panel (M). A no template control is shown in the extreme right lane of the lower panel (C). All remaining lanes contain RAPD products from the 47 megagametophytes. The asterisk denotes a blank lane due to amplification failure. Three segregating bands were scored (arrows). 52 Figure 4.2. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 282. Lambda//7/«dIII size marker is shown at the extreme left lane of each panel (M). A no template control is shown in the extreme right lane of the lower panel (C). All remaining lanes contain RAPD products from the 47 megagametophytes. The asterisk denotes a blank lane due to amplification failure. One segregating band was scored (arrows). 53 Figure 4.3. Ethidium bromide stained 1.4% agarose gel of RAPD products from 47 megagametophyte DNAs and primer 285. Lambda//7r/'«dIII size marker is shown at the extreme left lane of each panel (M). A no template control is shown in the extreme right lane of the lower panel (C). All remaining lanes contain RAPD products from the 47 megagametophytes. The asterisk denotes a failed reaction. Three segregating bands were scored (arrows). 54 Table 4.1. Random sample of primers showing G+C content, total number of amplification products and number of segregating bands per primer. Primer Sequence #G+C #Amplif. #Segregating products loci (1:1) 271 G C C A T C A A G A 5 5 1 2 8 9 A T C A A G C T G C 5 6 1 2 1 6 C A T A G A C T C C 5 7 0 235 C T G A G G C A A A 5 7 0 256 T G C A G T C G A A 5 8 4 297 G C G C A T T A G A 5 11 1 231 A G G G A G T T C C 6 6 1 225 C G A C T C A C A G 6 8 2 278 G G T T C C A G C T 6 11 3 2 2 9 C C A C C C A G A G 7 5 3 2 6 4 T C C A C C G A G C 7 6 0 295 C G C G T T C C T G 7 7 3 218 C T C A G C C C A G 7 8 2 298 C C G T A C G G A C 7 8 0 266 C C A C T C A C C G 7 9 2 203 C A C G G C G A G T 7 10 0 2 1 0 G C A C C G A G A G 7 10 1 244 C A G C C A A C C G 7 12 3 300 G G C T A G G G C G 8 8 2 285 G G G C G C C T A G 8 9 3 55 level of repeatability should be quite adequate for genome mapping applications. However, higher repeatability may be needed for fine scale mapping (i.e., within 1 or 2 cM) of a region of particular interest (i.e., containing an important QTL). Screening primers Three hundred primers were screened in this study. In order to identify polymorphic primers each was screened with an initial subset of DNA from 5 megagametophytes. Since segregation of markers is expected to follow a 1:1 Mendelian ratio of presence:absence of marker band, the use of 5 seeds should give reasonable confidence (approx. 94% of the time) in identifying primers that detect heterozygous loci in the maternal parent tree. For the first 200 primers (#l-#200, U.B.C. RAPD Primer Synthesis Project), 26 primers revealed polymorphism among the 5 haploid genotypes. As many primers did not produce useful patterns (i.e., no bands, high background, or a lack of strong, scorable bands) with the RAPD protocol used at this stage, adjustments to the protocol were made so that successful results were obtained with a higher proportion of primers. The amount of enzyme per reaction was halved while MgCh and primer concentration were increased (see Materials and Methods). Under these new experimental conditions 43 of the third set of a hundred primers (#201-#300, U.B.C. RAPD Primer Synthesis Project) detected one or more scorable bands segregating among the 5 genotypes. White spruce is a recently domesticated, open-pollinated species with high levels of genetic variation based on both phenotypic (Murray and Skeates, 1982) and isozyme (King etal, 1984; Cheliak etal, 1984b; Alden and Loopstra, 1987) analyses. Thus higher level of polymorphism seen with the third set of primers is more consistent with the high level of heterozygosity expected with spruce. Scoring segregating markers and linkage mapping From the 69 primers that were initially characterized as revealing polymorphic bands in the screening process, the 44 best, based on the scorability of the polymorphic band(s) 56 produced were used for the segregation analysis involving 47 megagametophytes. These 44 primers revealed 59 marker bands that segregated in conformance with the expected 1:1 Mendelian ratio. Thirty-five of the 44 primers detected one segregating locus, 3 detected 2 loci each, and 6 detected 3 loci each. Thirteen additional loci were scored that did not conform to the expected 1:1 ratio, as determined by Chi-square tests (a = 0.05). These were excluded from the linkage analysis, primarily because of suspicion regarding their reliability. In both agronomic and forest tree species the occurrence of distorted ratios in segregating RFLP and isozyme alleles is not uncommon (Cheliak et al, 1984a; Edwards et al, 1987; McCouch et al, 1988; Slocum et al, 1990; Heun etal, 1991; Landry etal, 1991 and references therein; Beavis and Grant, 1992; Tulsieram et al, 1992a). There appears to be a lack of consensus among researchers as to whether markers expressing segregation distortion should be used in linkage mapping studies. Some authors include them in the linkage analysis from the beginning (e.g., Slocum et al, 1990; Heun et al, 1991; Landry et al, 1991, Bradshaw and Stettler, 1994), others exclude them initially and then later place them onto a "framework" map (e.g., McCouch et al, 1988; Grattapaglia and Sederoff, 1994), while others exclude them completely (e.g., Beavis and Grant, 1992). Clustering of distorted loci within a linkage group may be indicative of the presence of a embryonic or gametic lethal or semi-lethal allele. The presence of such an embryonic lethal causing local distortion was demonstrated by Bradshaw and Stettler, 1994. However, it has recently been shown that classical tests for genetic linkage and marker order (such as those used by MAPMAKER) can be considerably biased when distorted loci are analyzed, particularly when dominant markers are used (Lorieux et al, 1995a; Lorieux et al., 1995b). As a result, segregation distortion can generate false-positive or false-negative linkages or lead to incorrect marker orders. The authors of the latter two studies suggest that special tests developed by Bailey (1949) be used in place of classical tests when analysing distorted loci. They note that when non-distorted loci are analysed, these tests simplify and 57 become equivalent to the classical tests. Hence it would be helpful if these tests were incorporated into future versions of genetic linkage analysis programs such as MAPMAKER. Of the 59 segregating loci, 37 were mapped to 16 linkage groups, while the remaining 22 were unlinked, using a minimum log likelihood ratio (LOD) score of 3.0 and minimum recombination value of 0.4 (Figure 4.4). One linkage group was comprised of 5 loci, two linkage groups were each comprised of 3 loci, and the remaining 13 groups all involved pairs of loci. For the three linkage groups containing 3 or more loci, the likelihood ratio between the order shown (highest likelihood) and the order with the next highest likelihood is given beneath each linkage group (Figure 4.4). The 16 linkage groups covered a total of 343.2 cM. Based on the haploid DNA content, it has been estimated that the conifer genome size is approx. 2500 cM (Neale and Williams, 1991). Thus many more markers need to be scored before a complete, saturated map will be obtained. When these markers are added it is expected that most of the unlinked markers found in this demonstration study will be integrated into the resulting linkage groups. The construction of a saturated map was beyond the goals of this study. It has been shown that LOD scores of 3 or higher provide highly conservative tests for linkage (Gerber and Rodolphe, 1994b). Such conservative tests are needed when a large number (e.g., 200-300) of markers are scored for saturated map construction. However, because only 59 markers were analyzed for linkage in this study, it was of interest to explore possible further linkage relationships previously rejected at LOD 3.0. Hence, a second linkage map was constructed based on linkage groups formed by two-point analysis with a smaller LOD score of 2.4 (i.e., 250 to 1 odds of pairwise linkage) and is shown in Figure 4.5. This map is, of course, less reliable than that shown in Figure 4.4 based on a LOD score of 3.0. As expected more of the markers were mapped into linkage groups (46 markers on 16 linkage groups; 13 unlinked markers). Also, larger linkage groups were obtained - 2 groups of 7 loci 58 -j-244bP y 2 5 0 a P T 2 l . 1 a P -,-2SlaP 7 . 2 J -4-289aP 13.11 2 1 . 8 + 256dP 2 1 . 8 2 4 . 4 1 3 . 9 2 4 . 0 , (--L244aP 2 4 . 5 5.50 -r28.1 aP T 2 2 5 a P -r229b\ 2 3 . 9 -*-295aP -^IHbaP 2 1 . 2 2 0 . 6 - L 285a.\ 3.72 -r25.1aP T 229cP T 2 3 0 C P -1-266bN 1 9 . 5 1 8 . 3 - J-285c\ 1 7 . 3 J -256cP - L 2 2 5 b \ - H l O a N T039aP 1 6 . 2 -L -^ Sa N J -059a. \ I83a\ T-23laP 1 4 . 0 0.16a P 1 3 . 5 -|-295cP -p249aP 1 1 . 8 •268aP T-247aP 9 2 1 - L 270aP T-2 -KlOObP - L 266aP - ' - n i a P -•-OO.laP 186 Figure 4.4. Partial RAPD marker linkage map of white spruce based on a minimum LOD score of 3.0 in the two-point analysis. The loci are listed on the right and map distance in Kosambi centiMorgans on the left. Loci identifiers begin with a three-digit number denoting the U.B.C. primer number, followed by a single small letter to signify the particular band scored, and then by a capital P or N to denote how that locus was coded ("P" denotes "Positive" or "natural" coding where + = H; " N " denotes "Negative" or "inverted" coding where + = A; see Materials and Methods for details). Numbers beneath linkage groups with 3 or more loci are likelihood ratios between the order shown and that of next highest likelihood. 59 T-244bP - p O l S a N T-297aP T-225aP -r-250aP -r-218aP T-251aP 7.2 - - 2 8 9 a P 21.8 + 256dP 21.8 +266b.N 24.4 13.9 31.4 23.6 26.9 2.7 13.1 24.7 20.6 + 036aP 24.0 25.2 159aN 244aP + 230cP 16.9 •213aP + 128aN 26.2 21.5 27.1 24.0 + 2 2 5 b N 26.0 24.5 -"-282aP -"-295aP -•-285aN 3.72 •300aN 141 + 183.N + 2 2 9 b N ± 2 ? 6 b p 214 22.6 + 1 lOaP 245aP T-253aP i - 2 2 9 c P - | -039aP 19.5 18.3 -•-285C.N 16.2 -•-059aN -•-256cP - 1 - 123aP - 1 - 230bP 16.6 + 186aP T-231aP T-295cP T-249aP 14.0 13.5 11.8 -p268aP -r 247aP 4 i 270aP 300bP -•-2663? J - 1 2 1 a P -"-003aP -"-283aP 3.80 Figure 4.5. Partial RAPD marker linkage map of white spruce based on a minimum LOD score of 2.4 in the two-point analysis. 60 each, 1 group of 4 loci, 2 groups of 3 loci each, and 11 groups of 2 loci each were obtained covering a total of 577.1 cM. It would be interesting to see how many of the more tentative linkage relationships given in this "LOD 2.4 map" would be confirmed by the addition of more markers bridging the gaps between pairs of loci linked at LOD scores of less than 3.0. Identification of QTLs using single-tree maps derivedfrom megagametophytes It would be most efficient if map construction and analysis of segregating traits of interest were performed on the same population. As traits will not be measured in the megagametophytes themselves, it might appear on the surface that construction of maps using megagametophytes is a dead end in regards to the localization of loci contributing to traits of interest. However, in this regard the biology of conifer seed development again comes to the rescue. Fortunately, the genetic makeup of a haploid megagametophyte is identical to the maternal contribution to (i.e., maternal haplotype of) the corresponding embryo, as both the megagametophyte and the egg cell are derived from the same product of maternal meiosis (Gifford and Foster, 1989). Hence, there is an opportunity to score segregating maternal markers in megagametophytes and measure phenotypic traits in trees derived from the corresponding embryos (O'Malley et al, 1994). This could be accomplished either by embryo rescue, where embryos are carefully dissected and then germinated on nutritive media mimicking the composition of megagametophytes, or by recovering megagametophytes after they are required by the germinated seedlings (but before the seed coats fall off) and, if possible, isolating DNA from the megagametophytes at that time. In this manner it may be possible to map segregating trait loci derived from the maternal tree only, provided that the associated genetic variance could be distinguished from that derived from heterozygous trait loci in the paternal parent or from trait loci polymorphic among a mixture of paternal parents. For this to be realized, the theoretical aspects of this approach must be further explored (which, is definitely beyond the scope of this chapter). 61 Applications of RAPD marker system in forest trees The use of RAPD markers in conifers presents a realm of opportunities in genetics and breeding previously not possible with isozymes, due to their limitation in number. The study of simply inherited traits or more complex quantitative traits could be accomplished as demonstrated for a number of agronomic crops (Edwards et al, 1987; Nienhuis et al, 1987; Osborn et al, 1987; Stuber et al, 1987). While the models used for annual inbred species may not be appropriate for outcrossing forest species, new methods need to be developed either based on the available breeding populations or modified population structures. MAS permits early selection, an advantage that is of considerable importance in conifers due to their long rotation age. Note, however, that there are many potential obstacles that may impede the application of MAS in forest tree breeding programs. These obstacles are considered in detail by Strauss et al. (1992b). Some of these obstacles, including inconsistent marker-trait locus associations among genetic backgrounds due to linkage equilibrium (or insufficiently high levels of linkage disequilibrium), can be overcome by construction of genetic maps for every family, or single tree maps for every tree, of interest in the breeding program. The high potential for automation of the RAPD process, and its relative ease compared to an RFLP-based approach, may soon make this goal feasible. At any rate Strauss et al. (1992b) note that the identification of QTLs using genetic linkage maps will provide useful information regarding the genetic architecture of quantitative traits in forest trees. The high number of marker bands produced with each primer using the RAPD technique seems to make this approach ideal for genotyping/fingerprinting parental trees, clones or any seed orchard material for identification purposes and for analyzing pollen contribution in controlled crosses. This provides a tool in addition to isozymes to verify the many assumptions involved in the genetic management of domesticated tree species (Muona, 1990). 62 RAPD analysis of variation at the nuclear genome level has advantages over RFLPs in genetic diversity studies. For example, a single primer usually produces multiple loci and hence a larger portion of the genome can be analyzed. The disadvantage of the dominance of RAPD markers can be overcome by the use of haploid megagametophytes as a source of DNA, as demonstrated in this study. As is commonly done in isozyme studies in conifers, the analysis of 5 or more megagametophytes from each sampled tree will allow determination of diploid genotypes. In evolutionary and population genetics studies the RAPD analysis of the nuclear genome can add a new dimension to the available data currently based on organelle genomes and isozymes. The potential availability of large numbers of RAPD markers should aid in the drawing of population genetic inferences that depend on large samples of loci from the genome in question. Furthermore, the availability of mapped loci will be helpful in population genetic studies, allowing the selection of effectively unlinked (>50 cM apart) loci providing genome wide coverage. Conversely, the use of loci linked at known map distances will be useful in studies of linkage disequilibrium in populations (e.g., Epperson and Allard, 1987; Hastings, 1990). The advantages that haploid DNA offers when using the RAPD marker system for genetic studies will not be limited to conifers but should find application in angiosperm plants as well through the use of anther culture. Haploid plants have been produced from anther culture from 85 genera belonging to 38 families (Srivastava and Johri, 1988). Since this study has shown that a linkage map can be constructed from one microgram or less of DNA per haploid genotype, even microcalli from anthers would be a suitable source of template DNA. This extends the RAPD genome mapping technique to those many plant species (including angiosperm trees) that may only respond transiently to anther culture by producing small haploid calli but not necessarily haploid plants. Also, non-destructive assays, involving the isolation of DNA from small amounts of tissue excised from microspore-derived plantlets for RAPD analysis, may also be possible, as has been demonstrated by Horn and Rafalski (1992) 63 in Brassica. Recent progress in RAPD mapping in conifers The demonstration study reported in this chapter was published in 1992 (Tulsieram et al, 1992b). Since that time, RAPD markers have become more and more popular, and they have been used in hundreds of research articles involving genetic mapping, fingerprinting, and population genetics. This is in itself testimony to their relative ease of use, utility, and reliability, as well as to the pioneering nature of the work described in this and the previous chapter. Here, only those publications concerning the construction of RAPD marker based linkage maps using megagametophytes in coniferous trees will be briefly considered. Shortly after the completion of this demonstration study, construction of an extensive RAPD map for a single loblolly pine clone was undertaken by a large group at Ron Sederoff s lab at North Carolina State University (Grattapaglia etal, 1991). The nearly saturated linkage map produced, following essentially the same strategy as that used in the study described in this chapter, contained 191 mapped loci. The fact that this map was constructed in just 60 days is further testimony to the power of this approach. Single-tree linkage maps based on RAPD marker segregation in megagametophytes have now been constructed in several additional conifer species, including slash pine (Pinus elliottii var. elliottii; Nelson et al, 1993), Norway spruce (Picea abies; Binelli and Bucci; 1993), longleaf pine (Pinuspalustris; Nelson et al, 1994) and Scots pine (Pinus sylvestris; Yazdani et al, 1995). For the map in Norway spruce, the mapping process was partially automated by using a Hamilton Microlab ATplus robot to set up the RAPD reactions (Binelli and Bucci; 1993). These authors intend to use their map as a tool in population genetic studies in Norway spruce. Nelson et al. (1993) hope to use their map to localize genes involved in resistance/susceptibility of slash pine to fusiform rust. The goal of our lab is to use 64 spruce maps to find markers linked to terminal weevil (Pissodes strobi) resistance genes. In conclusion, it is clear that the approach to genetic linkage mapping in conifers first taken in the demonstration study described in this chapter (and almost simultaneously by Grattapaglia et al, 1991), has now become a popular and mainstream approach among conifer geneticists. Hopefully, further technological progress combined with further research will allow the major obstacles to the use of MAS in forest tree breeding, described by Strauss et al. (1992b), to soon be overcome, so that genetic linkage mapping in forest trees will quickly evolve from a basic research exercise to an economic imperative. Prospects for MAS as a tree improvement tool A brief consideration of the medium to long term commercial potential of MAS would be pertinent at this point. For MAS to be a profitable exercise (e.g., for a forest company), cost/benefit analyses would need to demonstrate that the additional gain obtained by using MAS in conjunction with conventional phenotypic selection (see equation 1 in Strauss et al, 1992b) would more than offset the cost of performing MAS. It likely that, in the short term, most of the cost of initial identification and verification of marker-QTL associations (MQTLs) will not be borne by forest companies themselves, but by research grants to universities and government agencies. Once potentially useful MQTLs have been identified and verified at the basic research level, industry is more likely to show interest and provide substantial support. However it should be noted that forward-thinking companies may be willing to invest in MAS without expecting a return on their investment, just so that they will be in an advantageous competitive position to generate and profit from future advances in knowledge and technology in this area. One such forward-thinking company is Tasman Forestry Ltd. of New Zealand, which is currently utilizing markers linked to QTLs affecting wood density in their radiata pine breeding program (Gleed, 1995). 65 Identification and verification of MQTLs involves many considerations. To begin with, very large family sizes are needed for identification of MQTLs, especially for traits of low heritability. If a particular trait is targeted, however, the number of progeny within a family to be assayed for the markers can be greatly reduced by extreme genotyping (Lander and Botstein, 1989). Dividing each extreme into two independent groups will allow two independent analyses to be performed. Spurious MQTLs (due to Type I error) will be unlikely to show up in both analyses, and thus can be detected in this manner. Consideration of multiple families (e.g., the superior families identified by phenotypic selection among-families) leads to three problems. First of all, as noted above, among families in which the same QTL and marker are segregating, a given MQTL may not be consistent, due to linkage equilibrium. Hence the linkage relationship (coupling or repulsion) must be established for every such family. Second, although the same QTL may be segregating in two families, a linked marker identified in one may not be segregating (i.e., is homozygous) in the other. Third, it is probable that different sets of QTLs are segregating in different families, although these sets are likely to overlap between families. These three problems taken together necessitate that marker-QTL analyses must be performed for every family of interest. Multiple environments must also be considered. The general applicability of a given QTL will depend upon the magnitude of genotype by environment interaction (G X E) for that particular QTL. QTLs that show minimal G X E will be the most useful. Non-additive interactions within (dominance) and between (epistasis) QTLs are also possible. It is here that molecular markers could make a major contribution to the advancement of the field of tree improvement. Current approaches to tree improvement are based on the assumption of additivity among QTL loci. Gross violation of this assumption due to prevalent epistasis among loci (there must be interaction among genes and gene products!) will preclude a lasting response to long term directional selection. Molecular markers should provide powerful tools 66 for the diagnosis of epistatic interactions. Our selection strategies can then be adjusted accordingly (e.g., Kelly etal., 1995). One way to evaluate the effectiveness of MAS without being delayed by the long generation times of forest trees would be by performing a "retrospective" MAS trial. Where three generations of germplasm from multiple families are still available (likely only in loblolly or radiata pine) "mock" MAS could potentially be performed and its effectiveness compared to the known results of phenotypic selection. Unfortunately, it is unlikely that adequate sample sizes of progeny from the appropriate crosses are available to pursue this time saving avenue. However, the possibility would certainly be worthy of examination. The above considerations suggest that there is a great deal of challenging and potentially fruitful research to be done regarding the use of MAS in forest trees. It is very encouraging that at least one forestry company (Tasman Forestry Ltd.) is taking the prospect of MAS seriously enough to invest in this area. 67 Chapter 5 NUCLEAR RFLP ANALYSIS OF GENETIC DIVERSITY IN WESTERN RED CEDAR Introduction Western red cedar (Thujaplicata Donn ex D. Don) is a long-lived coniferous tree species native to western North America. Its range is divided into two portions, a coastal portion ranging from northern California to southernmost Alaska, and an interior portion ranging from the panhandle of northern Idaho to just east of Prince George, B.C. (Figure 5.1). Within this range the species grows in a wide variety of ecosystems, with a wide variety of associated vegetation, although it is not typically found on dry sites (Fowells, 1965). A number of studies have suggested that this species is extremely low in genetic diversity relative to other conifers. Isozyme studies have been conducted by Copes (1981), Yeh (1988), and El-Kassaby et al. (1994). In the first study no variation was found, while in the latter two studies very low expected heterozygosity estimates of 4% and 6% respectively were obtained. The latter two estimates place western red cedar as the third most genetically uniform conifer ever studied in terms of isozyme variation. Only two conifer species have less isozyme variation: the narrow California endemic Torrey pine (Ledig and Conkle, 1983) and red pine, a species of wide range from eastern North America (Fowler and Morris, 1977; Allendorfef a/., 1982; Simon etal, 1986; Mosseler etal, 1991). The results from the isozyme studies in western red cedar are corroborated by studies of leaf oil terpene variation (von Rudloff and Lapp, 1979; von Rudloff etal, 1988). The first study stated that western red cedar "has one of the lowest degrees of variability [in leaf oil terpene composition] found thus far in Northern American conifer species." The second terpene study confirmed this result. 68 Figure 5.1. Range map of western red cedar showing locations of sampled trees. The species range was divided into five geographic regions, the U.S. Coast (USC), the B.C. Coast (BCC), the Queen Charlotte Islands (QCI), Idaho (IDA) and the B.C. Interior (BCD. A pair of numbers are associated with each location (dot), the first indicating the number of trees from which DNA samples were successfully isolated, and the second, in parentheses, indicating the number of these DNA samples used in the nuclear RFLP analysis. Trees sampled from the same location were at least 100 meters apart. 69 70 Results from studies of quantitative traits have been less clear-cut. A provenance study of three Vancouver Island seed sources found no variation among provenances in 5 year height and survival (Bower and Dunsworth, 1988). A more detailed study of western red cedar populations from the southern half of the interior portion of the species range found among population variation in only one third of the variables analysed, and within population variation in only one half of the variables analysed, results that the author felt confirm that genetic variation in this species is low (Rehfeldt, 1994). However, more recent provenance trials, involving a wider sample of the species, have uncovered higher levels of variation, and high heritabilities for some traits (Cherry, 1995). This chapter describes a study of genetic variation using DNA markers intended to provide additional information regarding the level of genetic diversity in western red cedar. The initial intent was to use RAPD markers (Williams et al, 1990) in this endeavour, following their development and use in genetic mapping applications (Chapters 3 and 4). The disadvantage of the dominance of RAPD markers could be overcome in conifers by the use of haploid megagametophyte tissue as a source of DNA for the assay. However, in western red cedar, the megagametophytes are very small, and despite considerable effort, not enough DNA could be isolated from western red cedar megagametophytes for this approach to work. Within population allele frequencies could have been estimated from dominant RAPD markers derived from diploid tissue (e.g., foliar DNA) if populations were known to be in Hardy-Weinberg equilibrium (Lynch and Milligan, 1994). Standard genetic diversity parameters (e.g., expected heterozygosity) could then be estimated for comparison to the isozyme data. However, as western red cedar is known to have an extremely high selfing rate (68% according to El-Kassaby et al, 1994), random mating (and thus Hardy-Weinberg equilibrium genotype proportions) within populations could not be assumed in this species. For these reasons, RFLP markers were chosen for use in this study, as they are 71 codominant. To allow genetic interpretation and the estimation of standard genetic diversity parameters, single or low copy nuclear RFLP probes were developed. A DNA isolation protocol was optimized for the isolation from western red cedar foliage of the large quantity of high purity DNA needed for single copy nuclear RFLP analysis. DNA was isolated from more than 250 range-wide trees. Time constraints allowed the RFLP analysis of only 90 of these samples, enough to get a clear picture of the amount of DNA-level genetic diversity in the species, for this thesis. The results, documented herein, show that DNA-level genetic diversity (i.e., expected heterozygosity) in this species is very similar to that measured with isozymes. Materials and Methods Sample collection Branches from at least 50 western red cedar trees were sampled from each of 5 regions, covering most of the natural range of the species: 1) U.S. Coast; 2) B.C. Coast; 3) Queen Charlotte Islands, 4) Idaho; and 5) B.C. Interior (Figure 5.1). With the exception of one tree from the University of British Columbia campus, all B.C. Coast samples were obtained from seed orchards (41 trees from Lost Lake Seed Orchard, Western Forest Products Ltd., Victoria; 8 trees from Mt. Newton Seed Orchard, Fletcher Challenge Ltd., Saanichton). Twelve of the Queen Charlotte Island samples were also obtained from the Lost Lake Seed Orchard. These seed orchards were established from branch cuttings collected from natural stands within coastal B.C. The remainder of the samples were collected directly from natural stands. At least 45 g of foliage was cut from the branches sampled from each tree, then frozen by immersion in liquid nitrogen and stored at -70°C until further processing. DNA isolation Western red cedar total DNA was isolated from foliage using a considerably modified version of the protocol of Wagner et al. (1987). Twenty grams of foliage was ground to a 72 fine powder in liquid nitrogen using a stainless steel waring blender. The recovered powder (approximately 16 grams) was then stored at -70°C until needed. The frozen powder was thoroughly resuspended in 150 mis cold extraction buffer (50 mM Tris [pH 8.0], 5 mM EDTA, 0.35 M sorbitol, 10% polyethylene glycol [mol. wt. 3350], 0.1% bovine serum albumin, 0.1% spermine, 0.1% spermidine, 0.2 % 2-mercaptoethanol) and then filtered first through 1 layer and then 3 layers of cheesecloth. The filtrate was then centrifuged at 11,000 RPM in a Beckman JA-14 rotor for 15 minutes at 4°C and the supernatant discarded. The resulting pellet (far more gummy than that obtained with other conifer species) was resuspended in 8 mis wash buffer (50 mM Tris [pH 8.0], 25 mM EDTA, 0.35 M sorbitol, 0.2 % 2-mercaptoethanol) using a glass rod. The resuspension of persistent gummy lumps was facilitated by pipetting up and down with a 10 ml pipette. After adding more wash buffer to bring the suspension to 16 mis, 4 mis of 5% n-laurylsarcosine was added, followed by gentle mixing and incubation at room temperature for 15 minutes. The lysate was then brought to 0.7 M NaCI by adding 3.5 mis of 5M NaCI and then to 1% CTAB by adding 3.1 mis of 8.6 % CTAB/0.7M NaCI and mixing. The lysate was then incubated in a 75°C water bath for 30 to 60 minutes with occasional mixing, followed by extraction with 25 mis of 24:1 chloroform:isoamyl alcohol and centrifugation at 5000 rpm in a table top centrifuge at room temperature. As recommended by Fang et al. (1992), to reduce co-precipitation of contaminating polysaccharides, the recovered aqueous phase was brought from 0.7 M NaCI to 1.4 M NaCI by addition of 1/5 volume 5M NaCI prior to precipitation of nucleic acids with 2/3 volume room temperature isopropanol. The nucleic acid precipitate was then hooked out with a glass hook, washed twice in 70% ethanol/10 mM ammonium acetate, dried briefly under laminar flow, and then dissolved in 2 mis T/E buffer (pH 8.0) at 60°C. To each crude DNA sample in 2 mis T/E, 2.08 g of CsCl was added and dissolved at 60°C. This DNA/CsCl solution was then combined with 2.7 mis of a EtBr/CsCl mixture in T/E (prepared by adding 1.05 g of CsCl and 18.5 of EtBr [10 mg/ml] for every milliliter of 73 T/E buffer). The resulting solution was then transferred to a 5 ml "quickseal" tube (Beckman) and spun at 55,000 RPM in a VTi 65 rotor (Beckman) in an ultracentrifuge at 25°C overnight. The DNA band was recovered with a wide gauge needle and syringe and the volume of the recovered band determined. Several rounds (4-6) of butanol extractions were then performed to remove all ethidium bromide from the recovered DNA. Two times the original band volume of sterile dHjO was then added, followed by 6 volumes (again based on the original band volume) of ethanol. The pure DNA precipitate was then hooked out with a glass hook, washed twice in 4 mis 70% ethanol, dried briefly under laminar flow, and then dissolved in a minimal volume (50 to 500 ul depending on the size of the DNA precipitate) of T/E buffer (pH 8.0). Concentrations of the purified DNA samples were determined by gel electrophoresis and visual comparison to known amounts of intact lambda phage DNA. Genomic Library Construction Western red cedar total DNA (200 ug) was double-digested with methylation-sensitive restriction enzymes Pstl and Xhol (500 U each enzyme) and then electrophoresed a short distance on a 1% agarose gel. A gel slice containing DNA fragments from 500 bp to about 9 kb was excised and the size-selected DNA fragments purified and concentrated using the Qiaex gel purification kit (Qiagen). The plasmid vector pBSII KS(+) (Stratagene) was also digested with the same restriction enzymes and gel purified. One fifth of the recovered western red cedar DNA fragments were combined in a 10 ul ligation reaction with approximately 100 ng of the digested purified vector and the ligation allowed to proceed overnight at 14°C. The ligation was diluted 5-fold and then 1 ul used to transform E. coli strain DH5a (BRL), according to the supplier's protocol. Putative transformants were identified as white colonies growing on LB plates containing 100 ug/ml ampicillin onto which 50 ul of X-gal (20 mg/ml) and 5 ul of IPTG (200 mg/ml) had been spread. Plasmid DNA from 435 of the putative transformants was purified using the miniprep protocol of Zhou et al. (1990). 74 Probe screening Transformants were screened for those containing single or low copy number inserts in two stages. The first stage consisted of "reverse probing" using western red cedar total DNA as a probe. Roughly equivalent amounts of each of the 435 plasmids were digested with PstI andXhoI to liberate the inserts, run on 1.5 % agarose gels (with RNase in the loading buffer), and blotted as described below (except that the depurination step was omitted). Labeling of western red cedar total DNA (100 ng) and hybridization were as described below (except that polyethylene glycol was not used). Inserts that gave signals on the resulting autoradiograms were considered repetitive and not used in the next stage of screening. The second stage of screening consisted of probing a single i/wdHI-digested western red cedar DNA sample (the same sample used to construct the library) with 104 of the remaining inserts. Multiple single-lane blots were made so that up to 20 probes could be simultaneously screened using scaled-down hybridization reactions in 50 ml Falcon (Becton-Dickinson) tubes (two 50 ml tubes could fit in each of the 10 available hybridization bottles). Labeling reactions were also scaled down to reduce the cost of screening so many probes -one pre-aliquoted "rediprime" (Amersham) reaction was split among seven probes. Highly concentrated plasmids were digested with PstI and Xhol and the inserts were excised from low melting point agarose gels. The resulting gel slices were diluted with three times their weight of dH 20, melted at 65°C, and 10 ul aliquots were taken for denaturation, of which 2.5 ul was used as template in the scaled-down (6.5 ul) labeling reactions. Probes that gave small numbers of relatively weak bands in this stage of screening were considered putative single or low copy number nuclear loci. Southern blotting and hybridization Ten micrograms of J¥/«dLII-digested DNA sample (10 units Hindm per Ltg DNA) were loaded per lane on 0.8% agarose gels. The gels were depurinated for 10 minutes in 0.25 75 M HC1 and then blotted overnight onto nylon membranes (Zetaprobe GT - Bio-Rad) by alkaline transfer with 0.4 M NaOH (Reed and Mann, 1985). The blots were rinsed briefly in 2X SSC (20X SSC = 3M NaCI, 0.3M trisodium citrate) then baked in a vacuum oven for 1-2 hours at 80^ and stored dry at room temperature until needed. Prior to hybridization, the blots were incubated at 65°C in a hybridization oven (Robbins Scientific) for several hours in pre-hybridization solution (7% SDS, 0.25 M N a j f f i ^ [pH 7.2], 1 mMEDTA, 20 ug/ul denatured herring sperm DNA). Radioactive probes were prepared by incorporation of [a32P]-dCTP (3000 Ci/mmol) via random-primed labeling (rediprime - Amersham) of gel-purified inserts. Unincorporated nucleotides were removed using NICK columns (Pharmacia). Prehybridization solution was replaced with hybridization solution (the same as prehybridization solution plus the denatured probe at 3 to 5 million cpm/ml) and the hybridization allowed to proceed overnight. When multiple "population" samples on multiple blots were hybridized with a single probe, 10% polyethylene glycol [mol. wt. 3350] was added to the hybridization solution (see below under "Results"). Blots were then washed twice for 15 minutes each in 5% SDS / 20 mM N a j f f i ^ [pH 7.2] / 1 mM EDTA and then once for 30-60 minutes in 1% SDS / 20 mM N a j f f i ^ [pH 7.2] / 1 mM EDTA. All three washes were performed at 65°C. Blots were wrapped in cellophane and then exposed to X-ray film (Kodak Biomax) with one intensifying screen for 1-3 weeks prior to development. After autoradiography, the probes were stripped in 2.5 litres of 0. IX SSC / 0.5% SDS at 95°C for 30 minutes. Copy number analysis To confirm the copy number of putative single or low copy number nuclear probes, a reconstruction experiment (copy number analysis) was performed. From the gel slices obtained above, putative single or low copy number inserts were purified and concentrated using the Qiaex gel purification kit (Qiagen) and then quantified by visual comparison to known amounts of lambda phage DNA. Amounts of each insert corresponding to the number 76 of copies expected to be present in 2 ug of western red cedar nuclear DNA for sequences present at 100, 10, 1 and 0.1 copies per haploid genome were loaded onto adjacent slots on a slot blotter (Biodot SF - Bio-Rad). Calculations were made based on the genome size of Thuja occidentalis (6.3 pg per haploid nucleus - Dhillon, 1987), as the genome size of Thuja plicata had not been reported. Also included in each of these slots was 2 pig of herring sperm DNA as a "carrier". For each probe two additional slots, a control slot with 2 u.g of herring sperm DNA only and a slot with 2p.g of western red cedar DNA, were also included. The resulting slot blots were cut into separate strips corresponding to the 6 slots (one row on the slot blotter) used for each probe. The appropriate strip was probed simultaneously with the population sample blots. Copy number analysis was also performed on homologous organelle DNA probes obtained from western red cedar by PCR. A portion of the mitochondrial gene coxl was amplified by PCR as described in Chapter 2. The intergenic spacer between the chloroplast trnL and trnF tRNA genes was amplified by PCR from western red cedar using the primers described by Taberlet et al. (1991). Slot blots for these organelle probes were made as described above, except that instead of 100, 10, 1 and 0.1, copy number equivalents of 10,000, 1000, 100 and 10 were loaded. Data analysis Data were analysed and statistics estimated with the aid of the BIOSYS-1 (release 1.7) computer program (Swofford and Selander, 1981; Swofford, 1989). The steps of this program that were invoked from the input file were as follows (corresponding options given in parentheses): VARIAB (FULLOUT, PCRIT=0); HDYWBG (EXACTP); SLMDIS (ALLCOEF); COEFOUT (ABOVE=l, BELOW=2); HIERARCHY (NLEVEL=1); FSTAT (OUTPUT=l); WRIGHT78 (NOCORR). 77 Alternative estimates of F-statistics over all regions and loci for a one level hierarchy, along with their bootstrap confidence intervals, were obtained according to Weir and Cockerham (1984) using the program DJJPLOID.FOR (Weir, 1990). F B values for each region across loci and their confidence intervals were also obtained using this program. To do so, separate data files were made for each region. To allow the program to run, the data from each region was artificially divided into two populations (within the same data file). In this manner, the Frr estimate given by the program was actually an estimate of Fi S for that region. In addition, gene diversity analysis with corrections for small sample size according to Nei and Chesser (1983) was performed using the computer program GENESTAT-PC version 2.1 (Whitkus, 1988). A unrooted phylogenetic tree was constructed using programs from the computer package PHYLIP (version 3.5c; Felsenstein, 1993). The program GENDIST from this package was used to calculate Reynolds, Weir and Cockerham's (1983) genetic distance between all regions. The resulting genetic distance matrix was used by the program FITCH to construct an unrooted phylogenetic tree. Options G (Global branch-swapping) and J (Jumble taxon input orders - 10 different input orders were tested) of this program were invoked. The reliability of the dendrogram obtained was then tested by bootstrapping 100 times over loci using the programs SEQBOOT and CONSENSE. Results DNA isolation protocol DNA was successfully isolated from 274 trees in total. During the course of DNA isolation several modifications to the protocol of Wagner et al. (1987) were made to improve both the yield and quality of the DNA obtained. The initial protocol was much closer to that of Wagner et al. (1987), the only modifications being the omission of miracloth in the 78 filtration step (due to the much greater viscosity of resuspended, powdered western red cedar foliage relative to that of other conifers) and the omission of the polytron step (an experiment had shown no difference in DNA yield without it). With the protocol in this form, DNA yields, after isolation from numerous (about 130) samples, turned out to be quite variable and generally low. Furthermore, the DNA samples were quite difficult to dissolve at the high concentrations needed for RFLP analysis, likely due to co-purification of substantial amounts of polysaccharides. Many (about 70) samples were degraded by the prolonged incubation at 60°C and occasional vigorous mixing needed to dissolve the DNA. Hence, these DNA samples had to be re-isolated. Examination under a fluorescent microscope of DAPI-stained material collected from the thick interface formed during the chloroform extraction step revealed that quite substantial amounts of DNA were bound to the interface debris. To increase the recovery in the aqueous phase of this bound DNA, the temperature of the water bath used in the incubation step (after the addition of CTAB and prior to chloroform extraction) was raised from 60°C (Wagner et al, 1987) to 75°C. This modification greatly increased both the DNA yields and the uniformity of DNA yield. However, the problem of polysaccharide contamination remained. Polysaccharides appear as a clear, gelatinous co-precipitate along with the DNA after ethanol or isopropanol precipitation, and can make the DNA sample more difficult to dissolve and, when dissolved, far more viscous than DNA alone (Fang et al, 1992; Michaels et al, 1994). Such was the case with the western red cedar DNA samples obtained. Furthermore, contaminating polysaccharides can interfere with digestion of DNA by most restriction enzymes (Fang et al, 1992; Michaels et al, 1994). Although the western red cedar DNA samples appeared to be fully digested by the restriction enzyme Hindlll (Hindlll appears to be relatively insensitive to polysaccharide contamination according to Michaels etal, 1994), the polysaccharide 79 contamination affected the mobility of the digested DNA on agarose gels. Different DNA samples migrated at different rates, likely due to differential degrees of polysaccharide contamination. Furthermore, the presence of contaminating polysaccharide seemed to cause a constriction in migrating DNA from 3 to 6 kb in size ("pinching in") resulting in a narrowing of that portion of the lanes (see lanes with non-CsCl treated samples in Figure 5.2), and in pronounced "smiling" of detected RFLP bands (data not shown). Samples which displayed the most pinching in also showed the greatest retardation of migration. The high viscosity of resuspended, powdered western red cedar foliage relative to that of other conifers may be related to the high degree of polysaccharide contamination of western red cedar DNA. Perhaps polysaccharides are particularly abundant in western red cedar foliage. Several different means of removing the contaminating polysaccharides were attempted. Additional rounds of chloroform or phenol/chloroform extraction, either prior to or after restriction enzyme digestion, did not alleviate the problem. Furthermore, the incorporation into the DNA isolation protocol of an increased NaCI concentration (1.4 M) prior to isopropanol precipitation, as suggested by Fang et al. (1992) to reduce polysaccharide co-precipitation, also did not alleviate the problem. Nor did ethanol precipitation of the DNA at high salt concentration (1.6 M prior to adding ethanol) after restriction enzyme digestion. Also, attempts to differentially precipitate the polysaccharides but not the DNA with 0.35 volumes of ethanol (Michaels et al, 1994) were not successful (lanes labeled "EtOH" in Figure 5.2). The last resort was CsCl gradient purification, a labour intensive and expensive undertaking for a large number of DNA samples. However, this treatment proved to be effective at removing the contaminating polysaccharides (Figure 5.2), which formed a gelatinous pellet at the bottom of the gradients. Yet even this success did not come without complications. Following the standard protocol for CsCl gradient purification (Sambrook et 80 Figure 5.2. Contaminating polysaccharides affecting migration of western red cedar foliar DNA samples are removed by cesium chloride gradient purification. Shown are Hindlll digested DNA samples run on a 0.8% agarose gel. The numbers identify particular DNA samples. "CsCl" indicates samples digested after cesium chloride treatment and "EtOH" indicates samples digested after treatment with 0.35 volumes of ethanol according to Michaels et al. (1994). The remaining lanes contain samples that were digested before CsCl or EtOH treatment. See text for details. Lanes marked "M" contain Lambda/Z/mdlll size standards. 81 al, 1989) often led to large losses of DNA. It was eventually determined that addition of concentrated ethidium bromide to the DNA/CsCl mixture was responsible for the losses. The addition of concentrated EtBr caused the immediate precipitation of polysaccharides surrounding the EtBr droplets. The polysaccharide - ethidium bromide complex was gelatinous in nature and appeared to swell up and "engulf proximal DNA strands. The engulfed DNA then pelleted with the polysaccharides at the bottom of the gradients and was thus lost. To make matters worse, banding of the remaining DNA was often disrupted, as some of the DNA band appeared to be associated with the pelleted polysaccharide precipitate. The solution to this problem was the reduction of the amount of ethidium bromide added (from 200 ul of a 10 mg/ml solution added to the 5 ml gradients to 40 ul) and the pre-dilution of this ethidium bromide in T/E and CsCl (half a gradients worth) before adding it to the other half of the gradient containing the DNA (also in CsCl and T/E). In this manner, a visible precipitate was not formed, the DNA banded properly in the gradient, and the polysaccharides formed a pellet at the bottom of the gradient. Hence, the unacceptable loss of "precious" DNA was avoided. Library construction and probe screening The directional cloning strategy was quite effective at preventing re-ligation of the vector as the majority of the colonies obtained were white rather than blue. Furthermore, the majority of the white colonies from which plasmids were isolated contained inserts (426 out of 435). The first stage of screening, by reverse probing, was not as effective as expected, as the labeled western red cedar DNA hybridized detectably to only a small fraction of the blotted inserts (58 out of 426). It is likely that only highly repetitive sequences were screened out at this step. Of the 104 inserts (chosen from those that did not give signals after reverse probing) examined in the next stage of screening, use of the inserts as probes versus a single ///'wdlll-digested western red cedar DNA sample, a large number (36) gave strong 82 hybridization signals (visible after overnight exposure) or smears and likely contained repetitive nuclear or organelle sequences. However, an additional 36 of the 104 inserts gave small numbers (1-4) of relatively faint bands requiring long exposures and were thus considered putative single or low copy number nuclear probes. The remaining 32 of the 104 inserts either gave extremely faint bands or no bands at all. Hybridization protocol When the first two putative single copy nuclear probes were used against multiple restriction enzyme digested samples (90 samples) on multiple blots (5 blots), bands were not observed after 1-2 week exposures. An experiment in which the probe concentration was increased by using an entire labeling reaction (one 50 ul "rediprime" [Amersham] reaction) for only one blot (18 samples) showed that the amount and concentration of the probe was the limiting factor. However, as it would have been too expensive to use an entire labeling reaction for every 18 samples, an improvement of the protocol was needed. To this end, several modifications of the hybridization protocol were tested: 1) addition of the hybridization accelerator 10% dextran sulphate (Wahl et al, 1979; Devey et al, 1991) to the hybridization solution; 2) addition of the alternate hybridization accelerator 10% polyethylene glycol (Amasino, 1986; Budowle and Baechtel, 1990); 3) use of a more traditional SSC-based hybridization solution (6X SSC, 5X Denhardt's reagent, 0.5% SDS - Sambrook et al, 1989) rather than that given in the materials and methods above (based on Church and Gilbert [1984], as recommended by the instructions included with the nylon membrane [Zetaprobe GT - Bio-Rad]); and 4) use of the SSC-based hybridization solution plus 10% Dextran sulphate. Significant enhancement of the hybridization signal was only obtained with the second modification, addition of 10% polyethylene glycol to the Church and Gilbert (1984) hybridization solution. In contrast, addition of 10% dextran to the Church and Gilbert 83 hybridization solution had no discernible effect. Furthermore, the only effect of the addition of 10% dextran to the SSC-based hybridization solution was increased background (i.e., non-f specific probe binding to the membrane). Hence, to enhance the hybridization signals when multiple, population samples were probed with putative single or low copy number nuclear probes, 10% PEG was added to the hybridization solution. Copy number analysis Copy number analysis of the homologous chloroplast and mitochondrial probes yielded copy number estimates of 5000 and 100 copies per haploid genome equivalent respectively. Hence, single or low copy number nuclear probes can be easily distinguished from organelle probes by copy number analysis. Of the 36 putative single or low copy number probes identified in the screening steps, 32 were used to probe 90 range-wide DNA samples and the copy number strip blots in parallel. Of these 32 probes, only one had a copy number greater than 10, the majority being single or low copy number. Some of the resulting autoradiograms from the copy number analysis are shown in Figure 5.3. RFLP analysis of range-wide samples Of the 274 range-wide DNA samples isolated, 90 were analysed with 30 or more single or low copy number nuclear probes. These 90 samples were composed of 18 samples from each of the 5 geographic regions (Figure 5.1). The probes that reveal polymorphisms among the 90 samples are available for future use against 160 remaining DNA samples (50 total per region). As noted above, 32 probes were used in the analysis of the 90 range-wide samples. One of these probes did not work, revealing no bands after prolonged autoradiography. Another probe, as noted above, had a copy number greater than 10 (roughly 20) and was thus not used in the analysis. The remaining 30 probes all produced one or more scorable bands. 84 C O P Y N U M B E R E Q U I V A L E N T S P R O B E 10000 1000 100 10 WRC CP 10000 1000 100 10 WRC MT 100 10 0.1 0 WRC 24 100 10 0.1 0 WRC 158 100 10 0.1 0 WRC ^^^^^^ 194 100 10 0.1 0 WRC 237 Figure 5.3. Autoradiograms from copy number analysis of homologous western red cedar RFLP probes. Each horizontal strip (slot blot) was hybridized with one probe. Numbers above slots indicate copy number equivalents of the probe in question, to which the signal in the slot marked "WRC" is compared. Slots marked "0" are control slots onto which 2ug herring sperm DNA ("carrier") only was blotted. "CP" indicates a homologous chloroplast probe (intergenic spacer between chloroplast trnL and tmF genes) and "MT" indicates a homologous mitochondrial probe (coxl). Probes 24, 158, 194, and 237 are homologous single or low copy number nuclear probes. See Materials and Methods for further details. 85 Nine of these 30 probes revealed polymorphisms (Table 5.1). The 30 probes produced 62 bands in total, 19 (31%) of which were polymorphic (Table 5.1). Genetic interpretations of the banding patterns were expedited by several observations including the copy number results, band sizes and intensities, whether or not Hindlll sites were present internally in the probe in question, and the pattern of polymorphism among the range-wide samples. Probe 171 defied straightforward genetic interpretation and was thus excluded from the analysis. It produced 3 monomorphic bands and one polymorphic band of high frequency. Because this band was of high frequency, it was expected to occur in homozygous form among the 90 samples, resulting in the disappearance of its allelic counterpart. As this did not happen the probe was excluded. The bands produced from the remaining 29 probes were scored as 44 putative loci (Table 5.1). At least one polymorphism was detected at 9 of the 44 loci. Example autoradiograms are shown in Figures 5.4 and 5.5. Allele frequencies at the 9 polymorphic loci are shown in Table 5.2 for the five regions as well as for the three coastal regions pooled, the two interior regions pooled, and overall. It is noteworthy that private alleles, occurring in only one region, were not detected. Furthermore, all alleles, even the rare ones (e.g., allele A at locus 445-2), were detected in both the interior and coastal portions of the species range. Eight of the nine polymorphic loci had only two alleles, with locus 8-1 having three alleles, two of which were at low frequency overall. Measures of genetic variability from all loci are given in Table 5.3. All measures were uniformly low across all of the regions. The average number of alleles per locus was 1.2 in all regions, as most loci were monomorphic and only one locus had more than two alleles. The percentage of polymorphic loci ranged from 14 to 18 % when only loci where the frequency of most common allele was less than or equal to 0.95 were considered polymorphic, and from 86 Table 5.1. Results of range-wide RFLP analysis of western red cedar with 30 single or low copy nuclear probes. Probe* Total number Number of Total number Number of Number of of bands polymorphic of putative polymorphic internal bands loci loci Hindm sites 012 6 0 5 0 1 017 1 0 1 0 0 022 1 0 1 0 0 061 1 0 1 0 0 083 1 0 1 0 0 104 1 0 1 0 0 123 1 0 1 0 0 125 1 0 1 0 0 130 0 0 0 132 1 0 1 0 0 149 1 0 1 0 0 152 1 0 1 0 0 157 0 0 0 165 0 1 0 1 170 1 0 1 0 0 173 1 0 1 0 0 187 1 0 1 0 0 194 1 0 1 0 0 227 0 0 0 237 1 0 1 0 0 239 1 0 1 0 0 008 3 3 1 1 0 024 4 2 3 1 0 114 5 4 3 0 137 3 2 2 1 0 155 3 2 2 1 0 158 3 2 1 1 1 160 2 2 1 1 0 171 4 1 2?? 1?? 1 445 5 1 3 1 1 Totals 62 19 46 10 5 •Probes that did not reveal polymorphisms are given first. 87 U.S. Coast B.C. Coast Q.C.I. Idaho B.C. Interior M M 23.1 — 9.4 -6.6 " 4.4 — 2.3 »-2.0 * -A U.S. Coast B.C. Coast M Q.C.I. Idaho B.C. Interior M 9.4 *• 6.6 — 4.4 2.3 2.0 B Figure 5.4. Autoradiograms from range-wide western red cedar nuclear RFLP analysis with ///wdlll-digested DNA samples vs. two single copy, monomorphic probes. Each autoradiogram (A and B) contains 18 range-wide samples (of a total of 90 analysed) from the geographic regions indicated ("Q.C.I." = Queen Charlotte Islands). " M " denotes lanes containing Lambda///wdIII size markers (mobility and sizes in kb indicated on the extreme left). A: probe 194. B: probe 237. 88 U.S. Coast 23.1 — 9.4 " B.C. Coast Q.C.I. M Idaho M B.C. Interior Figure 5.5. Autoradiograms from range-wide western red cedar nuclear RFLP analysis with //wdlll-digested DNA samples vs. two polymorphic probes. Each autoradiogram (A and B) contains 18 range-wide samples (of a total of 90 analysed) from the geographic regions indicated ("Q.C.I." = Queen Charlotte Islands). " M " denotes lanes containing Lambda/Z/zwdlll size markers (mobility and sizes in kb indicated on the extreme left). A: probe 24. B: probe 158. 89 2 6 o •c O CJ C f l e o '5b <a PH CS "o3 3 o O •J 1—1 CS cn r - 1—1 ON F—1 ON cn r - CS 0 0 ,—1 Ov O N 1-H r—1 O N vo 1—1 CS oo 1-H i n TT i n cn V O r - CS V O cn cn V O i - H 0 0 o O N O r - CS m 0 0 CS cn VO V O cn ON o o ON o O O o o o © ' O © o o o o O O O © ' o o 1—1 CS V O ON i — i 0 0 CS cn 0 0 CS o o V O V O O N VO m cn VO i n 0 0 i - H r> CS i n I T ) i n 0 0 O 0 0 o O N o VO cn O N o i n CS r> r-» CS O N o o O N © o © O o O o O o © o o o o o © © o © V O O N V O O o o o cn r> i n i n CS 0 0 i n i n O N O f~ CS ON vo cn VO cn V O o O N cn V O O O N o O N O V O cn i n r - cs 0 0 <—i i n V O cn O N o O O N o o © o O o © © o o o o © o O O o © O V O cn CS 0 0 1-H O N V O O N 0 0 CS r~ cn o o o o f-H o 0 0 CS 1-H 0 0 m cn VO r - CS VO cn o O o o oo o ON o V O cn ON o VO cn CS VO cn o o o o m © © © o o o o , o © o © o o o o o o cn r - o cn t-- cn CS 0 0 0 0 CS 0 0 CS cn r - O N 0 0 CS 0 0 I - H o 1-H oo V O cn r - CS CS t— CS cn V O 0 0 »—( CS r -O O N o O N o V O cn ON o m CS r - 0 0 0 0 >—1 o O N © © © © © o O o © o © o o O o o O o o vo oo V O vo O N 0 0 CS V O V O o o o o i n 1-H CS i n CS r - m CS T)- i n O N o o o o o o O N o i n cn VO i n CS i n V O cn o o o o cn © o © ' o o o o © © © o o o o O o o 0 0 CS o O N 0 0 CS cn ,—1 O N CS 0 0 VO cn 0 0 CS CS r - o cn vo CS cn VO VO cn CS m i—i 0 0 CS o O N o VO cn i n 0 0 0 0 i n m O N O o O N o o o o o o o o o o o o o o o O O o d VO O VO 0 0 CS CS 0 0 i - H O N O N i -H VO O N F—1 o o i n r f O m i -H 0 0 0 0 i -H VO cn 0 0 i -H m 0 0 i -H o o o O N O O N o VO cn 0 0 i -H 0 0 i -H cn vo m 0 0 i - H o © © ' © ' O O* d d d d d d d d d d d d d d i -H < PQ u <! PQ < n < PQ < CQ < m < PQ < PQ < m 0 0 cn i - H 1-H i - H i - H CN 1 •4 4 i n 0 0 o i n TI- i - H cn i n m VO TT CS 1-H i - H i - H i - H i - H »-H 90 t-i > O C O o o CN CN 0 0 I I g P CN s P o s o •c co o o CN O N CN ( S ) s ° © e © a o s p o e o o CN CN 0 0 © e § P o a § P ©• a 0 0 o 1; o O N i n C O s p _ vo __: © o CN ON <N CN I I RT I—1 I I CN 2 S P ©• e c o "5b I P4 0 0 o ON ON r-~. O S S p ©• e ©• e » 3 g p CN Ui OS s •s > §1 VO 0 0 o o 1° > £ o CN CN 0 0 1 o 1 O o 8P & o <u o o-i a. o e O N CN oo' i o o 60 a <u o g P o I I O -O JO C o O J5 CN ^ 8 P ©• e g P cn O o o W J3 M eN 8 p o e "3" CN cn O 60 X ! ^ . «» T 3 -a <L> (U O co 8"f II 4 1 1 o CI u S3 || s ? 1« 6, O • S c« 5 l l 91 16 to 21% in the absence of a frequency, criterion. Observed and expected heterozygosities (HQ and Hg) were quite uniform across the regions, and generally approximated one another. The greatest difference between HQ and Hg occurred in the U.S. Coast, where there was an apparent deficiency of heterozygotes. The overall expected heterozygosity was only 6.2%. Inbreeding coefficients, or panmictic indices, (F I S) over loci for each region are given in Table 5.4, along with their 95% bootstrap confidence intervals. With the exception of the U.S. Coast, all of the regions had F I S values very close to zero, indicating that the genotypes are in random mating proportions. The bootstrap confidence intervals indicate that none of these F I S estimates, not even that for the U.S. Coast, were significantly different from zero. The relatively high (though not significantly different from zero) F I S value obtained for the U.S. Coast (0.23) is likely due to the "patchy" distribution of western red cedar in northern California and southern Oregon, areas from which most of the U.S. Coast samples were drawn. The distribution of the species in the other four regions is far more continuous. This patchy distribution likely results in lower levels of within-region gene flow, leading to greater within-region subdivision and thus to the observed deficiency of heterozygotes ("Wahlund effect") as reflected in the relatively large, positive F I S value for the U.S. Coast. F I S values for each locus over all regions, and the overall F I S value across loci and regions are given in Table 5.5, along with additional F statistics (Wright, 1951) calculated according to Nei (1977). However, I have modified the subscripts to better match the sampling strategy used in this study. The subscript "R" (for "region") replaces Wright's subscript "S" (for subpopulations), as geographic regions were sampled in this study rather than subpopulations (i.e., single stands). In this manner, F I S becomes F I R . Even when averaged across regions, F I R values fluctuate considerably from locus to locus (Table 5.5), most likely due to sampling error. The overall average, across regions and loci is, however, close to zero, indicating random mating genotype proportions within regions. The coefficient 92 Table 5.4. Deviation from Hardy-Weinberg genotype proportions within regions. Region F K 95% Confidence Interval U.S. Coast 0.228 -0.033 to 0.428 B.C. Coast -0.055 -0.178 to 0.182 Q. Charlotte I. -0.054 -0.267 to 0.173 Idaho 0.021 -0.169 to 0.337 B.C. Interior 0.020 -0.168 to 0.243 Fis estimates over loci and their 95% confidence intervals were calculated using the computer program DIPLOID.FOR (Weir, 1990). The program calculates confidence intervals by bootstrapping over all loci. 93 Table 5.5. Estimates of Fj R , F I T and F R T at all polymorphic loci.* Locus FIR FIT FRT 008-1 -0.042 -0.010 0.031 024-1 -0.272 0.051 0.254 114-1 0.151 0.201 0.060 114-3 0.083 0.231 0.161 137-1 -0.121 -0.023 0.088 155-1 -0.150 -0.117 0.029 158-1 0.089 0.132 0.047 160-1 0.493 0.516 0.045 445-2 -0.029 -0.011 0.017 Overall1 -0.003 0.086 0.088 Overall2 0.026 0.105 0.081 95% C.I.2 -0.094 to 0.160 -0.002 to 0.221 0.024 to 0.157 Subscripts I, R, and T denote individual, region, and total range respectively (see "Results" for rationale for modification of subscripts). Calculated according to Nei (1977) using step FSTAT in BIOSYS-1. 2 Calculated according to Weir and Cockerham (1984) using the computer program DIPLOID.FOR (Weir, 1990). This program also calculated the 95% confidence intervals by bootstrapping over all loci. 94 of gene differentiation among regions, F R T (equivalent to Nei's G s x [Nei, 1973], with the subscript change as above), was 9% overall. Also given in Table 5.5 are alternative estimates of F I R , F I T and F R T calculated, over all regions and loci, according to Weir and Cockerham (1984), along with their bootstrap confidence intervals (Weir, 1990). These estimates differ somewhat, although not drastically, from those calculated according to Nei (1977). However, the bootstrap confidence intervals on these estimates are quite wide, indicating their lack of precision. F statistics calculated with an additional level of hierarchy are given in Table 5.6. Here, regions have been grouped according to their location within the coastal (U.S. Coast, B.C. Coast, Queen Charlotte Islands) or interior (Idaho and B.C. Interior) portions of the species range (Figure 5.1). As in Table 5.5, the subscript "R" denotes "region". The coastal and interior portions of the species range have been denoted "zones" with the associated subscript "Z". In this analysis, the absolute level of genetic differentiation among regions, D R T (=0.0054), is partitioned into the differentiation of regions within zones (1)^=0.0045) and the differentiation of the two zones (coast and interior) from each other (DZT=0.0009). The much larger magnitude of Dj^ relative to D Z T suggests that most of the differentiation among regions occurs within, rather than between, the coastal and interior zones. Alternative estimates of some of the statistics given in Table 5.6, adjusted for small sample size within regions according to Nei and Chesser (1983), are provided in the same table. Estimates of D R T and F R T adjusted in this manner are somewhat smaller than the original estimates. The matrix of Nei's unbiased genetic distance and similarity coefficients (Nei, 1978) between regions is given in Table 5.7. This genetic distance is an "unbiased" version of Nei's original genetic distance (Nei, 1972), adjusted to remove the "spurious distance" that can result due to sampling error (Nei, 1978). The table shows that, overall, the genetic distances are quite small, indicating a very low degree of differentiation between populations. The three 95 Table 5.6. Hierarchical F statistics over all loci. Statistic* Estimate* Unbiased1 Hp 0.0617 0.0620 H R 0.0563 0.0579 H z 0.0608 DRZ 0.0045 D Z T 0.0009 D R T 0.0054 0.0041 FRZ 0.074 ^ZT 0.015 FRT 0.088 0.067 *Subscripts are as follows: T = total range (i.e. species) R = region Z = zone (i.e. coast or interior) Calculated according to Wright (1978) using step WRIGFiT78 (option NOCORR) inBIOSYS-1. 'Unbiased estimates (for sample size) calculated according to Nei and Chesser (1983) using the computer program GENESTAT-PC (Whitkus, 1988). 96 Table 5.7. Matrix of Nei's genetic similarity and distance coefficients. Region 1 2 3 4 5 1 U.S. Coast ***** 0.999 0.990 0.996 0.999 2 B.C. Coast 0.001 ***** 0.997 0.993 0.996 3 Q. Charlotte I. 0.010 0.003 ***** 0.986 0.988 4 Idaho 0.004 0.007 0.014 $ s)c Hi $ »k 1.000 5 B.C. Interior 0.001 0.005 0.012 0.000 ***** Below diagonal: Nei (1978) unbiased genetic distance. Above diagonal: Nei (1978) unbiased genetic identity. 97 largest distances all involve the Queen Charlotte Islands. This result suggests that a mild "founder effect" occurred when this region was re-colonized after glaciation. It is also notable that Idaho and the B.C. Interior appear to be completely undifferentiated from each other. Nei's genetic distance is formulated for an infinite alleles model of mutation, where differences accrue between populations due to both mutation and genetic drift (Nei, 1972). The model assumes that effective population sizes have remained constant and equal in all populations. Genetic distance is linearly related to time since divergence under this model. In contrast, the genetic distance of Reynolds et al. (1983) is based on a pure drift model (no mutation) in which effective population sizes are allowed to vary both between populations and within a population over time. It is linearly related to the amount of genetic drift since divergence rather than time. This distance measure is appropriate for studies of short term, intraspecific differentiation, such as the current study. As no private alleles were found in this study, and as all rare alleles observed were widespread, it can be reasonably concluded that mutation did not contribute significantly to the observed differentiation. Hence, the distance measure of Reynolds et al. (1983) appears to be more appropriate than that of Nei for this study, particularly for the purposes of phylogeny reconstruction. Accordingly, the matrix of Reynolds et a/.'s (1983) genetic distances between regions was calculated and is given in Table 5.8. As can be expected for such a "short term" genetic distance measure the magnitudes are much greater than the corresponding Nei's distances (Table 5.7). However, the pattern of relative magnitudes is identical between the two matrices. An unrooted phylogenetic tree, constructed based upon the matrix of Reynolds et a/.'s (1983) genetic distances (Table 5.8) using the algorithm of Fitch and Margoliash (1967), is shown in Figure 5.6. Bootstrap analysis showed that the tree obtained is fairly robust, especially in consideration of the small sample sizes analysed per region. This tree illustrates in graphical form the relationships described above. The B.C. Interior and Idaho regions are 98 Table 5.8. Matrix of Reynolds, Weir and Cockerham's (1983) genetic distances. Region 1 2 3 4 5 1 U.S. Coast ****** 0.0429 0.1581 0.0858 0.0441 2 B.C. Coast 0.0429 ****** 0.0642 0.1220 0.0934 3 Q.Charlotte I. 0.1581 0.0642 ****** 0.2001 0.1812 4 Idaho 0.0858 0.1220 0.2001 ****** 0.0301 5 B.C. Interior 0.0441 0.0934 0.1812 0.0301 ****** 99 Figure 5.6. Phylogenetic tree showing relationships among geographical regions within the range of western red cedar. The unrooted tree shown was constructed based on Reynolds, Weir and Cockerham's (1983) genetic distances between regions using the algorithm of Fitch and Margoliash (1967) with the aid of the computer program package PHYLIP (version 3.5c; Felsenstein, 1993). Numbers before the nodes indicate the number of trees containing that particular clade out of a total of 100 bootstrapped trees. Note that the branches leading to BCC, USC and BCI are all of zero length. One centimeter of branch length is equivalent to a genetic distance of 0.02. 100 the most closely related. Furthermore, the long branch leading to the Queen Charlotte Islands indicates that genetic drift has been most pronounced in this region, with most of this genetic drift most likely occurring during the founding of this geographically isolated region. The overall topology of the tree is consistent with a scenario in which the current species range was re-colonized after the last glaciation from a refugial population located in or near the U.S. Coast region. Re-colonization is likely to have proceeded in two directions: from the U.S. Coast to the B.C. Coast and from there to the Queen Charlotte Islands; and, secondly, from the U.S. Coast to Idaho and from there to the B.C. Interior. Discussion Sampling strategy The sampling strategy used in this study involved a low density sampling of geographic regions (Figure 5.1) rather than the more traditional concentrated sampling within "populations" (i.e., locations). There are several reasons for this strategy, mostly involving practical considerations. On the theoretical side, however, since the total gene diversity (= expected heterozygosity) of a species is equivalent to the probability that two gametes, each drawn at random from any individual in the species, contain the same allele at a randomly chosen locus (Nei, 1973), the sampling strategy employed in this study, being closer to a random sample of the whole species, may actually be better for estimating this parameter. Furthermore, a single population may not be representative of the genetic makeup of an entire region. On the practical side, advantage was taken of the availability of clones of trees from the B.C. Coast (and to a lesser extent, the Queen Charlotte Islands) in seed orchards in or near Victoria, B.C. These clones were derived from widespread trees. Hence, to be consistent, widespread trees were chosen for the most part when sampling other regions. Another practical reason is that since almost all of the sampling of natural stands was done by one person working alone, the travel of large distances from roadsides that would be necessary to sample many trees from one stand (but further than 100 m from each other) was 101 considered unsafe. The drawback of this sampling strategy is that any deviation of genotypic proportions within each region from those expected under Hardy-Weinberg equilibrium (random mating) may be a result of two causes. Under this sampling strategy, the effect of a lack of panmixis within populations (i.e., heterozygote deficiency due to, for example, a high rate of selling) cannot be distinguished from that due to population subdivision within regions ("Wahlund effect"). (This point is discussed further under "Inbreeding" below.) DNA isolation for RFLP analysis requires more effort in western red cedar Because of the apparent high concentration of polysaccharides in western red cedar foliage, CsCl gradient purification was necessary to obtain DNA of sufficient quality for RFLP analysis (high levels of polysaccharide contamination negatively affect digestibility by restriction enzymes and can alter electrophoretic mobility). This makes RFLP analysis more laborious and expensive than for most other conifers, where CsCl gradient purification does not appear to be needed (e.g., Wagner et al, 1987; Devey et al, 1991; Strauss et al, 1993; Ahuja et al, 1994; Jermstad et al, 1994). However, the availability of the optimized protocol developed in this project should greatly assist future studies with this species. Hopefully, the protocol will also be useful with other problematic species with polysaccharides similar to those in western red cedar foliage. Notably, the optimized Southern/RFLP protocol developed in this study gave excellent signal intensity over background. Level of resolution of the range-wide analysis of 90 trees The nuclear RFLP analysis of 90 range-wide trees was intended to give an accurate depiction of species-level genetic diversity. To get a feel for the level of resolution of this analysis, it would be pertinent to consider the minimum allele frequency at which an allele will have a 95% probability of being included in the sample. Consider a codominant allele A of 102 (low) frequency p. The probability (a) of not detecting that allele in a sample size of one diploid individual is equal to the probability of sampling an individual of genotype "aa" (where "a" represents any allele that is "not A", of combined frequency q, and p + q = 1) which is a = q 2 + Fq(l-q) (Hartl and Clark, 1989). Here, F is the inbreeding coefficient (panmictic index), and, since we are considering a range-wide sample, it should be F I T . The probability of not detecting allele A in a sample of N individuals is then a = [q2 + Fq(l-q)]N . Rearranging this equation for q and then subtracting it from 1 to yield p gives: p = i . [e^/f l -F) + (F/(2-2F))2]1/2 + F/(2-2F). Solutions of this equation for a 95% probability of detecting the allele (a = 0.05), for the range-wide sample size of 90 trees employed in this study, and for various values of F (i.e., F I T) are given in Table 5.9. In the worst case scenario, where F = 1 and all individuals carrying the allele in question are homozygotes, the minimum range-wide frequency at which the allele has a 95% probability of being included in the sample is 3.3%. Under random mating genotype frequencies (F=0) this minimum frequency is 1.7%. At F = 0.1, approximating the overall estimate of F I T in this study of 0.086 (Table 5.9), the minimum frequency is still 1.7%. Hence, it is highly certain that an allele with an overall frequency of about 2% or more will be included in the range-wide sample of 90 trees employed in this study. Since alleles of such low frequency would contribute little to a species-wide estimate of expected heterozygosity, the sample size of 90 should be quite adequate for estimating this statistic. Also given in Table 5.9 are the minimum allele frequencies corresponding to a sample size of 18, the number of trees sampled per region in this study. For reasonable values of F (F I S [or with my subscripts, F I R] in this case), an allele of frequency of roughly 10% or less in a given region has a 5% chance of not being included in the sample of 18 individuals. This 103 Table 5.9. Allele frequency resolution for various degrees of deviation from panmixia. F Minimum allele frequency (p) n = 90 n=18 -0.25 0.015 0.071 0.00 0.017 0.080 0.10 0.017 0.084 0.25 0.019 0.091 0.50 0.022 0.106 1.00 0.033 0.153 104 would imply that any inferences based on rare alleles present in one or two populations but not in others would be very weak, as one would expect with this sample size. Comparison to gene diversity estimates from isozyme studies The estimate of overall expected heterozygosity obtained in this study of 6% is in close agreement with estimates of 4% (Yeh, 1988) and 6% (El-Kassaby et ai, 1994) obtained from isozyme studies. This is somewhat surprising as it was expected that somewhat higher levels of diversity would be uncovered with single or low copy number RFLP markers than with isozymes, as the two types of markers examine different levels of organization. Isozyme variation is due to amino acid substitutions that cause electrophoretic mobility shifts in enzymes involved in important metabolic processes. As such, many of the possible amino acid substitutions likely do not persist in populations due to deleterious effects on enzyme function. Furthermore many of the possible amino acid substitutions that do persist may not be detectable on gels (i.e., may not cause mobility shifts). In contrast, the RFLPs in this study were derived from random, single or low copy number genomic clones, and are unlikely to occur within expressed sequences. Any nucleotide substitutions within the restriction enzyme sites that define the detected fragments, as well as insertions or deletions within the detected fragments, are uncovered with RFLP markers. Such substitutions, insertions, and deletions, because they probably occur in intergenic sequences and are thus more likely to be selectively neutral, are expected to occur with higher frequency than amino acid substitutions within isozymes. For example, expected heterozygosity estimates obtained in an RFLP study of common bean (Phaseolus vulgaris) with low copy number random genomic probes were 3-4 times higher than those obtained with isozymes (Becerra Velasquez and Gepts, 1994). However, the expectation of higher RFLP variation relative to isozyme variation may be tempered by the fact that only one restriction enzyme (HindllV) was used in the current study. This was done in order to maximize the number of probes used, as, because of the 105 expectation that insertion/deletion variation predominates in plant genomic DNA (Apuya et al, 1988; McCouch et al, 1988), and in the absence of restriction site maps, fragments from different enzymes uncovered by the same probe cannot be considered independent loci. As the probes employed were short (500 to 1600 bp) most of them did not contain a Hindlll site and none contained more than one Hindlll site (Table 5.1). Hence, only two or three restriction sites were examined per locus, or 12 to 18 base pairs per locus (the Hindlll recognition sequence is 6 base pairs long). However all insertions or deletions internal to and large enough relative to the size of the fragments produced would still be detected. Note that in the current study, polymorphism may have been under-estimated due to the possible presence of insertions or deletions too small to be detected on the blots produced. Note that close agreement between genetic diversity estimates from isozyme and DNA level data (RAPD markers) were also obtained in a study in black spruce (Picea mariana; Isabel et al, 1995). As RAPD markers detect nucleotide substitutions within flanking 10-mer primer sites (therefore 20 base pairs at most are assayed for substitutions per locus) as well as insertion/deletion variation between or encompassing the primer sites, levels of variation in Mendelian RAPD markers should be similar to those observed with Mendelian RFLP markers where only one restriction enzyme was used, as in this study. The agreement between the species-wide gene diversity estimate obtained in this study and estimates based on isozymes (Yeh, 1988; El-Kassaby, 1994) confirms the placement of western red cedar as the third most genetically uniform conifer ever studied, at least in terms of electrophoretically detectable genetic variation (including DNA marker variation presented here). The only conifer species with less electrophoretically detectable genetic variation are Torrey pine (Pinus torreyana; Ledig and Conkle, 1983) and red pine (Pinus resinosa; Fowler and Morris, 1977; Allendorf et al, 1982; Simon et al, 1986; Mosseler et al, 1991) for which expected heterozygosity estimates of 1.7% and 0.0% respectively were obtained. The lack of 106 genetic variation in Torrey pine is not that surprising considering its very limited range and small total population size (Ledig and Conkle, 1983). However, red pine, like western red cedar, has a quite extensive range. Comparison to studies in other conifers using Mendelian nuclear DNA markers To the author's knowledge, this is the first population genetics study of a conifer using single or low copy number nuclear RFLP markers. Thus, there are no "benchmarks" in other conifer species with which the results obtained here can be compared. However, some population genetic studies of nuclear DNA variation in conifers have been conducted with RAPD markers. The lack of genetic variation in red pine has been confirmed with RAPD markers (Mosseler et al, 1992; DeVerno and Mosseler, 1994). In the former study, a species-wide survey with 69 RAPD primers revealed only one variant RAPD band in a single tree. In the latter study, a further 400 RAPD primers were surveyed and no polymorphisms were found. Hence red pine represents the extreme case for genetic uniformity in a conifer (variation in quantitative traits in red pine is also lacking - Fowler and Lester, 1970). As noted above, black spruce (Picea mariana), a species abundant in isozyme variation (e.g., Yeh et al, 1986), has been studied for RAPD marker variation (Isabel et al, 1995), the result being that RAPD markers uncovered a high level of variation (expected heterozygosity of 33%) in close agreement with that found with isozymes. The estimate of expected heterozygosity in Douglas-fir (Pseudotsuga menziesii) in British Columbia obtained from RAPD markers (45%; Carlson et al, 1994) was much higher than that obtained in isozyme studies (e.g., 16% in Yeh and O'Malley, 1980; El-Kassaby and Ritland, 1995). Hence, nuclear DNA-level genetic variation in western red cedar, as uncovered by single or low copy number RFLP markers in this study, appears to be much lower than that uncovered by RAPD marker studies in other conifers (with the exception of red pine, where both isozyme and DNA variation are lower than in western red cedar). 107 Variation in leaf oil terpenes and quantitative traits The evidence for low levels of genetic variation in western red cedar obtained from isozyme studies (Copes, 1981; Yeh, 1988; El-Kassaby etal, 1994) and from this RFLP study is corroborated by very low levels of leaf oil terpene variation (von Rudloff and Lapp, 1979; von Rudloff et al, 1988). The results with quantitative traits, however, are conflicting. A provenance test of three Vancouver Island provenances revealed no significant differences among provenances for survival or height (Bower and Dunsworth, 1988). Common garden studies of populations covering the southern half of the interior portion of the range uncovered genetic variation among populations in only one third of the variables analysed, and the slopes of detectable clines were very gentle (Rehfeldt, 1994). Furthermore, genetic variation within populations was found for only one half of the variables analysed. The author thus concluded that "the present tests also support the conclusion that genetic variation among and within western red cedar populations is low" (Rehfeldt, 1994). More recent provenance trials, involving a wider sample of the species, have found more pronounced levels of variation, and high heritabilities for some traits (Cherry, 1995). However, Russell etal. (1995) have noted differential selfing rates among trees, and variation in inbreeding depression among families produced by selfing. Because of these confounding factors, they suggest that open-pollinated maternal half-sib progeny (as used by Cherry, 1995) are unreliable for estimating genetic parameters. From their studies of progeny of controlled pollinations, Russell etal. (1995) conclude that western red cedar contains "modest, though far from negligible genetic variation." Also, genetic variation in germination parameters has recently been observed (Y. El-Kassaby, Pacific Forest Products Ltd., Saanichton, B.C., personal communication). In summary, it is clear that genetic variation in quantitative traits definitely exists in western red cedar. The presence of moderate levels of quantitative trait variation in combination with low I 108 levels of variation in single locus variation in isozyme and single or low copy nuclear RFLP markers is somewhat paradoxical, but not unprecedented in trees. In Acacia mangium, very low levels of isozyme diversity (Moran et al, 1989) are combined with high levels of quantitative trait variability (Atipanumpai, 1989). It cannot be assumed that levels of different types of genetic variation (i.e., isozyme, neutral nuclear DNA marker, and quantitative trait variation) are always proportional (Lande, 1988; Lewontin, 1984). It is also noteworthy that studies with Drosophila, houseflies, and ryegrass have observed increased genetic variance in quantitative traits after population bottlenecks (see below), accompanied by the expected loss of variation at isozyme loci (Carson, 1990). Population bottleneck The favoured explanation for the low level of isozyme and leaf oil terpene variation in western red cedar is the occurrence of a severe population bottleneck during the last ice age (Critchfield, 1984; Yeh, 1988; El-Kassaby et al, 1994). Although the location of the glacial refugium (or refugia) for western red cedar during the last ice age is unknown, paleobotanical records suggest that its location was far south of the glacial limits (Baker, 1983; Critchfield, 1984; Hebda and Matthewes, 1984). It is possible that the refugium was located to the south of the species current range. The low level of RFLP variation observed here seems to corroborate the suggestion that a bottleneck (or multiple bottlenecks over time) has occurred in the "recent" history of the species. Perhaps western red cedar is more prone to population bottlenecks due to glaciation than other northern conifers. This may be due in part to a slow migration rate via seed - due to a small wing surface seeds of this species fall shorter distances than those of associated conifers (Siggins et al, 1933). Paleobotanical records indicate that the species was one of the slowest to migrate northward after the last glaciation, and reached the northern tip of Vancouver island only 3,000 years ago (Hebda, 1983). If this was due to a slow migration 109 rate via seed, then, during times of glacial advance, the species may have also been slow to retreat southward, possibly resulting in a relatively restricted range and population size (i.e., a single, small refugium) during glacial maxima. Alternatively, the presumed failure to retreat efficiently could have resulted from a scarcity of suitable habitat to retreat to, or from intense competition from other species for such habitat. If the species is particularly prone to glacial bottlenecks, then the current low level of isozyme, terpene, and RFLP variation may be due not only to the most recent ice age, but to recurrent bottlenecks (i.e., post-glacial re-colonization from single refugia of small effective population size) over the many cycles of glaciation occurring throughout the 2 million year Pleistocene epoch. (Note, however, that the presumed slow rate of retreat may sometimes lead to multiple, isolated refugia, which, through drift and divergent selection, could result in increased genetic diversity.) Another possibility is that western red cedar experienced a bottleneck during the post-glacial Xerothermic period that other more drought tolerant conifers did not. This warm, dry period appears to have lasted from 10,000 to 6,000 yr. B.P. in the Pacific Northwest [Baker, 1983]. Differentiation between the coast and interior Because of the small within-region sample sizes, inferences drawn from the F statistics and genetic distances obtained in this study must be viewed with caution. However, the apparent low level of differentiation between the coastal and interior portions of the species range (F Z T of 0.015 and D Z T of 0.0009 [Table 5.6]; Nei's unbiased genetic distance between coast and interior zones of 0.005) suggests that they originate from a common glacial refugium. The lack of private alleles and the detection of all rare alleles in both portions of the species range (Table 5.2) support this inference. The phylogenetic tree obtained (Figure 5.6) is consistent with a hypothesis that the this refugium was located in or near the U.S. Coast region. 110 An alternative explanation for the lack of divergence between the coast and interior would be a high rate of migration between the two zones. However, given the large distance separating the two, this seems unlikely (although it is possible that the coast and the interior zones were connected in the recent past). Differentiation among regions Estimates of the relative degree of differentiation among regions (FR T) obtained in this study were 9%, 8%, and 7% (Tables 5.5 and 5.6; calculated according to Nei [1977], Weir and Cockerham [1984] and Nei and Chesser [1983] respectively). The 95% confidence interval for the second estimate of 8%, calculated by bootstrapping over loci, was 2 to 16% (Table 5.5). These estimates, though lacking in precision, are close to the average G S T (=FST) for gymnosperms of 7% (Hamrick and Godt, 1990) and larger than the estimate obtained by Yeh (1988) with isozymes for western red cedar of 3.3%. However, Nei (1987) cautions that small values of H T (like that obtained here with western red cedar) can cause G S T estimates to be inflated. A case in point is Torrey pine which has an extremely low H T of 0.017 and an extremely high G S T of 1.00 (Ledig and Conkle, 1983; Adams, 1983). For comparative purposes then, the absolute measure of genetic differentiation, D S T , is more useful (Nei, 1987). For western red cedar in this study, estimates of D S T (called D R T in this study) of 0.0054 and 0.0041 were obtained (Table 5.6; calculated according to Wright [1978] and Nei and Chesser [1983] respectively), indicating, in absolute terms, low differentiation between regions. Inbreeding The rate of selfing in a western red cedar seed orchard has been estimated at 68% (El-Kassaby et al, 1994). This is the highest rate ever reported for a conifer. Coupled with this high selfing rate is a lack of inbreeding depression in reproductive biology (Owens et al, 1990) and early growth traits (Larson, 1937; Cherry, 1995). Hence, it appears that western I l l red cedar has a mixed-mating system, rather than the outcrossing mating system typical of conifers (Adams and Birkes, 1991). Close relatives of western red cedar, Thuja occidentalis and T. orientalis also have atypically high estimated selfing rates of 0.37 (Perry and Knowles, 1990) and 0.25 (Xie et al, 1991) respectively, although considerably lower than that reported for western red cedar. The mixed-mating system of western red cedar is expected to lead to a heterozygote deficiency in populations relative to Hardy-Weinberg proportions. For a population in which the rate of self-fertilization (S) is 0.68, the inbreeding coefficient is expected to be S/(2-S), or 0.52 (Hartl and Clark, 1989). In addition to that caused by the mixed mating system of western red cedar, the sampling strategy employed in this study, involving a low density sampling of each geographic region (Figure 5.1), rather than concentrated sampling of individual "populations" (i.e., single stands), is expected, in the presence of population subdivision within regions, to cause a further deficiency of heterozygotes ("Wahlund effect"). It is somewhat surprising then that estimates of the overall coefficient of inbreeding within regions (F I R) were close to zero (-0.003 and 0.026+0.13 [95% C.I.] - Table 5.5). This result is corroborated by that of Yeh (1988), where a low overall coefficient of inbreeding of 0.057 was obtained from polymorphic isozyme loci using large population sample sizes. Although inbreeding depression has not been observed in early growth traits in western red cedar, recent results show that inbreeding depression manifests itself at a later stage, in trees of greater than 2-3 years of age (Cherry, 1995; J. Russell, B.C. Ministry of Forests, personal communication). The apparent lack of inbreeding in mature stands, as inferred from the inbreeding coefficients obtained in this study and by Yeh (1988), suggests that the selective disadvantage caused by this late stage inbreeding depression is severe enough to purge inbred progeny from more mature stands (note that heterozygote deficiencies symptomatic of inbreeding were not detected in an isozyme study of Thuja occidentalis either 112 [Perry et al, 1990]). If this is true, one has to wonder why western red cedar selfs at such a high rate in the first place! As suggested by El-Kassaby et al. (1994), tolerance of high selfing rates by western red cedar may be more the result of a history of population bottlenecks rather than an evolutionary strategy. Forced inbreeding due to small population sizes during bottlenecks may have exposed deleterious recessive genes involved in early stage inbreeding depression (and hence preventing selfing), leading to their elimination from the population by selection (in addition to drift), and hence to a tolerance of selfing. Perhaps recessive genes involved in later stage inbreeding depression were not purged in a similar manner due to low interspecific competition amongst older trees at that time. Alternatively, the apparent lack of inbreeding in mature stands may be due to a far lower selfing rate in natural stands than that observed in the seed orchard. However, as seed orchard conditions are manipulated to enhance outcrossing, this is unlikely (El-Kassaby et al., 1994). Phenotypic plasticity Recent studies by Yousry El-Kassaby (Pacific Forest Products Ltd., Saanichton, B.C., personal communication) have demonstrated a strikingly high degree of phenotypic plasticity in western red cedar - in particular, a marked responsiveness of shoot growth to temporal fluctuations in temperature. This may help to explain how a species low in genetic variation (at least as measured by isozymes, terpenes, and the RFLP markers in this study) can have such a wide ecological amplitude. As suggested by Lewontin (1957), homeostasis for the fitness of a genotype in the face of environmental variation (e.g. temporal or spatial variation in the length of the growing season) may be achieved by phenotypic plasticity (e.g. of shoot growth in response to temperature). Hence, phenotypic plasticity can be viewed as an alternative strategy to genetic diversity to deal with environmental heterogeneity (Lewontin, 1957). It has been hypothesized that a species or population with abundant adaptive 113 phenotypic plasticity may have less need for genetic variation and thus may be more likely to lose it (Levins, 1963; Marshall and Jain, 1968). However, a consistent inverse relationship between genetic diversity and plasticity of species or populations has failed to materialize in plants (Schlichting, 1986). It seems likely that these two alternative adaptive strategies are not mutually exclusive but may coexist in varying degrees within a species or population (Schlichting, 1986), as was originally suggested by Lewontin (1957). Significant contributions of this study This study was designed primarily to get a good estimate of the overall level of genetic diversity (HT) in western red cedar using single or low copy number RFLP markers. In order to do this, several methodological obstacles had to be overcome, especially in regards to the isolation of the large quantities of high quality DNA needed for single copy nuclear RFLP analysis (which turned out to be particularly difficult in western red cedar), and the development of a sensitive and reliable hybridization protocol. Once they were optimized, the protocols proved to be exceptionally reliable. Due to the very large genome sizes of coniferous trees (Dhillon, 1987), single copy nuclear RFLP analysis is notoriously difficult, and only two other labs in the world have successfully conducted this analysis with conifer species and large numbers of different probes ([e.g., Devey et al., 1991] and [Mukai et al, 1995]). As both of these labs have used the method for genome mapping, this study represents the first application of single or low copy number nuclear RFLP markers in a population genetics study in a conifer. The main result of this study is the finding that genetic diversity does indeed appear to be low in western red cedar, as the estimate of H T obtained was 0.06, in close agreement with those obtained from isozyme studies. Future work in this area should involve the application of probes that revealed polymorphic loci in this thesis research to a larger sample size per region (32 more DNA samples per region are available to bring the total sample size per region to 50), using DNA 114 samples that were isolated (but not utilized) as part of this work (Figure 5.1). The increased sample sizes would allow more definitive answers to questions such as the variation of genetic diversity among regions, the amount of inbreeding in populations, and the degree of differentiation between regions and between the coast and interior portions of the species range. The polymorphic loci uncovered in this thesis could be used as additional tools to augment the few available polymorphic isozyme loci in any further genetic marker studies of western red cedar. El-Kassaby et al. (1994) suggest that further studies of the selfing rate of western red cedar, especially in natural stands, are needed. Also, these markers are likely to be of use in studies of the genetic efficiency of western red cedar seed orchards. For these applications, the ability to assay small amounts of tissue is desirable. This could be accomplished if the polymorphic RFLP probes uncovered were converted to PCR-based markers (i.e., STSs [Olson etal, 1989] or SCARs [Paran and Michelmore, 1993]) by sequencing their ends and designing PCR primers. 115 Chapter 6 GENERAL CONCLUSION Several seminal contributions to the field of forest molecular genetics were described in this thesis. The first evidence for the occurrence of RNA editing in the mitochondria of a gymnosperm was described in Chapter 2. The publication of these results (Glaubitz and Carlson, 1992) prompted further investigations into the taxonomic range and timing of the evolutionary origin within the plant kingdom of this intriguing phenomenon (Sper-Whitis et al, 1994; Heisel et al, 1994). Also, this work was cited in a major review of plant mitochondrial molecular biology by Gray et al. (1992). The work described in Chapter 3 led to the first publication involving the use of RAPD markers in conifers. The strategy described therein for the construction of genetic linkage maps with dominant RAPD markers in diploid organisms has since been adopted by a number of workers for mapping angiosperm trees (e.g., Grattapaglia and Sederoff, 1994; Bradshaw etal, 1995). An alternate mapping strategy, taking advantage of the availability of haploid megagametophyte tissue in conifers, was explored in Chapter 4. In demonstration of this strategy a partial genetic linkage map was produced, based upon segregation of RAPD markers among megagametophytes from a single white spruce tree (Tulsieram et al, 1992b). This single-tree mapping approach has now become the most popular strategy for the construction of genetic linkage maps in conifers (Grattapaglia etal, 1991; Nelson etal, 1993; Binelli and Bucci, 1993; Yazdani et al, 1995). The final study of DNA-level genetic diversity in western red cedar, described in Chapter 5, represents the first large scale population genetic study in a conifer using single or low copy number RFLP markers. The estimate of overall expected heterozygosity resulting from this study (6%) closely agreed with previous results based on isozymes, confirming that genetic diversity is relatively low in this wide-ranged species. 116 Future methodology At this point we should look to the future, particularly concerning technological advances in DNA marker methodology. The author's vision of population genetics and forest molecular genetics of the future involves automation, PCR-based assays, multiplexing via the use of fluorescent labels, and maximal integration of marker assays and data analysis. Achievement of this integration would require the cooperated effort of many experts. Many of the tasks involved in DNA marker work are tedious and repetitive. Automation of these tasks will allow molecular genetic researchers of the future to spend less time labeling tubes, aliquoting solutions, loading gels, etc., and more time surveying the literature, planning experiments, analyzing data, and writing papers. Machines are already available for many of these tasks, although for small labs they are relatively expensive. Hopefully, production of greater numbers and/or increased competition among producers of such machines will lead to greater affordability in the future. Alternatively, investment could be put into a very well equipped, centralized facility to which many researchers could book access. PCR-based methods, such as RAPD markers, are particularly amenable to automation (Rafalski and Tingey, 1993). Once RAPD bands of interest have been identified, many researchers now convert these marker bands to SCARs, or Sequence Characterized Amplified Regions (Paran and Michelmore, 1993). This conversion process involves sequencing the ends of the RAPD band (or RFLP probe or any cloned DNA fragment) of interest, and from this sequence information, designing a pair of primers that allow specific PCR amplification of only that band (i.e., single locus). Hence, the term "SCAR" is a way of saying "product of a single-locus PCR reaction." Because a pair of relatively long (approximately 20 base pairs each) primers are used in the SCARs assay, rather than the single, short (10 base pairs) primer used in RAPD, the SCARs assay (i.e., single-locus PCR) is more robust than the RAPD assay, 117 and is therefore more highly reproducible. Also, amplification nulls are less frequent with SCARs than with RAPDs, so SCARs markers are generally codominant. Allelic variation at a codominant SCAR locus is usually detected by restriction enzyme digestion. An alternative, more powerful method of detecting allelic variation in PCR products such as SCARs is Single Strand Conformation Polymorphism (SSCP) analysis (Hayashi, 1992). SSCP allows, with minimal manipulation, the detection of greater than 90% of the possible single-nucleotide substitutions within PCR products (or restriction fragments of PCR products) of about 200 bases in length. For PCR products close to 400 bases in length, greater than 80% are detected (Hayashi, 1992). PCR products are simply denatured prior to their resolution on a cooled, non-denaturing polyacrylamide gel. The non-denaturing conditions of the gel allow the single DNA strands to develop secondary structure via internal base pairing. Most single-nucleotide substitutions result in differences in the secondary structure of one or both of the single strands, detectable as mobility shifts on the gel. Size differences (insertions and deletions) should also result in mobility shifts. To date, SSCP has been predominantly applied to the detection of mutations that may be responsible for human hereditary diseases, and has become very popular in this context. The obvious potential of this technique for population genetics applications has so far received little attention (for example Clark et al., 1995; Zietkiewicz et al., 1995). Microsatellite, or Simple Sequence Repeat (SSR) markers (Litt and Luty, 1989; Weber and May, 1989), as they are PCR-based, should also be amenable to automation. Such markers uncover polymorphisms consisting of variable numbers of tandem repeats of small size (2 to 6 bp). A single, codominant microsatellite locus can be detected via PCR using primers based upon unique (single-copy) sequences flanking the repeat. Because these markers are both highly abundant in eukaryotic genomes and very highly polymorphic they are currently attracting considerable attention. Microsatellite markers have recently been 118 developed in many organisms, including forest trees (e.g., radiata pine [Smith and Devey, 1994] and bur oak [Dow et al., 1995]). Such markers should be particularly useful for studies where maximum genetic discrimination is required, such as fingerprinting, mating system, and paternity analyses. Another recently developed PCR-based genetic marker system is AFLP (Amplified Fragment Length Polymorphism - Zabeau and Voss, 1993). This method involves ligation of "linkers" to restriction enzyme-digested D N A followed by PCR amplification using a primer that matches part of the linker sequence. In this manner, multiple bands are amplified as in the RAPD assay. However, amplification conditions are stringent so the PCR primer (long enough to be unlikely to amplify untreated genomic DNA) will only bind to the perfectly matching linkers. Hence, this technique should be more robust than the RAPD assay. Automated DNA sequencers can be used for analysis of SCARs, microsatellite markers, or AFLPs. SSCP analysis can also be performed on these machines if non-denaturing polyacrylamide gels are used (Hayashi, 1992). Current automated sequencers are capable of discriminating between four different fluorescent labels, one for each of the 4 bases in DNA. It is proposed that apparatuses for future DNA marker work should be far more flexible, and capable of distinguishing a far larger number of fluorescent labels. This would permit the use of multiplexing, the analysis of several PCR loci (i.e., 10 or more) simultaneously on a single lane of a gel. Different PCR loci can be identified in a multiplex reaction by the use of labels fluorescing at different wavelengths attached to one or both of the PCR primers specific to each locus. In this manner it is also possible to simultaneously amplify the various loci via multiplex-PCR in a single tube. For electrophoresis, size or mobility standards, distinguished by their own particular label, can be included in each lane of the gel. These internal standards provide for correction of lane to lane mobility variation, facilitating identification of allelic variants across samples with maximum precision. 119 The software that accompanies automated DNA sequencers allows direct printout of the DNA sequences obtained, removing the need for tedious manual interpretation of sequence ladders. Similarly, future machines for population genetics should be able to automatically score the alleles from multiple loci and save the data. Standardization of data file formats will allow data analysis to be immediately performed by the appropriate program. Alternatively, data could be accumulated over many separate runs and then analyzed in toto. Hence, in the "superlab" of the future, all the steps in the process will be integrated. Achievement of this vision would of course require the cooperative effort of experts from many disciplines, including engineers, molecular geneticists, computer programmers, and genetic statisticians. However, as many of the elements are already in place (e.g., the ABI PRISM 310 Genetic Analyzer by Perkin Elmer), this vision may soon become a reality. 120 LITERATURE CITED Adams WT (1983) Application of isozymes in tree breeding. In: Tanksley SD, Orton TJ (eds) Isozymes in plant genetics and breeding, part A. Elsevier Science Publishers B.V., Amsterdam, pp 381-400. Adams WT, Birkes DS (1991) Estimating mating patterns in forest tree populations. In: Fineschi S, Malvolti ME, Cannata F, Hattemer HH (eds) Biochemical markers in the population genetics of forest trees. SPB Academic Publishing bv, Hague, Netherlands, pp 157-172. Adams WT, Neale DB, Doerksen AH, Smith DB (1990) Inheritance and linkage of isozyme variants from seed and vegetative bud tissues in coastal Douglas-fir (Pseudotsuga menziesii var. menziesii [Mirb.] Franco). Silvae Genetica 39:153-167. Ahuja MR, Devey ME, Groover AT, Jermstad KD, Neale DB (1994) Mapped DNA probes from loblolly pine can be used for restriction fragment length polymorphism mapping in other conifers. Theor Appl Genet 88:279-282. Alden J, Loopstra C (1987) Genetic diversity and population structure of Picea glauca on an altitudinal gradient in interior Alaska. Can J For Res 17:1519-1526. Allard RW (1956) Formulas and tables to facilitate the calculation of recombination values in heredity. Hilgardia 24:235-278. Allendorf FW, Knudsen KL, Blake G M (1982) Frequencies of null alleles at enzyme loci in natural populations of ponderosa and red pine. Genetics 100:497-504. Amasino R M (1986) Acceleration of nucleic acid hybridization rate by polyethylene glycol. Anal Biochem 152:304-307. Apuya NR, Frazier BL, Keim P, Roth EJ, Clark K G (1988) Restriction fragment length polymorphisms as genetic markers in soybean, Glycine max (L.) Merrill. Theor Appl Genet 75:889-901. Atipanumpai L (1989) Acacia mangium: studies on the genetic variation in ecological and physiological characteristics of a fast growing plantation tree species. Acta Forestalia Fennica 206:1-92. Bailey NT J (1949) The estimation of linkage with differential viability, II and III. Heredity 3:220-228. Bailey-Serres J, Hanson DK, Fox TD, Leaver CJ (1986) Mitochondrial genome rearrangement leads to extension and relocation of the cytochrome c oxidase subunit I gene in sorghum. Cell 47:567-576. 121 Baker RG (1983) Holocene vegetational history of the western United States. In: Wright HE (ed) Late-quaternary environments of the United States, Volume 2: The Holocene. Univ. of Minnesota Press, Minneapolis, pp 109-127. Beavis WD, Grant D (1992) A linkage map based on information from four F2 populations of maize (Zea mays L.). Theor Appl Genet 82:636-644. Becerra Velasquez VL, Gepts P (1994) RFLP diversity of common bean (Phaseolus vulgaris) in its centres of origin. Genome 37:256-263. Beckman JS, Soller M (1986) Restriction fragment length polymorphisms and genetic improvement of agricultural species. Euphytica 35:111-124. Beckman JS, Soller M (1988) Detection of linkage between marker loci affecting quantitative traits in crosses between segregating populations. Theor Appl Genet 76:228-236. Benne R (1990) RNA editing in trypanosomes: is there a message? TIG 6:177-181. Benne R, Van Den Burg J, Brakenhoff JPJ, Sloof P, Van Boom JH, Tromp MC (1986) Major transcript of the frameshifted coxll gene from trypanosome mitochondria contains four nucleotides that are not encoded in the DNA. Cell 46:819-826. Bernatzky R, Tanksley S (1986) Toward a saturated linkage map in tomato based on isozymes and random cDNA sequences. Genetics 112:887-898. Binelli G, Bucci G (1994) A genetic linkage map of Picea abies Karst, based on RAPD markers, as a tool in population genetics. Theor Appl Genet 88:283-288. Blake TK (1990) RFLP analysis in barley: making the technology accessible. Agri Canada Biocrop Network RFLP Workshop. Manitoba,Canada. Abstract p 4. Bonen L, Boer PH, Mcintosh JE, Gray MW (1987) Nucleotide sequence of the wheat mitochondrial gene for subunit I of cytochrome oxidase. Nucleic Acids Res 15:6734. Bower RC, Dunsworth BG (1988) Provenance test of western red cedar on Vancouver Island. In: Smith NJ (ed) Western red cedar-does it have a future? Conference Proceedings, University of British Columbia, Faculty of Forestry, pp 131-135. Bradshaw HD, Stettler RF (1994) Molecular genetics of growth and development in Populus. II. Segregation distortion due to genetic load. Theor Appl Genet 89:551-558. Bradshaw HD, Stettler RF (1995) Molecular genetics of growth and development in Populus. IV. Mapping QTLs with large effects on growth, form, and phenology traits in a forest tree. Genetics 139:963-973. 122 Bradshaw HD, Villar M , Watson BD, Otto KG, Stewart S, Stettler RF (1994) Molecular genetics of growth and development in Populus. III. A genetic linkage map of a hybrid poplar composed of RFLP, STS, and RAPD markers. Theor Appl Genet 89:167-178. Briscoe D, Stephens JC, O'Brien SJ (1994) Linkage disequilibrium in admixed populations: applications in gene mapping. J Hered 85:59-63. Budowle B, Baechtel FS (1990) Modifications to improve the effectiveness of restriction fragment length polymorphism typing. Applied and Theoretical Electrophoresis 1:181-188. Burr B, Burr FA (1991) Recombinant inbreds for molecular mapping in maize: theoretical and practical considerations. Trends Genet 7:55-60. Carlson JE, Hong Y-P, Brown GR, Glaubitz JC (1994) FISH and RAPD marker genome analysis in conifers. In: Gresshoff PM (ed) Plant Genome Analysis. CRC Press, Boca Raton, Florida, pp 69-82. Carlson JE, Tulsieram LK, Glaubitz JC, Luk VWK, Kauffeldt C, Rutledge R (1991) Segregation of random amplified DNA markers in F, progeny of conifers. Theor Appl Genet 83:194-200. Carson HL (1990) Increased genetic variance after a population bottleneck. Trends Ecol Evol 5:228-230. Casanova JL, Pannetier C, Jaulin C, Kourilsky P (1990) Optimal conditions for directly sequencing double-stranded PCR products with Sequenase. Nucleic Acids Res 18:4028. Chang C, Bowman JL, DeJohn AW, Lander ES, Meyerowitz E M (1988) Restriction fragment length polymorphism linkage map for Arabidopsis thaliana. Proc Natl Acad Sci USA 85:6856-6860. Cheliak WM, Morgan K, Dancik BP, Strobec KC, Yeh FC (1984a) Segregation of allozymes in megagametophytes of viable seed from a natural population of jack pine, Pinus banksiana Lamb. Theor Appl Genet 69:145-151. Cheliak WM, Pitel JA, Murray G (1984b) Population structure and the mating system of white spruce. Can J For Res 15:301-308. Cheliak WM, Skroppa T, Pitel JA (1987) Genetics of the polycross. I. Experimental results from Norway spruce. Theor Appl Genet 73:321-329. Cherry M (1995) Genetic variation in western red cedar {Thujaplicata Donn) seedlings. Ph.D. Dissertation, University of British Columbia, Vancouver, B.C. Church GM, Gilbert W (1984) Genomic sequencing. Proc Natl Acad Sci USA 81:1991-1995. 123 Clark AG, Aguade M , Prout T, Harshman LG, Langley CH (1995) Variation in sperm displacement and its association with accessory gland protein loci in Drosophila melanogaster. Genetics 139:189-201. Conkle MT (1981) Isozyme variation and linkage in six conifer species. In: Proc. Sym. Isozymes of North American Forest Trees and Forest Insects. USDA For. Serv. Gen. Tech. Rep., PSW-48, pp 11-17. Copes DL (1981) Isoenzyme uniformity in western red cedar seedlings from Oregon and Washington. Can J For Res 11:451-453. Covello PS, Gray MW (1989) RNA editing in plant mitochondria. Nature 341:662-666. Covello PS, Gray MW (1990) Differences in editing at homologous sites in messenger RNAs from angiosperm mitochondria. Nucleic Acids Res 18:5189-5196. Covello PS, Gray MW (1992) Silent mitochondrial and active nuclear genes for subunit II of cytochrome c oxidase (coxll) in soybean: evidence for RNA-mediated gene transfer. EMBO J 11:3815-3820. Covello PS, Gray MW (1993) On the evolution of RNA editing. Trends Genet 9:265-268. Critchfield WB (1984) Impact of the pleistocene on the genetic structure of North American conifers. In: Lanner R M (ed) Proceedings of the 8th North American Forest Biology Workshop Logan, Utah: Utha State University, pp 70-118. DeVerno LL, Byrne JR, Pitel JA, Cheliak W M (1989) Constructing conifer genomic libraries: a basic guide. Information report Pl-X-88, Petawawa National Forestry Institute, Forestry Canada. DeVerno LL, Mosseler A (1994) Analysis of genetic diversity in forest trees using RAPD markers. Sixth International Meeting, Molecular Genetics Working Party, IUFRO, Portland, Maine, May 20-23. Abstract. Devey ME, Fiddler TA, Lui BH, Knapp SJ, Neale DB (1994) An RFLP linkage map for loblolly pine based on a three-generation pedigree. Theor Appl Genet 88:273-278. Devey ME, Jermstad KD, Tauer CG, Neale DB (1991) Inheritance of RFLP loci in a loblolly pine three-generation pedgigree. Theor Appl Genet 83:238-242. Dhillon SS (1987) DNA in tree species. In: Bonga JM, Durzan DJ (eds) Cell and tissue culture in forestry. Martinus Nijhoff Publishers, Dordrecht Boston Lancaster, pp 298-313. Dow BD, Ashley MV, Howe HF (1995) Characterization of highly variable (GA/CT)„ microsatellites in the bur oak, Quercus macrocarpa. Theor Appl Genet 91:137-141. 124 Eckert RT, Joly R, Neale DB (1981) Genetics of isozyme variants and linkage relationships among allozyme loci in 35 eastern white pine clones. Can J For Res 11:573-579. Edwards MD, Stuber CW, Wendel JF (1987) Molecular marker facilitated investigations of quantitative trait loci in maize. I. Numbers, genomic distribution and type of gene action. Genetics 116:113-125. El-Kassaby Y A (1992) Domestication and genetic diversity-should we be concerned?. For Chron 68:687-700. El-Kassaby Y A (1995) Evaluation of the tree-improvement delivery system: factors affecting genetic potential. Tree Physiol 15:545-550. El-Kassaby YA, Ritland K (1995) Genetic variation in low elevation Douglas-fir of British Columbia and its relevance to gene conservation. Biodiversity and Conservation 4: (In press). El-Kassaby YA, Russell J, Ritland K (1994) Mixed mating in an experimental population of western red cedar, Thujaplicata. J Hered 85:227-231. El-Kassaby YA, Sziklai O, Yeh FC (1982) Linkage relationships among 19 polymorphic allozyme loci in coastal Douglas-fir (Pseudotsuga menziesii var. menziesii). Can J Genet Cytol 124:101-108. Epperson BK, Allard RW (1987) Linkage disequilibrium between allozymes in natural populations oflodgepole pine. Genetics 115:341-352. Fang G, Hammar S, Grumet R (1992) A quick and inexpensive method for removing polysaccharides from plant genomic DNA. Biotechniques 13:52-55. Felsenstein J (1993) PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Dept of Genetics, University of Washington, Seattle. Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279-284. Fowells HA (1965) Western red cedar (Thuja plicata Donn). In: Silvics of forest trees of the United States. U.S. Department of Agriculture, Forest Service, Agricultural Handbook No.271. pp 686-691. Fowler DP, Lester DT (1970) Genetics of red pine. U.S. For. Serv. Wash. Off. Res. Pap. WO-8. Fowler DP, Morris RW (1977) Genetic diversity in red pine: evidence for low genie heterozygosity. Can J For Res 7:343-347. Gerber S, Rodolphe F (1994a) An estimation of the genome length of maritime pine (Pinus pinaster Ait.). Theor Appl Genet 88:289-292. 125 Gerber S, Rodolphe F (1994b) Estimation and test for linkage between markers: a comparison of LOD score and y} test in a linkage study of maritime pine (Pinus pinaster Ait.). Theor Appl Genet 88:293-297. Gifford EM, Foster AS (1989) Morphology and evolution of vascular plants. W.H. Freeman and Co., New York. Gill KS, Lubbers EL, Gill BS, Raupp WJ, Cox TS (1991) A genetic linkage map of Triticum tauschii (DD) and its relationship to the D genome of bread wheat (AABBDD). Genome 34:362-372. Glaubitz JC, Carlson JE (1992) RNA editing in the mitochondria of a conifer. Curr Genet 22:163-165. Gleed JA (1995) Incorporating biotechnology into a forest program (a New Zealand example). The Leslie L. Schaffer Lectureship in Forest Science, November 22, 1995, The University of British Columbia, Vancouver, B.C. Goldberg R (1978) DNA sequence organization in the soybean plant. Biochem Genet 16:45-68. Grabau EA (1986) Nucleotide sequence of the cytochrome oxidase subunit I gene from soybean mitochondria. Plant Mol Biol 7:377-384. Grant D, Blair D, Behrendsen W, Meier R, Beavis W, Bowen S, Tenborg R, Martich J, Fincher R, Smith S, Simth H, Keaschall J (1988) Identification of quantitative-trait loci for plant height and ear height using RFLP's. Maize Coop News Lett 62:71-73. Grattapaglia D, Sederoff R (1994) Genetic linkage maps of Eucalyptus grandis and Eucalyptus urophylla using a pseudo-testcross: mapping strategy and RAPD markers. Genetics 137:1121-1137. Grattapaglia D, Wilcox P, Chaparro JX, O'Malley D, McCord S, et al (1991) A RAPD map of loblolly pine in 60 days. Third International Congress of the International Society for Plant Molecular Biology, abstract 2224. Gray MW, Covello PS (1993) RNA editing in plant mitochondria and chloroplasts. FASEB J 7:64-71. Gray MW, Hanic-Joyce PJ, Covello PS (1992) Transcription, processing and editing in plant mitochondria. Annu Rev Plant Physiol Plant Mol Biol 43:145-175. Gualberto JM, Lamattina L, Bonnard G, Weil JH, Grienenberger JM (1989) RNA editing in wheat mitochondria results in the conservation of protein sequences. Nature 341:660-662. 126 Guries RP, Friedman ST, Ledig FT (1978) A megagametophyte analysis of genetic linkage in pitch pine (Pinus rigida Mill.). Heredity 40:309-314. Hamrick JL, Godt MJW (1990) Allozyme diversity in plant species. In: Brown AHD, Clegg MT, Kahler AL, Weir BS (eds) Plant population genetics, breeding, and genetic resources. Sinauer Associates Inc., Sunderland, Massachusetts, pp 43-63. Haiti DL, Clark A G (1989) Principles of population genetics. Sinauer Associates Inc., Sunderland, Massachusetts. Harvey MJ, Muehlbauer FJ (1989) Linkages between restriction fragment length, isozyme and morphological markers in lentil. Theor Appl Genet 77:395-401. Hastings A (1990) The interaction between selection and linkage in plant population. In: Brown AHD, Clegg MT, Kahler AL, Weir BS (eds) Plant population genetic, breeding and genetic resources. Sinauer Associates Inc., Sunderland, Massachusetts, pp 163-180. Hayashi K (1992) PCR-SSCP: A method for detection of mutations. GATA 9:73-79. Hebda RJ (1983) Late-glacial and postglacial vegetation history at Bear Cove Bog, northeast Vancouver Island, British Columbia. Can J Bot 61:3172-3192. Hebda RJ, Mathewes RW (1984) Holocene history of cedar and Native Indian cultures of the North American Pacific Coast. Science 225:711-713. Helentjaris H (1987) A genetic linkage map for maize based on RFLPs. Trends Genet 3:217-221. Helentjaris R, Slocum M , Wright S, Schaeffer A, Nienhuis J (1986) Construction of genetic linkage maps in maize and tomato using restriction fragment length polymorphisms. Theor Appl Genet 72:761-769. Heun M , Kennedy AE, Anderson JA, Lapitan NLV, Sorrells ME, Tanksley SD (1991) Construction of a restriction fragment length polymorphism map for barley (Hordeum vulgare). Genome 34:437-444. Hiesel R, Combettes B, Brennicke A (1994) Evidence for RNA editing in mitochondria of all major groups of land plants except the Bryophyta. Proc Natl Acad Sci USA 91:629-633. Hiesel R, Schobel W, Schuster W, Brennicke A (1987) The cytochrome oxidase subunit I and subunit III genes in Oenothera mitochondria are transcribed from identical promoter sequences. EMBO J 6:29-34. Hiesel R, Wissinger B, Schuster W, Brennicke A (1989) RNA editing in plant mitochondria. Science 246:1632-1634. 127 Hoch B, Maier RM, Appel K, Igloi GL, Kossel H (1991) Editing of a chloroplast mRNA by creation of an initiation codon. Nature 353:178-180. Horn P, Rafalski A (1992) Non-destructive RAPD genetic diagnostics of microspore-derived Brassica Embryos. Plant Mol Biol Rep 10:285-293. Hughes DW, Galau G (1988) Preparation of RNA from cotton leaves and pollen. Plant Mol Biol Reporter 6:253-257. Ingle J, Timmis J, Sinclair J (1975) The relationship between satellite deoxyribonucleic acid, ribosomal ribonucleic acid gene redundancy, and genome size in plants. Plant Physiol 55:496-501. Isaac PG, Jones VP, Leaver CJ (1985) The maize cytochrome c oxidase subunit I gene: sequence, expression and rearrangement in cytoplasmic male sterile plants. EMBO J 4:1617-1623. Isabel N , Beaulieu J, Bousquet J (1995) Complete congruence between gene diversity estimates derived from genotypic data at enzyme and RAPD loci in black spruce. Proc Natl Acad Sci USA 92:6369-6373. Jermstad KD, Reem A M , Henifin JR, Wheeler NC, Neale DB (1994) Inheritance of restriction fragment length polymorphisms and random amplified polymorphic DNAs in coastal Douglas-fir. Theor Appl Genet 89:758-766. Kadowaki K, Suzuki T, Kazama S, Oh-fuchi T, Sakamoto W (1989) Nucleotide sequence of the cytochrome oxidase subunit I gene from rice mitochondria. Nucleic Acids Res 17:7519. Kam-Morgan LNW, Gill BS (1989) DNA restriction fragment length polymorphisms: a strategy for genetic mapping of D genome of wheat. Genome 32:724-732. Kawasaki ES (1990) Amplification of RNA. In: Innis MA, Gelfand DH, Sninsky JJ, White TJ (eds) PCR Protocols. Academic Press, San Diego New York, pp 21-27. Keim P, Shoemaker RC, Palmer RG (1989) Restriction fragment length polymorphism diversity in soybean. Theor Appl Genet 77:786-792. Kelly JD, Afanador L, Haley SD (1995) Pyramiding genes for resistance to bean common mosaic virus. Euphytica 82:207-212. Kemmerer EC, Kao T, Deng G, Wu R (1989) Isolation and nucleotide sequence of the pea cytochrome oxidase subunit I gene. Plant Mol Biol 13:121-124. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge. 128 King IN, Dancik BP, Dhir NK (1984) Genetic structure and mating system of white spruce (Picea glaucd) in a seed production area. Can J For Res 14:639-643. Kosambi DD (1944) The estimation of map distance from recombination values. Ann Eugen 12:172-175. Kudla J, Igloi GL, Metzlaff M , Hagemann R, Kossel H (1992) RNA editing in tobacco chloroplasts leads to the formation of a translatable psbL mRNA by a C to U substitution within the initiation codon. EMBO J 11:1099-1103. Lagercrantz U, Ryman N (1990) Genetic structure of Norway spruce (Picea abies): Concordance of morphological and allozymic variation. Evolution 44:38-53. Lambeth CC (1980) Juvenile-mature correlations in Pinaceae and implications for early selection. For Sci 26:571-580. Lande R (1988) Genetics and demography in biological conservation. Science 241:1455-1460. Lande R, Thompson R (1990) Efficiency of marker-assisted selection in improvement of quantitative traits. Genetics 124:743-756. Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121:185-199. Lander ES, Green P, Abrahamson J, Barlow A, Daly MJ, Lincoln SE, Newberg I (1987) Mapmaker: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1:174-181. Landry BS, Hubert N , Etoh T, Harada JJ, Lincoln SE (1991) A genetic map for Brassica napus based on restriction fragment length polymorphisms detected with expressed DNA sequences. Genome 34:543-552. Landry BS, Kesseli R, Farrara B, Michelmore RW (1987) A genetic map of lettuce (Lactuca sativa L.) with restriction fragment length polymorphism, isozyme, disease resistance and morphological markers. Genetics 116:331-337. Lange K, Boehnke M (1982) How many polymorphic genes will it take to span the human genome? Am J Hum Genet 34:842-845. Larson CS (1937) The employment of species, types and individuals in forestry. Royal Vet. and Agr. Coll. Yearbook, Copenhagen, Denmark, pp 74-154. Ledig FT (1986) Conservation strategies for forest gene resources. For Ecol Manage 14:77-90. Ledig FT, Conkle MT (1983) Gene diversity and genetic structure in a narrow endemic, Torrey pine (Pinus torreyana Parry ex Carr). Evolution 37:79-85. 129 Levins R (1963) Theory of fitness in a heterogeneous environment. II. Developmental flexibility and niche selection. Am Nat 97:75-90. Lewontin RC (1957) The adaptations of populations to varying environments. Cold Spring Harbor Symp Quant Biol 22:395-408. Lewontin RC (1984) Detecting population differences in quantitative characters as opposed to gene frequencies. Am Nat 123:115-124. Lincoln S, Daly M , Lander E (1992) Constructing genetic maps with MAPMAKER/EXP 3.0. Whitehead Institute Technical Report 3rd edition. Litt M , Luty JA (1989) A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene. Am J Hum Genet 44:397-401. Loeschcke V, Tomiuk J, Jain SK (1994) Conservation genetics. Birkhauser Verlag, Basel, Switzerland. Lorieux M , Goffinet B, Perrier X, Gonalez de Leon D, Lanaud C (1995a) Maximum-likelihood models for mapping genetic markers showing segregation distortion. I. Backcross populations. Theor Appl Genet 90:73-80. Lorieux M , Goffinet B, Perrier X, Gonalez de Leon D, Lanaud C (1995b) Maximum-likelihood models for mapping genetic markers showing segregation distortion. II. F 2 populations. Theor Appl Genet 90:81-89. Lynch M , Milligan BG (1994) Analysis of population genetic structure with RAPD markers. Mol Ecol 3:91-99. Lyttle TW (1993) Cheaters sometimes prosper: distortion of mendelian segregation by meiotic drive. Trends Genet 9:205-210. Maier RM, Hoch B, Zeltz P, Kossel H (1992) Internal editing of the maize chloroplast ndhA transcript restores codons for conserved amino acids. Plant Cell 4:609-616. Marshall DR, Jain SK (1968) Phenotypic plasticity oiAvena fatua and A. Barbata. Am Nat 102:457-467. Martin B, Nienhuis J, King D, Shaefer A (1989) Restriction fragment length polymorphisms associated with water use efficiency in tomato. Science 243:1725-1728. Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genet Res 23:23-35. McCouch SR, Krochert G, Yu ZH, Wang ZY, Kush GS, Coffman WR, Tanksley SD (1988) Molecular mapping of rice chromosomes. Theor Appl Genet 76:815-829. 130 Michaels SD, John MC, Amasino R M (1994) Removal of polysacchrides from plant DNA by ethanol precipitation. BioTechniques 17:274-276. Moran GF, Muona O, Bell JC (1989) Acacia mangium: a tropical forest tree of the coastal lowlands with low genetic diversity. Evolution 43:231-235. Mosseler A, Egger KN, Hughes GA (1992) Low levels of genetic diversity in red pine confirmed by random amplified polymorphic DNA markers. Can J For Res 22:1332-1337. Mosseler A, Innes DJ, Roberts BA (1991) Lack of allozymic variation in disjunct Newfoundland populations of red pine (Pinus resinosa). Can J For Res 21:525-528. Mukai Y, Suyama Y, Tsumura Y, Kawahara T, Yoshimaru H, Kondo T, Tomaru N , Kuramoto N , Murai M (1995) A linkage map for sugi (Crytomeria japonica) based on RFLP, RAPD, and isozyme loci. Theor Appl Genet 90:835-840. Mulligan R M (1991) RNA editing: when transcript sequences change. Plant Cell 3:327-330. Muona O (1990) Population genetics in forest tree improvement. In: Brown AHD, Clegg MT, Khaler AL, Weir BS (eds) Plant population genetics, breeding, and genetic resources. Sinauer Associates Inc., Sunderland, Massachusetts, pp 282-298. Murray G, Skeates D (1982) Variation in height of White Spruce Provenances after 10 and 20 years in five field tests. In: Eskert (ed) Proc. of 28 th North Eastern Tree Improvement Conference. Durban, New Hampshire. pp82-89. Nam H-G, Giraudat J, den Boer B, Moonan F, Loos WDB, Hauge BM, Goodman H M (1989) Restriction fragment length polymorphism linkage map of Arabidopsis thaliana. Plant Cell 1:699-705. Namkoong G (1984) A control concept of gene conservation. Silvae Genet 33:160-163. Neale DB, Williams CG (1991) Restriction fragment length polymorphism mapping in conifers and applications to forest genetics and tree improvement. Can J For Res 21:545-554. Nei M (1972) Genetic distance between populations. Am Nat 106:283-292. Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Nat Acad Sci USA 70:3321-3323. Nei M (1977) F-statistics and analysis of gene diversity in subdivided populations. Ann Hum Genet 41:225-233. Nei M (1978) Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89:583-590. 131 Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York. Nei M , Chesser RK (1983) Estimation of fixation indices and gene diversities. Ann Hum Genet 47:253-259. Nelson CD, Kubisiak TL, Stine M , Nance WL (1994) A genetic linkage map of longleaf pine (Pinuspalustris Mill.) based on random amplified polymorphic DNAs. J Hered 85:433-439. Nelson CD, Nance WL, Doudrick RL (1993) A partial genetic linkage map of slash pine (Pinus elliottii Engelm. var. elliottii) based on random amplified polymorphic DNAs. Theor Appl Genet 87:145-151. Newman DH, Williams CG (1991) The incorporation of risk in optimal selection age determination. For Sci 37:1350-1364. Nienhuis J, Helentjaris T, Slocum M , Ruggero B, Schaefer A (1987) Restriction fragment length polymorphism analysis of loci associated with insect resistance in tomato. Crop Sci 27:797- 803. Nilan RA (1990) The North American barley genome mapping project. Agri Can Biocrop Network RFLP Workshop. Manitoba, Canada. Abstract p 5. Nugent JM, Palmer JD (1991) RNA-mediated transfer of the gene coxll from the mitochondrion to the nucleus during flowering plant evolution. Cell 66:473-481. O'Malley DM, Allendorf FM, Blake G M (1979) Inheritance of isozyme variation and heterozygosity in Pinusponderosa. Biochem Genet 17:233-250. O'Malley DM, Crane B, McKeand SE, Liu BH, Sederoff RR (1994) Genomic mapping of quantitative traits in loblolly pine. In: Proc. TAPPI Biological Sciences Symposium. Minneapolis, Minnesota, October 3-6. ppl73-177. Ochman H, Gerber AS, Haiti DL (1988) Genetic applications of an inverse polymerase chain reaction. Genetics 120:621-623. Oda K, Yamato K, Ohta E, Nakamura Y, Takemura M , Nozato N , Akashi K, Kanegae T, Ogura Y, Kohchi T, Ohyama K (1992) Gene organization deduced from the complete sequence of liverwort Marchantia polymorpha mitochondrial DNA. A primitive form of plant mitochondrial genome. J Mol Biol 223:1-7. Olson M , Hood L, Cantor C, Botstein D (1989) A common language for physical mapping of the human genome. Science 245:1434-1440. Osborn TC, Alexander DC, Fobes JF (1987) Identification of restriction fragment length polymorphisms linked to genes controlling soluble solids content in tomato fruit. Theor Appl Genet 73:350-356. 132 Owens JN, Colangeli A M , Morris S (1990) The effect of self-, cross, and no pollination on ovule, embryo, seed, and cone development in western red cedar (Thujaplicata). Can J For Res 20:66-75. Paran I, Michelmore RW (1993) Development of reliable PCR-based markers linked to downy mildew resistance genes in lettuce. Theor Appl Genet 85:985-993. Paterson AH, Lander ES, Hewitt ID, Peterson S, Lincoln SE, Tanksley SD (1988) Resolution of quantitative traits into Mendelian factors by using a complete linkage map of restriction fragment length polymorphisms. Nature 335:721-726. Perry DJ, Knowles P (1990) Evidence of high self fertilization in natural populations of eastern white cedar (Thuja occidentalis L.). Can J Bot 68: 663-668. Perry DJ, Knowles P, Yeh FC (1990) Allozyme variation of Thuja occidentalis L. in Northwestern Ontario. Biochem Syst Ecol 18:111-115. Rafalski JA, Tingey SV (1993) Genetic diagnostics in plant breeding: RAPDs, microsatellites and machines. TIG 9:275-280. Reed KC, Mann DA (1985) Rapid transfer of DNA from agarose gels to nylon membranes. Nucleic Acids Res 13:7207-7221. Rehfeldt GE (1994) Genetic structure of western red cedar populations in the Interior West. Can J For Res 24:670-680. Reiter RS, Williams JGK, Feldman KA, Rafalski JA, Tingey SV, Scolnik PA (1992) Global and local genome mapping in Arabidopsois thaliana by using recombinant inbred lines and random amplified polymorphic DNAs. Proc Natl Acad Sci USA 89:1477-1481. Reynolds J, Weir BS, Cockerham CC (1983) Estimation of the coancestry coefficient: basis for a short-term genetic distance. Genetics 105:767-779. Rudin D, Ekberg I (1978) Linkage studies in Pinus sylvestris using macro-gametophyte allozymes. Silvae Genet 27:1-12. Russell JH, Burdon R, Yanchuk A (1995) Inbreeding depression and experimental selfed populations in western redcedar (Thujaplicata). Evolution and tree breeding: meeting of the Canadian Tree Improvement Association/Western Forest Genetics Association, Victoria, B.C., August 28 - September 1, 1995. Abstract. Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487-491. Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. 133 Savolainen O (1994) Genetic variation and fitness: conservation lessons from pines. In: Loeschcke V, tomiuk J, Jain SK (eds.) Conservation genetics. Birkhauser Verlag, Basel, Switzerland. Savolainen O, Karkkainen K (1992) Effect of forest management on gene pools. New Forests 6:329-345. Scharf SJ (1990) Cloning with PCR. In: Innis MA, Gelfand DH, Sninsky JJ, White TJ (eds) PCR Protocols. Academic Press, San Diego New York, pp 84-91. Schlichting CD (1986) The evolution of phenotypic plasticity in plants. Ann Rev Ecol Syst 17:667-693. Schoen DJ, Stewart SC (1986) Variation in male reproductive investment and male reproductive success in white spruce. Evolution 40:1109-1120. Schuster W, Wissinger B, Fliesel R, Unseld M , Gerold E, Knoop V, Marchfelder A, Binder S, Schobel W, Scheike R, Granger P, Ternes R, Brennicke A (1991) Between DNA and protein - RNA editing in plant mitochondria. Physiol Plant 81:437-445. Siggins HW (1933) Distribution and rate of fall of conifer seeds. J Agric Res 47:119-128. Simon JP, Bergeron Y, Gagnon D (1986) Isozyme uniformity in red pine (Pinus resinosa) in the Abitibi Region, Quebec. Can J For Res 16:1133-1135. Slocum MK, Figdore SS, Kennard WC, Suzuki YK, Osborn TC (1990) Linkage arrangement of restriction fragment length polymorphism loci in Brassica oleraceae. Theor Appl Genet 80:57-64. Smith DN, Devey ME (1994) Occurrence and inheritance of microsatellites va. Pinus radiata. Genome 37:977-983. Soller M , Beckman JS (1990) Marker-based mapping of quantitative trait loci using replicated progenies. Theor Appl Genet 80:205-208. Song K M , Suzuki JY, Slocum MK, Williams PH, Osborn TC (1991) A linkage map of Brassica rapa (syn. campestris) based on restriction fragment length polymorphism loci. Theor Appl Genet 82:296-304. Sorensen F (1967) Linkage between marker genes and embryonic lethal factors may cause disturbed segregation ratios. Silvae Genetica 16:132-134. Sorensen F (1969) Embryonic genetic load in coastal Douglas-fir, Pseudotsuga menziesii var. menziesii. Am Nat 103:389-398. 134 Sper-Whitis GL, Russell AL, Vaughn JC (1994) Mitochondrial RNA editing of cytochrome c oxidase subunit II (coxll) in the primitive vascular plant Psilotum nudum. Biochim Biopys Acta 1218:218-220. Srivastava PS, Johri, B M (1988) Pollen embryogenesis. J Pathology 23-24:83-99. Strauss SH, Bousquet J, Hipkins VD, Hong YP (1992a) Biochemical and molecular genetic markers in biosystematic studies of forest trees. New Forests 6:125-158. Strauss SH, Conkle MT (1986) Segregation, linkage, and diversity of allozymes in knobcone pine. Theor Appl Genet 72:483-493. Strauss SH, Hong YP, Hipkins VD (1993) High levels of population differentiation for mitochondrial DNA haplotypes in Pinus radiata, muricata, and attenuata. Theor Appl Genet 86:605-611. Strauss SH, Lande R, Namkoong G (1992b) Limitations of molecular-marker-aided selection in forest tree breeding. Can J For Res 22:1050-1061. Stuber CF, Edwards MD, Wendel JF (1987) Molecular marker facilitated investigations of quantitative trait loci in maize: II Factors influencing yield and its component traits. Crop Sci 27:639-648. Swofford DL (1989) BIOSYS-1, Release 1.7. Illinois Natural History Survey, University of Illinois, Champaign, III. Swofford DL, Selander RB (1981) BIOSYS-1: a FORTRAN program for the comprehensive analysis of electrophoretic data in population genetic and systematics. J Hered 72:281-283. Taberlet P, Gielly L, Pautou G, Bouvet J (1991) Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Mol Biol 17:1105-1109. Tulsieram LK, Compton WA, Morris R, Thomas-Compton M , Eskridge K (1992a) Analysis of genetic recombination in maize populations using molecular markers. Theor Appl Genet 84:65-72. Tulsieram LK, Glaubitz JC, Kiss G, Carlson JE (1992b) Single-tree genetic linkage mapping in conifers using haploid DNA from megagametophytes. Bio/technology 10:686-690. Vida G (1994) Global issues of genetic diversity. In: Loeschcke V, Tomiuk J, Jain SK (eds) Conservation genetics. Birkhauser Verlag Basel, Switzerland, pp 9-19. von Rudloff E, Lapp MS (1979) Population variation in leaf oil terpene composition of western red cedar, Thuja plicata. Can J Bot 57:476-479. 135 von Rudloff E, Lapp MS, Yeh FC (1988) Chemosystematic sturdy of Thujaplicata: multivariate analysis of leaf oil terpene composition. Biochem Syst Ecol 16:119-125. Wagner DB, Furnier GR, Saghai-Maroof MA, Williams SM, Dancik BP, Allard RW (1987) Chloroplast DNA polymorphisms in lodgepole and jack pines and their hybrids. Proc Natl Acad Sci USA 84:2097-2100. Wahl GM, Stern M , Stark GR (1979) Efficient transfer of large DNA fragments from agarose gels to diazobenzyloxymethyl-paper and rapid hybridiation by using dextran sulfate. Proc Natl Acad Sci USA 76:3683-3687. Weber JL, May PE (1989) Abundant class of human DNA polymorphism which can be typed using the polymerase chain reaction. Am J Hum Genet 44:388-396. Weir BS (1990) Genetic data analysis: methods for discrete population genetic data. Sinauer Associates Inc., Sunderland, Massachusetts. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370. Weller JI, Soller M , Brody T (1988) Linkage analysis of quantative traits in an interspecific cross of tomato {Lycopersicon esculentum x Lycoperscion pimpinellifolium) by means of genetic markers. Genetics 118:329-339. Whitkus R (1988) Modified version of GENESTAT: A program for computing genetic statistics from allelic frequency data. Plant Genet Newsletter 4:10. Williams CG, Hamrick JL, Lewis PO (1995) Multiple-population versus hierarchical conifer breeding programs: a comparison of genetic diversity levels. Theor Appl Genet 90:584-594. Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey SV (1990) DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res 18:6531-6535. Wilson AC, Cann RL, Carr SM, George M, Gyllensten UB, Helm-Bychowski K M , Higuchi RG, Palumbi SR, Prager EM, Sage RD, Stoneking M (1985) Mitochondrial DNA and two perspectives on evolutionary genetics. Biol J Linn Soc 26:375-400. Wright S (1951) The genetical structure of populations. Ann Eugenics 15:323-354. Wright S (1978) Evolution and the genetics of populations, vol. 4. Variability within and among natural populations. University of Chicago Press, Chicago. Xie CY, Dancik BP, Yeh FC (1991) The mating system in natural populations of Thuja orientalis. Can J For Res 21:333-339. 136 Yazdani R, Yeh FC, Rimsha J (1995) Random amplified polymorphic DNA (RAPD) linkage map in Pinus sylvestris (L.). Dendrome: Forest Tree Genome Research Updates 2(1):3. Yeh FC (1981) Analysis of gene diversity in some species of conifers. In: M.T. Conkle (ed) Proc. of Sym. on Isozymes of North American Forest Trees and Insects. U.S. For. Serv. Tech. Rep. PSW-48. pp 48-52. Yeh FC (1988) Isozyme variation of Thuja plicata (Cupressaceae) in British Columbia. Biochem Syst Ecol 16:373-377. Yeh FC, Khalil MAK, El-Kassaby YA, Trust DC (1986) Allozyme variation in Picea mariana from Newfoundland: genetic diversity, population structure, and analysis of differentiation. Can J For Res 16:713-720. Yeh FC, O'Malley D M (1980) Enzyme variation in natural populations of Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) from British Columbia. 1. Genetic variation patterns in coastal populations. Silvae Genet 29:83-92. Zabeau M , Voss P (1993) Selective restriction fragment amplification: a general method for DNA fingerprinting. European Patent Application 92402629.7 (Publ. Number 0 534 858 Al) . Zhou C, Yang Y, Jong A Y (1990) Miniprep in ten minutes. Biotechniques 8:172-173. Zietiewicz E, Akalin N , Labuda D (1995) Neutral polymorphisms in the deletion-prone regions of the dystrophin gene. Hum Hered 45:80-83. 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items