UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The evolution of alternative splicing after polyploidy Tack, David Christopher 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2016_may_tack_david.pdf [ 2.12MB ]
JSON: 24-1.0300178.json
JSON-LD: 24-1.0300178-ld.json
RDF/XML (Pretty): 24-1.0300178-rdf.xml
RDF/JSON: 24-1.0300178-rdf.json
Turtle: 24-1.0300178-turtle.txt
N-Triples: 24-1.0300178-rdf-ntriples.txt
Original Record: 24-1.0300178-source.json
Full Text

Full Text

THE EVOLUTION OF ALTERNATIVE SPLICING AFTER POLYPLOIDYbyDavid Christopher TackB.Sc., The University of Michigan, 2007A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Botany)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)May 2016© David Christopher Tack, 2016AbstractGene and genome duplications have made major contributions to the genomes of eukaryotes. Alternative splicing modulates gene expression and alters protein function.  First, I examine alternative splicing patterns in the allopolyploid Brassica napus, revealing that the genome-wide trends of alternative splicing in duplicated genes of an evolutionarily new allotetraploid plant are very similar overall to those found in Arabidopsis thaliana. Within Brassica napus, I show that the alternative splicing patterns of the reunited homeologs are not well conserved, highlighting that alternative splicing is a rapidly evolving aspect of gene expression.  Second, using Arabidopsis thaliana, I investigated the divergence of alternative splicing between paralogs, revealing about 30% qualitative conservation of alternative splicing events. I determined that qualitatively conserved events most often are not quantitatively conserved, indicating either incomplete divergence or specialization. I examined the duplicate gene pair of CCA1/LHY in detail, showing a case of subfunctionalization of alternative splicing after gene duplication that has implications for the cold response pathway of A. thaliana. By analyzing a transcriptome data set from nonsense mediated decay mutants, I showed that alternative splicing mediated nonsense mediated decay has significantly diverged between both pairs of whole genome and pairs of tandem duplicates. Third, I investigated the immediate effects of allopolyploidzation on gene expression and alternative splicing using three resynthesized Brassica napus lines. Many of the effects of allopolyploidization are repeatable, however some changes to gene expression and alternative splicing are unique to an instance of polyploidy.  In all three polyploids surveyed, intron retention events that changed their frequency did so in an overwhelmingly negative fashion (i.e. the levels of alternatively spliced transcripts went down) and the majority of these changes were iiparallel between polyploids. Other classes of alternative splicing events showed a far more balanced set of changes in response to polyploidy. Natural B. napus showed significantly more increases in intron retention frequency vs. the parental species than any of the resynthesized lines. I assert that much of the changes in levels of alternatively spliced transcripts can be attributed the stochastic nature of polyploidization.   iiiPrefaceA version of chapter 2 has been published. Chalhoub B, Denoeud F, Liu S, Parkin IA, Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B et al. (82 co-authors)  (2014) Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science, 345(6199), 950-953.The project was conceived by BC, KLA, and I. Genescope raised the plants, extracted the RNA, and sequenced the libraries. I mapped the reads, processed the reads with custom scripts to detect alternative RNA splicing, and compared these splicing patterns within homeolog pairs. A version of chapter 3 has been published. Tack DC, Pitchers, WR and Adams, KL. (2014)  Transcriptome Analysis Indicates Considerable Divergence in Alternative Splicing Between Duplicated Genes in Arabidopsis thaliana. Genetics, 198(4), 1473-1481. The project was conceived by KLA and I. I grew the plants, extracted the RNA, prepared the libraries, mapped the reads, and processed and counted AS events in the reads. WRP and I analyzed the data. I wrote the first draft. KLA, WRP and I edited the manuscript and wrote the final draft.Chapter 4 is based on a manuscript in preparation. Tack DC, Pitchers, WR and Adams, KL. (in preparation) Transcriptomic changes accompanying allopolyploization are variable between resynthesized lines of Brassica napus.  I grew the plants, extracted the RNA, mapped the reads and processed and counted the AS events in the reads.WRP and I analyzed the data. I wrote the first draft. KLA, WRP and I edited the manuscript and wrote the final draft.ivTable of ContentsAbstract  …...................................................................................................................................  iiPreface ......................................................................................................................................... ivTable of Contents ......................................................................................................................... vList of Tables ............................................................................................................................... ixList of Figures .............................................................................................................................. xiAcknowledgements .................................................................................................................... xiiChapter 1: Introduction …......................................................................................................... 11.1 Polyploidy .................................................................................................................... 11.1.1 Gene Evolution after Duplication ................................................................. 31.2 Alternative Splicing....................................................................................................... 41.2.1 Alternative Splicing after Gene Duplication …............................................. 61.3 Research Objectives …................................................................................................. 9Chapter 2: Profiling and Analyzing Alternative Splicing in Brassica napus….....…............ 122.1 Introduction ................................................................................................................ 122.2 Materials and Methods ............................................................................................... 142.2.1 Quality Checking and Exploration of the B. napus Genome Sequence …. 142.2.2 Comparing Brassica napus to Arabidopsis thaliana ….............................. 152.2.3 Quality Checking B. napus RNA-seq Data ….................................…........ 162.2.4 Mapping of RNA-seq Data …..................................................................... 172.2.5 Development of Custom Analysis Pipeline …............................................ 182.2.6 Characterization of Homeolog Pairs …....................................................... 21v2.3 Results ….................................................................................................................... 242.3.1 Quality Checking and Mapping ….............................................................. 242.3.2 Development of a Custom Analysis Pipeline ….......................................... 252.3.3 Genome-wide Characterization of AS ….................................…................ 262.3.4 Characterization of Homeolog Pairs …....................................................... 272.4 Discussion .................................................................................................................. 28Chapter  3:  Transcriptome  Analysis  Indicates  Considerable  Divergence  in  AlternativeSplicing between Duplicated Genes in Arabidopsis thaliana…............................................... 403.1 Introduction ................................................................................................................ 403.2 Materials and Methods .............................................................................................. 433.2.1 Plant Growth, RNA Extraction, and Library Preparation............................ 433.2.2  Library Sequencing and Mapping/Processing …...................................... 433.2.3 Equivalent Junction and Event Calling …................................................... 443.2.4 Analyses of leaf RNA-seq Data ….............................................................. 453.2.5 Analyses of NMD Data …........................................................................... 463.2.6 Generation of Singletons and Duplicate Sets ….......................................... 473.2.7 RT-PCR Confirmations …........................................................................... 473.3 Results ….................................................................................................................... 483.3.1 Conservation of Alternative Splicing Within Paralog Pairs …................... 483.3.2 CCA1 and LHY Have Functionally Characterized, Divergent AS ............ 503.3.3 Conservation of AS-Induced NMD within Paralog Pairs …..................... 513.4 Discussion ….............................................................................................................. 52viChapter  4:  Transcriptomic  Changes  Accompanying  Allopolyploization  are  Variablebetween Resynthesized and Natural Lines of Brassica napus. ...........................…................ 634.1 Introduction ................................................................................................................ 634.2 Materials and Methods ............................................................................................... 664.2.1 Plant Material and RNA-seq …................................................................ 664.2.2 Mapping and Scripting …............................................................................ 664.2.3 Statistical Methods ...................................................................................... 674.2.4 RT-PCR ....................................................................................................... 694.3 Results  ......................................................................................................................  694.3.1 Global Changes in Expression of Genes upon Allopolyploidy ................... 694.3.2 Global Changes in B. napus vs. Resynthesized Allopolyploids ….............. 714.3.3 Changes in Alternative Splicing Upon Allopolyploization ......................... 724.3.4 Changes in Alternative Splicing in B. napus ............................................... 784.3.5 Gene Ontology for Alternative Splicing Categories ................................... 804.3.6 Experimental Verification of Alternative Splicing Events …...................... 814.3.7 Changes in Serine-Arginine Gene Alternative Splicing .............................. 824.3.8 Patterns of Splicing between Homeologs .................................................... 824.4 Discussion .................................................................................................................. 86Chapter 5: Conclusion  …........................................................................................................ 1055.1 Summary .................................................................................................................. 1055.2 Future Directions ...................................................................................................... 107Bibliography …......................................................................................................................... 109viiAppendices …............................................................................................................................ 119Appendix A Supplementary tables and figures of chapter 3 .......................................... 119Appendix B Supplementary tables and figures of chapter 4 …...................................... 122viiiList of TablesTable 2.1 Description of RNA-Seq reads obtained by sequencing cDNA with the Solexa Illuminatechnology (single reads) from major tissue and developmental stages of B. napus  …............. 31Table 2.2 Summary of the draft genome of B. napus ….............................................................. 31Table 2.3 Progression of RNA-seq reads through clipping, mapping, quality filtering the mappedresults, then calling with the custom Python scripts …................................................................ 32Table 2.4 Summary of alternate splicing mapping, event calling, and discovered events in bothsubgenomes of  B. napus …........................................................................................................ 33Table 2.5 Summary of annotated and verified transcripts and events in B. napus ….................. 33Table 2.6 Comparison of alternative splicing patterns observed between pairs of homoeologs inthe An and Cn subgenomes of B. napus ….................................................................................. 33Table 2.7 Coverage of introns by RNA Seq data …..................................................................... 34Table 2.8 Event thresholds and rules …....................................................................................... 34Table 2.9 Chi-squared test confirming the non-independence of conserved junctions …........... 34Table 3.1 Conservation of alternative splicing events in duplicated gene pairs …...................... 58Table 3.2 NMD status of paralog pairs …................................................................................... 58Table 3.3 RT-PCR primer sets ….................................................................................................. 59Table 4.1 Expression changes between parentals and allopolyploids …...................................... 97Table 4.2 Conserved changes shared by all allopolyploids …..................................................... 97Table 4.3 Expression changes between parentals and B. napus …........................................... 98Table 4.4 Conserved changes shared by B. napus …................................................................... 98Table 4.5 Intron retention frequency changes after allopolyploidy …......................................... 99Table 4.6 Intron retention frequency changes in B. napus …..................................................... 100Table 4.7 Homeolog analysis …................................................................................................. 101ixTable A.1 Analysis numbers for using 3 read threshold …........................................................ 119Table A.2 Analysis numbers for using 5 read threshold …....................................................... 120Table A.3 Analysis numbers for using 8 read threshold …........................................................ 121Table B.1 Summary of data and mapping ….............................................................................. 122Table B.2 Alternative acceptor changes in synthetics …........................................................... 123Table B.3 Alternative donor changes in synthetics …................................................................ 124Table B.4 Alternative position changes in synthetics …............................................................ 125Table B.5 Alternative acceptor changes in B. napus …............................................................. 126Table B.6 Alternative donor changes in B. napus ….................................................................. 127Table B.7 Alternative position changes in B. napus ….............................................................. 128Table B.8 Homeolog analysis in B napus ….............................................................................. 129Table B.9 Summary of GO analysis …...................................................................................... 130Table B.10 Changes in SR gene splicing levels …..................................................................... 131Table B.11 Summary of PCR verification …............................................................................. 133xList of FiguresFigure 2.1 A typical FastQC inspection of read quality from the RNA-seq libraries ….............. 35Figure 2.2 Paradigm for calling equivalent junctions ….............................................................. 36Figure 2.3 AS event freuqnecies in B. napus …........................................................................... 37Figure 2.4 Graphs of intron coverage …...................................................................................... 38Figure 2.5  Design schematic for Python algorithm as it  processes individual RNA-seq readswhen iterating through SAM (Sequence Alignment Map) files ….............................................. 39Figure 3.1 Distribution of fold change differences for α-WG and tandem duplicates …............ 59Figure 3.2 Graphical representation of conservation status of alternative splicing events betweenparalogs ….................................................................................................................................... 60Figure 3.3 RT- PCR results …...................................................................................................... 61Figure 3.4 Alternative splicing divergence between CCA1 and LHY paralogs …....................... 62Figure 4.1 Examples of Modeled Events …............................................................................... 102Figure 4.2 Density of AS fold change, synthetics ….................................................................. 103Figure 4.3 Density of AS fold change, natural …....................................................................... 104Figure 4.4 Results of RT-PCR event confirmations …............................................................... 104xiAcknowledgementsI thank my supervisor Dr. Keith Adams, for allowing me the opportunity to explore the frontiers of plant genome research and tackle new and exciting projects in an emergent field. I thank my committee members, Dr. Loren H. Rieseberg Dr. Quentin Cronk, and Dr. Naomi Fast for keeping me on my toes and tempering modern in-silico analysis with the rigours of the modern scientific process and criticism. I thank Dr. Cameron Grisdale and Donald Wong for helpful discussions of biology and bioinformatics, as well as the entirety of the Fast Lab for comradeship throughout my time at UBC. I thank Vivienne Lam and the Graham lab for an equal amount of learning opportunities and the ability to learn more aspects of applying computational approaches to problems in biology. Finally, I thank my parents for understanding and the limitless support they provided for what would be considered by some to be an eccentric and bizarre turn of a research career, and for instilling in me a curiosity about the world we happen to inhabit.  This research was funded by in part by NSERC discovery grant awarded to Keith Adams.xii1 IntroductionGenomes and genes are dynamic, appearing static in the present yet constantly changing over evolutionary time and geographic distance. The size and structure of the genome fluctuates as they are expanded via polyploidy or transposable element proliferation and shrunk via diploidization. Genes as well share a similarly ephemeral existence, where the total amount of genes in a genome changes over time as  new genes are born and others are removed. Duplicate genes are therefore curious study subjects, as they are the raw protean material of adaptation and innovation. One aspect of gene expression is alternative splicing, or how the primary mRNA transcript is processed in different ways to yield different final transcripts. I aim to investigate the genome-wide trends of how alternative splicing evolves between duplicated genes.1.1 PolyploidyThe angiosperm lineage is marked by repeated and iterative episodes of polyploidy (Blanc and Wolfe 2004a, Cui et al. 2006, Jiao et al. 2011, Adams and Wendel 2005, Soltis et al. 2009), with as many as 15% of angiosperm speciation events having an associated ploidy level change (Wood et al. 2009). The expansion of the genome gives evolution both the raw material and genomic contexts for adaptation and speciation (Cui et al. 2006, Blanc and Wolfe 2004a, Otto and Whitton 2000, Tinti et al. 2012).  Autopolyploidy is an internal multiplication of a species' genome, usually caused by the union of unreduced gametes. Allopolyploidy hybridizes genomes from two different species, which is then usually followed by at least one round of genome doubling which resolves chromosome pairing and restores fertility. Many key crop species, such as wheat, sugarcane, coffee, cotton, and canola are polyploids. Polyploidy can allow for rapid niche adaptation and speciation (Comai 2005, Rieseberg and Willis 2007).  1Polyploidy is a recurrent theme in the evolutionary history of the angiosperm lineage (Blanc and Wolfe 2004; reviewed in Adams and Wendel 2005), and has even been inferred to have offered adaptive advantages to polyploid lineages which survived the K-T event (Fawcett et al. 2009).The genetic and genomic mechanisms responsible for unleashing such phenotypic novelty range from genome level rearrangements and deletions to complex epigenetic changes (reviewed in Chen and Ni 2006, Doyle et al. 2008, Soltis et al. 2014). Immediate consequences of polyploidy can include gene silencing, novel gene expression, gain or loss of alternative splicing events, and organ-specific reciprocal expression of homeologs (reunited orthologs), or even large scale changes in methylation (e.g., Adams et al. 2003, Adams and Wendel 2005, Zhouet al. 2011, Salmon et al. 2005, reviewed in Yoo et al. 2014). Some of these immediate changes may be caused by cis-element differences between homeologs in a new trans-environment; each homeolog has a cis-architecture adapted for its native trans-environment, whereas in allopolyploids, homeologs are in a new hybrid trans-environment with their existing cis-architecture, opening the way for novel expression profiles (Chaudhary et al. 2009, reviewed in Yoo et al. 2014).  Perhaps most puzzling is that the allopolyploization process itself is highly stochastic (reviewed in Buggs et al. 2014), with repeated instances of the same event producing different phenotypic outcomes (Gaeta et al. 2007), indicating that the 'genome shock', which occurs during a merger, is chaotic and not entirely repeatable. Thus chance and contingency may play a large role in granting polyploids the right set of traits needed to persist and capitalize on a specific niche or set of conditions. However, this area remains understudied, and the relative strength and importance of repeatable changes acompanying a given allopolyploidization event versus uncommon changes is not currently known.21.1.1 Gene Evolution after DuplicationOn an evolutionary timescale, genes duplicated by polyploidy are often pseudogenized oreliminated from the genome entirely, as the genome undergoes diploidization. Subgenomes within a polyploid may experience differential amounts of gene loss, a process called biased fractionation. (e.g., Cheng et al. 2012), leaving some genomes largely reduced with most genes pseudogenized, and others very conserved. Some genes and classes of genes are more likely to revert to single copy status after polyploidy (De Smet et al. 2012), whereas some classes of genes are more likely to persist (Blanc and Wolfe 2004b), and those which persisted after one round of polyploidy are likely to persist through another (Seoighe and Gehring 2004). Those genes that are retained frequently experience asymmetric protein-coding sequence evolution, andhave differing transcriptional profiles (Blanc and Wolfe 2004b, Casneuf et al. 2006). In Arabidopsis thaliana, a particularly interesting case of duplicate gene divergence has been observed for the whole genome duplicate gene pair Brassinosteroid Kinase 1 (BSK1) and SHORT SUSPENSOR (SSP) (Adams and Liu 2010). The ancestral gene expression pattern and function is retained by BSK1, however SSP has diverged after duplication, changing from a brassinosteroid signal transducer to a paternal controller of zygote elongation with expression restricted to pollen, rather than throughout the plant like the expression pattern of BSK1. Several models for the retention of duplicate genes after duplication have been proposed, such as the duplication, degeneration, complementation model (Force et al. 1999), where each duplicate willloose one function or regulatory motif retained by the other, forcing selection to keep both gene copies as each one is no longer completely redundant. This is generally referred to as subfunctionalization, whereas the evolution of a completely new function or expression pattern altogether is termed neofunctionalization. Subfunctionalization may open the evolutionary path to neofunctionalization by delaying the onset of pseudogenization, allowing the time for new 3mutations to give rise to new functions on which selection may then act upon.1.2 Alternative SplicingAlternative splicing is the differential processing of a gene's primary mRNA transcript, allowing multiple proteins to be encoded by in a single gene. The disparity of the proteome typically being larger than its corresponding genome, especially in animals, is partially accounted for by alternative splicing (Gravely 2001). Alternative splicing is very well studied in animal systems where much is known, highlighting some of the key processes in which it is critical. For example, bcl-x either inhibits or promotes apoptosis depending on the splicing of its primary transcript, which is controlled by an upstream element (Boise et al. 2003, Lee et al. 2012). In Drosophila, alternative splicing plays a functional role in sex determination, sex-specific behaviours, fine-tuning myosin proteins for individual muscle groups, facilitating neural patterning, and the regulation and targeting of homeotic transcription factors involved in developmental morphology (Venables et al. 2012). Although alternative splicing is much less studied in plants, there are good examples of functional consequences. Resistance to cold-induced sweetening is greatly enhanced in potatoes, which utilize alternate splice forms of invertase inhibitor at high rates (Brummel et al. 2011).  In Arabidopsis one of YUCCA4 's isoforms is cytosolic and present in all tissues, whereas another isoform is only present in flowers and attached to the ER membrane where it putatively plays a role in auxin synthesis; thisdemonstrates a case of tissue-specific and subcellular location-specific alternative splicing whereit contributes to the synthesis of a key hormone for flower development (Kriechbaumer et al. 2012). Numerous examples of alternative splicing being tied to control of the circadian clock exist (James et al. 2012), including Circadian Clock Associated 1 (CCA1), which alternatively 4splices to maintain inhibition of the cold response and parts of the circadian clock (Seo et al. 2012). Alternative splicing plays a large role in mediating the immune response in plants. One of the most classic examples is the N gene in tobacco, which confers resistance to tobacco mosaic virus only when both of its splice forms are present (Whitham et al. 1994, Dinesh-Kumar and Baker 2000). Similarly, in Arabidopsis, RESISTANCE TO PSEUDOMONAS SYRINGAE4 (RPS4) requires several splice variants to fully confer its resistance (Zhang and Grassman 2007). Alternative splicing also plays a role in transcript abundance. Retaining an intron or shifting the reading frame via other events may cause inclusion of a premature stop codon, thus invoking the nonsense-mediated decay (NMD) pathway. In Arabidopsis, up to 18% of regulatorygenes that are alternatively spliced are affected by alternative splicing attenuating transcript levels (Kalyna et al. 2011). It was discovered that 17.4% of genes with more than two exons produce alternatively spliced transcripts that are targeted by NMD (Drechsel et al. 2013), highlighting an emergent role for what was largely presumed to be transcriptional noise. Perhaps one of the more interesting cases is the Glycine Rich Protein AtGRP7 (Staiger et al. 2003). AtGRP7 auto-regulates its own transcript level by the constitutive protein invoking a premature stop codon in AtGRP7 mRNA during splicing when the normal protein concentration is sufficiently high. A similar mechanism of auto-regulation occurs in AtGRP8 (Schöning et al. 2008), highlighting some nuances of gene expression that are mediated by alternative splicing.  Tissue-specific, stress-specific, subcellular target-specific, and ecotype-specific splicing forms and patterns have all been characterized  (Kriechbaumer et al. 2012, Paula 2011, Dammann et al.2003,  Kissen et al. 2012).  The relationship between overall gene expression level and alternative splicing has been presumed to be negative based on EST data (English et al. 2010), which indicates that highly expressed genes may be under selection for a particular function. 5Thus alternative splicing can act as both a modulator of gene function and gene expression level, and can be thought of as a mechanism for “fine tuning” of the transcriptome.1.2.1 Alternative Splicing after Gene DuplicationSubfunctionalization can partition transcripts previously embedded in a single gene with alternative splicing between duplicates, with each new gene specializing in one of the previous splice variants. In one example of this, a study in zebrafish found two isoforms produced by the ancestral Mitf gene had been completely partitioned between two duplicates (mitfa, mitfb) generated by a teleost-specific whole genome duplication event, whereas these two isoforms are retained in a single gene in all other vertebrates (Lister et al. 2001). Another example of this occurs between mangrove and poplar, in which a fusion event created the chimeric gene SODcp-PRL32 in their ancestor; the two isoforms were subfunctionalized between duplicates in poplar, whereas mangrove has a single gene that is alternatively spliced to produce both transcripts (Cusack and Wolfe 2007). Despite characterization of a few examples, the interaction of alternative splicing and gene duplication remains relatively understudied. If alternative splice forms are frequently partitioned between duplicates in a manner indicating subfunctinalization ofsplice forms, duplicated genes will presumably have a low level of conservation of alternative splicing overall. However, each now fully partitioned splice form would be located in a separate gene, allowing selection to further refine an individual transcript and its regulatory motifs for a particular purpose, perhaps escaping the adaptive constraint found in a single gene with two or more functions.Extensive divergence of alternative splicing was found between a limited number of 6whole genome and tandem duplicates in Arabidopsis utilizing an RT-PCR based approach, wheremany duplicates showed divergent organ or stress specific splicing patterns compared to their paralog (Zhang et al. 2010). In resynthesized allopolyploid Brassica napus, as many as 30% of the genes duplicated by allopolyploidy show changes to alternative splicing compared to their parental splicing pattern, with losses being common (Zhou et al. 2011). These changes can be specific to an organ or stress condition, with some occurring in parallel between independently resynthesized lines, suggesting that some parts of genome shock and allopolyploid formation may be repeatable rather than stochastic (Zhou et al. 2011). Immediate partitioning of splice forms can even occur within a homeolog pair generated by allopolyploidy, potentially acting as avery rapid form of splice form subfunctionalization (Zhou et al. 2011). As many as 20% of homeolog pairs in natural B. napus have altered alternative splicing compared to the parental species, but it is unknown if this change occurred upon or after allopolyploid formation; many ofthe changes to alternative splicing which occurred upon allopolyploidization in resynthesized hybrids were not found in natural B. napus (Zhou et al. 2011). There are exciting prospects for future investigation, as Zhou's study shows divergence of alternative splicing both on the onset of allopolyploidy, and thousands of years later, and that these patterns are perhaps very different from each other. Yet the study was somewhat limited by reliance on RT-PCR of known alternative splice events.  It is difficult to find any patterns differentiating the long term effects and patterns of divergence due to selection versus those that happen immediately as a result of genome shock on such a limited amount of data points, but the prospects for future research using more modern techniques are promising.  The Zhou et al. (2011) study both represented the first large scale study of alternative splicing divergence in plants, as well as highlighted the stressand organ specific nature of many changes to alternative splicing after gene duplication. While this is still the largest reported experiment specifically testing the divergence of alternative 7splicing in plants, it only encompassed a small subset of whole genome duplicate pairs for whichprimers could be designed around events annotated previously. Additionally, this method only revealed presence or absence of an alternatively spliced transcript, and lacks any quantitative information about the overall alternative spicing levels, which represents an as of yet untested mode of potential alternative splicing divergence between duplicates. With next-generation sequencing technology, a much more complete set of paralogs could be assayed simultaneously, with no a-priori knowledge of events or the need for specific primers, while also adding a powerful quantitative dimension of comparison between paralogs.One particular aspect of gene regulation that controls alternative splicing is cis elements that are of particular interest in an allopolyploid, as both sets of genes have cis architecture adapted to the native parental trans environment and regulation. When genes exist in a new hybrid and polyploid trans environment, transcription factors from another subgenome may create transgressive effects on gene expression and splicing, as well as the stoichiometry of native factors being perturbed, leading to new or altered splicing and expression patterns. Changes in cis elements were implicated to account for most species-specific alternative splicingpatterns in vertebrates (Barbosa-Morias 2012), and in another study about 80% of non-exon-skipping events were attributed to divergent cis architecture in Drosophila (McManus et al. 2014). Thus cis-elements may be able to reciprocate their parental splicing even in a new hybrid trans-environment, but this has yet to be investigated in plants.  Are the parental species splicing patterns and levels conserved or greatly changed after allopolyploidy? In cases where splicing has diverged between duplicates, can cis elements be implicated to be one of the principle mechanisms driving divergence?81.3 Research objectivesThe aim of my thesis is to further understand the evolutionary relationship between gene duplication and alternative splicing. With only a handful of individual case studies in specific genes, and a few studies finding evidence for genome-wide trends and patterns, the intersection of alternative splicing and gene and genome duplication is a tantalizing area of research. Furthermore, modern sequencing techniques have enabled a more accurate assessment of the importance of alternative splicing, where it is a major contributor to both the regulation of the transcription and the diversity of the proteome, rather than a mere transcriptional novelty present in a few genes. From the year 2003 to the year 2012, the estimate of the number of genes that have alternative splicing in Arabidopsis has changed from 1.2% to 61% (Syed et al. 2012), highlighting the levels of biological complexity that were previously inaccessible for study, while also expanding the relevance of alternative splicing overall. Next generation sequencing has enabled much of this progress, as sequencing a cDNA library using Illumina® technology (mRNA seq) allows for a rapid yet detailed analysis of the transcriptome. Expressed Sequence Tags (ESTs) allowed for very complete information about a few transcripts at a time, but simply do not compare to the information contained in the hundreds of millions of short reads that next-generation technologies offer. I aimed to use next-generation sequencing to study the evolution of alternative splicing in duplicate genes in evolutionarily recent, resynthesized, and paleopolyploid plant systems. These patterns have never been characterized on a genome-wide scale in polyploid plant genomes on any time scale.The major objectives of my thesis were to:1. Characterize alternative splicing in Brassica napus. Brassica napus (canola) is an important crop plant and model system for allopolyploidy, yet there has been no large scale examination of its alternative splicing. Using next-generation sequencing, I annotated alternative 9splicing events for the genome sequencing project using RNA-seq data, while at the same time I described the genome wide trends of the types and frequencies of different event classes. Additionally, I analyzed the conservation of alternative splicing between homeologs (reunited orthologs) to obtain a more refined and complete assessment of how alternative splicing divergesafter allopolyploidy.2. Study the divergence of alternative splicing patterns in duplicated genes in Arabidopsisthaliana. Arabidopsis is an ideal study system, with decades of research and annotation complimenting a comparatively small genome, and well-defined sets of duplicate genes. Using mRNA-seq, I closely examined the alpha whole genome duplicate set (α-WG) as well as tandem duplicates, assessing conservation of alternative splicing both qualitatively and quantitatively. I detailed a curious case of splice form subfunctionalization, where the duplicate pair CCA1/LHY has almost entirely partitioned an ancestral splicing event to CCA1. Finally, I showed that nonsense mediated decay mediated by AS has diverged incredibly rapidly, with most instances ofAS-NMD only occurring in one pair member.3. Investigate the immediate effects of allopolyploidization on alternative splicing on a genome-wide scale using resynthesized Brassica napus. I show that the frequency of alternative splicing events can change dramatically upon allopolyploidization, with intron retention events changing the most drastically. Not all changes are repeated between instances of polyploidy, highlighting allopolyploidization as chaotic. Homeolog pairs with shared yet quantitatively biased alternative splicing events in the parental species tend to retain that bias even in the allopolyploid, pointing to cis element control being one of the main drivers of alternative splicing divergence between species, and one that may persist even through altered trans 10environments. Transcriptional changes in the resynthesized allopolyploids were compared with those found in a natural B. napus (canola line) revealing the natural allopolyploid to be transcriptionally distinct yet still containing many differences from its parental species.  11Chapter 2: Profiling and Analyzing Alternative Splicing in Brassica napus12.1 IntroductionWhole genome duplication is a recurring phenomenon in the plant lineage (reviewed in Blanc and Wolfe 2004a; Cui et al. 2006; Kasahara 2007; Jiao et al. 2011). This has caused considerable interest in characterizing the evolutionary outcomes of duplicate genes, centering on the retention mechanisms of seemingly redundant genes, and/or the divergence of their expression or protein products. One aspect of gene expression is alternative splicing, which modulates expression (Kalyna et al. 2011) or increases the number of protein products a gene can produce (Nilsen and Gravely 2010, reviewed in Reddy et al. 2013). Stress responses frequently invoke alternative splicing (reviewed in Staiger and Brown 2013). Alternative splicinghas been studied on a genome-wide scale in the model plant system Arabidopsis thaliana (Marquez et al. 2012,  Filichkin et al. 2010). The conservation of alternative splicing patterns between duplicate genes has also been studied in Arabidopsis thaliana (Zhang et al. 2010), as well as in Brassica napus (Zhou et al. 2011), on a small scale using RT-PCR. However, alternative splicing has not yet been studied on a genome-wide scale in Brassica napus, nor has the conservation of alternative splicing between duplicate genes within its genome._________________1 Most of chapter 2 has been published in Chalhoub B, Denoeud F,  Liu S, Parkin I , Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B, Corréa M, Da Silva C, Just J, Falentin C, Koh C, Le Clainche I, Bernard M, Bento P, Noel B, Labadie K, Alberti A, Charles M, Arnaud D, Guo H, Daviaud C, Alamery S, Jabbari K, Zhao M, Edger P , Chelaifa H, Tack D, Lassalle G, Mestiri I, Schnel N, Le Paslier MC,  Fan G, Renault V, Bayer P , Golicz A, Manoli S, Lee T, Dinh-Thi V, Chalabi S, Hu Q, Fan C, Tollenaere R, Lu Y, Battail C, Shen J, Sidebottom C, Wang X, Canaguier A, Chauveau A, Bérard A, Deniot G, Guan M, Liu Z, Sun F, Lim Y, Lyons E, TownC, Bancroft I, Wang X, Meng J, Ma J ,Pires J, King G, Brunel D, Delourme R, Renard M, Aury J, Adams K, Batley J, Snowdon R, Tost J, Edwards D, Zhou Y, Hua W, Sharpe A, Paterson A, Guan C, Wincker P (2014). Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science, 345(6199), 950-953.12Brassica napus is an allotetraploid, formed about 7,500 years ago from its parental species, B. rapa and B. oleracea (Chalhoub et al. 2014). It can be resynthesized, allowing the of study transcriptional changes immediately after allopolyploization, as well as thousands of years later in the domesticated variety. Some changes in gene expression and genome structure quicklyset in after allopolyploidy (e.g., Adams et al. 2003, Gaeta et al. 2007, Chaudhary et al. 2009), resulting in unique phenotypes that may allow rapid adaptation (Rieseberg and Willis 2007, Otto and Whitton 2000). While a limited number of changes to splicing are known to occur after allopolyploidy (Zhou et al. 2011), these changes have yet to be studied on a genome-wide scale. In resynthesized B. napus, as many as 30% of the genes duplicated by allopolyploidy show changes to alternative splicing respective to the parental splicing pattern, with losses being most common (Zhou et al. 2011). These changes can be organ or stress specific, with most occurring in parallel between independently resynthesized lines, suggesting many parts of genome shock may be static rather than stochastic (Zhou et al. 2011). Partitioning of ancestral splice forms can even occur within a homeolog pair, potentially acting as a rapid form of subfunctionalization (Zhou et al. 2011).  As many as 20% of homeolog pairs in natural B. napus have altered alternative splicing compared to the parental state, but it is unknown if these changes occurred upon or after allopolyploization, as many of the changes to alternative splicing which occurred inresynthesized plants were not found in natural B. napus (Zhou et al. 2011).Brassica napus is an important crop plant, as well as an excellent and tractable system forthe study of allopolyploidy. The relative abundance and importance of alternative splicing in plants has been characterized in Arabidopsis (Marquez et al. 2012, Filichkin et al. 2010, Kalyna et al. 2011, reviewed in Reddy et al. 2013, Staiger et al. 2013). There is currently considerable interest in annotating and analyzing alternative splicing in crop plants. Of particular interest in an13allopolyploid is the behavior of the homeologs, or reunited orthologs from the two parental species. After existing in separate species for millions of years, then being reunited in allopolyploidy and exposed to further selection and diploidization, the question becomes: to whatextent do the homeologs share alternative splicing patterns? To this end, the transcriptome of B. napus was analyzed for alternative splicing patterns to annotate these alternative splicing events. These events were examined in the context of how they are conserved between homeologs.2.2 Materials and Methods2.2.1 Quality Checking and Exploration of the B. napus Genome SequenceBoulos Chalhoub's research group provided all the data for this study electronically via the Genescope servers, which is summarized in table 2.1 RNA-Seq reads and 2.2 Summary of the draft genome of B. napus. We received 351 million RNA-seq reads which were 101bp long and sequenced from a single end. Also included was a list of the homeologous gene pairs. The Brassica napus genome and annotation was analyzed to ascertain its completeness, with details in Figure 2.2. The genome itself contains 79,498 genes;  55,461 of them are anchored to a knownlocus while 24,037 genes remain un-anchored regions, though some have been at least narrowed to a chromosome. The sequencing does appear slightly asymmetric, as 14,325 C genome (originating from B. oleracea) genes remain completely un-anchored, compared to 5775 A genome (originating from B. rapa) genes remain un-anchored. Regardless of fragmentation, so long as the genes are physically present to be be mapped against by RNA-seq reads, and the homeologs are called accurately, an alternative splicing analysis should not be heavily impaired. The gene annotation, predicted in large part by GAZE, has only inferred some of the 5' and 3' untranslated regions of genes along with their coding sequences (CDS), but no exon/intron 14structure has been directly predicted in the given annotation. In terms of inferring gene structure when calling splicing events and comparing the structures of the homeologs to each other in all subsequent analysis, UTR regions bordering a CDS were fused to form either the first or last exon of the gene, and every other CDS was considered its own exon; introns were inferred to exist in the gaps between such CDS. 2.2.2 Comparing Brassica napus to Arabidopsis thalianaBrassica napus and Arabidopsis thaliana both belong to the Brassicaceae family, but represent different extremes in terms of genome size and gene number. Arabidopsis thaliana, once chosen for sequencing as the ideal model plant due to its small and potentially simple genome, turned out to be the complex daughter of many successive rounds of duplication and diploidization. Diploidization reduces genome size and complexity, thus plants which have gone through extended diploidization have fewer genes, and fewer similar genes, allowing for a more precise mapping and subsequent analysis of RNA-seq data. Brassica napus is simply gigantic; it is an evolutionarily recent allotetraploid created by merging the genomes of B. rapa and B. oleracea, which each have had a respective triplication or duplication of their genomes, yet without a lengthy amount of for diploidization to fully reduce genome size. The total number of genes listed for Arabidopsis thaliana after decades of research (TAIR 10) is ~27900 genes, including pseudogenes, while the total number of genes listed in the 'special pre-release draft' of B. napus is 79498, or almost three times the number. From a genomic perspective, having so many homeologs further complicates this problem: uniquely mapping RNA-seq reads will be more challenging because there are many very similar areas and genes between the A and C genomes. Simply put, the size of this genome and the number of similar genes makes the experimental data requirements for researching B. napus much more extensive than for a study in15A. thaliana, for which most detailed splicing analysis to date has been done (Marquez et al. 2012). 2.2.3 Quality Checking B. napus RNA-seq DataThe RNA-seq data lacks any replication structure and is from disparate tissues and environmental conditions (table 2.1). Additionally, while the quality of the reads is high (Figure 2.1), after they have been trimmed with sickle (low quality bases and reads removed), there are only a grand total of 343 million non-paired 101bp reads (table 2.3) between all tissues and treatments. With so few reads against such a massive genome, the average coverage per gene will be much less than in other studies of alternative splicing. Marquez (2012)  had a total of 115 million paired-end reads from a normalized library, against ~27900 genes. We have 343 million single-end reads from non-normalized libraries against 79498 genes. While this data could definitely lead to identification of many alternative splicing events, it is not optimal for a detailedanalysis of conservation of homeolog splicing patterns. Sequencing from both ends of an RNA fragment generates paired-end reads, which thus double the amount of a transcript covered by the read, thereby greatly increasing the chances of showing any alternative events carried by the transcript itself. The smaller 'windows' into transcripts offered by single end reads diminishes theamount of information each read can carry considerably. The lack of replicates is perhaps a more pressing issue; there is no way to ascertain the robustness or consistency of a splicing event, and much more difficult to distinguish it from an experimental artifact. To this end, we have  pooled all the lanes together, and set up a 'global' analysis of splicing discovery and conservation between homeologs. This precludes doing any quantitative work at all, as many different tissue types and conditions have been pooled, meaning gene expression and splicing can no longer be normalized to any standard. Thus, based on the data provided, what will experimentally be 16investigated is if the homeologs are capable of reciprocating each others behavior across a pooled panel of trans-environments. Or stated another way, how qualitatively conserved are the alternative splicing events of homeologs, given many different trans-environments. Although thisis an interesting question, and incidental to it's investigation, many splice events will be annotated, it is different than examining patterns within discrete trans environments, and asking how the homeologs have conserved or diverged within one particular context, be it tissue, stress or developmental state. 2.2.4 Mapping of RNA-seq DataGSNAP was used to map all data against the reference genome provided. The settings used were J 64 and j 31 for quality score scale adjustment, nthreads 8 for using 8 threads, N for allowing novel events, n and Q for only reporting the best hit available for the read,  and nofails for not reporting reads which fail to produce a meaningful alignment. An example command follows:gsnap --db=bna_v2.0_ncmtcp_gmap -J 64 -j -31 -N 1 --nthreads=8 --ordered --format=sam -D indexes -n 1 -Q --nofails [read_file].fastq > [output_file].samWhile both TopHat and GSNAP are splice aware aligners for RNA-seq, we chose GSNAP asit demonstrated a higher tolerance of mismatches, which is important for mapping short reads back to a new genome. GSNAP was also found to have a better ability to find unique hits over TopHat, which is critical when mapping to duplicate rich genomes such as B. napus, as reads not uniquely mapped are of limited use due to their ambiguity. Overall, the amount of reads that remained was rather reasonable (table 2.3), given the genome size and structure, versus the length of the reads.  Sequence alignment map, or SAM files are output by mapping software and analyzed downstream by other software. After post processing of the SAMs to remove erroneous17mappings with a custom Python script, 279,371,104 reads out of 343,554,609; or about 81% remained, with ~19% lost due to either no mappings or having multiple equally good mappings. The progression of filtering/trimming with sickle, mapping with GSNAP, then re filtering to remove errant mappings with a custom Python script is summarized in table Development of Custom Analysis PipelineCufflinks is by far the most common piece of software for the purpose of analyzing alternative splicing. However, without paired-end reads, it is unable to accurately assemble transcript isoforms, thus its use on single-end reads is not recommended (Cufflinks user manual).However, the reliability of its predictions even when supplied with paired-end reads has been called into question, particularly in plants which have very different gene structure than animals (Liu et al. 2014). One of the larger issues with Cufflinks and other similar isoform reconstructionsoftware is precisely that it attempts to reconstruct entire transcript isoforms, which can be many kilobases long, from short reads which are at most 200-300bp long. While the mathematical modelling behind this reconstruction is certainly valid in a platonic realm of pure math, it has long been observed, even in animals, that Cufflinks can create chimeric transcripts, combining two separate alternative splicing events into the same transcript isoform even though those events never actually occur together in vitro. Better annotation certainly does assist Cufflinks, where decades of molecular research in human, rat, and mouse have given it a strong panel of confirmed forms with which it guide its judgment in an RNA-seq experiment (supplied via .gff), however plant resources have lagged behind considerably in comparison. While single alternative splicing events can be discovered by breaking down Cufflinks results into their event components, it then becomes difficult to accurately quantify these on a per event or junction basis, as Cufflinks will report everything in FPKM (Fragments Per Kilobase of transcript per 18Million mapped reads), which is a metric of how much of each predicted or annotated isoform is present, rather than an event based metric. These problems call for a simpler software suite which makes fewer isoform assumptions, is less reliant on having a battery of annotated isoforms, is species and kingdom agnostic, yet can make the best use of short reads possible by adopting a junction and  event centric view of alternative splicing. A series of Python scripts were created with these goals in mind, where individual reads which map to a gene locus are then each compared to the constitutive gene model, logging any incongruities as well as instances of normal splicing. For each junction, counts of both the constitutive patterns and any alternative patterns are collected, opening the way for either event discovery, or to compare the same junction between treatments, or even to compare that junction to an equivalent junction in a related gene, for both types of events present and the relative frequency of those events. More complex patterns like exon skipping are handled similarly, where the details and frequency of the event are logged, but with either no numbered junction, or in relation to a non-standard junction, i.e. a three-five junction where exon four was skipped.Each line of a SAM file represents one entire illumina read (or in the case of paired-end reads, one half of the pair), which the script first compares against a digital index of the genome to assign it to a gene. The read is evaluated on if it mapped to no genes, one gene, or two or moregenes; only reads that hit one gene were considered further. Using the CIGAR string, which encodes and describes the way the read aligns to genome, the read is then compared to the gene model, and assigned to the exons and any junctions that it covers. Reads which contain the CIGAR flags of I or D, representing an insertion or deletion to or from the reference respectively,19are not considered because of complications. All constitutive reads simply were saved as a hit to the exon that the read maps to, or the junction that the read spans. Any deviation in the reads mapping from the expected gene model caused the program to examine the read further. The readcould then be called as an alternative donor (5' end of a junction is either elongated or truncated),alternative acceptor (3' end of a junction is either elongated or truncated), intron retention (at least 8 base pairs overlap an intron, not a junction spanning read), complex event (both donor and acceptor are non-constitutive), or exon skipping (exons are cut out of the transcript, non contiguous exons are linked by a single read). For alternative donors/acceptors/complex, the exact change of the event was logged and counted to memory, both for annotation purposes, and because different events can happen at the same junction – i.e. the program independently countsthe occurrences of different alternate donor events at the same junction. Figure 2.5 is a flow chartof how the script processes reads. Table 2.7 summarizes the criteria used to accept events in this study using the calls of this script in this study.  With the exception of intron retention, these criteria were based on the requirements laid out by other studies(Marquez et al. 2012). This script was run on all of the pooled data lanes we were supplied with (table 1.1), after mapping and filtering (table 1.3) resulting in an output file containing each genes unique splicing profiles in a flat text format. Alternative events were not accepted if there was no read support for the predicted constitutive junction to prevent erroneous calls. However, we felt the standards to accept an intron retention (Marquez et al. 2012) were too stringent, prompting a further analysis.To access our criteria for calling an intron retention event, namely requiring a constitutiveread at the junction in question and 10+ reads each at least 8bp inside of the intron, we performed a coverage analysis to compare our results with the coverage based methods used in other studies of alternative splicing. Previous studies required 100% coverage (Marquez et al. 202012) or 75% coverage (Gan et al. 2011) of the intron by RNA-seq reads to accept an event . All lanes of data were combined using the samtools merge command, and the depth at each postion was attained by calling samtools depth on the merged bam. The resulting depth was compared to the introns inferred from the .gff file with a Python script, generating coverage statistics for all introns. These statistics were compared with the 56372 intron retention events called by our criteria, with results listed in the table 2.7., and visualized in figure Characterization of Homeolog PairsAlternative splicing events have been called for every annotated gene, however we need to be able to compare these events within homeolog pairs to assess the degree of alternative splicing conservation. To control for structural divergence between duplicate genes, such as exongain/loss, or fission/fusion, the exon structure of the homeolog pairs must be investigated. In investigating how much is exon structure is conserved within a pair, the junctions which retain enough homology for a meaningful comparison of alternative splicing can be discerned, and alternative splicing events at those junctions can be compared against each other, thus accountingfor any structural rearrangements that have taken place. The provided genome and annotation provided the putative exon structure for each gene.  The exact coordinates of these inferred exonswere used to query the B. napus genome so their sequences could be extracted and saved to a separate file (akin to the TAIR file containing all exons). This file was then converted into a localBLAST database and queried with itself, resulting in a BLAST report containing all homologous regions between all exons in the B. napus genome, and more specifically, the conservation of structure between known homeologs. The BLAST output was parsed with a custom Python script, which collected all hits between exons in homeolog pairs, and evaluated which junctions were still intact, or homeologous junctions (see Figure 2.2). A single exonic gain or loss in one 21homeolog does not cripple the analysis of the entire pair, it only removes one or more junctions that can no longer be called homeologous.. Likewise, for fissions/fusions, a single junction is eliminated, but the remaining junctions, if they still have homology, remain resolvable to assess the status of alternative splicing conservation. These structural rearrangements result in the remuneration of exons and associated junctions, i.e. a homeologous junction pair may be between junction 3-4 in the A homeolog and junction 2-3 in the C homeolog, so long as exon 2A matches exon 3C, and exon 3A matches exon 4C (Figure 2.2), the physical junction is consideredconserved. These junctions are resolvable in a further analysis of  level of splicing conservation. Of 19,555 homeolog pairs, 15,914 had identical exon structures, and 3641 had one or more rearrangements which were factored into subsequent analysis and assignment of homeologous junctions. Additionally, only junction pairs which had read support for the both constitutive models were considered, adding another level of scrutiny. That is, junctions in both homeologs must have read support for their constitutive model, and the junction must be homeologous via BLAST, before any splicing differences will be assessed.There are two ways to study alternative splicing via RNA-seq – the qualitative aspect being the events themselves, and the quantitative aspect – the frequency that the event occurs if the event is shared qualitatively. The design of the RNA-seq experiment can influence the resolution of the analysis on either of these two points. If one seeks only to catalogue or annotatethe most events, a normalized library (Marquez et al. 2012) is potentially the best approach – a nuclease is applied to the sample to eliminate common and redundant reads, such that more of the reads which make it to the illumina sequencer are from less common transcripts, which are thus enriched for rare alternative splicing events. The pitfalls of this design is that one cannot reliably assay expression anymore, or approach the alternative splicing question from a 22quantitative perspective. However, normalized libraries are a veritable gold mine of new splice forms, as Marquez demonstrated, gaining 46,955 new splice events in the well annotated and studied Arabidopsis thaliana out of just 115 million reads.  In order to analyze the quantitative aspect of alternative splicing, one must have ample read depth in order to accurately capture the relative rates of alternative splicing events. This can be very revealing in that measurable changes in the frequency of an event can add another level of explanatory power over just the binary presence or absence of an event. As far as using either a qualitative or quantiative method to study homeologs, or any set of related genes, one should strive to account for both types of potential conservation or divergence. Conservation is not a binary yes or no, and like most thingsin biology, it should be placed on an appropriate scale; it is possible to be qualitatively conserved, where an event is shared, but not qualitatively conserved, because it is expressed at different frequencies.In any study, read length is critical to many aspects – when mapping, the longer the read, the more accurately it can be mapped to a gene; this is especially important in duplicate rich genomes where the presence of one or two SNPS can make the difference of mapping to a specific paralog, or to be thrown out of the analysis due to ambiguity. Longer reads also have more splicing information content – they are larger windows into the transcript from which the read originated from, so if an alternative event happened, the read has a better chance of carryingit as length increases. Even if an event is not particularly rare as a percent of a genes expression, having reads that originate from a specific locus of an alternative transcript begins to mine the data for a fraction of a fraction of the reads – thus every increase read length helps prevent missing events. Most reads are going to map to an exon, fewer will originate from exon-exon junctions, and even less will originate from the specific locus of an alternative transcript to 23actually carry an alternative event. Consider if a 10kb transcript retains an intron of 150 bp; assuming even fragmentation, only 1.5% of the transcript has a chance to contain the event. Hence, the longer the RNA-seq, the higher the chance is they will contain relevant information, and the lower chance they will miss events from the transcript they originated from. Higher read depth magnifies read length and further increases the chances of finding events. Thus it is possible with RNA-seq to examine both quantitative and qualitative aspects of alternative splicing, provided the length and depth are of superior quality. Our data is of neither superior length nor depth given the study subject of B. napus. Thus, we will aim to only qualitatively annotate and qualitatively compare alternative splicing events that we discover.Using the BLAST results to infer which junction pairs between homeolog pairs retain structural homology, we compared the events present in both equivalent junctions to access conservation  of alternative splicing. This was done on a simple presence or absence rule, where as long as an event passed the detection threshold, it was considered present.2.3 Results2.3.1 Quality Checking and MappingThe reads were found to be of suitable quality, with an overall high per base quality score(Figure 2.1.). Although their replicate structure, or more particularly their lack of one prohibited several types of analyses, it was found that they would be adequate for annotation and a simple homeolog alternative splicing analysis. The mapping was very successful (table 2.3), keeping themajority of reads throughout the various stages of the processing pipeline. A total of 68.9% of allreads were called as uniquely being assigned to genes, which is a good result considering the 24complexity, size, redundancy, and obvious incomplete state of the B. napus genome. 2.3.2 Development of a Custom Python PipelineIn designing the core components and algorithms of this custom set of scripts, the ground work was set for further development and refinement for its use on other systems. The basic method has not changed (Figure 2.5), even with the incorporation of the ability to utilize paired-end reads which we did not have in this study, and has since been applied to Arabidopsis thaliana, and Cyanidioschyzon merolae in various studies with great success. In this study, it made the best use of the single end reads we were provided, and quantified them in discernible and discrete numbers such that annotation and even a simple homeolog analysis was possible.We show our criteria have a very high congruence with coverage based standards (Table 2.7, Figure 2.4). The vast majority (86%) of retained introns called by number of reads have complete coverage, and 95% of retained introns have at least 70%+ coverage. Of the 56372 intron retentions inferred using our criteria, 48974 of them had complete read coverage; using less stringent coverage requirements does not significantly increase the amount of introns callable under a coverage threshold. The overwhelming majority of called introns using our criteria had high coverage (table 2.7). Requiring 100% coverage to ascertain an intron retention event may be an artifact of EST based methods that are unrealistic to apply to short reads, where a more quantitative approach may be best. Coverage of retained introns, assuming they are beingproduced biologically, is a stochastic function of read depth – more depth increases coverage, butassuming even and complete coverage of a very small part of a potentially rare transcript is unrealistic, and severely biases shorter introns towards having their intron retention called under a coverage paradigm. The density of the coverage data has been plotted (Figure 2.4), revealing 25intron coverage that is distinctly bimodal. This suggests that highly but incompletely covered introns still group with the introns that by chance have the proper distribution of reads across an intron to attain 100% coverage, and should not really be considered differently. Additionally, intron coverage has been plotted against intron length, revealing a stark prejudice. Introns up to about ~75bp in length show the strictest adherence to bimodality, as it does not take many reads to cover every base, while introns larger than ~75bp show a very fluid distribution of coverage, with a downward trend as length increases. 2.3.3 Genome-wide Characterization of ASRNA-Seq data was generated from 8 different tissue/stress conditions and pooled to annotate the transcriptome and gain insight into the patterns of alternative splicing that follow allopolyploidy. Globally, the A and C genomes showed no differentiation in the type or prevalence of alternative splicing event. Intron retention was the most frequently observed type of event (62% of AS events)(Table 2.4, Figure 2.3), with exon skipping being the least common (2% of AS events). These frequencies of event types in both subgenomes mirror those found in Arabidopsis thaliana (Marquez et al. 2012)(Table 2.4, Figure 2.3). The splicing events found in the parental species may not be repeated in the allopolyploid (Zhou et al. 2011), potentially due to a new, common trans-environment. Combining all tissues may have missed tissue specific splicing biases as seen with gene expression biases between subgenomes in allopolyploids (Chaudhary et al. 2009). This does represent the first study of alternative splicing in an allopolyploid. Overall, one or more AS events were found in 48% of the genes analyzed (Table 2.4). This number is consistent with recent RNA-seq studies in Orzya sativa (48% of all genes inLu et al. 2010) and Arabidopsis thaliana (61% of intron containing genes in Marquez et al. 2012).26We were able to validate 262,990 exon-exon junctions that were predicted by GAZE, as well as annotate 99,823 transcripts in the Brassica napus genome. Table 2.5 represents the summation of this work; for genes which had read-support for all of their GAZE predicted junctions, every potential transcript which the data supported was output to a standard gff3 formatted annotation file. Each constitutive form was given its own transcript, and each alternative splicing event placed on its own unique transcript, making no assumptions about how these events segregate given the difficulty of full transcript assembly. 2.3.4 Characterization of Homeolog PairsThe conservation of alternative splicing patterns between members of homeolog pairs was assessed, showing different levels of conservation between different types of events (Table 2.5). Intron retention was most reciprocated, while exon skipping was the least commonly reciprocated within the pair members. Fragmentation biases may be partially responsible for this result; longer transcripts produce more fragments, and intron retention creates the longest alternative transcripts. The likelihood of a false negative is much lower for intron retention events, while other types of events which more subtly effect transcripts may produce more instances of false asymmetry due to the comparative difficulty of resolving the event in the data. Thus, across several tissues and stresses, the homeologs pairs are 'at least' this able to reciprocateeach others splice patterns. It is difficult to determine weather these low rates of reciprocation aredue to A and C genome divergence before allopolyploidization, or to post allopolyploidy subfunctionalization, or even to radical change upon allopolyploidization, although it opens up exciting avenues for future study.We lack tissue or stress specific information for each event, thus determining inter 27homeolog pair partitioning of forms per tissue/stress is impossible, despite this potentially representing a large amount of what makes allopolyploidy unique.(Zhou et al. 2011). We pooled data from many conditions, whereas some of the most interesting known inter homeolog pair trends is biased expression of one within a particular condition. On an organismal scale, homeologs may be able to reciprocate some each others alternative splicing to some level, but the more interesting question is how and how often allopolyploidy is partitioning their actual expression and splicing across ontogenic, tissue, and stress conditions. Although some rates of conservation may be low, they are far greater than one would expect by chance alone, i.e. if events were segregating independently and randomly between junctions. Table 1.8 shows a chi-square tests demonstrating the non-independence of the homeologous junctions' behavior, with 'lazy' categories requiring an event of the same type at thehomeologous junction, but not necessarily the exact same event (i.e. permits a donor or acceptor to be different by a few base pairs). Essentially, if events were randomly positioned within gene junctions, one would never expect to observe this many events occurring in both related junctions, or phrased another way, an event in one homeolog's junction positively influences the chances an equivalent event will be in the same junction in the homeolog.2.4 DiscussionThis survey of the transcriptome of Brassica napus represents the first detailed analysis of alternative splicing of another member of the Brassicaceae. As a resource for future studies, many of the predicted gene structures have been confirmed, although there are still undoubtedly many more genes to find and properly annotate in such a large genome. RNA-seq is an exceptionally powerful tool for rapid genome annotation. The overall trends and frequencies of 28types of alternative splicing in the allotetraploid Brassica napus have been shown to be exceptionally similar to the paleopolyploid Arabidopsis thaliana. Although we were not able to do a detailed analysis and possibly pick out any specific effects of allopolyploidy on alternative splicing, the relative abundances of particular event classes appear quite static. Homeologs sharing 42% of intron retention events seems quite remarkable, where events that are conserved are potentially more implicated in having a conserved function, due to having been retained in both parental species, and again after allopolyploization. While alternative acceptor (21% conservation) and alternative donor (16% conservation) are much lower, these forms may requiremore specificity from trans acting factors selecting for specific sites versus intron retention, thus more likely to be lost in through time and allopolyploization, though it is unknown entirely if these events were lost before or after allopolyploization. Future studies can build on the annotation, and further refine investigations into alternative splicing with more and better data, although the basic method of finding similar junctions via BLAST search and using them as a point of comparison between homeologs (or other duplicate types) serves as a good model. Comparing entire isoforms can becomes excessively tricky in the case of a rearrangement or addition/deletion of an exon, whereas comparing splicing at specific conserved junctions explicitly investigates the conservation of the alternative splicing event. Even for other types of investigations into alternative splicing, comparing the splicing profile of a specific junction between treatments, rather than levels of a putative transcript, may prove more accurate and meaningful until such a time as RNA-seq reads, or whatever technology comes after it, are suitably long enough to accurately resolve isoforms in non-model systems. New analysis tools open up exciting new venues for discovery, yet they must be temperedwith critical thinking and comparison to old and more validated methods. We did not have the 29most control with our data in this experiment, having no say in the experimental design. Thus wecompared our quantitative standards for intron retention to coverage based standards, which they show a very high parity. In future experiments, we will be able to run RT-PCR on the same RNA that was used to generate RNA-seq libraries, thus opening up new avenues for further validation of our analysis tools. It is an exciting time in biology, as knowledge is being generated at a faster pace than ever before, but a time that requires proficiency in many fields to be successful. The genomics of plants are especially challenging due to genome size and complexity, but even these have begun to be analyzed by modern sequencing methods.30Table 2.1 Description of RNA-Seq reads obtained by sequencing cDNA with the Illumina technology (single reads) from major tissue and developmental stages of B. napus cv  Darmor-bzh. Adapted from table S15 (Chalhoub et al. 2014).Table 2.2 Summary of the draft genome of B. napus.31Tissue Type Nitrogen Library Designation Number of Reads (100bp) Size (BP) Avg. Read LengthRoots = AUP_AOSW_2_D09BTACXX.IND7 50164790 5066643790 101Roots + AUP_BOSW_3_D09BTACXX.IND7 41362592 4177621792 101Roots - AUP_COSW_4_D09BTACXX.IND7 37635594 3801194994 101Stem + AUP_DOSW_2_D09BTACXX.IND12 50477761 5098253861 101Stem - AUP_EOSW_3_D09BTACXX.IND12 52883824 5341266224 101Leaves + AUP_FOSW_4_D09BTACXX.IND6 41477433 4189220733 101Leaves - AUP_GOSW_5_D09BTACXX.IND6 35694972 3605192172 101Flower Buds = AUP_HOSW_5_D09BTACXX.IND12 41332645 4174597145 101Total 351029611 35453990711Chromosomes/Contigs 41Genes 79498Scaffold Designation Genes Scaffold Designation GeneschrA01 2835 chrA01_random 96chrA02 2896 chrA02_random 46chrA03 4612 chrA03_random 168chrA04 2334 chrA04_random 90chrA05 3169 chrA05_random 65chrA06 3302 chrA06_random 166chrA07 2986 chrA07_random 121chrA08 1889 chrA08_random 69chrA09 4097 chrA09_random 200chrA10 2218 chrA10_random 53chrC01 3274 chrC01_random 42chrC02 1938 chrC02_random 30chrC03 4463 chrC03_random 256chrC04 3004 chrC04_random 150chrC05 2189 chrC05_random 6chrC06 2274 chrC06_random 143chrC07 2633 chrC07_random 74chrC08 2574 chrC08_random 37chrC09 2774 chrC09_random 150chrAnn_random 5755chrCnn_random 14325chrUnn_random 1995Anchored Genes 55461 Un-anchored Genes 24037Percent Anchored 0.697640192 Percent Un-anchored 0.302359808Summary of B. napus genomeTable 2.3 Progression of RNA-seq reads through clipping, mapping, quality filtering the mappedresults, then calling with the custom Python scripts.32Reads in .fastq files ReadsAUP_AOSW_2_D09BTACXX.IND7.clip.fastq 49254235AUP_BOSW_3_D09BTACXX.IND7.clip.fastq 40781641AUP_COSW_4_D09BTACXX.IND7.clip.fastq 37439347AUP_DOSW_2_D09BTACXX.IND12.clip.fastq 50266377AUP_EOSW_3_D09BTACXX.IND12.clip.fastq 50244245AUP_FOSW_4_D09BTACXX.IND6.clip.fastq 41081857AUP_GOSW_5_D09BTACXX.IND6.clip.fastq 34488200AUP_HOSW_5_D09BTACXX.IND12.clip.fastq 39998707Total 343554609Reads in SAM files ReadsAUP_AOSW_2_gs_N1th8J64j-31n1Q.sam 46328959AUP_BOSW_3_gs_N1th8J64j-31n1Q.sam 33820457AUP_COSW_4_gs_N1th8J64j-31n1Q.sam 31615568AUP_DOSW_2_gs_N1th8J64j-31n1Q.sam 48864309AUP_EOSW_3_gs_N1th8J64j-31n1Q.sam 48835898AUP_FOSW_4_gs_N1th8J64j-31n1Q.sam 39955046AUP_GOSW_5_gs_N1th8J64j-31n1Q.sam 33653952AUP_HOSW_5_gs_N1th8J64j-31n1Q.sam 38760793Total 321834982Reads in parsed SAM files ReadsAUP_AOSW_2_gs_N1th8J64j-31n1Q_interest_sorted.sam 41075224AUP_BOSW_3_gs_N1th8J64j-31n1Q_interest_sorted.sam 29932978AUP_COSW_4_gs_N1th8J64j-31n1Q_interest_sorted.sam 27998369AUP_DOSW_2_gs_N1th8J64j-31n1Q_interest_sorted.sam 43142554AUP_EOSW_3_gs_N1th8J64j-31n1Q_interest_sorted.sam 41627667AUP_FOSW_4_gs_N1th8J64j-31n1Q_interest_sorted.sam 35056280AUP_GOSW_5_gs_N1th8J64j-31n1Q_interest_sorted.sam 26403662AUP_HOSW_5_gs_N1th8J64j-31n1Q_interest_sorted.sam 34134370Total 279371104Summary ReadsTotal reads before mapping 343554609Total reads before parsing 321834982Total reads before event calling 279371104Total reads called 237011415Percent reads called (mapped/parsed) 0.848374838Percent reads called(raw) 0.689879887Table 2.4 Summary of alternate splicing mapping, event calling, and discovered events in both subgenomes of B. napus. Adapted from table S17 (Chalhoub et al. 2014).Table 2.5 Summary of annotated and verified transcripts and events in B. napus.Table 2.6 Comparison of alternative splicing patterns observed between pairs of homoeologs in the An and Cn subgenomes of B. napus. Adapted from table S18 (Chalhoub et al. 2014) .33Summary of Splice Events Total A Genome C Genome UnknownTotal Reads 237011415 116140766 116634878 4235771Total Genes with Expression 72755 34700 36431 1624Junctions with Coverage 262990 129595 129254 4141Intron Retention 56372 27048 28424 900Alternate Acceptors 20612 9610 10630 372Alternate Donors 9573 4290 5143 140Alternate Position 2402 1105 1239 58Exon Skips 1723 802 890 31Genes with event(s) 35068 16877 17586 605Percentage of Genes with event(s) 48.20% 48.64% 48.27% 37.25%mRNA transcripts 99823Constitutive form 35441Intron retention 41298Alternative acceptor 13852Alternative donor 6467Alternative position 1630Exon skipping 1135Summary of Annotation in B. napusHomeolog Event Conservation Both Homeologs Homeolog Specific Percent SharedRequiring exactly the same eventIntron Retention 9435 12585 42.85%Alternate Donors 634 3286 16.17%Alternate Acceptors 1848 6772 21.44%Alternative Position 69 909 7.06%Exon Skips 70 587 10.65%Total 12056 24139 33.31%Requiring only the same type of event at the equivalent junctionIntron Retention 9435 12585 42.85%Alternate Donors 727 2854 20.30%Alternate Acceptors 2073 5482 27.44%Alternative Position 134 686 16.34%Exon Skips 70 587 10.65%Total 12439 22194 35.92%Table 2.7 Coverage of introns by RNA Seq data.Table 2.8 Event thresholds and rules.Table 2.9 Chi-squared test confirming the non-independence of conserved junctions.34Coverage Threshold Covered Introns Uncovered Introns Percentage of uncovered Introns100 48974 7398 13.12499 49309 7063 12.52995 50538 5834 10.34990 51652 4720 8.37380 53125 3247 5.76070 53997 2375 4.21360 54557 1815 3.220Type of Event Event Criterion # RequiredIntron Retention Read has 8+ base pairs laying in the intron 10Alternative Donor Read spans junction, has an alternate 5' site 2Alternative Acceptor Read spans junction, has an alternate 3' site 2Complex Event Read spans junction, with 5' and 3' sites changed 2Exon Skipping Read must skip at least one exon 2Logged AsIntron retention anchored to junctionAltD to junction, bases changed, new junction coordinatesAltA to junction, bases changed, new junction coordinatesComplex, with changes in both sites logged, new junction coordinatesExon skipping; new junction entirelyHomeologous Junctions 69604Chi-Square Test Category No A.S. Both Junctions p-valueRequiring exactly the same eventIntron Retention* 47584 6767 5818 9435 2.20E-016Alternative Donors 66018 1813 1473 634 2.20E-016Alternative Acceptors 62049 3580 3192 1848 2.20E-016Alternative Position 68784 495 414 69 2.20E-016Exon Skips N/A** 320 267 70 N/A**Requiring only the same type of event at the equivalent junctionIntron Retention* 47584 6767 5818 9435 2.20E-016Alternative Donors 66018 1582 1272 727 2.20E-016Alternative Acceptors 62049 2912 2570 2073 2.20E-016Alternative Position 68784 376 310 134 2.20E-016B. oleracea specific B. rapa specificFigure 2.1 A typical FastQC inspection of read quality from the RNA-seq libraries. Base calling quality remains high throughout length of the reads provided.35Figure 2.2 Paradigm for calling equivalent junctions. Rearrangements may have occurred between duplicated genes. To prevent erroneously comparing junctions that not share homology, exons were BLAST searched against each other, and only junctions that were structurally conserved were further considered for investigation into alternative splicing, even if the exons themselves have been re-numbered. 36Figure 2.3 AS event freuqnecies in B. napus. Intron retention is the most common type of alternative splicing discovered in B. napus followed by alternate acceptor and donor, closely mirroring event frequencies found in the model plant Arabidopsis thaliana.37Figure 2.4 Graphs of intron coverage. Coverage may be an unrealistic standard to qualify potential events, as longer introns will require more reads to attain a similar coverage level. Shorter introns have a very bimodal distribution, as they can be completely covered by a few well placed reads, whereas larger ones are far more stochastic, with a downward coverage trend as length increases. Coverage can be said to be a stochastic function of both read depth that is tied to gene expression level, intron length, and AS event frequency.38Figure 2.5 Design schematic for Python algorithm as it processes individual RNA-seq reads when iterating through SAM (Sequence Alignment Map) files. 39Read in GFF Initialize Genes Read in SAMMap Read to GeneDoes not MapMultiple GenesMaps to One Gene Map to Gene FeaturesHarmonious with modelDiscordant with modelLog JunctionsLog Features HitCall EventLog changesLog Features HitWrite Counts and Types of  Mappings and Events to FileSave read information          in memoryNo more readsChapter 3: Transcriptome Analysis Indicates Considerable Divergence in Alternative Splicing between Duplicated Genes in Arabidopsis thaliana13.1 IntroductionGene and genome duplications have created large numbers of duplicated genes, some of which have diverged in expression patterns and functions. All vertebrates have experienced at least two rounds of whole-genome duplication, whereas plants persist with a characteristic propensity for iterative rounds of polyploidy (reviewed in Blanc and Wolfe 2004a; Cui et al. 2006, Kasahara 2007, Jiao et al. 2011). New duplicates must remain functionally relevant or be deleted or pseudogenized. Several models for duplicate gene retention and subsequent fates have been proposed, including genetic redundancy, gene dosage balance, genetic robustness, and divergence of protein sequence and expression patterns (reviewed in Sémon and Wolfe 2007l Han et al. 2009). Complementary degenerate mutations may knock out one or more functions or expression patterns in each duplicate, referred to as subfunctionalization, further specializing each duplicate to its now partitioned function (Force et al. 1999; reviewed in Conant and Wolfe 2008.) Neofunctionalizaiton is the generation of a new function or expression patterns in one duplicate whereas the other copy retains the ancestral expression pattern or function.In plants there has been particular interest in characterizing the fates of genes duplicated by whole-genome duplication events. In Arabidopsis thaliana previous work defined and examined the α-whole-genome (WG) duplicates, which originated from a polyploidy event at the_________________1 A version of chapter 3 has been published. Tack DC, Pitchers WR, and Adams KL. 2014. Transcriptome Analysis Indicates Considerable Divergence in Alternative Splicing Between Duplicated Genes in Arabidopsis thaliana. Genetics, 198(4), 1473-1481.40base of the Brassicaceae family (Blanc et al. 2003, Bowers et al. 2003). These duplicates are a discrete set of paralogs that originated at the same time and some of them expression profile divergence (Blanc and Wolfe 2004b; Casneuf et al. 2006, Liu et al. 2011). Expression patterns ofsimilar sets of genes derived by whole-genome duplication have been characterized in other plants (Schnable et al. 2011; Renny-Byfield et al. 2014). Tandem duplicates contrast to α-WG duplicates in that they arise from small-scale duplications, often by unequal crossing over, and are formed at various times during the evolution of most plant lineages. Alternative splicing is integral to gene expression, as differential splicing of the primary mRNA transcript can alter protein functionality or transcript level. Much of the diversity of the eukaryotic proteome can be attributed to alternative splicing (Nilsen and Gravely 2010), where more transcripts and protein products exist than genes. The most prevalent class of alternative splicing in plants is intron retention (IR) where an intron is not spliced out of the transcript, whereas the most common type in animals is exon skipping (SKIP) where one or more modular exons are skipped to produce different transcript isoforms (reviewed in Reddy et al. 2013). Alternative donor (ALTD) and alternative acceptor (ALTA) events cause a change in the 5'- and 3'- exon boundaries, respectively, whereas an alternative position (ALTP) is a combination of both alternative donor and alternative acceptor. Alternative splicing (AS) affects >50% of genes in A. thaliana and Orzya sativa; exact percentages depend on whether all genes or only intron containing genes are counted (Lu et al. 2010; Marquez et al. 2012). Environmental and biological stresses are often met with changes in splicing in plants (e.g., Filichkin et al. 2010; reviewed in Staiger and Brown 2012). Numerous examples of the alternative splicing being tied to the circadian clock exist (Sanchez et al. 2010; Seo et al. 2012; reviewed in Filichkin and Mockler 2012) as well as studies highlighting its key role in expression-level regulation via 41nonsense-mediated decay (NMD) (Kalyna et al. 2011; Drechsel et al. 2013). For example, AtGRP8 auto-regulates itself via AS-induced NMD during elevated protein levels (Schöning et al. 2008). While some alternatively spliced transcripts may simply be by-products of random interaction between splicing factors, more functionally characterized forms in plants are continuing to be found, mirroring the wealth of characterized AS isoforms found in animals. A previous study used RT-PCR-based methods to attempt to ascertain how divergent or conserved the AS patterns are among paralogs in A. thaliana (Zhang et al. 2010). However, this approach was limited by being confined to a small set of known events for which there was prior annotation. In the advent of RNA-seq, which allows for both qualitative discovery (Filichkin et al. 2010; Marquez et al. 2012) and quantitative comparisons, it is possible to reevaluate the degree to which AS events have been conserved between duplicates. Thus studying conservation of alternative splicing in a single tissue type is a reflection of the regulatory divergence that has occurred between duplicates in the same trans environment. Alternative splicing is regulated by cis elements known as exon splicing enhancers (ESEs) and exon splicing silencers (ESSs) and intron splicing enhancers (ISEs) and intron splicing silencers (ISSs) (Chen and Manley 2009; Huelga et al. 2012; reviewed in Reddy et al. 2013). Both pair members containing the same AS events and expressing them at similar frequencies within a tissue implies a lack of such regulatory divergence in AS, whereas qualitative or quantitative variation in AS between paralogs in the same tissue would show they have experienced asymmetrical change in their regulation.We analyzed AS patterns in two types of duplicated genes in A. thaliana, α-WG duplicates and tandem duplicates, in leaf tissue, using RNA-seq analysis to assess both 42qualitative and quantitative conservation. Our findings indicate considerable qualitative divergence in AS patterns between duplicates in leaves. Of those AS events that are qualitatively conserved, a large majority occur at different frequencies and show quantitative divergence. To assess the relationship between gene duplication and nonsense-mediated decay associated alternative splicing events, we analyzed published RNA-seq data from NMD mutants that indicated that most duplicates show AS-induced NMD asymmetry.3.2 Materials and Methods3.2.1 Plant Growth, RNA Extraction, and Library PreparationA. thaliana (Col-0) was grown at 20° with a 16-hr/8-hr light/ dark cycle and 70% humidity in a growth chamber. The first true leaves were harvested at 20 days after germination and flash frozen in liquid nitrogen before being stored at 280°. Three biological replicates were used, which consisted of pools of different plants from the same growth chamber. Total RNA wasisolated from the first true leaves with the Ambion RNAqueous kit (AM1912) in conjunction with the Ambion Plant RNA Isolation Aid (AM9690). Extracted total RNA was DNase treated with the Ambion Turbo DNA-free kit (AM1907) and visualized on a 2% agarose gel to assess quality. Libraries were prepared with the Illumina TruSeq RNA Sample Prep Kit. The low-throughput (48 samples) protocol was followed, using 4 mg total RNA input for each sample andallowing fragmentation to proceed for 2 min.3.2.2 Library Sequencing and Mapping/ProcessingLibraries were sequenced to obtain 100-bp paired-end reads on an Illumina HiSeq 2000, multiplexing three libraries over 1.5 lanes. This yielded a total of 91,315,717, 90,873,817, and 43168,071,194 paired-end reads per sample. The RNA-seq reads were mapped against the A. thaliana genome version Ensemble TAIR 10 downloaded as an igenome from http://ccb.jhu.edu/software/tophat/igenomes.shtml, using GSNAP version 2014-01-21. The command gsnap -d thaliana– pairmax-rna = 10000–localsplicedist = 10000–clip-overlap -N 1–nthreads = 6–quality-protocol = sanger–ordered– format = sam–nofails R1.fastq R2.fastq . mapped.sam was used to map. Mapped reads were processed with a set of custom Python scripts to ascertain which genes they originated from, as well any splicing patterns, calling all splicing events based on the representative gene models. A total of 84,864,583, 79,167,068, and 150,297,611 paired-end reads were mapped. Scripts are available on request. The RNA-seq data have been deposited in NCBI’s Gene Expression Omnibus (GEO) (Edgar et al. 2002) and are accessible through GEO Series accession no. GSE57579. (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=cdkjkawipzmvdmp&acc=GSE57579).3.2.3 Equivalent Junction and Event CallingExons from the representative models for α-WGD and tandem duplicate pair genes (Liu et al. 2011) were compared against each other, using BLASTn (-task dc-megablast). For each set of reciprocally matched and contiguous exons, an equivalent junction pairing was logged. This allowed investigation of events even if there has been a rearrangement or renumeration of exons between duplicates’ models, such as exon loss or fission/fusion between paralogs at homologous junctions. Only events that occurred within these junction pairs were compared. For AS event calling the minimum number of AS reads required was three, with at least one read in support in each replicate as well as constitutive coverage of the junction. Overall 95% of all events had five or more total AS reads and 85% of events had eight or more total AS reads. Analyses were repeated requiring five and eight reads, respectively (Appendix A, Table A.1, A.2 and A.3) with 44almost no effect on the results.3.2.4 Analyses of Leaf RNA-seq DataExpression data were used to exclude from the data set any gene pairs with a member thathad zero reads in any replicate. Gene pairs were also excluded if either member was part of the 5% of genes with the highest variance in number of reads among the three replicates. This filter thus excludes from our analysis gene pairs for which the repeatability of the read counts was low.The AS data were then further filtered to remove AS events wherein the expression of one partner was represented by fewer than one read per replicate.Qualitative conservation was assessed using a binary rule; the pattern of splicing was defined as conserved at a given junction if both paralogs had both constitutively and alternatively spliced reads mapped to the same equivalent junction and as divergent if only one paralog had alternatively spliced reads. After counting qualitatively conserved junctions, the conserved junctions were tested for quantitative conservation. These tests were run as a logistic regression at each junction, using the modelwhere pr(alt) is the proportion of alternative splicing events and bparalog is the paralog identity. Logistic regression was used because the proportion of alternative splicing events is by definition bounded between 0 and 1, and our outcome (difference or no difference between paralogs) is binary. Junctions were defined as quantitatively divergent if the model provided statistical support (i.e., a P-value <0.05) for the quantitative difference between the paralogs in the pattern of constitutively and alternatively spliced reads and as quantitatively conserved otherwise (i.e., in the absence of statistical evidence for the existence of a difference between 45paralogs, a P-value ≥0.05,we assume conservation). The number and percentage of conserved junctions are reported in Table 3.1. To visualize the difference in levels of a particular AS event between paralogs (i.e., effect size), the difference in number of AS reads was expressed as fold difference after standardizing for the mapped expression level of each gene (Figure 3.1).Hypotheses about gene ontology were addressed using the web-based Gene Ontology (GO) enrichment analysis and visualization tool (GOrilla) at http://cbl-gorilla.cs.technion.ac.il/ (Eden et al. 2007, 2009). Lists of genes that our analysis classified as conserved or divergent were input into the GOrilla web tool to search for enrichment in GO terms. To determine whetherthere was a difference in alternative splicing conservation rate between events located in genes’ UTR regions vs. coding regions, the location of each event was used as a binary factor. Logistic regression was used to model the conservation status as a function of event location, and then a linear regression was used to model the size of difference in expression pattern between paralogs (fold change) as a function of event location.3.2.5 Analyses of NMD DataThe occurrence of NMD was investigated using data from Drechsel et al. (2013). Genes were classified as NMD positive if they were reported as showing NMD via alternative splicing in one or more mutant conditions. Genes where no significant increase in frequency in a splicing event in mutant conditions was reported were classified as NMD negative. NMD presence/ absence data were used to test for differences between singleton and duplicate genes in the percentage of NMD-positive genes, using a chi-square test. These data were also used to quantifythe percentages of tandem and a-WG duplicates where NMD status was asymmetric, i.e., where one paralog was NMD positive and the other was NMD negative.46All statistical analyses were conducted in R (version3.0.2) and analysis scripts are archived along with ourdata at Dryad g7b25 and at https://github.com/willpitchers/petulant-dubstep.3.2.6 Generation of Singletons and Duplicate SetsOnly terminal duplicate pairs without subsequent gene duplication were used to study alternative splicing. These were found by generating gene families following the analytical procedure described in Liu et al. (2011). First, A. thaliana gene families were obtained from PLAZA 1.0 (Proost et al. 2009) and a 50% consensus tree topology was obtained for each gene family based on 100 replicates of bootstrapping maximum-likelihood analyses by RaxML v.7.0.0with an amino acid substitution matrix WAG and gamma-distributed rate variation (Stamatakis 2006). Then 2584 α-WG duplicate pairs identified by Blanc et al. (2003) and 1826 clusters of tandem duplicates identified by Liu et al. (2011) were used as the backbone for us to pull out the terminal duplicate pairs without subsequent gene duplications from each gene family consensus tree topology. A total of 1771 WG duplicate pairs and 1179 tandem duplicate pairs were used for subsequent analyses.3.2.7 RT-PCR ConfirmationsPrimer sets were designed around 20 junctions chosen at random with one or more alternative events. The junction was encompassed to amplify the constitutive and alternative splicing(s) of the junction. Three complementary DNA (cDNA) pools were made via the Invitrogen (Carlsbad, CA) SuperScript First Strand Synthesis for RT-PCR Kit, using thesame RNA samples used to prepare Illumina libraries. PCR was performed using FroggaBio Ultra Pure Taq PCR Master Mix with dye. The PCR cycling conditions were 94° for 3 min; 30–4735 cycles of 94° for 30 sec, 54° for 30 sec, and 72° for 30 sec; and 72° for 7 min. PCR products were visualized on 1.2% agarose gels. Primer sets are listed in Table Results3.3.1 Conservation of Alternative Splicing Within Paralog PairsThree biological replicates of RNA-seq from leaves of  A. thaliana were used to investigate the splicing patterns of two classes of paralogs, α-WG duplicates and tandem duplicates. Reads were mapped against the TAIR 10 assembly with GSNAP (Wu and Watanabe 2005) and interpreted with a suite of custom Python and R scripts (see Materials and Methods). Representative gene models were BLAST searched against each other to create a set of homologous junctions between paralog pairs where events could be directly compared. Only splicing events that were present in every replicate were accepted, resulting in 62,909 unique AS events between all genes. A total of 43,236 of these events were IR, 8817 were ALTA, 4986 wereALTD, and 1508 were ALTP class events. While we detected 995 SKIP events, as well as 3367 events in other complex categories, the difficulty in assigning the event to one homologous junction shared between paralogs excluded using them from further analysis.First, a qualitative analysis comparing simple presence/absence of splice events at homologous junctions (Table 3.1) revealed 30% of events are conserved between a-WG duplicates and 33% of events are conserved between tandem duplicates. Intron retention is both the most common type of event and the most conserved between paralogs, with alternative acceptor being second in both regards. The rate of conservation for IR events was higher than those for all  other classes of AS in both α-WG duplicates and tandem duplicates, and this 48difference had statistical support (α-WG duplicates P < 0.001, tandem duplicates P < 0.001, chisquare). The complete list of conserved and divergent events is too large to fit inside an appendix, but is instead available for download online as an excel file at the following URL: http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.169466/-/DC1/genetics.114.169466-2.xlsNext, we tested for changes in the frequency of qualitatively conserved events between paralogs by modeling the alternative and constitutive read counts at homologous junctions, usingbinary logistic regression. This used data from all three replicates to provide robust statistical support for these quantitative changes. Because we used relative frequency (i.e., number of alternative reads compared to constitutive reads at each junction), differences in absolute expression of the gene will have minimal effects on this analysis. Any splice event-junction model that produced a P-value <0.05 and thus had statistical support for a difference between frequencies of alternative vs. constitutive reads was tallied as divergent. Quantitative conservation (or more precisely the absence of statistical evidence for quantitative divergence) was found to be 31% in α-WG duplicates and 41% in tandems (Table 3.1), with tandems having a significantly higher rate of quantitative conservation (P = 0.002, chi square). However, the median effect size difference in the quantitative comparison is 4.1-fold for α-WG duplicates and 10.2 for tandems (P = 0.001, chisquare), again with a significant difference. More of the events in tandem duplicates are quantitatively conserved than events in α-WG duplicates, but the events in tandems that are quantitatively divergent show a wider range of disparity on average than α-WG events that are quantitatively divergent. When the qualitative and quantitative results are combined, only a small fraction of the events are fully conserved (Table 3.1; Figure 3.2). The α-WG duplicates have 9% conservation on both levels whereas the tandem duplicates again have a 49significantly higher rate of conservation (P = 0.0003, chi square) at 14%. No significant differences in the rate of qualitative or quantitative conservation were found in events located in untranslated regions of transcripts (5'-UTR and 3'-UTR) vs.the events found in coding sequences of transcripts (CDS). Additionally there was no differential enrichment of GO categories or terms in genes with conserved AS events vs. genes with divergent AS, either qualitatively or quantitatively.To compare RNA-seq data to RT-PCR data, as well as to validate our computational pipeline, 20 sets of primers were designed around a random set of junctions that containedone or more alternative splicing events. The vast majority (17/20, 85%) amplified the two or more expected fragments (Figure 3.3; Table 3.3) for the constitutive and alternative form(s) while the remaining 3 either failed to amplify or had unspecific amplification. Events from these 3 junctions were validated by EST support in plantGDB (Duvick et al. 2008) (http://www.plantgdb.org/ASIP/), resulting in support for all 22 events over the 20 random junctions.3.3.2 CCA1 and LHY Have Functionally Characterized, Divergent Alternative SplicingThe qualitative and quantitative results were surveyed for genes with alternative splicing events that had been functionally characterized. The α-WG pair of late elongated hypocotyl (LHY; AT1G01060) and circadian clock-associated1 (CCA1; AT2G46830) was found to display two types of asymmetric AS events: an alternative donor unique to the exon 4–5 junction in CCA1 and an intron retention event of intron 4 that is shared qualitatively, albeit at exceptionally low levels,in LHY (Figure 4). The exon 5–6 junction of LHY corresponds to the exon 4–5 50junction of CCA1. LHY was found to have 9 IR reads and 429 constitutive reads among all three replicates (~2% of reads contain the IR), while the corresponding junction in CCA1 was found tohave 2186 intron retention reads,1811 constitutive reads, and 114 alternative donor reads (~53% of reads contain the IR). Thus there has been a large quantitative change in IR levels as well as a qualitative change in the ALTD event. The functionality of the intron retention event has been detailed in Seo et al. (2012), who demonstrated that retention of the intron in CCA1 makesCCA1β, which binds to CCA1α or LHY, creating protein dimers with reduced DNA binding affinity and preventing formation of normal CCA1α homodimers, LHY homodimers, and CCA1α-LHY heterodimers. The alternative splicing of CCA1 into CCA1β is suppressed in lowertemperatures, allowing CCA1α and LHY to function in coordinating the cold response. Theintron retention event is clearly the ancestral state, as the same event was found in CCA1 orthologs in O. sativa, Brachypodium distachyon, and Populus trichocarpa (Filichkin et al. 2010), indicating that LHY clearly has an almost complete loss of the event in leaf tissue and thatit was not a novel gain in CCA1 post-duplication.3.3.3 Conservation of Alternative Splicing-Induced NMD within Paralog PairsTo explore the relationship between gene duplication and NMD-associated alternative splicing events, we analyzed RNA-seq data from NMD mutants reported in Drechsel et al. (2013). All genes reported as showing NMD via alternative splicing in one or more mutant conditions were said to be NMD positive, whereas genes without a significant increase in frequency in a splicing event in mutant conditions were considered to not have AS-induced NMD. Genes were sorted by duplication status (singleton or duplicate, see Materials and Methods) to ask whether duplication status influenced NMD status. Singleton genes had an overall chance of having an NMD event of 6.37%, whereas duplicates had a 7.08% chance, with 51a nonsignificant difference (P = 0.1496, chisquare). However, this does not preclude that it may still be a common mechanism of divergent evolution between duplicated genes. Comparing the NMD status of α-WG duplicate pairs, 207 pairs had one member with an AS event(s) associated with NMD whereas only 35 pairs had an event in both members (Table 3.2), indicating that 85% of pairs exhibited AS-induced NMD asymmetry. Tandem duplicate pairs show a similar pattern with 89% of cases of AS-induced NMD being present in only one member of a pair (58 pairs asymmetric, 7 pairs symmetric). This complements the previous finding of low conservation of AS, indicating that much of the divergence may have implications in further diverging expression levels and profiles between paralogs.3.4 DiscussionIn this study we analyzed 3951 AS events in tandem and α-WG duplicate pairs, using RNA-seq data. We found that alternative splicing patterns are infrequently conserved between paralogs in A. thaliana leaves. The majority of alternative splicing events are conserved neither qualitatively nor quantitatively, pointing to alternative splicing as being a rapidly evolving aspectof gene expression after gene duplication. The amount of divergence and asymmetry of AS events between paralogs suggests post-duplication specialization of AS is generally favored over conservation. The differences between paralogs could be accounted for by one gene losingan ancestral AS event or one gene gaining a new AS event. Intron retention events show the mostconservation, with 37–40% being qualitatively conserved, whereas alternative donors and acceptors are much less conserved at 5–15% (Table 3.1). One hypothesis to explain this difference would be that alternative donors and acceptors may be more prone to diverge due to requiring more specification; alternate exon boundaries have to be selected by specific splicing factors and specific binding motifs whereas intron retention may be more robust to sequence 52changes. The tandem duplicates have a higher rate of quantitative conservation than α-WG duplicates (41% vs. 31%), which could be due to them being younger duplicates in general. Interestingly, the average fold difference between quantitative AS event levels is much higher in tandem duplicates (10.2-fold) than in a-WG duplicates (4.1-fold). Events in tandem duplicates tend to be either quantitatively conserved or present at very disparate frequencies, whereas quantitatively divergent AS events in a-WG duplicates have less difference on average. Nonuniformity in age may help explain the more stochastic patterns of the tandem duplicates. The rate of event conservation in leaves was found to be ~37% for intron retention eventsin α-WG duplicates, which are the most common type of AS event, while a further refinement of this to include a quantitative dimension reduced this to 11%. The more nuanced use of both qualitative and quantitative conservation status hints at incomplete or partial regulatory divergence between the duplicates’ alternative splicing patterns. Among the qualitatively conserved AS events, both paralogs keeping the event itself but expressing them at different frequencies is common. These data indicate that the event itself may be qualitatively conserved although the regulation of it may not be. From an evolutionary standpoint, while many forms are certainly lost, this still implies that as many as 37% of splice forms, in the case of intron retentions, are conserved between duplicates but may be present at different levels; thus, they have experienced quantitative divergence in regulation. Likewise, conserved AS events from Zhang et al. (2010) that varied in AS by organ type have also experienced divergence in AS regulation, perhaps in some cases analogous to neo- and subfunctionalization. Thus rewiring, or both paralogs keeping the event itself but expressing it either at different frequencies or in different tissues, seems common. Rather than an AS event being lost from one paralog, instead there is differential regulation of the event to allow paralog-specific specialization in terms of 53when, where, and how much the event is expressed; the AS event may still be useful in both paralogs, depending on how and when it is deployed. This implies changes in cis elements are the primary mechanisms for divergence of AS patterns between paralogs and that most of the molecular evolution responsible is likely centered around ESEs, ESSs, ISEs, and ISSs to producedifferent effects in paralogs under the same trans environment. Changes in cis elements were implicated to account for most species-specific alternative splicing patterns in vertebrates (Barbosa-Morais et al. 2012), and in another study ~80% of non-exon-skipping events were attributable to divergent cis architecture in Drosophila (McManus et al. 2014). Collectively most changes in alternative splicing are thus directed by changes in cis. We have detected extensive qualitative and quantitative divergence in AS events between duplicates. In a concurrent study, Shen et al. (2014) identified AS events in soybean and compared the number of AS events present in genes duplicated by a paleopolyploidy event. Theyfound that the number of AS events was different in the vast majority of duplicate pairs. Of thosepairs where both genes had the same number of AS events, about two-thirds had divergence in the types of AS events. However, that study did not examine qualitative or quantitative conservation of individual AS events between duplicates, which was the focus of this study. Thusthe two studies give complementary perspectives on AS events in duplicated genes. Divergence in expression patterns, commonly seen in duplicates (e.g., Blanc and Wolfe 2004b; Casneuf et al. 2006; Liu et al. 2011), may have an underpinning in alternative splicing in some cases. Our finding that 85% of α-WG pairs are NMD asymmetric suggests that some cases of expression divergence are due to NMD lowering the level of expression of one paralog. Complex changes in regulation, especially those that invoke alternative splicing, may play a role 54in coordinating differential expression of paralogs. While diverged, non-splicing-related, cis elements differentially activate or repress paralogs to achieve differential expression, NMD via AS may also play a role. The evolution of an AS-induced NMD event to selectively shut down orreduce the level of one paralog in a given trans environment may be an alternative to mutations in cis-regulatory elements that change the transcription level. Alternatively,asymmetric NMD between duplicates in some cases might be due to one gene acquiring mutations that lead to AS due to a relaxation of selection, but the NMD is not involved in regulating protein levels (because transcript and protein levels do not always correlate well).The case of CCA1 and LHY is a clear example of divergent AS between duplicated genes.While the IR event is ancestral, LHY maintains the event only at a very low frequency inleaf tissue, leaving two distinct possibilities. One possibility is that selection has not yet completely eliminated the event from the gene, i.e., incomplete loss. The second possibilityis that the event is maintained in LHY due to selection for AS in another tissue or stress, and the small amount of AS detected in leaf is a consequence of nonspecific trans-environment interactions, i.e., incomplete specialization. Specifically, in another tissue, intron retention in LHY may serve a functional purpose and be at a higher rate, but the same cis elements that allow intron retention in another tissue produce an incidental amount in leaf tissue. In either case, the pattern CCA1/LHY displays is less drastic than full partitioning/subfunctionalization of ancestral splicing patterns (Lister et al. 2001; Cusack and Wolfe 2007; Marshall et al. 2013) where an ancestral gene with two splice forms becomes two specialized genes, each with only one of thetwo ancestral forms becoming the new constitutive form. Duplicated genes acting in concert and having diverged AS patterns to control the circadian clock highlight the potential reservoir of functionality that alternative splicing offers, as well as the specialization power that duplication 55allows. Although alternative splicing is often thought of as a response to stress, in the case of CCA1, it is the absence of AS that invokes a cold response pathway and the AS form that invokesnormal homeostasis. LHY and CCA1 have several other known roles, including regulating SVP (short vegetative phase) (Fujiwara et al. 2008) and the C-repeat binding factor (CBF) cold-response pathway (Dong et al. 2011). Although the functions of some AS events in plants have been characterized, the functions of the vast majority are unknown (reviewed in Reddy et al. 2013). Analyzing evolutionary conserved AS events between species that are somewhat distantly related is a way of finding AS-creating isoforms with potentially important functions, compared with AS events with no function or ones created by splicing noise (e.g., Boue et al. 2003; Ast 2004; Sorek et al. 2004; Wang and Brendel 2006; Darracq and Adams 2013). Likewise, cases where AS is conserved after gene duplication, especially those in ancient duplicated genes, may suggest that the AS event is functional in both duplicates. Thus the approach of comparing conservation of AS events in gene families may be a way of identifying AS events that are more likely to be functional. Future studies may incorporate more tissues or even multiple cell types within a tissue and assay duplicate pairs to pry out more qualitative and quantitative changes in AS events between duplicates. These patterns could highlight fine-scale regulatory and cis-element divergence between duplicates. Additionally, investigating causal changes in splicing-related cis elements may provide a mechanistic example of diverged AS patterns between paralogs. Further functional characterization of alternative splicing in plants would undoubtedly lead to more interesting example cases like CCA1/LHY. Perhaps the most engaging prospect is the continued 56escalation of the known role of alternative splicing in all aspects of plant biology and that gene duplication and alternative splicing are indeed processes with a complex relationship.We thank C. Grisdale for help with RT-PCR and providing feedback on analysis and methods, G. Baute and A. Hammel for providing feedback on analysis and methods, A. Gorski and H. Jhand for assistance with scripting and computation, A. Darracq for assistance with plant growth and computation, and S-L. Liu for generating the duplicate gene sets. This research was supported by a grant from the Natural Science and Engineering Research Council of Canada.57Table 3.1 Conservation of alternative splicing events in duplicated gene pairsTable 3.2 NMD status of paralog pairs58Event  class Conserved Divergent % conservationQualitative conservation of alternative splicing eventsa-WG duplicates IR 881 1500 37.0 ALTA 37 346 9.7  ALTD 12 220 5.2  ALTP 0 35 0.0  Total 930 2101 30.7  Tandems IR 279 411 40.4  ALTA 22 119 15.6  ALTD 8 62 11.4  ALTP 2 17 10.5  total 311 609 33.8  Quantitative conservation of alternative splicing eventsa-WG duplicates IR 279 602 31.7 ALTA 14 23 37.8  ALTD 2 10 16.7  ALTP 0 0 0.0  Total 295 635 31.7  Tandems IR 116 163 41.6  ALTA 8 14 36.4  ALTD 4 4 50.0  ALTP 2 0 100.0  Total 130 181 41.8  Overall conservation of alternative splicing eventsa-WG duplicates IR 279 2102 11.7 ALTA 14 369 3.7  ALTD 2 230 0.9  ALTP 0 35 0.0  Total 295 2736 9.7  Tandems IR 116 610 16.8  ALTA 8 133 5.7  ALTD 4 66 5.7  ALTP 2 17 10.5  Total 130 790 14.1Symmetric NMD Asymmetric NMD Neither NMDa-WG 35 207 1529 Tandem 7 58 1114Table 3.3 Primer sets for in-vitro testing of predicted splice eventsFigure 3.1 Distribution of fold change differences for α-WG  and tandem duplicates. Each point represents the difference in AS expression between a pair of paralogs. Boxplot overlays indicate means (thick lines), 25-75% quantile regions (boxes), and 95% quantile regions   (whiskers). Tandem duplicates events that are not quantitatively conserved have a very wide distribution of fold changes, whereas α-WG duplicate events that are not quantitatively conserved have a more narrow distribution.59Gene J unction Event  Class Forward Primer Reverse Primer Confirmed? product  1 product  2 AT1G01620 1,2 IR GACAGCTTACGAGCCAAGAA ATTGACAGTGATGGGAGTGAAG 1 174 334 AT3G20300 1,2 IR CTCCTGAAGCATACCACCATATC GTTTCGCTGCTCTTTCGTTTC X 330 780 AT1G06640 2,3 IR CTTACCGTCCTTCTACCAGATAAC GGTCCGTACACTCTAGGATTTG 1 250 374 AT3G21360 2,3 IR TGGACGTGGTTGGAAATCTAC CGGTTTCTCGACTCGTCATATT 1 170 244 AT5G42850 2,3 IR ACATTGAGCACGAAATCGTAAAG CGCATCCTTGGAGAGTTGAT X 300 374 AT1G26945 1,2 IR CACCTTCTTCTCCACTCTCATTC GGTCATCAACCTCTCTGTGTAAG 1 300 933 AT2G31360 4,5 IR CCGAACAATGTACCACGAAATG GTAGGAGCAGCATTGGAAGT 1 227 402 AT2G42810 5,7 SKIP GGTAACTTCCTCTCCCTCAATTC AGCAATCTCTGTGCCAGTATC X 350 230 AT4G18593 1,2 AltA/IR AATCAGTGTAGCAGACACCATC CTTCACCCTTTCCTGGTTCAT 1 200 704 AT4G24800 1,2 AltA/IR CTTCAGACTCGGGTTCTTCTTT CATGAGACCTTCTGTGCTTCA 1 329 629 AT4G37540 1,2 IR AGTTCCTGGTCCACAACATAC CACCGTCTTCGTCGCTAAA 1 205 305 AT1G72170 1,2 IR CCACCGGAATACGATGTGAA CAATACCAATTCCAGCACCTAAAG 1 182 382 AT1G63880 1,2 IR TCAATTCCCACCATACCATCAA CAATGAAACTTGTGCACGTAAGA 1 207 457 AT3G54810 2,3 IR CATAGGTGACTCCGACTCTTTC GGAGGTCTTTCCACCTTAACA 1 167 467 AT5G04550 1,2 IR CGCAACCTCAAACGCTAATAC CCTCTCTCTCTCTCTCTCTCATC 1 320 620 AT1G66100 1,2 IR CACCGCAAACAGAGGATACA AGTGAGTCGACATGTCCTAGA 1 100 202 AT5G24165 1,2 IR TCGTATCAAATTTCCTCAGAGACA GAGATGCTTTCCCTCCATCAG 1 150 271 AT1G52347 1,2 IR TTGACGCCGGTGATTGTT GCTTTCCTCGGTCCTTGATAC 1 334 834 AT1G23030 1,2 IR GGCCTCTCTACTCGACCTAAT CTTTGTTTCTCCTCCTCCTCAC 1 553 743 AT5G52530 1,2 IR CGAGCGGTGGATAGTGAAAT CGCTTTGATTCTCCTGCTACTA 1 340 990 Figure 3.2 Graphical representation of conservation status of alternative splicing events between paralogs60Figure 3.3 RT- PCR results. Example genes include no. 3 (AT1G06640 junction 2-3), no. 11 (AT4G37540 junction 1-2), and no. 9 (AT4G185932 junction 1-2). Genes 3 and 11 show amplification of the constitutive form as well as a predicted intron retention event, whole gene 9 shows amplification from the constitutive form, an intron retention event, and an alternative acceptor event. 61Figure 3.4 Alternative splicing divergence between CCA1 and LHY paralogs. CCA1β is produced at high frequency from the retention of intron 4 in CCA1 under normal conditions, blocking CCA1α and LHY from triggering a cold response. LHY has extremely low, perhaps vestigal, amount of intron retention at the equivalent junciton. Cold conditions shift the splicing of CCA1 to cease production of the β form, allowing CCA1a and LHY to trigger a cold response.Only the junctions relevant to this alternative splicing event are shown for clarity.62Chapter 4: Transcriptomic Changes Accompanying Allopolyploidization are Variable between Resynthesized and Natural Lines of Brassica napus.4.1 IntroductionThe flowering plant lineage is marked by repeated and iterative episodes of polyploidy (Blanc and Wolfe 2004a, Cui et al. 2006, Jiao et al. 2011, Adams and Wendel 2005), with as many as 15% of angiosperm speciation events having an associated ploidy level change (Wood et al. 2009). Polyploids often exhibit characteristics beyond the range of their progenitors, allowing for rapid adaptation and colonization of new niches (Rieseberg et al. 2003, Ainouche etal. 2004, reviewed in Hegarty and Hiscock 2008, te Beest et al. 2011). Polyploidization may have allowed higher survival during the K-T event from the adaptability and stress resistance it can confer (Fawcett et al. 2009), since evidence suggests a role in the development of novel adaptive traits (Vanneste et al. 2014). Humans have exploited this evolutionary novelty, as many important and robust crop plants are recent polyploids, such as coffee, wheat, canola, cotton, maize, and tobacco. Paradoxically, polyploids have higher extinction and lower speciation rates than diploids (Mayrose et al. 2011, Arrigo and Barker 2012), indicating that polyploidy can be thought of as an evolutionary 'gamble'. Each individual instance of allopolyploidy has unique changes, opening up variation and different attributes for selecton to act on (Soltis et al. 2014). Allopolyploids arising from the same event may even present similar phenotypes, yet express divergent proteomes (Hu et al. 2015). While much attention is centered on unqiue changes foundin new polyploids, much of the process is repeatable and non-stochasitc (reviewed in Buggs et al. 2014).63The genetic and genomic mechanisms responsible for unleashing such phenotypic novelty range from genome level rearrangements and deletions to complex epigenetics (reviewedin Chen and Ni 2006, Doyle et al. 2008, Yoo et al. 2014). For example, as many as 30% of parental methylation patterns are disrupted by hybridization in Spartina hybrids (Salmon et al. 2005). Allopolyploid genomes are destabilized by an activation of transposable elements caused by a reduction in the concentration of siRNAs (Kenan-Eichler et al. 2011). The behavior of homeologs, or reunited orthologs in allopolyploids, has received much interest. In Gossypium allopolyploids, homeolog specific expression biases and silencing have been observed in an organ-specific manner (Adams et al. 2003). Further studies estimate as many as 40% of homeolog pairs are biased, and this expression divergence is driven largely by differences in cis (Chaudhary et al. 2009), as well as in trans (Yoo et al. 2013). Interestingly, younger allopolyploids tend to show reduced differential homeolog behavior (Yoo et al. 2013, Buggs et al. 2011) when compared to established allopolyploids, and even different directionality of transcriptome dominance (Yoo et al. 2013). Similar trends of biased homeolog expression have been shown among resynthesized Brassica napus (Higgins et al. 2012). However, perhaps unique to the B. napus system, ~50 independent resynthesized allopolyploids were used to show extreme variance in the amount of genome rearrangements and  phenotypes that were possible outcomes from an identical allopolyploization event from the same parents, showing the allopolyploidization process is indeed stochastic (Gaeta et al. 2007). Given the range of genomic, transcriptomic, and phenotypic variation generated by allopolyploization, allopolyploids can be thought of as evolutionary 'wild cards'. Most will find themselves poorly or insufficiently adapted, but by exploring a broad phenotype space, some are likely to be extremely successful. 64Alternative splicing modulates raw gene expression, as differential processing of the primary mRNA transcript can result in a different protein product, or act as another layer of expression regulation via nonsense mediated decay - NMD (Drechsel et al. 2013, reviewed in Smith and Baker 2015). Alternative splicing can be tissue and stress specific, as different trans environments favor specific splice forms and splicing events may be a response to a particular stress. In Arabidopsis, whole genome duplicates are known to have divergent splicing patterns, both among and within tissues (Zhang et al. 2010, Tack et al. 2014), indicating that divergence insplicing tends to follow gene duplication brought on by polyploidy. Methylation patterns influence splicing (reviewed in Moar et al. 2015) but can be perturbed by hybridization and polyploidy, suggesting that many changes in splicing may occur at the onset of allopolyploidy due to changes in the methylome. Zhou et al. (2011) investigated changes in alternative splicing using resynthesized and natural Brassica napus, as well as their diploid progenitors; B. rapa and B. oleracea. They found frequent and often parallel losses of AS events following polyploidy, with each instance leading to some unique changes to alternative splicing patterns. The scope of this study was focused on a few genes for which alternative events were known to occur in at least one of the parental species, but their results suggest that a whole-genome study of alternative splicing patterns in a polyploid genome would be likely to reveal more detail and potentially informative patterns.In this study, we used RNA-seq to survey the transcriptomes of B. rapa, B. oleracea and three resynthesized allopolyploid B. napus progeny as well as natural B. napus. This offers the deepest sequencing yet of these transcriptomes in conjunction with each other, allowing an examination of the immediate effects of allopolyploization as well as the transcriptome variation possible following a parallel event. This represents the first genome-wide study of alternative 65splicing in allopolyploid plants. It uses an entire allopolyploid system, where using B. napus and its diploid progenitors allows a comparison of the transcriptional changes occurring immediately upon the onset of allopolyploization against one outcome thousands of years later in the natural polyploid.4.2 Materials and Methods4.2.1 Plant Material and RNA-seqSeeds from the plant lines EL9450sp (polyploid Line 47, P1), EL9451sp (polyploid Line 48, P2) and EL6450 (polyploid Line 50, P3), BN (B. napus), BO (B. oleracea), and BR (B. rapa)were obtained from the Arabidopsis Biological Resource Center at Ohio State University and grown at 20°C and 50% humidity under 16h/8h light dark cycles in a growth chamber. First true leaves of size 2.5cm to 3cm were harvested at the same time of day (1:00 PM) over a 7 day period and immediately frozen in liquid nitrogen. Total RNA was isolated from first true leaves with the Ambion RNAqueous kit (AM1912) in conjunction with the Ambion Plant RNA Isolation Aid (AM9690). RNA from two to three leaves was pooled per sample to minimize variance, and extracted total RNA was DNase treated with the Ambion Turbo DNA-free kit (AM1907), then visualized on a 2% agarose gel to assess quality.  RNA was sent to Genome Quebec for library preparation and sequencing. Data (Appendix B, table B.1) has been deposited into the GEO under accession #1234564.2.2 Mapping and ScriptingFastQC was used to assess the quality of the sequenced libraries before proceeding with mapping. Reads from parental species as well as the resynthesized hybrids were mapped to a 66concatenated hybrid genome consisting of both the B. rapa genome (Wang et al. 2011), and the B.  oleracea genome (Parkin et al. 2014) to allow for mapping consistency. GSNAP version 2014-06-10 was used to map the reads before piping the resulting SAM files to a set of custom Python and R scripts  (Tack et al. 2014), using the command: gsnap -d napus_all --pairmax-rna=10000 --localsplicedist=10000 --clip-overlap -N 1 -n 1 -Q --nthreads=6 --quality-protocol=sanger --ordered --format=sam --nofails Index_1, Index_2 > mapped_Index. Mapping reads from the parental species onto the concatenated hybrid genome allowed us to empirically determine where cross mapping was likely to be occurring in the resynthesized allopolyploids, where reads preferentially map to the opposite subgenome of their origin. Both parental genomes are recently sequenced and have missing parts and incomplete annotation, as well as being evolutionarily closely related, and have many duplicated genes. We excluded any gene from all analysis where 5% or more of its total parental expression originated from reads from the opposite subgenome, considering these unresolvable, where we had high confidence theremaining genes as being accurately quantified (95% of the reads probably originate from the correct genome). A total of 47,862 genes out of 100,245 were discarded due to lacking the abilityto resolve the genome of origin, leaving 52,383 genes remaining for all analysis. For all analysis involving natural B. napus, only those genes in the parental genomes which had a known equivalent in B. napus (Chalhoub et al 2014)  were used. Scripts have been deposited in the DRYAD database under the accession #1234564.2.3 Statistical MethodsExpression was modeled using the ‘DESeq2’ and ‘edgeR’ packages (Anders et al. 2013; Love et al. 2014 ;Robinson et al. 2010) for R (version 3.1.3, 2015). In both cases these packages 67model the abundance of raw reads using a negative binomial generalized linear model. Though they estimate dispersion parameters is different ways, the ratio of expression changes detected between the subgenomes were similar between the two approaches, though DESeq2 is more liberal with our dataset; suggesting a greater number of deferentially expressed genes. We used the built-in functions from each package to make explicit contrasts between the allopolyploids and the (in vitro) mid-parent values, in each case classifying a gene as deferentially expressed on the basis of an adjusted p-value < 0.05. In both edgeR and DESeq2 the p-value adjustment was performed using a Benjamini–Hochberg procedure, thus adjusted p-values at 0.05 are equivalent to a 5% false discovery rate. In order to investigate changes in AS, a similar approach to Tack et al. (2014) was used, where counts of AS events were compared against all other behaviors at that junction to ascertainif a change in ratio has occurred between treatments. Each set of alternative and constitutive junction counts were obtained through a set of custom Python scripts (Tack et al. 2014) where each read which mapped to a gene was then compared to the constitutive model, and all constitutive and alternative splicing patterns the read inferred were logged and counted. Levels of alternative splicing were compared using a general linear model fitted with a binomial distribution. For each gene with sufficient data, we ran the following model in R (version 3.2.2, implemented via the "Rpy" package version 2.6.2):fitter ← glm (cbind(alts,con) ~ treatment, data= alt_splice3, family= binomial)The 'treatment' variable was set up such that the frequency of events in the parental species was compared respectively to events in either the synthetic or natural allopolyploids. For testing alternative splicing between homeologs, the model was first run between the parental 68orthologs to determine if there was a bias, then re-run for each pair of homeologs to see if this bias had changed, using the same general linear model.4.2.4 RT-PCRPrimer sets were designed around 28 junctions chosen at random with one or more alternative events. The junction was encompassed to amplify both constitutive and alternative splicing(s). All complementary DNA (cDNA) pools were made via the Invitrogen (Carlsbad, CA) SuperScript first strand Synthesis for RT-PCR kit, using RNA samples from the same plants used to prepare Illumina libraries. PCR was performed using FroggaBio Ultra Pure Taq PCR Master Mix with dye. The PCR cycling conditions were 95ºC for 2 min; 30-35 cycles of 95ºC for30s, 57ºC for 30s and 72ºC for 60s; and 72ºC for 5 min. PCR products were visualized on 2.0% agarose gels. Primer sets are listed in Appendix B table B.11. Primer 9 was excluded upon discovery as it was a duplicate of primer 12, reducing the original 29 primers to 28.4.3 Results4.3.1 Global Changes in Expression of Genes upon AllopolyploidyThree biological replicates of RNA from the first true leaves of B. rapa. B. oleracea, three independently resynthesized allopolyploids (hereby referred to as P1, P2, and P3), and natural B. napus were prepared, sequenced, and the sequences were mapped (see methods, Appendix B, table B.1). To account for the ambiguity in mapping to a hybrid genome with nestedwhole genome duplications (from the Brassica lineage-specific genome triplication event and thealpha genome doubling event at the base on the Brassicaceae) in its constituent sub-genomes, weapplied a rather stringent filter on the data to prevent erroneous mapping and calling. Genes were69discarded from all analysis if more than 5% of the total parental reads which mapped to it originated from the alternate sub-genome, thus a total of 47,862 genes out of 100,245 were discarded due to lacking the ability to resolve the genome of origin, leaving 52,383 genes remaining. To assess changes in expression level in resynthesized hybrids, a 50/50 mix of RNA from both parental species (in-vitro) midpoint was used rather combining independent RNA-seq data from both parents (in-silico midpoint.)Changes in gene expression level upon allopolyploization were investigated using the DEseq2 package (see methods) (Table 4.1) by comparing expression levels of the homeologous gene pairs in the polyploids to the expression levels in their diploid progenitors. In all synthetic polyploids, the AT (derived from B. rapa) subgenome showed significantly more genes with changes in their expression levels than genes from the CT subgenome (χ2df1, p < 0.0001, AT P1 9.1%, P2 7.6%, P3 11.9% change vs CT 7.0%, 4.7%, 7.6% change). Both subgenomes in P2 showed significantly fewer changes than were seen in P1 and P3 (χ2df1, p < 0.0001), where P1 showed significantly fewer changes than P3 in AT (χ2df1, p < 0.0001) and in CT ( χ2df1, p=0.022), showing a small range in the of amount of change between the resynthesized allopolyploids. The CT subgenome has fewer significant decreases in expression levels than the AT subgenome in P2 and P3 (χ2df1, p < 0.0001), but significantly more in P1 (χ2df1, p < 0.0001). P3 has significantly more decreases in expression levels than P1 and P2 in the AT sub-genome (χ2df1, p < 0.0001, P3 6.0% deceases vs P1 3.6% and P2 3.8% decreases) where P1 and P2 do not differ from each other ( χ2df1, p=0.1251). In the CT subgenome, P1 has significantly more decreases in expression levels than either P2 or P3  ( χ2df1, p < 0.0001, p=0.0059), where P3 has significantly more decreases than P2 (χ2df1, p < 0.0001). 70Increases in expression levels in the AT subgenome are always more frequent than they are in the CT subgenome in all resynthesized allopolyploids (χ2df1, p < 0.0001). In the AT subgenome, P1 and P3 do not significantly differ from each other  (χ2df1, p=0.09106) in occurrence of an expression level increase, while both have significantly more increases than P2 (χ2df1, p < 0.0001). In the CT subgenome, P1 and P2 do not significantly differ from each other in number of increases in expression levels (χ2df1, p = 0.8103), however both have significantly fewer increases than P3 (χ2df1, p < 0.0001). Finally, all three resynthesized allopolyploids shared 780 decreases (Table 4.2) (518 AT and 262 CT) and 690 increases (511 AT and 179 CT), highlighting some changes are very repeatable in the allopolyploization process. Except for P1 CT decreases and P3 CT increases, there were always more shared changes between 2 or more resynthesized polyploids than unique changes, despite a large amount of unique changes. It would seem the vast majority of changes are in fact part of a shared pool of potential or probable changes upon allopolyploidy, where some are more or less probable.  4.3.2 Global Changes in B. napus vs. Resynthesized AllopolyploidsTo compare the transcriptional changes immediately accompanying allopolyploidy to changes that have either been selected for or fixed over the course of the domestication of B. napus, we repeated the previous analysis with natural B. napus included. Genes for which no direct homolog between the B. napus assembly (Chaloub et al. 2014) and the  B. rapa and B. oleracea genomes were discarded, resulting in a smaller gene set. It was found (Table 4.3) that natural B. napus, which is estimated to be ~7500 years old (Chaloub et al. 2014) is significantly different from any of the resynthesized allopolyploids in terms of expression levels. Overall, the natural B. napus had significantly more changes in expression levels from 71both subgenomes than any of the allopolyploids (Table 4.3); AT (natural 34.7% vs P1 8.5%, P2 6.8%, P3 11.3%) and CT (31.6% vs 7.3%, 4.5%, 7.1%) (χ2df1, p < 0.0001). In both subgenomes, the natural has more than three times the number of significant changes as do any of the synthetics, suggesting either that our panel of allopolyploids did not cover the entire range of possible immediate changes, or that the majority of changes build on the initial conditions after apolyploidy event, or some combination of these two.  In every resynthesized allopolyploid and the natural, AT had more net changes (χ2df1, p < 0.0001). Investigating these changes further, both subgenomes of natural B. napus have significantly more decreases of expression than any of the resynthesized allopolyploids (χ2df1, p < 0.0001). However, the CT subgenome has more decreases than the AT subgenome in the natural  (19.% decreases CT vs 15.1% decreases AT, χ2df1, p < 0.0001), as well as by P1 and P3 (P1 3.4 AT vs 5.5 % CT, P3 5.8% AT vs 4.5% CT,  χ2df1, p < 0.0001), but not P2 (P2 3.3 AT vs 2.8%  CT, χ2df1, p =0.002133) where the AT subgenome is marginally yet significantly richer in decreases in expression levels. Unsurprisingly, the natural has ~3x more increases in expression levels than any of the resynthesized allopolyploids (χ2df1, p < 0.0001), with the AT subgenome being significantly enriched compared to the CT subgenome inall resynthesized allopolyploids and the natural allopolyploid (natural AT 19.6% increases vs CT 11.9% increases, P1 5.0% vs 1.7%,  P2 3.4% vs 1.6%, P3 5.4% vs 2.6%, χ2df1, p < 0.0001). We examined shared changes from the parental expression levels shared by all allopolyploids (Table 4.4) revealing a small subset of genes whose expression levels changed the same way to allopolyploidy in all plants tested. 4.3.3 Changes in Alternative Splicing Upon AllopolyploidizationWe investigated the changes to splicing patterns accompanying allopolyploidization. A suite of custom Python scripts (Tack et al. 2014) was used to count the number of reads 72indicating constitutive or alternative splicing at each exon-exon junction. After filtering events from genes which were not considered resolvable from cross mapping, these counts of alternative and constitutive reads were used to run a general linear model at each exon-exon junction to test for a difference in the frequency of the alternative versus constitutive splicing events between the parental species and the allopolyploids at that junction. We considered IR (intron retention), ALTA (alternative acceptor), ALTD (alternate donor), and ALTP (alternate position) events, as these are the most common types in plants as well discretely quantifiable. To be considered for analysis, alternative events must have at least one genotype with 5 or more alternative reads per replicate, and have an alternative frequency between 5 and 50% of the total read pool for the junction, in order to remove situations where annotation is incorrect, or where the 'alternative' form is actually the constitutive form in leaf tissue. We further restricted our modeling to those junctions represented by at least 75 reads indicating the alternative event across all tested genotypes, and at least 300 reads indicating the constitutive event. A graphical example of the modeling process can be found in Figure 4.1. The overall density of the fold change has been plotted for models showing significant changes (Figure 4.2). We tested a total of 14,062 junction models for IR events in the AT subgenome and 13,051 models for IR events in CT subgenome using data from all three synthetic polyploids and their parentals, where 9,959 succeeded (70.8% success) and 8,627 succeeded (66.1% success) respectively (table 4.5). Overdispersion, insufficient number of parental reads, or failure to meet criteria resulted in the model failing, whereas a significant change in IR event frequency requireda genotype to produce a p-value < 0.05. The majority of supported models show no significant change of IR event frequency between the parental and allopolyploid levels; between the three polyploids an average of 77.0% of AT genome events and an average of 63.56% of CT genome 73events showed no significant change. In each polyploid, the CT genome IR event frequencies show significantly more change than the AT genome IR event frequencies (χ2df1, p < 0.0001, P1 32.9% change vs 20.8%, P2 32.8% change vs 20.7%, P3 43.5% change vs 27.1%). P3 has more change of IR event frequencies than either P1 or P2 in both AT and CT genomes (p < 0.0001, χ2df1,P3 27.1% changed AT frequencies vs P1 20.8% and P2 20.7%, P3 43.5% changed CT frequencies vs P1 32.9% and P2 32.8% ). P1 and P2 have the same ratio of IR event frequency change to conservation and are indistinguishable from each other in both AT and CT genomes in this regard (χ2df1, p = 0.75 for AT genome,  p=0.85 for CT genome).Decrease in IR event frequency is the most common type of change in all resynthesized polyploids, with a total of 13583 cases of decreasing IR frequency between the three polyploids to only 2699 increases in frequency; AT events have an average of 4.9% increases and 18.0% decreases, and CT  events have an average of 4.7% increases and 31.6% decreases, showing a negative trend for IR event frequency change. Most decreases in IR frequency are shared between two or more allopolyploids (table 3). Considering AT genome IR events, P1 has significantly more decreases than P2 (χ2df1, p < 0.0001, P1 16.3% decreases, P2 13.4%), and P3 has significantly more decreases than P1 or P2 (χ2df1, p < 0.0001, P3 24.3% decreases). When CT events are considered, P1 again has more decreases than P2 (χ2df1, p=0.00143, P1 28.5%  decreases, P2 26.4%), and P3 has significantly more decreases than either P1 or P2. (χ2df1 , p < 0.0001, P3 40.0% decreases). Significantly more of the IR events show decreased frequency in the CT subgenome than in the AT subgenome in every polyploid (χ2df1, p < 0.0001, P1 28.5% CT decrease vs 16.3% AT decrease, P2 26.4% vs 13.4%, P3 40.0% vs 24.3%).  Increases of IR event frequencies are less common overall than decreases. Like 74decreases, the majority of them tend to be shared by two or more polyploids. Considering AT genome IR events, P2 has significantly more increases in IR frequency than either P1 or P3 (χ2df1,p < 0.0001, P2 7.2%  increases vs P1 4.5% and P3 2.8%). P1 has significantly more increases in IR frequency than P3 (χ2df1, p < 0.0001, 4.5% increases vs 2.8%). When CT events are considered,this pattern repeats itself with P2 having significantly more increases than P1 or P3 (χ2df1, p < 0.0001, P2 6.3% increases vs P1 4.3% and P3 3.5%).  In CT events, P1 and P3 are discernible from each other in terms of increases (χ2df1, p=0.007018, P1 4.3% increases vs P3 3.5%). The subgenomes of each polyploid were compared to each other, revealing no subgenome bias in the amount of positive change between AT and CT in P1 (χ2df1, p=0.5546, P1 AT=4.5% increases vs P1CT=4.3% increases). In P3, the CT subgenome has slightly more increases than the AT sub-genome (χ2df1, p=0.007765,  P3 AT=2.5% vs P3 CT=3.5%), whereas in P2 the AT sub-genome has slightly more increases (χ2df1, p=0.01929, P2 AT=7.2% vs P2 CT=6.3%).Alternative acceptor (ALTA) events are the second most common type of AS event in Arabidopsis thaliana (Marquez et al. 2012, Filichkin et al. 2010), with our results extending this pattern to other several other Brassicaceae. In the same manner, we tested 1894 junction models (Appendix B, Table B.2) in the AT subgenome, and 1511 models in the CT subgenome, with 1328(70.1%) and 1003 (66%) of models passing respectively Polyploidy appears to be affecting ALTA events much less severely than IR events; in every polyploid, in both subgenomes, there isless total change in ALTA frequencies than there is in IR frequencies (χ2df1, p < 0.0001, AT P1 20.8% change to 13.3%,  AT P2 20.7% change to 14.0%,  AT P3 27.1% change to 16.3%, CT P1 32.9% change to 21.1%, CT P2 32.8% change to 24.7%,CT P3 43.5% change to 26.1%). Overall 85.4% of AT  ALTA event frequencies did not change between the parental state and the polyploids, as opposed to 77.0% for IR events; 75.7% of CT  ALTA event frequencies were 75conserved while only 63.5 % of CT IR event frequencies were conserved. As with IR events, the CT genome showed more change than the AT genome in every polyploid (χ2df1, p < 0.0001, P1 21.9% change vs 13.3%, P2 24.7% change vs 14.0%, P3 26.1% change vs 16.3%). However, alternative acceptor events do not display the negative bias seen in the IR events; AT events have an average of 7.2% increases and 7.3% decreases, and CT  events have an average of 14.2% increases and 10.0% decreases, showing rather balanced change compared to IR events. The CT subgenome was significantly enriched for decreases in event frequency comparedto the AT subgenome in P1 and P2 (χ2df1, p=0.04497, p=0.004142,P1 CT 9.0% decreases vs P2 AT 6.6% decreases, P2 CT 10.2% vs P2 AT 6.7%) but not in P3 (χ2df1, p=0.1171, P3 CT 10.7% decreases vs P3 AT 8.6%). In each polyploid, changes in CT  events are significantly more likely to be positive than those in AT  events (χ2df1, p < 0.0001, P1 CT 12.8% increases vs AT 6.6%, P2 CTincreases 14.5% vs AT 7.3%, P3 CT 15.4% increases vs AT 7.6%). Next, we tested for the same patterns the ALTA events for that occurred in the IR events; specifically whether P3 showed the most decreases of frequency in both subgenomes, and if P2 shows the most increases. Polyploid 3 does not show significantly more decreases than P1 or P2 in the CT genome (χ2df1, p=0.2491, p=0.7644, P3 CT 10.7% decreases P1 9.0%, P2 10.2%), or the AT genome (χ2df1, p=0.06577, p=0.07801, P3 AT 8.6% decreases, P1 6.6%, P2 6.7%). The AT subgenome in P2 does not have significantly more positive change than the AT subgenome in P1 or P3, nor does P3 have more positive change than P1 (χ2df1, p=0.5376,  p=0.7658, p=0.3225, P2 AT 7.3% increases, P1 6.6% and P3 7.6%). In the CT subgenome events, P2 is similarly not enriched vs the other two polyploids, nor does P3 have more change than P1 (χ2df1, p=0.3166,  p=0.6524, p=0.1295, P2 CT 14.5% increases, P1 12.8% and P3 15.4%). 76Alternative 5' splice sites or alternate donors (ALTD), are the next most common type of alternate splicing events in the Brassicaceae. Alternative donor event frequencies (Appendix B, Table B.3) are conserved much more highly than IR event frequencies in all polyploids, in all subgenomes (χ2df1, p=0.0001, p=0.0006, p < 0.0001, p=0.0007, p=0.016, p < 0.0001, P1 AT 20.8% IR change vs 15.8% ATLD change, P2 AT 20.7% vs 16.2%, P3 AT 27.1% vs 18.3%, P1 CT32.9% vs. 26.9%, P2 CT 32.8% vs 28.4%,# P3 CT 43.5% vs 32.1%). Comparing the amount of frequency change in ALTD events to the change in ALTA events, no significant difference was found within the AT subgenome (χ2df1,p=0.0911, p=0.1624, p=0.2216). Considering CT subgenome events, P1 and P3 had significantly more change in ALTD event frequency than ALTA event frequency. (χ2df1, p=0.02002,p=0.007697 , P1 21.9% change in ALTA, 26.9% in ALTD, P3 26.1% change in ALTA 32.1% in ALTD ). Alternative acceptor and alternative donor events change in a similar pattern with mixes of increases and decreases, rather than the overall trend of decreases in IR events. Only P3 was found to have significantly more change in ALTD levels than P1, and only in the CT subgenome events (χ2df1, p=0.02852, P3 32.1% change to P1 26.9%). No polyploid was found to have significantly more decreases or increases than another polyploid between equivalent subgenomes. However, the CT subgenome has more change in ALTD frequency than the AT subgenome in every polyploid (χ2df1, p < 0.0001, P1 AT 15.8% change vs CT 26.9%, P2 AT 16.2% change vs CT 28.4%, P3 AT 18.3% change vs CT 32.1%). Negative changes in ALTD event frequency are more common in the CT subgenome than in the AT subgenome (χ2df1, p < 0.0001, P1 CT 14.0% decrease vs AT 6.9%, P2 CT 14.8% decrease vs AT 8.2%, P3 CT 16.2% vs AT 7.9%), as are positive changes in ALTD frequency (χ2df1, p=0.007958, p=0.0001433, p=0.0006677, P1 CT 12.8% vs AT 8.8%, P2 CT 13.6% vs AT 8.0%, P3 CT 15.8% vs AT 10.3%).77Finally, we tested alternative position (ALTP) events that change both the 5' and 3' splice site between two adjacent exons, alternative position (Appendix B, Table B.4). We tested 1235 AT subgenome ALTP events, and 731 CT subgenome ALTP events, where 943 (76.3%) and 549 (74.0%) respectively passed. Only the AT subgenome in P3 had significantly more total change than the AT subgenome in P1  (χ2df1, p=0.0347, P3 change 17.9% vs p1 13.7%) while there were no other significant differences in total amount of change between equivalent subgenomes among the polyploids. Within both AT and CT subgenome events, no polyploid had a significant enrichment of increases of frequency or decreases in ALTP frequency over another. Mirroring ALTA, ALTD, and IR events, the CT subgenome has significantly more change than the AT subgenome in all polyploids (χ2df1, p < 0.0001, p < 0.0001,p=0.0008029, P1 23.4% change vs 13.7%, P2 28.1% change vs 15.5%,  P3 26.3% change vs 17.9%). The CT subgenome is not enriched for decreases of ALTP frequency compared to the AT subgenome (χ2df1, p=0.8829, p=0.2916, p=0.3968, P1  CT 6.0% decrease vs AT 5.6%, P2 CT 7.3% decrease vs AT 5.6%, P3 CT 6.2% decrease vs AT 7.7%), however the CT subgenome is enriched in increases in ALTP frequency in all polyploids (χ2df1, p < 0.0001, P1 CT 17.4% increases vs AT 8.0%, P2 20.8% increases vs 9.8%, P3 20.1% increases vs 10.1%).4.3.4 Changes in Alternative Splicing in B. napusTo compare the changes immediately accompanying allopolyplodization to longer term evolutionary changes, the alternative splicing patterns of natural B. napus were analyzed and compared to its diploid progenitors alongside the three synthetics. To ensure a robust assay, only events that occurred in the natural allopolyploid were considered, and then only when the gene had a known parental ortholog (Chalhoub et al 2014), with its exon structure completely conserved. All exons and introns of both parental genomes were queried against all exons and 78introns of the B. napus genome using a discontinuous megablast, where a gene and all of its events would be thrown out of further analysis if all consecutively numbered exons and introns did not produce a meaningful hit  (e ≤ .0001) to its equivalently numbered exon or intron in B.napus.  To allow for small inconsistencies or changes in genes, ALTA and ALTD events were allowed to be off by as much as 2 bp, i.e. the addition or subtraction from the exon that the event generates in the parental species was allowed to be up to 2 base pairs different in the natural and was still considered the same event, whereas ALTP events were allowed up to 3bp of difference at either end. All of these criterion together lead to a considerable trimming of the available events, while ensuring the events surveyed were indeed parallel events.When IR events are considered, it is immediately clear that natural B. napus is very different from the synthetics (Figure 4.3, Table 4.6, Appendix B, Table B.5, B.6, B.7). In every case, there is significantly more change IR event frequency in the AT subgenome of natural B. napus (N) than any of the AT subgenomes in the three resynthesized allopolyploids (χ2df1, p < 0.0001, N 46.1% change vs. P1 21.0%, P2 22.0%, P3 28.0%). When CT subgenome events are considered, the natural B. napus again has far more change than every synthetic (χ2df1, p<0.0001, N 46.8% change vs P1 33.3%, P2 33.0%) except P3 (χ2df1, p= 0.1301, N 46.8% change vs P3 44.5%). P3 and the natural have an equivalent amount of decreases in IR event frequency in the AT subgenome (χ2df1,p=0.4444, P3 25.5% decreases N 24.6%), both of which are enriched in the frequency of losses compared to P1 and P2 (χ2df1, p<0.0001, P1 16.1% decreases, P2 14.0%), where P1 has more decreases in IR event frequency than P2 (χ2df1,p=0.03699). In the CT subgenome, P3 has more decreases than any of the resynthesized allopolyploids, but also natural B. napus (χ2df1, p<0.0001, P3 40.4% decreases vs P1 28.3%,  P2 25.4%, N 29.8%). The natural CT is essentially the same as P1 (χ2df1, p=0.2684, P1 28% decreases vs N 29.8%), both of which 79have more decreases in IR event frequency than P2 (χ2df1, p=0.001,p=0.03287, P2 25.4% decreases). However, in both subgenomes, the natural has significantly more increases in IR event frequency than any resynthesized allopolyploid (χ2df1, p<0.0001, N 21.4% AT increases vs. P1 4.9%, P2 8.0%, P3 2.4%, N 16.9% CT increases vs. P1 5.0%, P2 7.5%, P3 4.1%), well exceeding P2, which is also significantly enriched in increases vs. P1 and P3 in both subgenomes(χ2df1, p<0.0001, p=0.0005). The AT subgenome in natural B. napus has significantly more increases in IR event frequency than the CT subgenome (χ2df1, p<0.0001, N AT 21.4% increases vs N CT 16.9%). In P1 and P2 they are indistinguishable (χ2df1, p=0.9183, p=0.56), yet in P3 the CT subgenome has significantly more increases (χ2df1, p= 0.001721, P3 CT 4.1% increases, AT 2.4% increases). All CT subgenomes, in resynthesized or natural polyploids, have more decreases than their corresponding AT subgenome (χ2df1, p<0.0001,P1 CT 28.3% decreases vs AT 16.1%, P2 25.4% vs 14.0%, P3 40.4% vs 25.5%, N 29.9% vs 24.6%).4.3.5 Gene Ontology for Alternative Splicing CategoriesGenes with events which increased, decreased, or retained the parental AS event frequency in all three synthetic polyploids were used to run a gene ontology analysis (Appendix B, Table B.9) with the web based DAVID tool (Hwang et al. 2009a, 2009b) using the ortholog from Arabidopsis thaliana (Chalhoub et al. 2014) as a proxy for function. Genes that had a decrease in IR event frequency in all polyploids are enriched in environmental response categories, abscisic acid response, and fatty acid biosynthetic processes. Environmental and stress response genes are often implicated as being AS rich, thus while likely to be over-represented by association, highlight some of these changes as potentially altering the stress 80resilience profile of the new allopolyploids, wherein AS can function as a positive or negative regulator of stress responses. We examined the same categories with natural B. napus included, showing a similar trend of genes expressing IR events at or below the parental frequency showing enrichment in environmental response genes. Genes where resynthesized polyploids showed no change from the parental frequency, but there was change in parental frequency in natural B. napus were investigated, where genes with both increases and decreases in frequency in the natural but not in the resynthesize allopolyploids have an enrichment in metal binding, response to jasmonic acid stimulus, and fatty acid biosynthetic processes. Genes containing events which decreased in frequency in every synthetic polyploid and which occurred at or exceeded the parental frequency in the natural polyploid were investigated to elucidate some of the possible significance of so many events increasing in frequency in the natural polyploid. Genes with an event frequency decrease in every resynthesized allopolyploid and are equal to theparental level in the natural allopolyploid show enrichment in light stimulus response, lipid biosynthetic processes, and response to abiotic stress.  While GO analysis is rather vague, these results combined with the concerted study of specific genes where the presence or ratio of AS has a known phenotypic consequence put a focus on AS changes as a significant driver of the variation and adaptations that polyploidy can unleash.4.3.6 Experimental Verification of Alternative Splicing EventsTo verify the Illumina sequencing results with a different technique, 28 RT-PCR primer sets were designed around random junctions containing one or more events (Appendix B, Table B.11, Figure 4.4), with 24 sets (85%) confirming the presence of one or more events predicted at that junction. Eight of the junctions contained multiple predicted events, raising the total number of events queried to 35, where 30 (85%) were confirmed. Figure 7 displays some of the gel 81images where these events were resolved. More information can be found in materials and methods under RT-PCR.4.3.7 Changes in Serine-Arginine Gene Alternative SplicingTranscripts of serine-arginine (SR) proteins are extensively alternatively spliced, yet are involved in the regulation and mediation of splicing (Palusa et al. 2007, reviewed in Black 2003, Busch and Hertel 2011). Orthologs of known SR genes in Arabidopsis (Chaloub et al. 2014, TAIR) from both sub genomes were specifically investigated using our expression and splicing data. We found 52 events in SR genes which passed all criteria, 30 (57%) had a significant change in frequency in at least one polyploid, and 10 (19%) showed a significant change in all polyploids (Appendix B, Table B.10). Within the changes that were significant, 68% of them were decreases of frequency, with 32% events showing increases in frequency of the event. Bra010459 (RS40) shows a marked decrease in the frequency of the 1-2 IR event from the parental level of 52.7%, resulting in retention as low as ~25% in the most extreme polyploid. Bo4g154780 (SRZ22a) shows a similar decrease in the 2-3 IR event, having a parental frequencyof 17.2% and losing about ~30% of its event frequency in the polyploids, or down to ~12% alternative splicing. Bo6g092410 (SR 33) has a parental 2-3 IR event frequency of 31.1%, whereas in the polyploids it varies from ~40-49%. All three of these IR events would result in theinclusion of a premature termination codon (PTC). 4.3.8 Patterns of Splicing between HomeologsHomeologous genes in B. napus derived from B. rapa and B. oleracea (Chalhoub et al. 2014) were BLAST searched against each other to find gene pairs with no rearrangements between exons, where every contiguously numbered exon matched. AS events occurring at 82equivalent junctions between homeologs were quantitatively compared, where qualitative patterns of presence/absence were not analyzed. Polyploid B. napus has over three times the number of genes (28,755 vs. 101,040) as A. thaliana, resulting in a lower depth of the transcriptome and difficulty in reliably ascertaining true qualitative presence/absence patterns from RNA-seq data alone. Our filter of requiring at least five reads in every replicate of at least one genotype to accept an event could lead to false qualitative differences where the event is present in both homeologs but not above the threshold in one homeolog. The exact boundary between extreme quantitative and qualitative is difficult to establish without very high depth of RNA-seq data or an exhaustive panel of sequencing RT-PCR products, such as in Zhou et al. 2011. Thus we chose only to investigate quantitative AS differences between homeologs where both were above detection thresholds and where we were confident we had sufficient coverage and depth to confidently compare their AS patterns.  We analyzed 2196 IR events shared between B. rapa and B. oleracea, where 1083 of them occurred at a similar frequency in the native A and C genomes, and 1113 of them displayed a bias between parental genomes; 455 of these events were higher in the A genome and 658 of them were higher in the native C genome (Table 4.7). Equivalent events which had similar frequencies before allopolyploization tend to retain this relationship upon allopolyploization; within all three synthetics 73.8% (2398/3249) of such pairs maintained a non-biased relationship,whereas 513 (15.7%) of them gained AT subgenome bias (higher frequency of the event in AT homeolog) and 338 (12.3%) gained CT subgenome bias. Significantly more unbiased events developed an AT bias in the allopolyploid (χ2df1, p<0.0001). Next, we tested events which were A biased between the parents. Within all three synthetics, 732 (53.6%) of pairs retained this bias, where 594 (43.5%) pairs changed to non biased and the events had equivalent frequencies in the 83re synthesized plants. Interestingly, there were 39 (2.8%) cases between the three polyploids where the parental A bias changed to a CT bias, where the CT gene had a higher event frequency. Pairs which were C biased in the parentals display a similar pattern; 942 pairs (47.7%) retained this bias into the CT sub-genome, 954 pairs losing bias (48.3%) and 78 pairs changed to an AT sub-genome bias (3.9%). Thus, A genome homeologs retained their parental biases more often than C genome homeologs in the allopolyploids (χ2df1, p=0.0008, 53.6% A homeolog bias preserved vs 47.7%). We continued this analysis to include homeolog pairs in natural B. napus (Appendix B, Table B.8). This required both the diploid versions of genes to match their B. napus versions as well as each other, as well as requiring the event passing the detection threshold in all four biological groups, both parentals and the synthetic polyploids, and the natural B. napus.  This considerably truncated the data set, but allowed an investigation of a total of 410 IR events, with 177 pairs not biased in the parental species and 233 which were biased (Appendix B, Table B.8). While the total ratio of parental non-biased pairs which stayed the same or developed a bias was essentially the same between any resynthesized polyploid and the natural (χ2df1, p> .50), B. napus has significantly more unique A genome biases than any resynthesized polyploid (χ2df1, p=0.08, 53.8% unique vs P1 18.5%, P2 14.2%, P3 20.5%). Among parental A biased pairs, the natural B. napus has significantly fewer pairs that have retained A bias than any of the synthetics (χ2df1, p<0.05, P1 57.3%, P2 56.0%, P3 54.8% vs 36.5% N), while having more unique changes to a non-biased state than any resynthesizd polyploid (χ2df1, p<0.05, 32.6% unique, P1-P3, 0%, , 9.3%, and 8.8% unique). Among C biased pairs, natural B. napus has an astounding 22 unique pairs which flipped to A bias, and significantly fewer pairs retaining C bias than any of the synthetics (χ2df1,  p<0.05, 40.3% N vs 54.3%-56.9% for the polyploids), and more unique changes84to unbiased than any resynthesized polyploid (χ2df1,  p<0.05, 28.5% N vs 6.3%-12.3% unique for the polyploids.The homeolog pair Bra007937/Bo2g068460 corresponds to AT1G70700, JASMONATE-ZIM-DOMAIN PROTEIN 9; the 5-6 junction was A biased in the parents (glm, p<0.05, A 21.1%vs C 13.6% IR), but P1 and P2 flipped this bias (glm, p<0.05, P1 AT 19.3 vs CT 30.8%, P2 AT 18.9% vs CT 27.8%) and P3 saw this junction develop non-bias (glm, p=0.401, P3 AT 17.4% vs CT 20.6%,). The 5-6 intron is also retained in Arabidopsis thaliana (Tair, ASIP), and contains PTCs in every species. JAZ genes are frequently alternatively spliced, where the intron retaining forms have reduced binding efficiency, functioning as dominant repressors of jasmoinic acid signaling (Chung et al. 2010), though the 5-6 intron is not explicitly implicated as either being conserved in the JAZ family of genes, or having this behavior in Arabidopsis (Chung et al. 2010). Bra013145/ Bo3g087110; AT2G13540, ABA HYPERSENSITIVE 1 junction 9-10 IR is non-biased in the native genomes (glm, p= 0.611, A 5.5% vs C 6.3% IR,), but develops a CT bias in all resynthesized polyploids (glm, p < 0.05, AT 3.8%, 3.6%, 4.2% vs CT 11.7%, 13.7%, 8.8% IR). Bra035914/Bo9g014730; AT5G61770 PETER PAN-LIKE PROTEIN, 5-6 IR gained AT biasin P2 only (glm, p<0.001, P2 AT 23.6% vs CT 5.8%,), where the pair was non-biased in the parentals (glm, p=0.971,  A 15.4% vs C 15.9% IR). Bra029584/Bo5g143430;  AT3G06510, SENSITIVE TO FREEZING 2 9-10 IR maintains a strong C bias in the parentals and resynthesized polyploids P1 and P3, where the A/AT homeolog varies between 8-10% IR and the C/CT varies between 21-33% (glm, p<0.0001); P2 has this bias as well, (5.0% vs 27.3%) but the replicates are inconsistent, disallowing a statistical verification of this bias with the same test.85We also examined some cases where peculiar patterns between the resynthesized polyploids and the natural polyploid occurred. Bra005131/Bo4g029730;AT2G38170 RARE COLD INDUCIBLE 4 2-3 IR shows C/CT bias in parentals and all synthetics (glm, p<0.001, C/CT 20%-98% higher IR frequency), but shows A/AT bias in the natural (glm, p<0.05, AT IR 20.2% CT IR 14.4%). Bra029778/Bo5g138190; AT3G09600 LHY-CCA1-LIKE5 IR 7-8 displays much the same pattern, where a C/CT bias in parentals and all synthetics (glm, p<0.001, C/CT 202%-322% higher) turns to an A/AT bias in the natural (glm,  p<0.001, AT 14.4% IR, CT 8.6%,).Bra007872/Bo2g064260; AT1G69640 SPHINGOID BASE HYDROXYLASE 1 1-2 IR shows C/CT bias in all but the natural, where it becomes AT biased (28.1% AT vs 21.8% CT, p<0.05). Finally, Bra027508/Bo9g059330; AT5G43860 CHLOROPHYLLASE 2 1-2 IR is A biased in the parent (glm, p<0.05 , A 57.3 vs C 42.9%), is not significantly biased in any of the synthetics (glm,  p=0.31, p=0.58, p=0.052, P1 41.5% vs 36.2%, P2 37.1% vs 39.4%, p3 40.9% vs 31.9%), and is CT biased in the natural (AT 12.7% vs CT 36.3%,  p<0.0001).4.4 DiscussionIn light of many studies showing that aspects of polyploidy are repeatable (reviewed in Buggs et al. 2014), and those showing changes that are less repeatable or even unique to a particular instance of polyploidy (Hu et al. 2015, Gaeta et al 2007, Zhou et al. 2011, reviewed in Soltis et al. 2014) the relative weights of these types of changes can be evaluated on their importance to the formation of new, successful polyploids. If common changes, changes that happen in most polyploids of a given heritage, were responsible for evolutionary success, then it would seem odd that successful polyploids are so rare. However, if rare and unique changes are responsible for success, then the pattern appears explained; only an exceptionally rare set of changes offers enough innovation to compete and persist. The vast majority of expression and 86splicing levels do not change, showing that while gene regulatory architecture is rather robust, change is all but inevitable during genome and transcriptome shock. Furthermore, most changes in both expression and alternative splicing are shared among our polyploids, showing most changes occurring during allopolyploidization are repeatable. Another possibility is that successful polyploids are not dependent on unique changes giving them a selective advantage, but rather the repeatable changes simply open up new avenues on which selection can act.  The majority of changes in alternative splicing are consistently found in all polyploids, and thus are part of the repeatable aspects of allopolyploidzation. The AT subgenome consistentlyshows less change, and a majority of this change is negative for IR events in both subgenomes. Amore complex and crowded trans-enviroment, with transcription and splicing factors from both subgenomes could account for part of the loss of IR event frequency, as well as the changes in other event frequencies, as factors overcrowd or overload the native splicing regulation of genes. Splicing is coordinated between the trans environment and cis acting elements and other splicingfactor binding sites, including ESE/ESS (exon splicing enhancers and silencers) and ISE/ISS (intron splicing enhancers and silencers), where competition due to trans effects could be effectively down regulating the absence of splicing, i.e. intron retention events. Many SR genes are losing AS themselves, and thus putatively having more active splicing protein, therby could also explain the loss of so much IR retention frequency. Other genes may have lost the ability to auto-regulate their own AS, where their expression level in the new polyploid is never sufficiently high to start a negative feedback loop, resulting in a decrease in detectable AS. The methylome is reshaped upon polyploidy (Salmon et al. 2004), and this can influence splicing patterns (reviewed in Moar et al. 2015). 87Given the amount of change that is consistent in all the synthetics we tested, and the known rarity of polyploidy persisting, having a chance pre-adaptation to a new environment or stress by means of expression level, AS level, or AS-NMD mediated expression level would seem to a potentially enabling factor. Enough rare, chance changes in the right direction among the background of expected and consistent changes accompanying polyploidy might just enable the protean adaptability characteristic of polyploidy. With so many stress and environmental related genes present as changing in the GO analysis of AS containing genes, polyploidy could be thought of as a lock and key mechanism. There are yet to be exploited niches or potential to further dominate an existing niche, but polyploidization must be repeated millions of times before, by chance, a polyploid has exactly the right changes to be at once pre-adapted for the particular environment it finds itself in, finding the exact right fit for the environmental 'lock'. Gene expression is an extremely complicated process with many positive and negative factors affecting the final amount of expression or splicing. The yin-yang model of alternative splicing (reviewed in Nilsen and Graveley 2010) attempts to describe the balance of splicing outcomes by the relative strength of negatively and positively (largely SR proteins) acting factorsacting on a single splice site. In our study, we see a massive decrease in the frequency of IR events post allopolyploization, suggesting that factors which favor splicing may be over expressed in or have lost some regulation, resulting in an 'overloaded' trans environment. In this environment, the cellular signal to splice out introns is sufficiently strong to overpower the retention of even normally retained introns. It is possible that there is simply an excess abundance of SR proteins; if SR proteins from both subgenomes are expressed in the resynthesized allopolyploids, retention events will be suppressed, giving the appearance of decrease or loss of events (Zhou et al. 2011), where the other classes of events which are changes88in splicing, rather than a binary choice of splicing or not, will be less affected.  It is curious that in autopolyploid watermelon (Saminathan et al. 2015) there appears to be the opposite trend, with an increase in alternative splicing, where there may be a dosage balance effect with two setsof the same SR genes rather than a transgressive effect we observe in this allopolyploid with two different sets of SR genes. It is possible that a hybrid and polyploid transcriptome creates a more complex and crowded trans environment than a normal diploid. This could overload the cis regulatory circuitry of some genes and exceed the native splicing efficiency for which it is adapted. However, this 'over splicing' results in competition between splice site choices at junctions wherethere are alternative acceptor/donor sites, rather than just a saturation which highly suppresses the generation of un-spliced transcripts, i.e. intron retention. Changes in acceptor or donor event frequencies reference a change in the preferred splicing site, whereas changes in IR frequency are a change the frequency of splicing.Intron retention seems to be most affected of all types of alternate splicing tested, while all of the event types which change the splice site, rather than eliminating it, seem less affected, demonstrating remarkably similar patterns of change. Alternative splicing is partially regulated by cis-elements known as exon splicing enhancers and silencers and intron splicing enhancers and silencers. More transgressive behavior from the AT transcriptome on the CT subgenome may contribute to the elevated rate of change of CT events vsAT events, or where trans acting factors from the AT transcriptome may be overpowering or out-competing factors from the CT transcriptome. Stable interacting partners and transcription factor and splicing factor stoichiometry is likely to be significantly disrupted, but the complexity of splicing and the amount of changes that are occurring make it very difficult to point to a tractablesource or pattern. The relative degree to which transgression versus loss of native trans 89environment contribute to the changes in alternative splicing poses an interesting question, yet unsolvable in this experiment. Viewed in the light of the trans-environment, the overloaded and much more complex trans-environment may rob transcripts of the opportunity to leave the nucleus before being processed; competing signals from both transcriptomes may over-process and disrupt normal adaptive IR events. Such a global reduction in the frequency of retained introns may have profound influences on the end result of the transcriptome, as AS-NMD is thought to play a central role in the auto-regulation of the steady state transcriptome in Arabidopsis (Drechsel et al. 2013). The majority of events retaining their frequencies attests to the robustness and strength of cis-elements in controlling and regulating the splicing.Although the natural has a considerable amount of decreases in intron retention in both subgenomes, it is within the range of the amount of decrease found in the three resynthesized allopolyploids, whereas the amount of increases is clearly outside the range of anything found in the three resynthesized allopolyploids. B. napus is expected to be about 7,500 years post allopolyploization, though it is unclear if the natural strain of today started out with a more balanced set of changes, or if it's particular parents were similar to the plants used. The resynthesized allopolyploids do have a good deal of variation in proportion of changes they experience, so given a large amount of time, selection may have found similar the one we have today, or these changes may be part of the long term changes accompanying allopolyploization. If the networks of splicing factors and transcription factors settle after transcriptome shock, or a less complex trans-environment were to manifest, the massive amount of decreases seen in the IR events in the resynthesized allotetraploids may eventually give way to a fine tuning of the transcriptome and splicing patterns. In the AT sub-genome, we found 310 events which were equal to the parental levels in the resynthesized allotetraploids, but raised in the natural, and 273 90events which were similarly lowered in the natural. In the CT sub-genome, we found 167 events which were the same in the parents and the resynthesized allotetraploids, but raised in the natural, and 150 where they are lowered in the natural. A study in tetraploid watermelon (Saminathan et al. 2015) found that AS rates increased overall in their autopolyploid system, whereas in our allopolyploid system, this polyploidy effect is possibly suppressed by the effect ofhybridization until later generations, as the natural B. napus used here is presumed to be at least 7,500 years old. Our B. napus has as many decreases as any of the resynthesized allotetraploids, but more increases.It has been shown in Arabidopsis that alternatively spliced SR protein transcripts are frequently targeted by the NMD pathway (Paulsa and Reddy, 2010) by means of a PTC in the intron. Among the alternative splicing events we found in the SR genes in our system, 80% would result in the inclusion of a PTC. Since most of these changes are decreases in frequency ofthe alternative event, i.e. an event potentially targeted by NMD via a PTC, more functional SR protein could be produced in the allopolyploids compared to the parental state. SR proteins are required for the initiation of splicing by recognizing and recruiting other factors to splice sites. These potentially higher levels of functional SR proteins in the allopolyploids might explain some of the observed differences between overall IR event frequency change and ALTA, ALTD, and ALTP event frequency change. The biggest amount of negative change in frequency was found in the IR events, whereas the rest were relatively balanced in increases and decreases. The potentially greater ratio of functional SR proteins would allow fewer introns to remain un-splicedor possibly compete against those factors that initiate intron retention, causing a net loss of IR frequency, whereas other types of alternative splicing where the splice site changes, rather than no splicing occurring at all, would be affected in a less predictable fashion. In addition to more 91genes being actively transcribed in the polyploids, it would seem probable that more are being fully spliced and therefore more likely to become active proteins in the polyploid state.The ratio of splicing events is detectable in the transcriptome, and these event ratios are heavily implicated in controlling both final transcript level via NMD (Dreschel et al. 2013), thereby influencing transcript levels, or by altering the proteins themselves. Both of these modes of action can have outcomes on phenotype, demonstrating changes in alternative splicing frequency are relevant in the context of polyploidy, with those specific to an instance of polyploidy perhaps being the most interesting. Transgressive phenotypes, which are characteristic of polyploids and have traits beyond the parental range, are the result of novel transcriptomes. Some of the novelty of new transcriptomes can be potentially attributed to the marked change in splicing ratios. Splicing ratios can have dramatic effects on the functionality ofthe downstream proteins (reviewed in Mastrangelo et al. 2012, Yang et al. 2014). CIRCADIAN CLOCK-ASSOCIATED1 (CCA1) regulates part of the circadian clock and temperature responsein Arabidopsis via modulating the retention frequency of intron 4, where transcripts containing the intron function as a dominant-negative regulator of the fully spliced form. Mutants that only produce the fully spliced form have a higher resistance to cold (Seo et al. 2012).  This demonstrates a case where the frequency of AS confers a phenotype, particularly one that may beadvantageous to a polyploid invading a new environment. In our polyploids, we found significant splicing frequency changes in orthologs of CCA1. In the A genome ortholog Bra004503, the retention frequency of intron 2 increased in P3 from a parental average of 12.5% AS to 16.6% AS, increasing retention by 32.8%. In the C genome ortholog Bo4g006930, the retention frequency of intron 1 decreased in all polyploids by from a parental average of 37.7% to 27.1%, or losing 28% of its AS frequency. LHY, which is the α-duplicate of CCA1, curiously 92shows the inverse AS pattern in Arabidopsis, where it constitutively splices out exon 5 (equivalent to CCA1 intron 4) at 20ºC, but includes in response to cold, whereas CCA1 mostly includes its 4th exon at 20ºC (James et al. 2012) and splices it out at colder temperatures. In bothcases the intron inclusion ratio allows the coordination responses to cold and/or the regulation of the circadian clock. We detected an increase of an equivalent event to LHY in Bra033291,where intron 5 inclusion in every polyploid increased from 1.6x to 2.2x the parental value. Bra033291 is improperly annotated in the B. rapa genome, as we detected it as 3-4 event, or intron 3, but theB. rapa genome annotation is missing the first two exons which our RNA-Seq detects, i.e. it is equivalent to intron 5 in Arabidopsis, which was confirmed by BLAST. Essentially, the LHY ortholog is more frequently retaining an intron normally reserved for cold response at 20ºC, with the parental frequencies between 2.4% and 2.8% and polyploid frequencies as high as 5.8%. In Arabidopsis, the FCA gene's fully spliced protein product can influence it's own abundance via causing alternative splicing of its transcript into an inactive form (Quesada 2003), where this auto-regulation keeps FCA levels from autonomously overcoming flowering repression. Bra038446, an ortholog of FCA, has a significant decrease in P3 of the 11th intron from a parental average of 22.3% to 11.1%, or a reduction of 52% of intron retention frequency. The circadian clock gene AtGRP7 is able to auto-regulate its own ratio of AS, and therefore expression levels, by means of a protein-transcript feedback mechanism (Staiger et al. 2003). AtGRP8 acts in much the same way as AtGRP7 (Schöning et al. 2008). Viewed in this light, some genes may not reach a critical protein to transcript stoichiometry to effectively self-regulatein the allopolyploids, i.e. what is measured as a loss of AS frequency is the result of some proteins overall lower abundance in the more complex allopolyploid cell environment not being sufficiently able to self-inhibit. Bra011869 (GRP8) has a significant loss of IR frequency of 93intron 1 in P1 and P3, where the parental frequency is 17.7% and P1 and P3 are 14.6% and 14.8% result in a ~17% reduction in IR frequency. It's paralog, Bra010693, has a significant decrease of IR frequency of intron 1 in all three allopolyploids, with a parental rate of 39.7% retention, while the polyploids have rates of, 30.7%, 28.2%,  and 30.9%, or a ~25% decrease, as well as an increase of a 1-2 ALTD event in P3, where the parental frequency was 8.5% and in P3 it was 13.7%, or an increase of 39.7%. RESISTANCE TO PSEUDOMONAS SYRINGAE4 (RPS4) requires variants of the full reading frame as well as a truncated frame generated by AS to be fully functional (Zhang et al. 2003) and alters the frequency of AS in response to disease exposure (Zhang and Grassman 2007). The N gene in tobacco confers resistance to the tobacco mosaic virus, and similarly to RPS4, both of its AS forms are required for full resistance, yet the AS ratios are changed in response to infection. This pattern is again repeated in Medicago where a sufficient amount of the alternative form of RCT1 is required for conference against the pathogen (Tang et al. 2013). A truncated form of HOS1 is upregulated vs the longer, normal formin Arabidopsis in response to cold, where the shift of ratio may be adaptive in response to cold. (Lee et al. 2012).  The presence or absence of a specific splice event has been known to influence phenotype, yet more plant specific cases of a quantitative change in ratio of splicing, rather than a qualitative change,  are being found and highlight one mechanism of potential phenotypic consequences as well as variety that allopolyploidy can release. The large amount of alternative splicing bias between homeologs turning into situations where event is detected at similar levels in the allopolyploids may be indicative of evolution which had happened in trans; once inside the same trans-environment the splicing frequencies ofthe junctions are inflected roughly equally. Preservation of parental bias is best explained by preexisting cis-elements continuing to drive a higher amount of AS, even in a new and chaotic 94transcriptome. Transgressive effects may explain the cases where biases flip. Changes in cis elements were implicated to account for most species-specific alternative splicing patterns in vertebrates (Barbosa-Morias 2012), and in another study 80% of non-exon-skipping events ∼were attributed to divergent cis architecture in Drosophila (McManus et al. 2014).  Many of these explanations may be oversimplified in the wake of the many complicating factors which are part of transcriptome shock, however, in each polyploid, there were roughly twice as many cases of biased frequencies  keeping their respective parental biases (cis likely) than there were cases of non-biased pairs developing a new bias (trans likely)(χ2df1, p<0.0001, 24.3%-28.0% of non-biased pairs developed a bias, 50.3%-55.6% of A biased AT pairs retained bias, and 45.5%-51.2% of C biased CT pairs retained bias), once again suggesting the importance of cis-elements as the more common and possibly stronger mechanism in AS divergence. Zhou et al. 2011 investigated changes in AS upon allopolyploidization using targeted RT-PCR of events known to have AS based on EST sequencing using the same polyploid system. Our results are complimentary but highlight the strengths and weaknesses of the experimental methods used in the respective studies. RT-PCR results are binary, i.e. presence or absence of an event is absolute whereas illumina sequencing is able to elucidate subtler quantitative differences. Differentiating between extreme loss of frequency and true event loss is much more difficult, and this is compounded by a large and complex transcriptome. RT-PCR may neatly miss detection of an event when the event level is below detection threshold whereas a few readsmay support an event in an RNA-Seq data set, making the line between extreme quantitative and qualitative loss less absolute than RT-PCR. In Zhou et al., two lines of resynthesized polyploids were surveyed revealing 21 events (26%) and 24 events (30%)  out of 81 which qualitatively changed from their parental state in resynthesized B. napus allopolyploids, most of which were 95losses of IR events. Seventeen of these changes (75.5%) occurred in parallel. In our study we found 23% of AT subgenome events and 37% of CT subgenome IR events underwent negative quantitative change, with 69.3% and 80.6% of AT and CT changes being paralleled between one or more polyploid. Putatively, the same underlying mechanisms are contributing to both quantitative and qualitative decreases and losses. Zhou et al. predicted changes would be more common in natural B. napus than the resynthesized allotetraploids, but found the natural as having only marginally fewer changes; 16 of 80 events (20%) were changed, but this is not significantly less than either of the change amounts seen in their two resynthesized allotetraploid lines (χ2df1, p=0.3570 , p=0.5995). We found the natural to have significantly more changes from the parental state than any of the resynthesized polyploids. When IR events are considered, the amount of decreases in the natural are always within the range of the amount of change that the three polyploids experience in both the AT and CT subgenomes. In this aspect, the studies agree entirely – Zhou et al. did not find a significant difference between the amount of losses in the natural and the resynthesized allotetraploids, nor did we find that the natural experienced more negative frequency changes than the resynthesized allopolyploids. We found the most change was in positive frequency change in the natural compared to the resynthesized allotetraploids, where the natural has over double the amount of positive frequency change of any polyploid, in either the AT or CT subgenome. This change would not be detectable without doing quantitative or semi-quantitative RT-PCR, which would be very difficult in a complex system. Rather, this finding adds another level of complexity to allopolyploidy, where event loss or frequency decrease may happen quickly as a hybridization effect, but events that are retained then go through a phase of quantitative fine tuning to inflect and regulate the larger transcriptome; in watermelon autopolyploids, the predominant direction of AS change was found to be positive (Saminathan et al. 2015).96Table 4.1 Expression changes between progenitor B. rapa and B. oleracea and three resynthesized allopolyploids. Shared changes are between one or more allopolyploid.Table 4.2 Conserved change between all the resynthesized allopolyploids.97Expression ChangesIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 432 1127 282 734 195 25403P2 330 745 390 696 636 25376P3 600 1053 928 782 227 24583Total 1362 2925 1600 2212 1058 753624287 3812 76420Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 130 307 673 606 108 22386P2 153 276 341 372 407 22661P3 378 324 542 604 115 22247Total 661 907 1556 1582 630 672941568 3138 67924Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 562 1434 955 1340 303 47789P2 483 1021 731 1068 1043 48037P3 978 1377 1470 1386 342 46830Total 2023 3832 3156 3794 1688 1426565855 6950 144344ATCTAT+ CTConserved ChangeSubgenome Plus Minus Same511 518 23218179 262 21038ATCTTable 4.3 Expression changes between progenitor B. rapa and B. oleracea and three resynthesized allopolyploids and natural B. napus. Shared changes are between one or more allopolyploid and/or B.napus.Table 4.4 Conserved change between all the resynthesized allopolyploids and natural B. napus.98Expression Changes with NaturalIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 206 1054 217 649 106 22675P2 166 688 232 614 259 22948P3 305 1052 657 809 112 21972N 3873 1012 2971 808 210 16033Total 4550 3806 4077 2880 687 836288356 6957 84315Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 78 236 305 676 31 16220P2 81 207 199 303 142 16614P3 203 269 252 539 46 16237N 1898 190 2779 695 110 11874Total 2260 902 3535 2213 329 609453162 5748 61274Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 284 1290 522 1325 137 38895P2 247 895 431 917 401 39562P3 508 1321 909 1348 158 38209N 5771 1202 5750 1503 320 27907Total 6810 4708 7612 5093 1016 14457311518 12705 145589ATCTAT+ CTConserved ChangeSubgenome Plus Minus Same252 287 1423042 100 10764ATCTTable 4.5 Intron retention events changing frequency after allopolyploidy. Shared changes only need be shared by 2 or more resynthesized allopolyploids.99Intron Retention EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 172 281 376 1252 357 7521P2 406 319 274 1063 496 7401P3 100 187 1011 1410 276 6975Total 678 787 1661 3725 1129 218971465 5386 23026Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 125 251 306 2160 460 5325P2 274 278 251 2027 550 5247P3 89 217 1032 2421 218 4650Total 488 746 1589 6608 1228 152221234 8197 16450Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 297 532 682 3412 817 12846P2 680 597 525 3090 1046 12648P3 189 404 2043 3831 494 11625Total 1166 1533 3250 10333 2357 371192699 13583 39476ATCTAT+ CTTable 4.6 Intron retention events changing frequency after allopolyploidy with naturalShared changes only need be shared by 2 or more resynthesized allopolyploids.100Intron Retention EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 30 99 48 374 50 2019P2 60 151 43 324 75 1967P3 13 52 181 489 52 1833N 423 140 303 342 75 1337Total 526 442 575 1529 252 7156968 2104 7408Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 16 99 49 599 61 1463P2 44 129 38 545 77 1454P3 15 79 205 720 43 1225N 257 131 166 517 106 1110Total 332 438 458 2381 287 5252770 2839 5539Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 46 198 97 973 111 3482P2 104 280 81 869 152 3421P3 28 131 386 1209 95 3058N 680 271 469 859 181 2447Total 858 880 1033 3910 539 124081738 4943 12947ATCTAT+ CTTable 4.7 Homeolog analysisShared changes only need be shared by 2 or more resynthesized allopolyploids.101Homeolog Pairs which were Equal in the parental speciesUnique Shared Unique Shared Unique SharedP1 48 113 38 65 56 763P2 63 119 45 77 38 741P3 50 120 43 70 44 756Total 161 352 126 212 138 2260513 338 23980.1578947 0.104032Homeolog Pairs which were A biased in the parental speciesUnique Shared Unique Shared Unique SharedP1 17 212 0 8 44 174P2 26 227 6 11 26 159P3 27 223 3 11 30 161Total 70 662 9 30 100 494732 39 594Homeolog Pairs which were C biased in the parental speciesUnique Shared Unique Shared Unique SharedP1 8 12 30 270 49 289P2 17 16 54 283 30 258P3 13 12 38 267 49 279Total 38 40 122 820 128 82678 942 954AT>CT AT<CT AT=CTAT>CT AT<CT AT=CTAT>CT AT<CT AT=CTFigure 4.1 Examples of modeled eventsThe frequency of the event in polyploids is unchanged from the parent (grey), have the event at asignificantly higher frequency than the parent (blue) or a significantly reduced frequency (red) (p < 0.05). Bo8g021910 is an ortholog of ALDEHYDE DEHYDROGENASE 4 (Arabidopsis thaliana). Bra033886 is an ortholog of AUXIN RESISTANT 2 (Arabidopsis thaliana). 1020.150.200.25Bo8g021910 1−2 IR Event(AT1G44170: ALDH4)Percentage IR0.200.250.300.35Bra033886 2−3 IR Event(AT3G23050: AXR2)Percentage IRParental Polyploid 3 Polyploid 2 Polyploid 1Parental Polyploid 3 Polyploid 2 Polyploid 1Figure 4.2 Density of AS fold change, syntheticsThe density of the AS fold changes of each polyploid/event type have been plotted on the left two columns, and the combined density of all three polyploids AT genome and CT genome have been plotted against each other in the right column. Below one designates a negative change in frequency from the parental levels of the event (0 to 1 fold of the parental value), whereas above one ( > 1 fold) indicates a positive change.1030 1 2 3 0 1 2 3 0 1 2 30 1 2 3 0 1 2 3 0 1 2 30 1 2 3 0 1 2 3 0 1 2 30 1 2 3 0 1 2 3 0 1 2 3AT IR events  CT IR events AT vs CT IR events AT ALTD events  CT ALTD events AT vs CT ALTD events AT ALTA events  CT ALTA events AT vs CT ALTA events AT ALTP events  CT ALTP events AT vs CT ALTP eventsPolyploid 1 Polyploid 2 Polyploid 3 OleraceaRapaD  ensity of AS Fold ChangeFigure 4.3 Density of AS fold change, naturalThe density of the AS event fold changes of all three synthetic polyploids and natural B. napus for IR events are plotted in the left two columns. The far right column is the combination of both AT and CT genomes for IR events, with a composite of all the synthetics compared to the natural, which shows less negative change in IR event frequencies.Figure 4.4 Results of RT-PCR event confirmations. Negative (RT-) are not shown.A)Primer 1, Bo8g045830, Junction 1-2, Constitutive, ALTD(+55bp), IR(+132bp)B)Primer 2, Bra008524, Junction 1-2, Constitutive, ALTA(+35bp), IR (+109bp)C)Primer 13, Bo5g127250, Junction 1-3, Skips Exon 2 (-69bp), includes exon 2D)Primer 17, Bra005988, Junction 14-15, Constitutive, ALTD(-43bp), IR(+82)1040 1 2 3 0 1 2 3 0 1 2 3P1 P2 NaturalP3 SyntheticNaturalD  ensity of AS Fold Change AT IR events  CT IR events  Synthetic vs. NaturalA B DBoCBr  P1  P2  P3  Bo  Br  P1  P2  P3Chapter 5: Conclusion5.1 SummaryI have examined the evolution of alternative splicing and polyploidy on three distinct time scales: the paleopolyploid system of Arabidopsis thaliana (~23mya), the evolutionarily recent allopolyploid Brassica napus in the genome project (~7,500y) and my own transcriptome analysis, and newly resynthesized allopolyploid Brassica napus (5 generations). At each evolutionary time scale, alternative splicing has undergone considerable changes, where Arabidopsis represents the most advanced and refined stage of a post polyploidy scenario, and the newly resynthesized B. napus represents the least. We can begin to form ideas and see general theoretical trends about the nature of alternative splicing evolution throughout the iterative expansions and subsequent diploidization that is characteristic of the flowering plant lineage. Following a polyploidy event, the frequency of alternative splicing is perturbed by genome and transcriptome shock, where the characterized gene expression and methylation changes accompanying polyplodization are thus inflected into changes in alternative splicing. These changes are erratic and not entirely repeatable, much like the other genomic, transcriptomic, and phenotypic changes (Gaeta et al. 2007) which are either interacting with or causal to the alternative splicing changes. From our limited observations, it appears as if selection sets in quite rapidly with regards to splicing and expression, as natural B. napus is quitedistinguishable from resynthesized allopolyploid B. napus after only ~7,500 years. However, thishas caveats in that the polyploid has to be favored by selection and that we may not have seen the full range of potential transcriptional change in only three allopolyploids. As diploidization further sets in, splicing patterns of genes may more fully diverge and sub or neofunctionalize, as 105we have observed in the case of CCA1/LHY, although we have seen this is possible immediately after allopolyploidization (Zhou et al. 2011). At each time scale, alternative splicing is changing, be it quick erratic change, or a more gradual refining of gene expression and function. Many of these changes may be underpinned and driven by changes in cis elements, as both duplicates inhabit the same trans environment yet are diverged in their expression or splicing. In Arabidopsis, a large part of the splicing divergence we saw was attributed to cis divergence between the paralogs, which is also supported by other studies in other systems (Barbosa-Morais et al. 2012, McManus et al. 2014). This is furthered by our study in B. napus, as events that were biased in the parental species tended to retain this bias in new hybrid allopolyploids and in the natural B. napus, for which the best explanation is the strength of their cis-elements to reciprocate their parental splicing patterns even in a new trans-environment. Thus the evolution of alternative splicing is largely dependent on the changes to cis-regulation to direct transcript processing, allowing new ways to evolve different splice forms, as well as new and different regulatory circuits and networks via AS-nonsense mediated decay.Polyploidy itself remains only partially understood, where repeated iterations of auto and allopolyploidization followed by diploidization over time is intriguing yet difficult to understand.All we can study in the immediate present are largely static and unchanging genomes. However chaotic polyploidy is, the ability for genes to escape adaptive constraint via duplication seems to be exceptionally important; duplication may break the stagnation of a diploid genome and allow evolution to be creative and fashion new functions, or repurpose old functions with the protean material that duplicated genes represent. Much time has been devoted to the evolution of duplicate genes, particularly where they diverge in protein function or expression patterns. Only 106recently has alternative splicing come into light as both a key biological process with far reaching implications and an approachable topic on a genome-wide scale. While surely more interesting facets are yet to be discovered, I have completed some of the first large -cale evolutionary studies of alternative splicing in plants.  5.1 Future DirectionsOne aspect that would be exceptionally engaging to follow up on is a more detailed investigation of the SR proteins. The SR proteins are one of the major families of proteins that control and regulate alternative splicing, yet are highly alternatively spliced themselves. I would be very curious if one could recapitulate the allopolyploidy syndrome of extreme loss of IR eventfrequency by selective genetic tinkering, either by up regulating the expression of SR proteins or excision of NMD-related introns, perhaps in Arabidopsis. While the large amount of data grantedby illumina studies is excellent for forming hypotheses and testing genome-wide trends, it cannotelucidate a causal mechanism for the patterns we see in resynthesized allopolyploids. I would want to test and see if the changes to splicing are explainable only via changes in SR protein levels, or if a higher order mechanism is in play during allopolyploidization. If one can recreate asimilar pattern of change with an alteration of SR proteins alone, it may suggest that one major barrier to successful allopolyploid formation is a harmonization of the SR proteins, where part oftranscriptome shock is having two or more sets of SR genes overload the transcriptome. I find this hypothesis intriguing, as it could explain a lot in terms of hybridization compatibility as well. It would even be interesting to alter the SR proteins of a newly resynthesized allopolyploid to see if parental patterns could be restored.  Another avenue of follow-up would be to include far more allopolyploids, but with much 107more shallow sequencing. The goal of such a study would be not to gain a sense of the depth of the changes that can accompany allopolyploidization, but to encapsulate more of the variability in allopolyploidization. With only three resynthesized allopolyploids, I feel that we are myopic intrying to see all of the potential changes that could be taking place, but because of data constraints one would have to be either extremely well funded or simply devote less sequencing per sample, if one wanted to sequence tens or hundreds of individual polyploids. Finally, the last aspect that would be illuminating to peruse would be is an investigation info the cis-architecture empowering some of the splicing divergence between duplicated genes. Either as a series of case studies or perhaps using DNAase-seq, investigating the exact cis-changes between duplicated genes with diverged alternative splicing could improve our understanding of how, and on what time scales, splicing divergence occurs. A detailed look into CCA1/LHY would be fascinating, as LHY has retained the IR event, but only significantly expresses it in response to cold, the inverse pattern of CCA1. In short, what is the genetic basis ofsome of the evolved changes between duplicates?108BibliographyAdams KL, Cronn R, Percifield R, Wendel JF. 2003. Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc. Natl. Acad. Sci. U. S. A. 100:4649-4654.Adams KL,  Wendel, JF. 2005. Polyploidy and genome evolution in plants. Curr. Opin. Plant  Biol. 8:135-141.Ainouche ML, Baumel A, Salmon A. 2004. Spartina anglica CE Hubbard: a natural model system for analysing early evolutionary changes that affect allopolyploid genomes. Biol. J. Linn.Soc. 82:475-484.Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, and Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402. Anders SD, McCarthy DJ,Chen, Y, Okoniewski M , Smyth GK, Huber W, and Robinson MD. 2013. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nat. Protoc. 8:1765–1786.Arrigo N, Barket MS. 2012. Rarely successful polyploids and their legacy in plant genomes. Curr. Opin. Plant Biol. 15:140-146.Ast G. 2004. How did alternative splicing evolve? Nat. Rev. Genet. 5:773-782.Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, Slobodeniuc V, Kutter C, Watt S, Çolak R et al. (17 co-authors).  2012. The evolutionary landscape of alternative splicing in vertebrate species. Science 338:1587-1593. Black DL. 2003. Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem 72:291-336 Blanc G, Hokamp K, Wolfe KH. 2003. A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. Genome Res. 13:137-144.Blanc G, Wolfe KH. 2004a. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16:1667-1678.Blanc G, Wolfe KH. 2004b. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16:1679–1691.Boise LH, Gonzalez-Garcia M, Postema CE, Ding L , Lindsten T,  Turka LA, Mao X, Nunez G, Thompson CB. 1993. bcl-x, a bcl-2-related gene that functions as a dominant regulator of apoptotic cell death. Cell 74:597-608. 109Boue S, Letunic I, Bork P. 2003. Alternative splicing and evolution. Bioessays 25:1031-1034. Bowers JE, Chapman BA,  Rong J, Paterson AH. 2003. Unraveling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433-438.Brummell DA, Chen RK, Harris JC, Zhang H, Hamiaux C, Kralicek AV, McKenzie MJ. 2011. Induction of vacuolar invertase inhibitor mRNA in potato tubers contributes to cold-induced sweetening resistance and includes spliced hybrid mRNA variants. J. Exp. Bot. 62:3519–3534.Buggs RJ, Wendel JF, Doyle JJ, Soltis DE, Soltis PS, Coate JE. 2014. The legacy of diploid progenitors in allopolyploid gene expression patterns. Phil. Trans. R. Soc. B 369(1648), 20130354.Buggs RJ, Miles N, Tate JA, Gao L, Wei W, Schnable PS, Barbazuk WB, Soltis PS, Soltis DE. 2011. Transcriptome shock generates evolutionary novelty in a newly formed, natural allopolyploid plant. Curr. Biol. 21:551-56.Busch A, Hertel KJ. 2012. Evolution of SR protein and hnRNP splicing regulatory factors. WileyInterdiscip. Rev. RNA 3:1-12. Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biol. 7:R13.Chalhoub B, Denoeud F, Liu S, Parkin IA, Tang H, Wang X, Chiquet J, Belcram H, Tong C, Samans B et al. (82 co-authors). 2014. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science. 345:950–953.Chaudhary B, Flagel LE, Stupar RM, Udall JA, Verma N, Springer NM, Wendel J. 2009. Reciprocal silencing, transcriptional bias and functional divergence of homoeologs in polyploid cotton (Gossypium). Genetics 182:503–517. Chen ZJ, Ni Z. 2006. Mechanisms of genomic rearrangements and gene expression changes in plant polyploids. Bioessays 28:240-252.Chen M, Manley JL. 2009. Mechanisms of alternative splicing regulation: Insights from molecular and genomics approaches. Nat. Rev. Mol. Cell Biol. 10:741–754. Cheng F, Wu J, Fang L, Sun S, Liu B, Lin K, Bonnema G, Wang X. 2012. Biased gene fractionation and dominant gene expression among the subgenomes of Brassica rapa. PLoS One 7:e36442.Chung HS, Cook TF, DePew CL, Patel LC, Ogawa N, Kobayashi Y, Howe GA. 2010. Alternative splicing expands the repertoire of dominant JAZ repressors of jasmonate signaling. Plant J. 63:613-622.110Conant GC, Wolfe KH. 2008. Turning a hobby into a job; how duplicated genes find new functions. Nat. Rev. Genet. 2008. 9: 938-950.Comai L. 2005. The advantages and disadvantages of being polylploid. Nat. Rev. Genet. 6:836–846.Coolon JD, McManus CJ, Stevenson KR, Graveley BR, Wittkopp PJ. 2014. Tempo and mode of regulatory evolution in Drosophila. Genome Res. 24:797-808.Cui L, Wall PK, Leebens-Mack JH, Lindsay BG, Soltis DE, Doyle JJ, Soltis PS, Carlson JE, Arumuganathan K, Barakat A et al. (13 co-authors). 2006. Widespread genome duplications throughout the history of flowering plants. Genome Res. 16: 738-749.Cusack BP, Wolfe KH. 2007. When gene marriages don't work out: divorce by subfunctionalization. Trends Genet. 23:270–272.Darracq A, Adams KL. 2013. Features of evolutionarily conserved alternative splicing events between Brassica and Arabidopsis. New Phytol. 199:252–263.Dammann C, Ichida A, Hong B, Romanowsky SM, Hrabak EM,  Harmon AC,  Pickard BG, Harper JF. 2003.  Subcellular targeting of nine calcium-dependent protein kinase isoforms from Arabidopsis. Plant Physiol. 132:1840–1848De Smet R, Adams KL, Vandepoele K, Van Montagu MC, Maere S, Van de Peer Y. 2013. Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants. Proc. Natl. Acad. Sci. U. S. A. 110:2898-2903.Dinesh-Kumar, SP, Baker BJ. 2000. Alternatively spliced N resistance gene transcripts: their possible role in tobacco mosaic virus resistance. Proc. Natl. Acad. Sci. U. S. A. 97:1908-1913.Dong MA, Farré EM, Thomashow MF. 2011 Circadian clock-associated 1 and late elongated hypocotyl regulate expression of the C-repeat binding factor (CBF) pathway in Arabidopsis.  Proc. Natl. Acad. Sci. U. S. A. 108: 7241–7246.Doyle JJ, Flagel LE, Paterson AH, Rapp RA, Soltis DE, Soltis PS, Wendel JF. 2008.Evolutionarygenetics of genome merger and doubling in plants. Annu. Rev. Genet. 42: 443-461.Drechsel G, Kahles A, Kesarwani AK, Stauffer E, Behr J, Drewe P, Rätsch G, Wachtera A. 2013. Nonsense-mediated decay of alternative precursor mRNA splicing variants is a major determinant of the Arabidopsis steady state transcriptome. Plant Cell 10: 3726-3742.Duvick J, Fu A, Muppirala U, Sabharwal M, Wikerson MD, Lawrence CJ, Lushbough C, Brendel V. 2008. PlantGDB: a resource for comparative plant genomics. Nucleic Acids Res. 36: 959-965.Eden E, Lipson D, Yogev S, Yakhini Z. 2007. Discovering motifs in ranked lists of DNA sequences. PLoS Comput. Biol. 3:e39.111Eden E, Navon R, Stienfeld I, Lipson D, Yakhini Z. 2009. GOrilla: A tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 10:48.Edgar R, Domrachev M, Lach AE. 2002. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30:207-210.English A, Patel K, Loriane A, 2010. Prevalence of alternative splicing choices in Arabidopsis thaliana. BMC Plant Biol. 10:102.Fawcett, JA, Maere S, Van de Peer Y. 2009 . Plants with double genomes might have had a betterchance to survive the Cretaceous–Tertiary extinction event. Proc. Natl. Acad. Sci. U. S. A. 106: 5737-5742. Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC. 2010.Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res. 20:45–58.Filichkin SA, Mockler TC. 2012. Unproductive alternative splicing and nonsense mRNAs: a widespread phenomenon among plant circadian clock genes. Biol. Direct. 7:20.Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. 1999. Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545.Fujiwara S, Oda A, Yoshida R, Niinuma K, Miyata K, Tomozoe Y, Tajima T, Nakagawa M, Hayashi K, Coupland G et al. (11 co-authors). 2008. Circadian clock proteins LHY and CCA1 regulate SVP protein accumulation to control flowering in Arabidopsis. Plant Cell 20:2960-2971.Gaeta RT, Pires JC, Iniguez-Luy, F, Leon E, Osborn TC. 2007. Genomic changes in resynthesized Brassica napus and their effect on gene expression and phenotype. Plant Cell 19: 3403-3417.Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, Lyngsoe R, Schultheiss SJ, Osborne EJ, Sreedharan VT et al. (25 co-authors). 2011. Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature. 477:419-423.Graveley BR. 2001. Alternative splicing: increasing diversity in the proteomic world. Trends Genet. 17:100-107.Han MV, Demuth JP, McGrath CL, Casola C, Hahn MW. 2009. Adaptive evolution of young gene duplicates in mammals. Genome Res. 5:859-67. Hegarty MJ, Hiscock SJ. 2008. Genomic clues to the evolutionary success of polyploid plants. Curr Biol. 18:R435-R444. Higgins J, Magusin A, Trick M, Fraser F, Bancroft I. 2012. Use of mRNA-seq to discriminate contributions to the transcriptome from the constituent genomes of the polyploid crop species Brassica napus. Brassica napus. BMC genomics, 13:247.112Hu G, Yoo MJ, Chen S, Wendel JF. 2015. Gene-expression novelty in allopolyploid cotton: a proteomic perspective. Genetics 200:91-104.Huang DW, Sherman BT, Lempicki RA. 2008 Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protoc. 4:44-57.Huang DW, Sherman BT, Lempicki RA. 2009. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res.37:1-13.Huelga SC, Vu AQ, Arnold JD, Liang TY, Liu PP, Yan BY, Donohue JP, Shiue L, Hoon S, Brenner S et al. (12 co-authors). 2012. Integrative genome-wide analysis reveals cooperative regulation of alternative splicing by hnRNP proteins. Cell Rep. 1:167–178.James AB, Syed NH, Bordage S, Marshall J, Nimmo GA, Jenkins  GI, Herzyk P, Brown J, Nimmo HG. 2012. Alternative splicing mediates responses of the Arabidopsis circadian clock to temperature changes. Plant Cell 24:961-981.Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y,Liang H, Soltis PS et al. (17 co-authors). 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473:97-100.Kalyna M, Simpson CG, Syed NH, Lewandowska D, Marquez Y, Kusenda B, Marshall J, Fuller J, Cardle L, McNicol J et al. (13 co-authors). 2011. Alternative splicing and nonsense-mediated decay modulate expression of important regulatory genes in Arabidopsis. Nucleic Acids Res.  40:2454–2469.Kasahara M. 2007. The 2R hypothesis: an update. Curr. Opin. Immunol. 19: 547–552.Kenan-Eichler M, Leshkowitz D, Tal L, Noor E, Melamed-Bessudo C, Feldman M, Levy AA. 2011. Wheat hybridization and polyploidization results in deregulation of small RNAs. Genetics 188:263–272.Kissen R., Hyldbakk E,Wang CW, Sørmo CG, Rossiter JT, Bones AM. 2012. Ecotype dependent expression and alternative splicing of epithiospecifier protein (ESP) in Arabidopsis thaliana. Plant Mol. Biol. 78:361-375.Kopelman NM, Lancet D, Yanai I. 2005. Alternative splicing and gene duplication are inversely correlated evolutionary mechanisms. Nat. Genet. 37:588-589.Kriechbaumer V, Wang P, Hawes C, Abell B. 2012. Alternative splicing of the auxin biosynthesisgene YUCCA4 determines its subcellular compartmentation. Plant J. 70:292–302.Lee J, Zhou J, Zheng X, Cho S,  Moon H, Jen Loh T, Jo K, Shen H. 2012. Identification of a novel cis-element that regulates alternative splicing of Bcl-x pre-mRNA. Biochem. Bioph. Res. Co. 420:467–472.113Lister JA, Close J, Rabile DW. 2001. Duplicate mitf genes in zebrafish: complementary expression and conservation of melanogenic potential. Dev. Biol. 237:333-344.Liu SL, Adams KL. 2010. Dramatic change in function and expression pattern of a gene duplicated by polyploidy created a paternal effect gene in the Brassicaceae. Mol. Biol. Evol. 27: 2817-2828.Liu S, Baute GJ, Adams KL. 2011. Organ and cell type-specific complementary expression patterns and regulatory neofunctionalization between duplicated genes in Arabidopsis thaliana. Genome Biol. Evol. 3:1419–1436.Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550.Lu T, Lu G, Fan D, Zhu C, Li W, Zhao Q, Feng Q, Zhao Y, Guo Y, Li W et al (12 co-authors). 2010. Function annotation of rice transcriptome at single nucleotide resolution by RNA-seq. Genome Res. 20:1238–1249. Marshall AN, Montealegre MC, Jimenez-Lopez C, Lorenz MC, van Hoof A. 2013. Alternative splicing and subfunctionalization generates functional diversity in fungal proteomes. PLoS Genet. 9:e1003376Marquez Y, Brown JW, Simpson C, Barta A, Kalyna M. 2012. Transcriptome survey reveals increased complexity of the alternative splicing landscape in Arabidopsis. Genome Res. 22: 1184–1195. Mastrangelo, AM., Marone D, Laidò G, De Leonardis AM, & De Vita P. 2012. Alternative splicing: enhancing ability to cope with stress via transcriptome plasticity. Plant Science 185: 40-49.Mayrose, I, Zhan SH, Rothfels CJ, Magnuson-Ford K, Barker MS, Rieseberg LH, Otto SP. 2011. Recently formed polyploid plants diversify at lower rates. Science 333:1257-1257.McManus CJ, Coolon JD, Duff MO, Eipper-Mains J, Graveley BR, Wittkopp PJ. 2010 Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res. 20:816-825.McManus CJ, Coolon JD, Eipper-Mains J, Wittkopp PJ, Graveley BR. 2014 Evolution of splicing regulatory networks in Drosophila. Genome Res. 24:786-796.Moar, GL, Yearim A, Ast G. 2015. The alternative role of DNA methylation in splicing regulation. Trends in Genet. 31:274-280.Nilsen TW, Graveley BR. 2010. Expansion of the eukaryotic proteome by alternative splicing. Nature 463:457-463.Otto S.P.,Whitton J. 2000. Polyploid Incidence and Evolution. Annu. Rev. Genet.. 34:401-437.114Palusa SG, Ali GS,  Reddy, AS. 2007. Alternative splicing of pre‐mRNAs of Arabidopsis serine/arginine‐rich proteins: regulation by hormones and stresses. Plant J. 49:1091-1107.Palusa, SG Reddy AS. 2010. Extensive coupling of alternative splicing of pre‐mRNAs of serine/arginine (SR) genes with nonsense‐mediated decay. New Phytol. 185:83-89.Parkin, IA, Koh C, Tang H, Robinson SJ, Kagale S, Clarke WE, Town CD, Nixon J, Krishnakumar V, Bidwell SL et al. (31 co-authors). 2014. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea. Genome Biol. 15:R77.Paula D. 2011. A role for SR proteins in plant stress responses. Plant Signal Behav. 6:49-54.Proost S, Van Bel M, Sterck L, Billiau K, Van Parys T, Van de Peer Y, Vandepoele K. 2009. PLAZA: a comparative genomics resource to study gene and genome evolution in plants. Plant Cell 21:3718-3731.Quesada V, Macknight R, Dean C, Simpson GG. 2003. Autoregulation of FCA pre‐mRNA processing controls Arabidopsis flowering time. EMBO J. 22:3142-3152.R Development CT. 2011. R: A Language and Environment for Statistical Computing,Reference Index Version 3.2.2Reddy SN, Marquez Y, Kalyna M, Barta A. 2013. Complexity of the alternative splicing landscape in plants. Plant Cell 25:3657-3683. Renny-Byfield S, Gallagher JP, Grover CE, Szadkowski E, Page JT, Udall JA, Wang X, Paterson AH, Wendel JF. 2014. Ancient gene duplicates in Gossypium (Cotton) exhibit near-complete expression divergence. Genome Biol. Evol. 6: 559-571.Rieseberg LH, Raymond O, Rosenthal DM, Lai Z, Livingstone K, Nakazato T, Durphy JL, Schwarzbach AE, Donovan LA, Lexer C. 2003. Major ecological transitions in wild sunflowers facilitated by hybridization. Science 301:1211-1216Rieseberg LH, Willis JH. 2007. Plant Speciation. Science 317:910–914.Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139-140.Roulin A, Auer PL, Libault M, Schlueter J, Farmer A, May G, Stacey G, Doerge RW, Jackson SA. 2013. The fate of duplicated genes in a polyploid plant genome. Plant J. 73:143–153.Roux J, Robinson-Rechavi M. 2011. Age-dependent gain of alternative splice forms and biased duplication explain the relation between splicing and duplication. Genome Res. 21: 357-363.Salmon, A, Ainouche ML,Wendel JF. 2005. Genetic and epigenetic consequences of recent hybridization and polyploidy in Spartina (Poaceae). Mol. Ecol. 14:1163-1175.115Saminathan T, Nimmakayala P, Manohar S, Malkaram S, Almeida A, Cantrel, R,Tomason Y, Abburi L, Rahman MA, Vajja VG et al. (16 co-authors). 2015. Differential gene expression and alternative splicing between diploid and tetraploid watermelon. J. Exp. Bot. 66:1369-1385. Sanchez SE, Petrillo E, Beckwith EJ, Zhang X, Rugnone ML, Hernando CE, Cuevas JC,Godoy Herz MA, Depetris-Chauvin A, Simpson CG et al. (17 co-authors). 2010. A methyl transferase links the circadian clock to the regulation of alternative splicing. Nature. 468:112–116.Schnable JC, Springer NM, Freeling M. 2011. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl. Acad. Sci. U. S. A. 108: 4069-4074.Schöning JC, Streitner C, Meyer IM, Gao Y, Staiger D. 2008. Reciprocal regulation of glycine- rich RNA-binding proteins via an interlocked feedback loop coupling alternative splicing to nonsense-mediated decay in Arabidopsis. Nucleic Acids Res.  36: 6977–6987.Sémon M, Wolfe KH. 2007. Consequences of genome duplication. Curr. Opin. Genet. Dev. 17: 505–512.Seo PJ, Park M, Lim M, Kin S, Lee M, Baldwin IT, Park C. 2012. A self-regulatory circuit of CIRCADIAN CLOCK-ASSOCIATED1 underlies the circadian clock regulation of temperature responses in Arabidopsis. Plant Cell. 24: 2427-2442.Seoighe C, Gehring C. 2004. Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome. Trends Genet. 20: 461-464.Smith JE, Baker KE. 2015. Nonsense‐mediated RNA decay–a switch and dial for regulating geneexpression. Bioessays. Mar 27. doi: 10.1002/bies.201500007Soltis PS, Liu X, Marchant DB, Visger CJ, Soltis DE. 2014. Polyploidy and novelty: Gottlieb's legacy. Phil. Trans. R. Soc. B. 369(1648), 20130351.Soltis, DE, Visger CJ, Soltis PS. 2014. The polyploidy revolution then… and now: Stebbins revisited. Am. J. Bot. 101:1057-1078.Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson, AH, Zheng C, Sankoff D, dePamphilis CW, Wall PK, Soltis, PS. 2009. Polyploidy and angiosperm diversification. Am. J. Bot. 96:336-348.Sorek R, Shamir R, Ast G. 2004. How prevalent is functional alternative splicing in the human genome? Trends Genet. 20:68-71.Staiger D, Zecca L, Kirk D, Apel, K. Eckstein L. 2003. The circadian clock regulated RNA‐binding protein AtGRP7 autoregulates its expression by influencing alternative splicing of its own pre‐mRNA. Plant J. 33:361-371.116Staiger D, Brown JW. 2013. Alternative splicing at the intersection of biological timing, development, and stress responses. Plant Cell. 25:3640–3656.Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22: 2688–2690. Su Z, Wang J, Yu J, Huang X, Gu X. 2006. Evolution of alternative splicing after gene duplication. Genome Res. 16: 182-189.Syed NH, Kalyna M, Marquez Y, Brta A, Brown JW. 2012. Alternative splicing in plants coming of age. Trends Plant Sci. 17:616-623.Tack DC, Pitchers WR, Adams KL. 2014. Transcriptome analysis indicates considerable divergence in alternative splicing between duplicated genes in Arabidopsis thaliana. Genetics 198: 1473-1481.Tang F, Yang S, Gao M, Zhu H. 2013. Alternative splicing is required for RCT1-mediated diseaseresistance in Medicago truncatula. Plant Mol. Biol. 82:367-374.te Beest M, Le Roux JJ, Richardson DM, Brysting AK, Suda J, Kubesová, Pysek P. 2011. The more the better? The role of polyploidy in facilitating plant invasions. Ann. Bot. 109:19-45.Tinti M, Johnson C, Toth R, Ferrier DE, MacKintosh C. 2012. Evolution of signal multiplexing by 14-3-3-binding 2R-ohnologue protein families in the vertebrates. Open Biology. 2 :120103.Vanneste K, Maere S, Van De Peer Y. 2014. Tangled up in two: a burst of genome duplications atthe end of the Cretaceous and the consequences for plant evolution. Philos Trans R Soc Lond B Biol Sci 369: 20130353Venables JP, Tazi J, Juge F. 2012. Regulated functional alternative splicing in Drosophila. Nucleic Acids Res. 40:1–10.Wang BB, Brendel V. 2006. Genome wide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci U. S. A. 103:7175–7180.Wang X, Wang H, Wang J, Sun R, Wu J, Liu S, Bai Y, Mun JH, Bancroft I, Cheng F et al. (108 co-authors). 2011. The genome of the mesopolyploid crop species Brassica rapa. Nature Genetics 43:1035-1039.Whitham, S, Dinesh-Kumar SP, Choi D, Hehl R, Corr C, Baker B. 1994. The product of the tobacco mosaic virus resistance gene N: similarity to toll and the interleukin-1 receptor. Cell 78: 1101-1115.Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, Rieseberg LH. 2009. The frequency of polyploid speciation in vascular plants. Proc. Natl. Acad. Sci. U. S. A. 106: 13875-13879. 117Wu TD, Watanabe CK. 2005. Gmap: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. 21:1859–1875.Yang S, Tang F, Zhu H. 2014. Alternative splicing in plant immunity. Int. J. Mol. Sci. 15:10424-10445.Yoo MJ, Szadkowski E, Wendel JF. 2013. Homoeolog expression bias and expression level dominance in allopolyploid cotton. Heredity 110:171–180. Yoo, MJ, Liu X, Pires JC, Soltis PS, Soltis DE. 2014. Nonadditive gene expression in polyploids.Annu. Rev. Genet. 48:485-517.Zhang XC, Gassmann W. 2003. RPS4-mediated disease resistance requires the combined presence of RPS4 transcripts with full-length and truncated open reading frames. Plant Cell  15:2333-2342.Zhang XC, Gassmann W. 2007. Alternative splicing and mRNA levels of the disease resistance gene RPS4 are induced during defense responses. Plant Physiol, 145:1577-1587.Zhang XN, Mount SM. 2009. Two alternatively spliced isoforms of the Arabidopsis SR45 protein have distinct roles during normal plant development. Plant Physiol. 150:1450-1458.Zhang PG, Huang SZ, Pin A, Adams KL. 2010. Extensive divergence in alternative splicing patterns after gene and genome duplication during the evolutionary history of Arabidopsis. Mol Biol Evol. 27: 686–1697.Zhou R, Moshgabadi N. Adams KL. 2011. Extensive changes to alternative splicing patterns following allopolyploidy in natural and resynthesized polyploids. Proc. Natl. Acad. Sci. U. S. A. 108:16122-16127.118AppendicesAppendix A Supplementary tables and figures of chapter 3Table A.1: Counts of conserved and divergent alternative splicing events using a total of 3 observations as a threshold119Minimum number of reads - 3Qualitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 881 1500 37.0ALTA 37 346 9.7ALTD 12 220 5.2ALTP 0 35 0.0total 930 2101 30.7tandem IR 279 411 40.4ALTA 22 119 15.6ALTD 8 62 11.4ALTP 2 17 10.5total 311 609 33.8Quantitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 279 602 31.7ALTA 14 23 37.8ALTD 2 10 16.7ALTP 0 0 0.0total 295 635 31.7tandem IR 116 163 41.6ALTA 8 14 36.4ALTD 4 4 50.0ALTP 2 0 100.0total 130 181 41.8Overall Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 279 2102 11.7ALTA 14 369 3.7ALTD 2 230 0.9ALTP 0 59 0.0total 295 2736 9.7tandem IR 116 610 16.8ALTA 8 133 5.7ALTD 4 66 5.7ALTP 2 17 10.5total 130 790 14.1Table A.2: Counts of conserved and divergent alternative splicing events using a total of 5 observations as a threshold.120Minimum number of reads - 5Qualitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 886 1499 37.1ALTA 37 345 9.7ALTD 12 222 5.1ALTP 0 35 0.0total 935 2101 30.8tandem IR 279 409 40.6ALTA 22 118 15.7ALTD 8 62 11.4ALTP 2 17 10.5total 311 606 33.9Quantitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 279 607 31.5ALTA 12 25 32.4ALTD 4 8 33.3ALTP 0 0 0.0total 295 640 31.6tandem IR 116 163 41.6ALTA 8 14 36.4ALTD 4 4 50.0ALTP 2 0 100.0total 130 181 41.8Overall Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 279 2106 11.7ALTA 12 370 3.1ALTD 4 230 1.7ALTP 0 35 0.0total 295 2741 9.7tandem IR 116 572 16.9ALTA 8 132 5.7ALTD 4 66 5.7ALTP 2 17 10.5total 130 787 14.2Table A.3: Counts of conserved and divergent alternative splicing events using a total of 8 observations as a threshold121Minimum number of reads - 8Qualitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 891 1497 37.3ALTA 37 344 9.7ALTD 12 223 5.1ALTP 0 35 0.0total 940 2099 30.9tandem IR 279 407 40.7ALTA 22 117 15.8ALTD 8 62 11.4ALTP 2 17 10.5total 311 603 34.0Quantitative Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 289 602 32.4ALTA 6 31 16.2ALTD 3 9 25.0ALTP 0 0 0.0total 298 642 31.7tandem IR 116 163 41.6ALTA 8 14 36.4ALTD 4 4 50.0ALTP 2 0 100.0total 130 181 41.8Overall Conservation of Alternative Splicing EventsEvent Class Conserved Divergent % Conservationalpha WGDs IR 289 2099 12.1ALTA 6 375 1.6ALTD 3 232 1.3ALTP 0 35 0.0total 298 2741 9.8tandem IR 116 570 16.9ALTA 8 131 5.8ALTD 4 66 5.7ALTP 2 17 10.5total 130 784 14.2Appendix B Supplementary tables and figures of chapter 4Table B.1: Summary of data and mappingData on all samples sequenced, including where they were mapped to. Reads called indicates how many reads the custom Python script accepted as mapping to a known and annotated gene.122Sample Genotype Mapped To Number of Reads Number of Bases Reads Mapped Mapping Percent Reads Called Percent Called of Mapped Mapped and CalledA rapa R+O 61790558 12358111600 53092274 85.9229560607 35741021 0.6731868558 0.5784220463B rapa R+O 58951148 11790229600 51089327 86.6638373183 34792297 0.6810091078 0.5901886253C rapa R+O 63137624 12627524800 54893156 86.9420680132 37961371 0.6915501634 0.6012480134D oleracea R+O 57918018 11583603600 53303009 92.0318250531 38967091 0.7310486168 0.6727973841E oleracea R+O 71776687 14355337400 66008244 91.9633473749 48435644 0.7337817379 0.6748102486F oleracea R+O 95239158 19047831600 86833089 91.1737260424 59756961 0.6881819095 0.6274410889G P1 R+O 82772585 16554517000 73883269 89.2605552913 52356709 0.7086409374 0.6325368357H P1 R+O 106180805 21236161000 94564461 89.0598456096 65053336 0.6879258372 0.6126656885I P1 R+O 138829606 27765921200 125197499 90.1806917179 87780717 0.7011379437 0.6322910475J P2 R+O 122082888 24416577600 109082663 89.3513126918 76586218 0.7020934023 0.6273296713K P2 R+O 144004667 28800933400 129925211 90.2229168725 92062142 0.708577968 0.639299711L P2 R+O 126330367 25266073400 113629127 89.9460119513 80249700 0.7062423352 0.6352368152M P3 R+O 138725256 27745051200 124606284 89.8223492916 87432127 0.7016670764 0.6302538523N P3 R+O 133821552 26764310400 119750555 89.4852534665 84037619 0.7017722715 0.6279826959O P3 R+O 138712755 27742551000 123851065 89.2859960859 85455751 0.6899880191 0.6160626757P napus napus 131919274 26383854800 112166476 0.8502660195 96437125 0.8597678062 0.7310313503Q napus napus 140388522 28077704400 98387267 0.7008213036 84265847 0.8564710614 0.6002331658R napus napus 106818196 21363639200 93763885 0.8777894452 80470306 0.8582228221 0.7533389349S Synthetic Mix R+O 150276375 30055275000 136032725 90.5216971064 95895192 0.7049420792 0.6381255337Total 2069676041 1820059586 1323737174Table B.2: Alternative acceptor events changing frequency after allopolyploidyShared changes only need be shared by 2 or more resynthesized allopolyploids.123Alternative Acceptor EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 26 60 32 55 30 1096P2 39 56 34 54 34 1082P3 41 59 54 59 27 1059Total 106 175 120 168 91 3237281 288 3328Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 26 96 25 61 37 703P2 37 101 35 62 30 683P3 40 106 41 61 27 673Total 103 303 101 184 94 2059406 285 2153Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 52 156 57 116 67 1799P2 76 157 69 116 64 1765P3 81 165 95 120 54 1732Total 209 478 221 352 185 5296687 573 5481ATCTAT+ CTTable B.3: Alternative donor events changing frequency after allopolyploidyShared changes only need be shared by 2 or more resynthesized allopolyploids.124Alternative Donor EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 32 62 28 46 36 855P2 20 65 35 52 26 861P3 46 64 28 56 23 842Total 98 191 91 154 85 2558289 245 2643Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 15 83 19 88 32 525P2 32 72 28 85 47 498P3 36 85 30 94 25 492Total 83 240 77 267 104 1515323 344 1619Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 47 145 47 134 68 1380P2 52 137 63 137 73 1359P3 82 149 58 150 48 1334Total 181 431 168 421 189 4073612 589 4262ATCTAT+ CTTable B.4: Alternative position events changing frequency after allopolyploidyShared changes only need be shared by 2 or more resynthesized allopolyploids.125Alternative Position EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 20 37 18 22 26 585P2 26 44 18 22 17 581P3 30 42 28 27 14 567Total 76 123 64 71 57 1733199 135 1790Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 15 63 10 17 17 325P2 21 72 17 16 9 312P3 23 67 10 18 13 316Total 59 202 37 51 39 953261 88 992Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 35 100 28 39 43 910P2 47 116 35 38 26 893P3 53 109 38 45 27 883Total 135 325 101 122 96 2686460 223 2782ATCTAT+ CTTable B.5: Alternative acceptor events changing frequency after allopolyploidy with naturalShared changes only need be shared by 2 or more resynthesized allopolyploids.126Alternative Acceptor EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 1 12 5 12 4 166P2 4 12 3 13 2 166P3 3 11 9 10 3 164N 26 16 15 9 4 130Total 34 51 32 44 13 62685 76 639Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 3 13 4 12 3 144P2 2 17 2 14 2 142P3 5 14 12 14 3 131N 28 18 5 10 2 116Total 38 62 23 50 10 533100 73 543Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 4 25 9 24 7 310P2 6 29 5 27 4 308P3 8 25 21 24 6 295N 54 34 20 19 6 246Total 72 113 55 94 23 1159185 149 1182ATCTAT+ CTTable B.6: Alternative donor events changing frequency after allopolyploidy with naturalShared changes only need be shared by 2 or more resynthesized allopolyploids.127Alternative Donor EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 3 4 4 7 1 71P2 1 4 1 5 1 78P3 5 6 2 6 2 69N 14 5 7 6 2 56Total 23 19 14 24 6 27442 38 280Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 2 9 0 10 0 58P2 2 9 0 8 2 58P3 3 9 2 11 1 53N 9 8 3 7 3 49Total 16 35 5 36 6 21851 41 224Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 5 13 4 17 1 129P2 3 13 1 13 3 136P3 8 15 4 17 3 122N 23 13 10 13 5 105Total 39 54 19 60 12 49293 79 504ATCTAT+ CTTable B.7: Alternative position events changing frequency after allopolyploidy with naturalShared changes only need be shared by 2 or more resynthesized allopolyploids.128Alternative Position EventsIncreases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 0 2 0 1 0 9P2 0 1 0 0 0 11P3 0 2 0 1 0 9N 0 1 2 0 0 9Total 0 6 2 2 0 386 4 38Increases Decreases No changePolyploid Unique Shared Unique Shared Unique SharedP1 1 1 0 0 0 10P2 0 3 0 1 0 8P3 0 2 0 0 0 10N 1 2 1 1 0 7Total 2 8 1 2 0 3510 3 35Increases Decreases No changePolyploid Unique Total Unique Total Unique TotalP1 1 3 0 1 0 19P2 0 4 0 1 0 19P3 0 4 0 1 0 19N 1 3 3 1 0 16Total 2 14 3 4 0 7316 7 73ATCTAT+ CTTable B.8: Homeolog analysis repeated with the natural added in.Shared changes only need be shared by 2 or more resynthesized allopolyploids.129With Natural AddedHomeolog Pairs which were Equal in the parental speciesUnique Shared Unique Shared Unique SharedP1 5 22 7 20 8 115P2 5 30 8 24 2 108P3 7 27 6 25 2 110N 21 18 13 17 10 98Total 38 97 34 86 22 431135 120 453Homeolog Pairs which were A biased in the parental speciesUnique Shared Unique Shared Unique SharedP1 0 47 0 2 0 33P2 2 44 2 2 3 29P3 2 43 1 2 3 31N 6 24 6 0 15 31Total 10 158 9 6 21 124168 15 145Homeolog Pairs which were C biased in the parental speciesUnique Shared Unique Shared Unique SharedP1 1 3 3 79 5 60P2 2 4 5 81 4 55P3 1 3 1 81 8 57N 22 5 9 52 18 45Total 26 15 18 293 35 21741 311 252AT>CT AT<CT AT=CTAT>CT AT<CT AT=CTAT>CT AT<CT AT=CTTable B.9: Summary of GO analysisIn the synthetics (resynthesized) allopolyploids, genes with events of various types which always had a particular reaction to allopolyploidy were analyzed using the DAVID GO analysis suite, where an enrichment is highlighted in green with the number of genes in that category. A plus indicates that the event was a shared increase, where a minus indicates a shared decrease. When the naturals are included, more derivative categories appear,  where an x to y type is where the event was unanimously one state in the resynthesized allopolyploids, and changed to a new state in the natural allopolyploid.130Synthetics NaturalsGO Catergory +IR -IR =IR +ALTA -ALTA =ALTA +ALTD -ALTD =ALTD +ALTP -ALTP =ALTP +IR -IR =IR = to - = to + - to = - to +chloroplast 53 469 743 18 28 158 18 21 89 20 0 72 15 99 175 73 65 49 22plastid 54 477 758 18 28 159 18 21 90 20 0 72 15 102 178 76 65 50 23plastid part 30 227 292 0 16 55 0 13 0 11 0 29 0 53 74 28 0 21 11alternative splicing 30 187 361 11 8 76 8 9 49 0 0 25 11 42 90 32 29 20 10splice variant 16 101 190 7 0 48 0 0 29 0 0 0 0 33 55 20 16 13 0response to abiotic stimulus 0 151 255 0 0 0 10 0 44 0 0 0 0 35 58 24 28 15 0response to osmotic stress 9 48 113 0 0 0 0 0 19 0 0 0 0 14 22 0 16 0 0response to salt stress 9 45 104 0 0 0 0 0 18 0 0 0 0 13 21 0 15 0 0fatty acid biosynthetic process 0 22 42 0 0 0 0 0 0 0 0 0 0 0 11 7 7 0 0fruit development 0 0 103 0 5 26 0 0 0 0 0 12 0 0 26 0 0 0 0lipid biosynthetic process 0 51 96 0 0 22 0 0 0 0 0 0 0 0 26 0 0 7 0metal binding 0 158 361 0 0 0 0 0 0 0 0 0 0 0 74 34 44 0 0response to light stimulus 0 56 89 0 0 21 0 0 0 0 0 0 0 0 24 0 0 9 0response to temperature stimulus 0 55 83 0 0 17 0 0 0 0 0 0 0 12 0 0 0 0 0response to metal ion 10 66 110 0 0 19 0 0 0 0 0 11 0 14 0 0 0 0 0cellular response to stress 0 44 96 0 0 20 0 0 0 0 0 12 0 0 0 0 0 0 0response to inorganic substance 0 82 134 0 0 0 0 0 0 0 0 0 0 19 0 0 0 0 0response to jasmonic acid stimulus 6 0 38 0 0 0 0 0 0 0 0 0 4 0 0 6 7 0 0response to abscisic acid stimulus 0 34 61 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0RNA splicing 0 15 29 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0transcription regulation 0 0 0 0 0 0 0 0 0 0 0 0 0 26 0 0 0 0 0Genes in pool 265 1766 3811 79 76 760 76 75 567 41 16 166 64 379 848 335 387 175 70Table B.10: Changes in SR gene splicing levelsChanges in levels of splicing events in SR protein genes and details on PTC status.131Arabidopsis SR Name orthologs Event Junction Invoke PTC? P1 P2 P3 Change Change All Sig. Changes Positive Negative Base Percent (average)AT1G09140 SR30 Bra018581 IR 9-10 1 0.74 0.81 0.75 0 0 0 0 0 42.4ALTA 9-10 1 *1.51 1.00 1.06 1 0 1 1 0 13.7ALTP 9-10 0 1.02 0.88 0.99 0 0 0 0 0 25.3Bo5g009470 IR 7-8 1 *0.39 1.31 0.99 1 0 1 0 1 8.0IR 10-11 1 0.50 0.58 0.61 0 0 0 0 0 4.2AT3G49430 SR34a Bra017986 IR 2-3 1 0.65 *0.56 0.78 1 0 1 0 1 7.8Bo3g109570 IR 2-3 1 0.89 1.19 0.87 0 0 0 0 0 10.2IR 9-10 1 0.70 0.83 0.68 0 0 0 0 0 10.0AT4G02430 SR34b Bra036267 IR 2-3 1 0.91 0.80 0.97 0 0 0 0 0 8.7IR 3-4 0 0.72 0.62 0.66 0 0 0 0 0 9.1IR 9-10 1 *0.79 0.94 0.96 1 0 1 0 1 52.0ALTA 9-10 1 *1.35 *1.42 1.17 1 0 2 2 0 9.1ALTP 9-10 1 0.85 0.88 1.13 0 0 0 0 0 10.2AT4G25500 RS40 Bra010459 IR 1-2 1 *0.60 *0.71 *0.51 1 1 3 0 3 52.7Bra013895 IR 1-2 1 0.93 *0.85 0.89 1 0 1 0 1 66.4ALTD 1-2 1 1.06 1.08 1.05 0 0 0 0 0 5.3Bo1g040230 ALTD 1-2 1 *0.67 *0.83 *0.57 1 1 3 0 3 12.7AT5G52040 RS41 Bra029146 ALTA 2-3 0 *2.63 *1.7 *2.02 1 1 3 3 0 4.1IR 3-4 1 *0.49 *0.70 *0.68 1 1 3 0 3 21.5ALTA 3-4 1 0.94 1.14 0.93 0 0 0 0 0 7.2ALTD 3-4 1 0.93 1.32 1.25 0 0 0 0 0 4.1Bo3g023480 IR 1-2 1 1.09 *1.47 0.89 1 0 1 1 0 53.8ALTA 1-2 1 *2.05 1.22 1.26 1 0 1 1 0 3.7ALTD 1-2 1 *1.69 1.27 *1.60 1 0 2 2 0 8.0AT2G24590 SRZ22a Bra032082 IR 2-3 0 *0.70 *0.66 *0.61 1 1 3 0 3 15.6IR 3-4 1 *0.51 1.28 0.63 1 0 1 0 1 4.0ALTA 4-5 0 *1.81 *1.95 1.28 1 0 2 3 0 2.7Bo4g154780 IR 2-3 1 *0.73 *0.70 *0.70 1 1 3 0 3 17.2132Arabidopsis SR Name orthologs Event Junction Invoke PTC? P1 P2 P3 Change Change All Sig. Changes Positive Negative Base Percent (average)AT3G53500 RSZ32 Bra007004 IR 1-2 1 *0.71 0.83 *0.67 1 0 2 0 3 84.0IR 2-3 1 *0.67 0.88 *0.69 1 0 2 0 2 50.8ALTA 1-2 1 0.94 1.26 1.01 0 0 0 0 0 4.7AT2G37340 RSZ33 Bra023114 IR 2-3 1 *0.72 *0.63 *0.72 1 1 3 0 3 48.7ALTA 2-3 1 *0.22 *0.42 *0.55 1 1 3 0 3 7.5IR 3-4 1 0.54 *0.38 *0.46 1 0 2 0 2 16.6AT1G55310 SR33 Bo6g092410 IR 2-3 1 *1.56 *1.29 *1.58 1 1 3 3 0 31.1ALTA 2-3 0 1.01 0.94 0.98 0 0 0 0 0 6.4ALTD 2-3 0 0.73 0.72 0.86 0 0 0 0 0 13.0AT3G55460 SCL30 Bra014741 IR 1-2 1 0.65 0.91 *0.51 1 0 1 0 1 6.8IR 2-3 1 0.81 0.74 1.19 0 0 0 0 0 4.6IR 4-5 0 0.95 1.40 1.24 0 0 0 0 0 4.0AT3G13570 SCL30a Bra039394 IR 2-3 1 0.85 0.88 0.92 0 0 0 0 0 27.1Bo1g138240 IR 1-2 1 0.88 0.79 0.79 0 0 0 0 0 6.9IR 2-3 1 1.16 1.11 1.08 0 0 0 0 0 26.9IR 3-4 0 1.21 1.01 *1.96 1 0 1 1 0 3.0ALTD 2-3 1 *1.62 *1.45 *1.44 1 1 3 3 0 5.1ALTA 2-3 1 0.73 1.16 0.94 0 0 0 0 0 4.4AT1G16610 SR45 Bra026037 IR 2-3 1 0.55 1.04 0.83 0 0 0 0 0 10.5IR 4-5 1 0.97 0.66 *0.31 1 0 1 0 1 10.6Bo5g022030 IR 4-5 1 *0.59 0.86 *0.61 1 0 2 0 2 9.7ALTA 2-3 0 1.06 1.13 0.99 0 0 0 0 0 6.4AT1G23860 RSZ21 Bra024624 IR 3-4 1 *0.24 *0.46 0.60 1 0 2 0 2 10.4AT5G64200 SC35 Bra024276 IR 5-6 1 0.78 *0.42 0.63 1 0 1 0 1 8.342PercentGenes with change 30 57.692307692Genes with change in all 10 19.230769231Significant changes 58Significant Decreases 40 68.965517241Significant Increases 20 34.482758621Table B.11: Summary of PCR verificationAll primers used in this experiment.133Primer Name species gene Junction forward reverse Event 1 Event 2 Found 1 Found 2 Result Events Possible Events ConfirmedB.napus_00001 oleracea Bo8g045830 1,2 TCCAACTATTCACAGCCATTGT TCGCATTAAATCCTTCGTCTCTT IR ALTD Y Y 1 2 2B.napus_00002 rapa Bra008524 1,2 AACAGAGAGAGCTGCACAATTC TCGACGCTTTCCCACATTTC IR ALTA Y Y 1 2 2B.napus_00003 oleracea Bo2g120570 2,3 TTTCAGGGTTCAGCTTCCAG GAAAGAGTTCGTCACATTGGATTAC IR - N - 0 1 0B.napus_00004 oleracea Bo3g149340 2,3 GACCTGTTATCCAAGTTCGTCTC AGAAGCTTGATCAGGTTCATTAGT IR ALTA Y N 1 2 1B.napus_00005 oleracea Bo1g017930 4,5 TCTACAGAGACTTTCCTCCTCAC ATCACCGGAGACGGTACAT IR - N - 1 1 1B.napus_00006 rapa Bra011719 3,4 CTTCCTCGTTGCCTTGTCTATC CCGACAGTGGTTGAGTTCTATG IR - N - 0 0 0B.napus_00007 rapa Bra007675 7,8 TGAGATCATAACCAAGGACAAAGA CATCCCATCCAGAACTGCAA IR - Y - 1 1 1B.napus_00008 oleracea Bo4g191060 18,19 CAATGGTGTTCTTGGCGATTT CAGCTACGACATGAGAGTTCTG IR - Y - 1 1 1B.napus_00010 oleracea Bo9g004450 8,9 CCCTGCAGCAAACATTGTTAC CTTCCTCCGGGTACTTCTCT IR - N - 1 1 1B.napus_00011 rapa Bra007563 4,5 GCATGATCAAGAAACCCATCAC GCCACACTTCATAGCCTTCT IR - Y - 1 1 1B.napus_00012 rapa Bra039116 2,3 GACAAGGAGCTGTCTCTTCTTT GGCAAGTCTCTGTCTGGTTT IR - Y - 1 1 1B.napus_00013 rapa Bra026302 1,3 CCACCAACAATAGCTCCTCTTC ACCTCCTCCTCTATCACATCAC ES - Y - 1 1 1B.napus_00014 oleracea Bo3g073040 3,5 GGCTGAGAATAGACACCTTTCA TACCCTGTTGCTTCCACTAATAC ES - N - 0 1 0B.napus_00015 oleracea Bo5g127250 1,2 AGATGAAGGAGCACAAGAAGAG TGCAGAGCCGAATGCTAAA IR - Y - 1 1 1B.napus_00016 rapa Bra019963 5,6 AGTGGTCCAGTTCTTGGTTTAG GGCATAGTTTGTGATGCTCTTG IR ALTA Y Y 1 2 2B.napus_00017 rapa Bra005988 14,15 AACATCTTATGCTTTGCTTCTAGTG TGAAACGAGTAGCAGTCATCC IR ALTD Y Y 1 2 2B.napus_00018 oleracea Bo6g069160 5,6 TTCCTGCGACATAAGAGATTGG GACGAAGGGATCAAAGTGTAGG IR ALTA N Y 1 2 1B.napus_00019 rapa Bra011719 1,2 GACTCAGGAAACCCAGAAGAAG TACTCACGGAATCAACCGAAAG IR - Y - 1 1 1B.napus_00020 oleracea Bo3g068310 2,3 CACACTCATCACCGGACTTT ACAATGAGACCAATGGCTAGAG IR - N - 0 1 0B.napus_00021 rapa Bra016457 3,4 GGAGGCTTTGGGAGAGTTTAC ACCGATGAGTGTAACGAGATTG IR - Y - 1 1 1B.napus_00022 rapa Bra011985 4,5 CACTGTGATTCCCGCCATTA TGTACCCTAGCTCCCATTCA IR - Y - 1 1 1B.napus_00023 oleracea Bo8g092500 2,3 AGTAGAGGATGGAGTGGAGTT GGATTTGGGAGTAGGGTCTTATG IR - Y - 1 1 1B.napus_00024 rapa Bra005806 4,5 ACCGTTGTGTGCGTAGTTAG GCTTACCGAGGACGATCATTAC IR ALTA Y Y 1 2 2B.napus_00025 oleracea Bo3g064870 4,5 TGTTTCGCTTCTACTGCTCAA TCACCAACCAAACGGTATCC IR - Y - 1 1 1B.napus_00026 oleracea Bo3g081950 1,2 TACCAGAGGAAGAAGGAGGAG GGCGACGAGACACTGAAAT IR - Y - 1 1 1B.napus_00027 oleracea Bo4g173310 1,2 GCGAAGAAGACGAAGGAAGTTA CTGAATCGCGTATTGGCTAGAT IR - Y - 1 1 1B.napus_00028 oleracea Bo7g111250 1,2 ACATTCGTTCTCCGTCTCTTG GGTTCAGGTCTTCTCGTTTGA IR - Y - 1 1 1B.napus_00029 rapa Bra008524 1,2 GAACAGAGAGAGCTGCACAA GTCGCTTTCACTCATCACTCT IR ALTA Y Y 1 2 2Successful Primer Sets 24Primers Success Ratio 0.8571428571Total Events Found 30Total Events Surveyed 35Event Success Ratio 0.8571428571


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items