Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Investigation of free U6 small nuclear ribonucleoprotein structure and function Dunn, Elizabeth Arlene 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2014_september_dunn_elizabeth.pdf [ 2.53MB ]
JSON: 24-1.0165949.json
JSON-LD: 24-1.0165949-ld.json
RDF/XML (Pretty): 24-1.0165949-rdf.xml
RDF/JSON: 24-1.0165949-rdf.json
Turtle: 24-1.0165949-turtle.txt
N-Triples: 24-1.0165949-rdf-ntriples.txt
Original Record: 24-1.0165949-source.json
Full Text

Full Text

  INVESTIGATION OF FREE U6 SMALL NUCLEAR RIBONUCLEOPROTEIN STRUCTURE AND FUNCTION by Elizabeth Arlene Dunn B.Sc., The University of Northern British Columbia, 2006 M.Sc., The University of Northern British Columbia, 2009  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in The Faculty of Graduate and Postdoctoral Studies (Biochemistry and Molecular Biology)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)   August 2014 © Elizabeth Arlene Dunn, 2014  ii Abstract  Eukaryotes differ from other domains of life in many respects, but at the level of gene architecture, it is the presence of interrupting sequences in their genes that serve as the defining feature. These must be removed with absolute precision in order to maintain the reading frame during translation of the encoded protein; thus it is not surprising that errors in this process, known as precursor messenger RNA (pre-mRNA) splicing, have been linked to a wide variety of human diseases. U6 small nuclear ribonucleoprotein (snRNP) is an essential component of the spliceosome, the large RNA-protein complex that is responsible for catalyzing pre-mRNA splicing. Although U6 small nuclear RNA (snRNA) plays a critical role in catalyzing the splicing reactions, very little is known about the mechanism of converting catalytically inert U6 snRNA in free U6 snRNP into a catalytic component of the assembled spliceosome. Here I present a model for free U6 snRNA secondary structure in free U6 snRNP that suggests that U6 becomes active for splicing through a mechanism that is dependent on its interaction with a second splicing factor, U4 snRNA. I propose that the U6 snRNP-associated protein, Prp24, is responsible for retrieving U6 snRNA from the disassembling spliceosome following splicing of a substrate, and then holds U6 snRNA in a conformation that masks catalytic sequences. I provide evidence, both from the literature and from my own genetic analysis, that the first two RNA Recognition Motifs of Prp24 bind a region of U6 snRNA known as the telestem, presenting U6 in a manner that is favorable for interaction with U4 snRNA. As a step toward solving the crystal structure to test this model, I have developed a system for the simultaneous recombinant expression of all components of U6 snRNP from a single expression vector, followed by purification of the pre-formed complex under non-denaturing conditions. I have subjected these particles to low-resolution negative stain electron microscopy and have also obtained a small angle X-ray scattering model of a sub-complex of free U6 snRNP, the LSm complex. This work has laid the foundation for understanding the structure/function relationship for U6 snRNP.  iii Preface  A version of chapter one has been published as a splicing review book chapter in Fungal RNA Biology. I conducted an extensive literature review and was responsible for integrating and synthesizing the information. I wrote this chapter with guidance from Dr. Stephen Rader. The full citation is as follows: Dunn, EA and Rader, SD (2014) Pre-mRNA splicing and the spliceosome: assembly, catalysis, and fidelity. In: Fungal RNA Biology. Sesma, A and von der Haar, T. (eds) Springer-Verlag Heidelberg. 22-57.  Portions of chapter 2 have been adapted and expanded from a manuscript published in Biochemical Society Transactions. I carried out the literature review presented and I wrote the dissertation chapter and manuscript with guidance from Dr. Stephen Rader. The full citation is as follows: Dunn, EA and Rader, SD. (2010) Secondary structure of U6 small nuclear RNA: implications for spliceosome assembly. Biochem. Soc. Trans. 38, 1099-1104.  Content from the remaining chapters has not yet been published. I performed all of the experiments shown in this work, including experimental design, execution, analysis, and interpretation of results, with the exception of the biophysical characterization of various complexes. Dr. Richard Fahlmann (U of Alberta) carried out the mass spectroscopy used to identify all of the protein components of our purified complexes. Dr. Calvin Yip (UBC) collected the negative stain electron microscopy images of materials that I prepared and sent to him. Dr. Sean McKenna (U of Manitoba) and Dr. Trushar Patel (U of Manitoba) were responsible for both the data collection and ab initio modeling of samples that I prepared and sent to them for the small angle X-ray scattering experiments.  Copyrighted materials presented herein have been reproduced or adapted with permissions as indicated.   iv Table of Contents  Abstract .................................................................................................................................... ii Preface ..................................................................................................................................... iii Table of Contents .................................................................................................................... iv List of Tables .......................................................................................................................... vii List of Figures........................................................................................................................ viii List of Abbreviations................................................................................................................ x Acknowledgements................................................................................................................. xii Dedication.............................................................................................................................. xiii 1 Introduction........................................................................................................................ 1 1.1 Precursor Messenger RNA Splicing.................................................................................. 1 1.2 Yeast as a Model Organism .............................................................................................. 2 1.3 The Spliceosome .............................................................................................................. 3 1.4 Association of U1 snRNP with the Pre-mRNA Transcript ................................................ 4 1.5 Recognition of the Branch point ....................................................................................... 7 1.6 Assembly of the U4/U6•U5 Triple snRNP........................................................................ 9 1.7 Spliceosome Activation ...................................................................................................11 1.8 Catalytic Steps.................................................................................................................16 1.9 Spliceosome Remodeling Between Catalytic Steps I and II..............................................19 1.10 Spliceosome Disassembly..............................................................................................22  v 1.11 Splicing Fidelity ............................................................................................................23 1.12 General Aims of this Dissertation ..................................................................................25 2 U6 snRNA in Free U6 snRNP and its Activation for Splicing .........................................28 2.1 The Karaduman Model of U6 snRNA Secondary Structure and its Implications ..............28 2.2 U4/U6 di-snRNP Formation and the Role of Prp24..........................................................38 2.3 The Dunn Model of U6 snRNP and a Role for U4 snRNA in Splicing.............................43 3 The U6 snRNA Telestem is a Binding Site for the U6 snRNP Protein  Prp24 ................48 3.1 Introduction.....................................................................................................................48 3.2 Materials and Methods ....................................................................................................49 3.2.1 Construction of U6 Mutant Yeast Strains ..................................................................49 3.2.2 Yeast Growth Assays................................................................................................52 3.2.3 Analysis of RNA Levels by Solution Hybridization and Primer Extension................52 3.2.4 5’-end Labeling of DNA probes................................................................................54 3.2.5 In Silico Modeling of U6 snRNP...............................................................................54 3.3 Results.............................................................................................................................55 3.4 Discussion .......................................................................................................................63 4 Co-expression and Co-purification of Free U6 snRNP ....................................................71 4.1 Context of this Work in Light of a Recent LSm2-8 Crystal Structure...............................71 4.2 Introduction.....................................................................................................................71 4.3 Materials and Methods ....................................................................................................75 4.3.1 Preparation of Yeast Genomic DNA and Total RNA.................................................75  vi 4.3.2 Construction of LSm Gene-Containing Co-expression Vectors .................................76 4.3.3 Co-expression and Co-purification of the LSm complexes ........................................77 4.3.4 Expression and Purification of Tobacco Etch Virus Protease.....................................79 4.3.5 Biophysical Characterization of Purified LSm complexes .........................................80 4.4 Results.............................................................................................................................80 4.5 Discussion .......................................................................................................................94 5  Concluding Remarks........................................................................................................99 6  Addendum ......................................................................................................................103 References..............................................................................................................................107  vii List of Tables Table 3.1 DNA oligonucleotides used to make mutant U6 snRNA yeast strains ........................50 Table 3.2 Prp24 plasmids used in this work……………………………………………………..52 Table 3.3 MC-Sym U6 snRNA structure ranking by scoring program .......................................61 Table 4.1 DNA oligonucleotides used to amplify the genes used in this work, to sequence the genes, and to modify the pQlinkN vector...........................................................................77   viii List of Figures Figure 1.1 Eukaryotic Gene Architecture.................................................................................... 4 Figure 1.2 Pre-spliceosome assembly ......................................................................................... 5 Figure 1.3 Triple snRNP assembly ............................................................................................. 9 Figure 1.4 Spliceosome activation.............................................................................................12 Figure 1.5 Catalysis...................................................................................................................17 Figure 1.6 Spliceosomal reodeling between the chemical steps..................................................20 Figure 1.7 Spliceosome disassembly mediated by Prp43 and the NTR complex ........................22 Figure 1.8 Kinetic proofreading of the first catalytic step of splicing .........................................24 Figure 2.1 Models of S. cerevisiae U6 snRNA secondary structure in free U6 snRNP ...............29 Figure 2.2 U6 snRNA conformational rearrangements throughout the splicing cycle.................30 Figure 2.3 Chemical modification data mapped to the Karaduman and Dunn models of U6 snRNA secondary structure ...............................................................................................34 Figure 2.4 Raw U6 snRNA chemical modification data in the absence (left) and presence    (right) of protein ...............................................................................................................35 Figure 2.5 Potential protein binding sites mapped to U6 snRNA secondary structure.................36 Figure 2.6 The U6 snRNA retrieval model. ...............................................................................45 Figure 3.1 Yeast plasmid shuffle to introduce mutant U6 snRNA..............................................55 Figure 3.2 U6 snRNA telestem extension mutant doubling times...............................................57 Figure 3.3 Free U4 snRNA levels in U6 snRNA telestem extension mutant yeast strains...........58 Figure 3.4 Over-expression of mutant U6 snRNA .....................................................................59 Figure 3.5 U6 snRNA G30U/G31U genetic combinations with Prp24 alleles ............................60  ix Figure 3.6 U6 snRNA secondary structure in free U6 snRNP ....................................................62 Figure 3.7 Three-dimensional model of free U6 snRNP.............................................................64 Figure 4.1 Construction of the LSm2-8 co-expression vector.....................................................83 Figure 4.2 Expression of the LSm proteins from pQlink ............................................................84 Figure 4.3 LSm2-8 purification .................................................................................................85 Figure 4.4 U6 snRNP, LSm1-7, and LSm2-8 purification..........................................................87 Figure 4.5 U6 snRNA transcription cassette. .............................................................................89 Figure 4.6 Negative stain electron microscopy of the LSM2-8 and U6 snRNP complexes .........90 Figure 4.7 Small angle X-ray scattering envelope for yeast LSm2-8..........................................91 Figure 4.8 The LSm2-8 crystal structure fits within the SAXS envelope with good   agreement……………………………………………………………………………….. 93  x List of Abbreviations 3’ ISL   3’ Intramolecular Stem/loop 3’ ss   3’ Splice Site 5’ ss   5’ Splice Site AEX   Anion Exchange ATP   Adenosine Triphosphate BP   Branch Point BSA   Bovine Serum Albumin CAI   Codon Adaptation Index CSM   Complete Supplemental Media CTP   Cytidine Triphosphate DNA   Deoxyribonucleic Acid DTT   Dithiothreitol EDTA   Ethylenediaminetetraacetic Acid EM   Electron Microscopy FOA   Fluoroorotic Acid FPLC   Fast Performance Liquid Chromatography GDP   Guanosine Diphosphate GTP   Guanosine Triphophate HDV   Hepatitis Delta Virus HH   Hammerhead Ribozyme IMAC   Immobilized Metal Ion Affinity IPTG   Isopropyl 1-thio-β-D-galactopyranoside Kd   Dissociation Constant LIC   Ligase Independent Cloning LSm   Like-Sm mRNA   Messenger Ribonucleic Acid NMR   Nuclear Magnetic Resonance NTC   Nineteen Complex NTR   Nineteen Complex Related OD   Optical Density PCR   Polymerase Chain Reaction  xi PDB   Protein Data Bank PEG   Polyethylene Glycol pre-mRNA  Precursor Messenger Ribonucleic Acid Prp   Pre-mRNA Processing RNA   Ribonucleic Acid RNAse  Ribonuclease RNP   Ribonucleoprotein RRM   Ribonucleic Acid Recognition Motif RT-PCR  Reverse Transcription Polymerase Chain Reaction SAXS   Small Angle X-Ray Scattering SDS   Sodium Dodecyl Sulphate SEC   Size Exclusion Chromatography smFRET  Single Molecule Fluorescence Resonance Energy Transfer snRNA  Small Nuclear Ribonucleic Acid snRNP   Small Nuclear Ribonucleoprotein TEV   Tobacco Etch Virus tRNA   Transfer Ribonucleic Acid UTP   Uridine Triphosphate UV   Ultra Violet    xii Acknowledgements  First and foremost I would like to thank my primary supervisor, Dr. Stephen Rader, for bringing me into his lab and introducing me to the structural and mechanistic unknowns in the field of pre-mRNA splicing. Stephen has provided a training environment that is rich in curiosity and drive for understanding, which has played a major role in my development as an innovative, meticulous, and critical scientist. I would also like to thank the members of my supervisory committee, Dr. Eric Jan, Dr. Natalie Strynadka and Dr. Vivien Measday for their critical assessment of my work, helpful questions, and guidance through my projects.  I would like to acknowledge Dr. Richard Fahlmann, Dr. Calvin Yip, Dr. Sean McKenna, and Dr. Trushar Patel for assisting in critical data collection, in particular with respect to the biophysical work presented throughout this dissertation.   I would like to thank past and current members of the Rader lab for their ongoing criticism of my ideas – these critiques have been invaluable in strengthening my work. In particular I would like to acknowledge Dr. Martha Stark, William Dunn, Tara Wong, Amy Hayduk, and Paul Kalke for asking the tough questions that forced me to revisit my ideas and to abandon some altogether. I would also like to thank Dr. Andrea Gorrell for her assistance in molecular modeling.  I would like to thank NSERC and UBC for granting me the financial support that was necessary to get me through this degree through a Post Graduate Scholarship and Four Year Fellowship respectively. I am also very grateful to UNBC for awarding me the UNBC scholars undergraduate tuition award, which was instrumental in sparking my interest in pursuing science as a career.  Most importantly, I would like to thank my husband, William Dunn, for his ongoing support and encouragement over the years, and my sons Link and Sly for forcing me to take a break from work every day. I would like to thank my parents, Rick and Kim Chester, for their encouragement from a very young age, and my siblings, Adrianne, Chris, and Jen, for keeping things interesting outside of the lab.   xiii Dedication           To My Family 1 1 Introduction 1.1 Precursor Messenger RNA Splicing  Precursor messenger RNA (pre-mRNA) splicing was first reported in the late 1970s when a number of research groups observed a discrepancy between the length of a mature mRNA and its corresponding genomic DNA sequence when viewed under the electron microscope (Berget et al. 1977, Chow et al. 1977, Kinniburgh et al. 1978, Tilghman et al. 1978). Formation of what was termed an “R-loop” – hybridization of the mature mRNA with the genomic DNA fragment encoding the gene (Thomas et al. 1976, White & Hogness 1977) – revealed that regions of the DNA sequence were missing in the mature mRNA transcript. It is now widely accepted that this observation is the result of an RNA processing event that is unique to eukaryotic organisms. This processing, known as pre-mRNA splicing, is a critical step in gene expression that involves the removal of intervening regions of pre-mRNA sequence (introns) and joining of the coding regions (exons) into one continuous mRNA transcript. In addition to 5’-end capping and 3’-end polyadenylation, splicing of the transcript results in the formation of a mature mRNA that can then be exported into the cytoplasm where the sequence is translated into the encoded protein.  Intron-containing genes are prevalent across eukaryotes to varying degrees. Approximately 0.5% of the genes of the unicellular red alga, Cyanidioschyzon merolae, and less than 5% of the genes of the yeast Saccharomyces cerevisiae, contain introns, with only a few of those genes containing more than a single intron and no gene containing more than two (Matsuzaki et al. 2004, Spingola et al. 1999). To contrast this, approximately 25% of the human genome (95% of the human transcriptome) contains introns, with each gene composed of an average of 8.8 exons and 7.8 introns (Venter et al. 2001, Sakharkar et al. 2004). The presence of multiple exons in a pre-mRNA transcript allows for the selective inclusion/exclusion of each intron and exon to generate a variety of protein isoforms through a process known as alternative splicing. These isoforms can have related or completely different functions, such as those observed for various protein kinase C isoforms that function antagonistically in the assembly of tight junctions (Andreeva et al. 2006).   An extreme example of alternative splicing is found in the gene for the Drosophila melanogaster axon guidance receptor, Dscam, which contains 95 alternative exons, allowing for the potential expression of 38 016 different protein isoforms (Schmucker et al. 2000). While it is unlikely that all of these isoforms are produced, the alternative splicing of those that are is regulated by both the developmental stage of the organism and the tissue type that the protein is  2 being expressed in (Celotto & Graveley 2001). Temporal and spatial regulation of alternative splicing is not unique to D. melanogaster. In humans, over 22 000 tissue-specific, alternatively spliced mRNAs have been identified through deep sequencing of 15 different tissue types, underscoring the importance of pre-mRNA splicing for proteome diversity in humans (Wang et al. 2008).  With the ability to generate so many different proteins from a single pre-mRNA transcript, it is not surprising that errors in pre-mRNA splicing have been linked to a wide variety of human diseases. As an example, consider the myelin proteolipid protein, PLP, which is the main component of myelin, the protective layer surrounding axons in the nervous system. The PLP gene contains an alternate 5’ splice site that allows for the production of two protein isoforms, PLP, which contains a signaling factor that is critical for maintenance of axonal integrity, and DM20, which lacks this functionality (Gudz et al. 2002). The PLP/DM20 ratio is tightly regulated throughout development and mutations that alter this ratio result in degenerative disorders (Hobson et al. 2006). This example, along with many others, demonstrates that a critical step in developing treatments for a variety of human diseases is to understand the detailed mechanism of pre-mRNA splicing. Only then can diseases that arise due to errors in pre-mRNA splicing be targeted therapeutically. 1.2 Yeast as a Model Organism  While only 5% of Saccharomyces cerevisiae genes contain an intron, the mRNAs derived from those genes represent approximately one third of the mRNA population in living cells, suggesting that splicing in these cells contributes substantially to the normal expression of yeast genes (Ares et al. 1999). These intron-containing genes tend to have only a single intron, bypassing the requirement for alternative splicing and regulatory factors found in most other eukaryotes. Consequently, the yeast spliceosome is much less complex and undergoes far simpler regulation, making yeast an excellent model organism to study splicing. Since splicing factors are highly conserved across eukaryotes, the structural and mechanistic insights gleaned from yeast are generally transferable to humans, shedding light on the wide variety of splicing anomalies that can lead to splicing-related diseases. Additionally, a number of genetic and biochemical tools can be employed in the yeast system that cannot be used in the mammalian system, allowing for a more targeted approach to addressing important questions. For these reasons, the work presented here utilizes S. cerevisiae as a model study system.  3 1.3 The Spliceosome   Eukaryotic pre-mRNA splicing is catalyzed by the spliceosome, a large complex of five small nuclear RNAs (snRNAs; U1, U2, U4, U5, and U6) and more than one hundred core proteins (Jurica & Moore 2003). Many of these proteins associate specifically with an snRNA to form small nuclear ribonucleoprotein (snRNP) particles, while others associate with the spliceosome in an snRNA-independent manner that is often mediated through protein-protein interactions. Assembly of the spliceosome on a newly transcribed pre-mRNA substrate requires the addition of four major splicing subcomplexes: U1 snRNP, U2 snRNP, the pre-formed U4/U6•U5 triple snRNP, and the Prp19 Complex (Cheng & Abelson 1987, Hoskins et al. 2011). Through numerous structural rearrangements, the spliceosome interacts dynamically with the transcript, recognizing the splice site signals and positioning the pre-mRNA substrate in a favorable orientation for the splicing reactions to proceed (Brody & Abelson 1985, Grabowski et al. 1985).  Recent real time kinetic analyses of spliceosome assembly using multi-wavelength fluorescence microscopy support a long held view that spliceosome assembly occurs through the highly ordered association of subcomplexes with the transcript (Cheng & Abelson 1987, Hoskins et al. 2011). These studies show that commitment of the transcript to splicing increases as assembly progresses, and that association of the subcomplexes with the transcript is reversible (Hoskins et al. 2011). Notably, although higher order splicing complexes such as a penta-snRNP have been purified and characterized (Stevens et al. 2002), Crawford et al. (2013) find no evidence for the association of preformed complexes with the pre-mRNA. Thus, such higher-order complexes probably represent a stable association of the constituents already assembled on pre-mRNA substrates. Regardless of whether a bona fide penta-snRNP exists independent of substrate in vivo, such a species would presumably have to undergo the same conformational and compositional rearrangements outlined in the stepwise assembly model in order to ensure that all proofreading stages are passed so that high fidelity splicing can be achieved. The details of spliceosome assembly, activation, catalysis, and disassembly will be discussed here, focusing on the findings in the model organism S. cerevisiae unless otherwise stated.    4 1.4 Association of U1 snRNP with the Pre-mRNA Transcript  Spliceosome assembly is guided through the recognition of three key sequence elements in the pre-mRNA transcript: the 5’ splice site (5’ss), located at the boundary of the 3’ end of the 5’ exon and the intron, the branch point adenosine (BP), located approximately two-thirds of the way into the intron, and the 3’ splice site (3’ss), located at the boundary of the intron and the 5’ end of the 3’ exon (Fig 1.1). The sequences at these sites are generally well conserved from yeast to human, where in almost all cases the intron begins with a GT dinucleotide and ends with an AG dinucleotide (Spingoal et al. 1999, Burset et al. 2000). Exceptions to this so-called GT-AG rule can be found both within and across species; for example, five S. cerevisiae and three S. pombe introns begin with the dinucleotide GC (Spingola et al. 1999, Wood et al. 2002). In addition, the sequence context at the splice sites and branch point is very important in budding yeast where the consensus sequences GTATGT, TACTAAC (where A is the branch nucleotide), and YAG (where Y is a pyrimidine), at the 5’ss, BP, and 3’ss respectively, are adhered to very closely in S. cerevisiae and other closely related fungal organisms (Bon et al. 2003, Neuvéglise et al. 2011)  Figure 1.1. Eukaryotic gene architecture. The 5’ splice site (5’ss), branch point sequence (BP,  reactive adenosine underlined), and 3’ splice site consensus sequences are indicated  within the intron located between the exons (boxed).   Assembly of the spliceosome begins with the association of U1 snRNP with the pre-mRNA transcript through a base-pairing interaction between the 5’ end of U1 snRNA and the 5’ss of the transcript (Fig. 1.2; Siliciano & Guthrie 1988, Crawford et al. 2013). This association proceeds in the absence of ATP hydrolysis, and is dependent on the presence of an intact 5’ss that maintains base pairing at intron positions one and five, but not at position four (Fig. 1.2; Siliciano & Guthrie 1988, Crawford et al. 2013). Notably, even though the 5’ss consensus sequence in mammals is far more degenerate than in yeast, the first ten nucleotides of U1 snRNA are invariant across eukaryotes (Guthrie & Patterson 1988). In mammals, the site of cleavage at the 5’ss is determined by complementarities to U1 rather than by the intron sequence, with  5 specific cleavage occurring opposite the C8-C9 nucleotides of U1 (Weber & Aebi 1988). This is not the case for S. cerevisiae, where authentic 5’ss cleavage appears to require a G at position five of the intron, along with the U1 snRNP specific proteins Nam8 and Luc7, which stabilize the 5’ss/U1 interaction through contacts with the intron and 5’ exon respectively (Siliciano & Guthrie 1988, Puig et al. 1999, Puig et al. 2007).  Figure 1.2. Pre-spliceosome assembly. ATP-independent addition of U1 snRNP and the  BBP/Mud2 dimer to the 5’ss and BP respectively (top), followed by the ATP-dependent  addition of U2 snRNP to the BP (bottom). Proteins are indicated by large rectangles;  those unique to S. cerevisiae are indicated with an asterisk, while those that are absent are  indicated with a dashed oval. Base pairing interactions between RNA nucleotides are  indicated by vertical lines.  6  U1 snRNA is fairly well conserved across eukaryotes, consisting of an almost invariant short single stranded 5’ end, three stem loop structures (stems I, II, and III) that are closed by a long range interaction, a single stranded region containing the Sm protein binding site, and a terminal stem loop (stem IV) (Fig. 1.2; Guthrie & Patterson 1988). Stem III is highly divergent in the hemiascomycetous yeasts, ranging from a short stem of 14 nucleotides in Y. lipolytica to a long un-branched stem of 104 nucleotides in the Candida species, and a long multi-branched stem loop in S. cerevisiae (Mitrovich & Guthrie 2007). Intriguingly, this large insertion, referred to as the U1 snRNA fungal domain (Guthrie & Patterson 1988), is accompanied by the presence of several yeast-specific U1 snRNP proteins. Prp42, which is thought to have arisen as a duplication of the yeast-specific protein Prp39 in a common ancestor of S. cerevisiae and C. albicans, might interact with the extended stem III, since a homolog in Y. lipolytica does not exist (Mitrovich & Guthrie 2007). Stem III of Y. lipolytica is more similar in size to most other eukaryotes that also lack a Prp42 homolog (Fabrizio et al. 2009). Likewise, the S. cerevisiae specific protein Snu56 might associate with U1 snRNA through its extended and branched Stem III (Mitrovich & Guthrie 2007).  Of the ten U1 snRNP specific proteins identified in S. cerevisiae, seven have mammalian homologs, although not all of the mammalian homologs associate specifically with U1 snRNP, and the mode of interaction between the protein and U1 snRNA has not necessarily been conserved (Mitrovich & Guthrie 2007, Fabrizio et al. 2009). For example, the human homolog of Mud1, U1A, interacts with human U1 snRNA through direct contacts between its amino-terminal RNA recognition motif (RRM) and the loop nucleotides of U1 stem loop II (Oubridge et al. 1994). While this binding interaction appears to be conserved in C. albicans, the nucleotide sequence has become quite degenerate in S. cerevisiae, accompanied by an insertion of thirty amino acids and degeneration of the surrounding amino acid sequence in the Mud1 RRM, suggesting that the mode of interaction between U1 snRNA and Mud1 in S. cerevisiae is different from in humans (Mitrovich & Guthrie 2007). Interestingly, the opposite situation is observed for the human protein homolog of Snp1, U1-70K, which binds the loop residues of stem loop I in humans (Surowy et al. 1989). In this case, the S. cerevisiae interaction is invariant while C. albicans shows some sequence degeneration at the site of interaction (Mitrovich & Guthrie 2009).  In addition to recognizing the 5’ss of splicing substrates, U1 snRNP has been proposed to play a role in increasing splicing fidelity. Single molecule fluorescence resonance energy transfer  7 (smFRET) experiments have revealed that the 5’ss and BP are held apart upon U1 snRNP association with the transcript, as demonstrated by a reduction in FRET efficiency upon U1 snRNP binding (Crawford et al. 2013). Furthermore, these sites remain separated during spliceosome assembly up to the point of spliceosome activation (Crawford et al. 2013). These authors propose that this additional role for U1 snRNP – to physically separate chemically reactive groups – is crucial to ensuring that splicing cannot occur until the spliceosome has assembled correctly. Indeed the U5 snRNP protein Prp28 has been shown to play a role in proofreading at the 5’ss, an event that would necessarily occur later in spliceosome assembly, i.e. once the triple-snRNP has assembled onto the splicing substrate, but before spliceosome activation (Yang et al. 2013). 1.5 Recognition of the Branch point  Prior to recruitment of the other major splicing complexes to the pre-mRNA substrate, the BBP-Mud2 heterodimer binds the BP sequence in an ATP independent manner, making direct contacts with the pre-mRNA in this region (Fig. 1.2; Abovich et al. 1994, Wang et al. 2008). In most species, including most fungi, the homologous BBP-Mud2 complex contains a third protein, U2AF1, which interacts with the AG dinucleotide located at the 3’ss (Wu et al. 1999). Notably, U2AF1 is not present in S. cerevisiae. U2AF1 is highly conserved, when present, with the S. pombe and human proteins showing 75% similarity (Käufer & Potashkin 2000). The presence of U2AF1 appears to correlate with a short distance between the BP and 3’ss (less than 15 nucleotides), suggesting that this heterotrimer is responsible for identifying both the BP and 3’ss (Neuvéglise et al. 2011). In species such as S. cerevisiae, where this distance is much longer (on average 30 nucleotides) and contains a conserved tract of polypyrimidines (PPT) near the 3’ss, association of the BBP-Mud2 complex is 3’ss independent, and interactions between the BBP-Mud2 complex and the BP and PPT appear to be stronger (Rymond & Rosbash 1985, Neuvéglise et al. 2011).  Commitment of a splicing substrate to the splicing pathway requires the stable association of U1 snRNP and the BBP-Mud2 complex at the 5’ss and BP regions respectively, although commitment complex formation is reversible (Legrain et al. 1985, Crawford et al. 2013). One of the key features of the commitment complex is the formation of a bridge connecting the 5’ss and BP through protein-protein contacts that involve a direct physical interaction between BBP and the U1 snRNP specific protein Prp40 (Abovich & Rosbash 1997, Schwer et al. 2013). The presence of homologs of these bridging proteins, Prp40, BBP, and  8 Mud2 in S. pombe and humans suggest that the cross-talk between the 5’ and 3’ regions of the intron is important at very early stages of intron recognition and spliceosome assembly across eukaryotes (Käufer & Potashkin 2000). Once this network of contacts has been established, the assembling spliceosome is then ready to accept the U2 snRNP complex.  Stable association of U2 snRNP with the pre-mRNA to form the pre-spliceosome is the first ATP-dependent step in spliceosome assembly (Fig. 1.2; Crawford et al. 2013). Two different ATPases, Sub2 and Prp5, are required at this stage to allow direct base-pairing interactions between U2 snRNA and the intron to form (Parker et al. 1987, Kistler & Guthrie 2001, O’Day et al. 1996). Sub2 is thought to function in the removal of Mud2 and BBP from the pre-mRNA, exposing the BP region of the transcript, while Prp5 appears to play a role in U2 snRNA remodeling to make the U2 snRNA BP-binding sequence more accessible. Interestingly, association of the Prp9/Prp11/Prp21 complex (SF3a complex in humans) with U2 snRNA is required for Prp5 activity, and RNase H treatment of U2 snRNA has shown that prior to assembly onto the transcript, three different Prp9/Prp11/Prp21-dependent and Prp5-dependent U2 conformations exist (Wiest et al. 1996). Association of the Prp9/Prp11/Prp21 complex with U2 converts a more open U2 snRNA that is unable to form pre-spliceosomes into a more closed particle that becomes the Prp5 substrate (Wiest et al. 1996). Association of Prp5 and subsequent ATP hydrolysis then converts U2 snRNP into a second more open conformation that is competent for association with the BP of the pre-mRNA (Wiest et al. 1996).  U2 snRNA is composed of four stem loop structures, the first three of which are separated by short single stranded regions (Fig. 1.2; Guthrie & Patterson 1988). Like U1 snRNA, U2 snRNA is highly conserved across eukaryotes, except in S. cerevisiae where a large insertion of approximately 1kb, referred to as the U2 fungal domain, replaces the third stem loop (Guthrie & Patterson 1988). Surprisingly, the entire fungal domain can be deleted without affecting growth in yeast, and a yeast U2 snRNA deletion can be complemented with human U2 snRNA (Shuster & Guthrie 1988, Shuster & Guthrie 1990). In contrast to U1 snRNP, the U2 snRNP components are much more highly conserved, with eleven proteins specifically associating with U2 snRNA in both yeast and humans in addition to the Sm protein core (Fabrizio et al. 2009). Stable association of U2 snRNP with the pre-mRNA involves direct U2 snRNP protein contacts with the pre-mRNA upstream of the BP in addition to base pairing (Gozani et al. 1996, Gozani et al. 1998).  9 1.6 Assembly of the U4/U6•U5 Triple snRNP  In contrast to U5 and U6 snRNPs, which can exist as free particles, U4 is almost always found in association with U6 in either the U4/U6 di-snRNP or in the U4/U6•U5 tri-snRNP (Fig. 1.3; Cheng & Abelson 1987, Fortner et al. 1994). This association involves an extensive base pairing interaction that spans U6 nucleotides 55-80 and U4 nucleotides 1-17 and 57-64, generating U4/U6 stem I and stem II (Fig. 1.3; Brow & Guthrie 1988). The U4/U6 di-snRNP protein complement is small, comprised only of the U6 associated LSm proteins, the U4 associated Sm proteins, and four proteins found in the di-snRNP that are not part of free U6 snRNP (Stevens et al. 1999, Fabrizio et al. 2009). It is not clear whether these four proteins, Prp31, Prp3, Prp4, and Snu13, associate with U4 snRNA in a free U4 snRNP particle since isolation and characterization of free U4 snRNP has not been possible due to its very low abundance. Alternatively, these proteins might recognize and bind the U4/U6 duplex at some point during U4/U6 di-snRNP formation.  Figure 1.3. Triple snRNP assembly. snRNA secondary structures are shown for U4 and U6  (top) and U5 (middle) with the stems and loops labeled as in the text. Associated proteins  are shown in large rectangles.  10  The U6 snRNP-specific protein Prp24 is the only other protein found associated with U6 snRNA in free U6 snRNP, aside from the Lsm2-8 protein complex that binds the 3’ uridine-rich tail of U6 (Fig. 1.3; Stevens et al. 2001, Mayes et al. 1999). While Prp24 does not stably associate with the U4/U6 di-snRNP, the rate of base-pair formation between U4 and U6 snRNAs is greatly enhanced in its presence (Shannon & Guthrie 1991, Raghunathan & Guthrie 1998). It is not yet understood how Prp24 facilitates this interaction, but it is known to do so in an ATP-independent manner (Raghunathan & Guthrie 1998). Surprisingly, the structure of yeast Prp24, which consists of four RRMs, is strikingly different from the mammalian homolog, SART3, which is three times as large and consists of only two RRMs and a long amino-terminal extension not present in S. cerevisiae (Bell et al. 2002, Rader & Guthrie 2002). The first two RRMs of Prp24 most closely resemble the SART3 RRMs, and in both yeast and humans have been shown to bind U6 snRNA with high affinity (Bell et al. 2002, Kwan & Brow 2005). The S. pombe homolog is similar in size to SART3, consisting of a large amino-terminal extension in addition to the four RRMs that are common among most other fungi (Rader & Guthrie 2002). Whether the additional RRMs in yeast Prp24 perform a similar function to the amino-terminal extension found in other homologs remains to be determined.  Unlike U1 and U2 snRNAs, neither fungal U4 nor U6 snRNAs deviate in size relative to other eukaryotes (Guthrie & Patterson 1988). Essentially all of the very little size variation in U6 snRNA is found in the 5’ stem loop where the length of the stem can vary by several base pairs (Brow & Guthrie 1988). In addition to the size conservation, U6 snRNA exhibits a striking level of primary sequence conservation with close to 80% sequence identity across the middle third of the RNA across eukaryotes (Brow & Guthrie 1988). This region of U6 engages in base pairing interactions with U4, and consequently it is not surprising that the corresponding region of U4 snRNA is highly conserved in primary sequence as well. Outside of this region, however, the primary sequence is quite degenerate (Guthrie & Patterson 1988). On the U4 side of the U4/U6 duplex, stems I and II are interrupted by a stem loop structure, the 5’ stem loop, which has been absolutely conserved in structure from yeast to humans even though the nucleotide sequence in the stem differs at almost every position (Fig. 1.3; Guthrie & Patterson 1988). This high level of phylogenetic co-variation argues for an important function for this structure, which has been shown to bind the protein Snu13 (Vidovic et al. 2000). U4 snRNA also contains a 3’ stem loop that varies substantially across eukaryotes, followed by the Sm protein-binding site and, in most eukaryotes excluding S. cerevisiae, a final stem loop structure (Guthrie & Patterson 1988).  11  In order for the U4/U6 di-snRNP to assemble onto the pre-mRNA, it must first associate with U5 snRNP to form the U4/U6•U5 tri-snRNP complex (Fig. 1.3). U5 snRNA can be divided into two major domains: the 5’ domain, which contains a complex stem loop structure, and the 3’ domain, which contains the single stranded Sm protein binding site followed by a 3’ stem loop that varies in both size and sequence (Fig. 1.3; Guthrie & Patterson 1988). In S. cerevisiae there are two functional forms of U5, a short and long form, with the short form terminating just prior to the 3’ stem loop structure (Patterson & Guthrie 1987). The 5’ stem loop is comprised of a long stem loop (loop 1) that is broken into three segments by two internal loops, IL1 and IL2, and a stem loop on the 5’ side of IL2 that is unique to S. cerevisiae (Guthrie & Patterson 1988). Loop 1, which makes direct contacts with the exon junction (Sontheimer & Steitz 1993, Newman et al. 1995), exhibits extreme sequence conservation such that nine of eleven nucleotides are invariant across eukaryotes. C. albicans is an exception to this where two additional nucleotides have been reported to deviate from the loop 1 consensus sequence (Mitrovich & Guthrie 2007).  Free U5 snRNA associates with eight different proteins – in addition to the heptameric Sm protein ring – to form the free U5 snRNP; all of them have a mammalian homolog (Stevens et al. 2001, Fabrizio et al. 2009). While Brr2, Prp8, and the only known spliceosomal GTPase, Snu114 (Fabrizio et al. 1997), are the key players in spliceosome activation, they also appear to play a major role in U5 snRNP formation and stability (Dix et al. 1998). Prp8 physically contacts U5 snRNA on both sides of IL1 and IL2, as well as loop 1 of the 5’ stem loop, while Snu114 contacts the 5’ side of IL2 (Dix et al. 1998). The stability of Prp8 depends on its ability to interact with Snu114, and stable interaction between these proteins requires the binding, but not hydrolysis, of GTP by Snu114 (Brenner & Guthrie 2006). Stable association of the GTP-bound Snu114/Prp8 dimer with U5 snRNA is required for stable association of Brr2, which interacts directly with Prp8, but not with U5 snRNA or Snu114 (Dix et al. 1998, Brenner & Guthrie 2006). Once fully formed, this free U5 snRNP particle then associates with the U4/U6 di-snRNP, and, upon addition of five other proteins, generates the U4/U6•U5 tri-snRNP complex (Fig. 1.3; Stevens et al. 2001). 1.7 Spliceosome Activation   Activation of the spliceosome requires large conformational and compositional changes that result in the loss of U1 and U4 snRNPs, the acquisition of the Prp19 Complex (NTC), and formation of the catalytic core of the spliceosome (Fig. 1.4; Cheng & Abelson 1987, Hoskins et  12 al. 2011). Each of these occurrences is precisely regulated and has been described as an allosteric cascade, in which the execution of one event is dependent on a conformational change associated with the previous event (Brow 2002). In some cases this is the exchange of a base-pairing partner for a mutually exclusive partner, and in other cases it involves a structural rearrangement in a protein that changes the accessibility of an interaction domain. The key drivers of these rearrangements during spliceosome activation are the U5 snRNP associated DExD-box RNA helicase proteins Prp28, which acts on the 5’ss/U1 snRNA interaction, and Brr2, which acts on the U4/U6 di-snRNA interaction (Fig. 1.4; Raghunathan & Guthrie 1998b). The U5 snRNP proteins Prp8 and Snu114 regulate the activity of both helicases through a delicate and finely tuned feedback system (Brenner & Guthrie 2005, Small et al. 2006).  Figure 1.4. Spliceosome activation. Triple snRNP addition to the pre-spliceosome (top) is  followed by U1 and U4 dissociation and NTC addition (bottom). snRNP particles and  protein complexes are indicated by large rectangles.  13   It is not yet clear how or what recruits the U4/U6•U5 tri-snRNP to the assembling spliceosome, although initial association of this complex probably involves protein-protein interactions between U1 and U5 snRNP proteins. Yeast two hybrid interactions have been reported between the U1 snRNP proteins Prp40 and Snp1 and the U5 snRNP proteins Prp8 and Brr2 respectively, suggesting that these interactions allow the tri-snRNP complex to dock with the pre-spliceosome (Abovich & Rosbash 1997, Fromont-Racine et al. 1997). Stable association of the tri-snRNP with the pre-spliceosome is guided largely by Prp8, which stabilizes an interaction between loop 1 of U5 snRNA and the exons (Dix et al. 1998). Prior to docking with the pre-spliceosome, Prp8 and Snu114 inhibit Prp28 and Brr2 helicase activity through a mechanism that is not well understood; however, contact between the tri-snRNP and U1 snRNP proteins upon docking induces a large structural rearrangement in the C-terminus of Prp8 that results in the activation of both helicases (Kuhn et al. 1999, Kuhn & Brow 2002, Brenner & Guthrie 2005).  The first major rearrangement during spliceosome activation is the exchange of U1 snRNA for U6 snRNA at the 5’ss, a process that requires ATP hydrolysis and Prp28 (Fig. 1.4; Staley & Guthrie 1999). Under normal wild type conditions, the yeast protein yU1C stabilizes the U1 snRNA/5’ss duplex. Mutations that alter either yU1C or U1 snRNA in the 5’ss binding region, however, render Prp28 dispensable for splicing, and, indeed, for cell viability (Chen et al. 2001). Since these mutations act to destabilize the interaction between U1 snRNP and the 5’ss, Prp28 is thought to function as an antagonist to yU1C, destabilizing its interaction with the pre-mRNA to provide a more suitable environment for U6 snRNA binding to the 5’ ss (Chen et al. 2001). Formation of the U6 snRNA/5’ss duplex promotes the complete dissociation of U1 snRNP from the 5’ss (Kuhn et al. 1999, Chen et al. 2001). Notably, extending the U1 snRNA/5’ss interaction by several base pairs inhibits the switch for U6 snRNA and stalls spliceosome assembly (Staley & Guthrie 1999). This inhibition can be reversed by lengthening the U6 snRNA/5’ss interaction by several base pairs, suggesting that U1 and U6 compete for binding to the 5’ ss, resulting in an equilibrium between the two bound states (Staley & Guthrie 1999). Prp28 appears to play a role in proofreading the stability of the U6 snRNA/5’ss interaction, rejecting suboptimal 5’ss pre-mRNAs by sending them down a discard pathway (Yang et al. 2013).  A second major structural rearrangement during spliceosome activation is the disruption of the U4/U6 di-snRNP. Several lines of evidence suggest that unwinding of the U4/U6 duplex is  14 tightly coupled to destabilization of U1 snRNP at the 5’ss. First, when U1 snRNA/5’ss unwinding is blocked by extending base pairing, U4/U6 duplex unwinding is also blocked (Staley & Guthrie 1999). Second, in the presence of a mutation that extends stem I of the U4/U6 di-snRNA to include the 5’ss binding region of U6 snRNA, U4/U6 unwinding is impeded and U1 snRNP is retained in a stalled spliceosome assembly intermediate (Li & Brow 1996, Kuhn et al. 1999). A mutation in Prp8 is capable of suppressing the conditional phenotype generated by the stem I-lengthening mutation, suggesting that U4/U6 unwinding is triggered by Prp8 only after stable association of U6 snRNA with the 5’ss (Kuhn et al. 1999, Staley & Guthrie 1999). Such a system would ensure that catalytic structures do not form prior to correct identification of the 5’ss, ensuring splicing fidelity during first step catalysis (Staley & Guthrie 1999).  While Prp8 is involved in regulating U4/U6 unwinding, it is Brr2 that plays an active role in unwinding the duplex (Raghunathan & Guthrie 1998b). In the absence of ATP or in the presence of a mutation in the helicase domain of Brr2, U4/U6 unwinding is inhibited (Raghunathan & Guthrie 1998b, Maeder et al. 2009). Genetic studies have implicated Prp8 as a negative regulator of Brr2, and in recent years some of the details of the mechanism of regulation have begun to surface (Kuhn et al. 2002). Specifically, the RNase H-like domain of Prp8 interacts directly with U4 and U6 snRNAs in single-stranded regions adjacent to U4/U6 stem I, the same region of U4 that is required for loading Brr2 onto the duplex (Mozaffari-Jovin et al. 2012). Prp8 and Brr2 physically interact with the same region of U4 snRNA in a mutually exclusive manner, with Prp8 blocking U4/U6 unwinding by preventing Brr2 from binding (Mozaffari-Jovin et al. 2012). A high-resolution crystal structure has revealed that Prp8 further blocks Brr2 activity by inserting its C-terminal tail into the RNA binding tunnel of Brr2, inhibiting the ATP-dependent helicase activity of Brr2 (Mozaffari-Jovin et al. 2013).  Once Brr2 has loaded onto U4 snRNA, it translocates along U4 to unwind U4/U6 stem I (Hahn et al. 2012, Mozaffari-Jovin et al. 2013). It is not yet clear how stem II is unwound, since Brr2 would encounter the protein-bound U4 snRNA 5’ stem loop before reaching stem II. It is possible that Brr2 continues to translocate along U4 snRNA, displacing proteins as they are encountered, and finally unwinding stem II (Nielsen & Staley 2012). Alternatively, Brr2 might somehow jump the 5’ stem loop to immediately unwind stem II following stem I unwinding, or it might not be involved in stem II unwinding at all (Nielsen & Staley 2012). What is known is that following release of U4 snRNA, Brr2 activity must be turned off to allow formation of the catalytic center of the spliceosome. Snu114 appears to function as a regulator of Brr2, since Brr2  15 activity is repressed when Snu114 is bound to GDP (Small et al. 2006). Importantly, while GTP hydrolysis is not required for U4/U6 unwinding, it is required for U4 snRNA release from the assembling spliceosome (Bartels et al. 2003, Small et al. 2006). Thus the hydrolysis of GTP following U4/U6 unwinding might trigger the release of the destabilized U1 and U4 snRNPs from the assembled spliceosome, likely by influencing the physical contacts between proteins at the core of the spliceosome.  Following the release of U1 and U4, the NTC is recruited to stabilize the assembled spliceosome during spliceosome activation (Fig. 1.4; Chan et al. 2003). The NTC is composed of Prp19 and at least seven other Prp19-associated proteins, which assemble into the pre-formed NTC prior to association with the spliceosome (Chen & Cheng 2012). Binding of the NTC results in the destabilization of the LSm complex of proteins from the 3’ tail of U6 snRNA, allowing these U6 snRNA nucleotides to interact with an intronic region of the substrate near the 5’ss (Chan & Cheng 2005). Cross-links between U6 snRNA and the NTC component Cwc2 have led to the proposal that Cwc2 serves to link the NTC to the spliceosome (McGrail et al. 2009). Interestingly, Prp19 itself contains a ubiquitin ligase motif at its N-terminus, and might regulate aspects of the splicing cycle through its ability to add ubiquitin to various proteins (Ohi et al. 2003). Indeed, Prp19 has been shown to ubiquitinate the U4/U6-associated protein Prp3 in humans, influencing tri-snRNP stability (Song et al. 2010). Further, ubiquitin is necessary for splicing in yeast, as inhibition of ubiquitin’s ability to interact with other proteins through ubiquitin mutation, or the presence of an inhibitory small molecule, reduces splicing by reducing tri-snRNP levels (Bellare et al. 2008).  The catalytic core of the spliceosome is formed through base pairing between the U2 and U6 snRNAs. Notably, these interactions are mutually exclusive with U4/U6 interactions, supporting the proposal that U4 snRNA acts as a negative regulator of U6 snRNA, masking U6 nucleotides so that catalytic features of the active site do not form prematurely (Brow & Guthrie 1989). Specifically, the stem I region of U6 base pairs to U2 snRNA to form U2/U6 helix I, and the stem II region of U6 folds back on itself to generate an intramolecular stem loop structure known as the 3’ ISL (Madhani & Guthrie 1992, Fortner et al. 1994). Interestingly, the C-terminal region of Prp8 influences U6 3’ ISL formation and/or stability, highlighting the importance of Prp8 throughout the splicing cycle (Kuhn et al. 2002). It is not yet clear whether U2/U6 helix I forms before, after, or at the same time as the U6 3’ ISL, however unwinding of U4/U6 stem I prior to stem II suggests that correct association of U2 and U6 snRNAs might be a prerequisite to  16 stem II unwinding and 3’ ISL formation. Once these catalytically important structures have formed, the spliceosome is considered to be fully activated and ready to splice a substrate. 1.8 Catalytic Steps  The splicing reaction consists of two sequential transesterification reactions separated by a period of spliceosomal remodeling. In the first reaction, the 2’ hydroxyl of a bulged adenosine found in the branch site consensus sequence of the intron reacts with the phosphodiester bond at the 5’ss (Fig. 1.5a; Padgett et al. 1984, Konarska et al. 1985). This results in the formation of an unusual 2’-5’ phosphodiester linkage joining the 5’ end of the intron to the branch point adenosine, with concomitant liberation of the 5’ exon (Padgett et al. 1984, Konarska et al. 1985). In the second step, the 3’ hydroxyl group of the 5’ exon reacts with the phosphodiester bond at the 3’ss, joining the 5’ and 3’ exons through a standard 3’-5’ phosphodiester linkage, with concomitant release of the intron in the form of a lariat (Padgett et al. 1984, Konarska et al. 1985). Both chemical steps are inferred to proceed through an in-line SN2 nucleophilic reaction based on the inversion of stereochemistry observed at the chiral phosphates (Maschhoff & Padgett 1993, Moore & Sharp 1993). While the spliceosome is composed largely of proteins, a long-held view is that the splicing reactions might actually be catalyzed by the highly conserved snRNAs located at the catalytic core of the spliceosome (Madhani & Guthrie 1992).  A catalytic function for U6 snRNA has been suspected for decades, not only because of its high level of sequence and size conservation, but also because of mechanistic and structural similarities to group II self-splicing introns (Madhani & Guthrie 1992, Peebles et al. 1995). At the active site of the spliceosome, U6 snRNA adopts a conformation that resembles the active domain of group II introns through formation of U2/U6 helix I and the U6 3’ ISL (Fig. 1.5b). U2/U6 helix I contains an invariant AGC triad that is required for exon ligation (Madhani & Guthrie 1992, Fabrizio & Abelson 1992, Hilliker & Staley 2004, Lee et al. 2010). The AGC triad is also present in a base-paired structure in group II introns (Fig. 1.5c), and like the spliceosome, has a strict requirement for a purine at the second position (Peebles et al. 1995, Hilliker & Staley 2004). Interestingly, in both systems the AGC triad is less tolerant of mutation than the complementary nucleotides to which it is base paired, demonstrating an important role for these nucleotides in addition to base pairing (Madhani & Guthrie 1992, Peebles et al. 1995). Directly adjacent to U2/U6 helix I lies the 3’ ISL, which, like domain V of group II introns, contains a small internal bulge on the 3’ side of the stem (Fabrizio & Abelson 1992, Lee et al. 2010, Peebles et al. 1995).  17  Figure 1.5. Catalysis. (a) Nucleophilic reaction of the 2’-hydroxyl of the branch point  adenosine (circled) with the 5’ splice site liberates the 5’ exon concomitantly with the  lariat intron/3’ exon intermediate (middle). Nucleophilic reaction of the 3’-hydroxyl of  the 5’ exon with the 3’ splice site results in the ligation of the exons and release of the  intron in the form of a lariat (right). (b) RNA interactions between U6 (above), U2  (below), and pre-mRNA (left) at the first chemical step. Catalytic Mg2+ ion binding sites  are indicated with an asterisk, and the AGC triad is marked with a bar. (c) Catalytic  domain V of a group II intron. The AGC triad is marked with a bar.    Mechanistically, pre-mRNA splicing and group II self-splicing are identical: both are Mg2+-dependent processes that result in the removal of a lariat intron (Peebles et al. 1986, van der Veen et al. 1986, Cech 1986). Steitz and Steitz (1993) have proposed a two metal ion mechanism for these reactions in which one metal ion activates the sugar hydroxyl, while the other metal ion directly coordinates and stabilizes the oxyanion leaving group. To date, three Mg2+ ion binding sites have been identified in U6 snRNA: one in the AGC triad, one at position U80 in the internal loop of the 3’ ISL, and a third in the almost invariant ACAGAGA sequence, which base pairs to the 5’ splice site of the pre-mRNA transcript and is located 5’ of U2/U6 helix  18 I (Fabrizio & Abelson 1992, Lee et al. 2010). In order for these Mg2+ ions to work in concert during the splicing reactions, the 3’ ISL must be closely juxtaposed with the 5’ splice site of the pre-mRNA transcript. Chemical structure probing of assembled spliceosomes has shown that this is indeed the case, with all three of these Mg2+ binding sites located in close proximity to position ten of the intron prior to the first catalytic step (Rhode et al. 2006). This constrains the structure of the active core in three-dimensions, placing all three Mg2+ ions close to the reactive groups for the first chemical step. Following the first reaction, the accessibility of the 3’ ISL changes, supporting the view that some level of spliceosomal remodeling occurs between the two splicing reactions (Rhode et al. 2006).  While the exact role of the Mg2+ ion that is coordinated at each site has not yet been elucidated, Yean et al. (2000) showed that substitution of a phosphorothioate at position U80 in the 3’ ISL reconstitutes fully assembled, but catalytically inactive spliceosomes. Only in the presence of more thiophilic metal ions does splicing proceed, demonstrating that it is the splicing reaction, not spliceosome assembly that requires this Mg2+ ion. While metal substitution fails to restore splicing in a phosphorothioate-substituted internal loop of a group II intron (Gordon & Piccirilli 2001), Tb3+ ion cleavage at this position suggests that this is indeed a site of metal ion coordination (Sigel et al. 2000). Notably, Fica et al. (2013) showed that U6 snRNA catalyzes both splicing reactions by positioning Mg2+ ions that are critical to stabilize the leaving groups, confirming Steitz and Steitz’s (1993) original proposal. Further, a reaction that resembles pre-mRNA splicing has been performed in the presence of Mg2+ ions in a protein-free system consisting of regions of U2 and U6 snRNA that make up the proposed catalytic domain (Valadkhan & Manley 2001). Thus the spliceosome can be considered a metallo-enzyme in which U6 snRNA plays a key role in coordinating these metal ions.  In addition to metal ion coordination by snRNAs at the active site of the spliceosome, Mg2+ ions are coordinated by protein components, although a direct role in catalysis for these metal ions has not been shown. Prp8 is the largest spliceosomal protein (260kDa), containing RNase H-like, endonuclease-like, and reverse transcriptase-like domains, none of which are catalytically active (Jackson et al. 1988, Pena et al. 2008, Dlakic & Mushegian 2011). Different first and second step conformations for Prp8 have been suspected for some time based on genetic findings (Collins & Guthrie 1999, Umen & Guthrie 1996), and recent structural work with human Prp8 has revealed a subtle difference in Prp8 conformation in which one state, an open form, allows Mg2+ ion coordination in the RNase H-like domain, while the other, the closed  19 form, does not (Schellenberg et al. 2013). The Mg2+-bound open state functions during the second catalytic step, where Mg2+ ion coordination was shown to promote exon ligation (Schellenberg et al. 2013). Schellenberg et al. (2013) suggest that Prp8 might present its Mg2+ ion at the active site along with two other metal ions presented by the snRNAs to generate a three-metal spliceosomal active site as observed for other enzymes that catalyze phosphoryl transfer reactions. In contrast, Abelson (2013) favors a role for this Mg2+ ion in stabilizing the second step active site conformation rather than a direct role in catalysis, given that the RNase H-like domain of Prp8 is catalytically inert. 1.9 Spliceosome Remodeling Between Catalytic Steps I and II  A general theme in spliceosome remodeling between the catalytic steps is beginning to emerge in which the components that are required for each step are present throughout both splicing reactions, but with substantial ‘toggling’ of these components to generate the appropriate active site for each step. For example, U2 snRNA toggles between two mutually exclusive stem structures: stem IIa and stem IIc (Fig. 1.6a). Stem IIc is required for catalysis of both steps of the splicing reaction, while stem IIa is required for spliceosome assembly and substrate rearrangement between the two catalytic steps (Hilliker et al. 2007). Thus, U2 toggles between these two conformations to allow spliceosome assembly and catalysis to proceed, and there is evidence to suggest that the RNA-dependent helicase Prp16 plays a role in this interchange (Fig. 1.6b; Hilliker et al. 2007). Similar events have been reported for active site protein components in which the affinity for protein binding in the spliceosome toggles between low and high affinity states. For example, Prp16 and Slu7 bind the activated spliceosome through low affinity entry sites that are converted to high affinity binding sites following the first catalytic step, when the action of these proteins is required (Ohrt et al. 2013).  Like spliceosome assembly and activation, promotion of each splicing reaction requires several RNA-dependent ATPases, which probably facilitate the formation of the step one and step two active sites. Interaction of Prp2 with the intron prior to the first catalytic step is required for the splicing reactions to proceed (Fig. 1.6b), and, in addition to making direct contacts with the pre-mRNA, Prp2 interacts with the carboxyl-terminus of Brr2 (Liu & Cheng 2012). This interaction has been proposed to allow for recruitment of Prp2 to the pre-catalytic spliceosome (Liu & Cheng 2012). Contact between Prp2 and Brr2 promotes the ATPase activity of Prp2, which results in the displacement of nine of eleven U2 snRNP-associated proteins (the SF3a and SF3b complexes in humans) through a mechanism that is not yet understood (Warkocki et al.  20 2009, Lardelli et al. 2010, Liu & Cheng 2012). The presence of the U2 snRNP-associated proteins at the BP region of the pre-mRNA may mask the reactive 2’-hydroxyl of the branch point adenosine until the spliceosome has correctly formed the step one active site. Removal of these proteins by Prp2 exposes the 2’-hydroxyl in a conformation that is compatible with in-line reaction with the phosphodiester bond at the 5’ss (Lardelli et al. 2010). Notably, these U2 snRNP proteins can be isolated in a particle containing U2 snRNA when purified spliceosomes are disassembled, suggesting that the U2 snRNP proteins, while displaced from the branch point for the first catalytic step, might remain loosely associated with the spliceosome throughout the splicing reactions (Fourmann et al. 2013).  Figure 1.6. Spliceosomal remodeling between the chemical steps. (a) Conformational toggling  in U2 snRNA, showing the switch between stem-loop IIa (left) and the mutually  exclusive stem IIc (right), which lengthens stem-loop IIa. (b) ATP hydrolases of the  helicase family associated with the chemical steps of splicing and spliceosome  disassembly.  Following the first catalytic step of splicing, the spliceosome re-positions the substrate for the second catalytic step, and the key driver of this remodeling event is the RNA-dependent ATPase, Prp16 (Fig. 1.6b; Schwer & Guthrie 1992). Prp16 is required specifically for the second catalytic step where it influences 3’ss cleavage and exon ligation; however, it has been shown to associate with the spliceosome in an ATP-independent manner prior to the first catalytic step to stabilize binding of the protein Cwc25 at the branch point (Schwer & Guthrie 1991, Tseng et al.  21 2011). Following the first catalytic step, Prp16 functions in an ATP-dependent manner to displace Yju2 and Cwc25 to allow for the association of the second step splicing factors Slu7, Prp18, and Prp22 (Tseng et al. 2011). Notably, Cwc25 is not displaced by Prp16 alone, but requires the stable association of Slu7 and Prp18, which are required to dock the 3’ss into the step two active site of the spliceosome (Ohrt et al. 2013). Interestingly, exon ligation can occur in the absence of Slu7 and Prp18 when the distance between the branch point and 3’ss is short; however, both proteins are required when this distance is longer than seven nucleotides (Brys & Schwer 1996, Ohrt et al. 2013).  In a genetic study, Mefford & Staley (2009) showed that Prp16 acts to destabilize U2/U6 helix I after the first catalytic step. However, since helix I integrity is important for both catalytic steps, they proposed that helix I reforms prior to second step catalysis. This is reminiscent of a second region of U2 snRNA discussed previously that undergoes toggling between the stem IIa and stem IIc conformations throughout the splicing cycle (Hilliker et al. 2007). Following 5’ss cleavage, Prp16 has been proposed to disrupt the stem IIc catalytic conformation by destabilizing stem IIc itself, as well as to destabilize interactions that are mutually exclusive with stem IIa, thereby promoting stem IIa formation (Hilliker et al. 2007). While the specific Prp16 substrate has yet to be identified, it is tempting to speculate that Prp16’s role in displacing Yju2 and Cwc25 is an indirect consequence of Prp16 unwinding various U2 snRNA duplexes. Unwinding these structures would relax the catalytic core of the spliceosome, allowing for substrate re-positioning, while reformation of the snRNA structures would result in stable formation of the step-two active site.  Once the splicing reactions have been completed, the mature mRNA product must be released from the spliceosome, and Prp22 is the RNA-dependent ATPase responsible for promoting this event (Fig. 1.6b; Company et al. 1991, Schwer & Gross 1998). Like many of the ATPases encountered so far, Prp22 performs both an ATP-independent and an ATP-dependent function in splicing. The ATP-independent function is not well characterized, but is only required when the distance between the BP and the 3’ss is greater than twenty nucleotides (Schwer & Gross 1998, Schwer 2008). This function is required prior to execution of the second step, when Prp22 has been proposed to act in concert with Slu7 and Prp18 to position the 3’ss and 3’-hydroxyl of the 5’-exon for catalysis (Schwer 2008). Site specific cross-links and RNase H protection of the mRNA downstream of the exon-exon junction in the presence of Prp22 suggest that a conformational rearrangement following the second catalytic step places Prp22 on  22 the mRNA at this location (Schwer 2008). Prp22 then acts to unwind the mRNA/U5 snRNA duplex, releasing the mRNA from the spliceosome using the energy generated through ATP hydrolysis (Schwer & Gross 1998, Schwer 2008). Following mRNA release, Slu7, Prp18, and Prp22 dissociate from the spliceosome (James et al. 2002). 1.10 Spliceosome Disassembly  After a substrate has been spliced, the spliceosome undergoes disassembly, resulting in the separation of U2, U5, U6, the NTC, and the lariat intron, and thereby allowing the spliceosomal components to be recycled for subsequent rounds of splicing (Fig. 1.7; Tsai et al. 2005). The DExD/H box RNA helicase Prp43, which associates with Ntr1 and Ntr2 to form the NTR complex, is responsible for promoting spliceosome disassembly in an ATP-dependent manner following mRNA release (Arenas & Abelson 1997, Tsai et al. 2005). Prp43 helicase activity is greatly enhanced through its interaction with Ntr1, demonstrating that Ntr1 is an accessory factor that is required to regulate Prp43 activity (Tanaka et al. 2007). Prp43 is recruited to the spliceosome through an interaction between Ntr2 and the U5 snRNP-component Brr2 (Tsai et al. 2007). Since Brr2 is present early in spliceosome assembly and throughout both catalytic steps, it is notable that binding of Ntr2 is competitively inhibited by the presence of Prp16 and Slu7, ensuring that spliceosome disassembly is not prematurely triggered through early association of the NTR complex with Brr2 (Chen et al. 2013).  Figure 1.7. Spliceosome disassembly mediated by Prp43 and the NTR complex. The excised  intron is shown as a lariat with the BP adenosine circled. Proteins and snRNP particles  are indicated by rectangles.   Whether or not Brr2 helicase activity is required during intron release and spliceosome disassembly is up for debate. Small et al. (2006) reported that in a GTP-bound state, Snu114 derepresses Brr2 activity after the second catalytic step, resulting in intron release and spliceosome disassembly in much the same way as observed for U4/U6 unwinding during  23 spliceosome assembly. This model presents another example of toggling throughout the splicing cycle, whereby hydrolysis of GTP to GDP results in repression of Brr2 activity following U4/U6 unwinding; subsequent exchange of the GDP for a new GTP following the splicing reactions derepresses Brr2 to allow spliceosome disassembly. Indeed, the RNA-dependent ATPase activity of Brr2 is preferentially stimulated by annealed U2/U6, suggesting that the U2/U6 duplex could be a Brr2 substrate (Xu et al. 1996). However, Fourmann et al. (2013) recently showed that while Prp43 is necessary and sufficient for spliceosome disassembly, Brr2 is not required. Since Brr2 activity is specifically dependent on ATP hydrolysis, the fact that spliceosome disassembly proceeded as efficiently in the presence of UTP, CTP and GTP as it did in the presence of ATP strengthens the argument that Brr2 activity is not required at this step (Fourmann et al. 2013).  The conflicting results reported by Small et al. (2006) and Fourmann et al. (2013) could reflect the different study systems used by the two groups. Fourmann et al. (2013) devised a purified splicing system with which stalled activated spliceosomes were isolated from an extract, followed by addition of recombinantly expressed and purified first and second step protein factors. The consequences of protein addition were then observed. In contrast, Small et al. (2006) used a tagged Prp43 to pull the Prp43-containing complex out of a whole cell extract where potential endogenous factors reside that might play a role in splicing but have not yet been identified. It is possible that Prp43 activity destabilizes the spliceosome substantially, enough so that in the purified system, Brr2 activity is dispensable. In the absence of Brr2 activity, for example in the presence of UTP, the workload for Prp43 might increase to complete disassembly. In the Small et al. (2006) complex, other factors might contribute to the stability of the disassembling spliceosome such that Brr2 activity is required for efficient disassembly. Further experimentation will be required to reconcile these differences. 1.11 Splicing Fidelity  Since pre-mRNA splicing involves the removal of intervening sequences and ligation of protein coding sequences to generate a continuous translation template, splicing must proceed with single nucleotide precision to avoid introducing nucleotide insertions or deletions that would result in the translation of frame-shifted, aberrant products. The spliceosome has evolved a number of proofreading mechanisms to ensure fidelity throughout assembly and catalysis. In these, the spliceosome acts to promote splicing of optimal substrates, while antagonizing splicing of suboptimal substrates (Semlow & Staley 2012). One proofreading mechanism for which there  24 is support in splicing is kinetic proofreading, originally described independently by Hopfield (1974) and Ninio (1975) in the translation field. In splicing, kinetic proofreading has been observed in early spliceosome assembly, where U2 snRNP association with the branch point, and exchange of U1 for U6 at the 5’ss, are proofread by Prp5 and Prp28 respectively. Other examples have been found through the first and second catalytic steps (Xu & Query 2007, Yang et al. 2013, Burgess & Guthrie 1993, Villa & Guthrie 2005, Mayas et al. 2006).  In the kinetic proofreading model, energy is expended to allow for inspection of the substrate before allowing the substrate to proceed down a productive pathway. Optimal substrates undergo reaction quickly, while the time required for reaction of suboptimal substrates is longer (Fig. 1.8). Several splicing ATPases, such as Prp16 and Prp22, have been implicated as ‘timers’ during these proofreading stages, in which splicing of optimal substrates proceeds more rapidly than the ATPase acts (Fig. 1.8a; Burgess & Guthrie 1993, Mayas et al. 2006). As a consequence, hydrolysis of ATP promotes a conformational change that shuffles the substrate down a productive pathway. However, when suboptimal substrates are encountered, ATP hydrolysis occurs more rapidly than the splicing reaction (Fig. 1.8b). This results in a conformational change in the spliceosome that promotes the rejection of the substrate through a discard pathway. Discrimination between fast and slow substrates may be based in part on the spliceosome’s ability to discriminate between substrates that are positioned correctly for the chemical steps and those that are not (Chua & Reed 1999).  Figure 1.8. Kinetic proofreading of the first catalytic step of splicing. (a) An optimal  substrate, in which reaction of the BP adenosine (circled) with the 5’ss is faster than ATP  hydrolysis by Prp16, leading to dissociation of Cwc25 and Yju2 after the first chemical  step has occurred. (b) A suboptimal pre-mRNA in which Prp16 ATP hydrolysis occurs  before the first chemical step, leading to premature dissociation of Cwc25 and Yju2, and  subsequent Prp43-mediated disassembly.  25  The role of Prp16 in kinetic proofreading during the first catalytic step has been well characterized and serves as an excellent example of proofreading by the spliceosome. Proofreading at this stage involves kinetic competition between the Prp16-dependent release of Cwc25 and the first transesterification reaction (Fig. 1.8a; Tseng et al. 2011). When the splicing machinery encounters an optimal substrate, the transesterification reaction proceeds more rapidly than the removal of Cwc25, and thus Cwc25 is displaced by Prp16 after the first catalytic step, thereby making way for second step splicing factors. In the case of a suboptimal substrate containing branch point mutations, however, ATP hydrolysis by Prp16 occurs more rapidly than the transesterification reaction, resulting in the premature release of Yju2 and Cwc25 from the spliceosome prior to completion of the first transesterification reaction (Fig. 1.8b; Tseng et al. 2011). Discard of the suboptimal substrate at this point involves the disassembly factor Prp43 (Koodathingal et al. 2013). In fact, Prp43-mediated spliceosome disassembly can be initiated after the action of Prp2, Prp16, or Prp22, following their dissociation from the spliceosome when a suboptimal substrate is encountered. This suggests that Prp43 plays a more general role in discarding suboptimal substrates throughout catalysis, in addition to disassembling the spliceosome following splicing of an optimal substrate (Chen et al. 2013). 1.12 General Aims of this Dissertation  Over the last several decades much work has been completed to understand the chemical mechanism of the splicing reactions and the composition of the spliceosome, which is responsible for catalyzing these reactions. Despite this wealth of information, very little is known about the exact role of many splicing factors, and even less is known about the mechanisms through which these factors function. With recent advances in the technology used to study splicing, we now have an opportunity to investigate and explore questions that could not be addressed previously. For example, we are seeing a shift from analyzing bulk splicing in whole cell extracts to monitoring the fate of individual substrates by fluorescence microscopy. As a result of this transition, we are already beginning to understand the kinetics of individual steps, and the order of association and dissociation events, with greater depth. These types of inquiries, along with progress in atomic-resolution structure determination of splicing complexes, will lead to a better understanding of the intricate details of the splicing cycle.  A major step toward understanding how splicing factors function within higher order complexes, such as the free snRNP particles and spliceosome intermediates, will be the atomic  26 resolution structure determination of these complexes. Solving these structures will shed light on the nature of the interactions that stabilize them and will lead to a better understanding of their mechanism of action. Unfortunately, such structural work is often limited by the inability to acquire sufficient quantities of high purity complexes through the reconstitution of recombinantly expressed splicing factors. The work presented here aims to address this problem through the development of a recombinant expression system that allows for the simultaneous expression of multiple splicing factors from a single expression vector, followed by purification of the pre-formed complex under non-denaturing conditions. While the procedure can be easily adapted for other complexes, this work focuses on free U6 snRNP and its major subcomplex, the LSm complex. The structural work presented here is supplemented with genetic and biochemical findings to further strengthen a model for U6 snRNA secondary structure and its activation for splicing.   Chapter two of this dissertation begins with an in depth critical analysis of the literature pertaining to free U6 snRNP and U4/U6 di-snRNP formation, highlighting inconsistencies between the current model and observations from the literature. Next I present a model of U6 snRNA secondary structure in free U6 snRNP that is dramatically different from all models proposed previously, and I propose a mechanism for activation of U6 snRNA for incorporation into the assembling spliceosome that is dependent on its interaction with U4 snRNA. I also suggest that the U6 snRNP-specific protein, Prp24, functions in recycling U4/U6 di-snRNP by retrieving U6 snRNA from disassembling spliceosomes, holding U6 snRNA in a secondary structure that is favorable for interaction with U4 snRNA. In chapter three, a region of U6 snRNA known in the literature as the telestem was investigated through an exhaustive genetic mutation analysis in which doubling times and RNA levels were assessed in the presence of mutations that were predicted to affect telestem stability. I found that the lower portion of this structure, the telestem extension, probably does not form a stem as proposed, but serves as an important protein binding site within free U6 snRNP. I then modeled the secondary structure of U6 snRNA in silico, and, guided by observations from the literature and from our findings reported earlier in the chapter, built in the protein components of the snRNP. In chapter four I introduce a method for co-expression and co-purification of a pre-formed, recombinant U6 snRNP along with low-resolution structural characterization of these particles. Notably, these particles were identical to those that have been purified from yeast at the resolution obtained by electron microscopy, building our confidence that the purification scheme that I have developed  27 here allows for the isolation of functional multi-protein particles. Finally, all of this work is brought together in a short general discussion to illustrate the progress that I have made in understanding the structure/function relationship of several splicing complexes: U6 snRNP, U4 snRNP, and the U4/U6 di-snRNP.                       *A version of this chapter has been published in Fungal RNA Biology and has been adapted with permission (Dunn & Rader 2014).   28 2 U6 snRNA in Free U6 snRNP and its Activation for Splicing 2.1 The Karaduman Model of U6 snRNA Secondary Structure and its Implications  Since the discovery of yeast U6 snRNA in 1988, the model of its secondary structure in free U6 snRNP has undergone substantial revisions to arrive at what is now widely accepted in the literature and will be referred to here as the Karaduman model (Brow & Guthrie 1988, Karadmuna et al. 2006). The Karaduman model is composed of two stems, the 5’ stem/loop and the 3’ intramolecular stem/loop (3’ ISL), which are separated by a largely unstructured region spanning nucleotides 30-62 (Fig 2.1; Karaduman et al. 2006). The 5’ stem/loop and 3’ ISL are phylogenetically conserved and have been proposed in all models of U6 snRNA secondary structure in all organisms for which the sequence is known (Epstein et al. 1990, Hamm & Mattaj 1989, Fortner et al. 1994, Brow & Vidaver 1995, Vidaver et al. 1999, Ryan et al. 2002, Karaduman et al. 2006). The region between these stems has changed considerably over the last three decades in an attempt to better reflect new data as it becomes available. This region was originally proposed to form the central stem/loop, a phylogenetically unproven feature that was later suggested to fold into a pseudoknot structure, followed by the telestem, and finally a slight variation of the telestem (Fortner et al. 1994, Vidaver et al. 1999, Ryan et al. 2002, Karaduman et al. 2006). The general trend over the years has been a decrease in proposed structure throughout this region, to the point where the Karaduman model now predicts a large asymmetric bulge composed of nucleotides 40-62 and 85-91 (Karaduman et al. 2006).  The 5’ stem/loop is a highly conserved structural element that varies both in size and sequence, and is responsible for essentially all of the difference in total length of U6 snRNA across species (Guthrie & Patterson 1988). While all other spliceosomal snRNAs are transcribed by RNA polymerase II and acquire a 5’ 7-methylguanosine cap structure shortly after transcription is initiated (Will & Lührmann 2001 and references therein), U6 snRNA is transcribed by RNA polymerase III and is not capped in this way, instead carrying a γ-monomethyl-phosphate 5’-end (Singh & Reddy 1989). Importantly, when the 5’ stem/loop is disrupted such that the terminal nucleotide is accessible, U6 snRNA undergoes posttranscriptional 5’ 7-methylguanosine capping within the nucleus (Kwan et al. 2000). However, neither the capping nor the disruption of the 5’ stem/loop show adverse effects on splicing as inferred from growth assays of U6 constructs bearing various 5’ stem/loop truncations or internal deletions that resulted in efficient capping (Kwan et al. 2000). Thus the 5’  29 stem/loop of U6 snRNA is dispensable for splicing and appears to instead be important in directing correct 5’-end formation during transcription and maturation of U6 snRNA in yeast, and will not be considered further here (Kwan et al. 2000).  Figure 2.1. Models of S. cerevisiae U6 snRNA secondary structure in free U6 snRNP. The  Fortner model (top left; Fortner et al. 1994), the Vidaver model (top right; Vidaver et al.  1999), the Karaduman model (bottom left; Karaduman et al. 2006), and the Dunn model  (bottom right; Dunn & Rader 2010).  Structural elements key to each model are labeled.  Base pairing in some models is intentionally not shown to reflect the original model.  30  The 3’ ISL is a critical component of the active spliceosome, coordinating an essential Mg2+ ion at position U80 that sits closely juxtaposed with the 5’ splice site within the active site of the spliceosome (Fig 2.2; Yean et al. 2000, Rhode et al. 2006, Fica et al. 2013). This structural element has also been proposed in all models of U6 snRNA secondary structure in free U6 snRNP prior to and including the Karaduman model. The implications of this are huge: since U6 snRNA must transit through the U4/U6 di-snRNP particle en route to the assembling spliceosome, the 3’ ISL in free U6 snRNP must unwind completely to form an extensive base pairing interaction with U4 snRNA (Fig 2.2; Fortner et al. 1994). Later, following U4 snRNA dissociation during spliceosome activation, these nucleotides are then proposed to regenerate the 3’ ISL for catalysis (Fig 2.2). Thus the Karaduman model, and all models previous to it, suggests that U6 snRNA undergoes a number of very large structural rearrangements throughout the splicing cycle in order to accommodate the wide variety of mutually exclusive interactions that involve U6 snRNA (Fortner et al. 1994). These rearrangements have been described as a sequence of allosteric events that are part of a much larger cascade toward activation of the spliceosome (Brow 2002).  Figure 2.2. U6 snRNA conformational rearrangements throughout the splicing cycle. Free  U6 snRNP containing the 3’ ISL (top), U4/U6 di-snRNP (middle), and the pre-catalytic  spliceosome (bottom). The unwound 3’ ISL is indicated with a bar in U4/U6 di-snRNP.  Position U80, which coordinates a catalytic Mg2+ ion, is indicated with an asterisk in the  pre-catalytic spliceosome.  31  Experimental support for the 3’ ISL is plentiful, however much of these data are ambiguous in that they cannot be definitively assigned to U6 snRNA in free U6 snRNP. For example, a large body of data has been acquired through in vivo genetic experiments that do not reveal the biochemical step (e.g. stage of splicing, transcription, translation) at which the mutation exerts its effect. The mutation could have disrupted critical interactions within free U6 snRNP, U4/U6 di-snRNP, U4/U6•U5 tri-snRNP, the active spliceosome, or some more transient intermediate particle, and so the assignment of genetic results to a source particle must be done with caution. As an example, consider the U6 snRNA mutation A62G, which was predicted to hyperstabilize the 3’ ISL, and which resulted in a cold sensitive growth phenotype at 18°C (Fortner et al. 1994). This growth phenotype was expected, because at depressed temperatures, unwinding of the hyperstabilized stem would be even less efficient than at 30°C, resulting in reduced levels of U4/U6 di-snRNP and presumably fewer functional spliceosomes. Fewer spliceosomes result in less splicing, which is expected to have a negative effect on the growth rate.   Consistent with this line of reasoning, the A62G mutation was found to produce less than half the wild type levels of U4/U6 di-snRNP (Fortner et al. 1994). However, upon further inspection of the data, it became clear that the story was much more complex. Over-expression of U4 snRNA was expected to drive U4/U6 complex formation by mass action, restoring the U4/U6 di-snRNP levels and correcting the growth phenotype. However, when U4 snRNA was expressed to levels that were four-fold higher than wild type, U4/U6 complex was restored to at least wild type levels, but the cold sensitive growth phenotype was only partially rescued (Fortner et al. 1994). A second 3’ ISL hyperstabilizing mutation, A62U/C85A, generated a much stronger cold sensitive phenotype despite introducing a weaker hyperstabilizing base pair (i.e.  U-A vs. G-C), and resulted in only a very slight reduction in U4/U6 di-snRNP levels (Fortner et al. 1994, Vidaver et al. 1999). Thus the 3’ ISL hyperstabilizing mutations that have been introduced have probably affected some process downstream of U4/U6 di-snRNP formation.  Perhaps the most striking 3’ ISL genetic results are those from human U6 snRNA, which strongly suggest that 3’ ISL formation is most critical in the assembled spliceosome (Wolff & Bindereif 1993). Like A62G, mutations that were expected to hyperstabilize the human 3’ ISL resulted in greatly reduced levels of U4/U6 di-snRNP, assembled spliceosomes, and splicing, consistent with an inability of the 3’ ISL to efficiently unwind to base pair with U4 (Wolff &  32 Bindereif 1993). However, in addition to this, mutations that were predicted to destabilize the 3’ ISL and at the same time to either have no effect on or to increase the stability of U4/U6 di-snRNP, resulted in wild type levels of U4/U6 di-snRNP and assembled spliceosomes, but splicing was reduced to less than 10% (Wolff & Bindereif 1993). In a third category of mutation, a double point mutant was made that was predicted to not affect the stability of free U6, but was predicted to decrease the stability of U4/U6 di-snRNP (Wolff & Bindereif 1993). This mutation resulted in wild type levels of U4/U6 di-snRNP and assembled spliceosomes; however, splicing was again reduced (Wolff & Bindereif 1993). Thus formation of the 3’ ISL is critical for splicing to occur, and while these data are not inconsistent with 3’ ISL formation in free U6 snRNP, they do strongly suggest that the observed phenotypes in both the yeast and mammalian systems might originate at a stage later than free U6 snRNP unwinding.  One possible explanation for the phenotypes observed for various U6 3’ ISL mutants is that the mutations exert their effect solely in the context of the active spliceosome, and not within free U6 snRNP. This would imply that the 3’ ISL does not form in free U6 snRNP, and consequently the interpretation of the observed phenotypes must be re-evaluated. Mutations that destabilize the 3’ ISL would impede its formation during spliceosome assembly, resulting in fewer splicing competent spliceosomes, and consequently low levels of splicing. However, as observed in the mammalian system (Wolff & Bindereif 1993), U4/U6 di-snRNP levels would be expected to remain comparable to wild type since the mutation affects a process downstream of U4/U6 formation. Mutations that increase the stability of the 3’ ISL would promote stem formation in active spliceosomes; however, unwinding of the stem during spliceosome remodeling between the two chemical steps, or during spliceosome disassembly might be impeded. This effect would be more pronounced at lower temperatures, hence the cold sensitive growth phenotype observed for the A62G and A62U/C85A yeast mutants. In this case, U6 snRNA would stall in late stages of splicing and therefore not be recycled for subsequent rounds of splicing. If U6 snRNA is not recycled, then U4/U6 levels would be expected to fall and free U4 would accumulate. Thus the growth phenotype and reduced levels of U4/U6 di-snRNP and splicing observed for the 3’ ISL mutations could arise strictly from complications within the spliceosome, and not within free U6 snRNP.   Other data, such as chemical structure probing, are fraught with ambiguity since this type of data indicates whether or not the Watson/Crick face of a single nucleotide is accessible for  33 chemical modification, but does not definitively show that the nucleotide is base paired – since protection from modification could be due to the presence of a protein or to tertiary structure interactions – nor does it reveal the base-pairing partner or anything with regard to non-canonical base pair interactions. Chemical structure probing data from two different yeast studies both indicate that the stem nucleotides of the proposed 3’ ISL are strongly protected from chemical modification while the loop nucleotides are strongly accessible, consistent with 3’ ISL formation in free U6 snRNP (Fig 2.3, 2.4; Jandrositz & Guthrie 1995, Karaduman et al. 2006). One exception to this can be found at nucleotides C67 and A79, which have been proposed to form a mismatch base-pair and are both strongly modified within the stem (Jandrositz & Guthrie 1995, Karaduman et al. 2006). However this pattern would be expected if the mismatch were mediated through non-Watson/Crick interactions between the two nucleotides, leaving the Watson/Crick positions vulnerable to chemical modification. Since these data are in good agreement with the 3’ ISL structure, alternative structures for this region of U6 snRNA within free U6 snRNP have not been appropriately explored.  Chemical modification data can also be difficult to interpret since different experimental systems can yield quite different modification patterns. For example, Jandrositz and Guthrie (1995) conducted their modification experiments on U6 snRNP that had been partially purified by glycerol gradient centrifugation, allowing for separation of free U6 snRNP from U4/U6 di-snRNP and active spliceosome. Thus, the modification pattern obtained in this study should represent U6 snRNA found only in free U6 snRNP that has been isolated directly from yeast cells. On the other hand, Karaduman et al. (2006) tandem affinity purified free U6 snRNP from yeast, washing away anything that might still be present and influencing the experiment in the Jandrositz and Guthrie (1995) study. Further, to test naked U6 snRNA, Jandrositz and Guthrie probed U6 snRNA that had been phenol:chloroform extracted under non-denaturing conditions from their U6 snRNP population while Karaduman et al. (2006) probed in vitro transcribed U6 snRNA that had never come into contact with the protein components of the snRNP. If these proteins influence the U6 snRNA structure, then the in vitro transcribed RNA might not fold into the same structure as U6 would in the presence of protein. Consequently, the data that I have presented in figure 2.3 represent data that are consistent between the two studies while the raw data for each study can be found in figure 2.4.  34  Figure 2.3. Chemical modification data mapped to the Karaduman and Dunn models of U6  snRNA secondary structure. Chemical modification data that were consistent between  two different studies, Jandrositz & Guthrie 1995 and Karaduman et al. 2006, were  mapped to the Karaduman model (a) and the Dunn model (b). For each model, the data in  the absence of protein (left) and the presence of protein (right) was shown with solid  circles  highlighting nucleotides that were strongly modified and with open circles  highlighting nucleotides that were strongly protected from chemical modification. The  boxed number beside each model indicates the number of data points that match the  model. The number in brackets for U6 snRNP takes into account four nucleotides that are  expected to be single stranded, but protected due to the presence of protein.  35  Figure 2.4. Raw U6 snRNA chemical modification data in the absence (left) and presence of  protein (right) of protein. Data from Jandrositz & Guthrie (1995) and Karaduman et al.  (2006) are indicated on the left and right of each grid respectively. Strongly modified  nucleotides are shown in black, weakly modified in gray, strongly protected in white, and  nucleotides for which there is no information are indicated by an ‘X’. The three columns  in the middle of the grid represent the Vidaver et al. (1999) (V), Dunn (D), and  Karaduman et al. (2006) (K) models along with their predicted modification (black) or  protection (white) patterns.   36  The 3’ ISL is a very stable stem loop, with an experimentally determined melting temperature of more than 60°C in the absence of protein (Reiter et al. 2003). Structure probing of the sugar-phosphate backbone of U6 snRNA in free U6 snRNP suggests that the U6-snRNP-associated proteins, Prp24 and the LSm complex, do not bind to the region of U6 proposed to form the 3’ ISL since the entire stem is susceptible to cleavage by hydroxyl radicals (Fig 2.5; Ghetti et al. 1995, Karaduman et al. 2006). Thus the 3’ ISL would be expected to remain intact in free U6 snRNP without destabilization due to protein binding in that region, and therefore, the 60°C melting temperature likely describes the stability of this structure in free U6 snRNP accurately. It is puzzling then how the 3’ ISL unwinds during the transition from free U6 snRNP to U4/U6 di-snRNP when there is no requirement for an RNA helicase or energy generated from ATP hydrolysis. Indeed, free U4 and free U6 have been reported to anneal in the absence of protein factors, albeit less efficiently than in the presence of Prp24 and the LSm complex (Raghunathan & Guthrie 1998, Achsel et al. 1999).  Figure 2.5. Potential protein binding sites mapped to U6 snRNA secondary structures. The  Karaduman model (a) and the Dunn model (b). Putative Prp24 binding sites are  highlighted with black boxes surrounding the nucleotides (Kwan & Brow 2005).  Protection of U6 snRNA from hydroxyl radical cleavage in the presence of Prp24 is  indicated by closed circles (Ghetti et al. 1995) and in the presence of Prp24 and the LSm  complex by open circles (Karaduman et al. 2006). U6 snRNA cross-links to Prp24 are  indicated with a closed star (Karaduman et al. 2006,) and U6 snRNA cross-links to the  LSm complex are indicated by an open star (Karaduman et al. 2008).  37  Following the addition of the triple snRNP to the assembling spliceosome, the spliceosome becomes functionally activated in part through the release of U4 snRNA, which allows the 3’ ISL to re-fold (Fig 2.2; Fortner et al. 1994). Although the U4/U6 duplex is not particularly strong compared to the 3’ ISL, with the melting temperature of the deproteinized U4/U6 di-snRNA experimentally determined to be ~ 53°C, unwinding of this duplex requires ATP hydrolysis by the ATP-dependent RNA helicase Brr2 (Brow & Guthrie 1988, Raghunathan & Guthrie 1998b). One could argue that the presence of protein stabilizes the U4/U6 di-snRNA interaction to the point that a helicase is required for RNA unwinding; however, there is currently no evidence to suggest that Brr2 functions in the displacement of proteins. The U4/U6 helix-unwinding requirements described here add further mystery to the U6 3’ ISL unwinding mechanism required for U4/U6 di-snRNP formation.  It is not clear what the functional significance of U4/U6 di-snRNP formation is; however, the universal conservation of this interaction across eukaryotes, and the energetic requirements for U4/U6 disassembly, argue that the interaction must be important, otherwise this step in spliceosome assembly would not have been so tightly retained throughout evolution. The role for U4 snRNA itself is even less clear since U4 snRNA is an essential gene product, yet it leaves the spliceosome prior to catalysis (Yean & Lin 1991). Since interaction with U6 snRNA is required for incorporation of U6 into functional spliceosomes, U4 has been described as an anti-sense negative regulator of U6 (Brow & Guthrie 1989). Formation of U4/U6 stems I and II are thought to prevent premature formation of the catalytically important U2/U6 helix I and the U6 3’ ISL respectively, until the spliceosome has correctly assembled onto the pre-mRNA transcript (Fig 2.2; Brow & Guthrie 1989). However, this functional description for U4 snRNA is inadequate given that a free U6 snRNP population containing an intact 3’ ISL and a partially accessible 5’ splice site binding region is present in living cells, and a negative regulator of this particle, so far as we know, does not exist. The only proteins that associate with U6 snRNA in free U6 snRNP are Prp24 and the LSm complex (Stevens et al. 2001), neither of which display the properties of a negative regulator, but instead promote U4/U6 base pairing (Raghunathan & Guthrie 1998, Achsel et al. 1999, Licht et al. 2008). Despite this attempt to ascribe a function to U4 snRNA, its role in splicing has remained enigmatic since its discovery almost three decades ago.  38 2.2 U4/U6 di-snRNP Formation and the Role of Prp24  The U6 snRNP-specific protein Prp24 plays a key role in facilitating U4/U6 di-snRNP formation; however, the details of its function are not yet known (Raghunathan & Guthrie 1998). Prp24 is composed of four RNA recognition motifs (RRM), the first two of which are predicted to bind U6 snRNA with high affinity (Kwan & Brow 2005, Bae et al. 2007). Little is known about the binding mode between U6 and Prp24; however, a triple alanine substitution of three highly conserved residues in the RNP-1 consensus sequence of RRM2 results in lethality, while the equivalent substitutions in RRM3 and RRM4 generate a temperature sensitive growth phenotype (Vidaver et al. 1999, Rader & Gurthrie 2002). A triple alanine substitution in RRM1 results in wild type growth; however, a point mutation within the RNP-1 domain of RRM1 also confers a temperature sensitive phenotype (Vidaver et al. 1999, Kwan & Brow 2005). These observations are consistent with destabilization of an RNA-protein interaction and suggest that the canonical RNA binding surface of RRM2 is essential, while the binding surfaces of RRM3 and RRM4 are important, but not critical, and that of RRM1 is dispensable. An alternative explanation for these phenotypes is that the mutation could have simply destabilized the protein fold, as might be the case for the RRM1 point mutation, which generates a protein that is not soluble upon recombinant expression, and therefore would not be expected to fold correctly when expressed in yeast (Dunn and Rader, unpublished). In contrast, chemical shift perturbation analysis of RRM2 and the RRM2 triple alanine substitution suggest that, at least for this RRM, mutation has not affected the structure and temperature sensitivity probably results from a disruption of RNA binding (Martin-Tumasz et al. 2010). These genetic results do not rule out the possibility that the RRMs make some non-canonical contacts with the RNA, nor do they reveal which RNA, U4 or U6, each RRM binds.  Understanding how and where Prp24 interacts with U6 snRNA in free U6 snRNP is expected to provide invaluable insight into its mechanism. A number of partial Prp24 structures have been solved, leading to some understanding of the mode of RNA binding for each RRM. These include a crystal structure of the first three RRMs (RRM123) resolved to 2.7Å, as well as NMR solution structures of the first two RRMs (RRM12), RRM4, and RRM2 bound to a hexanucleotide sequence from U6 snRNA (Bae et al. 2007, Martin-Tumasz et al. 2010, Martin-Tumasz et al. 2011). RRM4 is considered to be an occluded RRM (oRRM), since two α-helices flanking the anti-parallel β-sheet were found to lie across it, rendering the β-sheet inaccessible  39 for RNA binding (Martin-Tumasz et al. 2011). Consequently, if oRRM4 does contact RNA directly, it either does so in a non-canonical fashion, or Prp24 must undergo a conformational change in which the α-helices move aside prior to RNA binding. Since the oRRM4 α-helices generate a large electropositive surface and were found to rigidly bind the β-sheet through hydrophobic interactions, the former of these possibilities is more likely (Martin-Tumasz et al. 2011). In support of a non-specific RNA-protein interaction across the electropositive surface of oRRM4, fluorescence anisotropy experiments revealed a similar apparent binding constant when oRRM4 was allowed to interact with segments of three very different RNAs: U2, U4, and U6 (Martin-Tumasz et al. 2011).  The mode of interaction between RRM3 and its ligand is less clear, since the canonical RRM3 RNA binding surface was found to pack against RRM2 in the crystal structure (Bae et al. 2007). Residual dipolar coupling of an RRM2/RRM3 construct indicated that the orientation of bonds within each RRM in solution agreed well with those in the crystal structure (Martin-Tumasz et al. 2010). However, the orientation of bonds in each RRM relative to each other was quite different in solution compared to the crystal structure, suggesting that the packing observed in the crystal structure was probably an artifact of crystallization (Martin-Tumasz et al. 2010). RRM3 was found to bind single stranded RNA non-specifically with very low affinity (Kd ~ 1mM), and chemical shift perturbation in the presence of the U6 3’ ISL of mainly the loop residues outside of the anti-parallel β-sheet suggested that interaction between RRM3 and U6 is probably non-canonical (Martin-Tumasz et al. 2011). This finding is puzzling given the temperature sensitivity of the triple alanine substitution across the β-sheet, which suggested that a canonical interaction had been disrupted (Vidaver et al. 1999). Additionally, many trans-acting suppressor mutations for U4-G14C and U6-A62G cold sensitivity have been mapped to β-strands 1 and 3 of RRM3 (Shannon & Guthrie 1991, Vidaver et al. 1999). These genetic observations argue strongly for a canonical RNA binding interaction across RRM3, although it is possible that RRM3 does not bind RNA in free U6 snRNP, but instead might bind U4 snRNA to facilitate base pairing during U4/U6 di-snRNP formation.   The RNA binding interaction between U6 and Prp24 is best characterized for RRM2, whose solution structure has been solved in the presence of a hexanucleotide sequence (AGAGAU) that can be found within the asymmetric bulge of U6 snRNA (Martin-Tumasz et al. 2010). This region of U6 has been suspected to bind Prp24 for some time given its identification  40 as a high affinity binding site for Prp24 by gel electrophoretic mobility shift assay; that it corresponds to the high affinity binding site identified in mammals; and that many of the nucleotides throughout this region are protected from chemical modification (Fig 2.3, 2.4, 2.5; Kwan & Brow 2005, Bell et al. 2002, Karaduman et al. 2006). Scaffold independent analysis, an NMR based approach to the unbiased identification of the RNA sequence specifically bound by an RNA binding protein, indicated that RRM2 of Prp24 binds a G at position one, an A at position two, a purine at position three, and any nucleotide at position four (Martin-Tumasz et al. 2010). This is consistent with RRM2 binding position G50A51G52A53 in the asymmetric bulge (Kwan & Brow 2005). RRM2 was found to make specific contacts with the first GA dinucleotide in this sequence, resulting in a hydrogen-bonding network with a total of four hydrogen bonds across the Watson/Crick face of the dinucleotide, consistent with the strong protection of G50 from chemical modification in free U6 snRNP (Fig 2.3, 2.4; Jandrositz & Guthrie 1995, Karaduman et al. 2006, Martin-Tumasz et al. 2010).   Since RRM2 immediately follows RRM1 in the Prp24 primary sequence, the position and orientation of RRM1 relative to RRM2 must be constrained in three-dimensional space. Indeed, both the NMR and crystal structures indicated that RRM1 and RRM2 are rigidly bound through stable inter-domain contacts (Bae et al. 2007, Martin-Tumasz et al. 2010). Consequently, RRM1 has been proposed to bind the region of U6 snRNA immediately 3’ of the AGAGAU site that is occupied by RRM2 (Martin-Tumasz et al. 2010). Consistent with such a proposal, the sugar-phosphate backbone of this sequence is largely protected from hydroxyl radical cleavage and contains a position, G55, which forms a UV cross-link to Prp24 (Fig 2.5; Karaduman et al. 2006). NMR mapping of the RNA binding region of RRM1, in the presence of either a 40 or 21 nucleotide RNA corresponding to the 40-60 nucleotide region of U6 snRNA revealed that an electropositive patch consisting of the α-helical face and loops 3 and 5 of RRM1 serve as the RNA binding region (Bae et al. 2007). Like RRM3 and oRRM4, RRM1 is therefore expected to interact with RNA in a non-canonical fashion (Bae et al. 2007).  Given the proposed locations of Prp24 binding on U6 snRNA, a model for passive facilitation of U4/U6 base pair formation, mediated by Prp24, has been proposed (Martin-Tumasz et al. 2011). In this model, RRMs 1 and 2 and RRMs 3 and 4 form two separate ‘match-maker’ domains respectively, each composed of an RRM that makes sequence specific contacts with single-stranded RNA, and a second RRM that opportunistically binds RNA non-canonically  41 during helix breathing (Martin-Tumasz et al. 2010, Martin-Tumasz et al. 2011). The first match-maker platform, composed of RRM1 and RRM2, initially recognizes the RRM2 binding site in the U6 asymmetric bulge (Martin-Tumasz et al. 2010). Chemical modification of the protein-free U6 snRNA indicated that the proposed RRM1 binding site is base-paired in the naked RNA, but becomes accessible upon Prp24 binding (Fig 2.3, 2.4; Karaduman et al. 2006). This helix is short and very weak, likely undergoing substantial helical breathing (Martin-Tumasz et al. 2010). Upon binding of RRM2 to U6 snRNA, RRM1 is positioned in such a way as to ‘capture’ the RRM1 binding site in U6 as the helix breathes, and once captured, RRM1 presents these U6 nucleotides in a manner that is favorable for interaction with U4 snRNA to generate U4/U6 stem I (Martin-Tumasz et al. 2010).   The RRM3 and oRRM4 binding sites in U6 snRNA have not been strictly defined; however, Martin-Tumasz et al. (2011) provide chemical shift perturbation evidence to suggest that both domains interact with the U6 3’ ISL. Quite unexpectedly, a decrease in U6 fluorescence anisotropy and an increase in UV absorbance at 260nm were observed upon addition of oRRM4 to a short RNA containing the 3’ ISL sequence (Martin-Tumasz et al. 2010). This observation suggested that oRRM4 was capable of destabilizing the lower half of the 3’ ISL in vitro, although the authors noted that oRRM4 might disrupt some other stem region within the context of full length U6 snRNA (Martin-Tumasz et al. 2011). Disruption of the 3’ ISL was not observed in the presence of Bovine Serum Albumin or Prp24-RRM3, indicating that the ability to unwind the 3’ ISL is a property unique to oRRM4 (Martin-Tumasz et al. 2011). In the match-maker model, RRM3 was proposed to interact with the 3’ ISL in a partially non-canonical fashion given the stronger chemical shift perturbation in the β2-β3 loop compared to the β-sheet in the presence of the 3’ ISL. oRRM4 then captures RNA at the base of the 3’ ISL during helix breathing, leading to disruption of the entire stem structure (Martin-Tumasz et al. 2011). U4/U6 base pair formation was predicted to occur concurrently with 3’ ISL unwinding as the energetic barrier of breaking a single bond while forming a new bond would be much lower than the energetic barrier to unwinding the entire 3’ ISL prior to annealing (Martin-Tumasz et al. 2010).  It is not clear whether the match-maker model is compatible with, or in contradiction to a second model of U4/U6 formation known as the kissing loop model. In the kissing loop model, the loop nucleotides of the 3’ ISL are thought to make a ‘kissing loop’ interaction with the 5’ end of U4 snRNA, resulting in destabilization of the U6 snRNA structure in nearby regions  42 (Karaduman et al. 2006). Propagation of U4/U6 base pairing to generate U4/U6 stem II was proposed to lead to further destabilization of the 3’ ISL, until the entire stem has unwound and base-paired to U4 snRNA (Karaduman et al. 2006). In the match-maker model, nucleation of U4/U6 takes place in the stem I region (Martin-Tumasz et al. 2010). Both of these models are consistent with chemical modification/interference studies in mammals, which identified two regions of U6 snRNA that were critical for U4/U6 nucleation: an invariant AGCA located in the U6 snRNA large asymmetric bulge, which is part of U4/U6 stem I, and the loop nucleotides of the 3’ ISL (Wolff & Bindereif 1993). It is possible that U4/U6 base pair formation makes use of both nucleation sites simultaneously; however, the details of such a mechanism are not known.  The match-maker model has incorporated a number of key structural and biochemical observations from the literature; however, protein binding of the 3’ ISL is not supported by structure probing experiments of free U6 snRNP (Karaduman et al. 2006). The sugar-phosphate backbone of the 3’ ISL only becomes protected from hydroxyl radical cleavage when the LSm protein complex is not present, suggesting that in free U6 snRNP, the LSm complex plays an important role in regulating Prp24 stoichiometry (Karaduman et al. 2006). The additional protection observed in the absence of the LSm complex probably reflects non-specific binding of a second molecule of Prp24. Notably, what is referred to as the high affinity Prp24 binding site in U6 snRNA (nucleotides 49-58) was identified in the absence of the LSm complex and in the presence of various Prp24 and U6 snRNA truncations (Kwan & Brow 2005). In the presence of excess Prp24, multiple molecules of Prp24 were found to cooperatively bind a single molecule of U6 snRNA (Kwan & Brow 2005). Thus Prp24 appears to bind U6 snRNA somewhat promiscuously in the simplified in vitro system and consequently these findings should be interpreted with caution.   The match-maker model also fails to address other regions of U6 snRNA that have been shown to be critical for Prp24 binding (Karaduman et al. 2006, Ryan et al. 2002). The first sixty nucleotides of U6 snRNA show strong protection from hydroxyl radical cleavage, and within this region, UV cross-links to Prp24 have been mapped to nucleotides U28, U29, and U38 (Fig 2.4; Karaduman et al. 2006). These cross-links fall in or near a region of long-range RNA-RNA interaction referred to as the ‘telestem’ (Fig 2.1; Vidaver et al. 1999). Genetic support for the lower telestem is strong; however, genetic and biochemical experiments fail to support the presence of the upper telestem in the snRNP (Vidaver et al. 1999, Jandrositz & Guthrie 1995,  43 Karaduman et al. 2006, Ryan et al. 2002). Instead, the 5’ side of this stem, nucleotides 40-43, are proposed to be single-stranded in more recent models of U6 secondary structure in free U6 snRNP, and chemical modification data have shown that these nucleotides undergo protein-dependent protection, suggesting that they serve as a single-stranded protein-binding site in free U6 snRNP (Fig 2.3, 2.4; Jandrositz & Guthrie 1995, Karaduman et al. 2006). Further, an A40-A42 polyG or polyU substitution resulted in severely reduced binding of Prp24, where only 16% and 2% of U6 snRNA co-immunoprecipitated with Prp24 in a U6 reconstitution assay for each substitution respectively (Ryan et al. 2002). Indeed, the Kd of A40G was reduced more than 10-fold compared to wild type U6 in an in vitro filter-binding assay (Ghetti et al. 1995). Thus this stretch of nucleotides appears to be critical for Prp24 binding.  Notably, all of the cis-acting suppressors of U6-A62G cold sensitivity that were found outside of the 3’ ISL mapped to the lower telestem and nucleotides 40-43 (Fortner et al. 1994). While the cold sensitive growth phenotype was rescued fully in these suppressor strains, there was only a minimal improvement in U4/U6 levels (Fortner et al. 1994). Further, a single point mutation in each of RRM2 (R158S) and RRM3 (F257I) of Prp24 was identified as a trans-acting suppressor of U6-A62G and A62U/C85A cold sensitivity (Vidaver et al. 1999). When the lower telestem was disrupted by mutation of either side of the stem, the temperature sensitive growth phenotype of R158S alone was exacerbated while regeneration of the telestem by compensatory base mutation reverted the phenotype (Vidaver et al. 1999). Since R158S was found to interact with the lower telestem genetically while F257I did not, the mechanism of suppression of these two mutations is probably different (Vidaver et al. 1999). This observation suggests that one mechanism of A62G suppression involves stabilization of the lower telestem through binding of Prp24 (Vidaver et al. 1999). Thus nucleotides 40-43, which sit just downstream of the Prp24 UV cross-links, and the telestem are important Prp24 binding sites that were not considered in the match-maker model. 2.3 The Dunn Model of U6 snRNP and a Role for U4 snRNA in Splicing  Given the structural flexibility exhibited by U6 snRNA throughout the splicing cycle, and the fact that several segments of U6 snRNA contain very similar, if not identical sequences, I speculated that many of the observations discussed in the previous section are poorly explained simply because the secondary structure model of U6 snRNA in free U6 snRNP is incorrect. If this is the case, it is possible that the Prp24 binding sites discussed above have been incorrectly  44 assigned as well. In an effort to address this, I developed a new model of U6 snRNA secondary structure, the Dunn model, that was based solely on base pairing potential and then I proceeded to evaluate the model using information that was available in the literature (Dunn & Rader 2010). Importantly, the Dunn model maintains critical structural elements for which there is strong experimental evidence, while eliminating features that are not well supported in the context of free U6 snRNP. Consequently, this model is the first since the discovery of U6 snRNA to suggest alternative base pairing of the 3’ ISL nucleotides, replacing the 3’ ISL with two stem/loops, A and B, that base pair with regions of U6 that lie outside of the U4/U6 interaction domain. These structures eliminate the unprecedented large asymmetric bulge proposed in the Karaduman model, generating instead a well-known three-way helical junction motif (Fig 2.1; Dunn & Rader 2010).   Important structural features that have been maintained in the Dunn model include the lower telestem long range RNA-RNA interaction and the adjacent single-stranded nucleotides A40-C43, which were shown to be critical for Prp24 binding (Fig 2.3, 2.4, 2.5; Vidaver et al. 1999, Ryan et al. 2002, Karaduman et al. 2006, Dunn & Rader 2010). In addition, the yeast equivalent of the two sites that are important for U4/U6 nucleation in mammals, positions 59-62 and 72-75, have been maintained in the Dunn model as single-stranded regions, consistent with their strong chemical modification pattern in free U6 snRNP (Fig 2.3, 2.4; Wolff & Bindereif 1993, Jandrositz & Guthrie 1995, Karaduman et al. 2006). Despite the inclusion of these elements, the Dunn model differs substantially from the Karaduman model overall, although both models are supported to a similar extent by chemical modification data, highlighting the ambiguous nature of this structure probing technique (Fig 2.3, 2.4; Jandrositz & Guthrie 1995, Karaduman et al. 2006, Dunn & Rader 2010). The only structural similarities between the Karaduman and Dunn models are the lower telestem and a short extension to this structure for which there is currently no experimental information in the literature (Fig 2.1).  In addition to eliminating the 3’ ISL, the alternative base pairing proposed in the Dunn model sequesters an important sequence element in U6 snRNA, the ACAGAGA box. This sequence recognizes and base pairs with the 5’ splice site during early stages of spliceosome assembly (Fabrizio & Abelson 1990, Wassarman & Steitz 1992, Lesser & Guthrie 1993, Kandels-Lewis & Séraphin 1993, Wolff et al. 1994, Kim & Abelson 1996). Thus, in order for U6 to be activated for splicing, two critical elements in U6 must be sequentially unmasked: the  45 ACAGAGA sequence for spliceosome assembly, and the 3’ ISL for catalysis. The Dunn model proposes that U4 snRNA plays an essential role in releasing these elements through a process that is dependent on formation of the inter-molecular interaction between U4 and U6 snRNAs (Fig 2.6; Dunn & Rader 2010). Thus U4 snRNA can be described as an allosteric activator of U6 snRNA, a role that is consistent with the observation that U4 is present in yeast extract in sub-stoichiometric quantities relative to the other spliceosomal snRNAs, and therefore is probably a limiting factor in spliceosome assembly (Cheng & Abelson 1987, Stevens et al. 2001).  Figure 2.6. The U6 snRNA retrieval model. Prp24 retrieves U6 snRNA from the disassembling  spliceosome following splicing (1,2), holding U6 in a conformation that is favorable for  interaction with U4 snRNA (3). U4 snRNA allosterically activates U6 through the base  pair-mediated unmasking of the ACAGAGA sequence (4), and prevents premature  formation of the 3’ ISL until U6 base pairs with the 5’ splice site of the pre-mRNA (5).  Following stable association of U6 with the pre-mRNA, U4 dissociates, allowing the  catalytically important 3’ ISL to fold and splicing to occur (6). The spliceosome  disassembles and the cycle begins again. Regions of U4 and U6 that base pair are  indicated by different gray scale shaded segments.  46  As an allosteric activator, U4 snRNA plays a critical role in priming U6 snRNA for catalysis in two temporally distinct stages. First, the 5’ splice site binding region of U6 must be released from stem/loop A in order for U6 to interact with the transcript during spliceosome assembly (Fig 2.6). This first stage is achieved through formation of the U4/U6 di-snRNA base pairing interaction, where formation of the inter-molecular interaction is accompanied by disruption of the intra-molecular interactions in U6 stem/loops A and B. Stem/loops A and B are expected to be substantially less stable than the 3’ ISL (Dunn 2009), which is consistent with the low activation energy observed for di-snRNP formation. The U4/U6 interaction is therefore proposed to form through the same concerted mechanism proposed in the match-maker model to lower the energy barrier of di-snRNP formation, except that the intramolecular structures that are unwound are stem/loops A and B rather than the 3’ ISL (Fig 2.6). Indeed, strong chemical modification of nucleotides in the 5’ splice site binding region of U6 in U4/U6 di-snRNP, but not free U6 snRNP, demonstrates that the 5’ splice site interaction region does in fact become accessible and free to interact with the transcript during transition to the di-snRNP particle (Jandrositz & Guthrie 1995). In the presence of Prp24, the rate of U4/U6 annealing is greatly enhanced (Raghunathan & Guthrie 1998), presumably because Prp24 presents U6 to U4 in a conformation that is more favorable for interaction with U4.  Complete annealing between U4 and U6 results in formation of U4/U6 stems I and II, which now inhibit formation of the catalytically important U2/U6 helix I and U6 3’ ISL respectively (Fig 2.6; Brow & Guthrie 1989). Cross-links between U4 snRNA and the 5’ splice site in early stages of spliceosome assembly suggest that a proofreading mechanism is in place to ensure that correct base-pairing between U6 and the 5’ splice site has been established prior to dissociation of U4 from the assembling spliceosome (Johnson & Abelson 2001). The ATP-dependent RNA helicase responsible for mediating the switch of U1 for U6 snRNA at the 5’ splice site, Prp28, has been implicated in such proofreading (Staley & Guthrie 1999, Yang et al. 2013). Once the correct interactions have been detected at the 5’ splice site, U4 snRNA serves its second function: to activate U6 snRNA for splicing through its dissociation from the spliceosome. This releases the catalytically important U2/U6 helix I and U6 3’ ISL nucleotides so that U6 can now form the catalytic structures within the active site of the assembled spliceosome at an appropriate time during the splicing cycle. Following splicing, U6 is recycled back to a free U6 snRNP particle, and these critical regions of U6 snRNA are again sequestered through intramolecular base pairing within free U6 snRNP.  47  Very little is known about the release of U6 snRNA following splicing, especially with respect to how and when Prp24 and the LSm complex re-associate with U6 snRNA to form free U6 snRNP. One enticing model is that Prp24 retrieves U6 snRNA from the disassembling spliceosome. The 3’ ISL of the disassembling spliceosome might serve as the Prp24 RRM3 and oRRM4 substrate, allowing for disruption of this structure in order to return U6 snRNA to a catalytically inert particle (Fig 2.6). This could explain why the oRRM4 interaction is non-specific and why RRM3 binds its substrate with such low affinity (Martin-Tumasz et al. 2011) – the 3’ ISL/Prp24 interaction would be expected to be very short lived, simply serving to grab hold of U6 snRNA long enough for the RRM1 and RRM2 high affinity binding sites to become accessible and bound by Prp24. Either concomitantly with, or once the high affinity interaction has been established, RRM3 and oRRM4 would let go of their binding sites, allowing stem/loops A and B to form. RRM1 and RRM2 would then hold U6 snRNA in a conformation that is favorable for interaction with U4 snRNA and the splicing cycle would begin again with the release of Prp24 during U4/U6 di-snRNP formation.    If the Dunn model of U6 snRNA secondary structure is correct, then how and where does Prp24 bind U6 snRNA with high affinity in free U6 snRNP to facilitate U4/U6 di-snRNP formation? The region of U6 that has been proposed to bind RRM2 in the asymmetric bulge in the Karaduman model resides in the stem of stem/loop A in the Dunn model and would not be available to make canonical contacts with RRM2. One intriguing possibility is that the telestem region of U6 snRNA binds Prp24 with high affinity, since an identical hexanucleotide sequence to the one present in the NMR solution structure can be found at nucleotides 95-100, which forms the 3’ side of the extension to the telestem proposed in the Karaduman and Dunn models. Notably, this sequence was identified as a Prp24 binding site by gel electrophoretic mobility shift assay, as were nucleotides 28-35, which form the 5’ side of the extension to the telestem (Kwan & Brow 2005). Thus the Dunn model predicts that the high affinity Prp24 binding site might actually reside in the telestem region of U6 snRNA in free U6 snRNP, consistent with the UV cross-links to Prp24 at position U28, U29, and U38, as well as the telestem genetics and biochemistry discussed in the previous section (Fig 2.5, 2.6; Vidaver et al. 1999, Ryan et al. 2002, Karaduman et al. 2006). In order to explore this proposal further I must first establish whether or not the telestem extension forms a stem or remains unpaired, and secondly, whether the rigidly connected Prp24 RRMs 1 and 2 could interact with these regions of U6 snRNA in three-dimensional space without steric complications.  48 3 The U6 snRNA Telestem is a Binding Site for the U6 snRNP Protein  Prp24 3.1 Introduction  The extended telestem is one structural feature that is common to both the Karaduman et al. (2006) and the Dunn and Rader (2010) models of U6 snRNA secondary structure in free U6 snRNP (Fig 2.1). The upper portion of this stem, known as the lower telestem in the literature, has been supported both by extensive genetic manipulations as well as through chemical structure probing and U6 reconstitution assays with mutant U6 snRNA (See chapter two; Vidaver et al. 1999, Jandrositz & Guthrie 1995, Karaduman et al. 2006, Ryan et al. 2002). In contrast, there is currently no evidence to support the existence of the lower extension of this stem. In chapter two I highlighted several observations found in the literature that suggested that this region of U6 snRNA might serve as an important protein-binding site. These include three different UV cross-links to Prp24 as well as strong protection from hydroxyl radical cleavage (Karaduman et al. 2006). Strong protection from chemical modification of nucleotides 29-32 on the 5’ side of this structure in the absence of protein, and their moderate to strong accessibility to modification in free U6 snRNP suggests that if the stem does form, it is disrupted by binding of Prp24 in free U6 snRNP (Karaduman et al. 2006). Unfortunately there is no structure probing information available for nucleotides 98-101 on the 3’ side of the telestem extension since the probe for primer extension binds to this region of U6 snRNA; consequently these nucleotides are outside of the detection region.   If Prp24 does in fact bind this region of U6 snRNA, and if it does so in a canonical fashion through its RNA recognition motifs, then the nucleotides proposed to form the extended telestem structure must actually be single-stranded. The work presented here aimed to test the telestem extension through an exhaustive genetic analysis of these nucleotides. My genetic findings do not support base pairing between these two segments of U6 snRNA, however the unexpected growth phenotypes observed indicated that this region of U6 was important for cell viability. Analysis of the free U4, free U6, and U4/U6 di-snRNP levels in the mutant strains revealed that di-snRNP levels were greatly reduced when double mutations were present at position 30/31. These reduced di-snRNP levels did not arise due to impaired U4/U6 association, but rather as a result of reduced levels of free U6 snRNP. Genetic combinations of the U6-G30U/G31U mutation with various Prp24 alleles revealed a genetic interaction between this  49 region of U6 snRNA and RRM1, but not the C-terminal portion of Prp24. Further, a mutation that would allow base pairing of nucleotides G96 and A97 at the base of the telestem resulted in lethality. Thus my findings presented here, combined with those from the literature, support a model in which nucleotides 28-33 and nucleotides 96-101 do not base pair in free U6 snRNP, but instead serve as the high affinity single-stranded binding site for Prp24 RRMs 1 and 2 respectively. 3.2 Materials and Methods 3.2.1 Construction of U6 Mutant Yeast Strains  Mutant U6 genes were created in the yeast shuffle plasmid pSX6T (pSE358 containing a genomic fragment of wild type SNR6 and a TRP1 marker). SNR6 is flanked by SphI and XhoI restriction enzyme cleavage sites at the 5’ and 3’ ends respectively, and contains an internal EcoNI restriction enzyme cleavage site at position 72. These sites were used to insert oligonucleotide cassettes containing the appropriate mutation and restriction sites (Table 3.1). All mutations 5’ of position 72 were made in the SphI/EcoNI cassette and all mutations 3’ of position 72 were made in the EcoNI/XhoI cassette. Ten micrograms of pSX6T was digested with ten units of the appropriate enzymes (NEB) and NEB buffer 4 in a total volume of 20 µL for three hours at 37°C. Digested plasmid backbone was gel purified in a 1% agarose TBE gel and was extracted from the gel using the Qiagen QIAquick Gel Extraction Kit.   Mutant oligonucleotide cassettes were constructed by annealing complementary oligonucleotides containing the appropriate mutation and restriction enzyme cleavage sites such that the annealed pair contained the sticky ends required for ligation into pSX6T. Oligonucleotides were mixed in at an equal molar concentration (1 µM) in annealing buffer (10mM Tris-Cl pH:8, 50mM NaCl, 1mM EDTA) in a final volume of 100 µL. Mixtures were heated to 95°C for five minutes and then slow cooled to room temperature for several hours. Annealed oligonucleotide cassettes were ligated into pSX6T in a 3:1 molar ratio of insert:backbone in a reaction containing 400 units of T4 DNA ligase (NEB) and T4 DNA ligase buffer (NEB) in a total reaction volume of ten microliters. Ligation reactions were incubated at room temperature for two hours followed by a ten-minute heat inactivation of the ligase at 65°C. Five microliters of the ligation reaction was transformed into 50 µL of RbCl2 competent DH5α cells. The cells were plated on selective media and incubated overnight at 37°C. All mutant  50 plasmids were sequenced to confirm the presence of the desired mutation (UNBC Genetics Facility). Table 3.1. DNA oligonucleotides used to make mutant U6 snRNA yeast strains. DNA  sequences are given from 5’ to 3’. The first oligonucleotide listed for each gene is the  forward primer and the second is the reverse primer. oSDR# U6 Mutation Sequence 614  G30T/G31T CATGTGTTCGCGAAGTAACCCTTCGTGGACATTTTTTCAATTTGA AACAATACAGAGATGATCAGCAGTTCCCCTGC 615 G30T/G31T TGCAGGGGAACTGCTGATCATCTCTGTATTGTTTCAAATTGAAAA AATGTCCACGAAGGGTTACTTCGCGAACACATGCATG 616 G30C/G31C CATGTGTTCGCGAAGTAACCCTTCGTGGACATTTCCTCAATTTGA AACAATACAGAGATGATCAGCAGTTCCCCTGC 617 G30C/G31C TGCAGGGGAACTGCTGATCATCTCTGTATTGTTTCAAATTGAGGA AATGTCCACGAAGGGTTACTTCGCGAACACATGCATG 618 G30A/G31A CATGTGTTCGCGAAGTAACCCTTCGTGGACATTTAATCAATTTGA AACAATACAGAGATGATCAGCAGTTCCCCTGC 619 G30A/G31A TGCAGGGGAACTGCTGATCATCTCTGTATTGTTTCAAATTGATTA AATGTCCACGAAGGGTTACTTCGCGAACACATGCATG 620 T32A/C33G CATGTGTTCGCGAAGTAACCCTTCGTGGACATTTGGAGAATTTGA AACAATACAGAGATGATCAGCAGTTCCCCTGC 621 T32A/C33G TGCAGGGGAACTGCTGATCATCTCTGTATTGTTTCAAATTCTCCA AATGTCCACGAAGGGTTACTTCGCGAACACATGCATG 632 G96C/A97C ATAAGGATGAACCGTTTTACAAACCGATTTATTTCGTTTTTTTTT TATCC 633 G96C/A97C TCGAGGATAAAAAAAAAACGAAATAAATCGGTTTGTAAAACGGTT CATCCTTA 634 G96T/A97T ATAAGGATGAACCGTTTTACAAATTGATTTATTTCGTTTTTTTTT TATCC 635 G96T/A97T TCGAGGATAAAAAAAAAACGAAATAAATCAATTTGTAAAACGGTT CATCCTTA 638 T100C/T101C ATAAGGATGAACCGTTTTACAAAGAGACCTATTTCGTTTTTTTTT TATCC 639 T100C/T101C TCGAGGATAAAAAAAAAACGAAATAGGTCTCTTTGTAAAACGGTT CATCCTTA 640 T100G/T101G ATAAGGATGAACCGTTTTACAAAGAGAGGTATTTCGTTTTTTTTT TATCC 641 T100G/T101G TCGAGGATAAAAAAAAAACGAAATACCTCTCTTTGTAAAACGGTT CATCCTTA 642 T100A/T101A ATAAGGATGAACCGTTTTACAAAGAGAAATATTTCGTTTTTTTTT TATCC 643 T100A/T101A TCGAGGATAAAAAAAAAACGAAATATTTCTCTTTGTAAAACGGTT CATCCTTA   Mutant pSX6T was transformed into the yeast strain YHM1 (SNR6::LEU2; yCp50-SNR6(wt)-URA3; Madhani et al. 1990). Approximately 100 µL of yeast cells and 200-300 ng of  51 pSX6T were mixed in 1 mL PEG-TEL (10 mM Tris-HCl pH: 7.5, 3.335 mM EDTA, 100 mM LiOAc, 40% PEG 3350) along with 40 ng of boiled sheared salmon sperm DNA. Reactions were left at room temperature overnight and then were heat shocked at 42°C for ten minutes. Cells were pelleted in a microcentrifuge with a 30 second spin at 3000 rpm. The PEG-TEL was removed by gentle aspiration and the cells were gently resuspended in 200 µL TE (10 mM Tris-HCl pH: 7.5, 3.335 mM EDTA) and plated on CSM-trp plates. The plates were incubated at 30°C for three days. A single colony of each mutant was streaked onto a fresh CSM-trp plate and incubated at 30°C for three days. A single colony of each mutant was then streaked onto 5-FOA plates and the mutants that grew on the plates were grown up overnight and made into glycerol stocks. To ensure that the wild type plasmid had been shuffled out of the strain, each mutant was subjected to PCR of the URA3 gene to confirm the absence of the plasmid.  Assessment of a lethal allele was carried out both in liquid media (CSM-trp) and on CSM-trp plates at 30°C and at 37°C. Lethality was tested at 37°C since the lethal mutation (G96U/A97U) that had been introduced was predicted to increase the stability of the stem. Thus, growth at an elevated temperature might have compensated for the hyperstabilization of the stem, allowing cells to grow and divide. Lethality was suspected when this double mutation grew well on CSM-trp and CSM-ura, but failed to grow on 5-FOA after many rounds of streaking to fresh CSM-trp plates, which should have encouraged loss of the URA3-marked plasimd carrying wild type SNR6. To confirm that the strain is indeed lethal, the colonies were suspended in liquid 5-FOA and grown overnight at 30°C and 37°C. No growth was observed in liquid media at either temperature tested. To confirm lethality on plates, colonies containing the mutant and wild type bearing plasmids were grown up overnight in liquid YPD and the following morning one million cells were plated a 5-FOA plate in triplicate. When no colonies were observed on any of the plates, a lethal call was assigned to this double mutant.  For the genetic combination of U6 snRNA position 30/31 mutations with various Prp24 alleles, pSX6T-G30U/G31U and each Prp24 allele (Table 3.1) was transformed into the yeast strain LL200 (PRP24::ADE2, SNR6::LEU2; yCp50-SNR6(wt)-URA3; pUN50-PRP24(wt)-URA3; Vidaver et al. 1999) as above.     52 Table 3.2. Prp24 plasmids used in this work. All plasmids are centromeric and contain a HIS3  marker.  pSR49 (pRS313 backbone) Wild Type pSR63 (pPR113 backbone) K50E pSR64 (pPR113 backbone) N107S pSR65 (pPR113 backbone) S75A pSR66 (pPR113 backbone) L94P pSR39 (pPR113 backbone) Δ10 pSR53 (pRS313 backbone) RRM3sub pSR55 (pRS313 backbone) F257I pSR70 (pSE362 backbone) RRM4sub pSR87 (pSE362 backbone) RRM2sub pSR352 (pRS313 backbone) F87A  3.2.2 Yeast Growth Assays  All mutant yeast strains were streaked from the glycerol stock to CSM-trp plates and incubated at 30°C for three days. A single colony of each was grown up in liquid YPD overnight on a shaker at 30°C until the OD600 was between 1.0 and 2.0. The starting OD600 was set to 1.0 in a final volume of 200 µL and a five-fold dilution series was generated in a 96-well plate. The dilution series were plated on YPD plates with a pinning tool and the plates were incubated at 16°C, 30°C, or 37°C. Pictures of the plates were taken every 24 hours.  For doubling time determination, each mutant was grown up overnight as above. This culture was used to inoculate three 25 mL aliquots of liquid YPD to a starting OD600 of 0.05, and each culture was incubated at 16°C, 30°C, or 37°C in a shaker incubator. The OD600 was recorded until the OD600 was approximately 2.0. The OD600 was plotted against time to ensure that exponential growth of each strain had been achieved. The log10 OD600 versus time was then plotted to find the linear range, and the log2 OD600 of this range was then plotted against time to find the doubling time (inverse slope).  3.2.3 Analysis of RNA Levels by Solution Hybridization and Primer Extension  Each mutant U6 snRNA strain was grown overnight at 30°C to an OD600 of less than 2.5 and was then diluted back to an OD600 of 0.2 in fresh YPD liquid media. Strains were then grown at 30°C for an additional four hours before shifting to 16°C, 30°C, or 37°C for two hours. The cells were then harvested and stored as pellets at -20°C for further processing. The cell pellets  53 were re-suspended in 300 µL chilled, filtered RNA extraction buffer (50 mM Tris-HCl pH: 7.5, 100 mM NaCl, 10 mM EDTA). Two hundred microliters of 0.5 mm acid washed, baked Zirconia/Silica beads (Biospec Products Inc.) were added and the sample was vortexed for one minute on the maximum setting. The sample was placed on ice for five minutes and then vortexed for one minute as before. Three hundred microliters of chilled, filtered RNA extraction buffer, 60 µL 10% SDS, and 400 µL acid equilibrated Phenol:Chloroform (5:1, pH: 4.0) (Ambion) was added and the sample was vortexed on maximum speed for one minute prior to centrifugation in an Eppendorf 5415D centrifuge at 13 200 rpm for five minutes at 4°C. The aqueous phase was collected and Phenol:Chloroform extracted twice more before Chloroform back-extraction with an equal volume of Chloroform (Sigma). The aqueous phase was collected and 40 µL 3 M sodium acetate pH: 5.2 and one milliliter 100% ethanol were added. Tubes were inverted several times and then were placed at -80°C for at least 20 minutes. RNA was collected by spinning at 13 200 rpm for 20 minutes at 4°C in an Eppendorf 5415D centrifuge. The RNA pellet was washed once with 70% ethanol, left to air dry for 5-10 minutes, and then was re-suspended in 30 µL 10 mM Tris-HCl pH: 7.5.  Five micrograms of total RNA was incubated with 100 000cpm/µL 5’-end radioactively labeled probe specific for U4 (U4 14B – 5’-AGGTATTCCAAAAATTCCCTAC-3’) or U6 snRNA (U6 6D – 5’-AAAACGAAATAAATCTCTTTG-3’) in a ten microliter reaction containing 150 mM NaCl, 50 mM Tris-HCl pH: 7.4, 1 mM EDTA. Samples were incubated at room temperature for 20 minutes prior to adding two microliters 6X Native Loading Buffer. Samples were then electrophoresed in a pre-run, pre-chilled 9% (29:1) non-denaturing polyacrylamide gel at 300V for 3 hours at 4°C. Gels were exposed to an audioradiography screen overnight at -80°C. Screens were visualized in a Packard Instrument Co. Cyclone using the Optiquant version 4.0 software. Bands were quantified by densitometry and the percent free U4 was calculated by dividing the counts for free U4 by the sum of the counts for free U4 and U4/U6, and then multiplying by 100. Statistically significant differences in free U4 levels were determined by comparing each mutant to wild type in a two-tailed t-test with a confidence interval of 95% using graphpad software (  U6 snRNA levels were quantified by subjecting total RNA to primer extension using a 5’-end labeled, radiaoactive probe targeted to the 3’-end of U6 snRNA. Five micrograms of total RNA was mixed with one microliter of ten millimolar dNTP mix, 100 000 counts of labeled U6  54 snRNA probe, and ten micrograms of E. coli tRNA in a total reaction volume of 12 µL. This mix was incubated at 65°C for six minutes, followed by a two-minute incubation on ice. Eight microliters of an enzyme mix (0.5 µL Affinity Script Reverse Transcriptase in 1X Affinity Script Buffer (Agilent Technologies), 0.5 µL SUPERas•IN RNAse Inhibitor (Ambion), 12.5 mM DTT, and dH2O to eight microliters) were added and the reactions were allowed to proceed at 42°C for 1.5 hours. Reactions were stopped with stop buffer (150 µL TE pH: 7.5, 20 µL 3M NaOAc pH: 5.2, 0.5 µL (20 mg/mL) glycogen, 500 µL 100% ethanol) and placed at -20°C for at least 20 minutes. The RNA was pelleted by spinning at 13 200 rpm for 20 minutes at 4°C in an Eppendorf 5415D centrifuge. The RNA pellet was washed once with 70% ethanol, left to air dry for 5-10 minutes, and then was re-suspended in eight microliters of 7 M Urea loading buffer. Samples were heated to 65°C for three minutes prior to loading in a pre-run 8% (19:1) polyacrylamide, 7 M urea gel in 1X TBE at 400 V for 1.5 hours. Gels were exposed to an audioradiography screen overnight at -80°C and were visualized in a Packard Instrument Co. Cyclone using the Optiquant version 4.0 software where the bands were quantified by densitometry. 3.2.4 5’-end Labeling of DNA probes  In a total volume of 25 µL, 50 pmoles of DNA oligonucleotide (U4 14 or U6 6D – see above), purchased from Eurofins Genomics, was mixed with 2.5 µL 10X T4 Polynucleotide kinase buffer (NEB), 20 U of T4 Polynucleotide kinase (NEB), and 2.5 µL γ-32P-ATP (Perkin Elmer – 3000Ci/mmol). Reactions were incubated at 37°C for one hour fifteen minutes and then the polynucleotide kinase was heat inactivated at 65°C for twenty minutes. Reactions were passed over a G25 spin column (Santa Cruz) to remove unincorporated nucleotides. One microliter of reaction was collected before and after the G25 spin column step and was counted in a scintillation counter to determine the efficiency of incorporation. 3.2.5 In Silico Modeling of U6 snRNP  In silico modeling of U6 snRNA secondary structure was performed using MC-Sym (Parisien & Major 2008). U6 nucleotides 35-96 were modeled as proposed in our secondary structure using dot-bracket notation. Seven structures were produced and these were subjected to energy minimization and scoring within the MC-Sym program. Fragments of Prp24 were collected from the Protein Data Bank (PDB) and were modeled onto U6 snRNA using Mac  55 PyMOL (PyMOL Molecular Graphics System, 2008 DeLano Scientific LLC). PDB Accession Numbers used were: 2GHP, 2GO9, 2KH9, and 2L9W. 3.3 Results  To test whether formation of the telestem extension is important for splicing, I began with an exhaustive genetic analysis of this region of U6 snRNA (nucleotides 30-35 and 96-101). A number of mutations that were predicted to either hyperstabilize or destabilize the stem were constructed in a yeast shuttle vector containing a TRP1 marker and were subsequently introduced into a yeast strain containing a genomic deletion of SNR6, the gene encoding U6 snRNA (Fig 3.1). Since U6 snRNA is essential for cell viability, the genomic deletion was covered by a wild type copy of SNR6 carried on a URA3-marked plasmid. These transformations were plated on complete supplemental media lacking tryptophan (CSM-trp) to select for colonies that had taken up the mutant plasmid. Since there was no selective pressure to retain the URA3-marked plasmid carrying the wild type gene, cells were free to lose this plasmid. The URA3 gene encodes Orotidine 5’-phosphate Decarboxylase, which converts 5-Fluoroorotic Acid (5FOA) into fluorodeoxyuridine, which is toxic to the cells and can therefore be used as a final selection for colonies that have lost the URA3-marked plasmid, ensuring that the wild type copy of SNR6 is no longer present (Boeke et al. 1987). Upon successful exchange of the wild type plasmid for one carrying our mutation, growth was assayed at 16°C, 30°C, and 37°C (Fig 3.2).  Figure 3.1. Yeast plasmid shuffle to introduce mutant U6 snRNA. The parent yeast strained  used in these genetic studies carried a chromosomal deletion of SNR6, replaced by the  gene for LEU2. Wild type SNR6 encoded in a URA3-marked plasmid covered this  deletion. The TRP1-marked plasmid carrying the mutant SNR6 genes was introduced by  transformation and was selected for on CSM-trp media. Subsequent selection on 5-FOA  ensured that the URA3-marked plasmid had been lost, and then growth of the mutant  strains was assayed at 16°C, 30°C, and 37°C.  56  While most of the extended telestem mutations yielded a growth phenotype similar to the wild type strain, there were several mutations that resulted in strong, but unexpected growth phenotypes. The mutant G96U/A97U was predicted to increase the stability of the stem by introducing Watson-Crick base-pairing with position A34/A35 at the bulge in the stem that separates the telestem from the telestem extension (Fig 3.2). This double mutation was lethal, consistent with stem hyperstabilization; however, the equivalent double mutation from the other side of the bulge, A34U/A35C, exhibited only mild temperature sensitivity. Thus G96U/A97U is lethal due to the importance of these nucleotides in a mechanism that is independent of positions 34/35. At position 100/101, the double mutation U100C/U101C exhibited a very mild cold sensitive growth phenotype, consistent with the predicted stem hyperstabilization; however, the double mutation U100G/U101G, which was predicted to destabilize the stem and was therefore expected to yield a temperature sensitive growth phenotype, exhibited even stronger, but still cold sensitive growth. Thus the observed growth phenotype generated by the position 100/101 mutants is inconsistent with stem formation. At position 30/31, the three mutations that were introduced were predicted to either not change the stem stability (G30A/G31A), or to destabilize the stem (G30U/G31U and G30C/G31C). I therefore predicted to see a growth defect at elevated temperatures, since the stem would be even less stable or might not even form at all, but instead I observed a severe growth defect at reduced temperatures. Surprisingly, G30U/G31U was lethal at 16°C and this growth defect could not be rescued by compensatory base mutations at position 100/101 (data not shown). Together these genetic observations suggest that positions 96/97, 100/101, and 30/31 play an important role in splicing that does not involve formation of the telestem extension.  Under optimal growth conditions, U4 snRNA is only found base-paired to U6 snRNA in the di-snRNP particle (Cheng & Abelson 1987); thus accumulation of free U4 snRNA is generally indicative of a U4/U6 di-snRNP assembly defect (Fortner et al. 1994). To determine whether the mutations that I introduced into U6 snRNA caused a growth phenotype by influencing the ability of U6 to interact with U4 to form U4/U6 di-snRNP, I shifted the cell cultures to 16°C, 30°C, or 37°C for two hours following an initial period of growth at 30°C (Fig 3.3). I then extracted total RNA from these cells under non-denaturing conditions and probed this material for U4 snRNA by solution hybridization. I found that all mutations showed wild type levels of free U4 snRNA and U4/U6 di-snRNA, except for the three mutations that were made at position 30/31, as well as for U100G/U101G. These mutations resulted in statistically significant  57 accumulation of free U4 snRNA at the 95% confidence interval using a two-tailed t-test. Notably, although there was not a detectable defect in growth at elevated temperatures, free U4 accumulated to the highest level at 37°C compared to 16°C and 30°C for all three position 30/31 mutations. The accumulation of free U4 was most severe for the G30U/G31U mutant at all temperatures tested, consistent with the more severe cold sensitive growth observed for this mutation. Although the position 30/31 U6 mutations resulted in high levels of free U4 snRNA, it is not immediately apparent how the mutations at this position could affect U4/U6 formation, at least in a direct manner, since the mutations reside outside of the U4/U6 interaction domain.  Figure 3.2. U6 snRNA telestem extension mutant doubling times. The proposed secondary  structure including the telestem (above the bulge) and telestem extension (below the  bulge) is shown to the left. Doubling time (in minutes) is plotted for each mutant  indicated across the x-axis. Cells were grown overnight at 30°C and were then diluted  back to an OD600 of 0.05 before incubating at the appropriate temperature. The ‘X’  indicates a lethal phenotype.   An alternative explanation for free U4 accumulation is that U6 levels might be reduced in these strains, depleting the free U6 snRNA levels to the point that not all of the U4 snRNA is able to associate with U6. Reduced levels of U6 snRNA might indicate that U6 snRNP is unstable, causing some amount of U6 snRNA to undergo degradation. When U6 from these strains was quantified by primer extension, the level of U6 in the position 30/31 mutant strains was approximately one third of wild type while all other mutations in the extended telestem  58 resulted in levels comparable to wild type (data not shown). Primer extension was used for quantification of U6 snRNA as opposed to Northern blot for two reasons. First, primer extension is used extensively for quantification of U6 snRNA in the literature and was consequently used here in an effort to generate data that could be compared back to other U6 mutants in the literature without concern that differences in the data were due to the differences inherent to the quantification techniques. Second, since I was expecting to see reduced levels of U6 snRNA in some of the mutant strains, I wanted to be sure that inefficient transfer of the RNA to the membrane in a Northern blot was not the source of any apparently reduced levels of U6 that might have been observed.  Figure 3.3. Free U4 snRNA levels in U6 snRNA telestem extension mutant yeast strains.  The % free U4 snRNA was assessed by solution hybridization (top left) of total RNA  extracted from each mutant yeast strain indicated, which had been temperature shifted to  the appropriate temperature for two hours prior to RNA extraction. The U4 RNA levels  were quantified and plotted to the right. The X denotes a lethal mutant from which RNA  could not be extracted. The * denotes strains for which the U6 levels were approximately  one-third of wild type by primer extension of total RNA (bottom left). Experiments were  performed with biological triplicates. A62G is a control that is known to accumulate U4  snRNA as a result of a defect in U4/U6 di-snRNP assembly.  If the position 30/31 mutants cause U6 to be unstable in free U6 snRNP, then the U4 accumulation and cold sensitive growth exhibited by these mutations should be rescued by driving U4/U6 di-snRNP formation through mass action by over-expressing the mutant U6 snRNA. When each of the position 30/31 mutations was expressed from a 2µ plasmid, U6 levels  59 were restored to approximately 80% of wild type, and the amount of free U4 was reduced substantially, but the cold sensitive growth phenotype was only partially rescued (Fig 3.4).   Figure 3.4. Over-expression of mutant U6 snRNA. Over-expression of mutant U6 snRNA  restores U6 levels to at least 80% wild type, but only partially rescues the growth  phenotypes (left), while substantially restoring U4/U6 di-snRNP levels (right). The X  denotes a lethal phenotype.  U6 snRNA levels could be low in the position 30/31 mutant strains for one of two reasons: either transcription is reduced, resulting in lower total U6 snRNA levels, or U6 is transcribed well, but does not associate with Prp24 and the LSm2-8 proteins efficiently, causing U6 to be unstable in the cell. To address the possibility of snRNP instability, I turned to the U6 snRNP-associated protein, Prp24, which has been UV cross-linked to U6 snRNA in this region (Karaduman et al. 2006). To test whether the U6 snRNA mutations at position 30/31 result in the disruption of important protein-RNA contacts within the snRNP, I combined the G30U/G31U mutation with a variety of Prp24 alleles to look for either a synthetic lethal growth phenotype, or rescue of the cold sensitive growth phenotype (Fig 3.5). None of the alleles that I tested rescued growth at 16°C, but surprisingly, I only found a genetic interaction between G30U/G31U and mutations in the first RRM of Prp24. These ranged from synthetic lethal (F87A and L94P) to severe at all temperatures tested (K50E) and subtle at 37°C (1sub, S75A and N107S). Mutations at our disposal in RRM3 are very sick on their own at 30°C and dead at 37°C (Kwan & Brow 2005). Therefore, I do not believe that the inability to shuffle out the wild type genes when these mutations were combined with G30U/G31U reflects an authentic synthetic lethal interaction, but instead is a consequence of the severe phenotypes observed for each of the individual mutations (3sub and F257I). It should be noted that there is no growth at 37°C for the G30U/G31U/4sub  60 combination; this is because 4sub itself is temperature sensitive (Rader & Guthrie 2002). When present alone, the RRM1 mutants K50E and L94P are cold sensitive while F87A is temperature sensitive (Raghunathan unpublished, Kwan & Brow 2005). Thus the results obtained here demonstrate that position 30/31 of U6 snRNA interacts genetically with RRM1 of Prp24.  Figure 3.5. U6 snRNA G30U/G31U genetic combinations with Prp24 alleles. A single colony  of each mutant yeast strain was grown overnight at 30°C. A five-fold dilution series was  set up in a 96-well plate with the left most dot containing one OD600/mL of cells. Cells  were transferred from the 96-well plate to a YPD plate with a pinning tool and were  allowed to grow for two days at the indicated temperature. A schematic diagram of Prp24  is shown across the top.   In chapter two I proposed that high affinity binding between U6 snRNA and Prp24 in free U6 snRNP occurs through interactions between RRM1 and RRM2 of Prp24 and the telestem region of U6 snRNA. The genetic data presented here supports this proposal in that RRM1 was found to interact genetically with U6 snRNA in the region of nucleotides 28-35. The lethal phenotype observed for G96U/A97U suggested that a critical interaction had been disrupted within the snRNP. Since I found no evidence that base pairing occurs at this position, I favor the idea that G96/A97 serves as a specific and critical protein-binding site in free U6 snRNP (see discussion). I feel confident in proposing that this site is bound by RRM2 of Prp24 since the sequence at position 95-100 is identical to the one used to solve the NMR RRM2/AGAGAU  61 solution structure (Martin-Tumasz et al. 2010), demonstrating that Prp24 is in fact capable of binding this sequence. Thus, I propose that the telestem region of U6 snRNA is the high affinity-binding site for Prp24 RRMs 1 and 2.  To assess the geometrical constraints and potential three-dimensional steric conflicts associated with such an arrangement, given the rigid orientation of RRM1 relative to RRM2 (Bae et al. 2007), I modeled our U6 snRNA secondary structure in three dimensions using an RNA modeling program called MC-Sym (Table 3.2; Figure 3.6). Next I built in Prp24 piece-by-piece using NMR solution structure and crystal structure fragments of Prp24 collected from the protein data bank. The snRNA model presented here is one of seven generated by MC-Sym. Following energy minimization, this structure, #7, was ranked first by favorable entropy and second by three other scoring methods, with an energy of -36.63 kcal/mol (Table 3.2). With respect to the energy score, structure #7 ranked second to structure #5, which scored -37.87 kcal/mol. However structure #5 was ranked second to last by two other scoring methods. Thus structure #7 was taken forward for modeling the interaction between RRMs 1 and 2 and the telestem region of U6 snRNA. Table 3.3. MC-Sym U6 snRNA structure ranking by scoring program.  Ranking Amber ‘99 Entropy P-Score FF-01 Score Volume 1st Structure 5 Structure 7 Structure 6 Structure 5 Structure 4 2nd Structure 7 Structure 1 Structure 7 Structure 7 Structure 3 3rd Structure 6 Structure 2 Structure 2 Structure 4 Structure 1 4th Structure 1 Structure 4 Structure 4 Structure 1 Structure 6 5th Structure 4 Structure 6 Structure 1 Structure 2 Structure 2 6th Structure 3 Structure 5 Structure 5 Structure 6 Structure 7 7th Structure 2 Structure 3 Structure 3 Structure 3 Structure 5   The three-dimensional model produced by MC-Sym has predicted a very compact, somewhat entangled structure for U6 snRNA based on the secondary structure that I provided for the program. When nucleotides 30-34 and 96-100 were included in the MC-Sym modeling, the telestem extension was generated. Since base pairing between these nucleotides was not supported by my genetic results, and would prevent nucleotides G96 and A97 from interacting with Prp24-RRM2 in a canonical fashion, I re-submitted only nucleotides 35-96 for modeling by the program. The RRM2 binding site at nucleotides 96-100 was added to the model by superimposing the sugar-phosphate backbone of nucleotides 95 and 96 from the RRM2- 62 AGAGAU NMR solution structure (PDB Accession Number 2KH9) with nucleotides 95 and 96 from our MC-Sym model (Fig 3.7). With RRM2 built into the model, I then located RRM1 in three-dimensional space by superimposing RRM2 from the RRM1/2 NMR solution structure (PDB Accession Number 2GO9) with RRM2 from our model (Fig 3.7).  Figure 3.6. U6 snRNA secondary structure in free U6 snRNP. U6 snRNA from nucleotides  35-96 was modeled in MC-SYM and structure number 7 (shown here) was selected for  further analysis. The Prp24-RRM2 binding site was added in MacPyMOL. The RNA is  colored blue to red from the 5’ to 3’ and the figure to the right is a 90° rotation of the one  on the left. Key structural features are noted on each figure.   Given the position of RRM1 relative to the RRM2 binding site and position A35 in our three-dimensional model of free U6 snRNP, it is easy to speculate that nucleotides 30-34 lie across RRM1 in a non-canonical fashion. I have not modeled these nucleotides since several observations from the literature suggest that RNA binding by RRM1 indeed occurs non-canonically across the electropositive α-helical face and loops three and five of this domain (Bae et al. 2007) – but I have no further information regarding how this interaction occurs. However, I do note that several of the RRM1 amino acids that genetically interact with G30U/G31U to generate varying degrees of temperature sensitivity are located in loops three (S75A) and five (N107S), and also in loop one (K50E). These amino acids all cluster in the same region in the three-dimensional structure shown in figure 3.7 (top left panel). Notably, temperature sensitivity generally results due to disruption of an interaction; thus my genetic results are consistent with disruption of an interaction between RRM1 of Prp24 and a region of U6 snRNA containing nucleotides 30 and 31, which are located next to nucleotides 28 and 29, which have been UV cross-linked to Prp24 (Karaduman et al. 2006).  63  In addition to RRMs 1 and 2, Prp24 has two other more C-terminal RRM domains, RRM3 and oRRM4. While I do not have enough information at this time to definitively model these RRMs into place in free U6 snRNP, I have made an attempt to temporarily position them based on some relevant data from the literature (Fig 3.7). In particular, the spacer region between RRMs 2 and 3 is long enough to permit RRM3 to interact with nucleotides A40-C43 (Fig 3.6, 3.7). These nucleotides are strongly accessible to chemical modification in the absence of Prp24, but become strongly protected from modification and hydroxyl radical cleavage in the presence of Prp24 (Jandrositz & Guthrie 1995, Karaduman et al. 2006). In the context of free U6 snRNP, RRM3 might make canonical contacts with these nucleotides (see discussion). oRRM4 has not been meticulously positioned in our model since I have no information regarding its interaction region in U6 snRNA, but instead has been placed in a region that is protected from hydroxyl radical cleavage. The long spacer between RRMs 3 and 4 would allow for this orientation; however, placement of oRRM4 in our model should be considered highly speculative. 3.4 Discussion  The genetic results obtained for the mutations made in the telestem extension region of U6 snRNA were completely unexpected and were not consistent with stem formation. The G96U/A97U mutation was predicted to allow complete stem formation below the telestem structure, potentially generating an undisrupted ten base pair helix (Fig 3.2). This structure would be far stronger than the telestem alone, and thus the lethal phenotype was not too surprising since stem unwinding might have been completely inhibited. However, if increased stem stability were the sole cause of the lethal phenotype, then one would have expected to see the same phenotype when the mutation was made from the other side of the bulge. This was not the case. When the A34U/A35C mutation was introduced, allowing base pairing to G96/A97, only very mild temperature sensitivity was observed (Fig 3.2). In fact, the temperature sensitivity was so mild that on plates, this mutation appeared to grow similarly to wild type, and the growth phenotype was only detected by determining the doubling time. Therefore the lethal phenotype exhibited by G96U/A97U cannot be attributed to increased telestem stability. Taken together, these observations suggest that position G96/A97 is important for cell viability through some mechanism that does not involve base pairing to position 34/35. One possible explanation for these observations is that this position in U6 snRNA serves as a protein-binding site in free U6 snRNP.  64  Figure 3.7. Three-dimensional model of free U6 snRNP. Fragments of Prp24 were modeled  onto our MC-SYM model of U6 snRNA secondary structure by superimposing structures  collected from the protein data bank. Shown here are RRMs 1 and 2 from PDB# 2GO9  (pink), RRM3 from PDB# 2GHP (green), and RRM4 from PBD# 2L9W (yellow). (top  right) 90° rotation of the top left, (bottom left) 180° rotation of top left, (bottom right)  270° rotation of top left. Amino acids in RRM1 that are temperature sensitive when  combined with U6-G30U/G31U are indicated in the top left panel. Nucleotides that are  protected from hydroxyl radical cleavage are highlighted in blue.   Given the severity of the U6-G96U/A97U mutation, the NMR solution structure of RRM2 bound to a hexanucleotide sequence that is identical to the region including and surrounding G96/A97 (Martin-Tumasz et al. 2010), and the lethal phenotype of a triple alanine substitution across the RRM2 β-sheet (Vidaver et al. 1999), it seems reasonable to suggest that RRM2 of Prp24 might bind nucleotides G96 and A97 in free U6 snRNP. Notably, RRM2 is expected to make specific high affinity canonical contacts with U6 snRNA (Martin-Tumasz et al. 2010); thus a mutation in either of the interacting partners would be expected to be severe. Indeed, the RRM2sub mutation was shown to completely abolish interaction with U6 snRNA by  65 chemical shift perturbation, explaining the lethal phenotype (Martin-Tumasz et al. 2010). Scaffold independent analysis indicated that substitution of a U at either position in the GA dinucleotide bound by RRM2 is the least tolerated of the four nucleotides, and the solution structure revealed that this is because sequence specificity is derived from a network of four hydrogen bonds to the GA dinucleotide Watson-Crick face (Martin-Tumasz et al. 2010). Since uridine does not contain the amino groups required to make two of these specific hydrogen bonds, and its smaller size prevents uridine from also making the non-specific contacts through the sugar-phosphate backbone, binding to RRM2 would be greatly reduced or abolished altogether (Martin-Tumasz et al. 2010), hence the lethal phenotype observed for the G96U/A97U mutation.  As support for the proposal that RRM2 binds U6 snRNA at G96/A97, consider the RRM2 mutation R158S, which was originally isolated as a trans-acting suppressor of U6-A62G cold sensitivity (Vidaver et al. 1999). This mutation was temperature sensitive and in the presence of mutations that disrupt the lower telestem, this temperature sensitivity was exacerbated (Vidaver et al. 1999). However, when base pairing in the lower telestem was restored through compensatory base mutation, the growth phenotype reverted, suggesting that RRM2 is involved in stabilizing the lower telestem structure (Vidaver et al. 1999). The NMR solution structure revealed that R158 hydrogen bonds to the backbone phosphate of the G in the GA dinucleotide (Martin-Tumasz et al. 2010). The fact that the binding affinity of R158S was reduced five-fold when compared to wild type RRM2 indicates that this interaction between R158 and U6 snRNA is very important (Martin-Tumasz et al. 2010). It is difficult to understand how the lower telestem could genetically interact with R158S if RRM2 binds U6 at position 49-54 as proposed by Martin-Tumasz et al. (2010), although I cannot rule out the possibility that the genetic results reflect some indirect interaction. However, in our model, R158 contacts the base of the telestem, and could therefore have a direct influence on the telestem region of U6 snRNA, resulting in the genetic interactions observed.  U6 snRNA exhibits incredible structural flexibility throughout the splicing cycle, allowing for mutually exclusive interactions to occur in a temporally appropriate manner. Thus it is possible that the G96U/A97U lethality arises at some point in the splicing cycle outside of free U6 snRNP. This region of U6 snRNA is not known to interact with any other proteins; however, it does base pair with U2 snRNA in U2/U6 helix II (Field & Freisen 1996). It is unlikely that this  66 mutation is lethal due to disruption of helix II since helix II is not essential in yeast, and appears to be redundant with U2/U6 helix I, which was fully intact in my system (Field & Friesen 1996). Further, a nine-nucleotide mutation that completely disrupted U2/U6 helix II from the U2 side did not affect growth, while the equivalent mutation from the U6 side resulted in lethality that could not be rescued by compensatory mutations in U2 snRNA (Field & Friesen 1996). These observations argue that the helix II U6 nucleotides have an important function outside of helix II formation, consistent with my proposal that G96/A97 interacts directly with Prp24 in free U6 snRNP.   Interestingly, 3’-terminal deletions in U6 snRNA resulted in splicing levels that ranged from 4 – 40% of wild type, depending on the extent of truncation, in an in vitro U6 snRNA reconstitution experiment (Ryan et al. 2002). For all truncations tested, the ratio of lariat intron to mRNA indicated that there was a block in normal lariat intron degradation (Ryan et al. 2002). This is consistent with the Prp24 U6 snRNA retrieval model proposed in chapter two. Recall that I proposed that Prp24 retrieves U6 snRNA from disassembling spliceosomes following splicing of a substrate. If retrieval of U6 is dependent on an interaction between Prp24 and some site within twenty to thirty nucleotides of the 3’ end of U6 snRNA, and this entire region of U6 has been deleted, then proper disassembly of the spliceosome might not be triggered. Thus splicing would stall at spliceosome disassembly, preventing ejection of the lariat intron for degradation. Further, subsequent rounds of splicing would be impeded due to inefficient recycling of the splicing factors, resulting in the reduced levels of splicing that were observed. In support of this line of reasoning, a U6 snRNA 86-112 polyC substitution resulted in less than 10%, and less than 5%, co-immunoprecipitation of U6 snRNA by antibodies to Prp24 and LSm4 respectively (Ryan et al. 2002). This suggests that this region of U6 snRNA is critical for high affinity binding between U6 snRNA and these proteins to regenerate free U6 snRNP. Indeed, the terminal uridine tract has been shown to be essential for association of the LSm complex with U6 snRNA (Vidal et al. 1999, Zhou et al. 2014). My genetic results in the 3’ end of U6 snRNA have identified a second critical interaction domain at nucleotides G96 and A97 that might be essential for association of Prp24 with U6 snRNA.  A second region of U6 snRNA that generated unexpected genetic phenotypes was position 30/31. The mutations that were introduced here were expected to decrease stem stability. Thus I predicted to see a temperature sensitive phenotype for these mutations since the  67 proposed stem would be even less stable at elevated temperatures. Instead I observed a cold sensitive phenotype for each of the three mutations tested, with G30U/G31U exhibiting the most dramatic growth phenotype (Fig 3.2). This observation was not consistent with stem formation, and consequently I addressed the possibility that these mutations were somehow preventing efficient di-snRNP formation. While I did find that U4/U6 di-snRNP levels were reduced in these strains, as indicated by elevated levels of free U4 snRNA (Fig 3.3), this was not because of an inability of the two RNAs to base pair, but instead, was the result of greatly reduced U6 snRNA levels in these strains. Interestingly, the reduced levels of U6 and U4/U6 were just as pronounced at 37°C as they were at 16°C, although growth was not impeded at the elevated temperature (Fig 3.2, 3.3). This observation cannot be explained at this time; however, it is worth noting that other mutations in U6 snRNA, such as A62G, have also been reported to result in reduced levels of U4/U6 at all temperatures tested, but growth is only affected at reduced temperatures (Fig 3.3; Fortner et al. 1994). The difference is that A62G does not result in reduced levels of U6 snRNA. My U100G/U101G mutation provided yet a third unexplained phenotypic pattern, in that this mutation did not exhibit any severe growth phenotype, U6 snRNA levels were comparable to wild type, but U4/U6 levels were greatly reduced. Taken together, these observations suggest that mutations throughout U6 snRNA act through a variety of different mechanisms.  It is interesting to note that mutations that were introduced throughout the telestem extension region of U6 snRNA tended to generate a cold sensitive growth phenotype, although in many cases the cold sensitivity was quite weak. Cold sensitivity is most often thought to suggest that a particle has become hyperstabilized and therefore undergoes less efficient remodeling, preventing progression through the splicing cycle. In this case, the cold sensitivity throughout the telestem extension of U6 snRNA might reflect destabilization of free U6 snRNP due to disruption of the RNA/protein interactions, preventing progression to the di-snRNP particle. While the mutations at posistion 30/31 might disrupt binding of Prp24 RRM1, mutations in the 3’ end of U6 snRNA would be expected to disrupt interactions with the LSm complex (Vidal et al. 1999). Both Prp24 and the LSm complex are necessary for efficient formation of the di-snRNP particle (Raghunathan & Guthrie 1998, Achsel et al. 1999). Indeed, the reduced di-snRNP levels of the U100G/U101G mutant, accompanied by wild type levels of U6 snRNA and only a mild cold sensitive growth phenotype could be explained by decreased affinity for the LSm complex. Thus, destabilization of either the U6/Prp24 or U6/LSm complex interactions  68 might generate a cold sensitive growth phenotype since formation of U4/U6 di-snRNP would be unfavorable, effectively hyperstabilizing the preceding step of the splicing cycle.   There are two possible explanations for reduced U6 snRNA levels in the position 30/31 mutations: either the mutation interferes with transcription, resulting in lower total U6 snRNA levels, or the mutation might disrupt important interactions between U6 snRNA and the snRNP-associated proteins, resulting in unstable U6 snRNA that is then degraded in the cell. The reduced U6 levels observed here probably result as a combination of both. As an RNA Polymerase III transcript, U6 snRNA contains an internal A block promoter element that spans nucleotides 21-31 (Eschenlauer et al. 1993). A triple mutation at position 29-31 was lethal and completely abolished transcription in vitro and in vivo (Eschenlauer et al. 1993). Thus it is not surprising that our position 30/31 mutations resulted in reduced U6 snRNA levels. However, the reduced levels of U6 snRNA are probably not the cause of the cold sensitive growth phenotype since U6 snRNA levels were comparable at all temperatures tested for all three of the position 30/31 mutations, despite the more dramatic cold sensitivity exhibited by G30U/G31U. The position 30/31 nucleotides also appear to be critical for protein binding within free U6 snRNP, since the G30U/G31U mutation was lethal or severely temperature sensitive when combined with several mutations in RRM1 of Prp24 (Fig 3.5). This observation is consistent with disruption of an RNA-protein interaction. Indeed, UV cross-links have been identified between position 28-31 and both Prp24 and the LSm complex, indicating that the protein components of the snRNP physically interact with this region of U6 snRNA (Karaduman et al. 2006), hence the high conservation of this portion of U6 snRNA.   The NMR solution structure of the first two RRMs and the crystal structure of the first three RRMs of Prp24 revealed that interdomain contacts between RRMs 1 and 2 hold these domains in a rigid structure that forms a cleft in which the RNA can bind (Bae et al. 2007). Thus it was important for me to demonstrate that RRM1 could interact with U6 snRNA near nucleotides 30 and 31, and that RRM2 could interact with U6 snRNA nucleotides 96 and 97 in three-dimensional space without steric conflict. My in silico model demonstrates that such an arrangement is indeed possible (Fig 3.7). The position of RRM1 relative to RRM2 and U6 snRNA in our model is consistent with the finding that RRM1 contacts U6 snRNA in a non-canonical manner. Thus my three-dimensional model is a critical demonstration that U6 snRNA can fold into the structure that I have proposed, and that Prp24 can bind U6 in the telestem  69 region in free U6 snRNP. Further, this model incorporates all of the data from the literature when the position of RRM3 is also considered. RRM3 is modeled to interact with nucleotides A40-C43 in a canonical fashion, which is consistent with the strong accessibility to chemical modification in the absence of Prp24, strong protection from modification and hydroxyl radical cleavage in the presence of Prp24, and the severe temperature sensitivity of the RRM3sub mutation (Jandrositz & Guthrie 1995, Vidavaer et al. 1999, Karaduman et al. 2006).   My model of free U6 snRNP goes one step further to explain the conflicting observations with respect to RNA binding by RRM3. Mutation of three highly conserved aromatic residues in the β-sheet that are expected to make specific contact with the RNA results in a severe temperature sensitive growth phenotype that is consistent with a canonical RNA-RRM3 interaction (Vidaver et al. 1999). However, chemical shift perturbation data in the presence of a short RNA containing the U6 3’ ISL indicated that RNA binding by RRM3 was primarily non-canonical (Martin-Tumasz et al. 2010, Martin-Tumasz et al. 2011). These NMR experiments utilized a fragment of U6 snRNA that did not include the region of U6 snRNA that I have proposed to be bound by Prp24 RRMs 1 and 2, and further, it was short enough that the only stable structure available to fold would have been the 3’ ISL. Both of these observations are consistent with the Prp24 U6 snRNA retrieval model: RRM3, along with oRRM4, initially disrupts the 3’ ISL in a disassembling spliceosome at a non-canonical site in the protein, but then binds U6 snRNA with high affinity at nucleotides A40-C43 to assist in stabilizing free U6 snRNP. Indeed, this region of U6 snRNA is critical for interaction with Prp24, since a polyG or polyU mutation at position 40-42 results in less than 20% co-immunoprecipitation of U6 snRNA with antibodies to Prp24, along with greatly reduced splicing levels (Ryan et al. 2002). Together, RRM2 and RRM3 appear to make specific and tight contacts with the RNA to generate the high affinity binding that is observed between Prp24 and U6 snRNA (Vidaver et al. 1999, Kwan & Brow 2005).  To conclude, my genetic results failed to support my previous proposal that nucleotides 30-33 base pair with nucleotides 98-101 to form the telestem extension. However, I have now shown that the remainder of my previously proposed U6 snRNA secondary structure has the potential to fold in three-dimensions, and that Prp24 would be able to bind the three sites that have been identified as major Prp24 binding sites (near G30/G31, A40-C43, and G96/A97) without steric conflict. Importantly, this three-dimensional structure supports a model for U6  70 snRNA retrieval from disassembling spliceosomes by Prp24 that was presented in chapter two. Briefly, RRMs 3 and 4 are responsible for destabilizing the U6 snRNA 3’ ISL while RRMs 1, 2, and 3 make strong contacts with the telestem region of U6 snRNA to stabilize free U6 snRNP for future interaction with U4 snRNA. There is still much work to do in order to complete the three dimensional model presented here; however, the most convincing evidence would come from solving the atomic resolution structure of the entire free U6 snRNP.    71 4 Co-expression and Co-purification of Free U6 snRNP 4.1 Context of this Work in Light of a Recent LSm2-8 Crystal Structure  At the time that this work was initiated through to the point of obtaining a Small Angle X-ray Scattering (SAXS) structure of the LSm2-8 complex, I was unaware of any other group working on the recombinant yeast LSm complex expression and purification problem. In November 2013, a paper in which the X-ray crystal structure of LSm 2-8 had been solved was published ahead of print in the journal Nature, using a method to purify the pre-formed recombinant complex under native conditions that were almost identical to the method developed here (Zhou et al. 2014). In this chapter, I begin with an introduction highlighting the state of the field when this work began in September 2009 and will leave an analysis of the crystal structure for the discussion where comparisons between our biophysical data and the crystal structure will be made. 4.2 Introduction  Free U6 snRNP has been successfully purified from yeast using a tandem affinity purification strategy in which U6 snRNA and the LSm proteins co-purify through their association with tagged Prp24 (Karaduman et al. 2006). These particles have been subjected to extensive structure probing, including chemical modification, hydroxyl radical foot-printing, and UV cross-linking, as well as to low resolution negative stain electron microscopy (EM) (Karaduman et al. 2006, Karaduman et al. 2008). While these studies provided some insight into the structure of free U6 snRNP, chemical structure probing data are ambiguous and map equally well to competing models of U6 snRNA secondary structure (Dunn and Rader 2010). Further, the components of the snRNP are difficult to discern by EM due to the low resolution of the imaging technique. Thus a critical step to understanding how U6 snRNA folds within U6 snRNP requires solving the structure of the complex at higher resolution. A key step toward achieving this task is to develop an in vitro reconstitution system that allows for the assembly of a large quantity of sufficiently pure complex for biophysical characterization. However, a major challenge in achieving this goal has been the expression and purification of the LSm proteins, which, when expressed alone, tend to be very unstable (Zaric et al. 2005).   The LSm proteins are a group of small (10-15kDa) proteins that share a common fold with the evolutionarily related Sm proteins (Achsel et al. 1999). The Sm and LSm proteins are  72 found in all domains of life, arranging themselves into homomeric (eubacteria and archaea) or heteromeric (eukaryotes) ring-like complexes that associate with RNA and function in a variety of RNA processes (Khusial et al. 2005 and references therein). The Sm-proteins were the first of this group to be identified and have since been shown to interact with four of the snRNAs involved in splicing (U1, U2, U4, and U5) where they play a role in snRNP biogenesis (Tan & Kunkle 1966, reviewed in Will & Lührmann 2001). The LSm proteins arrange themselves into two distinct complexes that perform very different roles in eukaryotes. LSm1-7 localizes to the cytoplasm where it associates with the 3’ polyA region of mRNAs to promote their decapping and subsequent degradation by the 5’ to 3’ mRNA decay machinery (Ingelfinger et al. 2002). In contrast, LSm2-8 appears to only differ from the LSm1-7 complex by the exchange of LSm1 for LSm8, but instead localizes to the nucleus where it binds the polyU tail of U6 snRNA and promotes U4/U6 base pairing during pre-mRNA splicing (Achsel et al. 1999, Licht et al. 2008).  The Sm and LSm proteins share a common fold known as the Sm-motif, which can be divided into a thirty-two amino acid Sm-1 domain located in the N-terminal portion of the protein, separated by an internal sequence of variable length and composition from a more C-terminal fourteen amino acid Sm-2 domain (Hermann et al. 1995). Together, the Sm motifs generate an oligonucleotide-binding-like barrel composed of five anti-parallel β strands with a helix stacked on top (Kambach et al. 1999). The Sm-1 motif encodes the α-helix and β1, β2, and β3 strands while the Sm-2 motif encodes the β4 and β5 strands (Kambach et al. 1999). Crystal structures of the Sm protein complexes revealed that the β4 strand of one protein and the β5 strand of a second protein interact through main chain hydrogen bonding, which is stabilized by a cluster of hydrophobic amino acids and a salt bridge (Kambach et al. 1999). More recently, crystal structures of the Sm protein complex bound to U1 or U4 snRNA have shown that each Sm protein interacts in a one-to-one fashion with a single nucleotide of the Sm-binding motif located in the 3’ end of the spliceosomal snRNA, with additional snRNA-specific differences occurring outside of the Sm fold (Weber et al. 2010, Leung et al. 2011).  Despite the progress that has been made in understanding the structural details of Sm protein complex formation and its interaction with various snRNAs, attempts to work with the LSm complexes have been stifled by the inability to efficiently purify sufficient quantities of the complexes. Sequence similarity between the Sm and LSm proteins suggest that globally, the LSm protein complexes probably form very similar structures to that of the Sm protein complex.  73 However, the distinct difference in cellular localization, RNA substrate binding, and RNA processing activity exhibited by the two LSm complexes suggest that there are subtle differences in their structures that lead to their unique functional properties (Zaric et al. 2005). Low-resolution electron microscope structures of the RNA-free human LSm2-8 complex, isolated from purified triple snRNP, have been obtained (Achsel et al. 1999); however, higher-resolution structural work of the LSm complexes awaits a method to effectively and efficiently purify the complexes in a recombinant expression system.  The individual Sm and LSm proteins have been reported to be very unstable when expressed alone (Kambach et al. 1999, Zaric et al. 2005), probably due to the hydrophobic nature of the β4 and β5 strands, which would be exposed to solvent in the absence of neighboring proteins. One method that was successful in recombinantly expressing stable Sm proteins was the co-expression of pairs of proteins so that they effectively shielded the hydrophobic patches of each other from solvent (Kambach et al. 1999). This strategy was originally employed to solve the crystal structure of the Sm protein sub-complexes and was later adapted for the co-expression and purification of the human LSm subcomplexes (Zaric et al. 2005). Similar to the Sm proteins, Zaric et al. (2005) found that co-expressing the human LSm proteins greatly enhanced the solubility of each protein. For example, when expressed alone, LSm4 and LSm8 were largely insoluble, but when expressed together, the solubility increased 2.6 and 6.5 fold for each protein respectively. The same was observed for LSm5 and LSm6, for which the solubility increased 11.3 and 19.4 fold respectively. Interestingly, increasing the solubility of the protein through co-expression was not limited to co-expression of LSm proteins that physically interact within the fully formed complex. This was demonstrated by the 19.3-fold increase in solubility of LSm5 when co-expressed with LSm3, and the 25.5-fold increase in solubility of LSm6 when co-expressed with LSm7 (Zaric et al. 2005). Further, the co-expression pair appears to have less to do with increasing solubility than the order of the cistrons in the expression vector, since expression of the SmD2D1 heterodimer resulted in approximately 100-fold more protein than expression of the SmD1D2 heterodimer (Zaric & Kambach 2008).  With successful expression and purification of the LSm subcomplexes, Zaric et al. (2005) were able to then reconstitute the human LSm2-8 and LSm1-7 complexes, additionally demonstrating substrate specific binding by each complex in vitro. While this reconstitution system paved the way for biochemical studies of the LSm complexes, it did not lend itself well to  74 biophysical studies due to a high degree of heterogeneity of the final reconstituted product (Zaric et al. 2005). Further, the method used was quite inefficient, yielding only ~ 5mg of reconstituted complex from several liters of bacterial culture expressing each LSm subcomplex (Zaric & Kambach 2008). Much of this inefficiency was a result of promiscuous interactions between LSm proteins within the subcomplexes, which allowed for association of the subcomplexes into higher order hexameric and octameric complexes. These higher order complexes must be disrupted to allow for the correct associations to take place when the purified subcomplexes are mixed together for reconstitution of the heteroheptamer (Zaric et al. 2005). It is this step that results in great loss, with many of the proteins becoming insoluble upon mixing under denaturing conditions. Further, the expression and purification itself is very time consuming, requiring very different expression conditions followed by several purification steps for each subcomplex: Immobilized Metal Affinity Chromatography (IMAC), affinity tag cleavage, IMAC, and finally Anion Exchange Chromatography (AEX). Once the subcomplexes have been mixed for reconstitution, further purification is required, alternating between gel filtration and AEX until a Gaussian peak is observed in the A280 trace, indicating that aberrantly small or large complexes have been removed (Zaric et al. 2005). However, this purification strategy was not effective in removing aberrant complexes of similar size, resulting in a heterogeneous product (Zaric et al. 2005).  The work presented here describes an innovative strategy to recombinantly express all seven LSm proteins from the same co-expression vector, followed by purification of the pre-formed LSm complex under non-denaturing conditions. This method results in far greater yields of LSm complex of up to ~ 7 mg per liter of induced bacterial culture for the LSm2-8 complex. Further, the purification scheme is greatly simplified since an in vitro reconstitution step requiring protein unfolding is unnecessary. The complexes produced in this manner resemble those purified from human triple snRNP as well as those obtained by in vitro reconstitution when viewed under the electron microscope. Further, I have obtained a Small Angle X-ray Scattering (SAXS) envelope for the LSm2-8 complex, which has revealed the ~10 kDa LSm4 C-terminal extension that is not visible by EM. Lastly, I have extended this expression/purification technique to include Prp24 and U6 snRNA, resulting in the co-expression and purification of the entire U6 snRNP. Negative stain EM images have demonstrated that our recombinantly expressed and purified U6 snRNP is identical at this low resolution to the U6 snRNP purified from yeast.   75 4.3 Materials and Methods 4.3.1 Preparation of Yeast Genomic DNA and Total RNA  Yeast genomic DNA was prepared by growing a culture of yeast (strain W303) in five milliliters of liquid YPD at 37°C to an OD595 between 1.5 and 2.5. Cells were harvested and re-suspended in 200 µL lysis buffer (10 mM Tris-HCl pH: 8.0, 100 mM NaCl, 1 mM EDTA, 2% Triton X-100, 1% SDS) prior to addition of 200 µL Phenol:Chloroform (Sigma) and 200 µL 0.5 mm baked Zirconia/Silica beads (BioSpec Products Inc.). Samples were vortexed for 30 seconds on high and moved to ice for 30 seconds and that step was repeated over six minutes. The lysate was then centrifuged at 13 200 rpm in an Eppendorf 5415D centrifuge at 4°C for five minutes. The aqueous phase was collected and three microliters of 10 mg/mL RNase A (Sigma) was added prior to incubation at 37°C for ten minutes. Two hundred microliters TE (100 mM Tris-HCl, 10 mM EDTA, pH: 8.0) was added and the sample was extracted with 400 µL Phenol:Chloroform. One milliliter 100% ethanol was added to the aqueous phase and the sample was placed at -80°C for at least 20 minutes. DNA was collected by spinning at 13 200 rpm for 20 minutes at 4°C in an Eppendorf 5415D centrifuge. The DNA pellet was washed once with 70% ethanol and re-suspended in 50 µL TE.  Yeast total RNA was prepared from yeast cells grown as above. One and a half milliliters of culture was harvested and the cells were re-suspended in 300 µL chilled, filtered RNA extraction buffer (50 mM Tris-HCl pH: 7.5, 100 mM NaCl, 10 mM EDTA). Two hundred microliters of 0.5 mm acid washed, baked Zirconia/Silica beads (Biospec Product Inc.) were added and the sample was vortexed for one minute on the maximum setting. The sample was placed on ice for five minutes and then vortexed for one minute as before. Three hundred microliters of chilled, filtered RNA extraction buffer, 60 µL 10% SDS, and 400 µL acid equilibrated Phenol:Chloroform (5:1, pH: 4.0) (Ambion) was added and the sample was vortexed on maximum speed for one minute prior to centrifugation in an Eppendorf 5415D centrifuge at 13 200 rpm for five minutes at 4°C. The aqueous phase was collected and Phenol:Chloroform extracted twice more before Chloroform back-extraction with an equal volume of Chloroform (Sigma). The aqueous phase was collected and 40 µL 3 M NaOAc pH: 5.2 and one milliliter 100% ethanol were added. Tubes were inverted several times and then were placed at -80°C for at least 20 minutes. RNA was collected by spinning at 13 200 rpm for 20 minutes at 4°C in an  76 Eppendorf 5415D centrifuge. The RNA pellet was washed once with 70% ethanol, left to air dry for 5-10 minutes, and then was re-suspended in 30 µL 10 mM Tris-HCl pH: 7.5. 4.3.2 Construction of LSm Gene-Containing Co-expression Vectors  Polymerase Chain Reaction (PCR) was used to amplify the LSm genes from one microgram of yeast genomic DNA except for LSm2 and LSm7, which contain an intron, and were amplified from two micrograms of yeast total RNA by Reverse Transcription PCR (RT-PCR). Amplification was performed with gene-specific primers (prepared by Eurofins Genomics) that carried a BamHI restriction site in the forward primer and a NotI restriction site followed by a SalI restriction site in the reverse primer (Table 4.1). Amplified gene products were digested with BamHI and SalI and gel purified with the Qiagen QIAquick Gel Extraction Kit prior to cloning into the BamHI-SalI site of pUC19. Each LSm gene was sequenced by the UNBC Genetics Facility using primers oSDR422 and oSDR423 that annealed to pUC19 on either side of the LSm gene (Table 4.1). The LSm genes were then sub-cloned into the co-expression vector pQlinkH (Addgene plasmid 13667; Scheich et al. 2007) or pQlinkNmod using the BamHI and NotI restriction sites. pQlinkNmod is a modified version of pQlinkN (Addgene plasmid 13670; Scheich et al. 2007) in which the EcoRI-BamHI fragment was removed and replaced by the annealed oligonucleotides oSDR# and oSDR# (Table 4.1).  Co-expression vectors containing all of the LSm genes were constructed by ligase-independent cloning (LIC) using pQlinkH-LSm and pQlinkNmod-LSm vectors as described (Scheich et al. 2007). Briefly, 500 ng of plasmid DNA was digested with 5 U of SwaI or 5 U of PacI in a total volume of 10 µL in the presence of BSA. SwaI digests were incubated at 25°C for three hours while PacI digests were incubated at 37°C for three hours. The restriction enzymes were heat inactivated at 65°C for 20 minutes. The digests were then treated with 1.3 U of LIC qualified T4 DNA Polymerase (Novagen) for 30 minutes at 25°C in a 20 µL reaction containing 50 mM Tris-HCl pH: 8.0, 10 mM MgCl2, 5 µg/mL BSA, 5 mM DTT, and 2.5 mM dGTP (SwaI digests) or 2.5 mM dCTP (PacI digests). T4 DNA Polymerase was heat inactivated at 65°C for 20 minutes prior to mixing the two digested plasmids together. The mixes were heated to 65°C for five minutes and then slow cooled to room temperature. Two microliters of 25 mM EDTA was added and five microliters of the reaction was transformed into 50 µL of RbCl2 competent DH5α cells. Transformants were screened by colony PCR using the gene specific forward and  77 reverse primers (Table 4.1) to ensure that the LSm genes from both the SwaI and the PacI digested plasmids were present. Table 4.1. DNA Oligonucleotides used to amplify the genes used in this work, to sequence  the genes, and to modify the pQlinkN vector. DNA sequences are given from 5’ to 3’.  The first oligonucleotide listed for each gene is the forward primer and the second is the  reverse primer. oSDR790 LSm2 GGATATGGATCCATGCTTTTCTTCTCCTTTTTCAAGACTTTAG oSDR791 LSm2 GGTCTTGTCGACGCGGCCGCTTATTTTCTTTCAGTCATTACCTCCCTTCTGG oSDR792 LSm3 GGATATGGATCCATGGAGACACCTTTGGATTTATTGAAACTCAATCTCG oSDR793 LSm3 GGTCTTGTCGACGCGGCCGCTATATCTCCACTGCGCCATCG oSDR794 LSm4 GGATATGGATCCATGCTACCTTTATATCTTTTAACAAATGCGAAGGG oSDR795 LSm4 GGTCTTGTCGACGCGGCCGCTTAAAATTCGACCTTTTGTGGAGAAGAGCTG oSDR796 LSm5 GGATATGGATCCATGAGTCTACCGGAGATTTTGCCTTTGG oSDR797 LSm5 GGTCTTGTCGACGCGGCCGCTTACAACGCCTCCGTAGGGGTC oSDR798 LSm6 GGATATGGATCCATGTCCGGAAAAGCTTCTACAGAGGG oSDR799 LSm6 GGTCTTGTCGACGCGGCCGCCTATATTTTTTGTTCACTGATATACATGACCTGC oSDR800 LSm7 GGATATGGATCCATGCATCAGCAACACTCCAAATCAG oSDR801 LSm7 GGTCTTGTCGACGCGGCCGCCTATTTTTGCATATATAGTACATCAGAACCTTCGG oSDR802 LSm8 GGATATGGATCCATGTCAGCCACCTTGAAAGACTAC oSDR803 LSm8 GGTCTTGTCGACGCGGCCGCTTATTTTGTCTTTGATTCGTACACTTTTTCCC oSDR937 Prp24 GGATATGCATCCATGGAGTATGGACATCACGCTAGACCAG oSDR938 Prp24 GGTCTTGTCGACGCGGCCGCCTACTCTCACCTAGAAACATCTTGCGAAAATCG oSDR986 U6 CGAATAGAATTCGGAGCTTCGCGAACCTGATGAGTCC oSDR987 U6 CGAATAAAGCTTTGGGGGAGCTAAGCGGG oSDR804 pQlink CACACAGAATTCATTAAAGAGGAGAAAGGATCCAGTCTTC oSDR805 pQlink GAAGACTGGATCCTTTCTCCTCTTTAATGAATTCTGTGTG oSDR422 pUC19 GGATGTGCTGCAAGGCGATT oSDR423 pUC19 GTTGTGTGGAATTGTGAGCGG  4.3.3 Co-expression and Co-purification of the LSm complexes  The LSm proteins were co-expressed in Rosetta pLysS [F- ompT hsdSB(rB- mB-) gal dcm (DE3) pLysSRARE (CamR)] cells grown in Luria Bertani media that was supplemented with 50 mg/mL Carbenicillin to select for the LSm-containing pQlink vector and 25 mg/mL Chloramphenicol to select for a plasmid carrying the accessory tRNAs for rare codon usage. A starter culture was grown overnight at 37°C, 200 rpm to an OD595 of less than 2 and this was used to inoculate a one liter culture. The one liter culture was grown at 37°C, 200 rpm until the OD595 reached ~ 0.6 – 0.8 and then the culture was induced with isopropyl 1-thio-β-D-galactopyranoside (IPTG) (Amresco) at a final concentration of one millimolar. The cells were  78 left in the shaker at 37°C for an additional four hours and were then harvested and stored at -80°C for future processing.  Each one liter cell pellet was resuspended in 50 mL of lysis buffer (20 mM HEPES-NaOH pH: 7.5, 500 mM NaCl, 5 mM β-Mercaptoethanol, 10 mM EDTA, and 20 mM Imidazole) containing two dissolved tablets of Roche Complete, EDTA-free Protease Inhibitor Cocktail. The cell lysate was sonicated ten times in 15 second bursts at 15-18 W with 15 second pauses on ice in between. Streptomycin sulphate (Sigma) was added at 1% w/v and the lysate was then centrifuged at 25 000 x g in a JA25.50 rotor (Beckman Coulter Avanti HP-20 XPI) for 30 minutes at 4°C. Soluble material was filtered through a 0.45 µm syringe filter followed by a 0.22 µm nylon syringe filter. Note that this step is required for purification by FPLC, but can be left out in a batch bind purification.  One half milliliter of Roche Complete His-tag Purification Resin per one liter cell pellet was equilibrated with two column volumes of wash buffer (20 mM HEPES-NaOH pH: 7.5, 500 mM NaCl, 5 mM β-Mercaptoethanol, and 20 mM Imidazole) by resuspending the resin in wash buffer followed by a two minute spin at 700 x g at 4°C in the Beckman Coulter Allegra X-12R centrifuge. Excess buffer was removed and the filtered lysate was then incubated with the resin at 4°C with gentle rocking for 20 minutes. The lysate was then centrifuged as above and the lysate removed. The resin was washed ten times by resuspending it in two column volumes of wash buffer, centrifuging for two minutes at 700 x g at 4°C, and then removing the excess buffer. The LSm complex was eluted from the resin by repeating the wash steps ten times, but with one column volume of elution buffer (20 mM HEPES pH:7.5, 500 mM NaCl, 5 mM β-Mercaptoethanol, and 250 mM Imidazole). The eluted complex was quantified with a Nanodrop ND-1000 spectrophotometer and was then incubated with His6-TEV Protease (1:100 milligrams TEV Protease to milligrams LSm Complex) for 18 hours at room temperature in His column two wash buffer (20 mM HEPES-NaOH pH: 7.5, 500 mM NaCl, 5 mM β-Mercaptoethanol, and 60 mM Imidazole).  The dialyzed LSm complex was then passed over a 1mL HisTrap column (GE Healthcare) using the AKTA FPLC System and Unicorn software version 5.01 to remove the His6-TEV Protease, cleaved His6-tag, and any remaining His-tagged LSm complex. In this case, the flow through was collected and then concentrated in a YM-30 centrifugal filter unit  79 (Millipore) to approximately five milliliters (do not exceed 10mg/mL) according to the manufacturer’s protocol. The concentrated flow through was then passed over a Tricorn High Performance Superdex 200 Gel Filtration Column (Amersham) in 1mL samples. The peak fractions from each Superdex200 run were pooled and the amount of LSm complex was quantified with the Nanodrop ND-1000 spectrophotometer. When a more concentrated sample was required, the gel filtration sample was concentrated in a YM-30 centrifugal filter unit (Millipore). The LSm proteins present in our sample were verified by mass spectroscopy of the bands excised from a 12% SDS polyacrylamide gel by Dr. Richard Fahlmann at University of Alberta.  In addition to the LSm expression construct, an LSm2-8/Prp24 and a U6 snRNP expression construct that included LSm2-8, Prp24, and U6 snRNA were built. Prp24 was incorporated into these expression plasmids using the same strategy used to build the LSm construct (see above). To ensure homogeneous expression of U6 snRNA, the U6 snRNA gene was flanked by a gene encoding the hammerhead ribozyme at the 5’ end and an E. coli alanine tRNA at the 3’ end. This was incorporated into the expression construct as described previously. The hammerhead ribozyme moiety cleaved itself from the 5’ end of U6 snRNA cotranscriptionally while the 3’ tRNA sequence was removed by endogenous RNAse P following transcription. Primer pairs for amplification of Prp24 and the hammerhead/U6 snRNA/tRNA are shown in Table 4.1. 4.3.4 Expression and Purification of Tobacco Etch Virus Protease  The catalytic domain of Tobacco Etch Virus (TEV) Protease was recombinantly expressed in BL21 (DE3)-RIL from the expression vector pRK793 (Addgene plasmid 8827; Kapust et al. 2001). The TEV protease gene encoded on the plasmid included an N-terminal His-tag for purification along with the mutation S219V, which resulted in production of a far more stable and efficient protease (Kapust et al. 2001). One liter of cells was grown to an OD595 ~ 0.6 at 37°C and TEV protease expression was induced with 1mM IPTG. Cells were shifted to 30°C for four hours and were then harvested and stored at -80°C. The cell lysate was prepared by resuspending the cell pellet in 10 mL of binding buffer (50 mM Tris-HCl pH: 8, 5 mM β-mercaptoethanol), followed by sonication ten times in 15 second bursts at 15-18 W with 15 second pauses on ice in between. Streptomycin sulphate (Sigma) was added at 1% w/v and the lysate was then centrifuged at 3000 rpm in a JA25.50 rotor (Beckman Coulter Avanti HP-20  80 XPI) at 4°C for one hour. The soluble material was collected and filtered through a 0.2 µm sterile nylon syringe filter. TEV protease was purified by batch binding to Ni2+-NTA resin (Thermo Scientific) according to the manufacturer’s protocol. It was then eluted from the resin in elution buffer (50 mM Tris-HCl pH: 8, 300 mM NaCl, 250 mM Imidazole) and dialyzed into 50 mM Tris-HCl pH: 8, 5 mM β-mercaptoethanol, 25% glycerol for storage at -80°C. 4.3.5 Biophysical Characterization of Purified LSm complexes  Protein complex samples were prepared as outlined above and were then shipped to Dr. Calvin Yip at University of British Columbia, who obtained the negative stain EM images at a magnification of 98 000 times, and to Drs. Sean McKenna and Trushar Patel at University of Manitoba for SAXS data collection and ab initio modeling. Samples were supplied in 20 mM HEPES-NaOH pH: 7.5, 500 mM NaCl and 5 mM β-Mercaptoethanol. SAXS samples were subjected to dynamic light scattering before and after SAXS data collection to ensure that the X-rays did not damage the sample, and that the LSm complex was not self-associating into high order aggregates.  4.4 Results  In order to reconstitute the entire free U6 snRNP from in vitro transcribed U6 snRNA and recombinantly expressed protein components for biophysical characterization, I attempted to replicate the human LSm2-8 expression and purification scheme developed in the Kambach lab using the yeast versions of these genes. Unfortunately, the LSm4/8 proteins did not over-express to detectable levels in the soluble fraction following induction with 1mM IPTG or lactose, regardless of the temperature of induction or the bacterial expression strain tested. Thus it was clear that a critical first step toward U6 snRNP reconstitution was to develop a new method to prepare sufficient quantities of high purity LSm complex. Given the Kambach lab’s success in increasing the solubility of the human LSm proteins through co-expression, I set out to co-express all seven yeast LSm proteins simultaneously with the goal of purifying a pre-formed LSm complex that could be isolated from crude lysate under non-denaturing conditions. This system was expected to increase the yield and purity of the complex since the inefficient denaturation/reconstitution step was unnecessary.  To co-express all seven LSm proteins, I chose to clone the genes into the pQlink expression vector given the ease with which an unlimited number of proteins could be cloned  81 into the vector using a ligase independent cloning procedure (Scheich et al. 2007). Each gene within the vector was flanked by its own transcriptional promoter (Ptac) and terminator (λ t0) sequences, and therefore the LSm genes were transcribed into individual mRNA transcripts as opposed to one polycistronic mRNA. Prior to introduction of the LSm genes, the vector itself was modified slightly to remove additional regions of sequence located between the ribosome-binding site and the BamHI restriction enzyme cleavage site located in the multiple cloning region of the vector (Fig 4.1). This region of sequence included a start codon and a short stretch of nucleotides that would be translated into additional amino acids at the N-terminus of each protein. To ensure that the additional residues on each protein did not interfere with subsequent biophysical experiments, I removed this stretch of sequence, generating a vector consisting of the optimal positioning of the ribosome binding site seven nucleotides upstream of the start codon for each gene. Preceding one of the seven LSm genes was a His6-tag followed by a Tobacco Etch Virus (TEV) protease cleavage site that left two amino acids at the N-terminus following cleavage. In this way, the proteins expressed from the pQlink vector would only be isolated during purification if they associated with the tagged LSm protein. The TEV protease cleavage site was incorporated to allow for removal of the tag during later stages of protein complex purification.   Each of the yeast LSm genes was amplified from yeast genomic DNA by PCR with gene specific primers, except for LSm2 and LSm7, which both contain an intron, and were therefore amplified from yeast total RNA by RT-PCR. After confirming the sequence in pUC19, the genes were sub-cloned into the modified pQlink vector and from there they were combined into a single expression vector using the ligase independent cloning procedure that was developed specifically for this expression system (Fig 4.1; Scheich et al. 2007). Briefly, this involves digesting a pQlink vector containing the first gene of interest with SwaI and a second pQlink vector containing a second gene of interest with PacI. The pQlink vector contains a unique SwaI restriction enzyme site 3’ of the transcriptional terminator for the gene of interest, while a PacI site appears 5’ of the transcriptional promoter for the gene of interest and 3’ of the SwaI site (Fig 4.1). Following restriction enzyme digest, both plasmids are incubated with T4 DNA Polymerase, which exhibits a 3’ to 5’ exonuclease activity in the absence of free nucleotides. The pQlink plasmids have been designed in such a way that T4 DNA Polymerase treatment in the presence of dGTP (SwaI) or dCTP (PacI) will generate a fourteen nucleotide overhang in the SwaI digested plasmid that is perfectly complementary to the fourteen nucleotide overhang  82 generated in the PacI digested plasmid. Consequently, the digested, T4 DNA Polymerase treated plasmids can be annealed simply by mixing, heating, and slow cooling. The annealed plasmid is transformed into DH5α where it is replicated with high fidelity by the bacterial cellular machinery. It should be noted that T4 DNA Polymerase treatment destroys the restriction site that was cleaved; however, the sites that were not used remain intact, which means that the new vector containing both genes of interest contains a single SwaI site and two PacI sites just as the original vector did (Fig 4.1). Thus the procedure can be repeated to introduce, in theory, an unlimited number of genes into the same expression vector, although in practice there is probably some limitation that I have not yet encountered.  With all seven genes present in one vector (Fig 4.2), optimal expression conditions were explored. As a step toward choosing the most suitable bacterial strain for expression of the LSm proteins, I conducted a rare codon analysis that revealed that all of the yeast LSm proteins contain a high percentage of low frequency codons, ranging from 10% to as high as 20%. A codon adaptation index (CAI) of 1.0 is ideal while a CAI of > 0.80 is generally considered to be good. Given that the CAI of the yeast LSm proteins fell between 0.55 and 0.68, I chose to express the proteins in Rosetta pLysS, a bacterial strain that carries the rare codon tRNA genes on the same plasmid as the T7 lysozyme gene. The yeast LSm proteins over-expressed very well in Rosetta pLysS when induced with 1mM IPTG at an OD595 between 0.5 and 0.8 (Fig 4.2). When cells were left to express overnight, LSm4 underwent substantial cleavage at 37°C, but much less so at 25°C. Expression for four hours at 37°C yielded a similar expression profile to that seen at 25°C overnight, (i.e. more full length LSm4; Fig. 4.3), making this a flexible expression system for this set of proteins.  Since there were no previous reports of yeast LSm protein purification in the literature, it was not clear which LSm protein would be most suitable to carry the His6-tag. Two important criteria must be met to meet the goals of the purification system that I set out to achieve: the tag must be accessible in order to interact with the immobilized Ni2+ ions during the purification, and the TEV protease cleavage site must be accessible for removal of the tag following elution from the Ni2+ resin. Consequently I made several versions of the protein expression vector in which different LSm proteins bore the N-terminal tag and I tested these for both tag and TEV protease accessibility. In some cases the tag was not accessible at all (LSm4) while in other cases the tag was accessible for purification, but could not be efficiently cleaved (Lsm5). When the tag was  83 placed at the N-terminus of LSm6, it was both accessible and could be efficiently cleaved by TEV protease. As a result, the tagged LSm6 construct was taken forward for optimization of purification.  Figure 4.1. Construction of the LSm2-8 co-expression vector. (a) The original pQlinkN  plasmid sequence between the EcoRI and BamHI sites (top) and the modified vector  sequence (bottom). (b) Construction of the final expression vector requires three rounds  of SwaI/PacI digest, T4 DNA polymerase treatment, and annealing. (c) A close-up of the  PacI-PacI fragment of the final vector construct showing various features of the plasmid  as indicated by the key below the plasmid.  84  Figure 4.2. Expression of the LSm proteins from pQlink. (a) PCR products were separated in  a 1% Agarose Ethidium Bromide gel to demonstrate the presence of each LSm gene  amplified with gene specific primers from the same expression vector. (b) 12% SDS- PAG showing  expression of the LSm genes in Rosetta pLysS: (I) Insoluble material, (S)  Soluble material, (H) His-spin column purified complex. Expression was carried out for  18 hours at the temperature indicated. The LSm proteins identified in each band by mass  spectroscopy are listed to the right.  The general path along the purification involved four main steps (Fig 4.3). First, the crude lysate collected from the Rosetta pLysS cells was passed over a Ni2+ resin, to which the tagged LSm6 bound. The expectation was that the remaining six LSm proteins would co-purify, either through direct physical contact with the tagged protein, or through indirect contact with the tagged protein via other members of the LSm complex. The bacterial proteins present in our expression system were then washed away and the bound proteins were eluted from the resin. The second step was a dialysis that served two purposes: first it removed the high concentration of imidazole that was present following elution of the complex from the Ni2+ resin (critical for step three), and second, TEV protease was introduced to cleave the tag from LSm6. By incorporating a His6-tag on the TEV protease, the cleaved LSm complex was then isolated from both the TEV protease and the cleaved tag by passing the dialyzed sample over a second Ni2+ resin. This time the tag and TEV protease bound to the Ni2+ resin while the cleaved LSm complex was collected and a final polishing gel filtration step was included to ensure that no aberrantly formed complexes were present. Fractionation of the sample following gel filtration on a 12% SDS-PAGE gel yielded a number of bands that were subjected to mass spectroscopy.  85 This resulted in the unambiguous identification of all seven LSm proteins, and three LSm4 C-terminal truncation products. Cleavage of LSm4 could be greatly reduced by performing the induction at a lower temperature for a greater length of time (25°C overnight), or by inducing at 37°C for only four hours. Both conditions yielded similar amounts of protein at the end of the purification (~ 7 mg of protein per liter of culture).  Figure 4.3. LSm2-8 purification. (a) Four step purification scheme. (b) FPLC chromatograms  showing the A280 trace for the first step (IMAC, top), third step (IMAC, middle), and  fourth step (Gel Filtration, bottom). (c) a 12% SDS-PAGE gel run for 2.5 hours at 200V  to assess the quality of the purification. (Pre) Pre-induction, (Post) Post-induction, (FT)  First IMAC Flow Through, (E1) First IMAC Elution, (D) Dialyzed Sample, (FT2)  Second IMAC Flow Through (from step 3), (SEC) Gel Filtration. (d) 12% SDS-PAGE  gel run for two hours to assess the composition of each fraction across the main peak  from the gel filtration A280 trace.   86  The C-terminal truncation of LSm4 was unexpected, and since I was aiming to characterize the entire LSm2-8 complex, both alone and within free U6 snRNP, great effort was put into adjusting the expression and purification conditions in order to minimize the extent of cleavage. As stated above, cleavage could be greatly reduced by either reducing the time or temperature of induction; however, there was still some cleavage present that varied from one purification to another. This variation appeared to arise following cell lysis, thus I realized that further optimization of the purification conditions were necessary, and indeed, proved to be critical for the successful purification of mostly intact full length LSm4 (~ 90% or higher). These conditions included minimizing sample handling time between cell lysis and washing of the LSm2-8 complex-bound Ni2+ resin to reduce accessibility to endogenous proteases, including up to 10 mM EDTA in the lysis and wash buffer to chelate Mg2+ ions, minimizing protease activity, and ensuring that the cells and cell lysate were kept on ice and in chilled buffers at all times during cell harvest and lysis, again to minimize protease activity. To reduce sample handling time I have turned to a batch-binding strategy for the first purification step and we use a Ni2+ resin that is compatible with up to 10 mM EDTA.  Having successfully co-expressed and purified the LSm2-8 complex, I then repeated the procedure outlined above for the LSm1-7 complex and the entire U6 snRNP (Fig 4.4). Two surprising observations were made: first, the LSm4 C-terminus remains stably intact in both the LSm1-7 and U6 snRNP complexes, suggesting that the LSm4 C-terminus is stabilized by other proteins in these particles. In fact, purification of the LSm1-7 and U6 snRNP could be achieved by standard procedures using a Ni2+-NTA column in the absence of EDTA at room temperature without resulting in LSm4 C-terminal cleavage. Thus the C-terminus of LSm4 is protected both inside the cells and within the crude cell lysate in these complexes. Second, the amount of LSm1-7 complex retrieved from the same volume of bacterial culture was substantially less than that of the LSm2-8 and U6 snRNP complexes, suggesting that expression of the LSm1-7 complex was toxic to the cells. Indeed, the optical density at 595 nm prior to and following induction with IPTG indicated that protein expression almost immediately halted cell division. For example, cells induced at an OD595 of ~ 0.6 did not reach an OD595 of more than 1.5 after a four hour induction at 37°C. Consequently, twice as much cell culture has to be induced to obtain approximately the same protein yield as observed for the LSm2-8 and U6 snRNP complexes.  87    Figure 4.4. U6 snRNP, LSm1-7, and LSm2-8 purification. 12% SDS-PAGE gels run at 200V  for 2.5 hours showing the purity of various complexes throughout the purification  procedure. U6 snRNP purification on the left: (Pre): Pre-induction, (Post): Post- induction, (Sol): Soluble Material, (FT): Flow Through, (E): Elution, (Dial): Dialyzed,  (SEC E): Size  Exclusion Chromatography Elution. Complexes isolated after the first  IMAC step of the purification on the right. Protein band identities and the 10 kDa and 70  kDa markers are indicated in the middle. Note the three LSm4 C-terminal cleavage  products in the LSm2- 8 purification, but not the LSm1-7 or U6 snRNP purifications.  Depending on the downstream applications that the purified complex will be subjected to, tag removal might be unnecessary and the one-step purification is therefore more than adequate since the first batch-bind IMAC step is capable of removing essentially all unwanted proteins from the sample (Fig 4.4b). However, if tag removal is important, for example to promote contacts within growing crystals for structure determination, then the subsequent IMAC step is required following tag-removal to separate tagged complexes from complexes that no longer bear the tag. The final gel filtration step is often unnecessary; however, it does serve an analytical purpose in that it allows for the identification of a heterogeneous sample as evident by multiple peaks or a non-Gaussian distribution across a single peak in the A280 trace. However, like the Kambach system, the gel filtration step here is unable to distinguish between correctly formed heteroheptameric complexes and aberrant complexes of similar size.  Recombinant expression of U6 snRNP was not as straight forward as the LSm complex purifications, since it was important to recombinantly express U6 snRNA in the presence of the  88 LSm2-8 proteins and Prp24 to ensure that all components of the snRNP assembled into one pre-formed complex for purification. The expression vector utilized here contained a transcriptional promoter and terminator sequence flanking the gene of interest; however, there are other important sequence elements, such as the ribosome binding site, that are included for protein expression that are undesirable in the transcript when expressing an RNA. Further, even if the vector had been manipulated to remove these additional nucleotides from the RNA, the promoter and terminator are often ‘leaky’, resulting in the transcription of RNAs with heterogeneous ends. Since the 3’-end of U6 snRNA is critical for association of the LSm complex, it was essential to design a U6 snRNA construct that would result in a final RNA product with homogeneous ends bearing the appropriate functional group.   To address this issue, the gene encoding U6 snRNA was flanked by a 5’ hammerhead ribozyme (HH) sequence and a 3’ tRNA sequence (Fig 4.5). Transcription of this cassette was expected to yield an HHU6tRNA product. The HH moiety folded co-transcriptionally and very efficiently cleaved itself from U6, generating a molecule of U6tRNA (Fig 4.5). This molecule was then recognized by the endogenous RNAse P as a pre-tRNA, of which U6 snRNA mimicked the 5’ leader sequence and was cleaved. Importantly, this cleavage resulted in the generation of the correct 3’-OH group required for association of the yeast LSm2-8 complex with U6 snRNA. This same construct can be used in vitro to generate a pool of U6 snRNA with homogeneous ends through the addition of in vitro transcribed M1 RNA from bacterial RNAse P (Fig 4.5). Importantly, although we have been unable to detect full length U6 snRNA in the purified U6 snRNP, the fact that Prp24 has co-purified with the LSm complex indicates that it was expressed and bound by both Prp24 and the LSm proteins since Prp24 does not co-purify in the absence of U6 snRNA. This result suggests that the entire U6 snRNP associates into a stable particle in vivo; however, the presence of nucleases within the cells results in degradation of accessible regions of the RNA molecule.  The implications of this observation will be addressed in the discussion.  Negative stain EM of the LSm complexes, both purified from human triple-snRNP or reconstituted from the recombinantly expressed and purified human LSm subcomplexes, show that the LSm proteins assemble into a torus; that is, they interact with each other in such a way as to create a hole in the center through which the RNA substrate interacts with the protein complex (Achsel et al. 1999, Zaric et al. 2005). Thus it was critical for me to show that the complex isolated in the co-expression/purification system developed here also forms these toruses, as  89 opposed to simply interacting in some more globular, and presumably non-functional, form. To address this, we subjected the purified LSm2-8 complex to negative stain EM at a magnification of 98 000 times (Fig 4.6). Indeed, my purified complexes look identical by EM to the ones reported previously in the literature, increasing my confidence in the method that I developed for purification of pre-formed recombinant protein complexes. When these more than 1500 complexes were binned and averaged, the central lumen was more pronounced, although the resolution around the edges was reduced compared to the individual particles observed in the raw image. At this resolution I was unable to determine how many protein components were present in each complex. Negative stain EM of purified U6 snRNP revealed that in the presence of Prp24, the LSm complex still assembled into a torus shaped complex.  Figure 4.5. U6 snRNA transcription cassette. (a) A schematic of the in vitro U6 snRNA  transcription cassette including a T7 transcription promoter, a hammerhead ribozyme   (HH), U6 snRNA, and the Alanyl tRNA. The in vivo construct is identical except that it is  in pQlink rather than pUC19, and does not contain the T7 promoter. (b) in vitro  transcription from the construct shown in (a). Products generated in the absence (-) and  presence (+) of in vitro transcribed M1 RNA from RNAse P were separated in an 8%  polyacrylamide (7M Urea) gel.  90  Figure 4.6. Negative stain electron microscopy of the LSm2-8 and U6 snRNP complexes.  The raw images of LSm2-8 (a) and U6 snRNP (c) reveal the torus shape of the LSm  complex at a magnification of 98 000 times. U6 snRNP is indicated by the black arrows.  (b) LSm2-8 complexes were binned into 50 different groups and averaged.  The highest resolution achieved so far for any of the LSm complexes is that obtained through negative stain EM. To improve the resolution of the yeast LSm2-8 complex, we subjected our purified particles to a solution-based biophysical technique, small angle X-ray scattering (SAXS) (Fig 4.7). Two important properties of the sample must be considered before proceeding with this technique: first, it was important to determine whether or not the SAXS experiment damaged the sample. Second, since this is an averaging strategy, it was critical to ensure that the LSm complexes were not aggregating in solution in order to build confidence that the data that was collected represented individual LSm complexes and not higher order aggregates. Dynamic light scattering before and after data collection indicated that the x-rays did not damage the sample, since the volume distribution versus the hydrodynamic radius of the particles was essentially the same before and after the SAXS experiment (Fig 4.7a). Dynamic light scattering of samples of LSm complex ranging from 2 – 10 mg/mL indicated that aggregation was not a concern since the hydrodynamic radius of the particles did not change substantially as the concentration of sample increased (Fig 4.7b). If aggregation were occurring, then the hydrodynamic radius would have increased as the sample became more concentrated. Thus the LSm complex was a very well behaved SAXS sample.  91   Figure 4.7. Small angle X-ray scattering envelope for yeast LSm2-8. (a) Dynamic light  scattering before (solid curve) and after (dashed curve) SAXS data collection, indicating  that the sample suffered minimal damage throughout the experiment. (b) Dynamic light  scattering as a function of concentration, showing that the sample was not aggregating.  (c) Merged SAXS data from multiple concentrations of sample. (d) Paired distance  distribution representing the merged data from (c). (e) Ab initio SAXS envelope  constructed from the SAXS data. The figure to the right is rotated 90° to the top of the  page relative to the figure on the left.  To generate ab initio models of the LSm complex, the SAXS data collected independently at multiple concentrations ranging from 5 – 10 mg/mL were then merged in order to avoid analysis of a potential outlier at a single concentration (Fig 4.7c). The merged scattering  92 data were then used to produce a paired distribution function, a histogram of the distances between all electron pairs in the sample, in order to estimate the SAXS envelope (Fig 4.7d). Given the torus shape of the LSm2-8 complex observed by EM, we initially expected to see a Gaussian paired distance distribution of our SAXS data. Instead we saw a mainly Gaussian distribution with a tail extending to larger distances, indicating that a small number of inter-atomic distance vectors were larger than those found in the main distribution (Fig 4.7d). However, this was actually consistent with the presence of the additional C-terminal tail on LSm4. That is, distances measured from the edge of this extension across the main core of the complex would be larger than those measured from across only the main core, from one edge to the other.   The scattering plot can be used to calculate two parameters that are related to the size of the particle: the maximum particle dimension (Dmax) and the radius of gyration (r(G)). The Dmax of our data was 12.4 nm and the r(G) was 3.68 ± 0.02 nm. These parameters were used to build 16 different ab initio models of LSm2-8, which yielded a Dmax of 12.5 ± 0.03 nm and an r(G) of 3.67 ± 0.02 nm. The chi and normalized spatial discrepancy (NSD) values, which provide a measure of how closely each of the ab initio models resemble each other, were 1.12 and 0.53 ± 0.01, respectively; indicating that all of the models that were constructed were extremely similar to each other. One of these models is shown above (Fig 4.7e). In addition to what I assume to be the LSm4 extension, the central lumen of this complex was clearly visible.   While SAXS has revealed some features of the complex that were not visible by EM (i.e. the LSm4 C-terminal tail), the details of each component of the complex and how they interact with each other were still not clear at this resolution. I therefore attempted to crystallize the LSm2-8 complex to obtain an atomic resolution structure, and while I was able to generate crystals under a variety of different conditions, the crystals diffracted to only very low resolution. In the meantime, another group published the crystal structure and I have consequently abandoned efforts on this front. The SAXS envelope that we have obtained contains a prominent region of electron density that was not observed in the crystal structure (Fig 4.8). In particular, this density was found near the protrusion that I have assumed to be the C-terminal extension to LSm4. When the crystal structure was modeled in the SAXS envelope as shown, the LSm3 protrusion fits well inside the electron density. The extra density shown near LSm4 could possibly be accounted for by both the C-terminal 94 amino acids of LSm4 and C-terminal 13  93 amino acids of LSm8 that were removed for crystallization (Zhou et al. 2014). Notably, these structures indicate that the C-terminal half of LSm4 extends out the side of the main torus shaped core, rather than sitting on top of it. However, the functional relevance of this arrangement is currently unknown. It is possible that this region of LSm4 interacts with Prp24 in free U6 snRNP to promote U4/U6 duplex formation.  Figure 4.8. The LSm2-8 crystal structure fits within the SAXS envelope with good  agreement. The recently published LSm2-8 crystal structure (PDB Accession Number  4M78) is shown in cartoon inside the surface representation of the LSm2-8 SAXS  envelope obtained in this work. LSm proteins are labeled.  94 4.5 Discussion  Our initial attempts to reconstitute the LSm complex from recombinantly expressed LSm protein sub-complexes using the method developed by the Kambach lab for the human LSm protein complexes failed since I was unable to obtain soluble protein for subcomplexes containing LSm4. Notably, a recent paper also working toward yeast LSm complex expression and purification reported that the same subcomplexes failed to be expressed in their system, which used a different set of expression vectors and bacterial expression strains (Zhou et al. 2014). Thus there appears to be some fundamental difference between the human and yeast proteins that prevents subcomplex expression of the yeast versions, despite the high degree of sequence and structural similarity between the human and yeast proteins. Notably, LSm4, which carries a C-terminal extension that essentially doubles the size of the protein compared to the other LSms, is the most divergent in primary sequence, with much of this difference occurring in the C-terminal half of the protein (Achsel et al. 1999, Reijns et al. 2008). The C-terminal extension is required for aggregation of LSm4 and any proteins associated with it in vivo to generate the cytoplasmic foci at which mRNA degradation occurs (Reijns et al. 2008). Perhaps the expression of the LSm4-containing subcomplexes resulted in heavy aggregation via this C-terminal extension and consequently very little soluble material was generated in the recombinant expression systems.  If the LSm4-containing subcomplexes aggregate so heavily in the bacterial expression system, then what prevents this aggregation from occurring when the entire complex is recombinantly expressed? To answer this, recall that the C-terminus of LSm4 is susceptible to cleavage in the LSm2-8 recombinant expression system, but is fully protected from cleavage in the LSm1-7 and U6 snRNP expression systems (Fig 4.4). Moreover, with appropriate handling of the samples to minimize protease activity throughout the expression and purification of LSm2-8, cleavage of the C-terminal region of LSm4 can be greatly reduced. These observations suggest that the C-terminal portion of LSm4 is protected (i.e. buried) in the LSm1-7 and U6 snRNP complexes, but is somewhat exposed and therefore accessible to endogenous proteases when the LSm2-8 complex is expressed. However, in the subcomplex expression system, the C-terminus might be even more exposed, resulting in very heavy aggregation and essentially completely insoluble protein.  95  In an attempt to work around the insoluble subcomplex problem, I have now shown that the entire yeast LSm complex can be expressed from a single expression vector to yield a pre-formed complex that is readily purified under non-denaturing conditions. The purification scheme presented here is advantageous compared to the Kambach purification scheme for a number of reasons. First, it allows for the purification of a preformed protein complex under non-denaturing conditions, and consequently there is no need to unfold/refold the proteins for reconstitution of the complex. Second, the sample handling time has been greatly reduced, since many of the purification steps required in the Kambach purification are not necessary in our protocol. Third, the amount of reagents required has been greatly reduced, but at the same time I have increased the yield and purity substantially. Fourth, depending on the downstream applications, the one-step purification is often sufficient while the four-step purification can be employed when removal of the tag is desirable. However, above all of these improvements, the major advantage of the expression/purification system developed here is the co-expression of RNA to recombinantly generate preformed RNA-protein complexes that are suitable for biophysical studies.  When adapting this system for expression and purification of other RNA-protein complexes, there are a number of properties of the complex that need to be considered. Recombinant expression of the protein components should not require any special attention outside of the normal considerations for expressing each protein alone (i.e. using an intron-less gene for any protein that normally undergoes pre-mRNA splicing, assessing the level of rare codon usage and selecting an appropriate expression strain, etc.). However, care must be taken in designing the RNA transcriptional cassette. In the case of yeast U6 snRNA, I required a 3’-OH end in order to allow for efficient association of the LSm complex in vivo. Thus, I incorporated a tRNA sequence at the 3’ end of U6 and exploited the activity of endogenous RNAse P to create the correct end. If a different functional group is required at the 3’ end, various ribozyme sequences can replace the tRNA sequence used here. For example, the Hepatitis Delta Virus ribozyme (HDV) can be used to generate the 2’-3’-cyclic phosphate end found at the 3’ end of U6 snRNA in higher eukaryotes. However, in my experience with an in vitro transcription system utilizing the HDV, cleavage was quite inefficient and required coaxing through many rounds of heating and slow cooling to encourage the HDV to fold into a catalytically active RNA. Thus it is unclear how efficient this particular ribozyme would be in an in vivo expression system and consequently many different ribozymes would need to be tested.  96  The successful inclusion of an RNA in the recombinant expression/purification system is probably dependent on the global structure of the complex, and consequently this expression strategy might not be adaptable for all RNA-protein assemblies. The main reason for this is that the RNA is transcribed in vivo; thus, exposed or flexible regions of the RNA are highly susceptible to degradation by endogenous ribonucleases. It is clear that all components of U6 snRNP have associated into the appropriate complex when expressed in E. coli, since in the absence of U6 snRNA, Prp24 does not co-purifiy with the LSm proteins in this expression/purification system. Thus, degradation of accessible regions of U6 snRNA probably occurs post-cell lysis when RNAses, in particular the endoribonuclease RNAse I, are released from the periplasmic space and come into contact with the RNA in the cell lysate (Neu & Heppel 1964, Meador et al 1990). It might be possible to abolish U6 snRNA cleavage during cell lysis if an appropriate quantity of RNAse inhibitors is included in the cell lysis buffer.  RNAs that are highly structured or broadly protected by proteins would be ideal candidates for expression in this system; however, the structure of the RNA and its interactions with proteins are often not well defined prior to establishing a reconstitution system for high-resolution biophysical characterization. Having said this, degradation of exposed nucleic acid might be desirable to reduce flexibility within the complex to promote crystal contacts. Limited proteolysis of the purified complex could be employed to remove flexible regions of protein, resulting in crystals of the main RNA-protein complex core; that is, the major interactions that are essential to holding the RNA-protein complex together.  Despite the major advances that I have made in improving the method for obtaining LSm complexes and U6 snRNP for biophysical characterization, one limitation of our system is that I am unable to directly control the stoichiometry of the expressed proteins. In contrast to the Kambach expression vectors that produce polycistronic transcripts encoding up to three genes, all under the control of a single transcriptional promoter, our expression system generates monocistronic mRNAs from each gene. I chose to use this system over a polycistronic mRNA since I was including seven to nine genes in the expression vector and I was concerned about the efficiency of translation of downstream genes. Indeed, expression of the second gene from a bicistronic mRNA is greatly reduced compared to expression of the gene more proximal to the promoter, and inclusion of a second promoter sequence 5’ of the second gene greatly increases the expression of the second gene (Rucker et al. 1997, Kim et al. 2004). A second limitation is  97 that I have not yet identified a technique to confidently report that all LSm complexes obtained in the purification contain only one of each of the seven proteins. Since most of the LSm proteins are within ~5kDa of each other with respect to size, complexes containing duplicates of one protein and missing another cannot be readily identified by size exclusion chromatography or electron microscopy.  Although I cannot definitively determine the protein stoichiometry of our purified complexes, the problem is addressed in three ways throughout the expression and purification procedure. First, higher abundance proteins that did not assemble into the LSm subcomplexes in the Kambach system were reported to precipitate into the insoluble fraction, essentially self-adjusting the stoichiometry of the soluble material (Zaric et al. 2005). Although I have not taken a systematic approach to confirm that the same is occurring in our system, analysis of the insoluble fraction suggests that it is likely to be the case since some of our LSm proteins are found in both the insoluble and soluble fractions, while others are found only in the soluble fraction (Fig 4.2). This suggests that the protein of lowest abundance sets the stoichiometry of the system, and all proteins found in excess of this will become insoluble. It should be noted that this is probably a property that is specific to the LSm proteins and should not be generally expected of proteins that are soluble when expressed alone. Second, any soluble complexes and individual proteins that do not associate with LSm6 are left behind during the first purification step since they do not contain the His6-tag themselves. Third, following IMAC, it is possible that tagged free LSm6 might still be present. These proteins are removed during dialysis by choosing the appropriate molecular weight cut-off for the dialysis membrane, followed by size exclusion chromatography. In my experience with this expression/purification system, I no longer have aberrant complexes separating out during gel filtration, suggesting that the stoichiometry was properly adjusted in the cells. Since the crystal structure of the yeast LSm2-8 complex has recently been published using a co-purification strategy that was almost identical to the one developed here (Zhou et al. 2014), I can say with confidence that the stoichiometry is correct.  The RNA-free human LSm2-8 complex has been successfully isolated from purified human triple-snRNP and an electron microscope image of these has been obtained, revealing a torus shaped complex with a central lumen through which the RNA was expected to bind in a manner similar to the Sm protein-RNA interactions (Achsel et al. 1999). The same torus shape was observed in both the EM and SAXS structures obtained for our purified LSm2-8 complex,  98 validating my co-expression/purification strategy as a method to prepare multi-component protein complexes for biophysical characterization. Our lower resolution SAXS envelope agrees well with a recently published crystal structure of the yeast LSm2-8 complex, indicating that the interactions observed between the protein components are not a result of crystal packing (Fig 4.6, Zhou et al. 2014). In addition to validating the crystal structure as a true representation of the LSm2-8 complex, our structure contains a tail extending from one side of the complex that I assume to be the C-terminal extension of LSm4, which was not included for crystallization.  To conclude, I have presented a new strategy to recombinantly express and purify pre-formed multi-protein complexes for biochemical and biophysical characterization. With an expression/purification system in place for the LSm complexes and free U6 snRNP, I expect to gain great insight into the global structure of each complex by subjecting them to SAXS and X-ray crystallography. Biophysical characterization of each of the LSm complexes should reveal some of the structural features that differ between them and should therefore provide a glimpse into how they bind different substrates, localize to different cellular compartments, and how they function in different RNA processing activities. Structural work with free U6 snRNP will reveal how the LSm2-8 complex interacts with both Prp24 and U6 snRNA. Most importantly, high resolution structural work with our purified free U6 snRNP is required to test the model of U6 snRNA secondary structure and interaction with Prp24 proposed in chapters two and three of this work.   99 5  Concluding Remarks  U6 snRNA is a critical component of the spliceosome, coordinating catalytically important metal ions and making crucial contacts with many components of the spliceosome throughout the splicing cycle (Yean et al. 2000, Brow & Guthrie 1988, Madhani & Guthrie 1992). Its activation as a functional component of the spliceosome involves the exchange of mutually exclusive interactions through an intricate and finely tuned allosteric cascade (Brow 2002). I have proposed a new model of U6 snRNA secondary structure in the catalytically inactive free U6 snRNP particle that extends this cascade to include a mechanism by which sequence elements that are important during early stages of spliceosome assembly, and later for catalysis, are unmasked at appropriate stages throughout splicing to ensure that aberrant splicing does not occur (Dunn & Rader 2010). Our model is unique in that it is the first to consider alternative base pairing interactions involving nucleotides that form the U6 3’ ISL. Further, I provided many observations form the literature, combined with the work presented here, that support a model of Prp24 interaction with U6 snRNA in which RRMs 1 and 2 make high affinity contacts with U6 snRNA in the lower telestem region.  My model of U6 snRNA secondary structure has the advantage of offering insight into the function of U4 snRNA and Prp24, both of which intimately interact with U6 snRNA outside of the catalytic spliceosome. In our model, Prp24 retrieves U6 snRNA from disassembling spliceosomes by disrupting the 3’ ISL via RRMs 3 and 4, and then binds the telestem region with high affinity through RRMs 1, 2, and 3. This holds U6 snRNA in a conformation that masks the sequence elements that are important for assembly and catalysis through intramolecular base pairing, and also presents U6 snRNA in a conformation that is favorable for interaction with U4 snRNA. As a limiting factor, U4 snRNA activates a sub-population of U6 molecules for incorporation into the assembling spliceosome by unmasking these sequence elements at the appropriate time throughout the spicing cycle. With release of U4 snRNA following spliceosome assembly, U6 folds into its catalytic form, allowing splicing of the substrate to occur. Following catalysis, the spliceosome undergoes disassembly and U6 is converted back to an inactive form as Prp24 retrieves it from the spliceosome, and the cycle starts again.  In order to validate the models that I have proposed, it is essential to solve the structure of free U6 snRNP at atomic resolution. A major obstacle to achieving this has been the recombinant expression and purification of the LSm proteins due to their instability when expressed alone. I  100 have now developed a system to recombinantly co-express the protein components of the complex and to purify the pre-formed complex under non-denaturing conditions. Such a system is easily adaptable to other larger, multi-component complexes, and our lab has recently initiated a project to study all of the major splicing subcomplexes using this strategy. The real strength of this system is that it does not just apply to the splicing machinery, but can be adapted for any protein complex. Further, depending on the particle under study, our system can be used to include the nucleic acid components of the complex. I expect that this multi-component recombinant expression system will be used extensively to better understand the structure/function relationship of a wide variety of biological complexes.  I have now successfully purified both the LSm2-8 complex, which is involved in splicing in the nucleus, and the LSm1-7 complex, which is involved in mRNA decay in the cytoplasm. As I solve these structures at higher resolution, I will be able to identify the structural differences that lead to the distinct localization and activity exhibited by these two very similar complexes. Importantly, I have isolated these complexes in the presence of the LSm4 C-terminal extension, which was not included in the recent LSm2-8 crystal structure (Zhou et al. 2014). Solving the structure of the complete complex is essential to identifying the localization signal, since the only difference between these complexes is the exchange of LSm1 for LSm8. Although LSm1 and LSm8 appear to act competitively to localize the complexes, neither of these proteins contains a domain that is sufficient on its own to direct localization (Spiller et al. 2007, Reijns et al. 2009). The C-terminal extension of LSm4 might hold the answer since LSm4 neighbors LSm1 and LSm8 in the respective complexes and this region of the protein has been shown to be critical for aggregation of the LSm1-7 complex to form the cytoplasmic foci at which mRNA decay takes place (Reijns et al. 2008).   One mechanism that might be employed for localization of the LSm complexes is post-translational modification of one or more proteins. Previous reports have highlighted the unusually high asparagine content found in the C-terminus of yeast LSm4 (Reijns et al. 2008). I have noted that of the 44 asparagine residues present throughout LSm4, 34 are found in the C-terminal 100 amino acids. Of these, five conform to the NXS/T (where X is not proline) consensus motif for N-linked glycosylation (Gavel & von Heijne 1990). This consensus sequence appears on average 4.5 times per S. cerevisiae protein, although many of those sites probably do not undergo glycosylation (Kung et al. 2009). In most other eukaryotes, this C- 101 terminal extension is rich in arginine rather than asparagine, a second amino acid that can undergo post-translational modifications that have been shown to be important in a wide variety of cellular processes (Reijns et al. 2008, Wei et al. 2014 and references therein). Indeed, in vivo dimethylation of the C-terminal extension of human LSm4 was detected by mass spectroscopy; however, the significance of these post-translational modifications is unknown (Brahms et al. 2001). It is therefore conceivable that yeast LSm4 also undergoes post-translational modification, which might be the signal that directs localization of the LSm1-7 and LSm2-8 complexes.  The LSm proteins have been reported to shift from the nucleus to the cytoplasm when the cell becomes stressed, increasing the size and abundance of the cytoplasmic foci at which mRNA decay takes place (Spiller et al. 2007). Aside from LSm4, many other proteins that localize to the mRNA decay foci in the cytoplasm also contain an unusually high number of N-linked glycosylation motifs. For example, Dcp2, an mRNA decapping enzyme that is recruited by the LSm1-7 complex (Dunckley & Parker 1999), contains twenty of these sites. Pan2 and Pan3 of the mRNA polyA ribonuclease complex contain 10 and 16 of these motifs respectively. Four of the seven proteins found in the Ccr4-Not complex contain eight or more N-linked glycosylation motifs, while the major 5’ to 3’ exonuclease, Xrn1, contains 13 of these motifs. How glycosylation would function to promote aggregation and formation of these cytoplasmic foci is not clear. Perhaps these proteins are normally glycosylated to prevent aggregation, but under stress, they are deglycosylated to promote formation of the mRNA decay centers. Alternatively, these proteins might not be glycosylated under normal condition, but become glycosylated under stress to promote formation of the cytoplasmic foci. Thus glycosylation or deglycosylation might be the underlying mechanism to respond to cellular stress in order to promote mRNA decay. Our co-expression/purification system does not provide post-translational modification; however, solving the structure of these very similar complexes will reveal how the LSm4 C-terminus interacts with other proteins in the LSm complex, as well as with other proteins in higher order complexes containing the LSm proteins.   The work presented in this dissertation has paved the way for structural characterization of free U6 snRNP and other spliceosomal sub-complexes. With the ability to purify the LSm complexes, I am now in a position to address the functional differences between the LSm1-7 and LSm2-8 complexes from a structural perspective. I am currently working toward obtaining a  102 solution structure for both U6 snRNP and LSm1-7 that can be compared to the SAXS model obtained here for LSm2-8. However, the details of the structural differences between each of these complexes will probably require high resolution X-ray crystallography. The system that I have developed for co-expression and purification of these pre-formed complexes will be invaluable in providing the large amounts of high purity sample required to find optimal crystallization conditions. Most importantly, this work has outlined a simple approach to the recombinant expression and purification of any multi-protein complex, which can be easily adapted to biological molecules involved in other critical cellular processes.   103 6  Addendum  While this dissertation was undergoing external examination, a crystal structure of U6 snRNA bound by Prp24 was reported (Montemayor et al. 2014). Since the objective of this thesis was to work toward an atomic resolution structure of U6 snRNP, a discussion of this latest structural work and how it relates to this dissertation is warranted. Despite the fact that there are few similarities between the new structure and the model that I have proposed, I do not believe that the crystal structure has any significant impact on my work. Many of the data reported in the literature are incompatible with the crystal structure, and mutations that were required to generate the crystals were shown to not support formation of the telestem extension in the work presented in this dissertation. Here I will briefly describe the constructs used for crystallization as well as important structural features revealed by the crystal structure, resolved to 1.7 Å, and then I will highlight my concerns regarding this structural work.  In order to generate U6/Prp24 crystals, Montemayor et al. (2014) reconstituted the U6/Prp24 complex from recombinantly expressed and purified Prp24 and in vitro transcribed U6 snRNA. The Prp24 construct spanned amino acids 34-400 (of 444), which includes all four RRMs, and represents approximately 80% of full length Prp24. The U6 snRNA used in this study spanned nucleotides 30-101 (of 112) and contained three mutations that were included to stabilize the RNA: A62G, located in the 3’ ISL, and U100C and U101C, located at the base of the telestem extension. Reconstituted particles were purified over a MonoQ column to remove uncomplexed RNA and protein, and the purified complex was then crystallized by sitting-drop vapor diffusion.  Given the hyperstabilization of the 3’ ISL and telestem extension through mutation of the U6 snRNA sequence, it is not surprising to find these elements, along with the asymmetric bulge, in the crystal structure, which reveals a three-dimensional version of the structure originally proposed in two dimensions by Karaduman et al. (2006) (Fig. 2.1). The U6/Prp24 crystal structure has revealed an ‘interlocked topology’ in which Prp24 appears to form a ring around the asymmetric bulge in U6 snRNA (nucleotides 41-58), which itself is closed due to formation of the 3’ ISL (nucleotides 59-88) and telestem (nucleotides 30-40 and 91-101). RRMs 2, 3, and oRRM 4 were found to make extensive contact with U6 snRNA throughout the asymmetric bulge and telestem region, encompassing an area of ~ 2 200Å2. In this structure, RRM 2 binds U6 nucleotides 46-58 and RRM 3 binds nucleotides 39-44 within the asymmetric bulge. oRRM 4  104 contacts both the 3’ ISL and the telestem through non-canonical interactions via its electropositive α-helices, and RRM 1 does not contact U6 within the crystallographic asymmetric unit at all. Instead, RRM 1 forms an electropositive groove with RRMs 2 and 3, and this groove interacts with the 3’ ISL from a neighboring U6/Prp24 complex in the crystal.   One of the major concerns that I have with the crystal structure is that the RNA was stabilized by introducing three mutations into U6 snRNA, two of which, U100C and U101C, were not supported by the genetic results that I have reported in chapter three of this dissertation. Recall that double mutations in the telestem extension did not support formation of a stem structure, but instead revealed that both the 5’ and 3’ strands in this region of U6 snRNA play important roles that are independent of each other (see chapter three). Further, an extensive discussion regarding the third mutation, A62G, can be found in chapter two of this dissertation, where I argued for the existence of the 3’ ISL only in the activated spliceosome. In the presence of these three hyperstabilizing mutations, and in the absence of Prp24 and the LSm complex, the in vitro transcribed U6 snRNA would favorably generate these stem structures as a consequence of their high stability and the absence of anything that might normally act (directly or indirectly) to prevent formation of these structures.   Binding of Prp24 RRMs 1 and 2 to the 5’ and 3’ stands of U6 snRNA in the region surrounding the telestem, as proposed in my work, would be impeded in the presence of these stem structures, since formation of the telestem extension would occlude the Prp24 binding sites in the 30/31 and 96/97 regions of U6 snRNA. As a consequence of this, the recombinant Prp24 would have two options once mixed with this in vitro transcribed, mutated U6 snRNA: to bind a similar binding site in U6 snRNA that is accessible, or to not bind U6 snRNA at all. Since there is an identical RRM 2 binding site in the asymmetric bulge (see chapter three), it is not surprising to find that RRM 2 has bound this site in the crystal structure, although the binding register has shifted slightly when compared to earlier structure work carried out by the same research group. In the earlier solution structure, RRM2 was shown to bind the AGAGAU sequence from U6 snRNA through a network of specific interactions with the first GA dinucleotide of the sequence (Martin-Tumasz et al. 2010), while in this recent crystal structure, the binding register is shifted by one nucleotide (Montemayor et al 2014). Montemayor et al. (2014) suggest that the shift is the result of the presence of all four RRMs in the more recent work, with each RRM influencing the overall position of Prp24 binding, providing more opportunity for the true protein/RNA  105 interaction to be established in an artificial setting. An alternative explanation is that the presence of other domains in Prp24 has pulled RRM 2 out of the true register with this sequence of RNA because the protein is not binding U6 snRNA at the correct sites or in the correct orientation.   My second major concern regarding this recent crystal structure is that both Prp24 and U6 snRNA have been truncated, and the LSm protein complex is absent. The LSm complex, which binds the 3’ tail of U6 snRNA (missing in this structural work), interacts with the C-terminus of Prp24 (also missing in this work), to control Prp24 stoichiometry in free U6 snRNP (Achsel et al. 1999, Rader & Guthrie 2002, Kwan & Brow 2005, Karaduman et al. 2006). In the absence of the LSm complex, full length U6 snRNA bound by full length Prp24 results in complete protection of the U6 snRNA backbone from hydroxyl radical cleavage, while in its presence, protection from hydroxyl radical cleavage is only observed up to nucleotide sixty and again at position 81-83 (Karaduman et al. 2006). Notably, a second molecule of full length Prp24 binds full length U6 snRNA at low concentrations of protein (< 200 nM) in electrophoretic mobility shift assays, and when twelve amino acids are removed from the C-terminus of Prp24, this second shift is already observed at the lowest Prp24 concentration tested (50 nM) (Kwan & Brow 2005). Additional truncation of Prp24 from the C-terminus results in the appearance of a third shifted band that is also visible at the lowest concentration of Prp24, with the third band representing the major shifted species at 400 nM Prp24 (Kwan & Brow 2005). Thus it is clear that there are elements in free U6 snRNP that function to guide Prp24 binding to U6 snRNA, and in the absence of these informatative components of the snRNP, there is probably going to be promiscuous binding. Additional alteration of the RNA structure via introduction of the three hyperstabilizing mutations discussed above might have reduced this promiscuous binding to a single, non-native, but favorable interaction, generating a homogeneous pool of complexes that readily generated crystals.   To support the crytal structure biochemically, Montemayor et al. (2014) noted that their structure is consistent with chemical modification data from the literature. While this is true, I refer the reader back to chapter two of this dissertation where the same chemical modification data are mapped to both the Karaduman model (identical to the crystal structure presented by Montemayor et al. 2014) and the Dunn model of U6 snRNA secondary structure. I highlighted the ambiguous nature of this structure probing strategy by showing that the data map equally well to both structures, despite their remarkable differences. Further, the authors attempted to  106 convince readers of the validity of their structure by mapping trans-acting suppressors of A62G and A62U/C85A cold sensitivity onto Prp24 in their crystal structure, showing that the suppressors map to the RNA/protein interface. Again, I refer the reader back to chapter two of this dissertation where I highlighted the limitations of genetic work in that it is not always possible to assign a genetic observation to a particular molecular interaction at a particular time, especially in the case of U6 snRNA, which is a highly dynamic molecule that interacts with many binding partners throughout the splicing cycle.  As an altrnative explanation for the localization of the trans-acting suppressor mutations, consider the U6 retrieval model that I put forward in chapter two of this dissertation. Recall that under this model, RRM3 and oRRM4 of Prp24 function to destabilize the 3’ ISL through non-canonical interactions with the stem in U6 snRNA that has been retrieved from disassembling spliceosomes. High affinity binding is then mediated through interactions between the first three RRMs of Prp24 and the telestem region of U6 snRNA, causing U6 snRNA to be held in a conformation that is favorable for interaction with U4 snRNA. Notably, fourteen of these suppressors map to RRM3 and thirteen map to oRRM4, while only three map to RRM2 and none of the identified suppressors of A62G or A62U/C85A cold sensitivity mapped to RRM1. If RRM3 and oRRM4 are responsible for destabilizing the 3’ ISL, then localization of the suppressor mutations to these domains in Prp24 is not too surprising, especially since the suppressors map mainly to the site of predicted non-canonical interaction with the 3’ ISL, namely loop 3 of RRM3 and the N-terminal α-helix of oRRM4. Thus these suppressor might highlight a more transient interaction between Prp24 and U6 snRNA that is required for 3’ ISL unwinding in disassembling splicesomes, rather than highlighting a stable interaction interface within free U6 snRNP.  It is telling that the authors admit in their report that the mechanism of U4/U6 di-snRNP formation proposed in their paper is based on a hypothetical pathway intermediate rather than the structure that they have reported. This is because their proposed mechanism is incompatible with the structure presented. In light of this, and given the discussion above, the crystal structure reported by Montemayor et al. (2014) should be interpreted with caution. It remains to future work to reconcile the discrepancies between the Montemayor crystal structure and the genetic results and biochemical work presented in the dissertation.  107 References Abelson J (2013) Toggling in the spliceosome. Nat Struct Mol Biol. 20: 645-647. Abovich N, Liao XC, Robash M (1994) The yeast MUD2 protein: an interaction with PRP11  defines a bridge between commitment complexes and U2 snRNP addition. Genes Dev.  8: 843-854. Abovich N, Rosbash M (1997) Cross-intron bridging interactions in the yeast commitment  complex are conserved in mammals. Cell. 89: 403-412. Achsel T, Brahms H, Kastner B, Bachi A, Wilm M, Lührmann R (1999) A doughnut-shaped  heteromer of human Sm-like proteins binds to the 3’-end of U6 snRNA, thereby  facilitating U4/U6 duplex formation in vitro. EMBO J. 18: 5789-5802. Andreeva AY, Pointek J, Blasig IE, Utepbergenov D (2006) Assembly of tight junction is  regulated by the antagonism of conventional and novel protein C isoforms. Int. J.  Biochem. Cell Biol. 38: 222-233. Arenas JE, Abelson J (1997) Prp43: An RNA helicase-like factor involved in spliceosome  disassembly. Proc Natl Acad Sci USA. 94: 11798-11802. Ares M Jr, Grate L, Pauling MH (1999) A handful of intron-containing genes produces the lion’s  share of yeast mRNA. RNA. 5: 113-1139. Bae E, Reiter NJ, Bingman CA, Kwan SS, Lee D, Phillips GN Jr, Butcher SE, Brow DA (2007)  Structure and interactions of the first three RNA recognition motifs of splicing factor  Prp24. J Mol Biol. 367: 1447-1458. Bartels C, Urlaub H, Lührmann R, Fabrizio P (2003) Mutagenesis suggests several roles of  Snu114p in pre-mRNA splicing. J Biol Chem. 278: 28324-28334. Bell M, Schreiner S, Damainov A, Reddy R, Bindereif A (2002) p110, a novel human U6 snRNP  protein and U4/U6 snRNP recycling factor. EMBO J. 21: 2724-2735. Bellare P, Small EC, Huang X, Wohlschlegel JA, Staley JP, Sontheimer EJ (2008) A role for  ubiquitin in the spliceosome assembly pathway. Nat Struct Mol Biol. 15: 444-451. Berget SM, Moore C, Sharp PA (1977) Spliced segments at the 5’ terminus of Adenovirus 2  late mRNA. Proc Natl Acad Sci USA. 74: 3171-3175.  Boeke JD, Trueheart J, Natsoulis G, Fink GR (1987) 5-fluoroorotic acid as a selective agent in  yeast molecular genetics. Methods Enzymol. 154: 164-175. Bon E, Casaregola S, Blandin G, Llorente B, Neuvéglise C, Munsterkotter M, Guldener U,  Mewes HW, Van Helden J, Dujon B, Gaillardin C (2003) Molecular evolution of  eukaryotic genomes: hemiascomycetous yeast spliceosomal introns. Nucleic Acids Res.  31: 1121-1135.   108 Brahms H, Meheus L, de Brabandere V, Fischer U, Lührmann R (2001) Symmetrical  dimethylation of arginine residues in spliceosomal Sm protein B/B’ and the Sm-like  protein LSm4, and their interaction with the SMN protein. RNA. 7: 1531-1542. Brenner TJ, Guthrie C (2005) Genetic analysis reveals a role for the C terminus of the  Saccharomyces cerevisiae GTPase Snu114 during spliceosome activation. Genetics.  170: 1063-1080. Brenner TJ, Guthrie C (2006) Assembly of Snu114 into U5 snRNP requires Prp8 and a  functional GTPase domain. RNA. 12: 862-871. Brody E, Abelson J (1985) The “Spliceosome”: Yeast pre-messenger RNA associates with a 40S   complex in a splicing-dependent reaction. Science. 228: 963-967. Brow DA (2002) Allosteric cascade of spliceosome activation. Annu. Rev. Genet. 36: 333-360. Brow DA, Guthrie C (1988) Spliceosomal RNA U6 is remarkably conserved from yeast to  mammals. Nature. 334: 213-218. Brow DA, Guthrie C (1989) Splicing a spliceosomal RNA. Nature. 337: 14-15. Brow DA, Vidaver RM (1995) An element in human U6 snRNA destabilizes the U4/U6  spliceosomal RNA complex. RNA. 1: 122-131. Brys A, Schwer B (1996) Requirement for Slu7 in yeast pre-mRNA splicing is dictated by the  distance between the branch point and 3’ splice site. RNA. 2: 707-717. Burgess SM, Guthrie C (1993) A mechanism to enhance mRNA splicing fidelity: the RNA- dependent ATPase Prp16 governs usage of a discard pathway for aberrant lariat  intermediates. Cell. 73: 1377-1391. Burset M, Seledtov IA, Solovyev VV (2000) Analysis of canonical and non-canonicle splice  sites in mammalian genomes. NAR. 28: 4364-4375. Cech TR (1986) The generality of self-splicing RNA: Relationship to nuclear mRNA splicing.  Cell. 44: 207-210. Celotto AM, Graveley BR (2001) Alternative splicing of the Drosophila Dscam pre-mRNA is  both temporally and spatially regulated. Genetics. 159: 599-608. Chan SP, Kao DI, Tsai WY, Cheng SC (2003) The Prp19p-associated complex in  spliceosome  activation. Science. 302: 279-282. Chan SP, Cheng SC (2005) The Prp19-associated complex is required for specifying interactions  of U5 and U6 with the pre-mRNA during spliceosome activation. J Biol Chem. 280:  31190-31199. Chen HC, Cheng SC (2012) Functional roles of protein splicing factors. Biosci Rep. 32: 345-359.  109 Chen HC, Tseng CK, Tsai RT, Chung CS, Cheng SC (2013) Link of NTR-mediated spliceosome  disassembly with DEAH-box ATPases Prp2, Prp16, and Prp22. Mol Cell Biol. 33: 514- 525. Chen JY, Stands L, Staley JP, Jackups RR Jr, Latus LJ, Chang TH (2001) Specific alterations of  U1-C protein or U1 small nuclear RNA can eliminate the requirement of Prp28p, an  essential DEAD box splicing factor. Mol Cell. 7: 227-232. Cheng SC, Abelson J (1987) Spliceosome assembly in yeast. Genes Dev. 1: 1014-1027. Chow LT, Gelinas RE, Broker TR, Roberts RJ (1977) An amazing sequence arrangement at  the 5’ ends of Adenovirus 2 messenger RNA. Cell. 12: 1-8. Chua K, Reed R (1999) The RNA splicing factor hSlu7 is required for correct 3’ splice site  choice. Nature. 402: 207-210. Collins CA, Guthrie C (1999) Allele-specific genetic interactions between Prp8 and RNA active  site residues suggest a function for Prp8 at the catalytic core of the spliceosome. Genes  Dev. 13: 1970-1982. Company M, Arenas J, Abelson J (1991) Requirement of the RNA helicase-like Prp22 for  release of messenger RNA from spliceosomes. Nature. 349: 487-493. Crawford DJ, Hoskins AA, Friedman LJ, Gelles J, Moore MJ (2013) Single-molecule  colocalization FRET evidence that spliceosome activation precedes stable approach of 5’  splice site and branch site. Proc Natl Acad Sci USA. 110: 6783-6788. Dlakic M, Mushegian A (2011) Prp8, the pivotal protein of the spliceosomal catalytic center,  evolved from a retroelement-encoded reverse transcriptase. RNA. 17: 799-808. Dix I, Russell CS, O’Keefe RT, Newman AJ, Beggs JD (1998) Protein-RNA interactions in the  U5 snRNP of  Saccharomyces ceravisiae. RNA. 4: 1675-1686. Dunckley T, Parker R (1999) The DCP2 protein is required for mRNA decapping in  Saccharomyces cerevisiae and contains a functional MutT motif. EMBO J. 18: 5411- 5422. Dunn EA (2009) MSc Thesis: U6 snRNA secondary structure in free U6 snRNPs. UNBC. Dunn EA, Rader SD (2010) Secondary structure of U6 small nuclear RNA: implications for  spliceosome assembly. Biochem Soc Trans. 38: 1099-1104. Dunn EA, Rader SD (2014) Pre-mRNA splicing and the spliceosome: assembly, catalysis, and  fidelity. In: Fungal RNA Biology. Sesma, A and von der Haar, T. (eds) Springer-Verlag  Heidelberg. 22-57. Epstein P, Reddy R, Henning D, Busch H (1980) The nucleotide sequence of nuclear U6  (4.7S)  RNA. J Biol Chem. 255: 8901-8906.  110 Eschenlauer JB, Kaiser MW, Gerlach VL, Brow DA (1993) Architecture of a yeast U6 RNA  gene promoter. Mol Cell Biol. 13: 3015-3026. Fabrizio P, Abelson J (1990) Two domains of yeast U6 small nuclear RNA required for both  steps of nuclear precursor messenger RNA splicing. Science. 250: 404-409. Fabrizio P, Abelson J (1992) Thiophosphates in yeast U6 snRNA specifically affect pre-mRNA  splicing in vitro. Nucleic Acids Res. 20: 3659-3664. Fabrizio P, Laggerbauer B, Lauber J, Lane WS, Lührmann R (1997) An evolutionarily conserved  U5 snRNP-specific protein is a GTP-binding factor closely related to the ribosomal  translocase EF-2. EMBO J. 16: 4092-4106. Fabrizio P, Dannenberg J, Dube P, Kastner B, Stark H, Urlaub H, Lührmann R (2009) The  evolutionarily conserved core design of the catalytic activation step of the yeast  spliceosome. Mol Cell. 36: 593-608. Fica SM, Tuttle N, Novak T, Li NS, Lu J, Koodathingal P, Dai Q, Staley JP, Piccirilli JA  (2013)  RNA catalyzes nuclear pre-mRNA splicing. Nature. 503: 229-234. Field DJ, Friesen JD (1996) Functionally redundant interactions between U2 and U6  spliceosomal snRNAs. Genes Dev. 10: 489-501. Fortner DM, Troy RG, Brow DA (1994) A stem/loop in U6 RNA defines a conformational  switch required for pre-mRNA splicing. Genes Dev. 8: 221-233. Fourmann JB, Schmitzová J, Christian H, Urlaub H, Ficner R, Boon KL, Fabrizio P, Lührmann  R (2013) Dissection of the factor requirements for spliceosome disassembly and the  elucidation of its dissociation products using a purified splicing system. Genes Dev.  27: 413-428. Fromont-Racine M, Mayes AE, Brunet-Simon A, Rain JC, Colley A, Dix I, Decourty L, Joly N,  Ricard F, Beggs JD, Legrain P (2000) Genome-wide protein interaction screens reveal  functional networks involving Sm-like proteins. Yeast. 17: 95-110. Gavel Y, von Heijne G (1990) Sequence differences between glycoylated and non-glycosylated  Asn-X-Thr/Ser acceptor sites: implications for protein engineering. Protein Eng. 3: 433- 442. Ghetti A, Company M, Abelson J (1995) Specificity of Prp24 binding to RNA: A role for Prp24  in the dynamic interaction of U4 and U6 snRNAs. RNA. 1: 132-145. Gordon PM, Piccirilli JA (2001) Metal ion coordination by the AGC triad in domain 5  contributes to group II intron catalysis. Nat Struct Biol. 8: 893-898. Gozani O, Feld R, Reed R (1996) Evidence that sequence-independent binding of highly  conserved U2 snRNP proteins upstream of the branch site is required for assembly of  splicesomal complex A. Genes Dev. 10: 233-243.  111 Gozani O, Potashkin J, Reed R (1998) A potential role for U2AF-SAP 155 interactions in  recruiting U2 snRNP to the branch site. Mol Cell Biol. 18: 4752-4760. Grabowski P, Seiler S, Sharp P (1985) A multicomponent complex is involved in the  splicing of messenger RNA precursors. Cell. 42: 345-353. Gudz TI, Schneider TE, Haas TA, Macklin WB (2002) Myelin proteolipid protein forms a  complex with integrins and may participate in integrin receptor signaling in  oligodendrocytes. J Neurosci. 22: 7398-7407. Guthrie C, Patterson B (1988) Spliceosomal snRNAs. Annu Rev Genet. 22: 387-419. Hahn D, Kudla G, Tollervey D, Beggs JD (2012) Brr2p-mediated conformational  rearrangements in the spliceosome during activation and substrate repositioning. Genes  Dev. 26: 2408-2421. Hamm J, Mattaj IW (1989) An abundant U6 snRNA found in germ cells and embryos of  Xenopus laevis. EMBO J. 8: 4179-4187. Hermann H, Fabrizio P, Raker VA, Foulaki K, Hornig H, Brahms H, Lührmann R (1995) snRNP  Sm proteins share two evolutionarily conserved sequence motifs which are involved in  Sm protein-protein interactions. EMBO J. 14: 2076-2088. Hilliker AK, Staley JP (2004) Multiple functions for the invariant AGC triad of U6 snRNA.  RNA. 10:921-928. Hilliker AK, Mefford MA, Staley JP (2007) U2 toggles iteratively between the stem IIa and stem  IIc conformations to promote pre-mRNA splicing. Genes Dev. 21: 821-834. Hobson GM, Huang Z, Sperle K, Sistermans E, Rogan PK, Garbern JY, Kolodny E, Naidu S,  Cambi F (2006) Splice-site contribution in alternative splicing of PLP and DM20:  molecular studies in oligodendrocytes. Hum Mutat. 27: 69-77. Hopfield JJ (1974) Kinetic proofreading: a new mechanism for reducing errors in biosynthetic  processes requiring high specificity. Proc Natl Acad Sci USA. 71: 4135-4139. Hoskins AA, Friedman LJ, Gallagher SS, Crawford DJ, Anderson EG, Wombacher R, Ramirez  N, Cornish VW, Gelles J, Moore MJ (2011) Ordered and dynamic assembly of single  spliceosomes. Science. 331: 1289-1295. Ingelfinger D, Arndt-Jovin DJ, Lührmann, R, Achsel T (2002) The human LSm1-7 proteins  colocalize with the mRNA-degrading enzymes Dcp1/2 and Xrn1 in distinct cytoplasmic  foci. RNA. 8: 1489-1501. Jackson SP, Lossky M, Beggs JD (1988) Cloning of the RNA8 gene of Saccharomyces  cerevisiae, detection of the RNA8 protein, and demonstration that it is essential for  nuclear pre-mRNA splicing. Mol Cell Biol. 8: 1067-1075. James SA, Turner W, Schwer B (2002) How Slu7 and Prp18 cooperate in the second step of  yeast pre-mRNA splicing. RNA. 8: 1068-1077.  112 Jandrositz A, Guthrie C (1995) Evidence for a Prp24 binding site in U6 snRNA and in a putative  intermediate in the annealing of U6 and U4 snRNAs. EMBO J. 14: 820-832. Johnson TL, Abelson J (2001) Characterization of U4 and U6 interactions with the 5’ splice site  using a S. cerevisiae in vitro trans-splicing system. Genes Dev. 15: 1957-1970. Jurica MS, Moore MJ (2003) Pre-mRNA splicing: awash in a sea of proteins. Mol Cell. 12: 5-14. Kambach C, Walke S, Young R, Avis JM, de la Fortelle E, Raker VA, Lührmann R, Li J, Nagai  K (1999) Crystal structures of two Sm protein complexes and their  implications for the  assembly of the spliceosomal snRNPs. Cell. 96: 375-387. Kandels-Lewis S, Séraphin B (1993) Roles of U6 snRNA in 5’ splice site selection. Science.  262: 2035-2039. Kapust RB, Tözser J, Fox JD, Anderson DE, Cherry S, Copeland TD, Waugh DS (2001)  Tobacco etch virus protease: mechanism of autolysis and rational design of stable  mutants with wild-type catalytic proficiency. Protein Eng. 14: 993-1000. Karaduman R, Fabrizio P, Hartmuth K, Urlaub H, Lührmann R (2006) RNA structure and RNA- protein interactions in purified yeast U6 snRNPs. J Mol Biol. 356: 1248-1262. Karaduman R, Dube P, Stark H, Fabrizio P, Kastner B, Lührmann R (2008) Structure of yeast  U6 snRNPs: Arrangement of Prp24p and LSm complex as revealed by electron  microscopy. RNA. 14: 1-10. Käufer NF, Potashkin J (2000) Analysis of the splicing machinery in fission yeast: A comparison  with budding yeast and mammals. Nucleic Acids Res. 28: 3003-3010. Khusial P, Plaag R, Zieve GW (2005) LSm proteins form heptameric rings that bind to RNA via  repeating motifs. Trends Biochem Sci. 30: 522-528. Kim CH, Abelson J (1996) Site-specific cross-links of yeast U6 snRNA to the pre-mRNA near  the 5’ splice site. RNA. 2: 995-1010. Kim KJ, Kim HE, Lee KH, Han W, Yi, MJ, Jeong J, Oh BH (2004) Two-promoter vector is  highly  efficient for overproduction of protein complexes. Protein Sci. 13: 1698-1703. Kinniburgh AJ, Mertz JE, Ross J (1978) The precursor of mouse β-globin messenger RNA  contains two intervening RNA sequences. Cell. 14: 681-693. Kistler AL, Guthrie C (2001) Deletion of MUD2, the yeast homolog of U2AF65, can bypass the  requirement for Sub2, an essential spliceosomal ATPase. Genes Dev. 15: 42-49. Konarska MM, Grabowski PJ, Padgett RA, Sharp PA (1985) Characterization of the branch  site in lariat RNAs produced by splicing of mRNA precursors. Nature. 313: 552-557. Koodathingal P, Novak T, Piccirilli JA, Staley JP (2013) The DEAH-box ATPase Prp16 and  Prp43 cooperate to proofread 5’ splice site cleavage during pre-mRNA splicing. Mol  Cell. 39: 385-395.  113 Kuhn AN, Li Z, Brow DA (1999) Splicing factor Prp8 governs U4/U6 RNA unwinding during  activation of the spliceosome. Mol Cell. 3: 65-75. Kuhn AN, Reichl EM, Brow DA (2002) Distinct domains of splicing factor Prp8 mediate  different aspects of spliceosome activation. Proc Natl Acad Sci USA. 99: 9145-9149. Kung LA, Tao SC, Qian J, Smith MG, Snyder M, Zhu H (2009) Global analysis of the   glyoproteome in Saccharomyces cerevisiae reveals new roles for protein glycosylation in  eukaryotes. Molecular Systems Biology. 5: 308-338. Kwan S, Gerlach VL, Brow DA (2000). Disruption of the 5’ stem-loop of yeast U6 RNA  induces trimethylguaosine capping of this RNA polymerase III transcript in vivo. RNA. 6:  1859-1869. Kwan S, Brow DA (2005) The N- and C- terminal RNA recognition motifs of splicing factor  Prp24 have distinct functions in U6 RNA binding. RNA. 11: 808-820. Lardelli RM, Thompson JX, Yates JR III, Stevens SW (2010) Release of SF3 from the intron  branch point activates the first catalytic step of pre-mRNA splicing. RNA. 16: 516-528. Lee C, Jaladat Y, Mohammadi A, Sharifi A, Geisler S, Valadkhan S. (2010) Metal binding and  substrate positioning by evolutionarily invariant U6 sequences in catalytically active  protein-free snRNAs. RNA. 16: 2226-2238. Legrain P, Séraphin B, Robash M (1985) Early commitment of yeast pre-mRNA to the  spliceosome pathway. Mol Cell Biol. 8: 3755-3760. Lesser CF, Guthrie C. (1993) Mutations in U6 snRNA that alter splice site specificity:  implications for the active site. Science. 262: 1982-1988. Leung AK, Nagai K, Li J (2011) Structure of the spliceosomal U4 snRNP core domain and  its implications for snRNP biogenesis. Nature. 473: 536-539. Li Z, Brow DA (1996) A spontaneous duplication in U6 spliceosomal RNA uncouples the early  and late functions of the ACAGA element in vivo. RNA. 2: 879-894. Licht K, Medenbach J, Lührmann R, Kambach C, Bindereif A (2008) 3’-cyclic phosphorylation  of U6 snRNA leads to recruitment of recycling factor p110 through LSm  proteins. RNA.  14: 1532-1538. Liu HL, Cheng SC (2012) The interaction of Prp2 with a defined region of the intron is required  for the first splicing reaction. Mol Cell Biol. 32: 5056-5066. Madhani HD, Bordonné R, Guthrie C (1990) Multiple roles for U6 snRNA in the splicing  pathway. Genes Dev. 4: 2264-2277. Madhani HD, Guthrie C (1992) A novel base-pairing interaction between U2 and U6 snRNAs  suggests a mechanism for the catalytic activation of the spliceosome. Cell. 71: 803-817.  114 Maeder C, Kutach AK, Guthrie C (2009) ATP-dependent unwinding of U4/U6 snRNAs by the  Brr2 helicase requires the C terminus of Prp8. Nat Struct Mol Biol. 16: 42-48. Martin-Tumasz S, Reiter NJ, Brow DA, Butcher SE (2010) Structure and functional  implications of a complex containing a segment of U6 bound by a domain of Prp24.  RNA. 16: 792-804. Martin-Tumasz S, Richie AC, Clos LJ II, Brow DA, Butcher SE (2011) A novel occluded RNA  recognition motif in Prp24 unwinds the U6 RNA internal stem loop. Nucleic Acids Res.  39: 7837-7847. Maschhloff KL, Padgett RA (1993) The stereochemical course of the first step of pre-mRNA  splicing. Nucleic Acids Res. 21: 5456-5462. Matsuzaki M, Misumi O, Shin-I T, Maruyama S, Takahara M, Miyagishima S, Mori T,  Nishida K, Yagisawa F, Nishida K, Yoshida Y, Nishimura Y, Nakao S, Kobayashi T,  Momoyama Y, Higashiyama T, Minoda A, Sano M, Nomoto H, Oishi K, Hayashi H, Ohta  F, Nishizaka S, Haga S, Miura S, Morishita T, Kabeya Y, Terasawa K, Suzuki Y,  Ishii Y,  Asakawa S, Takano H, Ohta N, Kuroiwa H, Tanaka K, Shimizu N, Sugano S, Sato N,  Nozaki H, Ogasawara N, Kohara Y, Kuroiwa T (2004) Genome sequence of the ultrasmall  unicellular red alga Cyanidioschyzon merloae 10D. Nature. 428: 653-657. Mayas RM, Maita H, Staley JP (2006) Exon ligation is proofread by the DExD/H-box ATPase  Prp22p. Nat Struct Mol Biol. 13: 482-490. Mayes AE, Verdone L, Legrain P, Beggs JD (1999) Characterization of Sm-like proteins in yeast  and their association with U6 snRNA. EMBO J. 18: 4321-4331. McGrail JC, Krause A, O’Keefe RT (2009) The RNA binding protein Cwc2 interacts directly  with the U6 snRNA to link the nineteen complex to the spliceosome during pre-mRNA  splicing. Nucleic Acids Res. 37: 4205-4217. Meador J, Cannon B, Cannistraro VJ, Kennell D (1990) Prification and characterization of  Escherichia coli RNAse I comparisons with RNAse M. FEBS. 187: 549-553. Mefford MA, Staley JP (2009) Evidence that U2/U6 helix I promotes both catalytic steps of pre- mRNA splicing and rearranges in between these steps. RNA. 15: 1386-1397. Mitrovich QM, Guthrie C (2007) Evolution of small nuclear RNAs in S. cerevisiae, C. albicans,  and other hemiascomycetous yeasts. RNA. 13: 2066-2080. Montemayor EJ, Curran EC, Liao HH, Andrews KL, Treba CN, Butcher SE, Brow DA (2014)   Core structure of the U6 small nuclear ribonucleoprotein at 1.7-Å resolution. Nat Struct  Mol Biol. 21: 544-551. Moore MJ, Sharp PA (1993) Evidence for two active sites in the spliceosome provided by  stereochemistry of pre-mRNA splicing. Nature. 365: 364-368.  115 Mozaffari-Jovin S, Santos KF, Hsiao HH, Will CL, Urlaub H, Wahl MC, Lührmann R (2012)  The Prp8 RNase H-like domain inhibits Brr2-mediated U4/U6 snRNA unwinding by  blocking Brr2 loading onto the U4 snRNA. Genes Dev. 26: 2422-2434. Mozaffari-Jovin S, Wandersleben T, Santos KF, Will CL, Lührmann R, Wahl MC (2013)   Inhibition of RNA helicase Brr2 by the C-terminal tail of the spliceosomal protein Prp8.  Science. 341: 80-84. Neu HC, Heppel LA (1964) The release of Ribonuclease into the medium when E. coli cells  are converted to spheroplasts. Biochim & Biophys Res Commun. 14: 109-112. Neuvéglise C, Marck C, Gaillardin C (2011) The intronome of budding yeasts. C. R. Biologies.  34: 662-670. Newman AJ, Teigelkamp S, Beggs JD (1995) snRNA interactions at 5’ and 3’ splice sites  monitored by photoactivated crosslinking in yeast spliceosomes. RNA. 1: 968-980. Nielsen KH, Staley JP (2012) Spliceosome activation: U4 is the path, stem I is the goal, and Prp8  is the keeper. Let’s cheer for the ATPase Brr2! Genes Dev. 26: 2461-2467. Ninio J (1975) Kinetic amplification of enzyme discrimination. Biochimie. 57: 587-595. O’Day CL, Dalbadie-McFarland G, Abelson J (1996) The Saccharomyces cerevisiae Prp5  protein has RNA-dependent ATPase activity with specificity for U2 small nuclear RNA.  J Biol Chem. 271: 33261-33267. Ohi MD, Vander Kooi CW, Rosenberg JA, Chazin WJ, Gould, KL (2003) Structural insights  into the U-box, a domain associated with multi-ubiquitination. Nat Struct Biol. 10: 250- 255. Ohrt T, Odenwälder P, Dannenberg J, Prior M, Warkocki Z, Schmitzová J, Karaduman R,  Gregor I, Enderlein J, Fabrizio P, Lührmann R (2013) Molecular dissection of step 2  catalysis of yeast pre-mRNA splicing investigated in a purified system. RNA. 19: 902- 915. Oubridge C, Ito N, Evans PR, Teo CH, Nagai K (1994) Crystal structure at 1.92Å resolution of  the RNA-binding domain of the U1A spliceosomal protein complexed with an RNA  hairpin. Nature. 372: 432-438. Padgett RA, Konarska MM, Grabowski PJ, Hardy SF, Sharp PA (1984) Lariat RNAs as  intermediates and products in the splicing of messenger RNA precursors. Science. 225:  898-903. Parker R, Siliciano PG, Guthrie C (1987) Recognition of the TACTAAC box during mRNA  splicing in yeast involves base pairing to the U2-like snRNA. Cell. 49: 229-239. Parisien M, Major F (2008) The MC-Fold and MC-Sym pipeline infers RNA structure from  sequence data. Nature. 452: 51-55.  116 Patterson B, Guthrie C (1987) An essential yeast snRNA with a U5-like domain is required for  splicing in vivo. Cell. 49: 613-624. Peebles CL, Perlman PS, Mecklenburg KL, Petrillo ML, Tabor JH, Jarrell KA, Cheng H (1986)  A self-splicing RNA excises an intron lariat. Cell. 44: 213-223. Peebles CL, Zhang M, Perlman PS, Fanzen JS (1995) Catalytically critical nucleotides in  domain  5 of a group II intron. Proc Natl Acad Sci USA. 92: 4422-4426. Pena V, Rozov A, Fabrizio P, Lührmann R, Wahl MC (2008) Structure and function of an  RNAse H domain at the heart of the splicesome. EMBO J. 27: 2929-2940. Puig O, Gottschalk A, Fabrizio P, Séraphin B (1999) Interaction of the U1 snRNP with  nonconserved  intronic sequences affects 5’ splice site selection. Genes Dev. 13: 569- 580. Puig O, Bragado-Nilsson E, Koski T, Séraphin B(2007) The U1 snRNP-associated factor Luc7p  affects  5’ splice site selection in yeast and human. Nucleic Acids Res. 35: 5874-5885. Rader SD, Guthrie C (2002) A conserved Lsm-interaction motif in Prp24 required for efficient  U4/U6 di-snRNP formation. RNA. 8: 1378-1392. Raghunathan PL, Guthrie C (1998) A spliceosomal recycling factor that reanneals U4 and  U6 small nuclear ribonucleoprotein particles. Science. 279: 857-860. Raghunathan PL, Guthrie C (1998b) RNA unwinding in U4/U6 snRNPs requires ATP  hydrolysis and the DEIH-box splicing factor Brr2. Curr Biol. 8: 847-855. Reijns MA, Alexander RD, Spiller MP, Beggs JD (2008) A role for Q/N-rich aggregation- prone regions in P-body localization. J Cell Sci. 121: 2463-2472. Reijns MA, Auchynnikava T, Beggs JD (2009) Analysis of LSm1p and LSm8p domains in the  cellular localization of LSm complexes in budding yeast. FEBS J. 276: 3602-3617. Reiter NJ, Nikstad LJ, Allmann AM, Johnson RJ, Butcher SE (2003) Structure of the U6 RNA  intramolecular stem-loop harboring an SP-phosphorothioate modification. RNA. 9: 533- 542. Rhode BM, Hartmuth K, Westhof E, Lührmann R (2006) Proximity of conserved U6 and U2  snRNA elements to the 5’ splice site region in activated spliceosomes. EMBO J. 25:  2475-2486. Rucker P, Torti FM, Torti SV (1997) Recombinant ferritin: modulation of subunit stoichiometry  in bacterial expression systems. Protein Eng. 10: 967-973. Ryan DE, Stevens SW, Abelson J (2002) The 5’ and 3’ domains of yeast U6 snRNA: LSm  proteins facilitate binding of Prp24 protein to the U6 telestem region. RNA. 8: 1011-1033. Rymond BC, Rosbash M (1985) Cleavage of 5’ splice site and lariat formation are independent  of 3’ splice site in yeast mRNA splicing. Nature. 317: 735-737.  117 Sakharkar MK, Chow VT, Kangueane P (2004) Distribution of exons and introns in the human  genome. In Silico Biol. 4: 387-393. Scheich C, Kümmel D, Soumailakakis D, Heinemann U, Büssow K (2007) Vectors for co- expression of an unrestricted number of proteins. Nucleic Acids Res. 35: e43  (doi:10.1093/nar/gkm067). Schellenberg MJ, Wu T, Ritchie DB, Fica S, Staley JP, Atta KA, LaPointe P, MacMillan AM  (2013) A conformational switch in Prp8 mediates metal ion coordination that promotes  pre-mRNA exon ligation. Nat Struct Mol Biol. 20: 728-724. Schmucker D, Clemens JC, Shu H, Worby CA, Xiao J, Muda M, Dixon JE, Zipursky SL (2000)  Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular  diversity. Cell. 101: 671-684. Schwer B, Guthrie C (1991) Prp16 is an RNA-dependent ATPase that interacts transiently with  the spliceosome. Nature. 349: 494-499. Schwer B, Guthrie C (1992) A conformational rearrangement in the spliceosome is dependent on  Prp16 and ATP hydrolysis. EMBO J. 11: 5033-5039. Schwer B, Gross CH (1998) Prp22, a DExD-box RNA helicase, plays two distinct roles in yeast  pre-mRNA splicing. EMBO J. 17: 2086-2094. Schwer B (2008) A conformational rearrangement in the spliceosome sets the stage for Prp22- dependent mRNA release. Mol Cell. 30: 743-754. Schwer B, Chang J, Shuman S (2013) Structure-function analysis of the 5’ end of yeast U1  snRNA highlights genetic interactions with the Msl5•Mud2 binding-binding complex  and other spliceosome assembly factors. Nucleic Acids Res. 41: 7485-7500. Semlow DR, Staley JP (2012) Staying on message: Ensuring fidelity in pre-mRNA splicing.  Trends Biochem Sci. 37: 263-273. Shannon KW, Guthrie C (1991) Suppressors of a U4 snRNA mutation define a novel U6 snRNP  protein with RNA-binding motifs. Genes Dev. 5: 773-785. Shuster EO, Guthrie C (1988) Two conserved domains of yeast U2 snRNA are separated by 945  nonessential nucleotides. Cell. 55: 41-48. Shuster EO, Guthrie C (1990) Human U2 snRNA can function in pre-mRNA splicing in yeast.  Nature. 345: 270-273. Sigel RK, Vaidya A, Pyle AM (2000) Metal ion binding sites in a group II intron core. Nat   Struct Biol. 7: 1111-1116. Siliciano PG, Guthrie C (1988) 5’ splice site selection in yeast: genetic alterations in base-pairing  with U1 reveal additional requirements. Genes Dev. 2: 1258-1267.  118 Singh R, Reddy R (1989) γ-monomethyl phosphate: a cap structure in spliceosomal U6 small  nuclear RNA. Proc Natl Acad Sci USA. 86: 8280-8283. Small EC, Leggett SR, Winans AA, Staley JP (2006) The EF-G-like GTPase Snu114p regulates  spliceosome dynamics mediated by Brr2p, a DExD/H box ATPase. Mol Cell. 23: 389- 399. Song EJ, Werner SL, Neubauer J, Stegmeier F, Aspden J, Rio D, Harper JW, Elledge SJ,  Kirschner MW, Rape M (2010) The Prp19 complex and the Usp4Sart3 deubiquitinating  enzyme control reversible ubiquitination at the spliceosome. Genes Dev. 24: 1434-1447. Sontheimer EJ, Steitz JA (1993) The U5 and U6 small nuclear RNAs as active site components  of the spliceosome. Science. 262: 1989-1996. Spiller MP, Reijns MA, Beggs JD (2007) Requiremnets for nuclear localization of the LSm2-8p  complex and competition between nuclear and cytoplasmic LSm complexes. J Cell Sci.  120: 4310-4320. Spingola M, Grate L, Haussler D, Ares M Jr (1999) Genome-wide bioinformatic and  molecular analysis of introns in Saccharomyces cerevisiae. RNA. 5: 221-234. Staley JP, Guthrie C (1999) An RNA switch at the 5’ splice site requires ATP and the DEAD  box protein Prp28p. Mol Cell. 3: 55-64. Steitz TA, Steitz JA (1993) A general two-metal-ion mechanism for catalytic RNA. Proc Natl  Acad Sci USA. 90: 6498-6502. Stevens SW, Abelson J (1999) Purification of the yeast U4/U6•U5 small nuclear  ribonucleoprotein particle and identification of its proteins. Proc Natl Acad Sci USA. 96:  7226-7231. Stevens SW, Barta I, Ge HY, Moore RE, Young MK, Lee TD, Abelson J (2001) Biochemical  and genetic analyses of the U5, U6, and U4/U6.U5 small nuclear ribonucleoproteins from  Saccharomyces cerevisiae. RNA. 7: 1543-1553. Stevens SW, Ryan DE, Ge HY, Moore RE, Young MK, Lee TD, Abelson J (2002)  Composition and functional characterization of the yeast spliceosomal penta-snRNP.  Mol Cell. 9: 31-44. Surowy CS, van Santen VL, Scheib-Wixted SM, Spritz RA (1989) Direct, sequence specific  binding of the human U1-70k ribonucleoprotein antigen protein to loop I of U1 small  nuclear RNA. Mol Cell Biol. 9: 4179-4186. Tan EM, Kunkle HG (1966) Characteristics of a soluble nuclear antigen precipitating with sera  of patients with systemic lupus erythematosis. J Immunol. 96: 464-471. Tanaka N, Aronova A, Schwer B (2007) Ntr1 activates the Prp43 helicase to trigger release of  lariat-intron from the spliceosome. Genes Dev. 21: 2312-2325.  119 Tharun S, He W, Mayes AE, Lennertz P, Beggs JD, Parker R (2000) Yeast Sm-like proteins  function in mRNA decapping and decay. Nature. 404: 515-518. Thomas M, White RL, Davis RW (1976) Hybridization of RNA to double-stranded DNA:  Formation of R-loops. Proc Natl Acad Sci USA. 73: 2294-2298. Tilghman SM, Tiemeier DC, Seidman JG, Peterlin BM, Sullivan M, Maizel JV, Leder P  (1978)  Intervening sequence of DNA identified in the structural portion of a mouse β-globin  gene. Proc Natl Acad Sci USA. 77: 725-729. Tsai RT, Fu RH, Yeh FL, Tseng CK, Lin YC, Huang YH, Cheng SC (2005) Spliceosome  disassembly catalyzed by Prp43 and its associated components Ntr1 and Ntr2. Genes  Dev. 19: 2991-3003. Tsai RT, Tseng CK, Lee PJ, Chen HC, Fu RH, Chang KJ, Yeh FL, Cheng SC (2007) Dynamic  interactions of Ntr1-Ntr2 with Prp43 and with U5 govern the recruitment of Prp43 to  mediate spliceosome disassembly. Mol Cell Biol. 27: 8027-8037. Tseng CK, Liu HL, Cheng SC (2011) DEAH-box ATPase Prp16 has dual roles in remodeling of  the spliceosome in catalytic steps. RNA. 17: 145-154. Umen JG, Guthrie C (1996) Mutagenesis of the yeast gene PRP8 reveals domains governing the  specificity and fidelity of 3’ splice site selection. Genetics. 143: 723-739. Valadkhan S, Manley J (2001) Splicing-related catalysis by protein-free snRNAs. Nature.  413: 701-707. Van der Veen R, Arnberg AC, van der Horst G, Bonen L, Tabak HF, Grivell LA (1986) Excised  group II introns in yeast mitochondria are lariats and can be formed by self-splicing in  vitro. Cell. 44: 225-234. Venter, J., Adams, M., Myers, E., et al. (271 co-authors). (2001) The sequence of the human  genome. Science. 291: 1304-1351. Vidal VP, Verdone L, Mayes AE, Beggs JD (1999) Characterization of U6 snRNA-protein  interactions. RNA. 5: 1470-1481. Vidaver RM, Fortner DM, Loos-Austin LS, Brow DA (1999) Multiple functions of  Saccharomyces cerevisiae splicing protein Prp24 in U6 RNA structural rearrangements.  Genetics. 153: 1205-1218. Vidovic I, Nottrott S, Hartmuth K, Lührmann R, Ficner R (2000) Crystal structure of the  spliceosomal 15.5kD  protein bound to a U4 snRNA fragment. Mol Cell. 6: 1331-1342. Villa T, Guthrie C (2005) The lys1p component of the NineTeen complex interacts with the  ATPase Prp16p to regulate the fidelity of pre-mRNA splicing. Genes Dev. 19: 1894- 1904. Wasserman DA, Steitz JA (1992) Interaction of small nuclear RNAs with precursor messenger  RNA during in vitro splicing. Science. 257: 1918-1925.  120 Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth  GP,  Burge CB (2008) Alternative isoform regulation in human tissue transcriptome.  Nature. 456: 470-476. Wang Q, Zhang L, Lynn B, Rymond BC (2008) A BBP-Mud2p heterodimer mediates branch  point recognition and influences splicing substrate abundance in budding yeast.  Nucleic Acids Res. 36: 2787-2798. Warkocki Z, Odenwälder P, Schmitzová J, Platzmann F, Stark H, Urlaub H, Ficner R, Fabrizio  P, Lührmann R (2009) Reconstitution of both steps of Saccharomyces cerevisiae splicing  with purified spliceosomal components. Nat Struct Mol Biol. 16: 1237-1243. Weber G, Trowitzsch S, Kastner B, Lührmann R, Wahl MC (2010) Functional organization of  the Sm core in the crystal structure of human U1 snRNP. EMBO J. 29: 4172-4184. Weber S, Aebi M (1988) In vitro splicing of mRNA precursors: 5’ cleavage site can be predicted  from the interaction between the 5’ splice site region and the 5’ terminus of U1 snRNA.  Nucleic Acids Res. 16: 471-486. Wei H, Mundade R, Lange KC, Lu (2014) Protein arginine methylation of non-histone proteins  and its role in diseases. Cell Cycle. 13: 32-41. White RL, Hogness DS (1977) R loop mapping of the 18S and 28S sequence in the long and  short repeating units of Drosophila melanogaster rDNA. Cell. 10: 177-192. Wiest DK, O’Day CL, Abelson J (1996) In vitro studies of the Prp9•Prp11•Prp21 complex  indicate a pathway for U2 small nuclear ribonucleoprotein activation. J Biol Chem.  271: 33268-33276. Will CL, Lührmann R (2001) Spliceosomal UsnRNP biogenesis, structure and function. Curr  Opin Cell Biol. 13: 290-301. Wolff T, Bindereif A (1993) Conformational changes of U6 RNA during the spliceosome cycle:  an intramolecular helix is essential both for initiating the U4-U6 interaction and for the  first step of splicing. Genes Dev. 7: 1377-1389. Wolff T, Menssen R, Hammel J, Bindereif A. (1994) Splicing function of mammalian U6  small nuclear RNA: conserved positions in central domain and helix I are essential during  the first and second step of pre-mRNA splicing. Proc Natl Acad Sci USA. 91: 903-907. Wood V, Gwilliam R, Rajandream MA et al. (134 co-authors) (2002) The genome sequence of  Schizosaccharomyces pombe. Nature. 415: 871-880. Wu S, Romfo CM, Nilsen TW, Green MR (1999) Functional recognition of the 3’ splice site AG  by the  splicing factor U2AF35. Nature. 402: 832-835. Xu D, Nouraini S, Field D, Tang SJ, Friesen JD (1996) An RNA-dependent ATPase associated  with U2/U6 snRNAs in pre-mRNA splicing. Nature. 381: 709-713.  121 Xu YZ, Query CC (2007) Competition between the ATPase Prp5 and branch region U2 snRNA  pairing modulates the fidelity of spliceosome assembly. Mol Cell. 28: 838-849. Yang F, Wang XY, Zhang ZM, Pu J, Fan YJ, Zhou J, Query CC, Xu YX (2013) Splicing  proofreading at 5’ splice sites by ATPase Prp28p. Nucleic Acids Res. 41: 4660-4670. Yean SL, Lin RJ (1991) U4 small nuclear RNA dissociates from yeast spliceosomes and does  not participate in the subsequent splicing reaction. Mol Cell Biol. 11: 5571-5577. Yean SL, Wuenschell G, Termini J, Lin RJ (2000) Metal-ion coordination by U6 small nuclear  RNA contributes to catalysis in the splicesome. Nature. 408: 881-884. Zaric B, Chami M, Rémigy H, Engel A, Ballmer-Hofer K, Winkler FK, Kambach C (2005)  Reconstitution of two recombinant LSm protein complexes reveals aspects of their  architecture, assembly, and function. J Biol Chem. 280: 16066-16075. Zaric BL, Kambach C (2008) Reconstitution of recombinant human LSm complexes for  biochemical, biophysical, and cell biological studies. Methods Enzymol. 448: 57-74. Zhou L, Hang J, Zhou Y, Wan R, Lu G, Yin P, Yan C, Shi Y (2014) Crystal structure of the  Lsm complex bound to the 3’ end sequence of U6 small nuclear RNA. Nature. 506: 116- 120. 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items