Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Non-protein coding RNAs in the hyperthermophilic archaeon Sulfolobus solfataricus Auxilio, Maria 2006

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2007-268285.pdf [ 14.9MB ]
Metadata
JSON: 831-1.0100435.json
JSON-LD: 831-1.0100435-ld.json
RDF/XML (Pretty): 831-1.0100435-rdf.xml
RDF/JSON: 831-1.0100435-rdf.json
Turtle: 831-1.0100435-turtle.txt
N-Triples: 831-1.0100435-rdf-ntriples.txt
Original Record: 831-1.0100435-source.json
Full Text
831-1.0100435-fulltext.txt
Citation
831-1.0100435.ris

Full Text

Non-protein coding RNAs in the hyperthermophilic archaeon Sulfolobus solfataricus by M A R I A AUXILIO ZAGO B . Sc., Universidad de las Americas-Puebla, 1999 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF D O C T O R OF P H I L O S O P H Y in T H E F A C U L T Y OF G R A D U A T E S T U D I E S (Biochemistry and Molecular Biology) T H E U N I V E R S I T Y OF B R I T I S H C O L U M B I A December 2006 © Maria Auxi l io Zago, 2006 11 Abstract The archaeal L7Ae protein is an integral component of three functionally distinct macromoloecular ribonucleoprotein complexes: the 50S large ribosomal subunit, the C / D box modification particles and the H / A C A box particles To better understand the function of the L 7 A e protein and to investigate the diversity of R N A s specifically associated with this protein, immuno-affinity chromatography was used to isolate the sRNAs associated with L 7 A e and R T - P C R was employed to construct a c D N A library. The isolated sRNAs were divided in different groups based on the presence of common known sequence and structural motifs and/or genomic location. Group one contained six R N A s that exhibited the features characteristic of the canonical C / D box archaeal sRNAs and one R N A representative of the archaeal H / A C A s R N A family. Group two contained fourteen sequences that were encoded either within, or overlapping the 5' end or 3' end of ORFs mostly coding for transposases. Interestingly, one of the clones in this group corresponded to the 5'-untranslated region (UTR) of L 7 A e m R N A , indicating that L 7 A e protein is able to interact with its own m R N A . The relevance of this interaction for the expression of L 7 A e protein was further analyzed using a S. solfataricus in vitro translation system. Group three contained three sequences form intergenic regions. Group four contained five antisense sequences complementary to the 5' end, 3' end or internal regions within annotated open-reading frames (ORFs) and two sequences antisense to bona-fide C /D box sRNAs. Group five contained two sequences corresponding to internal regions of 7S R N A of the signal recognition particle (SRP). Additionally, in the aim to better understand the versatility of L 7 A e in the interaction with various s R N A substrates, we introduced mutations in a model C / D box s R N A and monitored the impact of mutation on the protein binding ability and on the methylation function of the s R N A . M y data suggest that L 7 A e protein might have an important regulatory role in archaeal cells, serving as a primary RNA-binding factor in various complexes with distinct function. In addition, the results obtained in this study set the stage to further characterize all the sequences identified in this screening and to elucidate their function. Ill Table of Contents Abstract ii Table of Contents iii List of Tables vi List of Figures vii List of Abbreviations viii Acknowledgements x Co-authorship statement xi 1. Introduction 1 1.1 Non-protein-coding R N A s (ncRNAs) 1 1.2 Experimental approaches to identify ncRNAs 2 1.2.1 Computational approaches 3 1.2.2 Direct detection 4 1.3 Classes of ncRNAs 5 1.3.1 Modification-guide snoRNAs are divided in two families 6 1.3.2 M i c r o R N A s and small interfering R N A s 9 1.3.3 Other classes of ncRNAs 10 1.4 Archaeal modification guide sRNAs 10 1.4.1 Structure and function of archaeal C /D sRNP complex 11 1.4.2 Structure and function of archaeal H / A C A sRNP complex 13 1.5 The K-turn R N A motif. 16 1.6 Project objectives 18 1.7 Chapter 1 figures 19 2. Materials and Methods 25 2.1 Materials 25 2.1.1 Prokaryotic strains and plasmids 25 2.1.2 Enzymes and chemicals 25 2.1.3 Oligonucleotides 26 2.2 Methods 27 2.2.1 Preparation of S. solfataricus cell extracts 27 2.2.2 Extraction of S. solfataricus genomic D N A 29 2.2.3 Extraction of S. solfataricus total R N A 29 2.2.4 Expression and purification of recombinant proteins in E. coli 30 2.2.5 Production of polyclonal antibodies 32 IV 2.2.6 Immunoprecipitation of L7Ae-containing sRNP complexes from S. solfataricus cell extract • 33 2.2.7 R T - P C R cloning 34 2.2.8 Cloning of the c D N A fragments 36 2.2.9 Screening of c D N A clones for non-coding R N A s 36 2.2.10 Toeprinting assay 36 2.2.11 In vitro methylation assay 37 2.2.12 In vitro translation assay 38 2.2.13 Standard molecular biology techniques 39 2.2.14 Gel electrophoresis 43 2.2.15 Bioinformatics tools 45 2.3 Chapter 2 Tables 46 3. Results 57 3.1 Construction of L7Ae-associated c D N A library 57 3.1.1 Immunoprecipitation of L7Ae-containing complexes 57 3.1.2 Construction and analysis of the c D N A library 58 3.1.3 Characterization of the sequences of the Library 59 3.2 R N A sequences of the c D N A library 60 3.2.1 Group 1: Modification-guide sRNAs 60 3.2.2 Canonical C /D box sRNAs : 61 3.2.3 Group 2: R N A s overlapping protein coding regions 69 3.2.4 Group 3: sRNAs encoded in intergenic regions 74 3.2.5 Group 4: Antisense R N A s 76 3.2.6 Group 5: Fragments of 7S R N A 79 3.2.7 Group 6: Fragments of r R N A and t R N A 82 3.3 Probing the structure and function of archaeal C /D box sRNP 83 3.3.1 Interaction of L7Ae with C /D box K-turn motifs 83 3.3.2 s R l C / D box s R N A Mutants 85 3.4 Regulation of gene expression: a new role for L 7 A e protein? 89 3.4.1 L7Aegene . 89 3.4.2 . L 7 A e interacts with its own m R N A 90 3.4.3 Role of L 7 A e in the regulation of gene expression 91 3.5 Chapter 3 Figures 97 3.6 Chapter 3 Tables 137 4. Discussion 144 4.1 The universe of the K-turn motif and novel ribonucleoproteins 145 4.1.1 L 7 A e binding specificity 145 4.1.2 Novel L7Ae-containing RNPs. . . . 147 4.2 s R N A s containing the conserved C and D box sequence elements 148 4.2.1 In archaea C /D box sRNAs guide the methylation of t R N A s targets 148 4.2.2 Atypical C / D box sRNAs 150 4.3 Other L7Ae-containing R N P complexes 151 4.4 Interaction of L7Ae with protein coding R N A s 154 4.4.1 How is the expression of L 7 A e regulated in the cell? 154 4.4.2 Translation of leaderless m R N A s in S. solfataricus in vitro system 156 4.4.3 Interaction of L7Ae with the 5 ' -UTR of protein coding R N A s 157 4.4.4 Interaction of L7Ae with the 3 ' -UTR of protein coding regions 157 4.5 Novel antisense sRNAs as potential regulators 159 4.5.1 Antisense C / D box sRNAs 159 4.5.2 R N A s antisense to transposases 160 4.6 Assembly and function of C / D box R N P complexes 161 4.6.1 Circular permutation of s R l 162 4.6.2 Multiple and single nucleotide substitutions in the C ' / D ' motif 163 4.6.3 C / D box sRNAs with non-standard K-turn 164 4.7 Future Perspectives 165 4.7.1 More ncRNAs in S. solfataricus remain to be discovered 166 4.7.2 Functional characterization of the newly identified L7Ae-associated s R N A s 167 4.7.3 Identification and characterization of the components of novel L 7 A e -containing R N P 169 4.8 Chapter 4 Figures 171 Bibliography 174 VI List of Tables Table 2.1. Strain Genotype 46 Table 2.2. E. coli plasmid vectors 46 Table 2.3. Oligonucleotides used in this study 47 Table 3.1 Sequence of the c D N A clones 137 Table 3.2. Analysis of the L7Ae-associated R N A s isolated from S. solfataricus 140 Table 3.3. Functional homologs of S. acidocaldarius C /D box R N A s in the related genomes of S. solfataricus and S. tokodaii 142 V l l List of Figures Figure 1.1. Structural features of eukaryotic C /D box snoRNAs 19 Figure 1.2. Structural features of eukaryotic H / A C A box snoRNAs 20 Figure 1.3. Structural features of archaeal C /D box RNPs 21 Figure 1.4. Structural features of archaeal H / A C A box RNPs 22 Figure 1.5. Secondary and tertiary structure of the K-turn motif. 23 Figure 3.1. Sucrose gradient sedimentation of particles containing the ribosomal protein L 7 A e from Sulfolobus solfataricus cell extracts 97 Figure 3.2. Ribosomal protein L7Ae copurifies with the box C /D associated Fib and Nop5 proteins 99 Figure 3.3. sRIOl - a new canonical C /D box s R N A 100 Figure 3.4. sR107 and sR108 - atypical C /D box R N A s 102 Figure 3.5. Structure and properties of sR107 R N A 104 Figure 3.6 sRl-08 s R N A 106 Figure 3.7. sR109 R N A - an H / A C A box pseudouridylation guide s R N A 108 Figure 3.8. Sense strand sRNAs overlapping the 5' end of annotated ORFs 109 Figure 3.9. Sense strand sRNAs overlapping the 3' end of annotated ORFs 111 Figure 3.10. sR125 R N A is an intermediate resulting from ligation of the processed pre-r R N A spacers 113 Figure 3.11. s R 1 2 6 R N A 115 Figure 3.12. Antisense sRNAs 116 Figure 3.13. Secondary structure of 7S SRP R N A 118 Figure 3.14. 7S SRP R N A associates with the L 7 A e protein 120 Figure 3.15. Sense-Antisense 7S R N A interaction 122 Figure 3.16. Structure of the C / D box K-turn motif. 124 Figure 3.17. Structure and methylation activity of sRNPs containing wi ld type and mutant s R l s R N A s 125 Figure 3.18. Gel shift and methylation activity assays of wild-type s R l and circularly permutated mutants 127 Figure 3.19. K-turn 38 and K-turn 15 s R l mutants 129 Figure 3.20. L 7 A e interacts with its own m R N A 131 Figure 3.21. In vitro translation of m R N A transcripts 133 Figure 3. 22. Translation auto-repression assays with L7Ae protein 135 Figure 4.1. Structure comparison of L7Ae protein from different archaeal organisms and eukaryotic L7Ae-homologous proteins 171 Figure 4.2. The signal recognition particle ribonucleoprotein 173 List of Abbreviations AdoMet 5-adenosyl-L-methionine B S A bovine serum albumin Bp base pair C M C N-cyclohexyl-N'-b-(4-methylmorpholinium)ethylcarbodiimine-p-tosylate cpm counts per minute D E P C diethylpirocarbonate dFJ^O deionized water D T T dithiothreitol E D T A ethylenedimine tetraacetic acid hr hour(s) IPTG isopropyl-P-thiogalctopyranoside K D a Kilodalton L S U large ribosomal subunit min minute m R N A messenger R N A ncRNAs non-coding R N A s nt nucleotide(s) O D optical density O R F open reading frame P A G E polyacrylamide gel electrophoresis pep cytidine-5'-3'biphosphate P C R polymerase chain reaction ¥ Pseudouridine rRNA- ribosomal R N A rpm revolution per minute S Svedberg coefficient (unit of sedimentation) Sac Sulfolobus acidocaldarius SD Shine-Dalgarno sequence motif SDS sodium dodecyl sulphate snoRNA small nucleolar R N A snoRNP small nucleolar ribonucleoprotein particles s R N A small R N A s SRP signal recognition particle Sso Sulfolobus solfataricus Taq Thermus aquaticus T B E Tris-Borate-EDTA solution T,„ melting temperature Tris Tris(hydroxymethyl)aminoethane . t R N A transfer R N A U unit U T R untranslated region U V ultra violet V volt v/v volume per volume W watt w/v weight per volume X Acknowledgements I would like to thank Patrick Dennis for giving me the opportunity to work in his lab and for introducing me to the study of the vast " R N A world". I would like to specially thank Arina Omer for her guidance, valuable discussions, technical advice and assistance throughout my reseach. I am especially grateful to George Mackie for his guidance and support during my reseach project. I would also like thank the other members of my advisory committee, Natalie Strynadka and Patrick Keeling, for their helpful comments and suggestions. I would like to thank all the members of the former Dennis lab and Omer lab for their help and technical advice. In particular, I want to thank Sonia Ziesche for her company and for the fruitful scientific discussions we had. Very special and warm thanks to Rodolfo Dominguez for all his loving support during challenging time. XI Co-authorship statement The results presented in this thesis were published in two research articles co-authored by Maria Zago, Arina Omer and Patrick Dennis. Sections 3.1 and 3.2 of the thesis were published in the paper "The expanding world of small R N A s in the hyperthemophilic archaeaon Sulfolobus solfataricus (published in Mol Microbiol. 2005 55(6): 1812-28), in which Maria Zago is the first author. Section 3.3 and 4.6 were adapted from "Probing the structure and function of an archaeal C /D box methylation guide s R N A (published in RNA 2006 12(9): 1708-20). Maria Zago contributed to the experimental work and to manuscript preparation. 1 1. Introduction 1.1 Non-protein-coding RNAs (ncRNAs) The central dogma of biology states that genetic information normally flows from D N A to R N A to protein. For many years it was generally believed that the critical functions of the cell relied exclusively on proteins and that the complexity of an organism depended on its repertoire of protein-coding genes. Therefore, the study of the transcriptional activity of the genome was focused almost exclusively on the discovery and characterization of protein-coding genes. R N A s were considered as accessory molecules, involved mainly in mediating the processes of transcription and translation. This simplistic view was first called into question by the discovery of introns and the associated finding of splice variants that allow the synthesis of more than one protein product from a single gene (reviewed in Hastings and Krainer, 2001). In addition, the discovery of several cellular R N A s with catalytic properties, such as the R N A subunit of RNase P and spliceosomal R N A s , underlined the notion that the functions of R N A extend well beyond a transient role in ensuring the expression of protein-coding genes (reviewed in Levy and Ellington, 2001). With the advent of the genomics era over the last decade the complete sequences for hundreds of microbial genomes have become available. In the effort to assign a function to this large informational output, it has become strikingly apparent that a substantial fraction of all genomes is devoted to making non-protein-coding R N A s or non-coding R N A s (ncRNAs) (Huttenhofer et al., 2002; Vi ta l i et al., 2003; Vogel et al., 2003). Protein-coding genes are generally easy to identify by comparative genome analysis and/or computational gene prediction programs that are designed to identify open reading frames (ORFs), conserved 2 promoter regions, polyadenylation signals and splice sites typically associated with protein-coding genes. In contrast, genes encoding n c R N A s are generally much more difficult to identify as they often are typically short, have widely varying motifs, are often characterized more by their secondary structure than by their primary sequence and are not systematically processed (Eddy, 2001). n c R N A genes were also missed in genetic studies because of their small size and their resistance to frameshift and nonsense mutations, which apply only to protein-coding genes. Biochemical experiments also failed to identify ncRNAs because mostly such studies were designed to assay proteins. As a consequence, ncRNAs remained generally undiscovered until very recently, when relevant experimental approaches were developed (Eddy 2002; Huttenhofer et al., 2002). 1.2 Experimental approaches to identify n c R N A s Emerging evidence has indicated that many organisms contain a plethora of non-coding R N A s that are involved in a wide variety of cellular functions. These functions range from the well established roles in translation, ribosome assembly and intron splicing to the recently described involvement in the regulation of developmental genes, gene silencing, m R N A turnover and chromosomal architecture (reviewed in Eddy, 2002; Mattick, 2003). In recent years, new bioinformatical and experimental strategies have been used to identify a great number of novel n c R N A candidates in various model organisms from Escherichia coli to Homo sapiens (Argaman et al., 2001; Huttenhofer et al., 2001; Storz, 2002; Wassarman et al., 2001), demonstrating that the number of ncRNAs in genomes of model organisms is much higher than had been anticipated. 3 1.2.1 Computational approaches Whole genome sequence analysis and annotation of both prokaryotic and eukaryotic . organisms have generally failed to identify the portion of the genome devoted to the production of ncRNAs or to reveal their structural and functional diversity. However, several different computational approaches developed in the past few years have proved to be successful methods to predict the presence of new ncRNAs genes. Approaches designed to detect sequence conservation in intergenic regions in the genomes of related species, alone and in combination with algorithms that search for conservation of secondary structure, have been very successful in predicting n c R N A genes in E. coli (Argaman et al., 2001), C. elegans (Grad et al., 2003), and Arabidopsis thaliana (Jones-Rhoades and Battel, 2004). The presence of binding sites for specific DNA-binding proteins as well as promoter and terminator sequences in the intergenic regions were other criteria used to predict possible R N A genes in E. coli (Argaman et al., 2001; Chen et al., 2002) and S. cerevisiae (Olivas et al., 1997). The detection of GC-r ich regions in the AT-r ich genomes of Methanococcus jannaschii and Pyrococcus furiosus led to the identification of novel ncRNAs (Klein et al., 2002). A few S. cerevisiae ncRNAs were also detected in unusually long intergenic regions (>2kb) (Olivas etal., 1997). Although the computational programs mentioned above led to the identification of n c R N A genes located in intergenic regions, these methods failed to identify ncRNAs encoded on the opposite strand of protein-coding genes, such as the cw-antisense ncRNAs . In addition, most of the computational approaches overlook ncRNAs that are specific to one species or are less well conserved, as they rely heavily on sequence conservation among 4 different species (Storz et al., 2005). Finally, the ncRNAs predicted by computational v approaches have to be validated experimentally. 1.2.2 Direct detection A s an alternative to these computational methods, a number of innovative biochemical approaches have been developed to identify small non-coding R N A s . Among the most productive has been the systematic cloning of c D N A s from the size-selected R N A fraction that is between 50- 500 nt in length. In this method clones of known R N A s were identified by high-throughput filter hybridization and the clones showing hydridization signals below a specific threshold were considered novel and then sequenced. This strategy led to identification of several novel ncRNAs in E. coli (Vogel et al., 2003), A. fulgidus (Tang et al., 2002a), D. melanogaster (Yuan et al., 2003) and mouse (Huttenhofer et al., 2001). Similar cloning approaches were used to isolate microRNAs from organisms ranging from C. elegans (Lagos-Quintana et al., 2001) to mouse (Lagos-Quintana et al., 2002). In addition to the systematic cloning strategy, several ncRNAs have also been identified on the basis of their association with RNA-binding proteins. In early screens, immunoprecipitation experiments with anti-Sm antibodies from patients with systemic lupus erythematosus led to the identification of the snRNAs (Lerner et al., 1981). More recently antibodies against the general RNA-binding protein Hfq (Host factor for Qbeta phage R N A replication) and fibrillarin proteins were used to co-immunoprecipitate novel ncRNAs in E. coli (Zhang et al., 2002) and the archaeon S. acidocaldarius (see below) (Omer et al., 2000), respectively. . -5 Additionally, microarrays have also been used to discover new n c R N A s or to study the expression of ncRNAs. In Bacteria, microarrays carrying probes from intergenic regions, in addition to the coding regions were used to specifically analyze the transcriptional output from E. coli under different growth condition (Wassarman et al., 2001). This global analysis of the E. coli transcriptome allowed the detection of several novel ncRNAs candidates, including 5' and 3' U T R R N A fragments that accumulate independently after the processing of m R N A transcripts (Kawano et al., 2005). In eukaryotes, microarrays have been increasingly used to confirm global predictions of certain classes of eukaryotic ncRNAs as well as to study their expression profile in different tissues (Barad et al., 2004). Although microarrays have proven to be a very useful tool in the identification of novel ncRNAs , the results obtained by microarray experiments need to be experimentally verified by northern blot analysis to eliminate false positives (Huttenhofer and Vogel, 2006). 1.3 Classes of ncRNAs R N A s can be divided into two distinct groups: messenger R N A s (mRNAs) , which are translated into proteins, and the non-protein-coding R N A s (ncRNAs), which lack a protein-coding capacity and function at the R N A level. The ncRNAs have a cellular function of their own or in complex with proteins that are bound to the R N A and thus form R N P complexes. The functions of these ncRNAs are often mediated by secondary or tertiary structure and/or intermolecular base pairing and carried out in association with specific proteins, within ribonucleoprotein (RNP) particles (Huttenhofer et al., 2005). Much of the recent research on ncRNAs has focused mostly on the study of two large classes of n c R N A s : (i) small nucleolar R N A s (snoRNAs); and (ii) the m i c r o R N A (miRNA) 6 and small interfering R N A (s iRNA) families. Nevertheless, new n c R N A candidates have been identified that lack the characteristic features of any known n c R N A family. 1.3.1 Modification-guide snoRNAs are divided in two families In eukaryotes, the largest number of ncRNAs is localized in the nucleolus and they are referred to as small nucleolar R N A s (snoRNAs). The vast majority of snoRNAs is involved in the posttranscriptional modification of ribosomal R N A (rRNA) precursors (Maden, 1990; Weinstein and Steitz, 1999), and these snoRNAs can be divided in two major families: the C /D box family, which guides the methylation of the 2'-0-ribose position and the H / A C A box family which guides the conversion of uridine to pseudouridine. Each snoRNA family associates with specific sets of proteins to form discrete small ribonucleoprotein particles (snoRNPs). 1.3.1.1 C/D box methylation guide snoRNAs C / D box snoRNAs contain one pair of conserved sequence elements called box C ( R U G A U G A ) and box D ( C U G A ) . Occasionally, a less well conserved pair of the C and D boxes is present and termed box C and box D ' (Bachellerie et al., 2002; Cavaille and Bachellerie, 1998; Kiss, 2001; Tycowski et al., 1996). Boxes C and D are located near the 5' and 3' ends of the R N A molecule, respectively, whereas the C and D ' boxes are located towards the center of the R N A (Figure 1.1). Immediately upstream of the D and the D ' boxes are located the antisense guide elements. These consist of a 10-20 nucleotide region with perfect complementarity to r R N A or snRNA targets. Methyl transfer is directed to the 7 nucleotide target that base pairs with the fifth position in the guide upstream from the start of the D or D ' box; this is known as the ' N plus five' rule (Kiss-Laszlo et al., 1996; Smith and Steitz, 1997; Tycowski et al., 1996). The C and D motifs are typically brought together by base pairing of the adjacent 4-5 nt present at the 5'and 3' ends of the snoRNA. The formation of this terminal stem structure is critical for snoRNA biogenesis, nucleolar localization and function (Cavaille et al., 1996). In the secondary structure, the juxtaposed C and D boxes form a secondary structure motif known as the Kink turn (K-turn) motif (see below; Watkins et al., 2000). Eukaryotic methylation-guide snoRNAs associate with four common and evolutionarly conserved, core proteins: fibrillarin (Nopl in yeast), Nop56, Nop58 and 15.5kDa (Snul3 in yeast) (Venema and Tollervey, 1999; Watkins et al., 2000). Fibrillarin protein is the 2'-0-methyltrasferase (Tollervey et al., 1993). It exhibits amino-acid sequence motifs characteristic of the S-adenosyl-methionine-dependent methyltransferases (Niewmierzycka and Clarke, 1999). Nop56 and Nop58 are paralogous proteins that share more than 78% amino acid sequence identity (Filippini et al., 2000; Newman et al., 2000). However, in vitro crosslinking experiments demonstrated that Nop58 and Nop56 are differentially bound to the box C /D and C ' / D ' motifs, respectively (Szewczak et al., 2002). The 15.5KDa/Snul3 protein specifically binds to the K-turn motifs formed by the C / D boxes and helps to recruit the other methylation-guide proteins into the complex (Watkins et al., 2000). 8 1.3.1.2 H / A C A box p s e u d o u r i d y l a t i o n - g u i d e s n o R N A s The H / A C A snoRNAs contain two conserved boxes and share a common secondary structure consisting of two large hairpin domains linked by a single-stranded hinge region and followed by a short tail (Figure 1.2). The conserved H box ( A N A N N A , where N represents any nucleotide) is located in the hinge region and the A C A box is located in the tail, 3 nucleotides upstream from the 3' end of the snoRNA (Balakin et al., 1996; Ganot et al., 1997). The H box, the A C A box and the duplex hairpin structures are required for processing of snoRNA precursors, protein binding and nucleolar localization (Filipowicz and Pogacic, 2002; Kiss , 2001). The sequence antisense to r R N A s that surrounds the uridine target for modification is located in a large internal loop within each hairpin structure known as the pseudouridylation pocket. The uridine target for modification is left unpaired. Usually, the pseudouridylation pocket is located 14 nucleotides from the H box or A C A box (Ganot et al., 1997; N i et al., 1997). The pseudouridylation-guide snoRNAs are known to associate with four core proteins: Cbf5, G a r l , Nhp2 and NhplO (Ganot et al., 1997; Henras et al., 1998; Lafontaine et al., 1998; Watkins et al., 1998). Cbf5 (centromere-binding factor 5) is the pseudouridine synthase within the H / A C A snoRNP complex (Lafontaine et al., 1998; Zebarjadian et al., 1999). Gar l is a 2 5 K D a nucleolar protein that contains a glycine/arginine-rich domain common to other nucleolar proteins, including nucleolin and fibrillarin. Gar l is required for pseudouridylation of yeast r R N A (Bousquet-Antonelli et al., 1997). Nhp2 protein exhibits significant similarity to the 15.5KDa proteins, the ribosomal L30 and archaeal L 7 A e protein (see below) (Nottrott et al., 2002). The role of Nhp2 in the H / A C A assembly and function is 9 not known. NoplO is a small nucleolar protein and it might have a role in the stabilization of the H / A C A snoRNP complexes (Henras et al., 1998). 1.3.2 M i c r o R N A s and small interfering R N A s M i c r o R N A s (miRNAs) are involved in the control of developmental timing and/or tissue-specific functions (reviewed in Banerjee and Slack, 2002). The first m i R N A s to be described corresponded to the products of the lin-4 and let-7 genes in Caenorhabditis elegans (Lee et al., 1993; Reinhart et al., 2000). These m i R N A s are 22 and 21 nt in length and are processed from larger precursor R N A s . They inhibit translation through antisense interactions with the 3 untranslated region of target m R N A s . m i R N A s are widely distributed in all organisms. The m i R N A are found in introns as well as in intergenic clusters and contain short inverted repeats that form a double-stranded R N A (dsRNA) stem-loop structure of about 70 base-pairs (bp) when transcribed (reviewed in Carrington and Ambros, 2003). Processing of this m i R N A precursor produces the. small 21-25 nucleotide effector molecules, which are usually derived from only one strand of the stem loop structure. Some m i R N A s have homologues in both vertebrates and invertebrates although their small size renders the criterion of conservation between species often insufficient for the identification and isolation of new m i R N A s . The number of m i R N A s in human is thought to be 220-250 (L im et al., 2003). 10 1.3.3 Other classes of n c R N A s Numerous ncRNAs that do not belong to any of the known families of n c R N A (such as r R N A , t R N A , s(no)RNAs, gmRNAs, m i R N A s , s iRNA) have also been identified in experimental RNomics screens. The function of these ncRNAs remains largely unknown. The new classes of ncRNAs include R N A s antisense to m R N A s or ncRNAs , transcribed pseudogenes and riboswitches. O f these new classes of ncRNAs, the antisense R N A s are the most well represented. Recent evidence indicates that 12% of mammalian n c R N A transcription is antisense to other known genes. However, it is not known to what extent these antisense transcripts are functional (Kampa et al., 2004). Some antisense transcripts have been implicated in gene regulation at the level of transcriptional interference, R N A i or methylation modification, suggesting that these transcripts might have an important role in regulating gene expression (Lavorgna et al., 2004). Transcripts of pseudogenes have been detected in c D N A libraries from mouse (Zhang et al., 2004) and analysis of the human genome sequence has predicted the presence of approximately 20,000 pseudogenes (Yano et al., 2004). Riboswitches have been identified in bacteria. Riboswitches are ds-acting regulatory R N A s , located in the 5 ' -UTR of certain m R N A s , that function as direct receptors for intracellular metabolites. Ligand binding induces a conformational change in the riboswitch that causes an alteration of the gene expression (Mandal and Breaker, 2004). \ 1.4 Archaeal modification guide sRNAs Seminal work over the last few years has revealed that the ribosomes of Sulfolobus solfataricus (and presumably other archaeal species) contain a large number of sites of 2'-0-11 ribose methyl modification and that these modifications are mediated by R N P complexes that are homologous to eukaryotic C /D box snoRNPs (Noon et al., 1998; Omer et al., 2000). The initial biochemical discovery of archaeal C /D box R N A s was made by sequencing entries in a S. acidocaldarius c D N A library that was prepared from small R N A s that were co-immunoprecipitated using antibodies against the archaeal Fib or Nop5 (also referred to as Nop56/58) protein (Omer et al., 2000). Based on the presence of the four conserved box (C, D ' C and D) sequences and the spacing between the boxes in the cloned R N A s , search programs were designed to help identify the genes encoding these s R N A s in sequenced archaeal genomes. The results of these searches revealed that genes encoding C / D box sRNAs are abundant in archaeal genomes, particularly the genomes of organisms that grow at high temperatures (Gaspin et al., 2000; Omer et al., 2000). This correlation with growth temperature may mean (i) that the base pairing occurring between the guide regions of the C / D box R N A s and the nascent r R N A assists in the proper folding of the r R N A during assembly or (ii) that the deposition of methyl groups along the r R N A backbone contributes to the stabilization of higher-order structure of the R N A within the ribosome. In a number of instances, the presence of a methyl modification predicted by the complementarity between the s R N A guide and r R N A target sequences has been confirmed (Dennis et al., 2001; Omer etal., 2000). 1.4.1 S t r u c t u r e a n d f u n c t i o n o f a r c h a e a l C/D sRNP c o m p l e x The archaeal methylation-guide C / D box RNPs are slightly shorter that their eukaryotic counterparts, probably reflecting the size constains in compact archaeal gneomes. Archaeal C / D box RNPs consist of a single small R N A (sRNA) about 50-60 nt in length and 12 two copies of each of three proteins: L 7 A e (a homologue of the eukaryotic 15.5kDa protein), Fib and Nop5 (Omer et al., 2002; Rashid et al., 2003). The sRNAs are characterized by a bipartite structure that consists of moderately conserved C ( U G A U G A ) and D ( C U G A ) box sequence motifs located near the 5' and 3' ends of the molecules and the less well conserved D ' and C ' boxes located near the center of the molecule (Figure 1.3). The C / D , and C ' / D ' boxes form two separate K - turn motifs that are each stabilized by association with the L 7 A e protein (Kuhn et al., 2002). The Fib protein component has a conserved methyltransferase fold and employs S-adenosyl-L-methionine (AdoMet) as the cofactor in methyl transfer reaction. Nop5 protein forms a heterodimer with Fib protein and helps to anchor the catalytic subunit to the R N A complex (Aittaleb et al., 2003). Neither Fib, Nop5 nor the Fib-Nop5 heterodimer shows appreciable affinity for either the naked C / D box s R N A or for the L 7 A e protein (Omer et a l , 2002). Nevertheless, upon binding of L 7 A e protein to the C /D box s R N A , two copies of the Fib-Nop5 heterodimer rapidly assemble to produce a larger complex that is active in guide-directed methylation. These observations suggest that the binding of the L7Ae protein stabilizes the structure of the K - turn within the R N A , revealing features that can then be recognized by the Fib-Nop5 complex (Omer et al., 2002). Mutational analyses along with a crystal structure of the Fib-Nop5 heterodimer have provided additional clues relating to the overall structure of the fully assembled R N P complex. The binding of L 7 A e protein stabilizes the highly kinked conformation of the C / D and C ' / D ' K-turn motif. L 7 A e binds only the loop of the K-turn, while the stem II is required for binding of Nop5 and Fib (Aittaleb et al., 2003) (for structure of the K-turn see Figure 1.5). The Fib-Nop5 structure suggests that the stem II of the C / D box is bound by the positively charged surface of the C-terminal domain of Nop5. The interface between Fib and 13 the N-terminal region of Nop5 is extensive, largely non-polar and exhibits surface-complementary features that interface the two proteins and stabilize the binding pocket of the AdoMet cofactor, probably by facilitating the conformational change at the Fib active site required for AdoMet binding (Aittaleb et al., 2003). The coiled-coil domain in the center of the Nop5 protein mediates self-dimerization and optimally positions the two Fib-Nop5 heterodimers into the bipartite, fully assembled and active R N P complex (Figure 1.3; Aittaleb et al., 2003). 1.4.2 S t r u c t u r e a n d f u n c t i o n o f a r c h a e a l H / A C A sRNP c o m p l e x Archaea appear to contain only a few H / A C A RNPs and pseudouridylation modifications in r R N A are infrequent. The first identification of H / A C A R N A s in Archaea came from sequencing of the entries in a c D N A library prepared from small R N A s from A. fulgidus (Tang et al., 2002a). The low abundance of these H / A C A R N A s and their poorly conserved features have precluded the development of an effective H / A C A gene-finding program that can be widely applied to other archaeal genomes. The presence of pseudouridine modifications in r R N A at sites predicted from guide sequences of the cloned sRNAs has been confirmed by biochemical analysis (Rozhdestvensky et al., 2003; Tang et al., 2002a). Homologues of the eukaryotic H / A C A snoRNA-associated proteins are encoded in archaeal genomes — Cbf5, the putative pseudouridine synthetase, G a r l and Nop 10 (Bult et al., 1996; Watanabe and Gray, 2000). The eukaryotic H/ACA-associated protein, Nhp2, is a member of the 15.5kDa family of proteins. In archaeal H / A C A sRNPs, this protein, is replaced by the homologous L 7 A e protein that is a component of the large ribosomal subunit and of C / D box RNPs (see above) (Rozhdestvensky et al., 2003). Analysis of archaeal 14 H / A C A sRNAs reveals the presence of an R N A K-turn structural motif that serves as the binding site for the L 7 A e protein (Figure 1.4; Rozhdestvensky et al., 2003). The secondary structure of archaeal H / A C A sRNAs consists of either one or three stem-ioop structures with an A C A or A G A motif positioned downstream from the stem structure (Rozhdestvensky et al., 2003; Tang et al., 2002a). Sequences in the large internal loop of each stem can form canonical bipartite guide, duplexes of 9-13 bp around a target uridine in 16S r R N A or 23 S r R N A . This pseudouridylation pocket is usually located 15-16 nt from the A C A or A G A box motif (Tang et al., 2002a). The K-turn motif in these sRNAs is invariably positioned 5-6 bp away from the base of the stem located above the pseudouridylation pocket (Rozhdestvensky et al., 2003; Figure 1.4). Each stem-loop structure of the H / A C A sRNAs appears to contain one copy of each of the four H/ACA-associated proteins (Tang et al., 2002a). Recent experiment reconstituting active H / A C A pseudouridylation complexes using protein components and Pf9 s R N A from P. furiosus (Baker et al., 2005) and Pab91 s R N A (Charpentier et al., 2005) from P. abyssi have provided valuable information about the structural organization of these RNPs in the third domain of life. Pf9 s R N A is 75 nt in length, contains a conserved A C A box located at the 3' base of the stem, a pseudouridylation pocket located within a central loop of the R N A , a K-turn motif near the top of the stem and a terminal loop with the conserved G A G sequence (Baker et al., 2005). Pab91 s R N A differs from Pf9 in that it is 68 nt long, the K-turn is located in the loop at the top of the hairpin and the conserved G A G sequence in the terminal loop is missing (Charpentier et al., 2005). The bipartite guide region of the H / A C A sRNAs base pairs with its target R N A and the uridine to be modified and the 3' nucleotide next to it are left unpaired in the pseudouridylation pocket. 15 Gel-shift assays and protein-interaction experiments using wild-type or mutant Pf9 guide R N A s indicate that Cbf5 and L 7 A e bind independently to the s R N A , whereas Gar l and Nop 10 do not interact directly with either the s R N A , L7Ae or with each other. Rather, they assemble into the complex through their independent associations with Cbf5 (Baker et al., 2005). The interaction of Cbf5 with the guide H / A C A s R N A depends on the presence of the A C A box, the pseudouridylation pocket and, to some extent, on the G A G sequence in the terminal loop of the hairpin. After Cbf5 binds to the H / A C A s R N A , the Cbf5-sRNA complex independently recruits Gar l and Nop 10 into the complex. The complex remains inactive in pseudouridylation until L 7 A e binds to the K-turn and introduces a sharp bend into the backbone of the R N A hairpin that is presumably required for activation of the archaeal H / A C A complex (Baker et al., 2005). Inclusion of a fragment of r R N A complementary to the bipartite guide with an appropriately positioned uridine results in target uridine isomerization. It has been suggested that Gar l and NoplO may be involved in interaction with or release of the r R N A substrate (Baker et al., 2005). The results obtained in the in vitro reconstitution study with Pab91 have provided a slightly modified view of the assembly and activity of archaeal H / A C A box R N P complex. In contrast to the previous work, the results obtained by Charpentier et al. (2005) suggest that the A C A box is not required for activity or for Cbf5 binding and that Cbf5-Nopl0 pair is the minimal set of proteins needed for the formation of a particle active in pseudouridylation. In addition, Charpentier et al. (2005) suggest that L 7 A e might play an important role in the assembly of the pseudouridylation complex by helping to stabilize the interaction between Cbf5 and the s R N A through protein-protein interactions or folding of the s R N A in a more favorable conformation. Nevertheless, both studies indicate that the maximal 16 pseudouridylation activity occurred when all four proteins and box A C A are present, and the efficient binding of the target R N A to the R N P complex requires a uridine at the site of pseudouridylation. The variations observed between the two studies may be the result of differences in the in vitro structure, assembly and activity of different archaeal box H / A C A s R N A or in the sensitivities of the assays used in the two studies. 1.5 The K-turn RNA motif The R N A kink-turn (or K-turn) was first observed at six separate 23S r R N A locations in the crystal structure of the archaeal 50S ribosomal subunit from Haloarcula marismortui (Ban et al., 2000). Later on, it was found that this R N A motif was not only present in r R N A , but also in m R N A s (Mao et al., 1999; Winkler et al., 2001), modification-guide R N A s (Kuhn et a l , 2002; Rozhdestvensky et al., 2003; Watkins et al., 2000) and spliceosomal R N A s (Vidovic et al., 2000). The structure of the K-turn motif in all these different R N A s is highly similar and generally equivalent (Figure 1.5A). The K-turn is characterized by two short helices that are connected via an asymmetric internal loop that is partially closed by two adjacent sheared G : A base pairs. This motif is stabilized by A-minor interactions of the stacked adenosines of the G : A base pairs with one of the helices and by stacking of two of the nucleotides of the asymmetric loop on each of the two helices. The nucleotide adjacent to the second G : A base pair (usually U) protrudes from the loop by a sharp 45° to 63° bend in the R N A , backbone (Klein et al., 2001) (Figure 1.5B). Experiments with R N A fragments containing K-turn motif sequences showed that the K-turn motif exists in a dynamic equilibrium between an open structure, similar to a simple bulge bend, and a highly ordered and tightly kinked conformation (Goody et al., 2004; Matsumura et al., 2003). In these 17 experiments it was observed that the electrophoretic mobility of the K-turn-containing R N A fragments was strongly retarded when metal divalent ions were present in the solution, indicating a highly kinked structure of the R N A . However, in the absence of divalent ions the R N A s migrated at a speed that is typical of a normal 3-nt bulge, indicating a looser conformation of the R N A (Goody et al., 2003; Matsumura et al., 2003). In addition, Turner et al. (2005) using fluorescence resonance energy transfer showed that the binding of L7Ae protein to a R N A fragment containing a K-turn motif induces the formation of the highly kinked conformation even in the absence of metal ions. The results of these experiments suggest that the formation of the tightly kinked conformation of the K-turn motif is stabilized by the presence of either divalent ions or by the binding of proteins. A l l of the K-turns identified so far, except K-turn 38, have a protein associated with them. In the 50S ribosomal subunit, five of the 23 S r R N A K-turns interact with nine different ribosomal proteins (Ban et al., 2000; Kle in et al., 2001). The K-turn motifs of the eukaryotic and archaeal C / D box R N A s interact with the 15.5KDa and L 7 A e proteins, respectively (Kuhn et al., 2002; Watkins 2000). The L 7 A e protein also interacts with the K-turn motif of archaeal H / A C A box sRNAs (Rozhdestvensky et al., 2003), whereas the Nhp2 protein interacts with eukaryotic H / A C A snoRNAs (Henras et al., 2001). The 15.5kDa, L 7 A e and Nhp2 proteins belong to the same family of RNA-binding proteins (Koonin et al., 1994). Other members of this family of proteins include the SBP2 protein that binds to the R N A selenocysteine insertion sequence (SECIS) element of some m R N A s and is part of the machinery that mediates the translational insertion of selenocysteine into eukaryotic proteins (Copeland and Driscoll , 2001; Copeland et al., 2000) and the ribosomal protein L30e. The observation that most of the K-turn motifs are associated with proteins along with the 18 structural dimorphism of the motif suggest that this R N A motif acts more like a protein-binding motif rather than an organizing feature of R N A (Goody et al., 2004). 1.6 Project objectives Recent evidence shows that in Archaea the ribosomal protein L 7 A e is an integral component of three functionally distinct macromolecular ribonucleoprotein complexes: the 50S large ribosomal subunit, the C / D box modification particles and the H / A C A box particles. The ability of L 7 A e protein to interact with functionally different R N A s suggested a multifunctional role for this protein in archaeal cells. In addition, the prevalence of the L7Ae binding motif in many distinct R N A s , suggested that L 7 A e might be a core protein component of different R N P particles and therefore it might play an important role in archaeal cells by helping to coordinate the function of distinct RNPs . The main objective of this research was to identify and characterize ncRNAs that associate with L 7 A e protein from the hyperthermophilic archaeaon 5. solfataricus to better understand the function of the L 7 A e protein in the third domain of life. Furthermore, the identification of new L7Ae-associated ncRNAs represents the first step in the characterization of novel L7Ae-containing R N P complexes. Overall, this study wi l l contribute to enhancing our understanding of the mechanisms employed by Archaea to regulate gene expression. 19 1 . 7 Chapter 1 figures F i g u r e 1.1. S t r u c t u r a l features o f e u k a r y o t i c C /D b o x s n o R N A s . The C / D box snoRNAs contains two conserved sequence elements known as the C box (represented in yellow) and the D box (represented in blue) located at the 5'and 3' ends, respectively. More imperfect copies of the conserved sequence elements referred to as the C (represented in yellow) and D ' (represented in blue) boxes are found towards the center o f the R N A molecule. A target r R N A is shown base paired to the complementary antisense elements of the C / D snoRNA found upstream of the D and D ' boxes. 2'-0-Ribose methylation (red circle) is directed to the nucleotide in the r R N A that participates in a th Watson-Crick base pair with the 5 nucleotide upstream of the D or D ' box. F i g u r e 1.2. S t r u c t u r a l features o f e u k a r y o t i c H / A C A box s n o R N A s . The eukaryotic H / A C A snoRNAs consist of two hairpin structures connected by a single-stranded region, the H box (shown in yellow). A conserved A C A (shown in blue) motif is located at the 3' end of the snoRNA. The target r R N A is shown interacting with the two regions of hyphenated complementarity located within the internal loops of each of the hairpin motifs. The uridine ( ¥ ) targeted for modification and the nucleotide 3' from it (N) are left unpaired in the pseudouridylation pocket. 21 RNA target F i g u r e 1.3. S t r u c t u r a l features o f a r c h a e a l C/D box R N P s . A model for the structure of a double-guide C / D box R N P complex is depicted. The conserved boxes C and C are shown in yellow, whereas the D and D ' boxes are in blue. The antisense sequence elements located 5' to the D and D ' boxes are shown base paring with their respective r R N A targets. 2'-0-Ribose methylation (red circle) is directed to the nucleotide in the r R N A that participates in a Watson-Crick base pair with the 5 t h nucleotide upstream of the D and D ' boxes. L 7 A e protein binds to the conserved C / D and C ' / D ' boxes and stabilizes the K-turn motif. The binding of L 7 A e protein then recruits two copies of the Fib-Nop5 heterodimer onto the complex. Nop5 protein serves as a bridge between the C / D box s R N A and the catalytic Fib subunit. The carboxyl terminus of Nop5 contacts the newly exposed determinants in the R N A , whereas its amino terminus forms a complementary surface with Fib protein and helps to position the Ado-Met substrate. The coiled-coil domain of Nop5 in the center of the molecule allows for Nop5-Nop5 dimerization (Figure adapted from Dennis and Omer, 2006). 22 F i g u r e 1.4. S t r u c t u r a l features o f a r c h a e a l H / A C A b o x R N P s . A model for the structure of archaeal H / A C A box R N P complex is depicted. The hairpin structure of the H / A C A s R N A , with the conserved A C A motif at the 3' end and the pseudoridylation pocket located in the large internal loop, is shown. The K-turn motif located at the top of the stem is indicated by dashed lines. The G A G sequence in the terminal loop is depicted. Cbf5 and L 7 A e bind directly and independently to the s R N A . The interaction of Cbf5 with s R N A is dependent on the A C A motif, the pseudouridylation pocket and the G A G sequence in the terminal loop. Through protein-protein interactions Cbf5 recruits Nop 10 and G a r l onto the complex. (Figure adapted from Dennis and Omer, 2006). 23 3' C - G 5' G - C i R N N A • G G • A N - N 5' (G)C - G(C) 3' Consensus K-turn 3' N - N 5' N - N N . R U A • G G • A U - U C - G 5' N - A 3' Consensus C/D box K-tuni 3' C - G 5' G - C i A G C U • G U 23S i RNA KT-15 K-turn H.marismortui 3' C - G 5' G - C G A A A • G G • A G • A 23S rRNA KT-7 K-tum H.marismortui 23SrRNAKT-7K-rurn H.marismortui F i g u r e 1.5. S e c o n d a r y a n d t e r t i a r y s t ruc tu r e o f the K - t u r n mot i f . (A) Secondary structure of different K-turn motifs. From left to right: the K-turn consensus sequence as indicated by Kle in et al. (2001); the consensus sequence for the K-turn formed by the C and D boxes in the C / D box sRNAs (Omer et al., 2003); K-turn 15 of the 23S r R N A as determined in the crystal structure of H. marismortui 50S ribosome subunit; K-turn 7 of the 23S r R N A of H. marismortui (Klein et al., 2001). The colors of the residues in the secondary structure diagram of K-turn 7 match the colors of the tertiary structure representation. (B) Tertiary structure of K-turn 7 of the 23 S r R N A from H. marismortui (Klein et al., 2001). The protruding nucleotide and the shared G : A base pairs are indicated by arrows. The highly kinked structure of this motif is stabilized by the stacking of the G : A 24 base pairs, by an A minor interaction between the A of the second G : A base pair (A98 in the figure) with the G - C base pair (G81-C93 in the figure) of stem I and by stacking of the two 5' most nucleotides of the asymmetric loop on each of the stems (G94 and A95 in the figure). A s a consequence of the bent in the R N A backbone the 3' most nucleotide of the asymmetric loop is protruding into the solution. 25 2. Materials and Methods 2.1 Materials 2.1.1 P r o k a r y o t i c s t ra ins a n d p l a s m i d s Sulfolobus solfataricus s t r a i n Sulfolobus solafataricus strain A T C C 33092/DMS 1617/P2 (Zi l l ig et al., 1980) was used to obtain genomic D N A , total R N A and ncRNAs. E. coli s t r a ins E. coli strain JM109 and NovaBlue were used for the propagation of plasmid vectors. Oneshot® competent cells (Invitrogen, Carlsbad, C A ) were used in the construction of the c D N A library. BL21(DE3)pLysS and BL21(DE3)LysE were used in the expression of the recombinant proteins. The genotype of each strain is listed in Table 2.1. The vectors used in the construction of the library and cloning of the recombinant proteins are listed in Table 2.2. 2.1.2 E n z y m e s a n d chemica l s E n z y m e s A l l the enzymes used in this study were purchased from Invitrogen (Carlsbad, C A ) . Sequenase™ was obtained from USB/Amersham. 26 Chemicals Chemicals were purchased from Fisher Scientific, Aldrich-Sigma or Pharmacia. Yeast extract,. Bactotryptone and BactoAgar were supplied by Difco Laboratories (Detroit, MI) . A l l the labeled nucleotides were obtained from Perkin Elmer. Non-radioactive nucleotides were obtained from Amersham/Pharmacia and Invitrogen. The Sequenase Version 2.0 sequencing kit was purchased from USB/Amersham. 2.1.3 Oligonucleotides A l l oligonucleotides were synthesized by Qiagen Oligo Division. Oligonucleotides used as D N A template for in vitro transcription were resuspended in TES buffer ( l O m M Tris-HC1 p H 8, I m M E D T A , 0 .1M NaCl) . A l l other oligonucleotides were resupendend in d H 2 0 . The D N A template for in vitro transcription was amplified by P C R from S. solfataricus genomic D N A using the appropriate set of primers. The c D N A clones corresponding to r R N A or t R N A fragments were directly transcribed from the T7 promoter present in the cloning plasmid. The sequences of the oligonucleotides used in this study are listed in Table 2.3. 27 2 . 2 M e t h o d s 2.2.1 Preparation of S. solfataricus cell extracts 2.2.1.1 Cell culturing and media • S. solfataricus cells (strain P2) were grown at 75-78°C in a 2-liter Perkin-Elmer culture vessel containing a complex medium (100 m M ( N H ^ S C ^ , 10 m M M g C l 2 , 5 m M C a C l 2 , 20 m M K H 2 P O 4 * 7 H 2 0 , 30 m M glutamic acid, 2% glucose, 1% yeast extract; adjusted at p H 3.4). Cells were harvested when the OD600 was between 0.4-0.6. Cells were rapidly cooled to 10°C and pelleted by centrifugation at 5,000 rpm in a Sorvall GS-3 rotor for 5 min. The cell paste was resuspended in buffer A (50 m M Tr i s -HCl p H 8, 10 m M M g C l 2 ) to a concentration of approximately 0.25 g/ml and it was either stored at -70°C or used to prepare cell-free extracts. 2.2.1.2 Preparation of S. solfataricus cell extracts Aliquots of 7 ml of the concentrated cells were lysed using the freeze and thaw method. The cells were frozen solid for 1 min in liquid nitrogen and thawed by transferring them into a lukewarm waterbath for 9 min. This freeze and thaw cycle was repeated three times. Cell debris was then pelleted by centrifugation for 20 min at 9,000 rpm in a Sorvall SS-34 rotor. 28 2.2.1.3 A m m o n i u m sulfate f r ac t i ona t i on Ammonium sulfate was added to the cell extract to 20% w/v concentration and the solution was gently agitated for 20 min at 4°C. More ammonium sulfate was added to the solution to achieve a 40% w/y final concentration followed by another 20 min incubation. After the last incubation, the extract was centrifuged at 9,000 rpm in a Sorvall SS-34 rotor for 10 min and the pellet was resuspended in 6 ml of buffer A . The ammonium sulfate fraction was then dialyzed against the same buffer overnight to remove residual ammonium sulfate and used for further purification. 2.2.1.4 Suc rose g r ad i en t f r a c t i o n a t i o n A sucrose gradient was created by layering 6 ml of buffer A containing 10, 15, 20, 25 and 30% w/v sucrose, into Beckman 25x89 mm polyallomer tubes. The ammonium sulfate fraction was then layered on top of this 30 ml sucrose gradient and centrifuged in the Beckman L8-70 Ultracentrifuge using a Beckman SW27 rotor at 18,000 rpm for 16 hr at 10°C. Fractions (1.5 ml) were collected from top to bottom using an auto densi-flow II (Buchler Instruments) probe connected to a Bio-Rad Model 2110 fraction collector. Aliquots of every second fraction between 2 and 20 were analyzed by Western blotting for the presence of L 7 A e protein with polyclonal antibodies prepared against the recombinant protein (see section 2.2.5). 29 2.2.2 E x t r a c t i o n o f S. solfataricus genomic D N A S. solfataricus cells from 1.5 ml of culture were pelleted by centrifugation. The cells were then lysed by adding 400 ul of lysis buffer (40 m M Tr i s -HCl p H 8, 20 m M E D T A p H 8, and 10 mg/ml Lysozyme) and incubated for 30 min at 37°C. 500 ul of 5% SDS containing 100 mg/ml of RNase A were added and the suspension was extracted twice with phenol/chloroform. The aqueous phase was transferred into a new microtube that contained 15 ul 5 M N a C l and 1 ml of 95% ethanol. The suspension was gently mixed until the D N A formed threads. The precipitated D N A was pelleted by centrifugation, washed with 70% ethanol and dried under vacuum. The recovered genomic D N A was resuspended in 200 ul of T E (10 m M Tr i s -HCl pH 7, 1 m M E D T A pH 8) buffer and mixed overnight at 4°C. The concentration of the D N A preparation was determined by measuring the absorbance at 260nm of different dilutions. 2.2.3 E x t r a c t i o n o f S. solfataricus to ta l R N A For total R N A preparations, cells in mid-log phase were collected. Specifically, 10 ml of S. solfataricus culture was rapidly cooled to less than 10°C and sodium azide was added to a final concentration of 10 m M . The cells were pelleted in a Sorvall 5B Plus centrifuge at 7,000Xg for 5 min at 4°C and resuspended in 1ml of ice cold Phosphate buffer (40 m M N H 4 C 1 , 40 m M N a 2 H P 0 4 , 20 m M K H 2 P 0 4 , 50 m M N a C l , 10 m M N a N 3 ) . The suspension was transferred to a tube containing 1ml of SDS lysis buffer (100 m M N a C l , 10 m M E D T A p H 8, 0.5% SDS) pre-equilibrated to 100°C in a boiling water bath. The R N A was extracted from the lysate by addition of an equal volume of phenol saturated with T E buffer. The aqueous phase was recovered and two more phenol extractions were performed. After the 30 third extraction, the aqueous phase was extracted once with chloroform:isoamyl alcohol (24:1) and the R N A was precipitated by addition of one tenth volume of 3 M sodium acetate and 10 ml 95% ethanol and incubated at - 2 0 ? C overnight. The precipitated R N A was pelleted by centrifugation at 10,000Xg for 10 min at 4°C. The pellet was dried under vacuum and resuspended in 100-500 ul of T E buffer. The concentration of the R N A was determined by measuring the A 2 6 0 of an appropriate dilution. Aliquots of 10 ug of total R N A were transferred to ependorff tubes and store at -70°C. The yield o f S. solfataricus total R N A obtained by this method is between 1-3 p-g/ul. 2.2.4 E x p r e s s i o n a n d p u r i f i c a t i o n o f r e c o m b i n a n t p ro te ins i n E. coli 2.2A.l C l o n i n g o f S. solfataricus L 7 A e , N o p 5 a n d F i b genes The gene encoding ribosomal protein L 7 A e (accession number S75397, gi:7440709) was amplified by P C R using oligonucleotides A 0 6 6 and A 0 6 7 , cloned between the Ncol and BamHI sites of a pET-3d vector and transformed into E. coli BL12(DE3)pLysE cells for overexpression. Nop5 (accession number: A K K 4 1 2 1 5 , gi:13814119) was amplified using oligonucleotides OSZ102 and OSZ103. The P C R fragments were cloned between the Ncol and EcoRl sites of pET28a and transformed into E. coli strain BL21(DE3) . Fib (NP_342426, gi: 15897821) was amplified by P C R with A O 7 0 and A 0 7 1 , cloned between the Ncol and BamHI sites of pET3d and transformed into E. coli strain BL21(DE3)pLysS. 31 2.2.4.2 Expression of the recombinant proteins E. coli cultures transformed with a plasmid containing S. solfataricus recombinant proteins were grown to an O D 6 0 0 of 0.4-0.6 and synthesis of the recombinant proteins was induced by adding 0.5 m M IPTG (Invitrogen). Cells were grown overnight at room temperature, harvested by centrifugation (5,000 rpm for 5 min in a Sorvall GS-3 rotor), washed and resuspended in the appropriate buffer: L7Ae buffer (50 m M Bis-Tris pH 6.5; 50 m M NaCl) , Nop5 buffer (50 m M Tr is -HCl pH 7.5; 50 m M NaCl) and Fib buffer (50 m M Tr i s -HCl pH 8.5, 50 m M NaCl) . Cells were disrupted by sonication and the clear lysate was heated for 5 min at 65°C followed by centrifugation. The supernatant was recovered and loaded on a DEAE-Sepharose column for further purification by ion-exchange chromatography. 2.2.4.3 Purification of the recombinant proteins by ion exchange and size exclusion chromatography The heated lysate fraction of L7Ae and Fib proteins was loaded on a 15 ml D E A E -Sepharose column (Pharmacia) equilibrated with the appropriate buffer. Fractions of 4 ml were collected. The recombinant proteins were recovered in the flow-through, concentrated (Centricon, Millipore) to a final volume of 500 ul and loaded on a 24 ml Superdex75 10/300 column (Amersham). Fractions of 500 ul were collected and the peak of the purified recombinant protein was detected by measuring the A 28o-The Nop5 heat-soluble fraction was loaded on a 15 ml DEAE-Sepharose column (Pharmacia) equilibrated with Nop5 buffer. Fractions of 4 ml were collected. The bound Nop5 protein was eluted by a step gradient with increasing salt concentrations (NaCl 32 concentrations: 200 m M , 400 m M and 600 mM). The majority of Nop5 protein elutes in the 400 m M N a C l fractions. The fractions containing Nop5 were pooled, concentrated to a final volume of 500 ul and further purified by size exclusion chromatography as indicated above. The fractions from the size exclusion chromatography containing the purified protein were pooled and the protein concentration was determined by a Bradford assay. The purified S. solfataricus recombinant proteins were frozen in liquid nitrogen and stored at -80°C. 2.2.4.4 B r a d f o r d assay The concentration of the purified proteins was determined using the BioRad Microassay which is based on the Bradford assay. The purified proteins were diluted in 800 ul of water and incubated with 200 ul of the concentrated dye reagent. The samples were incubated for 10 min and their absorbance at 595nm was measured. A standard protein calibration curve was constructed using B S A . 2.2.5 P r o d u c t i o n o f p o l y c l o n a l an t ibod ies For production of L7Ae antibodies, the L7Ae recombinant protein was separated on a 12% SDS-gel and visualized by copper staining. The gel slice containing the L 7 A e protein was excised, lyophilized, ground to fine powder in a mortar and mixed with 0.5 ml of PBS buffer (130 m M N a C l , 8 m M N a 2 H P 0 4 p H 7.2). The slurry was emulsified with an equal volume of complete (initial immunization) or incomplete (subsequent immunizations) Freud's adjuvant. The L 7 A e emulsion was used for the immunization of a New Zealand White rabbit. During each immunization, 150 ug of the recombinant L 7 A e was injected at 33 ten days intervals for a total of six times. After the fifth boost injection the animal was exsanguinated, the blood collected and the serum separated according to the procedure described by Harlow and Lane (1988). The sera were aliquoted in 1.5 ml fractions and stored at -20 °C. For the immunoprecipitation experiments, the unpurified L 7 A e anti-sera were used as source of L7Ae antibodies. 2.2.6 I m m u n o p r e c i p i t a t i o n o f L 7 A e - c o n t a i n i n g s R N P complexes f r o m S. solfataricus ce l l ex t rac t Settled Protein A Sepharose beads (Amersham, Pharmacia) (250 ul / analytical experiments or 1,000 pi / preparative experiment) were equilibrated in R I P A buffer (50 m M Tr i s -HCl p H 7.5, 150 m M N a C l , 1% Nondidet P-40 (Particle Data Laboratories), 5 g/ml sodium deoxycholate and 0.5% SDS) and incubated with 20 or 100 ul of anti-L7Ae polyclonal antibodies for 2 hr at 25°C with gentle mixing. The beads were washed three times with R I P A buffer and for the analytical experiment incubated with 200 ul aliquots from every other gradient fraction (see section 2.2.1.4), for 12-14 hr at 4°C. The precipitates were washed with R I P A buffer as before and the R N A s were extracted with phenol/chloroform. For the preparative experiment, the recovered R N A s were visualized by 3'-pCp ( 3 2P-cytidine-5'-3' biphosphate) end-labeling with T4 R N A ligase (New Engalnd BioLabs). The standard reaction contained 1 fig R N A ; I X R N A ligase buffer (New England BioLabs); 10 u € i [ 3 2P] pCp ( 3 2P-cytidine-5'-3' biphosphate), 10% v/v D M S O (Fisher); 12% v/v glycerol; 10 U T4 R N A ligase and 20 U RNase inhibitor. Ligation proceeded for 16-18 hr at 4°C. The labeled products were resuspended in 200 ul of water, extracted with two volumes of phenolxhloroform and centrifuged for 5 min at 12,000 rpm. The aqueous phase was 34 transferred to a clean eppendorf tube and the labeled products were precipitated with 3 volumes of 95% ethanol and one-tenth volume of 3 M sodium acetate. The samples were centrifuged at 4°C for 30 min. The pellet was washed with 70% ethanol and dried under vacuum. The precipitated R N A s were resuspended in 10 ul DEPC/sequencing dye (ratio 1:2) and separated on an 8% denaturing polyacrylamide gel. The R N A profile of the immunoprecipitated R N A s from each of the gradient fractions was visualized by autoradiography and the gradient fractions containing an enriched population of small R N A s were pooled. The pooled gradient fractions were concentrated on centrifugal filters (Centricon, Millipore) and the small R N A s were co-immunoprecipitated and recovered as indicated above. 2 . 2 . 7 R T - P C R c l o n i n g 2 . 2 . 7 . 1 L i g a t i o n o f o l i gonuc lo t i de l i n k e r to i m m u n o p r e c i p i t a t e d R N A s The modified D N A oligonucleotide AO30 (100 pmol) was used to anchor the purified R N A s at their 3'end. The modified AO30 is phosphorylated at the 5' end and its 3' end was blocked with dideoxycytidine to prevent unwanted side reactions during ligation of the linker to the R N A s . The same basic procedure described in section 2.2.6 was employed to ligate these modified single-stranded D N A to R N A moieties. 35 2 . 2 . 7 . 2 R e v e r s e t r a n s c r i p t i o n o f i m m u n o p r e c i p i t a t e d R N A s Oligonucleotide A 0 3 1 (antisense of AO30) was annealed to the RNA-l inker fragments in preparation for reverse transcription. The reaction conditions included an annealing step for 4.5 min at 60°C. A Thermoscript R T - P C R system (Invitrogen) was used to obtain the c D N A s . The reaction cocktail ( I X reaction buffer; 2 m M dNTP; 15 U Thermoscript) was added to the annealing mixture and the reaction was incubated for 30 min at 55°C and terminated by increasing the temperature to 85°C for 5 min. The R N A template was hydrolyzed by the addition of 40 U RNase H and further incubation at 37°C for 20 min. 2 . 2 . 7 . 3 P o l y A t a i l i n g o f c D N A f ragments A series of dATPs were added to the 5' end of the c D N A s to facilitate cloning. The c D N A s were incubated with 17 u M d A T P ; lx TdT buffer (Invitrogen); 200 U of Terminal deoxynucleotidyl transferase for 30 min at 37°C. Next, the extended c D N A s were extracted with phenol/chloroform, precipitated and dried as described in 2.2.6. To remove excess primer, the D N A was separated on a 2% agarose gel in 0.5X T B E buffer (45 m M Tr i s -HCl pH 8; 45 m M boric acid; 2 m M E D T A ) at a constant voltage of 100V. The D N A fragments above the primer size (90-500 nt) were excised form the gel and recovered by electroelution as previously described by Sambrook et al. (1989). 36 2.2.8 C l o n i n g o f the c D N A f ragments The recovered c D N A s were amplified in a standard P C R reaction using the primers A 0 3 1 and A 0 3 2 . The P C R products were ligated into the pCR2.1 vector ( T O P O - T A cloning kit, Invitrogen) and transformed into Oneshot® competent cells (Invitrogen) following the manufacturer instructions. The transformed cells were plated on YT-agar plates (8 g bacto-tryptone, 5 g bacto-yeast extract, 2.5 g N a C l , 15 g agar in one liter of water, adjusted to p H 7) containing 50 p.g/ml ampicilin and incubated at 37°C overnight. Individual colonies were picked and cultured in Y T media (8 g bacto-tryptone, 5 g bacto-yeast extract, 2T5 g N a C l in one litter of water, adjusted to pH 7) with ampicilin. 2.2.9 S c r e e n i n g o f c D N A clones f o r n o n - c o d i n g R N A s Small-scale alkaline lysis plasmid preparations were made from each culture and the presence of the insert was confirmed by restriction enzyme digestion with EcoRI. The clones containing inserts were sequenced using the M l 3 forward primer and the BigDye terminator cycle sequencing reaction kit. The c D N A sequences were analyzed using the Applied Biosystems Prism 377 D N A sequencer. The resulting sequences were scanned for the presence of small non-coding R N A s and compared by B L A S T to known sequences found in a non-redundant nucleotide database in an attempt to identify them (see 2.2.15.2). 2.2.10 T o e p r i n t i n g assay Toeprinting experiments were carried out by incubating 200 pmol of unlabeled full-length 7S, 7S fragment (nucleotides 89-311) or 7S fragment (nucleotides 135-311) transcripts 37 at 70°C for 10 min with increasing amounts of L 7 A e protein (0.1, 0.2, 0.6 and 1.2 u.mol). The resulting RNA-protein complexes were incubated with a radioactively labeled primer complementary to the 3' end of each of the R N A transcripts and used as templates in a reverse transcription reaction with as described in section 2.2.13.2. The reverse transcription products were separated on a denaturing 6% polyacrylamide gel. A sequence D N A ladder was generated using the same primer as the one used in the reverse transcription reaction, in a sequencing reaction with Sequenase™ sequencing kit (USB). The experimental conditions used to map the location of the L 7 A e - L 7 A e m R N A interaction by toeprinting assay were as described above, except that 100 pmol of wild-type leader L 7 A e m R N A and a primer complementary to positions 46-27 of the m R N A were used. 2.2.11 In vitro methylation assay The in vitro methylation assays were performed as described in Omer et al., 2002. Equimolar amounts (6pmol) of guide and target R N A were mixed in 20 ul final volume in 25 m M phosphate buffer p H 7, 100 m M N a C l , denatured by incubating for 2 min at 95°C and renatured by cooling rapidly to 55°C. The R N A s were added at 0°C to 80 ul containing Fib, r N O P 5 , L 7 A e (6 pmol of each), and [methyl- 3H] S-adenosyl methionine (300 pmol, 3.9 Ci /mmol , Amersham Pharmacia) in the binding buffer A . The reaction mix was incubated at 70°C. Aliquots of 20ul were removed after 2, 10, 15, 20 and 30 min, placed on ice and precipitated with 5% trichloroacetic acid. The precipitates were collected on 0.2 urn nitrocellulose filters (Millipore), dried and the radioactivity was measured by scintillation counting. 38 2 . 2 . 1 2 In vitro translation assay A S. solfataricus translation system was prepared as described by (Ruggero et al., 1993). In brief, 10 g of S. solfataricus cells were ground with 20 g of alumina and 2.5 u.g/ml DNase I (RNase-free, Invitrogen) in a mortar and then resuspended in 50 ml of buffer A (20 m M Tr i s -HCl pH 7.4, 10 m M M g acetate, 50 m M N H 4 C 1 , 1 m M DTT) . The extract was then centrifuged twice at 30,000Xg for 30 min at 4°C. The upper two-thirds of the supernatant was carefully removed (S30 fraction). One-third of the recovered S30 fraction was aliquoted and stored at -70°C. The remainder of the crude extract was used to prepare total t R N A fraction. The S30 extract was centrifuged at 45,000 rpm for 2 hr at 4°C in a Beckman Ti50 rotor. The recovered supernatant was diluted I X with an equal volume of 2 X SSC (3 M N a C l , 0.3 M N a citrate p H 7), extracted twice with phenolxhloroform and precipitated by adding 3 volumes of 95% ethanol. The pellet was dried, resuspended in 10 m M glycine p H 9 and incubated for 2 hr at 37°C. After incubation the total t R N A fraction was aliquoted and stored a t - 7 0 ° C . In vitro translation was conducted by incubating the m R N A s (10 pmol) for 40 minutes at 73°C in a 25 ul reaction containing l O m M KC1, 20mM of Tr i s -HCl p H 7, 18 m M M g acetate, 7 m M mercaptoethanol, 3 m M A T P , 1 m M G T P , 5 p,g of bulk t R N A , 20 u M 35 [ S] methionine (Amersham) and 5 ul of S. solfataricus S30 extract. To visualize the translation products, 15 ul of the reaction mixture were precipitated with 60 ul of acetone, incubated on ice for .15 min and centrifuged at 12,000 rpm for 10 min. The pellet was dried, resuspended in 20 ul of I X SDS-loading buffer, electrophoresed on a 15%o SDS-polyacrylamide gel and visualized by autoradiography as indicated in section 2.2.14. 39 The amount of incorporated [^SJ-methionine was quantified by trichloroacetic acid ( T C A ) precipitation. For this, 10 ul of the translation reaction was added to 250 ul of 1 M N a O H and incubated at 37°C for 10 min. Treatment with N a O H deacylates charged t R N A and ensures that all the precipitated counts are from protein-incorporated radiolabel. 1 ml of an ice cold 25% T C A solution containing 2% (w/v) casaminoacids was added to the N a O H -treated sample. The mixture was vortexed and incubated on ice for 5 min. The precipitated protein was collected on nitrocellulose filters (0.20uM, Sartorius) by vacuum and rinsed 3 X with 5ml of 5% T C A . The filters were dried, immersed in 10ml of scintillation fluid and counted in a Beckman L6000IC scintillation counter. To determine the stability of the m R N A transcripts in the translation reaction, 15 ul aliquots were withdrawn from the translation mixture at 0, 5, 10, 20, 40 min. The samples were run in denaturing 8% polyacrylamide gel, blotted on Hybond-N membrane and hybridized with the 3 2P-labeled aL7Toe oligonucleotide. The radioactive bands were visualized by autoradiography. 2.2.13 S t a n d a r d m o l e c u l a r b i o l o g y techniques 2.2.13.1 P o l y m e r a s e c h a i n r eac t i on Polymerase chain reaction was routinely used to amplify genes from S. solfataricus genomic D N A and to engineer different small R N A constructs. Reactions were performed in an Eppendorf Mastercycler® gradient Thermocycler. The parameters of the reaction included: a denaturation step at 95°C for 30 sec, an annealing step for 40 sec at a temperature depending on the sequence of the primer and an extension step at 72 °C for 1 min. Taq D N A 40 polymerase (Invitrogen) was used routinely used, however when a high yield was required the K O D H i F i D N A polymerase (Novagen) was used instead. The theoretical melting temperature of the primers was calculated with the following equation (Sambrook et al., 1989): 2[number of A+T]+4[number of C+G] 2.2.13.2 P r imer extensions Primer extension assays were used to verify the expression and lengths of the predicted s R N A s in vivo. Oligonucleotide primers (Qiagen-Operon) were designed to anneal within the D box guide region of the s R N A and extended into the C ' box motif. For those clones that did not have typical C /D box sequence elements, the primers were designed to anneal within the 3' end of the s R N A . The primers were 5'-end labeled using T4 polynucleotide kinase (Invitrogen) and [y-32 P] A T P . The labeled primers were extended in vitro with M M L V reverse transcriptase using S.' solfataricus total R N A as template. Approximately 0.2-0.5 pmoles of the labeled primers were incubated with 5-10 \xg of & solfataricus total R N A in I X M M L V buffer without M g (50 m M Tris pH 8.3; 75 m M KC1, 5 m M DTT) for 4.5 min at the melting temperature of the primer. A n extension cocktail containing I X M M L V , 8 m M M g C l , 2 m M dNTP's and 10 U / m l M M L V (Invitrogen) was added to the annealing mixture. The reactions were then incubated for 30 min at 42°C. The extension products were precipitated, dried, redissolved in 20 ul of gel loading buffer and visualized by autoradiography after electrophoresis on an 8% denaturing polyacrylamide gel. A sequence ladder, generated by 41 sequencing the c D N A clone with the same primer used for the primer extension, was run alongside the extension products. 2.2.13.3 Northern blots analysis For Northern Blot analysis, 5 u.g and 10 p.g of S. solfataricus total R N A were separated on a 6% polyacrylamide- 8 M urea gel. A labeled R N A marker was run alongside the R N A samples. The R N A was transferred to a Hybond-N membrane in 0.5X T B E buffer for 1 hr at 160 V . The blot was dried at 80°C, UV-irradiated for 2 min on an U V lamp table and equilibrated in 6 X SSC for 2 min. Pre-hybridization occurred at 37°C for 1-2 hr in 6 X SSC, 10 m M E D T A , 5 X Denhardt's solution (1% Ficol l , 1% polyvinylpyrrolidone and 1% bovine serum albumin), 0.5% SDS and 200 mg/ml salmon sperm D N A (Invitrogen). Next, 20 pmoles of labeled probe were added to the pre-hybridization solution and incubation r proceeded at 37°C overnight. After hybridization, the membrane was washed for 5 min at room temperature in 2 X SSC, 0.5% SDS, followed by a brief rinse in O.lx SSC at room temperature. The blot was visualized by autoradiography. 2.2.13.4 Gel retardation assays For gel retardation assays, labeled R N A transcripts (0.3pmol) were incubated in binding buffer A (25 m M phosphate buffer p H 7; 100 m M N a C l ; 1 m M M g C l 2 ) in the absence of protein or with increasing amounts of recombinant L 7 A e protein (as indicated in the figure legends). The reactions were incubated at 70°C for 10 min and 1 ul of loading 42 buffer (1% bromophenol blue, 10% glycerol) was added to the reaction (Omer et al., 2002). Protein-RNA complexes were resolved on non-denaturing 4% or 8% polyacrylamide gel containing 0.5X T B E buffer and visualized by autoradiography (see section 2.2.14.4). 2.2.13.5 Western blot Proteins were separated on a 12% SDS gel (see section 2.2.14.3) and electroblotted onto an Immobilon P membrane (Millipore) by wet transfer using a Trans-Blot electrophoretic transfer cell (BioRad). The transfer was carried out at 160 V for 1 hr. After this the membrane was rinsed with deionized water and incubated for 1 hr in blocking buffer (5% skim milk in PBS-T buffer (1.4 m M K H 2 P 0 4 , 8 m M N a 2 H P 0 4 , 140 m M N a C l , 2.7 m M KC1 and 0.1%> Tween 20 adjusted to pH 7.3). The membrane was washed in P B S - T buffer for 15 min, followed by two 5 min washes. Next, the membrane was incubated with P B S - T buffer containing the primary antibody. The optimum dilution for L 7 A e antiserum was 1/25,000. Nop5 antiserum was diluted 1/25,000 and Fib antiserum was diluted 1/8,000. After 1 hr incubation with the primary antibody the membrane was washed as before and incubated with 1/35,000 dilution of the secondary antibody (donkey anti-rabbit IgG, coupled to horseradish peroxidase, Amersham). The blot was washed and subjected to detection using the Amersham E C L detection kit. 43 2.2.14 G e l e lec t rophores i s 2.2.14.1 N a t i v e gel Agarose gels ( l%-2%) were used for resolving D N A fragments longer than 100 base pairs. Electrophoresis was performed in a Pharmacia GNA-200 gel electrophoresis apparatus in 0.5x T B E buffer (45 m M Tr i s -HCl pH 8; 45 m M boric acid; 2 m M E D T A ) containing 10 p.g/ml of ethidium bromide for visualization of the nucleic acids by U V light. Electrophoresis was performed at 100-200 V using a Pharmacia 500/400 power supply. Non-denaturing polyacrylamide gels (4%-10%) were used to resolved small fragments and for gel retardation experiments. Electrophoresis was performed in 0.5X T B E buffer using BioRad M i n i Protean II system. To visualize unlabeled nucleic acids, the gel was stained for 10 min in 0.5X T B E containing 10 |ig/ml of ethidium bromide. 2.2.14.2 D e n a t u r i n g U r e a gels Polyacrylamide-TBE gels (6%-10%) containing 8 M urea were used to visualize the primer extension and sequencing products. Samples were resuspended in sequencing gel loading buffer (98% deionized formamide, 10 m M E D T A , 0.025% bromophenol blue and 0.025% xylene cyanol FF) and boiled for 2 min before loading on the gel. Electrophoresis was performed in 0.5X T B E buffer at 32 W constant power (1800-2000 V ) using Pharmacia E C P S 3000/150 power pack. After electrophoresis, gels were transferred to a Whatman filter paper and dried for 30 min at 80°C on a BioRad gel dryer Model 583. 44 2.2.14.3 D e n a t u r i n g S D S gels Proteins samples were separated in SDS-polyacrylamide gels. Gels were prepared according to the method of Laemmli using a 5% acrylamide stacking gel and 8%-16% separating gel (Laemmli, 1970). Samples were resuspended in SDS-loading buffer (100 m M Tris p H 6.8, 200 m M D T T , 4% SDS, 0.2% bromophenol blue and 20% glycerol) and boiled for 2 min before loading. Electrophoresis was performed in I X protein buffer (25 m M Tris p H 8.3, 250 m M glycine and 0.1% SDS) using a BioRad M i n i Protean II system. Unlabeled proteins were visualized by staining the gel with a Coomassie Brilliant blue solution (25% w/v of Coomassie brilliant blue R250, 45% methanol and 10% acetic acid) for 30 min. Next the gel was soaked in distaining solution (40% methanol and 10% acetic acid) for 30 min. 2.2.14.4 V i s u a l i z a t i o n o f r a d i o a c t i v e gels After electrophoresis gels containing radioactive samples were dried using a BioRad gel dryer Model 583, exposed on a storage phosphor screen (Pharmacia) and visualized using the Typhoon 8600 phosphorimager (Molecular Dynamics) and ImageQuant image analyzing software. 45 2.2.15 B i o i n f o r m a t i c s tools 2.2.15.1 R N A secondary s t ruc tu r e p r e d i c t i o n R N A secondary structure predictions were done using the program mfold (Zuker, 2003). This computational program predicts optimal and suboptimal secondary structures for a R N A or D N A molecule using the most recent energy minimization method designed by Zuker. The program can be accessed at: http://www.bioinfo.rpi.edu/applications/mfold/old/rna/forml.cgi. 2.2.15.2 B L A S T sea rch B L A S T (Basic Local Alignment Search Tool) was used to search for sequence similarities of D N A in all available databases. B L A S T uses a heuristic algorithm that seeks local as opposed to global alignments and therefore is able to detect relationships among sequences that share only isolated regions of similarity (Altschul et al., 1990). B L A S T can be accessed at the N C B I (National Center for Biothechnology Information) web site at: http://www.ncbi.nlm.nih.gov/BLAST/. 46 2.3 Chapter 2 Tables T a b l e 2.1. S t r a i n G e n o t y p e S t r a i n G e n o t y p e Refe rence JM109 eI4-(McrA~)recAl endAl gyr96 thi-1 hsdR17(rk~mk+) supE44 relAl (lac-proAB)[F' traD36proAB lacP Z.AM15] (Yanisch-Perron et al., 1985) NovaBlue endAl hsdRl 7(rKi2-mKi2+) supE44 thi-1 recAl gyrA96 relAl lac [F'proA+B+ lacfZAM 15 TnlO(TcR)] Novagen . Oneshot® F(p80dlacZAMl 5 A (lacZYA-argF) Ul 69 deoR, recAl endAl hsdR17(rk~ m^) phoA supE44 X thi-1 gyrA96 relAl Invitrogen BL21(DE3)pLysS F ompT hsdSs(rB~ mB~) gal dem (DE3) pLysS(CmR) (Studier, 1991; Studier and Moffatt, 1986) BL21(DE3)pLysE pLysE F ompT hsdSa(rB~ mB~) gal dem (DE3) pLysE(CmR) (Studier, 1991; Studier and Moffatt, 1986) T a b l e 2.2. E. coli p l a s m i d vectors V e c t o r Res i s tance Refe rence T O P O Ampicil l in/Kanamycin i Invitrogen Carlsbad,CA U S A pET3d (expression vector) Ampic i l l in Novagen, Maddison,WI U S A pET14b (expression vector) Ampic i l l in Novagen, Maddison,WI U S A pET28a (expression vector) Kanamycin Novagen, Maddison,WI U S A 47 Table 2.3. Oligonucleotides used in this study*. Name Sequence Comments aL7Toe 5' C T A G G T C T T G T G G T A C T T C A 3' Complementary to nucleotides 46-26 of L 7 A e . AO30 5 ' C T C G A G A T C T G G A T C C G G G d d C 3' Primer linked to the 5' end of the immunoprecipitated sRNAs. A 0 3 1 5 ' C C C G G A T C C A G A T C T C G A 3' Complementary to A O 3 0 . A 0 3 2 5' G C G A A T C T T G C A G ( T ) 3 0 3' Primer used along with A 0 3 1 to amplify the c D N A fragments. A063.1 5' G T A A T A C G A C T C A C T A T A G G G A T A A G C C A T G G G A G 3' Forward primer to amplify the target fragment used in in vitro methylation assays with S. acidocaldarius s R l s R N A . A 0 6 5 5' T A T T T A G G T G A C A C T A T A G G T T A G C C A C G T G T T A C T C A G C C 3' Reverse primer to amplify the target fragment used in in vitro methylation assays with S. acidocaldarius s R l s R N A . A 0 6 6 5 ' A G A A T T C C C A T G G A C G C G A T G T C A A A A G C T A G 3' Forward primer to amplify the L7Ae gene. Contains an Ncol site for cloning in pET3d. A 0 6 7 5 T T A G G A T C C T T A A C T T G A A G T T T T A C C T T T A A T C 3' Reverse primer to amplify the L 7 A e gene. Contains a BamHI site for cloning in pET3d. 48 A O 7 0 5 ' A A A G A T C T C C A T G G C T G A A G T A A T T A C C G T A A A A C 3 ' Forward primer to amplify the Sso aFib gene. Contains an Ncol site for cloning into pET3d. A 0 7 1 5 ' T T A G G A T C C C T A C C C T T T A T A T T T G C T A A G A A C 3 ' Reverse primer to amplify the Sso aFib gene. Contains a BamHI site for cloning in pET3d. A 0 9 8 5' G T A A T A C G A C T C A C T A T A G A C A G A T G A G C T T A A C T C C C A T G G T C T G A T A G T T G A T G 3 ' Forward primer to amplify the C P 1 A s R l mutant. A 0 9 9 5'- A A T C G C T T T T T T A A C T T C T C A T C A A C T A T C A G - 3 ' Reverse primer to amplify the C P 1 A s R l mutant. A O 104 5' T C G A T G T A G C A A A C C G C G G G G 3' Primer to verify the ¥ 2 5 9 8 . Complementary to nucleotides 2643-2623 of 23 S r R N A . A O 107 5' G T A A T A C G A C T C A C T A T A G T A A A A A A G C G A T G G A T G A G C T T A A C T C C C A T C G T C T 3' Forward primer to amplify the CP2 s R l mutant. AO108 5' C T T C T C A T C A A C T A T C A G A C G A T G G G A G T T A A G C T C 3' Reverse primer to amplify the CP2 s R l mutant. A O 126 5' G C T T G A C G A G A C C C T C C A A A 3' Forward primer to amplify nucleotides 2623-2679 of S. solfataricus 23 S to generate a D N A sequence ladder. Use with A O 104. A O 129 5 ' T A A T A C G A C T C A C T A T A G G G G A T C T G G C G A C A C C C T T G G C C A A C C C T A A A T A T T T A G G G A T T C 3' Forward primer to amplify sR109 with the mutated K -turn motif. 49 AO130 5 ' G C T G T G G G G A T C T C T G G A G A C C C T A A T T C A G G G A A T C C C T A A A T A T T T A G G G T 3' Reverse primer to amplify s R l 09 with the mutated K -turn motif. A 0 1 3 8 5 ' T A A T A C G A C T C A C T A T A G G C T A T C T G T A G G T C C C A G T G G A 3' Primer to amplify large fragment of 7S R N A (nucleotides 89-311). Complementary to nucleotides 89-119. A0141 5' T A A T A C G A C T C A C T A T A G A T G G G G C C A G A C C C C C T A C C 3' Forward primer to amplify the antisense 7S R N A . A O 142 5' G G T C A G G G A G G G T G G G G G A 3' Reverse primer to amplify the antisense 7S R N A . A O 152 5 ' T A A T A C G A C T C A C T A T A G G A C T G C G T C C C G A A A C T A G A G G A G G A T C C A C G C G A T G T C A A A A G C T A G T T A T G T T A A G 3' Forward primer to amplify the L 7 A e D - 5 ' U T R mutant. A O 170-2 5' T C A C C C C A A C A A G A C A G T A A 3' Complementary to nucleotides 962-943 of SSO1026 (use with 5-36B primer). T 7 - G G 5' T A A T A C G A C T C A C T A T A G G 3' T7 promoter sequence. 412-6A 5' A A T T A T C A G C T T T T C C A G C G A 3' Reverse primer to amplify s R I O l . 412-6B 5 ' T A A T A C G A C T C A C T A T A G G G A T G A T G A G A G G G T C C A A 3' Forward primer to amplify s R I O l . 412-7A 5' T A T T A T C A G T G A A G G A C C A C T 3' Reverse primer to amplify sR102. 412-7B 5 ' T A A T A C G A C T C A C T A T A G G G G A T G A G . G A T T A C G G G A G 3' Forward primer to amplify sR102. 50 4-18A 5' C C A A A G T C G C G G A C G T T C A 3' Reverse primer to amplify sR103. 4-18B 5 ' T A A T A C G A C T C A C T A T A G G C T C A A G A T G T G C G G T T T T 3' Forward primer to amplify sR103. 5-109A 5' C T G G G G T A A C T G A G T T T C C 3' Reverse primer to amplify sR104. 5-109B 5 ' T A A T A C G A C T C A C T A T A G G T G T G A A T G A T G A A A G T C A 3' Forward primer to amplify • sR104. 6-7A 5' A C A T G C G C G A A C G C T G A T T 3' Reverse primer to amplify sR105. 6-7B 5 ' T A A T A C G A C T C A C T A T A G G G A G G A T G A C G A A T C C G G G 3' Forward primer to amplify sR105. 5-218A 5' G A T C A G T C C T C A C A C G C G 3' Reverse primer to amplify sR106. 5-218B 5 ' T A A T A C G A C T C A C T A T A G G T T A G A T G A T G T G T G A A C C C C 3' Forward primer to amplify sR106. 5-59A 5' G C T G G G G G T C G G A C T T T C 3' Reverse primer to amplify sR107. 5-59B 5 ' T A A T A C G A C T C A C T A T A G G A A G G G G T A T G A G G A C G A G G 3' Forward primer to amplify sR107. 6-1A 5' C A C C C C T C T T G A G G G C T C 3' Reverse primer to amplify sR108. 6-1B 5 ' T A A T A C G A C T C A C T A T A G G C C C C T C G A C T A G C C C C A A 3' Forward primer to amplify sR108. •4-5A 5' G C T G T G G G G A T C T C T G G A 3' Reverse primer to identify sR109. 4-5B 5 ' T A A T A C G A C T C A C T A T A G G G G G G A T C T G G C G A C A C C C T T G G A 3' Forward primer to identify sR109. 5-36A 5' C T G A G C T T C A T T T G G C G C 3' Reverse primer to amplify s R H O . 51 5-36B 5 T A A T A C G A C T C A C T A T A G G G G G C T G A T G A C G C C 3' Forward primer to amplify s R l l O . 4-20A 5' G C G A C T G G G G C C T C C A T G 3' Reverse primer to amplify s R l l l . 4-20B 5 ' T A A T A C G A C T C A C T A T A G G C A A T T C G G A C C G G A A G T T G 3' Forward primer to amplify s R l l l . 5-91A 5' C G G C G G C A G G A A G T C C C 3' Reverse primer to amplify sR112. 5-91B 5 ' T A A T A C G A C T C A C T A T A G G A T G G G G T A T C T A G T C C C T T G 3' Forward primer to amplify SR112. 5-80A 5' T G G G C C A C C C T C C T C T G G 3' Reverse primer to amplify sR113. 5-80B 5 ' T A A T A C G A C T C A C T A T A G G C A C C C T C C T C T G G T C A 3' Forward primer to amplify s R l 13. 5-105B 5 ' T A A T A C G A C T C A C T A T A G G A T T C T C T C G A C T G C C C C T C 3' Forward primer to amplify sR114. 412-28A 5' T C A T G T G C C C C G T T C G G C 3 ' Reverse primer to amplify s R l 15. 412-28B 5 ' T A A T A C G A C T C A C T A T A G G A T A A A A C G A G T G A A A G G C T C A 3' Forward primer to amplify sR115. 5-157A 5' T C G G C G A A C C A T G T A G C C 3' Reverse primer to amplify sR116. 5-157B 5 ' T A A T A C G A C T C A C T A T A G G T T T G T A A C C C T T A A G A C C T C G 3' Forward primer to amplify sR116. 4-1A 5 ' C C A A G G G C T G A G G A T T G C 3' Reverse primer to amplify sR117. 4-1B 5 ' T A A T A C G A C T C A C T A T A G G C T C A A G AT GTGCGGTTTT 3' Forward primer to amplify sR117. 5-18A 5' A G A A G A T T T G C G C C C C G T T T 3' Reverse primer to amplify sR118. 52 5-18B: 5 ' T A A T A C G A C T C A C T A T A G G G G G G G A C G A C A C C C A C G A C 3' Forward primer to amplify sR118. 412-49A 5 ' G T C T G C C T A C C T T A A G G T G 3' Reverse primer to amplify sR119. 412-49B 5 ' T A A T A C G A C T C A C T A T A G G T T A G G G T T T C G C T T C C G T G 3' Forward primer to amplify sR119. 5-72A 5' A G G A G G G A G C T T C C T G C T 3' Reverse primer to amplify sR120. 5-72B 5 T A A T A C G A C T C A C T A T A G G A G T G C C T G T G G A G C T C T G 3' Forward primer to amplify sR120. 5-132A 5' G T C T G G A G C A G A G G C G T A 3' Reverse primer to amplify sR121. 5-132B 5 ' T A A T A C G A C T C A C T A T A G G C C G A G G C A C C T C G G T T A A 3' Forward primer to amplify sR121. 5-106A 5 ' A G G A G G G A G C T T C C T G C T 3' Reverse primer to amplify S R 1 2 2 . 5-106B: 5' T A A T A C G A C T C A C T A T A G G C T T A G G G C C T G T G G A G C T 3' Forward primer to amplify sR122. 412-16A 5' C A T T T C C C C G C T C C C C G C 3' Reverse primer to amplify sR123. 412-16B 5 ' T A A T A C G A C T C A C T A T A G G G G C C G G C G T G A T T T T A C 3' Forward primer to amplify sR123. 4-9A 5 ' G T A A T A C G A C C C T C A G C T G A 3' Reverse primer to amplify SR124. 4-9B 5 ' T A A T A C G A C T C A C T A T A G G C A C A G A T G C G A T T C T C C T C 3' Forward primer to amplify sR124. 5-32A 5' G A G G A G G C C T C A A C T T C A C 3' Reverse primer to amplify sR125. 5-32B 5 ' T A A T A C G A C T C A C T A T A G G A G C C A T A C T C A G C A G A C A C 3' Forward primer to amplify sR125. 53 5-92A 5' G C G A G G G A C T C C A T G A C T C 3' Reverse primer to amplify sR126. 5-92B 5 ' T A A T A C G A C T C A C T A T A G G G A G G G A C T T C A G C T C C G T T 3' Forward primer to amplify sR126. 5-17A 5' G C A G G G C G A A C C T G A A T G T 3' Reverse primer to amplify sR127. 5-17B 5 ' T A A T A C G A C T C A C T A T A G G C G G A C C T G G C G A A G T C T C 3' Forward primer to amplify sR127. 5-42A 5' G G G G G A A C T C A A G T C C A G 3' Reverse primer to amplify sR128. 5-42B 5 ' T A A T A C G A C T C A C T A T A G G G G G A A C T C G C T A A C G G T A 3' Forward primer to amplify sR128. 5-70A 5' G T C A A G C T T C T C T A A C T T T C T 3' Reverse primer to amplify sR129. 5-70B 5 ' T A A T A C G A C T C A C T A T A G G C T C A C G T G G T G A A A A C G A C 3' Forward primer to amplify sR129. 5-87A 5' G A T C T T G G A A A A A A C A C C C A T 3' Reverse primer to amplify sR130. 5-87B 5 ' T A A T A C G A C T C A C T A T A G G A G A A T C C C A T G A A G G T T G A G 3' Forward primer to amplify sR130. 412-51A 5' T C G G G G A G C G G G G A A A T G 3' Reverse primer to amplify sR131. 412-51B 5 ' T A A T A C G A C T C A C T A T A G G T C A G T T G C A G G G T A A T T C 3' Forward primer to amplify sR131. 6-18A 5' G G A C G G G G A A A G G G G A A T 3' Reverse primer to amplify sR132. 5-165B 5 ' T A A T A C G A C T C A C T A T A G G T A G G G G A A A G G G G T C A G C 3' Forward primer to amplify sR132. 54 5-29A 5' T G G G G C C A G A C C C C C T A C 3' Primer to amplify 7S R N A . Complementary to nucleotides 311-292 o f7S R N A . 5-29B 5 ' T A A T A C G A C T C A C T A T A G G G C C C G G A C A C C A G C G T T C G 3' Primer to amplify the full-length 7S R N A . Complementary to nucleotides 1-20 of 7S R N A (use with 5-29A). 5-144B 5 ' T A A T A C G A C T C A C T A T A G G C G G T C A T G G G C T T T C T 3' Primer to amplify short 7S R N A fragment (nucleotides 134-311). Complementary to nucleotides 134-149 of 7S R N A . Kt-15 5' 5 ' T A A T A C G A C T C A C T A T A G G C C C A A A C C A A A C C C G C C 3' Forward primer to amplify the L7Ae binding site on 23S r R N A . Contains nucleotides 261-279 of 23S r R N A . Kt-15 3' 5 ' C A C G A C A T T C C C A C C G A C T 3' Reverse primer to amplify the L 7 A e binding site on 23S r R N A . Complementary to nucleotides 303-287.' 0HE61 5 ' G T A A T A C G A C T C A C T A T A G C A G T T G A T G A G A A G T T A A A A A A C C C C C C C C C C C G C T - 3 ' Forward primer to amplify the C/D' poly C s R l mutant. OHE62 5 ' G T T A T C A G A C C A T G G G A G T T A A G C G G G G G G G G G G G T T T 3' Reverse primer to amplify the C ' / D ' poly C s R l mutant. OHE73 5' G T A A T A C G A C T C A C T A T A G C A G T C C C C C C G A A G T T A A A A A A G C T T 3' Forward primer to amplify the C / D poly C s R l mutant. 55 OHE74 5' G T T A C C C C A C C A T T C A T C C A T C G C T T T T T T A A C T T C 3' Reverse primer to amplify the C /D poly C s R l mutant. OHE75 5 ' G T A A T A C G A C T C A C T A T A G C A G T T G A T G A G A A G T T A A A A A A C C C C T G G A T G A G C T 3' Forward primer to amplify the D ' poly C s R l mutant. OHE76 5 ' G T T A T C A G A C C A T G G G A G T T A A G C T C A T C C A G G G G T T T 3' Reverse primer to amplify the D ' poly C s R l mutant. OHE77 5 ' G T A A T A C G A C T C A C T A T A G C A G T T G A T G A G A A G T T A A A A A A G C G A C C C C C C C G C T - 3 ' Forward primer to amplify the C poly C s R l mutant. OHE78 5 ' G T T A T C A G A C C A T G G G A G T T A A G C G G G G G G G T C G C T T T 3' Reverse primer to amplify the C poly C s R l mutant. OHE79 5' G T A A T A C G A C T C A C T A T A G A G T T G A T G A G A A G T T A A A A A A G C G A T G G A A G A G C T 3' Forward primer to amplify the U 3 1 A p o l y C s R l mutant. OHE80 5 ' G T T A T C A G A C C A T G G G A G T T A A G C T C T T C C A T C G C T T T 3' Reverse primer to amplify t h e U 3 1 A p o l y C s R l mutant. OHE100 5 ' - G T A A T A C G A C T C A C T A T A G C A G T T G A T G A G A A G T T A A A A A A G C G A T G G A T C A G C T - 3 ' Forward primer to amplify the G32C poly C s R l mutant. OHE101 5 ' - G T T A T C A G A C C A T G G G A G T T A A G C T G A T C C A T C G C T T T - 3 ' Reverse primer to amplify the G32C poly C s R l mutant. OHE108 5' G T A A T A C G A C T C A C T A T A G C C G T T T G A C G A G A A G T T A A A A A A G C G A T G G A T G A G C T 3' Forward primer to amplify the C / D K-turn 38 s R l mutant. OHE109 5' G T C C T C G G A C C A T G G G A G T T A A G C T C A T C C A T C G C T T T T T T 3' Reverse primer to amplify the C / D K-turn 38 s R l mutant. 56 OHE110 5 ' G T A A T A C G A C T C A C T A T A G G A G T T G A U G A G A A G T T A A A A A A C C G A G C G T T T G A C G 3' Forward primer to amplify the C ' / D ' K-turn 38 s R l mutant. OHE111 5'- G T T A T C A G A C C A T G G G T G T T A A C G T C A A A C G C T C G G T T T T T T A - 3 ' Reverse primer to amplify the C ' /D 'K- tu rn 38 s R l mutant. OHE130 5' G T A A T A C G A C T C A C T A T A G G A A T G T G G T G C A G T T A A A A A A G C G A T G G A T G A G C T 3' Forward primer to amplify the C / D K-turn 15 s R l mutant. OHE131 5 ' G G G T T T G G A C C A T G G G A G T T A A G C T C A T C C A T C G C T T T 3' Reverse primer to amplify the C /D K-turn 15 s R l mutant. OSZ102 5' C C G A T A T C C A T G G T G A A A A T A T A C C T A A T T G A 3' Complementary to the 5' end of the S. solfataricus Nop5 gene. Contains an Ncol site for cloning into pET28. OSZ103 5' C C G A A T T C T C A C T T T C T T T T A C C T C T T C T C T 3' Reverse primer to amplify the Nop5 gene. Contains an EcoR I site for cloning into pET28. . T h e T7 promoter sequence is underlined. 57 3. Results 3.1 Construction of L7Ae-associated cDNA library 3.1.1 Immunoprecipitation of L7Ae-containing complexes In Archaea, the single multifunctional L7Ae protein is a component of the ribosome as well as a component of C /D box and H / A C A box R N P complexes. In all three of these distinct RNPs , L 7 A e recognizes and binds to a K-turn motif (Klein et al., 2001; Kuhn et al., 2002; Rozhdestvensky et al., 2003). Experimental evidence indicates that K-turn motifs are not restricted to r R N A and modification-guide R N A s , but are common to many other types of R N A as well . This observation prompted us to think that L7Ae protein could interact with a diverse set of R N A s that contain the K-turn motif. In an attempt to expand the spectrum of verified ncRNAs that associate with the L 7 A e protein and to characterize the K-turn motif and the R N A s containing this motif in the third domain of life, a single c D N A library of L7Ae-associated sRNAs was constructed. For this, S. solfataricus cell-free extracts were fractionated on a 10-30% sucrose gradient; gradient fractions were subjected to anti-L7Ae immuno-affinity chromatography as described in section 2.2.6. Western blot analyses of the distribution of L 7 A e in the gradient showed the presence of two L7Ae peaks, one between fractions 6-8 and other between fractions 12-18 (Figure 3.1 A) . The distribution pattern of the R N A s recovered by immuno-affinity chromatography corresponded well with the distribution of L 7 A e in the gradient. For the recovered R N A two peaks are observed, one between fractions 4-10 and other between fractions 12-18 (Figure 3.IB). A similar R N A distribution pattern was observed previously using antibodies against Sac Fib and Nop5 (Omer et al., 2000). The 58 immuno-purified complexes were highly enriched for small R N A s ( s R N A s ) , that are generally between 60 to 100 nucleotides in length, that sediment as a broad peak through the entire gradient (Figure 3.IB). These R N A s were neither detected in the aliquot immunoprecipitated with pre-immune serum nor in the aliquot containing unfractionated total R N A (data not shown). In the lower portion of the gradient, a substantial amount of R N A around 120 nt was observed in the immuno-purified material. Because this corresponds to the size of "the 5 S r R N A , it was suspected that the anti-L7Ae antibodies were able to interact to some extent with this r R N A molecule. However, the 5S r R N A sequence was not recovered in the library, indicating that neither L 7 A e nor the anti-L7Ae antibodies interact with this r R N A and that the band of 120 nt might correspond to other R N A species. The distribution of L 7 A e protein in the sucrose gradient was analyzed by western blotting with anti-L7Ae antibodies. This showed that L 7 A e protein is present through the entire gradient with a higher concentration in the bottom fractions (Figure 3.1 A ) , indicating that a higher fraction of the cellular L7Ae protein associates with the 50S ribosomal subunit, but there is also a small fraction of not ribosome associate, but in R N P complexes. 3.1.2 C o n s t r u c t i o n a n d ana lys i s o f the c D N A l i b r a r y Gradient fractions 2-12 and 13-20 were pooled and the coprecipitated R N A s were extracted (the R N A s recovered from a.single immunoprecipitation experiment were used to construct the c D N A library). The corresponding c D N A s were generated by R T - P C R , cloned into a T O P O vector and transformed in E. coli, as described in section 2.2.7. A total of 200 clones were obtained; of these only 128 contained an insert. Sequence analysis of the insert-containing clones revealed 45 distinct sequences (Table 3.1 and Table 3.2). The sequence and 59 structural characteristics of these c D N A s was much more diverse and heterogeneous than expected. For convenience, the recovered c D N A s were divided into six groups based on sequence, structure, genomic location and possible function (Table 3.2). Thirteen of the clones proved to be fragments of either 7S SRP R N A or r R N A and t R N A and were categorized as groups five and six, respectively (see below). 3 . 1 . 3 Characterization of the sequences of the Library Based on the presence of common sequence and structural motifs and/or genomic location, the remaining 32 c D N A sequences were further divided into four groups. The c D N A sequences were characterized (i) for expression and length using northern hybridization and primer"extension analysis, (ii) for the ability of the corresponding R N A to bind to the L 7 A e protein in band-shift experiments, (iii) for phylogenetic conservation of the sequences in related archaeal genomes and (iv) in selected instances, for the ability of R N A s to function as methylation guides. Five of the c D N A sequences (sR117, sR128, sR130, sR131 and sR132) could not be detected in total cellular R N A using either primer extension or northern hybridization although they could be detected by R T - P C R suggesting that they originate from low-abundance R N A s . In length, only nine of the c D N A clones correspond closely to the full-length of the R N A s detected in vivo; most of these are members of the canonical C / D box family of sRNAs. The remaining cloned c D N A s appear to be fragments of longer transcripts and based on the northern hybridization or primer extension results, are likely to represent partial degradation products derived from the longer (detectable) R N A s . At this stage we cannot exclude the possibility that these R N A s may represent processing products. A similar pattern of size reduction was observed by Tang et al.(2002b) in the set of 60 c D N A clones recovered from an Archeoglobus fulgidus c D N A library corresponding to the R N A fraction ranging from 50 to 500 nt. Gel-electrophoresis retardation assays were used to analyze the affinity of the R N A derived from each of the c D N A clones for the L 7 A e protein; 22 of the R N A s formed a detectable complex with L7Ae protein whereas complex formation with the remaining ten could not be detected. Database searches using B L A S T N revealed that seven of the 32 s R N A sequences had highly conserved homologs in other sequenced archaeal genomes. This reinforces the idea that these are authentic and functional small non-coding R N A s . These 32 sRNAs were divided into the four additional groups (Groups 1-4; Table 3.2), as described below. 3.2 RNA sequences of the cDNA l i b r a r y 3.2.1 G r o u p 1: M o d i f i c a t i o n - g u i d e s R N A s 3.2.1.1 L 7 A e c o - i m m u n o p r e c i p i t a t e s w i t h m e t h y l a t i o n - g u i d e s R N A s In an attempt to characterize the protein composition of archaeal C / D box RNPs, extracts of S. solfataricus were subjected to anti-L7Ae, anti-Fib or anti-Nop5 immuno-affinity chromatography. The isolated complexes were separated on S D S - P A G E and analyzed for the presence of L7Ae by western blot analysis (Figure 3.2). A l l three immuno purified complexes contained L7Ae protein although the amount of L7Ae protein recovered from purified complexes prepared with either anti-Fib or anti-Nop5 appeared to be less than that recovered from complexes immunoprecipitated with anti-L7Ae. The observation that L 7 A e can be coprecipitated with Fib and Nop5 is consistent with association of L7Ae with the methylation guide sRNAs . 6 1 3.2.2 Canonical C/D box sRNAs Six of the sequences recovered in the library display canonical features of archaeal C /D box ribose methylation guides. Two other sequences contain degenerate C and D ' like elements separated by segments of unusual lengths. They have been generically termed "atypical C /D box sRNAs" . The lengths of the canonical C /D box R N A s sR101-sR106 agree well with the sizes of the transcripts expressed in vivo whereas the atypical C /D box representatives appear to represent fragments of longer transcripts. Among the canonical C / D box R N A s , s R I O l , sR102 and sR104 sRNAs correspond to sequences that have been identified computationally (Todd Lowe, Maria Zago and P. Dennis, unpublished results). The D or D ' box guides of these three sRNAs are predicted to guide methyl transfer at positions U435 (D box, sRIOl) and A635 (D' box, sR102) in the 16S r R N A and G811 (D' box, sR104) in 23S r R N A . Using the dNTP concentration-dependent-pause primer extension assay, as described in Omer et al. (2000), we were able to confirm the presence of 2' -O-methyl ribose at position G811 in 23S r R N A (data not shown); however, using the same procedure, we could not detect the presence of methyl-ribose at either U435 or A635 in the 16S r R N A . Although this assay is a quick and simple method to detect nucleotide modifications, it is not able to identify unambiguously all ribose methylations, because of interference from other neighboring modifications (e.g., base methylations) and/or strong secondary structure features. Cases of false negatives have also been observed in which no pause was detected at sites of known ribose methylation (Maden et al., 1995). For unexplained reasons, a far lower primer extension success rate has been experienced with archaeal r R N A than previously with yeast r R N A (Omer et al., 2000). 62 Putative r R N A and t R N A targets for C / D box s R N A guides. To find putative r R N A or t R N A targets for the C /D box sR103, sR105 and sR106, the S. solfataricus genome was searched for antisense sequences able to form a minimum complementarity of 9 bp with the D or D ' guide regions of the respective sRNAs. For this aim a genomic tool available at http://genome-tools.sourceforge.net/ was used. This genomic tool allows gaps or nucleotide mismatches in the short input sequence. The resulting hits were manually sorted after discarding the perfect match to the s R N A gene and then ranked according to the length of the perfect match. The longest match to the Sso sR106 D ' guide element was eleven nucleotides within the 5. solfataricus t R N A M e t , t R N A , y r and t R N A P h e ; the predicted site of 2' -O ribose-methylation was G52. In S. tokodaii, the homologue to sR106 contains the exact same D ' guide sequence but is predicted to guide methylation to position G52 within a somewhat different set of t R N A s : two t R N A M e t , two t R N A T l i r , t R N A V a l , t R N A T r p , and t R N A P h e . The D box guide of sR106 also exhibited a 9 nt complementary match to a genomic -sequence corresponding to position C22 of another R N A (sRl 17) that was recovered in the library (see below). Using an in vitro methylation assay, I attempted to demonstrate the D box guide activity of sR106 against a transcript of s R l 17 without success (data not shown). I suspect that this failure may be related to the fact that both the guide and the target R N A s are able to bind the C /D box proteins and that this may interfere with 'proper guide-target complex formation and/or activity. No potential r R N A , t R N A or other non-coding R N A targets were identified for the D or D ' guides of either sR103 or sR105; we speculate that these sRNAs might modify unknown R N A s , although at this stage, alternate chaperone function cannot be excluded. In these analyses, many of these guide sequences exhibit complementarity to protein-coding 63 sequences. However, understanding the significance of these matches requires further investigation. A s s e m b l y a n d m e t h y l a t i o n ac t iv i ty o f R N P complexes . Canonical C / D box sRNAs are known to assemble with proteins L7Ae , Nop5 and Fib into functional RNPs that function as ribose-methylation machines (Omer et al., 2002). A l l six new canonical C /D box sRNAs were tested for their ability to assemble with the three methylation-guide core proteins and form R N P complexes that were able to direct methylation to R N A oligonucleotides that were complementary to one of the two guide regions. In Figure 3.3 one of the canonical C / D box R N A s recovered in the library (sRIOl) is represented as an example. In the predicted secondary structure of sRIOl (Figure 3.3A) the C /D and C ' / D ' boxes form the two K-turn motifs of the R N A . Standard band-shift assays demonstrated that all the C / D box sRNAs recovered in the library form a complex with L7Ae (Figure 3.3B and data not shown) and assemble into higher order complexes with the other two methylation-guide proteins (Figure 3.3C and data not shown). Moreover, the assembled C /D box sRNP complex was able to direct the methylation of a target R N A in an in vitro metrylation assay (Figure 3.3D and data not shown). 3.2.2.1 H o m o l o g u e s o f C / D box s R N A s genes i n o ther A r c h a e a species We next examined other sequenced archaeal genomes for the presence of sequences homologous to the box C /D R N A s sR101-sR106. For this aim, we used routine B L A S T N searches combined with the genomic tool mentioned above. To test the efficiency of the tool program in identifing C / D box s R N A homologs, I first searched for homologs of the initial set of 29 s R N A s identified biochemically in S. acidocaldarius or in the related genomes of S. 64 . solfataricus and S. tokodaii. Homologous sRNAs are defined as those predicted to guide modification to the same position in a given R N A target in distinct archaeal species (Dennis et al., 2001). Inspection of all the archaeal C /D box R N A sequences identified thus far has shown that in most cases the region of complementarity between the target and the guide extends over 9-10 bp. Based on this feature, the query sequence that I used in conjunction with the genomic tool corresponded to the 9-10 nt long guide sequence and the adjacent D or D ' element as present in the S. acidocaldarius genome. The program output consisted of the genomic coordinate number where a match was found. Knowing that archaeal C /D box R N A s have a dyad modular organization, I looked for the presence of additional box elements in the region surrounding the output coordinate. The analysis identified ten homologous groups of C / D box R N A s with members in two, three or four Sulfolobus species (Table 3.3). O f the ten groups, seven have representatives in S. solfataricus; two of these were in a group of 13 s R N A genes identified previously using an archaeal s R N A gene-finding program (Omer et al., 2000) and the remaining five are new genes. Thus, this strategy is proven to be robust as exemplified by the identification of five new s R N A genes in S. solfataricus and nine new sRNAs genes in 5". tokodaii. Using the same search algorithm or conventional B L A S T N , I looked for homologs to the S. solfataricus sR101-sR106 R N A s in other archaeal genomes. No homologs were detected in other Sulfolobus species for sR101-sR104 whereas homologs were detected for sR105 and sR106 (Table 3.3). A n inspection of the genomic environment of the sR106 encoding gene (previously annotated as sR18 in S. acidocaldarius; Omer et al., 2000) showed that in S. solfataricus and S. tokodaii, the 3'-end of the small R N A is complementary (antisense) over 2 and 7 nt respectively, to the 3'-end of the m R N A encoding the thiamine 65 biosynthetic enzyme, thil. In S. acidocaldarius the sR18 gene is located in an intergenic region, between a signal anchor protein and a conserved archaeal protein of unknown function. Inspection of the alignment of sR105 with the homologous sequence from S. metallicus indicates that the most conserved region spans the D ' guide; this suggests that the guide may be used to direct ribose-methylation to an unidentified target R N A . 3.2.2.2 A t y p i c a l C/D box s R N A s A s s e m b l y a n d m e t h y l a t i o n ac t iv i ty o f the a t y p i c a l C/D box R N P s . The sR107 and sR108 R N A s have some of the structural features of canonical C / D box sRNAs but the spacing of the box sequences is atypical (Figure 3.4 A and B , left panels). To obtain information on the possible function of these two atypical C /D box R N A s , I tested their ability to form complexes with the three C /D box binding proteins and to guide methylation in vitro, to short oligonucleotide targets (Figure 3.4 A and B , right panels). The R N A targets were designed to form a 10 nt perfect helix with the region upstream of the D or D-like element with methylation expected to occur at the -5 position. Both R N A s were able to guide a low level of methyl transfer; the activities were only about 13% of that obtained with the control S. acidocaldarius s R l R N A (2 pmoles of product for the atypical s R N A s compared to the 15 pmoles of product incorporated for the canonical s R l s R N A ) (Figure 3.4 C). These values are very close to the sensitivity of detection for the in vitro assay. The low level of activity associated with sR107 may have been due to the fact that only 7 nt are available in the loop to base pair with the target oligonucleotide. Alternatively, reduced activity of the atypical sR107 may be the result of partial or inefficient assembly of this R N A into higher order complexes, as suggested by the low amount of complex II and III formed in presence 66 of Nop and Fib proteins. To test the first of these possibilities, we inserted the D box guide of . s R l into the corresponding position of sR107 (Figure 3.4D); this expands the loop and the potential for guide-target interaction from 7 to 9 nt and the activity was enhanced four-fold (Figure 3.4C). These studies suggest that the atypical C /D box s R N A can assemble into higher order R N P complexes in the presence of L 7 A e , Nop5 and Fib and that these complexes may possess the ability to direct methylation to appropriate target oligonucleotides. Alternatively, these atypical C /D boxes s R N A may function in processes unrelated to nucleotide modification. In this instance, suboptimal recruitment of Nop5 and Fib may be the consequence of protein-protein interactions mediated by L 7 A e only. Further work wi l l be required to fully understand the structure, activity and function of these non-canonical C / D box R N A s . Structure and properties of sR107 R N A . The sR107 coding region is present in two copies in the genome, located in intergenic regions, and has a homologue in S. tokodaii. Northern hybridization results indicate that the in vivo transcript containing the c D N A sequence is about 190 nt in length. Attempts to map the 5' end of the in vivo transcript by primer extension were unsuccessful. However, the correlation between the length of the conserved sequence and the in vivo transcript length suggest that the highly conserved region corresponds to the coding region of this transcript (Figure 3.5 A) . Interestingly, the sequences that correspond to the proximal portion of the full-length transcript, that spans approximately 130 nt, has been recovered in two nearly identical forms (Sso-17 and Sso-109) in a second more general library of ncRNAs from 5". solfataricus constructed from a pool of size-fractionated, but unselected R N A s (Tang et al., 2005). Sso-17 and Sso-109 clones are derived from two nearly identical genes and both are linked to the sequence that encodes sR107. 67 Although to date there is no experimental evidence that indicates the function of the Sso 17-sR107 and Ssol09-sR107 fusion transcripts, it is interesting that these s R N A s exhibit extensive boxed complementarity to the 3 ' -UTR of a predicted transposase m R N A , which is encoded in a different region of the genome (Figure 3.5B). One of the box regions of complementarity corresponds to the D box guide. However, at present there is no evidence that indicates that this R N A functions as a methylation guide s R N A . The duplicated genes that encode this n c R N A in S. solfataricus are phylogenetically conserved in a single copy in the genome of Sulfolobus tokodaii. This single-copy R N A displays a totally different pattern of complementary to the 3 '-U.TR of a predicted O R F with unknown function (ST2013) (Figure 3.5C). sR108 RNA. The sR108 R N A was derived from a 297 nt-long sequence that is repeated fourteen times in the S. solfataricus genome. This sequence contains an imperfect match to a number of other related genomic sequences. Thirteen copies of the repeated sequence are located in intergenic regions (Figure 3.6B). Northern hybridization and primer extension analyses indicate that the detectable in vivo transcript is transcribed from the distal 200 nt of the repeat sequence and that the c D N A sequence recovered in the library is derived from the middle of the in vivo transcript (Figure 3.6A). Two nearly perfect but truncated copies of the same repeat are found in the genome of S. tokodaii. The S. tokodaii sR108 copies are 145 nt long and overlap the 3' end of the coding regions of long hypothetical proteins (ST1091 and ST1938). 68 3.2.2.3 H / A C A s R N A Typical eukaryotes such as yeast and human, contain dozens of pseudouridine modifications in their large and small subunit r R N A s (Maden, 1990; Maden and Hughes, 1997). Most or all site-specific pseudouridine modifications are introduced during ribosome assembly and are directed to selected locations within the r R N A s by the guide function of the H / A C A family of snoRNAs (Ganot et al., 1997; Kiss-Laszlo et al., 1996; N i et al., 1997; Weinstein and Steitz, 1999). Archaea and Bacteria typically contain fewer than a dozen pseudouridine modifications in their r R N A . In E. coli, all of these modifications appear to be enzyme-mediated and none is known to involve or require an R N A guide function (Massenet et al., 1999). In contrast, in at least one archaeal example (A. fulgidus), a c D N A library made from small R N A s captured four H / A C A - l i k e sRNAs and pseudouridine modifications were detected in the r R N A at three of the four sites predicted from the guide sequences in these R N A s (Tang et al., 2002a). The sR109 sequence from the S. solfataricus L 7 A e library has several of the hallmark features of an H / A C A R N A : it has (i) the conserved A C A sequence at the 3' end, (ii) the sequence and structural features of pseudouridylation pocket including antisense guide elements that are predicted to target modification to position U2598 (U2457 in the E. coli numbering system) in 23 S r R N A and (iii) a K-turn motif that is predicted-to be bound by the L 7 A e protein (Figure 3.7A). Pseudouridine modification at the predicted position is common and has been observed in organisms ranging from E. coli to humans (Massenet et al., 1999; Ofengand and Bakin, 1997). The ability of sR109 to bind to the L 7 A e protein was confirmed by band shift analysis (Figure 3.7C) and the presence of a pseudouridine modification at the predicted position U2598 was confirmed by C M C treatment and primer extension (Figure 3.7 69 B). A mutant sR109 s R N A , in which the two adjancent G A bp of the predicted K-turn were replaced with C C pairs, was constructed. Band-shift assays performed with the mutant sR109 s R N A demonstrated that this transcript is unable to form a complex with L 7 A e protein, thus confirming that the predicted K-turn is indeed the L 7 A e binding site within sR190 s R N A (Figure 3.7D and E). 3.2.3 G r o u p 2 : R N A s o v e r l a p p i n g p r o t e i n c o d i n g regions Fourteen of the sequences recovered in the library are encoded on the sense strand of annotated ORFs . Most of these sequences contain a functional K-turn motif and are able to bind the L 7 A e protein. The binding motif can apparently be located anywhere along the length of the R N A — at the 5' end overlapping the initiation codon, in the middle and completely within the coding region or at the end and extending into the 3' flanking region. Therefore, the sequences of this group have been subdivided in three categories: (i) R N A s overlapping the 5' end of ORFs, (ii) R N A s overlapping the 3' end of ORFs and (iii) R N A s encoded within ORFs. 3.2.3.1 R N A sequences o v e r l a p p i n g the 5 ' end o f annota ted O R F s Within the first subdivision, five clones ( sRl 10-sRl 14) map to the region upstream of the initiation codon and extend 3' into the protein-coding region (Figure 3.8). In case of s R H O , the overlapping gene encodes subunit four of formate dehydrogenlyase (hycD), whereas for s R l 11 - s R l 13 the overlapping genes are transposases. Interestingly, sR114 overlaps the 5' region of the gene encoding L7Ae m R N A itself. The sR110-113 70 sequences are characterized by a long hairpin that is interrupted by bulge nucleotides or internal loops; the initiation codons are located either within the stem structure or at the 3' base of the stem (Figure 3.8). A l l these R N A s , except s R l 13 (data not shown) are able to form a stable complex with the L7Ae protein and each of the four sequences contains an easily recognizable K-turn motif. Somewhat surprisingly, s R l 10-112 transcripts can form a higher order complex in the presence of all three C /D box binding proteins (Figure 3.8 A - C left panels). The 5'ends of the in vivo transcripts from which these c D N A fragments are derived were mapped by primer extension assays. The results obtained in this experiment indicate that the transcription start site of the m R N A from which s R l 10 is derived is 113 nt upstream from the A of the start codon and that the cloned fragment contains only the last 40 nt of the 5 ' -UTR of the m R N A . This indicates that the R N A was fragmented during the construction of the library and only the R N A fragment that was tightly bound to L 7 A e protein was recovered. A n AT-r ich promoter-like sequence has been identified about 27 nt 5' from the predicted transcription start site of this R N A . On the other hand, s R l 11, s R l 12 and s R l 14 clones correspond to the immediate 5' ends of the respective m R N A s . The 5' regions flanking all these three sequences contain AT-r ich promoter-like elements centered about 25-30 nt upstream of the start of the c D N A sequences. In addition, a Shine-Dalgarno (SD) sequence motif could be identified at the appropriate distance (8-10 nt) for all the 5'-end-overlapping sequences. At the present time there is no direct experimental data to suggest the possible function of a K-turn motif at the 5' end of these m R N A s . 71 3.2.3.2 R N A sequences o v e r l a p p i n g the 3 ' end o f anno ta ted O R F s The second subgroup representing sRNAs that overlap the 3' end of coding regions contains three members (sR115-117). The s R l 15 s R N A contains a region of secondary structure that overlaps the 3' end of a hypothetical transposase-related protein with the predicted K-turn motif overlapping the termination codon of the m R N A (Figure 3.9A). The s R l 15 R N A was able to bind L7Ae with high affinity and to form at somewhat reduced efficiency, a higher-order structure containing Nop5 and Fib (Figure 3.9A). Mutational analyses show that substitution of one of the K-turn G : A base pairs for a C : C pair abolishes the binding of L 7 A e protein to the R N A (Figure 3.9B). Interestingly, a shorter s R l 15 sequence containing a 5' deletion to nucleotide 26 was also recovered in the library. Band shift assays demonstrated that the truncated s R l 15 transcript, which is missing half of the nucleotides forming the predicted K-turn motif, is still able to interact with L 7 A e protein. The sequence of the truncated s R l 15 was manually folded and a putative K-turn motif could be identified in the terminal loop of the R N A (Figure 3.9C left panel). Band shift assays performed with increasing amounts of L 7 A e protein show that both sR115 R N A s , the truncated one and the one containing the complete sequence, display different affinities for the L 7 A e protein (Figure 3.9 A and C). Primer extension analysis showed that the s R l 15 in vivo transcript is about 50 nt longer than the corresponding c D N A clone (Table 3.2). Informatics analysis identified homologs of the s R l 15 sequence in the chrenarchaeon S. tokodaii and in the moderate thermophilic euryarchaeon Thermoplasma volcanium. Interestingly, in 5. solfataricus and 5. tokodaii, sR115 overlaps the gene encoding a hypothetical protein, with homology to transposase 1974, whereas in T. volcanium, the corresponding s R N A sequence lies in an 72 unrelated non-protein-coding region (Figure 3.9D). Inspection of the three sequences reveals extensive conservation over the region represented by the c D N A clone and the additional 50 nt of 5' flanking sequence. Although I noted the presence of a promoter-like sequence element ( 5 ' - T T T A A G T - 3 ' ) centered about 28 nt upstream of the 5' end of the in vivo transcript, I cannot predict at this time whether the s R N A is generated by processing of the transposase m R N A or is independently transcribed from this putative internal promoter. Two additional sequences in the genome of S. solfataricus exhibit significant similarity to the s R l 15 gene [Score = 65.9 bits, Expect = 9e-09]. These sequences correspond to a non-coding region similar to that found in T. volcanium and to a truncated form of transposase 1974. A l l five of these R N A sequences appear to contain the C /D box-like motifs as illustrated in Figure 3.9D. Within this subgroup, two sRNAs (sR116 and s R l 17) contain no K-turn-like motifs and are unable to bind the L 7 A e protein. It is unclear why these sRNAs were recovered in our library. They might represent fragments of a larger R N A or part of a larger complex that binds the protein. Conversely, these R N A s might represent non-specific sequences carried over with the pool of L7Ae-associated R N A s . It is interesting to note that s R l 13 and s R l 17 are derived from the same transposase ISC 1476 m R N A and that neither fragment has a K -turn motif (Table 3.2). Moreover, band shift assays performed with the full-length m R N A of transposase ISC 1476 were unable to detect the formation of the L 7 A e - m R N A complex (data not shown). 73 3.2.3.3 R N A sequences o v e r l a p p i n g i n t e r n a l reg ions o f anno ta ted O R F s The third subgroup contains six s R N A s ( sRl 18-123) whose sequences are contained entirely within the respective coding regions of predominantly transposase genes (Table 3.2). Five of these R N A s contain recognizable K-turn motifs and are able to form a strong complex with the L 7 A e protein and a somewhat weaker complex in the presence of all three C / D box binding proteins (data not shown). s R l 18 overlaps the coding region of transposase ISC 1904. ISC 1904 is a very abundant mobile element in the S. solfataricus genome with 10 complete and four partial copies (Brugger et al., 2002). Seven s R l 18-related clones differing by several base deletions and representing different isoforms of the transposase were recovered in. the library. Band shift assays demonstrated that all the isoforms were able to interact with L 7 A e protein (data not shown). In addition, partial fragments of s R l 18, which are missing the distal half of the R N A sequence, were also recovered. These truncated s R l 18 fragments were not able to bind L7Ae protein, suggesting that the L 7 A e binding site, or part of it at least, is located in this region of the R N A molecule. B L A S T searches reveled that a homologue of s R l 18 is present in the genomes of 5". tokodaii and D. ambivalens. In the first instance, it overlaps the coding region of a transposase-related hypothetical protein; in the second, a putative transposase. A longer s R l 18-related transcript was also recovered in a second, more general library of ncRNAs from S. solfataricus constructed from a pool of size-fractionated but unselected R N A s (Tang et al., 2005). The clone isolated by Tang et al. contains 36 extra nt at each end of the R N A molecule and constitutes an isoform of transposase 1904 that was not recovered in the L 7 A e c D N A library. sR122 and sR123 overlap different regions of an O R F coding for a small hypothetical protein and both sequences are able to form a complex with L 7 A e protein (data not shown). 74 3.2.4 G r o u p 3 : s R N A s encoded i n in t e rgen ic reg ions Three of the s R N A sequences (sR124-126) recovered in the library were encoded completely in intergenic regions. Only two of the sequences contained a K-turn motif and were able to form a stable complex with L 7 A e protein. 3.2.4.1 P r e - r R N A space r regions ( sR125) The 16S and 23S r R N A s are excised from the primary r R N A transcript by an archaeal splicing endonuclease that recognizes and cleaves a conserved secondary structure element, the bulge-helix-bulge (BHB) motif, located at the long processing stems surrounding 16S and 23S r R N A s in the pre-rRNA (Dennis et al., 1998). The B H B R N A structural motif is also present at the base of the anticodon loop (at exon-intron-exon junctions) in intron-containing t R N A transcripts. The B H B motif consists of two 3-nt bulges on opposite strands of an R N A , separated by a 4-bp helix. The splicing endonuclease cleaves at symmetrical positions within each of the 3-nt bulges present on the same minor groove face of the central 4-bp helix of the B H B motif, resulting in 2',3'-cyclic phosphates and 5 ' -OH ends (Dennis et al., 1998; Russell et al., 1999). In t R N A intron processing, the 5' and 3' halves of the t R N A are ligated together following removal of the intron (Daniels et al., 1985); the ligase enzyme has yet to be identified. Recently, Tang and coworkers (2002b) demonstrated that the pre-16S and pre-23S r R N A transcripts are ligated following cleavage within, the B H B motif, resulting in circularized R N A species. The sequences forming the 5' external transcribed sequence (ETS), the internal transcribed sequence (ITS) and 3' ETS of the primary r R N A transcript are also ligated to form a long product. Interestingly, in the L7Ae library one of the recovered 75 clones, sR125, contained the distal portion of the ITS, the ligation junction in the 23S r R N A processing stem and the 3' ETS . The 5' ETS sequence and the proximal portion of the ITS are missing from this clone (Figure 3.1 OA). Band shift assays demonstrated that sR125 R N A is able to form a complex with L7Ae protein (Figure 3.1 OB; Tang et al., 2002b) and in accordance with this observation, a K-turn motif has been identified in an asymmetric loop at the base of the 23S processing stem. The K-turn motif at this position is also conserved in S. acidocaldarius and S. tokodaii (Tang et al., 2002b; Figure 3.10C and D). This R N A was not able to interact with the other two methylation guide proteins, Nop5 and aFib. Remarkably, an L7Ae-binding site is also present in the religated r R N A spacer transcript of the euryarcheon A. fulgidus, but its location differs from the S. solfataricus transcript. In A. fulgidus the K-turn motif is found within the 5' ETS sequence, indicating that different strategies have been used in two distantly related Archaea to provide for an L 7 A e binding site in this R N A . In addition, these observations suggest that L 7 A e protein might help to organize a larger pre-rRNA processing complex by binding to the precursor r R N A . 3.2.4.2 s R N A s genes encoded between anno ta ted O R F s The 56 nt long sR126 is positioned in the 87 nt-long intergenic space between two convergently transcribed ORFs. The s R N A has a long hairpin secondary structure containing a K-turn motif in the middle of the stem (Figure 3.11 A ) . Band shift assays show that this R N A is able to bind the L7Ae protein (Figure 3.1 IB) . A putative transcription promoter could not be identified at the appropriate distance and the 5' end of the R N A could not be 76 mapped by primer extension experiments. Therefore, it is unclear i f this s R N A is independently transcribed or cotranscribed with the upstream O R F and generated by m R N A processing. The function of this s R N A could not be inferred since it lacks any sequence or structural motifs that would have allowed its assignment to any of the known classes of ncRNAs. 3.2.5 G r o u p 4: A n t i s e n s e R N A s Seven of the c D N A sequences recovered in the library exhibit sequence complementarity to either protein-encoding m R N A s or other ncRNAs (Table 3.2). Therefore, the sequences have been divided in two subgroups: i) antisense of C / D box sRNAs and ii) antisense of annotated ORFs. In the first subgroup, the R N A s display a perfect full-length complementarity to canonical methylation guide sRNAs; in the second, they exhibit a perfect or interrupted complementarity to m R N A s . 3.2.5.1 A n t i s e n s e to C / D box s R N A s Two of the sequences in the library were complementary to canonical C / D box methylation-guide sRNAs (Figure 3.12A). Neither of these R N A s had recognizable K-turn motif and neither was able to bind the L7Ae protein. We suspect that they appear in our library because they are, at some point, in complexes with the sense C / D box R N A s that are demonstrated substrates for L7Ae protein binding. The sR132 was observed to be the antisense partner to the previously characterized C /D box sR4 R N A of S. solfataricus (Omer et al., 2000). sR4 is 55 nt-long and its D ' and D box guides are predicted to direct 77 methylation to positions G894 in 23S r R N A and to position C277 in 16S r R N A , respectively. Only C277 ribose modification in 16S r R N A has been confirmed using the concentration-dependent-pause primer extension reaction (Omer et al., 2000). The 66 nt of sR132 span the whole sequence of sR4 and extend 11 nt into the 3' flanking sequence. sR132 is located in an intergenic region between two hypothetical proteins. Apparently, this gene is independently transcribed since a putative promoter could be identified at 30 nt upstream from the start of the c D N A clone. Similarly, a promoter could also be identified for the sR4 gene encoded on the complementary strand. The expression of sR132 was verified by northern blot analyses and the expression of sR4 was confirmed by primer extension in another study (Omer et al., 2000). The new C /D box s R N A that exhibits full-length antisense complementarity to sR133 has been designated sR134 for annotation purposes. Northern blot analyses showed that sR134 is 60 nt long in vivo (data not shown); however the 5' and 3' ends of this R N A were not mapped experimentally. The D ' and D box guide regions of this R N A do not exhibit complementarity to r R N A or t R N A sequences and therefore this s R N A has been designated as an orphan guide. 3.2.5.2 Antisense to annotated O R F s Fragments of antisense R N A that are complementary to 5', 3' or internal regions of m R N A s that most often (but not exclusively) encode transposon-related proteins were recovered in the c D N A library. The synthesis of transposases is expected to be tightly regulated because of the genomic instability that results from uncontrolled transposition. The detected antisense R N A s are predicted to be important components in this regulation. 78 The expression of all the antisense R N A fragments was confirmed either by northern hybridization or by R T - P C R . Two of the clones (sR127 and 130) exhibit respectively, antisense complementarity to the beginning and end of the same transposase ISC 1439 m R N A . sR130 is complementary to the last 15 nt of the transposase gene and the first 49 nt of the 3 ' -UTR (Figure 3.12B). The fragment contains a K-turn motif in the region complementary to the 3 ' -UTR, whereas sR127 does not contain a K-turn motif and does not bind L 7 A e protein (data not shown). Interestingly, only the clones that exhibit complementarity to the 3' end of m R N A s (sR130 and sR131) contain a recognizable K-turn and interact with L 7 A e protein (Figure 3.12C and data not shown). The presence of the non-L7Ae-binding antisense sequences in our library suggests that, in vivo, they i) may be fragments of larger antisense transcripts that do contain the motif and do bind the protein, or ii) are in complexes that contain the L 7 A e protein. In E. coli, antisense s R N A s have been implicated in the regulation of translation; several mechanisms involving R N A - R N A or RNA-protein interactions result in inhibition or promotion of ribosome binding and ultimately changing translational efficiency (Wagner et al., 2002; Wassarman, 2002). The sR127and sR128 antisense R N A s are complementary to the 5' end of a transposase and might participate in regulation of translation using a mechanism similar to the one identified in E. coli (Arini et al., 1997; Simmons and Kleckner, 1983). Moreover, it seems reasonable to suggest that in S. solfataricus, antisense R N A s might play an important role in the regulation of gene expression. In at least some cases, K-turn motifs and their interaction with L7Ae protein may be a critical component of this process. 79 3.2.6 Group 5: Fragments of 7S R N A The 7S R N A (4.5S R N A in Bacteria) is a universally conserved component of the signal recognition particle (SRP) that functions in the translocation of membrane or secreted proteins across the prokaryotic plasma membrane or to the endoplasmic reticulum membrane, (Eichler and M o l l , 2001; Herskovits and Bib i , 2000; Keenan et al., 2001). The eukaryotic 7S R N A molecule consists of 300 nt that can form a highly base-paired secondary structure. A 7S R N A molecule has also been identified in Archaea and although the primary sequence conservation between the archaeal and eukaryotic 7S R N A is limited to a specific region of the molecule, the secondary structure of both R N A s is nearly identical (Kaine, 1990; Luehrsen et al., 1985). The eukaryotic and archaeal 7S R N A s are both composed of seven helices, with the highest sequence conservation in the proximal and distal bulges of helix 8 and the tetra-nucleotide loop of helix 6 (Batey et al., 2000; Larsen and Zwieb, 1991) (Figure 3.13). 3.2.6.1 L 7 A e interacts with full-length 7S R N A The L 7 A e library contained two fragments of 7S R N A : R N A 1 (nucleotides 220-31 1), which was the most prevalent sequence in the library, and R N A 2 (nucleotides 135-233) (Tables 3.1 and 3.2). Neither of these 7S R N A fragments was able to form a stable complex with L 7 A e protein in a band shift assay (data not shown). Because R N A 1 was so prevalent in the library, we wondered i f the full-length 7S R N A might contain a K-turn in a region of the molecule not recovered within the two cloned sequences. A putative K-turn has been identified in the A l u domain that forms near the 5' and 3' ends of the human 7S R N A (Klein 80 et al., 2001). Visual inspection of the S. solfataricus 7S structure failed to identify a K-turn at this position. However, we were able to identify a putative K-turn motif present in the large internal loop around nucleotides 110 and 250, as illustrated in Figure 3.14C. To determine i f 7S R N A binds L 7 A e and to locate the binding site we carried out band shift assays with full length R N A and with R N A s containing 5' deletions extending to nucleotide positions 88 and 134 (Figure 3.14 A and B). The full-length R N A and the deletion to nucleotide 88 (which removes half of the A l u domain) form a stable complex with the protein whereas the longer deletion that removes half of the predicted K-turn motif does not interact with the protein. 3.2.6.2 M a p p i n g of L 7 A e binding site on the 7S R N A molecule To confirm the precise location of the L7Ae-7S R N A interaction, a modified toeprinting assay was used. In toeprinting experiments, an oligonucleotide annealed to the R N A template downstream of the presumed protein binding site is extended with reverse transcriptase. When the enzyme encounters a bound protein, it prematurely falls off the R N A leaving a 'stop signal' (the toeprint), which defines the last nucleotide on the 3' side of the R N A to which the protein is bound. For the toeprinting experiments, the full-length 7S transcript and the two 7S fragments with 5' deletions were incubated with increasing amounts of L7Ae protein as indicated in section 2.2.8 and Figure 3.14D. A primer complementary to positions 311 to 292 in 7S R N A was used to generate the c D N A strand of each of the L 7 A e -bound 7S R N A by reverse transcriptase. A strong block to reverse transcription was observed at position A258 (within a few nucleotides of the predicted L 7 A e binding motif) for the full-length 7S R N A and the 89-311 7S R N A fragment, but not for the 7S R N A fragment with the longest deletion that does not bind L7Ae (Figure 3.14D). These results suggest that the L7Ae 81 protein may be a component of the archaeal signal recognition particle and have a direct role in the function of the R N P complex. 3.2.6.3 Antisense-7S R N A In the S. solfataricus genome sequence database (http://www-archbac.u-psud.fr/proiects/sulfolobus/ ) the annotation of the 7S R N A gene is incorrect and actually represents the antisense complement to the authentic 7S R N A gene (see http://psyche.uthct.edu/dbs/SRPDB/SRPDB.html ). Because of this confusion, the presence of an anti-7S R N A was tested by R T - P C R amplification. Using this method, copies of both sense and antisense 7S R N A were detected in S. solfataricus total R N A preparations, but no amplification products were detected for the control sample which did not contain any R N A template (data not shown). Gel shift assays indicate that the antisense 7S R N A does not bind to L 7 A e (data not shown). To test whether the antisense-7S has an effect on the binding of L 7 A e to the sense 7S R N A competition assays were performed (Figure 3.15). For this, 32 constant amounts of L 7 A e protein and P-labeled 7S R N A were incubated with increasing amounts of unlabeled sense 7S R N A (Figure 3.15A), antisense-7S R N A (Figure 3.15B) or a truncated 7S R N A lacking the L 7 A e binding motif (nucleotides 135-311) (Figure 3.15C). Interestingly, addition of unlabeled 7S R N A or antisense-7S R N A to the binding reaction displaces the L 7 A e protein from the 7S R N A . This effect is not observed with the truncated 7S R N A competitor. In addition, when a constant amount of labeled 7S R N A is incubated with increasing amounts of unlabeled antisense-7S R N A a retarded band, indicative of R N A -R N A complex formation, is observed (Figure 3.15D). These results indicate that the 82 antisense-7S R N A is an efficient competitor able to displace the bound L 7 A e protein from the sense transcript. 3.2.7 G r o u p 6: Fragments of r R N A and t R N A Eleven of the 45 sequences recovered from the library were placed in group 6. B L A S T N searches against the whole S. solfataricus genome sequence revealed that four of the eleven sequences were fragments derived from the 5' and 3' ends of the 16S and 23S r R N A s (multiple clones of the four respective sequences were obtained), and the remaining seven were from different tRNAs (single or multiple clones; Table 3.2). The R N A s from all of the clones were examined for their ability to form a complex with the L 7 A e protein in band shift experiments; none of the R N A s was able to form a stable complex (data not shown), even though one of the r R N A fragments (RNA5) is known to contain a K-turn (Klein et al., 2001). Visual examination of the other ten sequences failed to reveal the presence of a structural motif that resembled the K-turn. Why were these r R N A and t R N A sequences prevalent in the library when they do not appear to contain a functional K-turn, and are unable to bind directly to the L7Ae protein? Several explanations are possible. For instance,.in eukaryotic organisms, C / D box containing R N P complexes have been implicated in the essential endonucleolytic processing events at the 5' and 3' ends of the small and large subunit r R N A (Fatica and Tollervey, 2002). The r R N A sequences may have been recovered because they enter into a transiently stable, L 7 A e -containing R N P processing complex at some point along the ribosome assembly pathway. Moreover, at least some of the recovered r R N A and t R N A fragments are targets for C / D box RNP-mediated methylation. The most thoroughly characterized example of this is within 83 R N A 3 from the 5' end of 16S r R N A . It contains the U52 target site that is methylated by the S. solfataricus sRl-containing R N P methylation guide complex. This may mean that in at least some instances, the transient complexes containing both the target r R N A or t R N A as well as the guide R N A can be co-precipitated with antibodies against the L 7 A e protein. Alternatively, these R N A s that lack recognizable K-turns may simply represent non-specific carry over in the immunoprecipitation reactions. 3.3 Probing the structure and function of archaeal C/D box sRNP 3.3.1 I n t e r a c t i o n o f L 7 A e w i t h C / D box K - t u r n motifs A major step toward understanding the methylation function of archaeal C / D box RNPs came with the reconstitution of active complexes using in vzYro-transcribed s R l R N A and purified recombinant proteins from S. acidocaldarius (Omer et al., 2002). The initial study demonstrated that the L 7 A e protein binds directly to the C /D box R N A and nucleates the sequential addition of the Nop5 and Fib to the complex. Subsequent analyses of the archaeal C / D box R N P s revealed a symmetric distribution of proteins between the C / D and C ' / D ' motifs (Omer et al., 2006; Rashid et al., 2003; Tran et al., 2003). The C and D and the C and D ' sequence elements fold into a K-turn motif. The K-turn motif formed by C / D box exhibits the features of a canonical K-turn consisting of an asymmetric loop flanked by two short helical stems, stem I and II (Watkins et al., 2000; see Figure 3.16). In contrast, the C'/D'-associated K-turn consists of a terminal loop closed by only one stem, the equivalent of stem II. The crystal structures of L 7 A e protein bound to two 84 different C / D box R N A oligonucleotides have provided a better understanding of the interactions that allow the protein to recognize the two different K-turn motifs present in the C / D box sRNAs. The co-crystal structure of the Archaeoglobus fulgidus L 7 A e bound to a C / D box R N A oligonucleotide reveals that the protein contacts the G residues present in the sheared G : A -G : A base pairs and three of the unpaired nucleotides located in the asymmetric loop, including a conserved U that protrudes from the R N A backbone into a hydrophobic pocket of the protein (Moore et al., 2004). A sixth contact between the L 7 A e protein and the C / D box R N A occurs at one of the Us of the U - U pair at the base of stem II (Moore et al., 2004). This U - U mismatch pair is conserved in most eukaryotic and archaeal C /D box representatives and has been postulated to play an essential role in the assembly of higher order complexes (Figure 3.16) (Szewczak et al., 2005; Watkins et al., 2002; Watkins et al., 2000). The co-crystal structure of Methanococcus jannaschii L 7 A e bound to a C ' / D ' like K-turn has revealed that, despite the differences apparent between the internal (C'/D') and the terminal (C/D) canonical K-turn, the same nucleotide residues serve as a cue for protein recognition (Suryadi et al., 2005). The R N A s R l from S. acidocaldarius was intensely studied in our lab and constitutes the model of the C / D box R N A family in Sulfolobus. We used site-directed mutagenesis of s R l R N A to better understand i) the relationship between C /D box R N A structure and its guide function in the methylation reaction, and ii) the specificity of the L 7 A e interaction with the K-turn motif. 85 3.3.2 s R l C / D box s R N A M u t a n t s s R l s R N A from S. acidocaldarius (Sac) exhibits the canonical features of the C / D box s R N A and guides the methyl modification to position U52- in the 16S r R N A in vivo (Figure 3. 17A and 3.18A) (Omer et al., 2000). Moreover, it has been demonstrated that in vitro this s R N A assembles with the methylation-guide proteins (L7Ae, Nop5 and Fib) to form an active R N P complex and is able to accurately direct methylation of a complementary R N A target (Omer et al., 2002). Several mutants of Sac s R l s R N A were constructed and tested for their ability to assemble into R N P complexes and to guide the methylation of a complementary fragment of r R N A containing position U52 (Omer et al., 2006). C i r c u l a r p e r m u t a t i o n o f s R l . In circularly permuted R N A s the wild type 5' and 3' termini are connected and new extremities are created at alternative sites. Circular permutations have been used as a tool to probe the folding pathway and tertiary structure of various R N A s (Harris and Pace, 1995; Pan et al., 1999; Pan et al., 1991). Two circular permutation mutants of s R l were constructed to examine the importance of the position and connectivity of the C / D and C ' / D ' boxes within the s R N A (Omer et al., 2006). In the first mutant, CP1 A , the 5' and 3' ends were relocated to the region connecting the D ' and C boxes (to give an s R N A where the order of the boxes was 5 '-C'-D-C-D'-3 ') whereas in the CP2 mutant the 5' and 3' ends were relocated to the connector region between the C and D ' box (to give s R N A where the order of the boxes was 5' - D ' - C ' - D - C - 3') (Fig 3.17B and C) . In the CP1 mutant the C /D and the C ' / D ' boxes form internal and terminal K-turn motifs respectively, whereas the CP2 mutant contains two internal K-turn motifs. Gel shift assays indicate that upon addition of L 7 A e protein CP1 R N A readily accumulates into complex I (Figure 3.18B). In contrast very little CP2 R N A assembles into complex I even at high 86 protein concentrations (Figure 3.18C). However, in the presence of Nop5 and Fib both mutants assemble into higher-order complexes with an efficiency comparable to the control s R l R N A (Figure 3.18 B and C) . These results suggest that for the CP2 mutant the free R N A might acquire structural conformations that are less able to bind L 7 A e (Omer et al., 2006). In vitro methylation assays demonstrated that the C P 1-containing R N P exhibits a slightly higher methylation activity (measured as initial rate or final level of incorporation) compared to the wild-type s R l containing control (Figure 3.17B). In contrast, repositioning of the 5' and 3' termini between the C and D ' boxes (CP2) appears to block completely the ability of the complex to direct methyl transfer (Figure 3.17C). M u l t i p l e a n d s ingle nuc leo t ide subs t i t u t i on i n the C / D a n d C ' / D ' mot i fs . The nucleotides forming the C /D or C ' / D ' boxes were partially or completely replaced with a poly-C tract in an attempt to determine the functional role that the C / D or C ' / D ' K-turn motif plays in the methylation activity of the R N P particle. Gel shift assays showed that all mutants tested displayed qualitative L7Ae-Nop5 and L7Ae-Nop5-Fib binding properties indistinguishable from the wild-type s R l R N A , although some heterogeneity was observed in the reactions containing only the L 7 A e protein (i.e., relative mobility and amount of complex) (data not shown). It should be noted that R N A s that contain one mutant (misfolded) K-turn are not able to bind L 7 A e efficiently at the aberrant site but maintain the capacity to assemble L 7 A e with at least one Nop5-Fib heterodimer at the normal K-turn site to form an asymmetric particles (Rashid et al., 2003; Omer et al., 2006). The stoichiometry of binding cannot be determined by gel shift assays; rather, binding can be assessed only qualitatively. 87 A l l the tested poly-C-containing R N P complexes possess low residual methylation activity (Figure 3.17 D, E, F and G). In addition, disruption of the K-turn associated with the guide function (box C /D K-turn and D guide function) is more detrimental to the activity of the complex than is disruption of the non-guiding K-turn motif (compare Figure 3.17 F and G; Tran et al., 2003; Omer, et al. 2006). The crystal structures of L 7 A e bound to different R N A s and mutagenesis experiments have shown that one of the most important features that contributes to K-turn stability and protein binding is the presence of the non-Watson-Crick G : A G : A base pairs. To probe the importance of these nucleotides, a G32C substitution that disrupts the second G : A base pair of the C ' / D ' box motif was introduced. Similar to what has been observed for the mutants containing multiple nucleotide substitutions, this single nucleotide mutation does not affect the ability of the R N A to assemble with L 7 A e , Nop5 and Fib proteins and form higher-order complex, but it has a detrimental effect on the methylation activity (Figure 3.17H). In a similar manner the U 3 1 A substitution, which replaces the non-paired U nucleotide that protrudes out of the R N A helix and interacts with L 7 A e binding pocket, has a moderate negative effect on methylation activity with no apparent effect on protein assembly (Figure 3.171). Merged together these observations provides strong support for a symmetrical sRNP structure that contains two copies of each of the three proteins organized around the C /D and D ' / C K-turns as depicted in Figure 1.3. The integrity of both K turns, particularly the one associated with guide function, is required for full activity of the complex. Replacement of C / D and C ' / D ' box K-turns . The K-turn 38 motif was first identified in the crystal structure of the H. marismortui large ribosomal subunit and is the only K-turn motif in the 23S r R N A that does not "associate with ribosomal proteins (Klein et 88 al., 2001; Figure 3.19A). Further characterization of this unusual K-turn motif demonstrated that it requires only the presence of low levels of Mg 2 + (10 mM) to form a stable self-folding R N A structure (Matsumura et al., 2003). In order to test the recently proposed chaperone function of L 7 A e for facilitating K-turn formation in C / D box R N A s , three s R l R N A mutants were constructed: in the first variant C ' / D ' box elements were replaced with K-turn 38; in the second, the C/D box was replaced with K-turn 38; and in the third, both C /D and C ' / D ' boxes were substituted by K-turn 38 motifs. Standard gel shift assays demonstrated that all three K-turn-38-containing R N A s were able to form higher-order complexes in the presence of L 7 A e , Nop5 and Fib proteins (Figure 3. 19 B and data not shown). Next, it was tested whether the stable self-folding K-turn 38 could substitute for the presence of L 7 A e protein and assemble with Nop5 and Fib into higher-order complexes. Interactions between the K-turn 38-containing R N A s and Nop5 and Fib proteins were not detected in the absence of L 7 A e , either by band shift assays or by filter-binding assays, even when the concentration of M g was increased from 1 m M to 10 mM(da ta not shown). This result implies that the LI At protein acts not only as an R N A chaperone to mediate the folding of C / D box s R N A K-turns, but is also actively involved in nucleating the addition (possibly through direct protein-protein interactions) of the Nop5 and Fib proteins to these complexes. In vitro methylation assays demonstrated that none of the K-turn-38-containing R N A s was active in guiding methyl modification (Figure 3.17 J and data not shown). In an attempt to better understand the role of the L7Ae protein in the assembly and function of the methylation-guide RNPs an s R l R N A mutant, in which the C / D box motif was replaced by S. solfataricus K-turn 15 sequence, was constructed. K-turn 15 is the L7Ae binding site in the 23S R N A (Ban et al., 2001) and is unusual in that it contains a U : A : U base 89 triplet in place of one of the two sheared G : A G : A base pairs (Figure 3.19A). Gel shift assays showed that the K-turn 15 containing R N A was able to assemble into higher-order complexes with the L7Ae , Nop5 and Fib proteins (Figure 3.19C). Interestingly, this complex was as active as the control s R l in the in vitro methylation reaction and, unlike the wi ld type s R l , sensitive to M g 2 + concentration (Figure 3.17K and Figure 3.19D). 3.4 Regulation of gene expression: a new role for L7Ae protein? 3 . 4 . 1 L7Ae gene The highly conserved L 7 A e gene was first identified from the sequencing of Methanococcus jannaschii genome and designated L 7 A e because of its similarity to ribosomal proteins (Bult et al., 1996). Northern blot analyses and primer extension experiments indicate that in S. solfataricus the L 7 A e gene is monocistronic and independently transcribed. The transcription start site of L7Ae gene is located 22 nt 5' from the A of the annotated start codon and a promoter-like sequence at the appropriate distance could be identified. In Archaea, similar to Bacteria, the m R N A s possess the SD sequence motif that enhances the efficiency of ribosome binding to the translation initiation regions of the R N A . Recent in silico genome-wide studies have shown that the initiation codons of highly expressed genes (i.e ribosomal proteins) are preceded by a SD sequence at an optimum distance of 8-10 nt (Ma et al., 2002; Tolstrup et al., 2000). A putative SD sequence separated by the appropriate distance from the assigned A U G codon could not be identified for L 7 A e gene. More careful 90 examination revealed the presence of an in-frame A U G triplet at codon position 4 in the proposed L 7 A e reading frame. This in-frame A U G codon is preceded by a good SD sequence motif (5 ' -GAGG-3 ' ) at the correct separation distance (9 nt). In addition, N -terminal sequencing of S. acidocaldarius L 7 A e protein and comparative sequence alignment of several archaeal L 7 A e proteins indicate that the second A U G triplet, and not the first, is the authentic initiation codon and that the assignment of the initiation codon for the L 7 A e . O R F in the S. solfataricus database is incorrect. Throughout this work the second A U G triplet w i l l be considered as the initiation codon of L 7 A e ORF. 3.4.2 L 7 A e interacts with its own m R N A One of the sequences recovered in the L7Ae-associated c D N A library, sR114, consists of the complete L7Ae 5 ' -UTR and the first two nucleotides of the initiation codon (Table 3.1). Gel shift assays demonstrated that L 7 A e protein was able to interact and form a stable complex with this 32-nt long fragment (Figure 3.8B). Moreover, when a transcript containing the leader L 7 A e m R N A was incubated with increasing amounts of recombinant L 7 A e protein, the formation of an RNA-protein complex was detected, indicating that L 7 A e protein interacts with its own m R N A (Figure 3.20B). To determine i f the putative L7Ae binding site identified in the 5 ' -UTR region of the L 7 A e m R N A constitutes the only protein-binding site of this molecule, gel retardation assays were performed using two different L 7 A e m R N A mutants: A 5 ' - U T R and D-5 ' -UTR. In the first mutant, the complete sequence of the 5 ' -UTR was deleted; in the second, the K-turn motif in the 5 ' -UTR was disrupted by replacing the G : A base pairs of the motif by C : C pairs. Neither A 5 ' - U T R nor D-5 ' -UTR transcripts were able to form a complex with L 7 A e protein, demonstrating that the only 91 protein-binding site in the leader L7Ae m R N A is located in the 5 ' -UTR and deletion or disruption of this motif abolishes the interaction between the protein and the m R N A (Figure 3.20B). To map the location of the L 7 A e - L 7 A e m R N A interaction, a modified toeprinting assay was performed as described in section 2.2.10. Figure 3.20C illustrates the results of the toeprinting experiment. A toeprint signal at position -3 (one nucleotide 3' from the predicted L 7 A e binding site) was detected when the leader L 7 A e m R N A was pre-incubated with increasing amounts of L 7 A e protein prior to reverse transcription (Figure 3.20C lanes 3-7). In contrast, when no L 7 A e protein was added to the reaction mix the toeprint signal was not observed and only the 5' end of the L 7 A e m R N A was detected (Figure 3.20C, lanes 1 and 2). Merged together these observations prompted us to hypothesize that the interaction between L 7 A e protein and the 5 ' -UTR of its own m R N A might be part of a regulatory mechanism in which the expression of L 7 A e protein is autoregulated by a feedback mechanism. 3.4.3 Role of L7Ae in the regulation of gene expression 3.4.3.1 Archaeal ribosomal proteins The archaeal ribosomal proteins (r-proteins) are more similar in sequence to their eukaryotic homologues than to their bacterial homologues. In contrast, the organization of r-protein genes in archaeal genomes is more similar to that in bacterial genomes in which about half of the r-protein genes are clustered in two large operons whose structure is largely conserved in most species. In Archaea over one-third of the r-protein genes are included in a few large clusters closely resembling the bacterial spc and S10 operons in the type and order 92 of genes (Ramirez et al., 1993 ). Information about the transcription patterns of the archaeal r-protein clusters is scarce, making it difficult to tell the extent to which they are organized into functional operons that may resemble the bacterial ones. Even less studied have been the mechanisms employed by Archaea to regulate the synthesis of r-proteins. 3.4.3.2 In vitro translation of L 7 A e protein Experimental evidence has demonstrated the presence of an L7Ae-binding motif in the 5 ' -UTR of L 7 A e m R N A and several other m R N A s , suggesting that this R N A motif may function as a regulatory element. In vivo genetic analyses to test the functional role of the interaction between L 7 A e protein and its own and other m R N A s are not yet feasible, because transformation systems for hyperthermophilic Archaea, such S. solfataricus, are not available. Nevertheless, a S. solfataricus cell-free in vitro translation system, previously used to translate faithfully m R N A transcripts at high temperatures, constitutes a powerful tool for studying the role of L 7 A e protein in the regulation of gene expression (Ruggero et al., 1993). For the in vitro translation assays the E. coli chloramphenicol acetyltransferase ( C A T ) m R N A was used as control. The plasmid containing the E. coli C A T gene was obtained from Ambion. The SD motif of the C A T m R N A ( 5 ' - G A G G A - 3 ' ) is located 8 nt 5' from the initiation codon and it exhibits a perfect complementarity with the 3' end of S. solfataricus 16S r R N A . Band shift assays showed that there is no interaction between the L 7 A e protein and the C A T m R N A (data not shown). A S. solfataricus in vitro translation system supplemented with [ 3 5S]-methionine was programmed with either wi ld type L7Ae m R N A or C A T m R N A transcript as indicated in section 2.2.12. The cell lysate was incubated for 10 min at 73°C prior to its addition to the 93 translation mix to reduce the translation of endogenous m R N A s present in the lysate. Analysis of the translation products by SDS-gel electrophoresis revealed the presence of two bands, a 14KDa band and a 25KDa band, corresponding to the molecular weight of the L7Ae and C A T proteins, respectively (Figure 3.21 A ) . These results show that both m R N A transcripts are translated efficiently in the S. solfataricus in vitro system. The translation efficiency was determined by trichloroacetic acid ( T C A ) precipitation as described in section 2.2.12. For this, aliquots were withdrawn from the translation mixture at different time intervals, the translation products were precipitated and the radioactivity incorporated was measured as indicated in section 2.2.12. [ 3 5S]-methionine incorporation increased considerably upon the addition of leader L7Ae or C A T m R N A transcripts into the translation reaction, reaching a maximal incorporation for both m R N A transcripts after 25 min of incubation (Figure 3.2IB). When the in vitro translation system was programmed with the leaderless L7Ae m R N A transcript, A-5 ' -UTR, or the L 7 A e m R N A transcript with the disrupted L 7 A e binding site, D-5 ' -UTR, no translation products were detected (data not shown), suggesting that the translational efficiency of these m R N A s is much less than that of the leader-containing L7Ae m R N A . Indeed, Condo et al. (1999) demonstrated that leaderless m R N A s were less efficiently translated than leader-containing m R N A s in the S. solfataricus in vitro system. Northern hybridization analysis of samples withdrawn from the translation mixture at different times showed that all the L7Ae and the control transcripts used in the in vitro translation experiments have essentially the same decay kinetics. Indeed, all of them were fairly stable, as about 40% of the full-size molecules were still observed after 40 min incubation (data not shown). 94 3.4.3.3 In vitro t r a n s l a t i o n o f o the r L7Ae - in t e rac t ing m R N A s Two sequences containing the 5'-end of the formate hydrogenlyase subunit 4 (HydD) and transposase 1190 m R N A s were recovered in the L7Ae-associated library. Band shift assays demonstrated that these two m R N A s are able to interact with L 7 A e protein and, similarly to L 7 A e m R N A , the L7Ae-binding site is located in the 5 ' -UTR region of the m R N A s . Transcripts containing the complete 5 ' -UTR sequence and the protein-coding region of each gene were used as templates in in vitro translation reactions. The protein band corresponding to the translation products for each m R N A transcript was not detected. This result could not be attributed to the stability of the transcripts since northern blot analyses showed that both m R N A transcripts were still present in the reaction mix after 40 min incubation. The secondary structure of the 5' end of the HydD and transposase 1190 m R N A s might provide an explanation for the lack of translation products for these transcripts, as translation efficiency is affected by the structure of the m R N A (Jacques and Dreyfus, 1990). The 5' end of both m R N A s exhibits a long and stable stem-loop structure (Figure 3.8 A and B) in which the SD sequence and the initiation codon are sequestered within the stem and the L 7 A e binding site is located at the base of this stem. It is possible that under the conditions used in the translation assay (with unusually high concentration of M g [20 mM]) this stem-loop structure is stabilized, preventing the ribosome from interacting with the m R N A and consequently affecting the synthesis of the protein. In addition, it has been demonstrated that at a concentration of 20 m M M g 2 + the kinked conformation of the K-turn motif is stabilized, causing a sharp bend in the R N A molecule (Matsumura et al., 2003). 95 3.4.3.4 In vitro au to - rep ress ion assays w i t h r e c o m b i n a n t L7Ae p r o t e i n The question of whether the ribosomal protein L7Ae uses a feedback inhibition mechanism to regulate its own expression was addressed by in vitro auto-repression assays. In these experiments increasing amounts of recombinant L7Ae protein added to the translation reaction are expected to repress the synthesis of L7Ae protein in a dose-dependent manner. For these experiments, L 7 A e protein was resuspended in a 20 m M Tris pH 7.4 and 10 m M KC1 buffer instead of the NaiPCv buffer, as it has been demonstrated that monovalent ions, and sodium in particular, inhibit translation in the S. solfataricus system by hindering the formation of the 70S monomers from the free subunits (Londei et al., 1986). L 7 A e protein was added to the translation reaction containing the leader L 7 A e m R N A or C A T m R N A transcripts in 1:0.5, 1:1 and 1:5 RNA/protein molar ratios (Figure 3.22 A and B) . These experiments showed that addition of purified recombinant L 7 A e protein in the u M range nonspecifically inhibited the in vitro translation reaction. Furthermore, this inhibitory effect was more prominent for the reactions containing L 7 A e m R N A than for the ones with C A T m R N A . Indeed, at 2 u M concentration of recombinant L7Ae protein the translation activity for the reaction with the L7Ae m R N A was similar to the background translation (Figure 3.22A, lanes 1 and 5), whereas at the same concentration of recombinant L 7 A e protein some translation product was still detected for the reactions with C A T m R N A (Figure 3.22B, lanes 1 and 5). When both the L 7 A e and the C A T m R N A transcripts were added into the same translation reaction, the inhibitory effect of L7Ae protein was more evident. In the translation reaction containing both transcripts the synthesis of C A T protein was inhibited at 96 a lower concentration of recombinant L 7 A e protein than for the reaction containing C A T m R N A only, indicating that the newly synthesized L 7 A e protein is also contributing to the translation inhibition effect (Figure 3.22C). This observation can also explain why translation inhibition is attained with lower concentrations of L 7 A e protein in the reactions with L 7 A e m R N A than in the reactions with C A T m R N A . Prior to its addition to the translation mix, the recombinant L 7 A e protein was tested in band shift assays with both the L 7 A e and the C A T m R N A s and no degradation of the transcripts was detected. Therefore the absence of translation products in the reactions containing recombinant L 7 A e protein is due to inhibition of translation and not to the degradation of the m R N A template by contaminants present in the L 7 A e protein preparation. It is not clear why addition of recombinant L 7 A protein affected the protein-synthesizing activity of the system. However it is possible that since L 7 A e protein is a component of the ribosome an excess of this protein in the translation reaction might have interfered somehow with the translational machinery. Since E. coli does not have an L 7 A e protein homolog, I intended to use a heterologous in vitro transcription/translation system from E. coli (Ambion) for the auto-repression assays. However, when the in vitro transcription/translation system was programmed with the linearized pET14b plasmid containing the L 7 A e gene, no translation products were detected, suggesting that the L 7 A e m R N A was either not translated in the E. coli system or it was quickly degraded (Figure 3.22D). 3.5 Chapter 3 Figures Figure 3.1. Sucrose gradient sedimentation of particles containing the ribosomal protein L 7 A e from Sulfolobus solfataricus cell extracts. 98 Figure 3.1. Sucrose gradient sedimentation of particles containing the ribosomal protein L 7 A e from Sulfolobus solfataricus cell extracts. (Previous page) A S. solfataricus lysate was layered onto a 10%-30% sucrose gradient and sedimented in a SW27 rotor (10°C, 18 K , 16 hr). Fractions (1.5 ml) were collected from top to bottom of the gradient. (A) Western blot analysis was performed on an aliquot from every second fraction between 2 and 18 using S. solfataricus anti-L7Ae antibodies. The position of the cellular L 7 A e polypeptide throughout the gradient is indicated. (B) A second aliquot (200 ul) from every second fraction was subjected to immunoaffinity chromatography using S. solfataricus anti-L7Ae sera. The R N A present in the immunoprecipitates was recovered by phenol extraction, pCp end-labeled and separated on an 8% denaturing polyacrylamide gel and visualized by autoradiography as described in section 2.2.14. The sizes of predominant R N A species are indicated on the right. 99 3 2 o 5 ^ E z ha I I *3 « « « r"* M M J H CC O ' W W W w- W W W L7Ae F i g u r e 3.2. R i b o s o m a l p r o t e i n L 7 A e copur i f i e s w i t h the b o x C / D associa ted F i b a n d N o p 5 p ro te ins . Cel l extracts from S. solfataricus were subjected to immunoaffinity chromatography using S. solfataricus anti-Fib, S. acidocaldarius anti-Nop5 or S. solfataricus anti-L7Ae. Unfractionated cell extract or the eluted complexes (EC) were subjected to SDS gel electrophoresis and analyzed by western blotting with S. solfataricus anti-L7Ae as indicated in section 2.2.13.5. Unfractionated cell extract or the antibody used for affinity purification is indicated at the top of each lane. 100 U U G A u u D'box O * C'box u >u A - U A A. C C u G G G A G A G Cbox « U C G -C - G U-A G-C G - C A-EJ A - U A - U A - U G - C C v 9C Dbox 3' 5' U. B. L 7 A e [ n M ] 0 50 100 200 400 600 Complex I Unbound s R I O l C . I 200nM| prote in 0 L L N L N F -4- C o m p l e x H I C o m p l e x II C o m p l e x I M J U n b o u n d • * " s R I O l D. 10 20 Time (min) 30 Figure 3.3. s R I O l - a new canonical C / D box s R N A . (A) The predicted structure o f sRIOl showing the two putative K-turn motifs formed by the C and D and C and D ' elements, illustrating the complementarity between the D box guide and target R N A transcript. The predicted site of methylation is indicated by the boxed U in the target R N A . Uniformly labeled sRIOl was mixed with L 7 A e (L) (B), or L 7 A e and Nop5 (LN) and L 7 A e , Nop5 and Fib (LNF) (C) recombinant proteins. The reactions were incubated for 10 min at 70°C and the complexes were separated on a nondenaturing 6% polyacrylamide gel, and visualized by autoradiography. The concentration of the individual 101 proteins used in the assay is indicated above each autoradiograph. The positions of the input R N A and of the resulting protein/RNA complexes are shown on the right: Complex I, L 7 A e -R N A ; Complex II, L 7 A e - N o p 5 - R N A ; Complex III, L7Ae-Nop5-F ib -RNA. The sRIOl complex III R N P (D) was assayed for methylation activity (filled circles) as described in section 2.2.11 and Omer et al. (2002), using Sac s R l complex III (filled diamonds), and s R l complex II (minus Fib, open triangles) as controls. 102 A Time (min) 5 ' Figure 3.4. sR107 and sR108 - atypical C / D box R N A s . The predicted secondary structures (left panels) of sR107 (A) and sR108 ( B ) s R N A s and the interaction of these R N A s with recombinant L 7 A e (L), L7Ae+Nop5, (LN) and L7Ae+Nop5+Fib, ( L N F ) (right panels) as assayed by band-shift experiments described in section 2.2.13.4 are illustrated. The G A / G A interactions that form part of the K-turn motif are represented as dashed arcs in sR107 and the possible K-turns formed by the C/D-l ike motifs are boxed in sR108. The protein concentrations used in the band-shift assays are indicated above the autoradiograms and the positions of the free R N A s and of the RNA/protein complexes are indicated on the right: Complex I, L 7 A e - R N A ; Complex II, L 7 A e - N o p - R N A ; Complex III, L7Ae-Nop5-F ib -RNA. Methylation assays (C) were performed as described in section 2.2.11 using in vz'/ro-assembled S. acidocaldarius (Sac) s R l (filled diamonds), sR107 R N A (open squares), or sR108 (filled triangles) R N P 103 complexes and suitable complementary target R N A oligonucleotides. A mutant sR107 (D) containing the D box guide region substituted with the guide sequence of the Sac s R l R N A (italics) was constructed and tested for the methylation activity (C, filled circles). 104 F i g u r e 3.5. S t r u c t u r e a n d p rope r t i e s o f s R 1 0 7 R N A . 105 Figure 3.5. Structure and properties of sR107 R N A . (Previous page) (A) The last 60 nt (shown in grey) of a 190 nt-long fragment were recovered in the L 7 A e library. This fragment possesses many of the canonical features of the C / D box sRNAs including the conserved C, D, C and D ' boxes (black line). The proximal 130 nt (shown in green) of this R N A were recovered as two nearly identical variants in a second library from S. solfataricus (Tang et al., 2005). (B) The R N A exhibits regions of imperfect complementarity (illustrated as green rectangles in (A) and as green boxes in (B)) to the 3 ' -UTR of the m R N A encoded by the SSO2103 gene (pink boxes). The D box region near the 3' end of the s R N A (grey box) exhibits a 9 nt complementarity to the m R N A . At present, there is not evidence to suggest that this R N A functions as a methylation-guide s R N A . The imperfect complementarity between the n c R N A and the m R N A is reminiscent of eukaryotic microRNAs that mediate translational inhibition. (C) A single copy of this n c R N A is found in an intergenic region of the S. tokodaii genome. In this organism, the distal part of the n c R N A (grey box) exhibits an extended and almost perfect complementarity to the 3 ' -UTR of a gene that encodes a protein of unknown function. 106 A . 297 bp repeat | SsoDNA 200 nt , • t In vivo transcript - cDNAclone twMmttStoDNA B . i>m gag ^  L?i "—— fr> ^ ' — s - ' 1 2 2 1 " 4 r s ^ m ^ ii-.iox M4 ^ S » 1 8 9 9 j Ssol 826 » 2 i k i ' n w 106 - Ssol827 h i ORK inTii'mi ^ Hyp prot SsdWIO ^_5SftU!M. 8* £ia jfc S j S » . 3 I 4 8 Tn ISO 11*1 S i a » . „ » m ^ 4 iwttW Hvp prot ~ C a m a y * ) hyp p m —1 . J ' . n » Sso28Q7 ^ ^ Hvp prot ^ Hvp prot • H»JBBI •— 'Rl0:- is: _ S a s S S L ^ —MM » m I'"2' I M |» .. (?s , s g o j s i o , ^ , , Hyp prot ~ A l t C Sso21SI fc « _< > K i t « , „ . Sw2l5 ; H v - p p M ™ Hvp ptt« _ M I S , w m fci »• ™™MU2L^. A W " « ^ IstORJ. o,T„|;|T tRNA-Ala fc 3C«s ^ < M Ss.ill.i53 Figure 3.6 sR108 s R N A 107 Figure 3.6. sR108 R N A . (Previous page) (A) The conserved repeated sequence (grey box), the length of the in vivo transcript relative to the c D N A clone (solid lines), and the presence of homologous D N A sequence in the genome of S. tokodaii (hashed box) for sR108 are shown. (B) The genomic context of all fourteen sR108 gene copies in S. solfataricus is given (the schematic representation is not to scale). The sR108 cloned fragment is represented as a pink arrow. The grey arrows represent the flanking annotated genes. The numbers above the lines and at each side of the s R N A gene indicate the distances between the s R N A gene and its flanking genes, where the number in parenthese indicates the length of overlap between the two genes. The gene ID number of the flanking ORFs is indicated at the top of each grey arrow whereas the name of the protein product is indicated below. 108 A . D . 5' • GGGGAUCU >ACC-i i i i i i i i GQ33CCCCUAGA y UGG ;UUGGAGAA! • • I I I -AGCUU GU-G A A A UU CCCUAA A y i i i i i i GGGAUUy* 5' 3' CU i • GU-UG GAA i i i cuu 3' 5' B . C. E . A G C T C M C - + Mutated sRl09 K-turn J J A e J n M ^ 0 100 20(1 400 ¥ 2 5 9 8 * m tHI §0 * Complex I Hp Iff • Unbound sR 109 RNA 4k ^ ^ • Unbound mut sR 109 RNA F i g u r e 3.7. s R 1 0 9 R N A - a n H / A C A b o x p s e u d o u r i d y l a t i o n gu ide s R N A . ( A ) The predicted secondary structure of sR109 is indicated. The A C A motif is highlighted, the pseudouridylation pocket is indicated by circle arcs, the nucleotides predicted to form a duplex with the r R N A target are in grey and the predicted K-turn motif is boxed. (B) Pseudouridine at position U2598 in 23S r R N A was detected using the C M C primer extension assay as described in Bakin and Ofengand (1993). Total R N A was either C M C modified (+), or mock treated (-), partially hydrolyzed by alkaline treatment, and used as template for reverse transcription. Lanes A , G , C and T are a complementary D N A sequence ladder generated from the 23 S r R N A gene using the same primer as in the primer extension assay. The position of the pseudouridylated nucleotide in the 23 S R N A is indicated. The arrow indicates the pause site in the C M C treatment lanes. The bands of the sequence ladder are retarded to the -/+ C M C lanes due to the presence of a 5' [3 2P]-phosphate group used to label the primer. ( C ) The gel-retardation analysis of the sR109 R N A / L 7 A e interaction performed as indicated in section 2.2.13.4 is illustrated. The protein concentrations used in the assays are indicated above the autoradiogram and the positions of the free R N A and of the RNA/protein complex are indicated on the right. (D) Mutations were introduced into the putative K-turn of sR109 by C residues at the two G A : G A shared base pairs; only the K-turn-forming region is shown. ( E ) Gel-retardation analysis showing the loss of L 7 A e binding is illustrated. A. 600 [nM] 0 L L N LNF • Complex III • Complex I I m • P^-Complex I ILnbound * " s R H O B . 600 [nM] 0 I. I N LNF Complex I I I Complex II • #2 Comp,esI Unbound s R l l l •mm A A G A C - G C - l-i A- ii A • s.i U I U I C - G A - U a • A G - C U - A U G A A . G G • A 2 . C - G S' | T A T T A A [ S* l A T A T T A G T l A • G G - C . C A A U U C G G - C C C A G U C G C - 3 ' c . 600 [nM] 0 L LN LNF .Complex III • Complex II l-C'omplex I .Inbound *~ sR112 A G A A O A C - G C - G C - Gu. A " * " c - a C - G U - A a - c A - u c ° - u c u c A - U u .a o.c G - C C |TTTATTATAT| «uo - cFTOr D . L7Ae(nM] ... _sRli4-L7 , \e Com plex bound sR114 5' T T T A T I I I I I , C A U A A G A A G c— 1 S * u A* G G • A U— A G — C C — G , . G — C H J E U C A 3 ' Figure 3 . 8 . Sense strand sRNAs overlapping the 5 ' end of annotated ORFs. o 110 F i g u r e 3.8. Sense s t r a n d s R N A s o v e r l a p p i n g the 5' end o f anno ta ted O R F s . (Previous page) The predicted structures of sense strand sRNAs s R l 10 ( A ) , s R l 11 ( B ) , s R l 12 ( C ) and s R l 14 (D) are illustrated. s R l 10 overlaps the distal part of the 5 ' -UTR of the m R N A encoding subunit four of the formate dehydrogenlyase. s R l 11 and s R l 12 overlap the 5 ' -UTR of transposase-encoding m R N A s , whereas s R l 14 overlaps the 5 ' -UTR of L 7 A e m R N A . The initiation codons are highlighted, the Shine-Dalgarno motif is in grey, the K-turn motifs are boxed and the promoter-like sequences located upstream the 5'-end of the c D N A are indicated. Gel-shift assays performed in the presence of s R l 10-sRl 12 s R N A alone or with successive additions of L 7 A e (L), Nop5 (LN) and Fib (LNF) recombinant proteins are shown in the left side panels. Gel-shift assays with s R l 14 s R N A were only'performed with L 7 A e protein. The concentration of the individual proteins used in the assay is indicated above each autoradiograph. The positions of the input R N A and of the resulting protein/RNA complexes are shown on the right. L7Ac |nM| 0 50 100 200 400 600 6 0 0 [nM] 0 I. L N LNF ^ Complex III ^-Complex II Complex I Unbound sR115 5' C U C G A C 0 100 200 400 djg£. - tin bound i R I15 G - c G - c C - I s 'i c A A • L S c -u -1 C - G 5'C - G 3' C . 0 100 200 400 Complex I t-Unbound A sR115 A G C C 5' - A A AC G G A G C UI D. i R D V I A I M N L N G R G S L T L S J A P S S O L 9 7 4 ATCC-TGACGTTATTGCAATCATGAATTTAAATGGGAGGGGGTCTC1G&CCCTCTCGACTGCCCCT S T O 0152 ATAGGGATGTTATTGGAATTATGAATCTTAACAGGAGGGGGTCTCTGACCCTCTCGTCTGCCCCT T V O I N T G CATCGATTGGCTGCTTGTTGTTCTAGTCCGGCGCGAAGGGTTCGCTGAACCACTCAACTGCCACT S S O I N T G AGACGTTATGGATGAAAGTCCCCTATTCGAATGGGAGTGGATCTCTGATCCACTCGACTGCCCAC S S O 8288 ATCGTGACGTTATTGCAATTACGAATTTGAATGGGAGGGGGTCTCTGATCCTCTCGACTGCCC-TC R D V I A I T H L N G R G S L I L S T A L S S O L 9 7 4 GAACCCTCGACCCTTCCAGGGTGGGGAGCTTCCG S T O 0152 GAACCCTCGCCCTTCAGGGCGAGGATGAAGTCAG T V O I N T G AAACACTCGCCCTTTAGGGCGGGGAAGGAGGTCA S S O I N T G GAACGATGATCCCTCTGGAAGGGGAACCATCGCC S S O 8288 GGAAGTCAGATTGGACTTTAAATATTTTAACTAG 3 M R D V N P N P. TSR ;AAA.TGAGGGATGTAAACCCGAACCGATGAGGG :AAATGAGGGATGTAGACCCGAACCGATGAGGG ::AAATGAGGGATGTAAACACGAATCGATTGGAA :• AGATGAGAGATGTAAAC CCGAATCGATGGT GG : A A A T G A G G G A A A C A T C C G C C A T T T A T G G T G A H K T E R Figure 3.9. Sense strand sRNAs overlapping the 3 ' end of annotated ORFs. 112 Figure 3.9. Sense strand sRNAs overlapping the 3' end of annotated ORFs. (A) The predicted secondary structure of s R l 15 is depicted (right panel). The termination codon is highlighted, the K-turn motif is boxed and the terminator-like sequence located downstream the 3'-end of the c D N A is indicated. The ability of s R l 15 to interact with L 7 A e protein and the other two methylation-guide proteins, Nop5 and Fib, was tested in band shift assays as described in section 2.2.13.4 (left panel). (B) A mutated s R l 15 R N A was tested for its ability to interact with L 7 A e protein in standard band shift assays (left panel). In this R N A the nucleotides forming the G : A bp and the protruding U of the predicted K-turn motif were changed to Cs (right panel); only the structure of the mutated K-turn motif is shown. (C) The secondary structure of the truncated s R l 15 recovered in the library is shown (right panel). The K-turn motif at the terminal loop is boxed and the stop codon is highlighted. Band shift assays performed with this R N A and L 7 A e protein are shown on the left. For (A) (B) and (C) left panels, the concentration of the individual proteins used in the assays is indicated above each autoradiograph. The positions of the input R N A and of the resulting protein/RNA complexes are shown on the right. (D) D N A sequence alignment o f the region spanning the sR107 c D N A , with conserved regions present in the genomes of S. tokodaii, T. volcanium and with two less related regions in the genome of S. solfataricus. The size of the sequence used in the alignment corresponds most probably to the length of the encoded in vivo transcript (see text). Gaps in the alignment are indicated by a dashed line; the 5'-end of the sR107 c D N A is shown by a black arrowhead. In S. solfataricus the sR107 c D N A partially overlaps the annotated transposase 1974; a homologue of transposase 1974 is present in the S. tokodaii genome. In T. volcanium, a non-protein coding, intergenic region, designated T v o I N T G , exhibits a high degree of sequence conservation to sR107; two additional regions in S. solfataricus appear to be related to the sR107-encoding D N A : one encodes an inactivated form of transposase 1974 and the other is situated in a non-protein-coding region. The corresponding amino acid sequences of the S. solfataricus and S. tokodaii transposase 1974 homologs are shown above the D N A alignment; the translation frame of the inactivated form of transposase 1974 is indicated below the alignment. In the alignment, D'-l ike and D -like boxes are in bold and underlined. 113 A. / \ / \ I 23SrRNA| > I A I'Xc |nM. 0 25 50 I0O 200 400 G U C U C U-G C-G U-A C-G C-G U-A C-G U-A U-A A-U»A G-U G-C G-C A-U U-A U-A C-G G-U G-C c ("bos A U G-C A-U G-C G-C U-A C-G G-C A-U A G , « A s C - 0 U-G C-G G U sR125-L7Ae complex Unbound sR125 5'-CACAGAU-AAAGGGCCUAUG-CUAAGG-CGUAUUACG-r sR125 c . D. C { 23 S rRNA 1 A • U G - c U • A c - G box G U Dbox A G U C A * C A - G U -A c - G c - C G U 5'U A G - C A U A 3 Cbox c G-C G-C U-A U-A C - G G-C C - G A - U A - C U-A C - G G-C G - C A G , G • A x 23 S rRNA j ^ J O box A C - G U-A C - G 5 ' C U G G - C U U U U 3 ' S. Acidocaldarius 23S rRNA processing stem S. tokodaii 23S rRNA processing stem F i g u r e 3.10. s R 1 2 5 R N A is an i n t e rmed ia t e r e s u l t i n g f r o m l i g a t i o n o f the p rocessed p re -r R N A spacers . 114 F i g u r e 3.10. s R 1 2 5 R N A is an i n t e rmed ia t e r e s u l t i n g f r o m l i g a t i o n o f the p rocessed p r e -r R N A spacers . ( A ) sR125 R N A contains the ITS and the 3' ETS of the r R N A primary transcript. The proposed secondary folding of this R N A is adapted from Tang et al., (2002b).The predicted K-turn formed by juxtaposition of the C - and D-like boxes is boxed. The position of the endonucleolytic cleavage that liberates the 23S r R N A subunit is indicated by an arrow. (B ) Gel-shift assays with sR125 s R N A were performed as described in section 2.2.13.4 with increasing amounts of L7 Ae protein. The concentration of the L 7 A e protein used in the assay is indicated above the autoradiograph. The positions of the input R N A and of the.resulting protein/RNA complexes are shown on the right. ( C ) Schematic representation of S. acidocaldarius and (D) S. tokodaii 23S processing stems. The nucleotides forming the predicted K-turn motifs are boxed and the 23S r R N A sequence is represented by a loop at the top of the processing stem. 115 A. B . G C U c G - c C - G G • U C - G C - G C • G A A C - G 1 A G 1 U A • G G «A A A C - G G • U A - U C - G U - A C • G U - A L7Ae [nM] 0 25 50 100 200 400 I I _s.RI26-L7Ae complex Unbound sR126 S'-AGCCAUAC • GGCCUCCUC-3' mm Figure 3.11. sR126 RNA. (A) The predicted secondary structure of sR126 is illustrated. The nucleotides forming the predicted K-turn motif are boxed. ( B ) Gel-shift assays with sR126 sRNA and increasing amounts of L7Ae protein were performed as indicated in section 2.2.13.4. The concentration of the L7Ae protein used in the assay is indicated above the autoradiograph. The positions of the input RNA and of the resulting protein/RNA complexes are indicated on the right. SRI32 3 ' - AGCCCCTCGCCCCTTTACTACTCACCCCAGTCTTGACTTGGGACTACTTAATGGGACGTTGACTGG I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ATGAGTGGGGTCAGAACTGAACCCTGATGAATTACCCTGCAACTGJ C D C D s R 4 5 ' - GGGAAATG CT ACC -3 ' SR133 sRI34 novel C/D box sRNA 3 ' - TGCCCCTTTCCCCTTACTACTCGGTGCGGTCTTGACTCGGTCCTACTTGCCGAACCCTCGACTGGGGAAAGGGGAT -5 ' I I I I II I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 5'- AGGGGAATGATGAGCCACGCCAGAACTGAGCCAGGATGAACGGCTTGGGAGCTGACCCCT -3' D" C D C A U A G U G C A c A A U C A - U G U C U A • G G • A U i | G U - i 5 ' C U C A C - r - G - C -3 ' -GAGT O 50 100 200 400 600 G " G A A U A U G G - C A - U 3AGAA - U G A C - 3 ' CTG -5' 0 ^ ^ G G C T C ^ ^ C _L7Ae-sRl30 Complex I Unbound sRl30 sR130 ISC 1439 ( 3'e * * - CAGTTCGAAGAGATlGAAAGAGCCiHC^ASICJlCCTCTGATACGTGCAGC; I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 1ACTC -5 ' I I I I I I I I I I I . CGATTCATTCGTCAAGCTTCTCliEHCTTTCTCGGTGTTCAGAGGAGACAATGCACGTCGTTTTCACCACGTGAG - 3 ' F i g u r e 3.12. A n t i s e n s e s R N A s . 117 Figure 3.12. Antisense s R N A s . (Previous page) (A) Two sRNAs , sR132 and sR133, are antisense to C / D box s R N A s . sR132 R N A displays full-length complementarity to the previously characterized sR4 R N A (Omer et a l . , 2000) whereas sR133 corresponds to an R N A antisense to a novel C / D box R N A , designated sR134. The R N A target of sR134 is not known. Both antisense s R N A are longer than their cognate C / D box sRNAs . The C, C , D ' and D boxes of the sense C / D box sRNAs are indicated. (B) sR130 s R N A exhibits a perfect complementarity to the 3' end of transposes 1439. On the bottom panel, the sR130 sequence is aligned with its respective m R N A target. The protein-coding regions are shown in grey and the termination codon is highlighted. The nucleotides forming the K-turn motif are boxed. On the top panel, the predicted secondary structure of sR130 with the single-stranded regions base pairing with the complementary regions of the m R N A target is shown. The predicted K-turn motif is boxed. (C) The results of band shift assays performed with sR130 s R N A and increasing amounts of L 7 A e protein are shown. The concentration of the L 7 A e protein used in the assay is indicated above the autoradiograph. The positions of the input R N A and of the resulting protein/RNA complexes are indicated on the right. 118 (0 3 0 0 0 0 0 0 < 0 « 0 0 « 0 0 « 0 0 3 3 0 0 0 0 0 3 0 3 0 0 - I 0 3 0 0 3 < 3 < « 0 3 I o o o 3 a I U 3 U < < O 0 O O O < O 3 O O < O 0 O 3 U O O 4 O 0 O U — 13 D — O (0 • 3 4 O O < 0 4 O O U < < 3 U U 4 U I 9 0 3 0 „ 3 0 3 3 3 0 0 0 3 3 0 0 0 3 0 0 3 u _ < - 3 3 — ^ < _ 3 0 - O O o o o o o o o o o , o o o o o o o —o O • 3 o * 0 — U u " 0 o o « < z to o I . s u 3 I . 3 - 0 u — 0 <-= o —o O O 3 O < U O 0 13 < U 3 U O CM S-S < 3 - m 3 <-&, O 3 O O O U < O O O J < O O O 3 ) 3 O O O O n "a & o u V i—i S-3 WD < 119 Figure 3.13. Secondary structure of 7S S R P R N A . (Previous page) Nucleotide sequence and secondary structure of the S. solfataricus (A) and human (B) 7S SRP R N A s is shown. The different helices of the 7S R N A are indicated by numbers. The region of S. solfataricus>7S R N A where the K-turn motif is located is boxed. The positions of hinge 1 and 2 in the human 7S SRP R N A are boxed. Hinge 1 separates the S domain (helices 5, 6, 7 and 8) and the A l u domain (helices 2, 3 and 4) of 7S R N A and facilitates a major kink in the R N A backbone. The bend in the Hinge 2 region allows the A l u domain to interact with the elongation factor site in the ribosome (Halic et al., 2001). 120 o L7Ac |nM] 0 100 200 400 0 L 7 A c j n ^ 0 100 2011 COO ^^•^  •^•^  Complex i«-L7Ae-7SRNA •Unbound 7SRNA Complex L7Ac-«h|89-3U]nt7SRNA •Unbound [89-3 U]nr7S RNA Unbound ~ 2 5 3]ll35-31IJnl 7S RNA c . 119 G - C 248 G • A A • G C ! c A A • G G • A U - G C - G G « A-* A G U - A 1 0 5 G - C 261 5' 3' [), LlAejM^ I ^ V t ^ V ^ L 7 ^ e ^ M ^ | A G C T 0 0.1 0.2 0.6 1.2 0 0.1 0.2 0.6 1.2 0 0.1 0.2 0.6 1.2 ^ p ^ ^ p p ^ ^ p p ' • Full length 7S RNA from 7S RNA from 7 S R N A nt 89-311 nt 135-311 Figure 3.14. 7S S R P R N A associates wi th the L 7 A e protein. Line drawings (A) of the predicted secondary structure of full-length 7S R N A (nucleotides 1-311) and two variants with deletions at their 5' ends (nucleotides 89-311 and nucleotides 135-311) are illustrated. The position of a putative K-turn in the middle of helix 5 is illustrated by the box and arrow in the first two structures; the motif is disrupted by deletion in the third structure. (B) Each of the three R N A s was tested for binding to L 7 A e using the gel-shift assay as described in section 2.2.13.4. The concentration of the L 7 A e used in the assay is indicated above each autoradiograph. The positions of the input R N A and of the resulting protein/RNA complexes are shown on the 121 right. (C) The predicted structure of the 7S R N A in the region of the putative K-turn motif is boxed. (D) A modified primer extension (toe-printing) assay was used to map the L 7 A e binding site onto the 7S R N A , as detailed in section 2.2.10. The R N A substrate and the concentration of the recombinant L7Ae protein used in the reaction is indicated below and above the autoradiograph, respectively; a specific D N A ladder was generated using the same primer as the one used in the primer extension, in a typical dideoxy-sequencing reaction. The arrow in (C) represents the position of the block to reverse transcription when L 7 A e protein is bound to the 7S R N A . A . L7Ae(4O0nMl - + + + + + labeled 7S RNA(nM) + + + + + + Unlabeled 7S RNA (nM) - - 51) 100 200 500 B. I.7Ae(400tiMi - + + + + Labeled 7S RNA(nM) + + + + + Unlabeled antisense 7S RNA (nM) 50 100 200 + + 500 L7Ae-7S R N A complex _ Unbound 7SRNA • » HI L7Ae-7S RNA complex Unbound ' 7SRNA c . D. L7Ae (400nM) - + Labeled 7S RNA (nM) + + Unlabeled truncated 7S RNA (nM) -+ + 50 + + + + 100 200 + + 500 Labeled 7S RNA(nM) Unlabeled antisense 7S RNA (nM) + 50 + ISO + 300 + 500 m m m m i #l mi up w ; J L " A c - - S R N A complex Unbound 7S R N A R N A - R N A complex _ Unbound 7SRNA Figure 3.15. Sense-Antisense 7S RNA interaction. 123 F i g u r e 3.15. Sense -Ant i sense 7S R N A i n t e r a c t i o n . Uniformly-labeled 7S R N A (50 nM) was mixed with 400 n M L 7 A e protein and increasing amounts of unlabeled 7S R N A ( A ) , unlabeled antisense 7S R N A (B) or a truncated 7S R N A containing nucleotides 135-311 ( C ) as indicated in section 2.2.13.4. The reactions were incubated for 10 min at 70°C and the resulting complexes were separated on a non-denaturing 4.5% polyacrylamide gel and visualized by autoradiography. The amount of each unlabeled competitor used in the assay is indicated at the top of each autoradiogram. The positions of the input R N A and of the resulting R N A / R N A or RNA/protein complexes are shown on the right. A constant amount of labeled 7S R N A was incubated with increasing amounts of antisense 7S using the same conditions mentioned above (D). The reaction products were separated on a 4% polyacrylamide gel and visualized by autoradiography. The positions of the unbound 7S R N A and the R N A - R N A complex are shown on the right. 124 Figure 3.16. Structure of the C / D box K - t u r n motif. (A) Sequence and secondary structure of S. acidocaldarius s R l C / D box K-turn motif. The C and D boxes are highlighted and the stem II is indicated. (B) Tertiary structure of the C / D box K-turn used to solve the crystal structure of the L7Ae-bound C / D box R N A (adapted from Dennis and Omer 2005). The G : A base pairs are indicated by dashed lines and the protruding U nucleotide is shown. 125 %lncorporation Figure 3.17. Structure and methylation activity of sRNPs containing wild type and mutant sRl sRNAs. The structures of wild-type and mutant s R l R N A s are shown on the left of the figure. Only the sequence of the C / D and C D ' K-turns is shown. The 5' and 3' terminal nucleotides and the spacer regions are represented by solid thin lanes. The D box guide 126 sequence that is complementary to positions of 58-49 in the 16S r R N A is represented as a thick solid line and the A residue that base pairs with the U52 methylation site in r R N A is boxed. The mutated nucleotides of the C / D and/or C ' /D ' boxes are shown in white. ( A ) W t , wild-type s R l ; (B) C P 1 A , mutant in which the normal 5' and 3' ends of the s R l R N A are joined between nucleotides A 2 and U54 and new 3' and 5'ends are introduced between nucleotides U27 and A28. A G(28)A mutation was introduced to provide a base pair to close the terminal loop; ( C ) C P 2 , in this mutant the normal 5' and 3' ends of the s R l R N A are joined between nucleotides A 2 and U54 and new 3' and 5' ends are introduced between nucleotides G14 and U15; (D) C P O L Y C , in this s R l mutant the nucleotides between positions 31 and 33 in the C box were replaced by Cs; ( E ) D ' P O L Y C , s R l mutant in which the nucleotides in the D ' box were replaced by Cs; (F ) C 7 D ' P O L Y C , s R l mutant in which the nucleotides between positions 23 and 33 were replaced by Cs; ( G ) C / D P O L Y C , s R l mutant I which all the nucleotides in the C and D boxes were replaced by Cs; ( H ) G 3 2 C , s R l mutant containing a G(32)C; (I) U 3 1 A , s R l mutant containing a U(31)A mutation; (J) K t 38 I N C / D , s R l mutant in which the nucleotides forming the C box were replaced by C C G U U U G A C and nucleotides forming the D box were replaced by C G A G G , such that the C /D K-turn is replaced by K-turn 38 from H. marismortui 23S r R N A (Klein et al., 2001) ( K ) K t l 5 I N C / D , s R l mutant in which the nucleotides forming the C box were replaced by G G A A U C C G U G G and those forming the D box were replaced by C C A A A C C C , such that the C / D K-turn was replaced by K-turn 15 from S. solfataricus 23S r R N A . The unusual A : U : A base triple in the terminal K-turn 15 is indicated by a dashed line. Wild-type s R l or mutant s R l sRNAs were assembled into R N P complexes with L7Ae , Nop5 and Fib and the methylation activity of each complex was tested in vitro using the standard reaction conditions described in section 2.2.11. The results of the in vitro methylation assay with each of the s R l construcuts is shown on the right side of the figure. The in vitro methylation reaction containing the K t l 5 s R N A was assayed using a low (1 mM) and high (10 m M ) concentration of M g . 127 A . G G A D'box C'box A A A A Sac ' sR1 ' u A A -C -0-c-c-c -m u-G -G G -1-2 G - C C b o x A 0 - » 0 u G A A D box G . u U Q A * A C I L7Ae (nM) 0 25 50 100 200 400 sR1 RNA [600 nM] L LN LNF , -^Complex III ^Complex II ••IComplex I B. A O U L7Ae [nM] 0 25 50 1 00 200400 KHUComplex I ^Complex I F i g u r e 3.18. G e l shif t a n d m e t h y l a t i o n ac t i v i t y assays o f w i l d - t y p e s R l a n d c i r c u l a r l y p e r m u t a t e d m u t a n t s . The secondary structure models of wild-type and circularly-permutated s R l mutants are depicted on the left side of each panel. The C , D ' , C ' , and D boxes are highlighted in grey and the two sheared G : A base pairs in each K-turn are indicated by dashed lines. The A residue that base pairs with the U52 methylation site in the target r R N A is boxed. The base pairing between the s R l guide region and the target sequence (grey) is shown base paired to the guide for s R l wi ld type. Base pairing between the guide sequence of the 128 mutant s R l R N A s and the target sequence is not depicted. Uniformly-labeled wi ld type s R l R N A (A) or circularly-permutated s R l R N A s CP1 (B) and CP2 (C) were incubated with L 7 A e protein, (top autoradiogram, complex I) or with L 7 A e (L), L 7 A e and Nop5 ( L N ; Complex II), or L7Ae , Nop5, Fib ( L N F ; Complex III) (bottom autoradiogram) as described in section 2.2.13.4. The RNA/protein complexes were separated in 6% native polyacrylamide gels and visualized by autoradiography. (D) To determine the methylation activity of each sRl-containing R N P complex, 120 pmol of s R N A (wild type s R l or CP1 or CP2 mutants) were incubated with 120 pmol of R N A target, 4 pmol of each protein and 60 pmol of [ 3 H-methyl]SAM. In the control reaction (diamonds), fibrillarin protein was omitted. Aliquots of 20 ul were removed at 2, 5, 10, 20 and 30 min. The methylated target R N A s were precipitated with 5% T C A , collected on filters and dried. The incorporation of [ H] radioactivity was measured by scintillation counting. 6131 130 Figure 3.19. K - t u r n 38 and K - t u r n 15 s R l mutants. (Previous page) s R l s R N A s constructs in which the C and D boxes were replaced by the sequence of K-turn 38 sequence from H. marismortui 23 S r R N A or the K-turn 15 sequence from S. solfataricus 23S r R N A . (A) The sequence and secondary structure of s R l C / D box (left), K-turn 38 (center) and K-turn 15 (right) are illustrated. (B) A labeled R N A fragment containing H. marismortui 23S r R N A K-turn 38 in the C /D box position was incubated with 600 n M of L 7 A e (L), L 7 A e and Nop5 (LN) and L 7 A e , Nop5 and Fib (LNF) . The RNA-protein complexes were separated under non-denaturing conditions and visualized by autoradiography as indicated in section 2.2.14. (C) A labeled R N A fragment (nucleotides 261-303 of the 23S r R N A ) containing 5. solfataricus 23S r R N A K-turn 15 (left panel) was incubated with increasing amounts of L 7 A e protein using the standard conditions as indicated in section 2.2.13.4. A n s R l mutant R N A containing the K-turn 15 sequence in the C / D box position was incubated with 600 n M of L 7 A e (L), L 7 A e and Nop5 (LN) and L 7 A e , Nop5 and Fib (LNF) (right panel). The RNA-protein complexes were separated under non-denaturing conditions and visualized by autoradiography as indicated in section 2.2.14. (D) The methyl incorporation activity of the wi ld type s R l or mutants sRl-containing complexes was assayed using either low (1 mM) or high (10 m M ) concentrations of M g 2 + in the reaction as indicated in section 2.2.11. 131 A . 5'I iGAAACUAGAGGAt JGCGfiWiguCAAAAGCUA. . .3' SD L7Ae 5 -UTR ] Leader L7Ae-L7Ae m m m i J Z L mmW^^ i§i§i§«-^ W Leader L?Ae ~ ^ W j ™ , transcript L7Ae [nM] C . - a ^ ^ t f l I 0 UK) 200 400 600 800 K T In c; SD K T (; (. A I u[ G A A f r s 1 2 3 4 5 6 7 Figure 3.20. L 7 A e interacts wi th its own m R N A . (A) The sequence of L 7 A e 5 ' -UTR is shown. The nucleotides forming the putative L 7 A e binding site are boxed and the 3'-most nucleotide of the toeprinted fragment is indicated 132 by an arrow. The SD sequence motif is underlined and the initiation codon is highlighted. (B) Wild-type L 7 A e m R N A transcript (left), the leaderless L7Ae m R N A transcript, A5 ' -U T R (center) or the L 7 A e m R N A transcript with the disrupted L 7 A e binding site, D-5 '-U T R , were incubated in binding buffer with increasing amounts of recombinant L 7 A e protein for 10 min at 70°C as indicated in section 2.2.13.4. The reaction products were separated on a native polyacrylamide gel and visualized by autoradiography. The amount of recombinant protein added is indicated at the top of each autoradiogram. The position of the unbound m R N A transcript and the resulting RNA-protein complex are indicated at the right. (C) Total R N A (2 ug) (lane 1) or in v/Yro-transcribed leader L 7 A e m R N A (100 nM) (lanes 2-7) were incubated with increasing amounts of recombinant L 7 A e protein. A labeled primer complementary to nucleotides 46-27 of L 7 A e m R N A was annealed to the reaction products and extended by reverse transcription. The position of the toeprint signal is indicated by an arrow and the 5/ end of L 7 A e m R N A is indicated by a star. The sequence of the L 7 A e 5 ' -UTR and initiation codon is shown on the left. The SD motif is underlined, and the A U G initiation codon highlighted. The amount of L 7 A e protein . added to each reaction is indicated at the top of the autoradiogram. 133 A. r-- < —' U 30KDa B. Time (min) F i g u r e 3 .21. I n v i t r o t r a n s l a t i o n o f m R N A t r a n s c r i p t s . ( A ) Ten pmol of leader L 7 A e m R N A (lane 2) or C A T m R N A (lane 3) transcripts were added to the S. solfataricus translation system and the reactions were incubated for 40 min at 73°C. A reaction where no m R N A was added (lane 1) was used as control to determine the background level of translation. Fifteen ul of the translation products were separated on a 14% SDS polyacrylamide gel and visualized by autoradiography as indicated in section 2.2.14. The bands corresponding to the L 7 A e and C A T proteins are indicated by an arrow. (B) The kinetics of [ 3 5S]-Met incorporation in the in vitro 134 translation system was monitored at different time intervals. Translation reactions containing E. coli C A T transcript (triangles), leader L 7 A e m R N A transcript (squares) or no added transcript (diamonds) were supplemented with [ 3 5S]-Met and incubated at 73°C as described in section 2.2.12. Aliquots (15ul) were withdrawn every 8 min for 50 min and the incorporated radioactivity was precipitated with 5% T C A , filtered and counted in a scintillation counter as indicated in section 2.2.12. L7Ae mRNA <ll.4uMI, L7Ac protein B. + + 4- + - <>.2uM 0.4uM 2fiM CAT mRNA (0.4uMl) L7Ac protein 30KI)a 30KDa 20KDa 14KI>» 6.5KDa <—L7Ae 20Kl)a 14Kl)a 6.5KDa L7Ae mRNA (0.4uMI) CAT mRNA(U.4u\ll) L7At protein + + + + + + 02uM 2uM 4uM D . 30kl>a 20KDa !4KI)a — -CAT -L7Ae 30KDa 20KDa 14KDa 6.5KDa 6.5KDa Figure 3. 22. Translation auto-repression assays with L7Ae protein. 136 F i g u r e 3.22. T r a n s l a t i o n au to - repress ion assays w i t h L 7 A e p r o t e i n . (Previous page) A S. solfataricus in vitro translation system supplemented with [ 3 : ,S]-Met was programmed with 10 pmol of either (A) leader L 7 A e m R N A (lanes 2-5) or (B) E. coli C A T m R N A (lanes 2-5). Increasing amounts of purified recombinant L 7 A e protein, resuspended in the same buffer used for the in vitro translation reaction (see text and section 2.2.12), were added to the translation mix. The reactions were incubated at 73°C for 40 min and the translation products were precipitated and separated in either a 16% (L7Ae) or 14% ( C A T ) SDS gel as described in section 2.2.14.3. (C) Ten pmol of L7Ae and C A T m R N A were added to 25ul in vitro translation reaction mix containing increasing amounts of recombinant L 7 A e protein (lanes 2-5). The samples were incubated for 40 min at 73°C and the translation products were precipitated and separated in a 14% SDS gel as described in section 2.2.14.3. (D) A n E. coli in vitro transcription/translation system was programmed with 1 |ag of linearized plasmid containing either the L 7 A e gene (lane 2) or the E. coli C A T gene (lane 3). The transcription/translation system was incubated at 37°C for 1 hr as indicated by the manufacturer. One fifth of the translation reaction was loaded on a 14% SDS gel to visualize the translation products. The amounts of recombinant L 7 A e protein added to each in vitro translation reaction are indicated at the top of each autoradiogram. The bands corresponding to the translation products of the L 7 A e and the C A T m R N A are indicated at the right of each autoradiogram. In all pannels, lane 1 corresponds to a translation reaction to which no m R N A transcript was added. A l 4 C-labeled protein size marker is showed on the left of each autoradiogram. Clone Sequence I Group 1. s R N A s with conserved sequence elements C/DtasfiMs sRIOl GATG.AT6A G A G G G T C C A A T G A • • - - T T T G A T G A • • T T A T C G C T G G A A A A G C T G A T A A T T SR102 GGATGAGGATTACGGGAGTACAACTGAGGAGAATGATGT AGTGGTCCTTCACTGATAAAA sR103 CTGATGCTGAGAA GGGCTCGGCTTGA - • TCTGTGATGA - ACCCGAGCCATGCTAACTGAG sR104 • • GTGAATGATGA A A G T C A T G C C A A T C C G A • • - - A A A G A T G A G G T A A A C T C T G G G G T A A C T G A G T T SR105 GAGGATGA CGAATCCGGGAACTGA - • TCAGTGAGGA CATGCGCGAACGCTGATTCAT SR106 TTAGATGA - • T GTGTGAACCCCGGACT GA - • TAAATGATGT - • • - GCGCGTGTGAGGACTGAJC~ Afpsl&DooxsRIMs sR107 AAGGGGTATGAGGACGAGGGGTTGAATACAAATTCATGAAATCTCAATGAAAGTCCGACCCCCAGC sR108 C C C T C G A C T A G C C C C A A T G A G T T G A G G G T G A G T G T G T A T G A G C C C T C A A G A G G G G T G C C C H/ACA sR/VA sR109 G G G G A T C T G G C G A C A C C C T T G G A G A A C C C T A A A T A T T T A G G G A T T C G A T G A A T T A G G G T C T C C A G A G A T C C C C A CAG C Group 2. Sense R N A s sRllO G G G C T G A T G A C G C C T A C T T C A C C A C A A A A G G G I G G G A A G T A T G G C G C C A A A T G A A G C T C A G SR111 CAATTCGGACCGGAAGTTGGGCTTGAAGCCCGAATAAGGGTGAGTACACAIf iGAGGCCCCAGTCGC SR112 ATGGGGTATCTAGTCCCTTGACAGCCCCGAAGAAGGGTATTGTATGAGGGGGACTTCCTGCCG SR113 G J C A C C C T C C T C T G G T C A A C T C T T A G G G G A T G A G G A G C G G G A G C C G A C T C C T A C T C C C G C A A T C C C A G A G G A G G G T G SR114 GCGTGACGAAACTAGAGGATGAACGCGAT RJWs ove/feprng (he stop codon of ORFs sRllS C T C T C G A C T G C C C C T C A A A T G A G G G A T G T A A A C C C G A A C C G A T G A G G G G A A C C C T C G A C C C T T C C A G G G T G G G G A G C T T C C G SR116 ATAAAACGAGTGAAAGGCTCACAGCCGAACGGGGCACATGA SR117 TGTAACCCTTAAGACCTCGGGAACCCTCGCCCTTTAGGCGGGGAGGAGGTCAGGTTTTGCGGGCTACATGGTTCGCCGA RNAs enwdedaimlhe ORF SR118 C T C A A G A T G T G C G G T T T T C C T C A C A T C C G T G A T A T T C C G C G G G T G T G G G T T G G G G T T A T T C C G C T A A T G G G G C G G A G A G G G A SR119 G G A C G A C A C C C A C G A C C A C A A G C T C T A C G C T A G G G C C A T G C C G G T C T C C A G A A A C G G G G C G C A G A T C T T C T SR120 T T A G G G T T T C G C T T C C G T G C A T A C A C T G A C G A A C A A A C C C T T A G G G C G T T A A A G G C C C A G T T G A A G T T A G C G T G C A A A A T C T SR121 A G T G C C T G T G G A G C T C T G C C C T C T A C C A G T A C T T C G G T A C T G G C A T G G T A G A G C T G T G A A G C A G G A A G C T C C C T C C T SR122 C C G A G G C A C C T C G G T T A A A G C G A G G G T C T A T G A G G T A C G C C T C T G C T C C A G A C sR123 C T T A G T G C C T G T G G A G C T C C A C C C T C T A C C A G T T C T C T G G C A A G G T G G G G C T G T G A A G C A G G A A G C T C C C T C C T Tab le 3.1 Sequence of the c D N A clones Group 3. sRNAs in intergenic regions SR124 G G C C T G C G T G A T T T T A C A T T A A T C C C C G T G T A C C G G G G T A A G A T G A T C G C A G G A G G G T C A C C C C C G SR125 C A C A G A T G C G A T I C T C C T C T T G G C C G G A G G A G A A T T A A G A G C A A A G G G C C T A T G G G A G A T A C T C T C C C T A A G G G C T C G A T G A SR126 G C C A T A C T C A G C A G A C A C C C G C G T G T C C C G T G G G A G A G T G A A G T T G A G G C C T C C T C Group 4. Antisense RNAs SR127 G A G G G A C r r C U C I C C G r r r r G G G F U F / I T C H G G T I T G y i A A C G G A G T T A T G G A G T C C C T C G C SR12S C 6 6 K C T G G C G 4 i l G T C r C 6 G r T G i l K T r C 6 C C G G r r 7 4 G T 1 6 1 4 7 T C T U I G C T T C A T A C A T T C A G G T T C 6 C C C T SR129 G G G G G A A C T C G C T A A C G G G r A C G G A r c r G C C A C C C r C G A C G A C C C G C G C G C T G G A C r r G A G r T C C C C C SR130 C T C A C G T G G T G A A A A C G A C G T G C A T A G T C T C C T C T G A A C A C C G A G A A A G J J A G 4 G U G C T T G 4 C SR131 A G A A T C C C A T G A A G G T T G A G A A T G A A G G T C G T G T C A G C G A C C G A C T A r G A G i i G r C T T C A T G G G r G T T T r r T C C A A G A T C A/ifaseofprecfJrfedC/DtosRiyAs SR132 G G T C A G T T G C A G G G T A A T T C A T C A G G G T T C A G T T C T G A C C C C A C T C A T C A T T T C C C C G C T C C C C G A SR133 T A G G G G A A A G G G G T C A G C T C C C A A G C C G T T C A T C C T G G C T C A G T T C T G G C G T G G C T C A T C A T T C C C C T T T C C C C G T Group 5. Fragments of 7S RNA RNA1 G C C T G G A C G C C A G C G T T C G C T G G T C A A C A G C C A G A G T G A A A C T G G G G T A A A C C T A T A G A T A G G T A G G C C A T G G G G T A G G G G G RNA2 T G G C G G T C A T G G G C T T T C T C T C C G G A T G G A G A G A A A G T A T C A T G A T A T G T G G G G G A A T C G G C G A G G C C C G G A A G G G A G C A G C Table 3.1 Sequence of the c D N A clones (cont.) 139 T a b l e 3.1. Sequence o f the c D N A clones r ecove red f r o m L 7 A e i m m u n o c o m p l e x e s i n S. solfataricus. From a total o f 128 insert-containing clones, 45 distinct sequences have been identified and organized into six major groups based on the presence of known sequence and/or structural motifs and on their genomic location. The r R N A or t R N A fragments obtained in the library are not shown. Conserved sequence elements corresponding to C, D ' , C , D in C / D box R N A s and the A C A in H / A C A box R N A s are boxed. Guide regions with identifiable R N A targets are highlighted in gray. The C /D box representatives are aligned using the boxes as anchor; dashed lines correspond to gaps in the alignment. The region of s R l 18 complementary to the D box guide of sR106 is highlighted. The c D N A sequence overlapping either the sense or the antisense (italics) strands of annotated ORFs is shown in bold. The positions of the initiation and termination codons are highlighted and underlined, respectively. Name Nr Size Northern/RT P E B S L 7 A e B S N, F Orientation R N A posit ion relative to predicted O R F 1. sRNAs with conserved sequence/structure elements C/D box s R N A s SR101 1 53 52 + « < SR102 1 60 60 - +• » Over lapping 5nt sR103 1 57 . 60, 87 + + sR104 1 60 65 65 + + » Over lapping 8nt sR105 1 55 55 55 + + > Complementary to 3 ' e n d ot O R F (4nt) SR106 1 56 65 ND + > Complementary to 3 ' e n d of O R F (2nt) Atypical C/D box sRNAs SR107 4 66 190 ND + + « < 44nt/414nt SR108 3 60 200 135 + + <<< 313nU449nt H/ACA box sRNAs sR109 2 78 250 - >> 4nt from downstream O R F 2. Sense RNAs RNAS overlapping the 5' end of ORFs SR110 2 61 960 ND + - << 30nt SR111 1 66 110 ND + + » 17nt SR112 1 63 700 ND + + << 1nt SR113 6 77 1400 - - ND >> 77nt SR114 1 33 300 - + ND >> 33nt RNAS overlapping the 3 ' end of ORFs SR115 7 82 - 132 + + « 45nt sR116 1 41 90 ND - ND « 4nt sR117 1 79 + RT - - ND » 11nt RNAs encoded within ORFs SR118 12 124 124, 150, 230 124 + + >> 124nt SR119 1 71 1000 + + « 71nt sR120 1 105 140 ND - ND » 105nt SR121 3 77 300 ND + + » 77nt SR122 1 53 250,330 ND + + » 53nt SR123 1 74 330 ND + + » 74nt 3. RNAs in intergenic regions • sR124 8 66 73 ND - ND ><< 128nt/31nt sR125 4 128 130, 220, 270 101, 136 + + » > 66nt/18nt sR126 2 56 56 - +/- » < 22nt/9nt 4. Antisense RNAs sR127 4 65 75 ,110, 220 ND - ND < 42nt complementary to 5 ' e n d sR128 7 75 + RT - - ND < 51 nt Complementary to 5' end sR129 3 68 75, 100 54 - ND > Complementary to nt 612-679 SR130 1 64 + RT - + + < 14nt complementary to 3' end SR131 2 79 + RT - + + > 46 nt Complementary to 3' end Antisense of predicted C/D box sRNAs SR132 1 66 + RT - - ND > « 1201nt/1 24nt SR133 2 76 big - - ND » < 44nt/88nt 5. Fragments of 7 S RNA RNA1 19 92 300 ND - ND 7S rRNA nt 220-311 R N A 2 1 100 300 100 - ND 7S rRNA nt 135-233 6. Fragments o f rRNA or tRNA R N A 3 3 74 - ND 16S rRNA nt 2 to 74 RNA4 2 64 - ND 16S rRNA nt 1435 to 1496 R N A 5 2 81 ND 23S rRNA nt 72 to 152 RNA6 5 64 - ND 23S rRNA nt 2985 to 3049 R N A 7 2 69 - ND tRNA-Glu G A A nt 1-69 R N A 8 2 80 - ND tRNA-Lys A A A nt 1-80 R N A S 1 75 - ND tRNA-Val G U A nt 1-75 RNA10 1 62 - ND tRNA-Arg A G A nt 18-79 RNA11 1 26 - ND tRNA-Gly nt 51-76 RNA12 1 59 - ND tRNA-Thi A C C nt1-59 RNA13 1 68 - ND tRNA-Ser U C A nt 1-68 Table 3.2. Analysis of the L7Ae-associated R N A s isolated from S. solfataricus G e n o m i c locus R N A funct ion / O R F annotat ion Poss ib le Target C o n s e r v e d in other A r c h a e a tRNA-Gly /SSO0686 SSO0650 SSO02602 /SSO2603 SSO0907 S S 0 2 4 6 8 (<) SSO0333 (<) 16S U435, D box 16S A635, D' box No r R N A or tRNA target 23S G811, D' box No r R N A or tRNA target G52 tRNA, D' box Sme Sto, sR18 Sac S S 0 1 7 3 9 / S S 0 8 8 1 3 multiple locat ions Sto Sto SSO1026 S S 0 1 6 8 4 SSO0131 SSO0801 SSO0091 O R F annotation Formate hydrogenlyase subuni t 4 (hycD) Partial t ransposon ISC1190 (782nt) C o n s e r v e d hypothetical protein (710nt) T r a n s p o s a s e ISC1476 (1367nt) R ibosomal protein L 7 A e S S 0 1 9 7 4 C o n s e r v e d hypothetical protein (1148nt) S S 0 2 2 6 9 C o n s e r v e d hypothetical protein (663nt) SSO0801 T r a n s p o s a s e ISC1476 (1367) Sto; TVo SS03172 S e c o n d O R F in t ransposon ISC1904 (521nt) SSO0980 T r a n s p o s a s e ISC1217(1064nt) SS02811 T r a n s p o s a s e ISC1316 (1235nt) S S 0 2 8 8 7 C o n s e r v e d hypothetical protein(305nt) S S 0 1 7 3 6 Hypothetical protein (336nt) S S 0 1 7 3 6 Hypothetical protein (336nt) SSO10340 /SSO2405 ITS-ETS 16S/23S rRNA SSO2359 /SSO2360 SSO2120 (> S S 0 2 5 6 2 (> SSO0980 (< S S 0 1 3 1 7 (> SSO2830 (< T r a n s p o s a s e ISC1439 (966nt) First O R F in t ransposon ISC1359 T r a n s p o s a s e ISC1217 (1064nt) T r a n s p o s a s e ISC1439 (966nt) Hypothetical protein (1638nt} SSO0507 /SSO0508 SSO0782 /SSO0783 7S RNA A c c NB: X17239 S s o 7S RNA 7S R N A 7S R N A Conta ins Um52 target of sR1 Contains KT-7 4^ O 1 4 1 T a b l e 3.2. A n a l y s i s o f the L7Ae-assoc ia t ed R N A s i so la ted f r o m S. solfataricus. The non-78, r R N A and t R N A clones are assigned sR numbers; the 7S R N A , r R N A and t R N A clones are numbered separately. The column designations are as follows: Nr , the number of identical clones sharing all or part of the core sequence (longest clone sequence illustrated in Table 3.1); Size, length in nucleotides of the longest clone; northern/RT, estimated size of the expressed R N A as determined by northern hybridization or detection of in vivo R N A by R T -P C R analysis; P E , length in nucleotides of the extension products generated using a primer complementary to the 3' terminal sequence of the c D N A clone; BS L7Ae , band-shift analysis using recombinant L 7 A e protein; BS N F , band-shift analysis using L 7 A e , Nop and Fib recombinant proteins; Orientation, orientation of the RNA-encoding sequence (underlined), relative to annotated adjacent genes; R N A position relative to predicted O R F , either the distance in nucleotides, between either of the s R N A ends and the nearest O R F boundary or the overlap with a predicted O R F boundary; Genomic locus, annotated coding genes flanking or overlapping the R N A ; Predicted target, the predicted R N A target for the novel C / D box R N A s and for the H / A C A R N A ; O R F annotation, the predicted or known function of the O R F overlapping the novel R N A ; Conserved in other Archaea, presence of orthologous sequences in other archaeal genera as identified computationally. The meanings of table entries are: N D , not determined; [-], not detected; [+], detected; [+/-], uncertain result; [>], 5'-3' gene orientation, [<], 3'-5' gene orientation and [> or <], orientation of the s R N A sequence. The species abbreviations are as follows: Sme, Sulfolobus .metallicus, Sto, Sulfolobus tokodaii, Tvo, Thermoplasma volcanium, Dam, Desulfurolobus ambivalens. Sac sR3 Sto sR3 Sso sR3 Ssh sR3 Sac sR7 Sto sR7 Sac sR9 Sso sR9B SacsR14 Sto sR14 SacsR17 Sto sR17 Sso SR17 SacsR18 Sto sR18 Sso sR18 Sac sR20 Sso sR20 Sto sR20 Sac sR22 Sto SR22 Sac sR26 Sto sR26 Sso sR26 C box A G G l A U G A C G A U G A A U G A U G A G G A G U G A U G A G G A A U G A U G A D' box - - - G A C C C A A A A U A l U U G A l A G A G A C C C A A A A U A U G G A A A C C C A A A A U A U U G A A A C C C A A A A U A U G G A C U A C A C A U G A A A U C' box A C G A l U G A U A U A U G A U G A A A U U GA A A A U GA A A A Dbox A U A A C C y G U C y C G G j C U G A l U C A G U A U A A C C U G U C y C G G C G G A U C A C U A U A A C C y G U C y C G G C U G A U U A G G A U A A C C U G U C U C G G C U G A U U A A G G A U G A U G A C C C A U U A U G A U G A C A A A G A G C C G A A U G G A C A A A G A G C C G A A U G G A U U A G U G A C A U C U A A U U U U G U G G G C A G C C A C U G A U A G A G U U A G U G A C A U C U A A U U U U G U G G G C A G C C A C U G A U A G A G A A A A U A A U G A U G A C U U A A C G U G A U G A C U A A C U C C A A U A C U G A A U A A C U C C A A U C C U G A C C A A U G A U G U U C A A U G A U G U C G U A A C C C G A A A C U G A A U A A A A G U A A C U C G C G A C U G A A U U G C G C U G U G A A G A G C U G U G A A G A C G C U A G A C U U A G A C U G A - - G G C U A G A U A U A C U G A - C U C A U G A U G A A U U U G U G A U G A A G G G C C A A A G C U C A G A G C A A A C A G G G C C A A A G C U C G G A G C A A A G A G A A A U G A A G A GGA A A U GA A GA G G A A A U G A G G A G U A A A A A A C C G G C U G A - G A U A A G U G A U G A A U A A U A A A C C G G C U G A - - A A A C A U G A U G A A U A G A A A A C C G U C U G A A A U G U G A U G A - C G A C G U C U C G C A C U G A U C - A G A G C y U U C G C A C G G A U U A G A G C U C U C G C U C C U G A U U A A G U G A U G A A G A U G A U G A - C A G A A C C C C G G C U U G A - C U G A A C C C C G G C U A G A U UlA U G A U G A l U G U G U G A A C C C C G G A U A G A - - - A A GA U GA - - - U C G A U G A U A A A U G A U G U - U A G A G C C G U G U G A G A A C U G A U C A A U - U A U A G C C G U G U G A G A G C U G A U U A G A G C G C G U G U G A G G A C U G A U C A A U G A U G A A A U G A U G A U A U G A U G A A A A G A G G G U C G C A U G A A A A G A G G G U C G C A U G A U A A G A G G G U C G A A A G A U A G A U G A U G A U A G A U G A U G A U C G A U G A U A U C C G C U G G A A A U A C U G A A A U C C G C U G G A A A U A C U G A A A U A C G C U G G A A A A A C U G A U U U C U G A A A U G A U G A A A A A A A U G A U G A A U U U U A G G G G A G C C U G A G U U U U A G G G G A G U U U G A U A G G U G A G G A G C G G U G A A A U U G C G G U U A C y C G C U G A A G A U A G G C G G U U A C U C A C U G A A C U A A C G G C U A A U G A U G A C U C C C G A U G A U G A G U G C U A A U G A U G A G G G U U A A A A G C G C U U A U G G U U A A A A G C G C U U A A G G U U A A A A G C U C C U A UU GAUGAUGA^ U G A A C C U C U A C C U A^C U GA A G A GC C U C G G U G A C G A - - U U A G A U G A C G A G C C U C U A C C U G C U G A G G U G C C A U C U C C G C U A U C U G A A G A G C C T a b l e 3.3. F u n c t i o n a l h o m o l o g s o f S. acidocaldarius C / D box R N A s i n the r e l a t ed genomes o f S. solfataricus a n d S. tokodaii 143 T a b l e 3.3. F u n c t i o n a l homologs o f S. acidocaldarius C / D box R N A s i n the r e l a t ed genomes o f S. solfataricus a n d S. tokodaii. The nomenclature of the small R N A s is based on the initial set of 28 sequences identified in S. acidocaldarius (Omer et al., 2000 and unpublished results). Sac, S. acidocaldarius, Sto, S. tokodaii, Sso, S. solfataricus, Ssh, S. shibatae; C , D ' , C , D elements are boxed and the guide region complementary to r R N A or t R N A is indicated by the broken underline; sequence alignments were performed using the box elements as anchors where dashes indicate gaps in the alignments. For Sso sR9, a B is used to distinguish the respective Sso s R N A from the one having the same identification number previously deposited in Gene Bank (Omer et al., 2000). 144 4. Discussion Earlier studies indicated that the archaeal L 7 A e protein is an integral component of three functionally distinct macromoloecular ribonucleoprotein complexes: the 50S large ribosomal subunit, the C /D box modification particles and, as more recently discovered, the H / A C A box particles (Ban et a l , 2000; Kuhn et al., 2002; Rozhdestvensky et al., 2003). L 7 A e recognizes and binds to the K-turn motif present in all these functionally different R N A s . In an attempt to identify and characterize additional ncRNAs that may interact with L 7 A e , a specific library of c D N A sequences from S. solfataricus corresponding to the R N A s associated with the L7Ae protein was constructed by immunoprecipitation with the anti-L 7 A e polyclonal antibodies. A total of 45 different clones were obtained in the library. Six of these corresponded to new canonical C /D box representatives, one clone displayed the characteristics of an H / A C A box sRNAs , eleven were fragments of r R N A and t R N A and the remaining 28 clones represent novel classes of ncRNAs. Recognizable sequence and/or structural motifs could not be identified for these last 28 clones; however, most of them interacted efficiently with the recombinant L7Ae in gel retardation assays, suggesting that these R N A s contain a functional K-turn binding motif. Based on this result, novel classes of ncRNAs that function in association with the L 7 A e protein, but are distinct from the C / D or the H / A C A box class of modifying R N A s , were defined. The L 7 A e protein seems to be intimately connected to the structure and function of many of these R N A s and it might function as a primary RNA-binding factor in various complexes with distinct functions. Moreover, the data obtained here set the stage for further investigations aimed at defining the function of these novel R N A s . 145 4.1 The universe of the K-turn motif and novel ribonucleoproteins 4.1.1 L 7 A e binding specificity L 7 A e belongs to a family of RNA-binding proteins, which includes the human 15.5kD spliceosomal protein, yeast ribosomal protein L30 and the H / A C A snoRNP core protein Nhp2p, the selenocysteine insertion sequence binding protein, SBP2, and many other proteins (Koonin et al., 1994). The common denominator of this family of RNA-binding proteins is the capacity to recognize and bind to a R N A motif known as the K-turn (Klein et al., 2001). This R N A motif was identified at several locations in the crystal structures of ribosomal subunits, in m R N A (Allmang et al., 2002; Mao et al., 1999; Winkler et al., 2001), modification-guide R N A s (Bortolin et al., 2003; Kuhn et al., 2002; Rozhdestvensky et al., 2003; Watkins et al., 2000), and spliceosomal R N A s (Vidovic et al. 2000). The K-turn is thus important in translation, R N A modification, spliceosome assembly, and the control of gene expression. Structural analyses have revealed that the free and RNA-bound"L7Ae proteins exhibit a very similar conformation. In contrast, some notable differences in the structure of the K-turn R N A have been observed upon protein binding (Turner et al., 2005). K-turn motifs are often dimorphic, existing in both a tightly kinked turn and a more loosely bent form, depending on the concentration of divalent metal ions (Goody et al., 2004; Matsumura et al., 2003; Turner et al., 2005). Therefore, it appears that L 7 A e binds the loosely bent form of the K-turn and stabilizes the tightly kinked form in the presence of magnesium or other divalent ions (Turner et al., 2005). 146 The identification in S. solfataricus of a large number of L 7 A e R N A substrates containing K-turn motifs, along with previous findings showing that .L7Ae binds to both r R N A and modification-guide sRNAs , indicates a broader specificity of the archaeal L 7 A e protein compared to its eukaryotic counterparts which appear to have become more specialized in both functionality and R N A target recognition. For example, the eukaryotic homolog of L 7 A e , 15.5kD/Snul3p protein, appears to bind exclusively box C / D sn(o)RNAs (but not C ' / D ' box motifs in methylation-guide snoRNAs) and the 5'-terminal stem of the U4 snRNA (Henras et al., 1998; Watkins et al., 2002; Watkins et al., 1998). A structural mechanism in which the eukaryotic L7Ae homologs have a more rigid fold than the L 7 A e protein may account for the differential K-turn specificity exhibited by the eukaryotic and archaeal proteins (Oruganti et al., 2005). Indeed, structural comparison of yeast Snul3p (15.5kD homologue in yeast) and L 7 A e protein shows that although the folding of both proteins is highly conserved (Figure 4.1 A ) one important structural difference between the eukaryotic and the archaeal protein is evident. A two amino-acid insertion in the loop connecting (34-strand with a6-helix in the RNA-binding surface of the Snul3 protein leads to the stabilization of the ot2 helix. The helix a2 contains the largest number of residues contacting R N A and plays an important role in the binding ability of the protein (Moore et al., 2004). Stabilization of a2 helix increases the rigidity of Snupl3 and, therefore, narrows its R N A specificity. In contrast, L 7 A e does not have this extended connecting loop which leads to a more flexible a2 helix, thus allowing a broader R N A specificity (Orugati et al., 2005; Figure 4 . IB) . The abundance of K-turn sequences in natural R N A and the flexible and multifunctional nature of the L 7 A e protein both point to the important function of K-turns as architectural 147 scaffolds for the assembly of higher-order complexes. Moreover, it suggests that the L 7 A e protein fold and its K-turn R N A substrates are ancient structural motifs that have been maintained and have evolved specialized roles in a variety of RNA-related biological processes in both Archaea and Eukarya. 4.1.2 Novel L7Ae-conta ining R N P s The L 7 A e protein interacts with a diversity of different R N A s and appears to be a component of functionally distinct R N P complexes. In all these RNPs the binding of L7Ae may induce specific tertiary structure in the R N A s that help recruit additional proteins into the complex (Moore et al., 2004; Turner et a l , 2005). The induced fold of the R N A by the binding of L 7 A e , along with the flexibility of the K-turn motif, suggests that the protein first explores the "adaptability" of the R N A before reaching the final stable protein-RNA conformation. This manner of binding might explain how L 7 A e is able to select a specific K -turn motif during the assembly of different R N P complexes (Oruganti et al., 2005; Turner et al., 2005). The challenge for the future remains to elucidate the structure, composition and function of the novel L7Ae-containing RNPs and to determine whether L 7 A e helps to coordinate the function of and the cross-talk between different R N P complexes. 148 4.2 sRNAs containing the conserved C and D box sequence elements A previous study of the extent of r R N A 2'-0-methylation in the archaeon S. solfataricus has reveled a comparable number of modifications to that present in eukaryotic r R N A (Noon et al., 1998). Additionally, it was found that archaea express three homologs of the four eukaryotic snoRNA-associated proteins that are required for the 2'-0-methylation reaction. Together, these findings raised the possibility that archaea use a eukaryotic-like system to modify their r R N A . In contrast, a typical bacterium such as E. coli contains only four 2 '-0 methylations and ten pseudouridylations. Each of these modifications appears to be catalyzed by a site-specific protein enzyme, ribose methylase or pseudouridine synthase, without any R N A cofactor (Caldas et al., 2000; Ofengand and Rudd, 2000). B y utilizing biochemical and computational methods, the presence of more than 50 distinct C / D box sRNAs has been demonstrated in most species of hyperthermophilic archaea (Gaspin et al., 2000; Omer et al., 2000; Tang et al., 2001a). The archaeal C / D box sRNAs not only guide methylation in r R N A , but also within various tRNAs (Dennis et al., 2001; Omer et al., 2000). 4.2.1 I n a r c h a e a C / D b o x s R N A s gu ide the m e t h y l a t i o n o f t R N A s targets More than 80 modified ribonucleosides have been identified in t R N A s (Rozenski et al., 1999) and although the specific function of many of these modifications remains elusive, studies have generally indicated that they are required for optimal growth and translation. This implies that all modifications play important roles in the stabilization of the secondary and tertiary structure of the t R N A and/or its function in protein synthesis. Methylation is the 149 simplest modification known. The 2'-0 methylation of the sugar stabilizes the ribose C3 ' -endo form. (Kawai et al., 1992). The local conformational rigidity conferred by this modification may affect the R N A stability and protein interactions. A global analysis of the guide sequences in the available C /D box sRNAs predicts that sRNAs target methylation to 21 different sites of pre-fRNAs (Dennis et al., 2001). In all instances the predicted position of methylation corresponds to one of the 21 sites of documented ribose methyl modification in t R N A , but never to positions where ribose methyl modification has never been observed. In many instances, a single C /D box s R N A is predicted to guide methyl modification at the same position in different t R N A s . For example, three s R N A s from P. aerophylum (sR5, sR48 and sR34) appear to target methylation to the same G10 in a total of 16 different t R N A s (Dennis et al., 2001). In P. aerophylum the D ' guide region of sR37 is predicted to target methylation at position G52 in eleven different t R N A s (Dennis et al., 2001). In addition, one of the canonical C /D box sRNAs recovered in the L 7 A e library, sR106, is predicted to guide the methyl modification at position G52 in three different t R N A s in S. solfataricus, likewise the sR106 homologue in S. tokodaii is predicted to modify the same G52 position in seven different tRNAs . Interestingly, it has been demonstrated that C /D box sRNAs are able to guide the methyl modification at the specific position in full-length t R N A targets (Ziesche et al., 2004). Indeed, Zieche et al. (2004) observed that when assembled into complexes with L 7 A e , Nop5 and Fib, the S. solfataricus s R l l and sR14 were able to direct methylation to the predicted location in the t R N A and that mutational alteration of the s R N A guide or the t R N A target sequence at the site of modification abolished the activity of the complex. The importance of this reaction may be related to the fact that many archaea grow at high temperatures. It has been suggested r 150 that under these conditions the t R N A primary sequences may sometimes lack sufficient structure to be recognized by conventional methyl transferases enzymes. The introduction of the methyl group by the sRNP guide complex, which presumably does not require t R N A structure, likely, contributes to stabilization and folding of t R N A into the conventional tertiary structure (Dennis et al., 2001; Renalier et al., 2005). 4.2.2 Atyp ica l C / D box s R N A s Two of the clones recovered in the L 7 A e library exhibit some of the structural features of the canonical C /D box sRNAs , but the connector regions between the C and D boxes have an unusual length. Gel retardation assays demonstrated that both C / D box-like s R N A s can form a stable complex with L 7 A e protein; although they only exhibit a weak affinity for the other two methylation guide protein Nop5 and Fib. Nonehteless, the resulting R N P complexes exhibit a low methylation activity comparable to the negative control. These results agree with the previous observation that the two juxtaposed K-turn motifs in the C/D box s R N A need to be spatially constrained with respect to each other in order for the methylation reaction to occur efficiently (Tran et a l , 2005). Hence, it is possible that the two atypical s R N A recovered in the L 7 A e library do not have function in ribose methylation and therefore, their cellular role is yet-to-be determined. In vivo, the recovered C/D-l ike sRNAs are part of longer transcripts. Interestingly, one of the transcripts (sR107) exhibits several short regions of complementarity, including the region containing the D box sequence, with the 3 ' -UTR of a transposase m R N A . In eukaryotes, microRNAs (miRNAs) base pair with complementary sequence elements at the 3 ' -UTR of target m R N A s . The mechanism of gene silencing mediated by micro R N A s begins 151 with a long double-stranded R N A which is processed by Dicer into many ~22-nt s iRNAs (Hammond et al., 2000). The ~22-nt m i R N A is then recognized by the P A Z domain of an Argonaute protein and incorporated into RISC (RNA-induced silencing complex) where the m i R N A duplex is unwound to obtain the so-called guide strand that identifies target messages based on nearly perfect complementarity between the m i R N A and the m R N A (Bartel, 2004). m i R N A are involved in translational repression and recent experimental evidence indicates that m i R N A s also induce m R N A destabilization. The hyphenated complementarity between sR107 s R N A and the 3 ' -UTR of transposase 1225 is reminiscent of eukaryotic m i R N A s that mediate m R N A silencing. This observation, along with the crystallization of the Argonaute protein (which is a signature component of the R N A interference effector complex) from the archaeons P. furiosus (Song et al., 2004) and A. fulgidus (Parker et al., 2005) raise the interesting possibility that a homologous mechanism of gene silencing mediated by ncRNAs is used by Archaea to regulate gene expression and/or function. 4 . 3 O t h e r L 7 A e - c o n t a i n i n g R N P c o m p l e x e s 4.3.1 P r e - r R N A processing complex Four r R N A fragments representing the 5' and 3' ends of 16S and 23S r R N A s were recovered in the L 7 A e library. None of these fragments exhibit binding affinity to the L 7 A e protein and only the fragment from the 5' end of 16S r R N A contains a known or predicted site of 2'-0-ribose methylation (position U52). Why were these specific R N A fragments recovered in the L 7 A e library whereas no other internal r R N A fragments were recovered? 152 The answer to this question is likely to be related to the machinery that cleaves the precursor r R N A transcript to release the 16S and 23S r R N A sequences. In Sulfolobus, as in most other archaea, the pre-rRNA contains long bulge-helix-bulge ( B H B ) processing stems that surround the respective 16S and the 23S r R N A sequences (Dennis et al., 1998). The B H B R N A motif is found not only in the processing stem of archaeal pre-rRNA but is also found at the intron-exon junction of archaeal intron-containing R N A transcripts. This motif is the substrate for an intron-excision endonuclease that is homologous to the eukaryotic t R N A intron-excision endonuclease (Thompson and Daniels, 1990). Another interesting s R N A recovered in the L 7 A e library was one containing the ITS and 3'-ETS sequences that had been ligated to each other at the site of endonuclease cleavage in the B H B motif (Tang et al., 2002b). In vivo the R N A is expected to be much longer and contains the 5'-ETS region; however, this sequence was missing from the recovered fragment. Interestingly, the ITS spacer sequence contains a well-defined K-turn motif and exhibits high affinity for L 7 A e protein binding. Together these results suggest that a large processing complex forms on the pre-rRNA and contains the L 7 A e protein, the intron-excision endonuclease, the yet-to-be-identified exon ligase and likely other components. Antibodies against the L 7 A e protein apparently precipitated the complex and fragments of the R N A near the center of the complex (i.e., the spacer sequences and the 5' and 3' ends of the mature rRNAs) were recovered in the L 7 A e library. It should also be noted that in eukaryotes a large complex containing a number of different C / D box and H / A C A box snoRNPs is assembled and mediates essential endonucleolytic cleavage events within pre-rRNA (Granneman and Baserga, 2005). 153 4.3.2 The signal recognition particle (SRP) Two fragments of the 7S R N A were recovered in the L 7 A e library. Although these fragments display no affinity for the L 7 A e protein, biochemical and mutational analyses demonstrated that the full-length 7S R N A contains a highly specific motif for L 7 A e binding. In eukaryotes, the SRP complex contains in addition to the 7S R N A , six different proteins: SRP54, SRP 19, SRP9, SRP 14, SRP68 and SRP72 (Keenan et al., 2001). SRP19 protein binds helices 6 and 8 of the 7S R N A and facilitates the binding of SRP54 to helix 8 (Figure 4.2A). Then, SRP54 recognizes and binds to the signal sequence of an emerging nascent polypeptide and interacts with the SRP membrane-bound receptor (Halic et al., 2004). SRP9 and SRP14 form a heterodimer that interacts with helices 3 and 4 of the 7S R N A and mediates the elongation arrest activity of the nascent polypeptide chain (Figure 4.2A; Halic et al., 2004). SRP68 and SRP72 proteins form a heterodimer first and then bind to a large asymmetric loop in the middle of helix 5 of the 7S R N A (Figure 4.2A; Halic et al., 2004). Based on sequence and functional homology, only the SRP 19 and SRP54 protein components of the eukaryotic SRP have been identified in archaea (Eichler and M o l l , 2001) (Figure 4.2B). The recent three-dimensional reconstruction of a eukaryotic SRP within the functional context of the stalled ribosome structure shows that the SRP68/72 heterodimer binds to a region of the 7S R N A , around nucleotides 100-240, designated as hinge 1 (Figure 3.13 and Figure 4.2C) (Iakhiaeva et al., 2005). This region of the 7S R N A adopts a highly kinked conformation upon binding of the SRP68/72 heterodimer, allowing the SRP R N P complex to interact simultaneously with both the elongation-factor binding site and the exit tunnel on the surface of the large ribosomal subunit (Halic et al., 2004). Interestingly, 154 mutational analyzes have demonstrated that the K-turn motif responsible for L 7 A e binding in S. solfataricus 7S R N A is located around nucleotides 110 and 250, which corresponds to the same region involved in the interaction of eukaryotic 7S R N A with SRP68/72 heterodimer (Figure 4.2D) (Iakhiaeva et al., 2005). Taken together, these observations suggest that in Sulfolobus the L 7 A e protein might be the functional equivalent of the eukaryotic SRP68/72. The binding of L 7 A e protein to the K-turn motif in the Sulfolobus 7S R N A is expected to stabilize the tightly kinked structure in the - R N A backbone, and thereby eliminating the need for the SRP68/72 proteins to provide this function. 4 . 4 Interaction of L 7 A e with protein coding R N A s 4.4.1 H o w is the express ion o f L 7 A e regu la ted i n the ce l l? Prokaryotes regulate the level of expression of several essential proteins by a variety of mechanisms that influence translation initiation events. One such mechanisms involves binding of trans-acting proteins that allosterically control alternative structures within the m R N A leader sequence. mRNA-specific repressor proteins usually inhibit translation by competing with ribosomes for binding to m R N A . In most cases, the protein binds close to or across the R B S (ribosome binding site) and sterically impedes ribosome entry (Jenner et al., 2005). Most of these mRNA-specific repressor proteins have as their primary function something other than regulating translation. This is important because regulation of translation requires controlled, binding of the repressor protein, and control is sometimes achieved via competition between the m R N A and another substrate, such as t R N A or r R N A . 155 The observation that L 7 A e m R N A contains a putative K-turn motif and that the protein can interact with its own m R N A led to the hypothesis that this protein, similar to other ribosomal proteins, autoregulates its expression by blocking the binding of the ribosome onto the m R N A . This autogenous regulation provides an efficient mechanism in which the expression of L 7 A e protein is coordinated with the availability of r R N A . The differences between the r R N A and the m R N A K-turn motifs likely determine the affinity of the protein for each R N A molecule. It is expected that L 7 A e exhibits a higher affinity for its r R N A target than for its m R N A target, so that it binds to the r R N A first and only when the r R N A is saturated does it bind to the m R N A , shutting off any further unnecessary translation. Interestingly, a similar mechanism has been observed in the autogenous regulation of the E. coli L10 and the yeast L30e ribosomal proteins. The m R N A s of these two ribosomal proteins each contains a K-turn motif in the 5 ' -UTR and binding of the proteins onto the respective m R N A motif blocks any further translation (Klein et al., 2001; Mao et al., 1999; Vilardell and Warner, 1994; Yates et al., 19 81). A t present, there are no genetic systems for hyperthermophilic archaea that would have allowed me to test the proposed autoregulatory function of L 7 A e protein in vivo. Neither could this be verified using a S. solfataricus in vitro translation system, because the addition of recombinant L 7 A e protein to the translation system proved to be detrimental to its activity. Notwithstanding, it is expected that the interaction of L 7 A e protein with its own m R N A interferes with the formation of the translation initiation complex. This hypothesis could be tested in vitro by performing a toeprinting assay in which the leader L 7 A e m R N A is incubated with purified S. solfataricus SOS subunit in the presence and absence of recombinant L 7 A e protein. If the binding of L 7 A e protein onto the m R N A indeed hinders the 156 formation of the initiation complex, then a toeprinting signal should be observed for the sample without recombinant L 7 A e protein, but not for the sample containing the recombinant protein. 4.4.2 T r a n s l a t i o n o f leaderless m R N A s i n S. solfataricus in vitro sys tem In order to test the role that the K-turn motif in the 5 ' -UTR of the L 7 A e m R N A plays ^ in the regulation of the expression of L 7 A e m R N A , two mutant L 7 A e m R N A s were assayed in the S. solfataricus in vitro translation system. The A - 5 ' - U T R mutant was missing the complete 5 ' -UTR sequence, whereas in the D-5 ' -UTR mutant the putative L 7 A e binding site was disrupted. When these mutant m R N A transcripts were used to program the in vitro translation system, no translation products were detected. Condo et al. (1999) showed that a leader m R N A could still be translated, albeit at much lower levels, in the in vitro system after it was rendered leaderless. The different translational efficiencies between leader and leaderless m R N A s are due to the mechanism used by the m R N A to interact with the 30S ribosomal subunit. In the presence of a SD motif, the small ribosomal subunit interacts directly and strongly with the m R N A . However, for leaderless m R N A s the interaction between the m R N A and the 3OS subunit requires the presence of the methionyl-tRNA initiator (met-tRNAi) suggesting that in the absence of the SD motif the codon-anticodon interaction is required for the recognition of the initiation site (Benelli et al., 2003). Since the interaction between the ribosome and the m R N A is stronger in leader m R N A s they exhibit a higher translational efficiency than leaderless m R N A . Therefore, it is possible that the translational efficiency of the leaderless L 7 A e m R N A transcript is too low to be detected by S D S - P A G E . In addition, the mutations introduced in the D-5 ' -UTR mutant L 7 A e m R N A 157 might have favored the formation of a secondary structure in the 5 ' -UTR of the m R N A that is not present in the wild-type m R N A , thus inhibiting or greatly affecting the translational efficiency of the D - 5 ' - U T R mutant m R N A . 4.4.3 I n t e r a c t i o n o f L 7 A e w i t h the 5 ' - U T R o f p r o t e i n c o d i n g R N A s L 7 A e protein not only interacts with ncRNAs such as r R N A , C / D box s R N A s and H / A C A sRNAs , but it also interacts with full-length m R N A s . Indeed, five fragments of m R N A that contain stable K-turns within their 5 ' -UTR were recovered in the L 7 A e library. The K-turn motifs in these sense-strand fragments most often are derived from regions of complex secondary structure that overlap the translation initiation codon and the R B S . The functional significance of the interaction between L 7 A e and these m R N A s is not known at present. However, I speculate that L 7 A e protein participates in the post-transcriptional regulation of the m R N A s in question by blocking ribosome binding resulting in the rapid degradation of m R N A s not engaged in translation. Alternatively, the binding of L 7 A e may help to recruit other proteins or regulatory ncRNAs that might target the m R N A for translation inhibition and/or degradation. 4.4.4 I n t e r a c t i o n o f L 7 A e w i t h the 3 ' - U T R o f p r o t e i n c o d i n g reg ions Three of the clones recovered in the library overlap the coding and the 3 -UTR of hypothetical proteins. sR115 exhibits a hairpin structure and is able to interact with L 7 A e protein. The L 7 A e binding site (K-turn structure), as confirmed by mutational analyses, overlaps the termination codon of the O R F . This finding suggests a role of the L 7 A e protein in the translation termination event, either by promoting translational stop and m R N A release 158 or more intriguingly, by masking the termination codon, thereby preventing ribosome release. The second hypothesis has similarities to the mechanism used by eukaryotic organisms to incorporate the rare amino acid selenocysteine (Sec) into specific polypeptides. The m R N A s coding for Sec-containing proteins contain a conserved secondary structure element known as the Sec-insertion sequence element (SECIS) in their 3 ' -UTRs (Berry et al., 1991; Zinoni et al., 1990). The SECIS element consists of a hairpin structure containing a K -turn motif (Allmang et al., 2002). The SECIS-binding protein, SBP2, recognizes and binds to this K-turn motif and acts as a platform to recruit EF-Sec and S e c - t R N A S e c . Once the loaded SECIS complex associates with the ribosome, SBP2 is transiently displaced by L30 ribosomal protein which binds to the same K-tum motif as SBP2 (Chavatte et al., 2005). It has been suggested that L30 protein anchors the SECIS complex to the ribosome and induces the conformational changes in the SECIS element required to deliver the S e c - t R N A S e c onto the A site of the ribosome (Chavatte et al., 2005). In bacteria, the cz's-acting R N A structures used for Sec-insertion are located immediately downstream of the U G A codon. Therefore, the need to use two different proteins, one to recognize and bind the SECIS element on the m R N A and other to recruit the elongation fac tor- tRNA S e c complex and deliver the Sec- tRNA to the ribosome, is eliminated (Forchhammer et al., 1989; Leinfelder et al., 1988). Archaea, similarly to eukaryotes, recode U G A from a distance suggesting they use separate factors for SECIS binding and t R N A delivery (Wilting et al., 1997). However, the search for archaeal SBP2 homologs has not been successful and L30 ribosomal protein has no prokaryotic homologue. Nevertheless, the eukaryotic SBP2 and L30 proteins exhibit a similar structural topology and recognize the same R N A motif as the L 7 A e protein (Allmang et al., 2002; Koonin et al., 1994). In addition, putative L 7 A e binding motifs have been detected in the 3'-159 U T R s of different m R N A s in S. solfataricus. Based on these observations, it is tempting to hypothesize that in Archaea L 7 A e protein might have evolved to accomplish Sec incorporation, enlarging the spectrum of functions associated with this protein. 4 . 5 Novel antisense sRNAs as potential regulators The past few years have seen an explosion in the number of detected n c R N A s in the three domains, of life. Most of the identified ncRNAs act by base-pairing with their target R N A s . In bacteria, antisense ncRNAs have been found to have a wide variety of biological functions including repression and activation of translation, as well as protection or degradation of m R N A s (Gottesman, 2004; Storz et a l , 2004). In eukaryotes, small interfering (s iRNAs) and microRNAs (miRNAs) have been implicated in the regulation of translation and m R N A stability (Bartel, 2004; Nelson et al., 2003). In Archaea, recent studies have revealed the presence of numerous ncRNAs in S. solfataricus that exhibit full-length or partial complementarity to different ORFs (Tang et al., 2005; Zago et al., 2005). These newly identified antisense R N A s are expected to play an important regulatory role in gene expression. In addition, some of these R N A s associate with L 7 A e , suggesting that this protein might be required for the function and/or the regulation of the antisense R N A s . 4.5.1 Antisense C / D box s R N A s A n t i - C / D box R N A s that are transcribed in the opposite direction from the D N A that encodes the authentic C /D box s R N A and that lack binding affinity for L 7 A e protein have been recovered in the L 7 A e library. At the moment one can only speculate about the possible 160 mechanism of function of these antisense C /D box R N A s as no comparable sequences have been reported in any other organisms. However, since the antisense C / D box R N A s do not bind the L 7 A e protein, but the C /D box sRNAs do, the presence of the antisense sequences in our library could only be explained by direct interaction between sense and antisense R N A sequences within the L7Ae-containing particle. Moreover, the interaction of the antisense C / D box R N A with its cognate R N A may be involved in the regulation of the modification activity of the methylation-guide sRNP. 4.5.2 R N A s antisense to t ransposases The S. solfataricus genome is exceptional in that it contains a large number of potentially mobile elements, including 201 copies of intact IS (insertion sequence) elements of at least 25 different types and more than 140 copies of MITEs (miniature inverted repeat transposable element) (Brugger et al., 2002). These mobile elements constitute about 12% of the 3 M b genome and, therefore, it is expected that the synthesis of transposases is tightly regulated because of the genomic instability that results from uncontrolled transposition. Four of the R N A s recovered in the L 7 A e library are complementary to 5', 3' or internal regions of m R N A s that encode transposon-related proteins. In E. coli antisense R N A s are known to regulate the expression of the transposable elements TnlO and Tn30. For TnlO, the antisense R N A is complementary to the ribosome binding site and start codon of the transposase m R N A . Experimental evidence shows that the interaction of the antisense R N A with the 5' end of the transposase m R N A inhibits the translation initiation step (Ma and Simons, 1990). A n antisense R N A anneals to an internal region of the Tn30 coding region and affects the translational elongation of the transposase m R N A (Arini et al., 1997). It is 161 possible that the antisense R N A s detected in S. solataricus employ a similar strategy to that used by E. coli to control the mobility of the multiple insertion elements at the translational level. The results obtained from a recent study of the genomic rearrangements mediated by mobile elements in S. solfataricus have suggested that the low number of transpositions observed for ISC127, ISC1225, ISC1359 and ISC1439 might be attributed to the recently discovered antisense R N A s complementary to the m R N A s of these transposases (Redder and Garrett, 2006). The results obtained in the c D N A library show that L 7 A e protein has affinity for the 3'-antisense R N A s , but failed to interact with either the 5'- or the internal antisense R N A s . Visual inspection of the 5' and internal clones revealed that these clones do not contain a recognizable K-turn motif. It is possible that some or all of the antisense clones that do not bind L 7 A e protein may be fragments of larger transcripts that do contain the motif and do bind the protein. These results suggest that K-turn motifs and their interaction with the L 7 A e protein may in at least some cases be involved in the antisense regulation of gene expression. 4 . 6 A s s e m b l y a n d f u n c t i o n o f C / D b o x R N P c o m p l e x e s In an attempt to better understand the function of the L 7 A e protein in the assembly and activity of the C / D box R N P and to determine the functional roles that the C / D and C ' / D ' K -turns play in the methylation reaction, several constructs of the S. acidocaldarius s R l s R N A , containing rearrangements and nucleotide substitutions, were prepared. These s R l R N A constructs were tested for their ability to assemble with L 7 A e , Nop5 and Fib proteins into a R N P complex and to function in the methyl transfer reaction. The results of this study 162 reinforce the idea that LI At, in addition to acting as a chaperone for the formation of the R N A K-turn motifs, also plays a more important role in the reaction, possibly by mediating RNA-protein and protein-protein interactions within the dynamic R N P complex. In addition, the experimental evidence obtained suggests that the internal C ' / D ' K-turn appears not to participate directly in the methyl transfer reaction; instead, it seems to play a structural role by facilitating the correct folding of the R N A guide. 4.6.1 C i r c u l a r permutation of s R l Circularly permutated s R l sRNAs were constructed in an attempt to determine the functional role of the C / D and C ' / D ' K-turn motifs in methylation activity. The positioning of the 5' and 3' termini of s R l between the C and D ' boxes was found to stimulate methylation activity (CP1A mutant, Figure 3.17 B and Figure 3.18D) whereas the positioning of the ends between the C and D ' boxes completely abolished methylation function (CP2 mutant, Figure 3.16C and Figure 3.18D). These results indicate first that in s R l , the C /D and C ' / D ' motifs could form functional K-turns in the absence of a terminal stem (canonical stem I) and second, that the methylation guide region in s R l can be coupled to either a terminal or an internal K-turn structure. In contrast, when connectivity was disrupted by positioning the 5' and 3' ends between the D ' and C boxes as in the CP2 construct, methylation activity was abolished (Figure 3.17C). The importance of rigid connectivity in the linkers between the C and D ' and the D ' and C boxes has recently been demonstrated. The optimal separation of C and D ' and the D ' and C boxes in archaeal sRNAs was shown to be tightly clustered around a length of 12 nt (Tran et al., 2005). Symmetrical and asymmetrical increases or decreases in this optimal length resulted in a decrease in the methylation activity of the R N P complexes. 163 4.6.2 M u l t i p l e a n d s ingle nuc leo t ide subs t i tu t ions i n the C ' / D ' m o t i f The importance of the L 7 A e - R N A interaction in the methylation function was assessed by replacing the C /D or C ' / D ' boxes by poly C or by mutating the nucleotides of the K-turn that participate in protein binding such as the G : A - G : A base pairs and the protruding U31 residue. Substitution by poly-C tracts in the first half of the C box, or in the entire D ' box or a combination of both reduces D-guide methylation activity to half, one third and less than one fifth, respectively, compared to the s R l control (Figure 3.17D, E and F). The gradual loss of activity of the poly C mutants (Figure 3.17 D-G) correlates with a gradual reduction in the stability of stem II from five, to four, to three (interrupted by a C : C mismatch) Watson-Crick base pairs. This suggests that the stability of stem II that normally seals the C ' / D ' K-turn structure might modulate the activity of the D box-associated guide. Interestingly, the G32C mutation exerts a more pronounced effect on methylation activity than substitution of the first half of C by poly C or the U31A. This could be related to the stability of stem II, as the G32C mutant has four base pairs compared to five in the other two mutants. Taken together, these findings confirm the cooperative role of the internal C ' / D ' and terminal C / D K-turns and suggest a more prominent connection between the stability of the C ' / D ' stem II and the D-guide function than was previously imagined. A recent study of the sequence connecting the C and D ' box sequences of Pyrococcus sRNAs suggests that this terminal loop has an optimal length of 3 nt with a consensus of R N K (where R is A or G , N is any nucleotide and K is G , U or A ) (Nolivos et al., 2005). The terminal K-loop in the S. acidocaldarius s R l deviates from the pyrococcal consensus; it is 4 nt in length and has a U G G A sequence. Mutation of this loop sequence in s R l to C residues 164 (along with disruption of one of the sheared G : A base pairs; Figure 3.17D) has only a modest two-fold negative effect on the methylation activity of the complex. Moreover, the C P 1 A circular permutation construct also has a 4 nt terminal loop between the D and C boxes ( U A G U ) and functions better than wild-type s R l in the methylation reaction (Figure 3.17B). 4.6.3 C / D box s R N A s wi th non-standard K - t u r n To define better the specific features of the C /D and C ' / D ' K-turns and to assess the proposed chaperone function of the L 7 A e protein, the C /D and C ' / D ' K-turn motifs in s R l were replaced one at a time or togehter with the sequence corresponding to K-turn 38 or K -turn 15. The K-turn 38 sequence is unusual in that it is able to fold into a compact K-turn structure in the presence of 10 m M M g without the aid of a protein chaperone. The three mutant s R l s R N A s containing the K-turn 38 motif in the C / D alone, C ' / D ' alone or C / D + C ' / D ' motifs, were unable to associate with Nop5 and Fib in the absence of the L 7 A e protein at either 1 or 10 m M M g 2 + and none of these mixtures was active in the methylation reaction. This result implies that the L 7 A e protein acts not only as an R N A chaperone to mediate the folding of C / D box s R N A K-turns, but is also actively involved in nucleating the addition of the Nop5 and Fib proteins to these complexes, possibly through direct protein-protein interactions. Moreover, in the presence of L 7 A e the three K-turn 38 mutants formed higher order complexes with three proteins (Figure 3.19B), but none of the K-turn 38 containing complexes was active in methylation (Figure 3.17J and data not shown). This might be explained by the inherent rigidity and inflexibility of the structure of K-turn 38 which does not allow the R N A to fold into the proper conformation such that Nop5 and Fib proteins are in the correct orientation to direct the methyl transfer reaction. 165 In contrast, the K-turn 15 (the binding site of the L 7 A e protein in the 5OS ribosomal subunit) can fully replace the C /D motif in s R l even though the structure of this K-turn is highly unusual in that an A : U : A base triple replaces one of the two sheared G : A base pairs in the canonical C / D motif (Figure 3.19). The results obtained in this study expand the repertoire of documented features associated with the guided ribose methylation function in Sulfolobus and raises the prospect that other K-turn containing sRNAs that were recently identified in functional screens (and shown to interact efficiently with L 7 A e , Nop5 and Fib), may possess bona fide methylation function. 4 . 7 F u t u r e P e r s p e c t i v e s In this work, novel small R N A s that associate with the multifunctional ribosomal protein L 7 A e were identified and characterized in Sulfolobus solfataricus. Although the function of a number of these ncRNAs remains to be elucidated, this study indicates that L 7 A e might be intimately connected to the structure and function of many of these, and possibly many other, ncRNAs. Nevertheless, many questions remain unanswered, including: (i) How many ncRNAs are encoded in S. solfataricus chromosome? (ii) When are they expressed? (iii) What are the protein partners associated with these ncRNAs? (iv) What is the mechanism of action of the corresponding RNPs? 166 4.7.1 M o r e n c R N A s in S. solfataricus remain to be discovered The number of known ncRNAs and putative ncRNAs of unknown function has increased dramatically over the past few years. Even for well-studied organisms such as E. coli and 5*. cerevisiae it is not known i f all the ncRNAs have been identified. Recent experimental data from tiling microarray experiments and c D N A libraries suggest that the number of transcribed ncRNAs is greater than previously thought, ranging from tens to hundreds in bacteria and from hundreds to thousands in humans (Storz et al., 2005; Wassarman et al., 2001). The discovery of novel ncRNAs in S. solfataricus, in addition to the previously identified C / D box and H / A C A box sRNAs , reveals that archaea contains a plethora of ncRNAs and suggests that many other ncRNAs still await discovery. Although the approach used in this study proved to be a fruitful method to identify new ncRNAs associated with L 7 A e protein, it is possible that not all the ncRNAs that associate in vivo with this protein were identified here. The complex secondary structure and/or the low abundance of some of the L7Ae-associated ncRNAs might have prevented us from cloning them in the library. In addition, Archaea may contain many other ncRNAs that do not associate with the L 7 A e protein and which would not have been identified in this study. Furthermore, the expression pattern of some ncRNAs may vary under different growth conditions and at different growth stages. Therefore, new functional screens for hyperthermophilic archaea need to be performed in order to characterize the complete set of ncRNAs expressed in organisms of the third domain of life. Til ing microarray experiments could be used to identify new n c R N A s or to study their expression patterns. In addition, a new approach termed genomic S E L E X could be used to identify new ncRNAs associated with a specific R N A binding protein 167 (Huttenhofer and Vogel , 2006 and references therein). The advantage of this method over the c D N A cloning strategy used in this study is that the screened R N A species are generated by in vitro transcription from all regions of the genome and thus these results are not dependent on isolating R N A s expressed at a particular growth stage or condition (Huttenhofer and Vogel , 2006). A s new ncRNAs are identified the challenge ahead wi l l be to find out "what" they do and "how" they function. 4.7.2 F u n c t i o n a l c h a r a c t e r i z a t i o n o f the n e w l y iden t i f i ed L 7 A e - a s s o c i a t e d s R N A s The search for ncRNAs in Archaea is still in its infancy and although several studies (Gaspin et al., 2000; Omer et al., 2000; Tang et al., 2002a; Tang et al., 2005; Zago et al., 2005), including this one, have demonstrated the diversity of n c R N A s in different archaeal species, very little is known about their roles in the cell and their mechanism of action. Some of the novel ncRNAs exhibit complementarity to m R N A s or other R N A molecules suggesting that they may function by base pairing with their target R N A . In addition, most of the antisense R N A s identified in this study are cz's-acting R N A s , (i.e., encoded at the same genomic locus as the R N A target, but on the opposite strand) suggesting that they might have evolved simultaneously with their target R N A . However, this tells very little about the role these cw-encoded ncRNAs play in the cell. Is the base pairing of the n c R N A with its cognate R N A affecting translation or m R N A stability, or both? Is L 7 A e protein required for the function of these ncRNAs or does it regulate the activity of the ncRNAs? In addition to antisense R N A s , 5'-UTRs and 3 '-UTRs of protein-coding transcripts were also recovered in the L 7 A e library. In a genomic-wide search for conservation in intergenic regions in E. coli many highly conserved 5'-UTRs were observed. Moreover, it 168 was shown that many of these 5'-UTRs were able to bind the bacterial Sm-like protein Hfq suggesting that they might be involved in translational regulation (Wassarman et al., 2001; Zhang et al., 2003). Therefore, it is possible that in archaea, similarly to in E. coli, the interaction between L 7 A e and the 5'-UTRs of different m R N A s functions as a mechanism to regulate the expression of certain messages. Ultimately, the function of the newly identified ncRNAs and the significance of their association with L 7 A e have to be tested in vivo by performing genetic experiments in which the n c R N A gene is eliminated from the genome or contains mutations that affect its base pairing with its target R N A . In addition, mutations that disrupt the L 7 A e binding site in the n c R N A could be used to determine the role that this protein plays in the function of the n c R N A . Moreover, the expression/overexpression of a specific n c R N A followed by microarray analyses could help to identify potential m R N A targets i f the ncRNAs influences the abundance of its respective m R N A target(s) in the cell. Although genetic systems for S. solfataricus are not yet widely available, current efforts are directed to develop more efficient transformation and selection methods. Recently a homologous recombination system from S. solfataricu, that uses lacS as a marker gene has been used for gene-knockout experiments (Worthington et al., 2003). In addition, uracil-auxotrophic mutants have been isolated in S. solfataricus and they promise to be a useful tool in developing new selection systems that wi l l allow us to study the mechanisms of gene regulation in vivo (Jonuscheit et al., 2003). 169 4.7.3 I d e n t i f i c a t i o n a n d c h a r a c t e r i z a t i o n o f the componen t s o f n o v e l L 7 A e - c o n t a i n i n g R N P The same approach used to identify novel ncRNAs in S. solfataricus can be applied to identify other protein components associated with L7Ae-containing RNPs . L 7 A e polyclonal, antibodies can be used to isolate the L7Ae-containing complexes, as before, and the proteins associated with these complexes can then be isolated and sequenced. Alternatively, one of the newly identified n c R N A s could be used as "bait" to identify other proteins associated with this R N A in cell extracts. For this purpose, the n c R N A could be "tagged" in an in vitro transcription reaction containing biotin-UTP. The biotinylated n c R N A is then incubated with cell extracts and purified using a streptavidin resin. Elucidation of the protein components of the different L7Ae-containing RNPs may hint as to their functions, since the proteins of the complexes may contain domains with known catalytic activity. Moreover, this approach might shed some light on the mechanisms employed by L 7 A e , and its associated ncRNAs, to coordinate the function of and the cross-talk between functionally distinct R N P complexes. F i n a l r e m a r k s : The discoveries made over the last 25 years demonstrate that R N A s are ubiquitous and pervasive, and capable of participating in a large number of important biological processes. In almost all cases the R N A s associate with proteins form dynamic ribonucleoprotein complexes. The structure and function of these dynamic complexes depend on the ability of R N A s to form unique inter and intra molecular secondary (complementary base pairing) and higher order structures that in terms of dynamic flexibility are generally far beyond the range that can be achieved by structurally more rigid protein complexes. 170 The experimental evidence I obtained in this study suggests that our current understanding of the n c R N A world and the R N P particles in archaeal systems is only a small part of a much larger picture that is likely to penetrate into virtually every aspect of the physiology of these organisms. 171 4.8 Chapter 4 Figures Figure 4 . 1 . Structure comparison of L7Ae protein from different archaeal organisms and eukaryotic L7Ae-homologous proteins. ( A ) T h e crystal structure o f L 7 A e f rom three different archaea is super imposed o n the crystal structures o f its eukaryot ic homologs S n u l 3 and 15 .5kD. The loops connect ing ct-helix 6 w i t h p-strand 4 [L(P4-ct6)] and ct-helix 2 w i t h P-strand 2 [L(cc2-P2)] are indicated. a2 indicates the pos i t ion o f ct-helix 2 w h i c h plays an important role i n b ind ing K - t u r n R N A s . (B) The residues i n v o l v e d i n R N A recogni t ion and d i sc r imina t ion reside i n the connect ing loops L(p4-ct6) and L(oc2-p2). L o o p L(p4-a6) i n the eukaryotic S n u l 3 protein is two amino-acids longer than the corresponding loop i n the archaeal L 7 A e . T h i s a l lows P4 to move closer 172 to (32, establishing the formation of H-bonds between the two P-strands. This stabilizes L(a2-P2) and subsequently a2 in the eukaryotic protein and leads to a more rigid structure. The shorter L(p4-cc6) loop of L 7 A e prevents the formation of the H-bonds between P4 and p2, which leads to a more unstable L(a2-p) and, eventually, ct2 (figure was adapted from Oruganti et al., 2005). 173 O SRP19 D. XT- 6 5' 3' •GC •AG 110 V^UGA AGG . . . . I I IUGAG GAC •3 ' • 5 ' 250 S R P 54 0 Figure 4.2. The signal recognition particle ribonucleoprotein. (A) Eukaryotic SRP RNP. The six proteins forming the eukaryotic SRP RNP particle are depicted bound to their respective helices in the secondary structure of 7S RNA. The 7S RNA helices are numbered according to the nomenclature of Larsen and Zwieb (1991). (B) Archaeal SRP RNP. The two protein components of archaeal SRP RNP identified to date, SRP54 and SRP 19, are depicted as bound to helix 8 and 6 of the 7S RNA, respectively. The L7Ae protein is bound to helix 5 of 7S RNA. (C) The nucleotides forming the hinge 1 region of the eukaryotic 7S RNA are shown. The SRP68/72 heterodimer binds to this region of the 7S RNA and induces a bend in the RNA backbone, such that the different components of the SRP RNP can interact with the signal sequence and the elongation factor site simultaneously (see text). (D) The nucleotides forming the L7Ae binding site on the 7S RNA, as determined by toeprinting assays and mutation experiments, are shown. The K-turn motif is boxed and the conserved G:A base pairs of the motif are shown in red. The position of the K-turn motif in the archaeal 7S RNA corresponds to the hinge 1 region of the eukaryotic 7S RNA, which is recognized by the SRP68/72 heterodimer, suggesting that L7Ae might be the functional homologue of the SRP68/72 heterodimer (see text). 174 Bibliography Aittaleb, M . , Rashid, R., Chen, Q., Palmer, J.R., Daniels, C.J. and L i , H . (2003) Structure and function of archaeal box C /D sRNP core proteins. Nat Struct Biol, 10, 256-263. Allmang, C , Carbon, P. and K r o l , A . (2002) The SBP2 and 15.5 kD/Snul3p proteins share the same R N A binding domain: identification of SBP2 amino acids important to SECIS R N A binding. Rna, 8, 1308-1318. Altschul, S.F., Gish, W. , Mil ler , W., Myers, E .W. and Lipman, D.J . (1990) Basic local alignment search tool. J Mol Biol, 215, 403-410. Argaman, L . , Hershberg, R., Vogel , J. , Bejerano, G . , Wagner, E . G . , Margalit, H . and Altuvia, S. (2001) Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol, 11, 941-950. Ar in i , A . , Keller, M . P . and Arber, W. (1997) A n antisense R N A in IS30 regulates the translational expression of the transposase. Biol Chem, 378, 1421-1431. Bachellerie, J.P., Cavaille, J. and Huttenhofer, A . (2002) The expanding snoRNA world. Biochimie, 84, 775-790. Baker, D . L . , Youssef, O.A. , Chastkofsky, M.I . , Dy, D . A . , Terns, R . M . and Terns, M . P . (2005) RNA-guided R N A modification: functional organization of the archaeal H / A C A R N P . Genes Dev, 19, 1238-1248. Bakin, A . and Ofengand, J. (1993) Four newly located pseudouridylate residues in Escherichia coli 23S ribosomal R N A are all at the peptidyltransferase center: analysis by the application of a new sequencing technique. Biochemistry, 32, 9754-9762. Balakin, A . G . , Smith, L . and Fournier, M . J . (1996) The R N A world of the nucleolus: two major families of small R N A s defined by different box elements with related functions. Cell, 86, 823-834. Ban, N . , Nissen, P., Hansen, J., Moore, P .B. and Steitz, T . A . (2000) The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. Science, 289, 905-920. Banerjee, D. and Slack, F. (2002) Control of developmental timing by small temporal R N A s : a paradigm for RNA-mediated regulation of gene expression. Bioessays, 24, 119-129. Barad, O., Mei r i , E . , Avnie l , A . , Aharonov, R., Barzilai, A . , Bentwich, I., Einav, U . , Gilad, S., Hurban, P., Karov, Y . , Lobenhofer, E . K . , Sharon, E . , Shiboleth, Y . M . , Shtutman, M . , Bentwich, Z. and Einat, P. (2004) M i c r o R N A expression detected by oligonucleotide microarrays: system establishment and expression profiling in human tissues. Genome Res, 14, 2486-2494. 175 Bartel, D.P. (2004) M i c r o R N A s : genomics, biogenesis, mechanism, and function. Cell, 1 1 6 , 281-297. Batey, R.T. , Rambo, R.P. , Lucast, L . , Rha, B . and Doudna, J .A. (2000) Crystal structure of the ribonucleoprotein core of the signal recognition particle. Science, 2 8 7 , 1232-1239. Benelli , D . , Maone, E . and Londei, P. (2003) Two different mechanisms for ribosome/mRNA interaction in archaeal translation initiation. Mol Microbiol, 5 0 , 635-643. Berry, M . J . , Banu, L . , Chen, Y . Y . , Mandel, S J . , Kieffer, J.D., Harney, J .W. and Larsen, P.R. (1991) Recognition of U G A as a selenocysteine codon in type I deiodinase requires sequences in the 3' untranslated region. Nature, 3 5 3 , 273-276. Bortolin, M . L . , Bachellerie, J.P. and Clouet-d'Orval, B . (2003) In vitro R N P assembly and methylation guide activity of an unusual box C /D R N A , cis-acting archaeal pre-tRNA(Trp) . Nucleic Acids Res, 3 1 , 6524-6535. Bousquet-Antonelli, C , Henry, Y . , G'Elugne J, P., Caizergues-Ferrer, M . and Kiss , T. (1997) A small nucleolar R N P protein is required for pseudouridylation of eukaryotic ribosomal R N A s . EMBO J, 16 , 4770-4776. Brugger, K . , Redder, P., She, Q., Confalonieri, F., Zivanovic, Y . and Garrett, R . A . (2002) Mobi le elements in archaeal genomes. FEMS Microbiol Lett, 2 0 6 , 131-141. Bult, C.J . , White, O., Olsen, G.J. , Zhou, L . , Fleischmann, R .D. , Sutton, G . G . , Blake, J .A. , FitzGerald, L . M . , Clayton, R . A . , Gocayne, J.D., Kerlavage, A . R . , Dougherty, B . A . , Tomb, J.F., Adams, M . D . , Reich, C.I., Overbeek, R., Kirkness, E.F. , Weinstock, K . G . , Merrick, J . M . , Glodek, A . , Scott, J.L., Geoghagen, N.S . and Venter, J.C. (1996) Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science, 2 7 3 , 1058-1073. Caldas, T., Binet, E . , Bouloc, P. and Richarme, G. (2000) Translational defects of Escherichia coli mutants deficient in the Um(2552) 23S ribosomal R N A methyltransferase RrmJ/FTSJ. Biochem Biophys Res Commun, 2 7 1 , 714-718. Carrington, J.C. and Ambros, V . (2003) Role of microRNAs in plant and animal development. Science, 3 0 1 , 336-338. Cavaille, J. and Bachellerie, J.P. (1998) SnoRNA-guided ribose methylation of r R N A : structural features of the guide R N A duplex influencing the extent of the reaction. Nucleic Acids Res, 2 6 , 1576-1587. Cavaille, J., Hadjiolov, A . A . and Bachellerie, J.P. (1996) Processing of mammalian r R N A precursors at the 3' end of 18S r R N A . Identification of cis-acting signals suggests the involvement of U13 small nucleolar R N A . Eur J Biochem, 2 4 2 , 206-213. 176 Charpentier, B . , Muller , S. and Branlant, C. (2005) Reconstitution of archaeal H / A C A small ribonucleoprotein complexes active in pseudouridylation. Nucleic Acids Res, 33, 3133-3144. Chavatte, L . , Brown, B . A . and Driscoll , D . M . (2005) Ribosomal protein L30 is a component of the UGA-selenocysteine recoding machinery in eukaryotes. Nat Struct Mol Biol, 12,408-416. Chen, S., Lesnik, E . A . , Hal l , T .A . , Sampath, R., Griffey, R . H . , Ecker, D .J . and Blyn , L . B . (2002) A bioinformatics based approach to discover small R N A genes in the Escherichia coli genome. Biosystems, 65, 157-177. Copeland, P.R. and Driscoll , D . M . (2001) R N A binding proteins and selenocysteine. Biofactors, 14, 11-16. Copeland, P.R., Fletcher, J.E., Carlson, B . A . , Hatfield, D . L . and Driscoll , D . M . (2000) A novel R N A binding protein, SBP2, is required for the translation of mammalian selenoprotein m R N A s . EMBOJ, 19, 306-314. Daniels, C.J . , Gupta, R. and Doolittle, W.F. (1985) Transcription and excision of a large intron in the t R N A T r p gene of an archaebacterium, Halobacterium volcanii. J Biol Chem, 260,3132-3134. Dennis, P.P., Omer, A . and Lowe, T. (2001) A guided tour: small R N A function in Archaea. Mol Microbiol, 40, 509-519. Dennis, P.P., Ziesche, S. and Mylvaganam, S. (1998) Transcription analysis of two disparate r R N A operons in the halophilic archaeon Haloarcula marismortui. J Bacteriol, 180, 4804-4813. Eddy, S.R. (2001) Non-coding R N A genes and the modern R N A world. Nat Rev Genet, 2, 919-929. Eddy, S.R. (2002) Computational genomics of noncoding R N A genes. Cell, 109, 137-140. Eichler, J. and M o l l , R. (2001) The signal recognition particle of Archaea. Trends Microbiol, 9,130-136. Fatica, A . and Tollervey, D . (2002) Making ribosomes. Curr Opin Cell Biol, 14, 313-318. Fil ipowicz, W . and Pogacic, V . (2002) Biogenesis of small nucleolar ribonucleoproteins. Curr Opin Cell Biol, 14,319-327. Fil ippini , D. , Bozzoni , I. and Caffarelli, E . (2000) p62, a novel Xenopus laevis component of box C / D snoRNPs. RNA, 6, 391-401. 177 Forchhammer, K . , Leinfelder, W. and Bock, A . (1989) Identification of a novel translation factor necessary for the incorporation of selenocysteine into protein. Nature, 342, 453-456. Ganot, P., Bortolin, M . L . and Kiss, T. (1997) Site-specific pseudouridine formation in preribosomal R N A is guided by small nucleolar R N A s . Cell, 89, 799-809. Gaspin, C. , Cavaille, J., Erauso, G . and Bachellerie, J.P. (2000) Archaeal homologs of eukaryotic methylation guide small nucleolar R N A s : lessons from the Pyrococcus genomes. J Mol Biol, 297, 895-906. Goody, T .A . , Melcher, S.E., Norman, D . G . and Li l ley, D . M . (2004) The kink-turn motif in R N A is dimorphic, and metal ion-dependent. RNA, 10, 254-264. Gottesman, S. (2004) The small R N A regulators of Escherichia coli: roles and mechanisms. Annu Rev Microbiol, 58, 303-328. Grad, Y . , Aach, J., Hayes, G.D. , Reinhart, B .J . , Church, G . M . , Ruvkun, G . and K i m , J. (2003) Computational and experimental identification of C. elegans microRNAs. Mol Cell, 11, 1253-1263. Granneman, S. and Baserga, S.J. (2005) Crosstalk in gene expression: coupling and co-regulation of r D N A transcription, pre-ribosome assembly and pre-rRNA processing. Curr Opin Cell Biol, 17,281 -286. Halic, M . , Becker, T., Pool, M . R . , Spahn, C M . , Grassucci, R . A . , Frank, J. and Beckmann, R. (2004) Structure of the signal recognition particle interacting with the elongation-arrested ribosome. Nature, 427, 808-814. Hammond, S . M . , Bernstein, E . , Beach, D. and Hannon, G.J. (2000) A n RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells. Nature, 404, 293-296. Harlow, E . and Lane, D. (1988) Antibodies: a laboratory manual. Cold Spring Harbor Laboratory Press. Harris, M . E . and Pace, N . R . (1995) Analysis of the tertiary structure of bacterial RNase P RNA. Mol Biol Rep, 22, 115-123. Hastings, M . L . and Krainer, A . R . (2001) Pre-mRNA splicing in the new millennium. Curr Opin Cell Biol, 13, 302-309. Henras, A . , Dez, C , Noaillac-Depeyre, J., Henry, Y . and Caizergues-Ferrer, M . (2001) Accumulation of H / A C A snoRNPs depends on the integrity of the conserved central domain of the RNA-binding protein Nhp2p. Nucleic Acids Res, 29, 2733-2746. 178 Henras, A . , Henry, Y . , Bousquet-Antonelli, C , Noaillac-Depeyre, J., Gelugne, J.P. and Caizergues-Ferrer, M . (1998) Nhp2p and Nop 1 Op are essential for the function of H / A C A snoRNPs. EMBOJ, 17, 7078-7090. Herskovits, A . A . and B i b i , E . (2000) Association of Escherichia coli ribosomes with the inner membrane requires the signal recognition particle receptor but is independent of the signal recognition particle. Proc Natl Acad Sci USA, 97, 4621-4626. Huttenhofer, A . , Brosius, J. and Bachellerie, J.P. (2002) RNomics : ' identification and function of small, non-messenger R N A s . Curr Opin Chem Biol, 6, 835-843. Huttenhofer, A . , Kiefmann, M . , Meier-Ewert, S., O'Brien, J., Lehrach, H . , Bachellerie, J.P. and Brosius, J. (2001) RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger R N A s in mouse. EMBO J, 20, 2943-2953. Huttenhofer, A . , Schattner, P. and Polacek, N . (2005) Non-coding R N A s : hope'or hype? Trends Genet, 21, 289-297. Huttenhofer, A . and Vogel , J. (2006) Experimental approaches to identify non-coding R N A s . Nucleic Acids Res, 34, 635-646. Iakhiaeva, E . , Y i n , J. and Zwieb, C. (2005) Identification of an RNA-binding domain in human SRP72. J Mol Biol, 345, 659-666. Jacques, N . and Dreyfus, M . (1990) Translation initiation in Escherichia coli: old and new questions. Mol Microbiol, 4, 1063-1067. Jenner, L . , Romby, P., Rees, B . , Schulze-Briese, C , Springer, M . , Ehresmann, C , Ehresmann, B . , Moras, D. , Yusupova, G . and Yusupov, M . (2005) Translational operator of m R N A on the ribosome: how repressor proteins exclude ribosome binding. Science, 308, 120-123. Jones-Rhoades, M . W . and Bartel, D.P. (2004) Computational identification of plant microRNAs and their targets, including a stress-induced m i R N A . Mol Cell, 14, 787-799. Jonuscheit, M . , Martusewitsch, E . , Stedman, K . M . and Schleper, C. (2003) A reporter gene system for the hyperthermophilic archaeon Sulfolobus solfataricus based on a selectable and integrative shuttle vector. Mol Microbiol, 48, 1241-1252. Kaine, B .P . (1990) Structure of the archaebacterial 7S R N A molecule. Mol Gen Genet, 221, 315-321. Kampa, D . , Cheng, J., Kapranov, P., Yamanaka, M . , Brubaker, S., Cawley, S., Drenkow, J., Piccolboni, A . , Bekiranov, S., Helt, G . , Tammana, H . and Gingeras, T.R. (2004) Novel R N A s identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res, 14, 331-342. 179 Kawai , G . , Yamamoto, Y . , Kamimura, T., Masegi, T., Sekine, M . , Hata, T., Iimori, T., Watanabe, T., Miyazawa, T. and Yokoyama, S. (1992) Conformational rigidity of specific pyrimidine residues in t R N A arises from posttranscriptional modifications that enhance steric interaction between the base and the 2'-hydroxyl group. Biochemistry, 31, 1040-1046. Kawano, M , Reynolds, A . A . , Miranda-Rios, J. and Storz, G . (2005) Detection of 5'- and 3'-UTR-derived small R N A s and cis-encoded antisense R N A s in Escherichia coli. Nucleic Acids Res, 33, 1040-1050. Keenan, R.J . , Freymann, D . M . , Stroud, R . M . and Walter, P. (2001) The signal recognition particle. Annu Rev Biochem, 70, 755-775. Kiss-Laszlo, Z . , Henry, Y . , Bachellerie, J.P., Caizergues-Ferrer, M . and Kiss , T. (1996) Site-specific ribose methylation of preribosomal R N A : a novel function for small nucleolar R N A s . Cell, 85, 1077-1088. Kiss , T. (2001) Small nucleolar RNA-guided post-transcriptional modification of cellular R N A s . EMBOJ, 20, 3617-3622. Kle in , D.J . , Schmeing, T . M . , Moore, P .B. and Steitz, T .A . (2001) The kink-turn: a new R N A secondary structure motif. EMBOJ, 20, 4214-4221. Kle in , R.J . , Misulovin, Z. and Eddy, S.R. (2002) Noncoding R N A genes identified in A T -rich hyperthermophiles. Proc Natl Acad Sci USA, 99, 7542-7547. Koonin, E . V . , Bork, P. and Sander, C. (1994) A novel RNA-binding motif in omnipotent suppressors of translation termination, ribosomal proteins and a ribosome modification enzyme? Nucleic Acids Res, 22, 2166-2167. Kuhn, J.F., Tran, E.J . and Maxwel l , E.S. (2002) Archaeal ribosomal protein L7 is a functional homolog of the eukaryotic 15.5kD/Snul3p snoRNP core protein. Nucleic Acids Res, 30, 931-941. Laemmli, U . K . (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227, 680-685. Lafontaine, D . L . , Bousquet-Antonelli, C , Henry, Y . , Caizergues-Ferrer, M . and Tollervey, D. (1998) The box H + A C A snoRNAs carry Cbf5p, the putative r R N A pseudouridine synthase. Genes Dev, 12, 527-537'. Lagos-Quintana, M . , Rauhut, R., Lendeckel, W. and Tuschl, T. (2001) Identification of novel genes coding for small expressed R N A s . Science, 294, 853-858. Lagos-Quintana, M . , Rauhut, R., Yalcin, A . , Meyer, J., Lendeckel, W. and Tuschl, T. (2002) Identification of tissue-specific microRNAs from mouse. Curr Biol, 12, 735-739. 180 Larsen, N . and Zwieb, C. (1991) S R P - R N A sequence alignment and secondary structure. Nucleic Acids Res, 19, 209-215. Lavorgna, G . , Dahary, D . , Lehner, B . , Sorek, R., Sanderson, C M . and Casari, G. (2004) In search of antisense. Trends Biochem Sci, 29, 88-94. Lee, R . C , Feinbaum, R . L . and Ambros, V . (1993) The C. elegans heterochronic gene lin-4 encodes small R N A s with antisense complementarity to lin-14. Cell, 75, 843-854. Leinfelder, W. , Forchhammer, K . , Zinoni, F., Sawers, G . , Mandrand-Berthelot, M . A . and Bock, A . (1988) Escherichia coli genes whose products are involved in selenium metabolism. J Bacteriol, 170, 540-546. Lerner, E . A . , Lerner, M . R . , Janeway, C . A . , Jr. and Steitz, J .A. (1981) Monoclonal antibodies to nucleic acid-containing cellular constituents: probes for molecular biology and autoimmune disease. Proc Natl Acad Sci USA, 78, 2737-2741. Levy, M . and Ellington, A . D . (2001) R N A world: catalysis abets binding, but not vice versa. Curr Biol, 11, R665-667. L i m , L .P . , Glasner, M . E . , Yekta, S., Burge, C B . and Barrel, D.P. (2003) Vertebrate mic roRNA genes. Science, 299, 1540. Londei, P., Teixido, J., Acca, M . , Cammarano, P. and Ami ls , R. (1986) Total reconstitution of active large ribosomal subunits of the thermoacidophilic archaebacterium Sulfolobus solfataricus. Nucleic Acids Res, 14, 2269-2285. Luehrsen, K . R . , Nicholson, D .E . , Jr. and Fox, G.E . (1985) Widespread distribution of a 7S R N A in archaebacteria. Curr Microbiol, 12, 69-72. M a , C. and Simons, R . W . (1990) The IS10 antisense R N A blocks ribosome binding at the transposase translation initiation site. EMBOJ, 9, 1267-1274. M a , J., Campbell, A . and Karl in, S. (2002) Correlations between Shine-Dalgarno sequences and gene features such as predicted expression levels and operon structures. J Bacteriol, 184, 5733-5745. Maden, B . E . (1990) The numerous modified nucleotides in eukaryotic ribosomal R N A . Prog Nucleic Acid Res Mol Biol, 39, 241-303. Maden, B . E . and Hughes, J . M . (1997) Eukaryotic ribosomal R N A : the recent excitement in the nucleotide modification problem. Chromosoma, 105, 391-400. Mandal, M . and Breaker, R.R. (2004) Gene regulation by riboswitches. Nat Rev Mol Cell Biol, 5,451-463. 181 Mao, FL, White, S.A. and Williamson, J.R. (1999) A n o v e l loop-loop recognition motif in the yeast ribosomal protein L30 autoregulatory R N A complex. Nat Struct Biol, 6 , 1139-1147. Massenet, S., Ansmant, I., Motorin, Y . and Branlant, C. (1999) The first determination of pseudouridine residues in 23S ribosomal R N A from hyperthermophilic Archaea Sulfolobus acidocaldarius. FEBS Lett, 4 6 2 , 94-100. Matsumura, S., Ikawa, Y . and Inoue, T. (2003) Biochemical characterization of the kink-turn R N A motif. Nucleic Acids Res, 3 1 , 5544-5551. Mattick, J.S. (2003) Challenging the dogma: the hidden layer of non-protein-coding R N A s in complex organisms. Bioessays, 2 5 , 930-939. Moore, T., Zhang, Y . , Fenley, M . O . and L i , H . (2004) Molecular basis of box C / D R N A -protein interactions; cocrystal structure of archaeal L 7 A e and.a box C / D R N A . Structure, 1 2 , 807-818. Nelson, P., Kiriakidou, M . , Sharma, A . , Maniataki, E . and Mourelatos, Z . (2003) The mic roRNA world: small is mighty. Trends Biochem Sci, 2 8 , 534-540. Newman, D.R. , Kuhn, J.F., Shanab, G . M . and Maxwel l , E.S. (2000) Box C /D snoRNA-associated proteins: two pairs of evolutionarily ancient proteins and possible links to replication and transcription. RNA, 6 , 861-879. N i , J., Tien, A . L . and Fournier, M . J . (1997) Small nucleolar R N A s direct site-specific synthesis of pseudouridine in ribosomal R N A . Cell, 8 9 , 565-573. Niewmierzycka, A . and Clarke, S. (1999) S-Adenosylmethionine-dependent methylation in Saccharomyces cerevisiae. Identification of a novel protein arginine methyltransferase. JBiol Chem, 2 7 4 , 814-824. Nolivos, S., Carpousis, A . J . and Clouet-d'Orval, B . (2005) The K-loop, a general feature of the Pyrococcus C / D guide R N A s , is an R N A structural motif related to the K-turn. Nucleic Acids Res, 3 3 , 6507-6514. Noon, K . R . , Bruenger, E . and McCloskey, J .A. (1998) Posttranscriptional modifications in 16S and 23 S r R N A s of the archaeal hyperthermophile Sulfolobus solfataricus. J Bacteribl, 1 8 0 , 2883-2888. Nottrott, S., Urlaub, H . and Luhrmann, R. (2002) Hierarchical, clustered protein interactions with U4/U6 snRNA: a biochemical role for U4/U6 proteins. EMBOJ, 2 1 , 5527-5538. Ofengand, J. and.Bakin, A . (1997) Mapping to nucleotide resolution of pseudouridine residues in large subunit ribosomal R N A s from representative eukaryotes, prokaryotes, archaebacteria, mitochondria and chloroplasts. J Mol Biol, 2 6 6 , 246-268. 182 Ofengand, J. and Rudd, k. (2000) Bacterial, archaeal and organellar R N A pseudourines and methylated nucleotides and their enzymes. In Garret, R., Douthwaite, S., Liljas, A . , Matheson, A . , Moore, P .B. and Noller, H . (eds.), Ribosome: Structure, Function, Antibiotics and cellular interactions. A S M Press, pp. 175-190. Olivas, W . M . , Muhlrad, D. and Parker, R. (1997) Analysis of the yeast genome: identification of new non-coding and small ORF-containing R N A s . Nucleic Acids Res, 25, 4619-4625. Omer, A . D . , Lowe, T . M . , Russell, A . G . , Ebhardt, H . , Eddy, S.R. and Dennis, P.P. (2000) Homologs of small nucleolar R N A s in Archaea. Science, 288, 517-522. Omer, A . D . , Zago, M . , Chang, A . and Dennis, P.P. (2006) Probing the structure and function of an archaeal C/D-box methylation guide s R N A . RNA, 12, 1708-1720. Omer, A . D . , Ziesche, S., Decatur, W . A . , Fournier, M . J . and Dennis, P.P. (2003) R N A -modifying machines in archaea! Mol Microbiol, 48, 617-629. Omer, A . D . , Ziesche, S., Ebhardt, H . and Dennis, P.P. (2002) In vitro reconstitution and activity of a C /D box methylation guide ribonucleoprotein complex. Proc Natl Acad Sci USA;99, 5289-5294. Oruganti, S., Zhang, Y . and L i , H . (2005) Structural comparison of yeast snoRNP and spliceosomal protein Snul3p with its homologs. Biochem Biophys Res Commun, 333, 550-554. Pan, T., Fang, X . and Sosnick, T. (1999) Pathway modulation, circular permutation and rapid R N A folding under kinetic control. J Mol Biol, 286, 721 -731. Pan, T., Gutell, R.R. and Uhlenbeck, O.C. (1991) Folding of circularly permuted transfer R N A s . Science, 254, 1361-1364. Parker, J.S., Roe, S . M . and Barford, D. (2005) Structural insights into m R N A recognition from a PIWI domain-siRNA guide complex. Nature, 434, 663-666. Ramirez, C , Kopke, A . K . E . , Yang, D .C . , Boeckh, T. and Matheson, A . T . (1993 ) The structure, function and evolution of archaeal ribosomes. In Kates, M . , Kushner, D.J . and Matheson, A . T . (eds.), The Biochemistry of Archaea (Archaebacteria). Elsevier, Amsterdam, pp. 439-466. Rashid, R., Aittaleb, M . , Chen, Q., Spiegel, K . , Demeler, B . and L i , H . (2003) Functional requirement for symmetric assembly of archaeal box C /D small ribonucleoprotein particles. J Mol Biol, 333, 295-306. Redder, P. and Garrett, R . A . (2006) Mutations and rearrangements in the genome of Sulfolobus solfataricus P2. J Bacteriol, 188, 4198-4206. 183 Reinhart, B .J . , Slack, F.J. , Basson, M . , Pasquinelli, A . E . , Bettinger, J . C , Rougvie, A . E . , Horvitz, H.R. and Ruvkun, G . (2000) The 21-nucleotide let-7 R N A regulates developmental timing in Caenorhabditis elegans. Nature, 4 0 3 , 901-906. Renalier, M . H . , Joseph, N . , Gaspin, C , Thebault, P. and Mougin, A . (2005) The Cm56 t R N A modification in archaea is catalyzed either by a specific 2'-0-methylase, or a C /D sRNP. Rna, 1 1 , 1051-1063. Rozenski, J., Crain, P.F. and McCloskey, J .A. (1999) The R N A Modification Database: 1999 update. Nucleic Acids Res, 2 7 , 196-197. Rozhdestvensky, T.S., Tang, T .H . , Tchirkova, I.V., Brosius, J., Bachellerie, J.P. and Huttenhofer, A . (2003) Binding of L 7 A e protein to the K-turn of archaeal snoRNAs: a shared R N A binding motif for C /D and H / A C A box snoRNAs in Archaea. Nucleic Acids Res, 3 1 , 869-877. Ruggero, D. , Creti, R. and Londei, P. (1993) In vitro translation of archaeal natural m R N A s at high temperature. FEMS Microbiology Letters, 1 0 7 , 89-94. Russell, A . G . , Ebhardt, H . and Dennis, P.P. (1999) Substrate requirements for a novel archaeal endonuclease that cleaves within the 5' external transcribed spacer of Sulfolobus acidocaldarius precursor r R N A . Genetics, 1 5 2 , 1373-1385. Sambrook, J.F., Fritsch, E.F. and Maniatis, T. (1989) In Molecular cloning: a laboratory manual Cold Spring Harbor Laboratory Press. Simmons, R . W . and Kleckner, N . (1983) Translational control of IS10 transposition. Cell, 3 4 , 683-691. Smith, C M . and Steitz, J .A. (1997) Sno storm in the nucleolus: new roles for myriad small RNPs . Cell, 8 9 , 669-672. Song, J.J., Smith, S.K., Harmon, G.J. and Joshua-Tor, L . (2004) Crystal structure of Argonaute and its implications for RISC slicer activity. Science, 3 0 5 , 1434-1437. Storz, G . (2002) A n expanding universe of noncoding R N A s . Science, 2 9 6 , 1260-1263. Storz, G . , Altuvia, S. and Wassarman, K . M . (2005) A n abundance of R N A regulators. Annu Rev Biochem, 7 4 , 199-217. Storz, G . , Opdyke, J .A. and Zhang, A . (2004) Controlling m R N A stability and translation with small, noncoding R N A s . Curr Opin Microbiol, 7 , 140-144. Studier, F .W. (1991) Use of bacteriophage T7 lysozyme to improve an inducible T7 expression system. J Mol Biol, 2 1 9 , 37-44. 184 Studier, F .W. and Moffatt, B . A . (1986) Use of bacteriophage T7 R N A polymerase to direct selective high-level expression of cloned genes. J Mol Biol, 1 8 9 , 113-130. Suryadi, J., Tran, E.J . , Maxwel l , E.S. and Brown, B . A . , 2nd. (2005) The crystal structure of the Methanocaldococcus jannaschii multifunctional L7Ae RNA-binding protein reveals an induced-fit interaction with the box C /D R N A s . Biochemistry, 44, 9657-9672. Szewczak, L . B . , DeGregorio, S.J., Strobel, S.A. and Steitz, J .A. (2002) Exclusive interaction of the 15.5 k D protein with the terminal box C / D motif of a methylation guide snoRNP. Chem Biol, 9 , 1095-1107. Szewczak, L . B . , Gabrielsen, J.S., Degregorio, S.J., Strobel, S.A. and Steitz, J .A. (2005) Molecular basis for R N A kink-turn recognition by the h l 5 . 5 K small R N P protein. Rna, 11, 1407-1419. Tang, T . H . , Bachellerie, J.P., Rozhdestvensky, T., Bortolin, M . L . , Huber, H . , Drungowski, M . , Elge, T., Brosius, J. and Huttenhofer, A . (2002a) Identification of 86 candidates for small non-messenger R N A s from the archaeon Archaeoglobus fulgidus. Proc Natl Acad Sci US A, 9 9 , 7536-7541. Tang, T .H . , Polacek, N . , Zywick i , M . , Huber, H . , Brugger, K . , Garrett, R., Bachellerie, J.P. and Huttenhofer, A . (2005) Identification of novel non-coding R N A s as potential antisense regulators in the archaeon Sulfolobus solfataricus. Mol Microbiol, 55, 469-481. Tang, T . H . , Rozhdestvensky, T.S., d'Orval, B . C . , Bortolin, M . L . , Huber, H . , Charpentier, B . , Branlant, C , Bachellerie, J.P., Brosius, J. and Huttenhofer, A . (2002b) RNomics in Archaea reveals a further link between splicing of archaeal introns and r R N A processing. Nucleic Acids Res, 30, 921-930. Thompson, L . D . and Daniels, C.J. (1990) Recognition of exon-intron boundaries by the Halobacterium volcanii t R N A intron endonuclease. J Biol Chem, 265, 18104-18111. Tollervey, D . , Lehtonen, H . , Jansen, R., Kern, H . and Hurt, E .C . (1993) Temperature-sensitive mutations demonstrate roles for yeast fibrillarin in pre-rRNA processing, pre-rRNA methylation, and ribosome assembly. Cell, 72, 443-457. Tolstrup, N . , Sensen, C .W. , Garrett, R . A . and Clausen, I.G. (2000) Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus. Extremophiles, 4, 175-179. Tran, E . , Zhang, X . , Lackey, L . and Maxwel l , E.S. (2005) Conserved spacing between the box C / D and C ' /D ' RNPs of the archaeal box C / D sRNP complex is required for efficient 2'-0-methylation of target R N A s . RNA, 11, 285-293. 185 Tran, E.J . , Zhang, X . and Maxwel l , E.S. (2003) Efficient R N A 2'-0-methylation requires juxtaposed and symmetrically assembled archaeal box C /D and C ' /D ' RNPs . EMBO J, 22, 3930-3940. Turner, B . , Melcher, S.E., Wilson, T.J., Norman, D . G . and Li l ley , D . M . (2005) Induced fit of R N A on binding the L 7 A e protein to the kink-turn motif. RNA, 11, 1192-1200. Tycowski, K . T . , Smith, C M . , Shu, M . D . and Steitz, J .A. (1996) A small nucleolar R N A requirement for site-specific ribose methylation of r R N A in Xenopus. Proc Natl Acad Sci USA, 93, 14480-14485. Venema, J. and Tollervey, D. (1999) Ribosome synthesis in Saccharomyces cerevisiae. Annu Rev Genet, 33, 261-311. Vidovic , I., Nottrott, S., Hartmuth, K . , Luhrmann, R. and Ficner, R. (2000) Crystal structure of the spliceosomal 15.5kD protein bound to a U4 s n R N A fragment. Mol Cell, 6, 1331-1342. Vilardell , J. and Warner, J.R. (1994) Regulation of splicing at an intermediate step in the formation of the spliceosome. Genes Dev, 8, 211-220. Vi ta l i , P., Royo, H . , Seitz, H . , Bachellerie, J.P.-, Huttenhofer, A . and Cavaille, J. (2003) Identification of 13 novel human modification guide R N A s . Nucleic Acids Res, 31, 6543-6551. Vogel , J., Battels, V . , Tang, T .H . , Churakov, G . , Slagter-Jager, J .G., Huttenhofer, A . and Wagner, E . G . (2003) RNomics in Escherichia coli detects new s R N A species and indicates parallel transcriptional output in bacteria. Nucleic Acids Res, 31, 6435-6443. Wagner, E . G . , Altuvia, S. and Romby, P. (2002) Antisense R N A s in bacteria and their genetic elements. Adv Genet, 46, 361-398. Wassarman, K . M . (2002) Small R N A s in bacteria: diverse regulators of gene expression in response to environmental changes. Cell, 109, 141-144. Wassarman, K . M . , Repoila, F., Rosenow, C , Storz, G. and Gottesman, S. (2001) Identification of novel small R N A s using comparative genomics and microarrays. Genes Dev, 15, 1637-1651. Watanabe, Y . and Gray, M . W . (2000) Evolutionary appearance of genes encoding proteins associated with box H / A C A snoRNAs: cbf5p in Euglena gracilis, an early diverging eukaryote, and candidate Gar lp and Nop 1 Op homologs in archaebacteria. Nucleic Acids Res, 28, 2342-2352. Watkins, N . J . , Dickmanns, A . and Luhrmann, R. (2002) Conserved stem II of the box C /D motif is essential for nucleolar localization and is required, along with the 15.5K protein, for the hierarchical assembly of the box C /D snoRNP. Mol Cell Biol, 22, 8342-8352. 186 Watkins, N . J . , Gottschalk, A . , Neubauer, G. , Kastner, B . , Fabrizio, P., Mann, M . and Luhrmann, R. (1998) Cbf5p, a potential pseudouridine synthase, and Nhp2p, a putative RNA-binding protein, are present together with Gar lp in all H B O X / A C A -motif snoRNPs and constitute a common bipartite structure. RNA, 4, 1549-1568. Watkins, N . J . , Segault, V . , Charpentier, B . , Nottrott, S., Fabrizio, P., Bachi, A . , W i l m , M . , Rosbash, M . , Branlant, C. and Luhrmann, R. (2000) A common core R N P structure shared between the small nucleoar box C/D RNPs and the spliceosomal U4 snRNP. Cell, 1 0 3 , 457-466. Weinstein, L . B . and Steitz, J .A. (1999) Guided tours: from precursor snoRNA to functional snoRNP. Curr Opin Cell Biol, 1 1 , 378-384. Wilt ing, R., Schorling, S., Persson, B . C . and Bock, A . (1997) Selenoprotein synthesis in archaea: identification of an m R N A element of Methanococcus jannaschii probably directing selenocysteine insertion. J Mol Biol, 2 6 6 , 637-641. Winkler, W . C . , Grundy, F.J. , Murphy, B . A . and Henkin, T . M . (2001) The G A motif: an R N A element common to bacterial antitermination systems, r R N A , and eukaryotic R N A s . RNA, 7, 1165-1172. Worthington, P., Hoang, V . , Perez-Pomares, F. and Blum, P. (2003) Targeted disruption of the alpha-amylase gene in the hyperthermophilic archaeon Sulfolobus solfataricus. J Bacteriol, 1 8 5 , 482-488. Yanisch-Perron, C , Vieira, J. and Messing, J. (1985) Improved M l 3 phage cloning vectors and host strains: nucleotide sequences of the M 1 3 m p l 8 and pUC19 vectors. Gene, 3 3 , 1 0 3 - 1 1 9 . Yano, Y . , Saito, R., Yoshida, N . , Yoshiki , A . , Wynshaw-Boris, A . , Tomita, M . and Hirotsune, S. (2004) A new role for expressed pseudogenes as n c R N A : regulation of m R N A stability of its homologous coding gene. J Mol Med, 8 2 , 414-422. Yates, J .L. , Dean, D. , Strycharz, W . A . and Nomura, M . (1981) E. coli ribosomal protein L10 inhibits translation of L10 and L7/L12 m R N A s by acting at a single site. Nature, 2 9 4 , 190-192. Yuan, G. , Klambt, C , Bachellerie, J.P., Brosius, J. and Huttenhofer, A . (2003) RNomics in Drosophila melanogaster: identification of 66 candidates for novel non-messenger R N A s . Nucleic Acids Res, 3 1 , 2495-2507. Zago, M . A . , Dennis, P.P. and Omer, A . D . (2005) The expanding world of small R N A s in the hyperthermophilic archaeon Sulfolobus solfataricus. Mol Microbiol, 5 5 , 1812-1828. Zebarjadian, Y . , K ing , T., Fournier, M . J . , Clarke, L . and Carbon, J. (1999) Point mutations in yeast C B F 5 can abolish in vivo pseudouridylation of r R N A . Mol Cell Biol, 1 9 , 7461-7472. 187 Zhang, A . , Wassarman, K . M . , Ortega, J., Steven, A . C . and Storz, G. (2002) The Sm-like Hfq protein increases OxyS R N A interaction with target m R N A s . Mol Cell, 9, 11-22. Zhang, A . , Wassarman, K . M . , Rosenow, C., Tjaden, B . C . , Storz, G . and Gottesman, S. (2003) Global analysis of small R N A and m R N A targets of Hfq. Mol Microbiol, 50, 1111-1124. Zhang, Z . , Carriero, N . and Gerstein, M . (2004) Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet, 20, 62-67'. Ziesche, S . M . , Omer, A . D . and Dennis, P.P. (2004) RNA-guided nucleotide modification of ribosomal and non-ribosomal R N A s in Archaea. Mol Microbiol, 54, 980-993. Z i l l i g , W. , Stetter, K . O . , Wunderl, S., Schulz, W., Priess, H . and Scholz, J. (1980) The Sulfolobus-"Caldariella" group: taxonomy on the basis of the structure of D N A -dependent R N A polymerase. Arch. Microbiol, 125, 259-269. Zinoni, F., Heider, J. and Bock, A . (1990) Features of the formate dehydrogenase m R N A necessary for decoding of the U G A codon as selenocysteine. Proc Natl Acad Sci U S A, 87, 4660-4664. Zuker, M . (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res, 31, 3406-3415. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0100435/manifest

Comment

Related Items