Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Characterization of the ribosomal RNA operons of Haloarcula marismortui Mylvaganam, Mylvaganam, Shanthini Shanthini 1996

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1996-148068.pdf [ 11.21MB ]
Metadata
JSON: 831-1.0087814.json
JSON-LD: 831-1.0087814-ld.json
RDF/XML (Pretty): 831-1.0087814-rdf.xml
RDF/JSON: 831-1.0087814-rdf.json
Turtle: 831-1.0087814-turtle.txt
N-Triples: 831-1.0087814-rdf-ntriples.txt
Original Record: 831-1.0087814-source.json
Full Text
831-1.0087814-fulltext.txt
Citation
831-1.0087814.ris

Full Text

C H A R A C T E R I Z A T I O N OF T H E RIBOSOMAL RNA OPERONS OF HALOARCULA  MARISMORTUI  by SHANTFflNI MYLVAGANAM B.Sc. (Hons.), University of Jaffna, Sri Lanka, 1985 M. Sc., Bowling Green State University, U. S. A., 1990  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Department of Biochemistry and Molecular Biology)  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA March 1996 © S. Mylvaganam, 1996  In  presenting  this  degree at the  thesis  in  partial fulfilment  of  University of  British Columbia,  I agree  freely available for reference copying  of  department  this or  publication of  and study.  this  his  or  her  representatives.  Department of  6 ' O C H BM I S P - /  The University of British Columbia Vancouver, Canada  DE-6 (2/88)  5  that the  may be It  thesis for financial gain shall not  permission.  requirements  I further agree  thesis for scholarly purposes by  the  is  that  an  advanced  Library shall make it  permission for extensive  granted  by the  understood be  for  allowed  that without  head  of  my  copying  or  my written  ABSTRACT The genome of Haloarcula marismortui contains two ribosomal RNA operons, designated as rrnA and rrnB (and possibly a third operon designated as rrnC) of which the characterization of the rrnA and rrnB operons are presented. Characterization of the rrnA and rrnB operons involved the analysis of primary and secondary structures and in vivo studies of the primary transcripts and processing intermediates. It was found that the gene orders of the rrnA and rrnB operons were 5-16S rRNA-tRNA -23S rRNA-5S r R N A - t R N A Y Ala  c  s  3' and 5-16S rRNA-23S rRNA-5SrRNA-3', respectively. Computing the substitution rates for the entire rrnA and rrnB operons demonstrated that the major differences are localized in the non coding regions, that is the regions including the 5'-flanking of the 16S rRNA, 16S-23S rRNA spacer and the 3'-flanking of the 5S rRNA gene. The percentage similarities between the 16S, 23S and 5S rRNAs of rrnA and rrnB are 95%, 98.7% and 98.3%, respectively. A pairwise sequence comparison between the 23S rRNA sequence of the rrnC operon (Brombach et al., 1989) and the other two operons, rrnA and rrnB, revealed that the sequence similarities are 98.8% and 99.6%, respectively. The 5S rRNA sequence from the rrnC operon is identical to the rrnA sequence. The nucleotide substitutions within the 16S rRNA genes of rrnA and rrnB operons are concentrated in three separate domains 58-321, 508-823 and 986-1158. About 60% of the substitutions are concentrated within the 508-823 domain and are compensatory, affecting both components of the nucleotide base pairs within defined rRNA helices. Using nuclease Sl protection assays, it was shown that the 16S rRNAs from the rrnA and rrnB operons are expressed and present in intact 70S ribosomes. A comparison of the 23S rRNAs from the rrnA and rrnB operons showed that the substitutions are located within the variable regions of domains I, m , IV and VI of the universal secondary structures of 23S rRNAs. The 5S rRNA sequences of the two operons differ at two nucleotide positions in the helix IV of the universal secondary structure for the 5S rRNA.  The 5'-flarucing regions of the rrnA contains four tandem promoters whereas the rrnB operon contains a single tandem promoter and a second promoter-like sequence. An internal promoter sequence was present within the 16S-23S spacer regions of all three operons from Ha. marismortui. Putative secondary structures of the primary transcripts from the rrnA operon showed that the 16S and 23S rRNAs are surrounded by inverted repeat structures containing the "bulge-helix-bulge" motif which is recognized by a processing endonuclease. In the case of the rrnB operon, the inverted repeat structure surrounding the 23S rRNA is identical to that of the rrnA operon and processing follows the same pathway. However, the inverted repeat structure surrounding 16S rRNA from rrnB does not contains the "bulgehelix-bulge" motif and its processing follows a distinct pathway. The 16S rRNA processing of the rrnB occurs at a single position within the 5'-flanking region and at three positions within the 16S-23S spacer region. The nucleotides present in these cleavage sites and their surrounding regions showed no sequence conservation. The 5S rRNA processing occurs close to or at its 5'- and 3'-ends. Two apparent termination sites were detected for the rrnA operon and they were found to be located downstream of the tRNACys RNA. Phylogenetic analysis of the 508-823 region of the 16S rRNA, the entire 16S rRNA, 23S rRNA genes and the combination of 16S-23S-5S rRNA genes were performed using the PAUP program. The phylograms from the 16S rRNA, 23S rRNA and the 16S-23S-5S rRNA sequences indicated that the rrnA and rrnB group together and the rate of divergence of the rrnB operon is higher than that of the rrnA operon. However, comparison of the 508823 regions showed that the two operons do not group together and that rrnB is evolving slower than rrnA. Based on the comparisons made between the rrnA and rrnB operons, it is obvious that the rrnB operon is different from the rrnA operon in its gene order, rRNA sequences and 16S rRNA processing and also probably evolving at a different rate than the rrnA operon.  TABLE OF CONTENTS ABSTRACT TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS ACKNOWLEDGEMENTS DEDICATION  CHAPTER 1: General Introduction 1.1 The Evolutionary Record 1.2 The RNA World and Beyond 1.3 Gene Duplication and Formation of Gene Families 1.4 Molecular Phylogeny and the Three Kingdom Classification 1.5 Archaea 1.5.1 Characteristic Features 1.5.2 Halophiles 1.6 Archaeal rRNA Operons 1.6.1 Organization 1.6.2 Transcription Signals 1.6.3 Secondary Structural Organization of Transcripts 1.6.4 Secondary Structures of Ribosomal RNAs  1.7 Ribosomal RNA Heterogeneity  26  1.8 Research Objectives  29  CHAPTER 2: Materials and Methods  30  2.1 Materials  30  2.2 Methods  31  2.2.1 Media and Culture Conditions 2.2.2 Isolation of Haloarcula  marismormi  31 Genomic DNA  2.2.3 5' End-Labelling of Oligonucleotides  '  31 32  2.2.4 Southern Blot Analysis  32  2.2.5 Plasmids and Phage Preparations  33  2.2.6 Gel Electrophoresis  33  2.2.6.1 Native Gel Electrophoresis  33  2.2.6.2 Denaturing Polyacrylamide Gel Electrophoresis  33  2.2.7 Ligation  34  2.2.8 Transformation  34  2.2.9 Exonuclease UJ Deletions  35  2.2.10 DNA Sequencing  35  2.2.11 Isolation of Total RNA  35  2.2.12 Isolation of RNAs from Mature Ribosomes  36  2.2.13 5' End-Labelling of DNA Fragments  36  2.2.14 3' End-Labelling of DNA Fragments  37  2.2.15 Maxam and Gilbert Sequencing  37  VI  2.2.16 Transcript Mapping  38  2.2.16.1 Nuclease S1 Protection Analysis of the Total RNA  38  2.2.16.2 Nuclease Protection Analysis of the Active 70S Ribosomal RNAs 38 2.2.16.3 Primer Extensions  39  2.2.17 Sequence Alignment of the 16S 5-Flanking Regions  40  2.2.18 Molecular Phylogeny Method  40  CHAPTER 3 : The Gene Organization, Sequence Heterogeneity, Expression and RNA Processing of the rrnA and rrnB Operons From the Halophilic Archaea Haloarcula  marismortui.  42  3.1 Introduction  42  3.2 Results and Discussion  43  3.2.1 Number of Operons  43  3.2.2 Operon Structures  48  3.2.2.1 Primary Structure  48  3.2.2.2 Secondary Structure  60  3.2.3 The 16S Leader Region  68  3.2.3.1 Sequence Alignment  68  3.2.3.2 Promoters  70  3.2.3.3 Primary Transcript Analysis of the 16S Leader Region  74  3.2.3.4 The 16S Processing Pathways for the rrnA and rrnB Operons  79  3.2.4 16S rRNA 3.2.4.1 Primary Structural Analysis  80 80  3.2.4.2 Secondary Structural Analysis 3.2.4.3 Expression of rrnA and rrnB 16S rRNAs 3.2.5 16S-23S Intergenic Spacer  81 88 96  3.2.5.1 Primary Structural Analysis  96  3.2.5.2 Primary Transcript Analysis of the rrnA 16S-23S Spacer  99  3.2.5.3 Primary Transcript Analysis of the rrnB 16S-23S Spacer 3.2.5.4 The Processing Pathways Within the 16S-23S Spacer 3.2.6 23S rRNA  108 110 111  3.2.6.1 Primary Structural Analysis  111  3.2.6.2 Secondary Structural Analysis  118  3.2.7 23S Distal Region 3.2.7.1 23S-5S Intergenic Spacer .3.2.7.2 5S RNA 3.2.7.3 The 5S Distal Region  120 120 122 124  3.2.7.4 Primary Transcript Analysis of the 23S Distal Region 3.3 SUMMARY  126 130  CHAPTER 4: Phylogenetic Implications of Sequence Diversity Between the Two Ribosomal RNA operons, rrnA and rrnB from the Haloarcula marismortui. 133 4.1 Introduction  133  4.2 Results and Discussion  134  4.2.1 Phylogeny of the 508-823 domains of the 16S rRNA Genes  134  Vlll  4.2.2 Phylogeny of the complete 16S rRNA Genes  134  4.1.3 Phylogeny of the 23 rRNA Genes  136  4.1.4 Phylogeny of the 16S-23-5S rRNA Genes  136  4.1.5 Evolution of Ribosomal RNA operons in Ha. marismortui  137  Future Research Prospects  140  References  141  Appendix  162  LIST OF TABLES  Table 1.1  Characteristics of the three domains.  Table 1.2  Number of rRNA Operons in archaea.  Table 1.3  Summary of characteristic structural features in various 5 S rRNAs.  Table 3.1  Summary of pulse field gel electrophoresis experiments using chromosomal DNA from different halophilic archaea, digested with different restriction enzymes and hybridized with 23S and 16S rRNA probes.  Table 3.2  The percentage similarities between the promoter sequences from the rrnA and rrnB operons of Ha. marismortui.  Table 3.3  Comparison of the nucleotide sequences of the 16S rRNA and the 508-823 domain from halophilic archaeal species are summarized.  Table 3.4  A comparison of the nucleotide sequences of the 23S rRNA from halophilic archaeal species.  Table A . l  Showing the endonuclease digestion fragments, which are less than 3.0kb, obtained  from the operons rrnA and rrnB of Ha.  marismortui (see Figure 3.1). Table A.2  Oligonucleotide used for sequencing the Ha. marismortui rrnA and rrnB operons and primer extention analysis on the Ha. marismortuiribosomalRNAs.  Table A.3  Plasmids and strains used for the characterization of the rrnA and rrnB operons in Ha. marismortui.  LIST OF FIGURES  Figure 1.1  A possible scheme for early evolution.  5  Figure 1.2  Compensatory base changes within rRNA secondary structures.  8  Figure 1.3  The universal phylogenetic tree.  Figure 1.4  The rooted evolutionary tree that relates all known groups of extant  10  organisms.  11  Figure 1.5  Gene organization and structure of archaeal rRNA transcripts.  19  Figure 1.6  The secondary structures containing the recognition features for endonuclease activities that excise precursor 16S and 23S rRNAs from the primary transcripts.  2y  Figure 1.7  Halobacterium cutirubruni 5S rRNA.  Figure 1.8  Functional regions in small subunit rRNA.  25  Figure 1.9  Functional regions in large subunit rRNA.  27  Figure 3.1  Restriction endonuclease maps of the rrnA and rrnB operons from Ha.  Figure 3.2  24  marismortui.  44  Genomic southern hybridization with oligonucleotide, oPD 34.  45  Figure 3.3A The complete sequence of the rrnA operon from Ha. marismortui is given from 5' to 3' direction. Figure 3.3B  The complete sequence of the rrnB operon from Ha. marismortui is given from 5' to 3' direction.  Figure 3.4  55  The gene organization of the rrnA and rrnB operons from Ha.  marismortui. Figure 3.6  52  Identification of a tRNA^ys gene using Southern hybridization analysis.  Figure 3.5  50  56  Distribution of sequence differences of the pairwise comparisons along the length of the three rRNA operons.  58  XJ  Figure 3.7 A  Primary Transcript of the rrnA operon of Ha. marismortui.  61  Figure 3.7B  Primary Transcript of the rrnB operon of Ha. marismortui  62  Figure 3.7C  Primary Transcript of the rRNA operon of Hb. cutirubrum..  63  Figure 3.8  A series of inverted repeat structures of the 16S 5 flanking region L  of the rrnA operon containing the promoter motifs. Figure 3.9  ^  Alignment of the partial sequences that generate a helix B in the 16S leader regions of some archaeal organisms. 66  Figure 3.10  Alignment of the 5 flanking sequences from the Ha. marismortui L  operons, rrnA and rrnB, and from the Hb. cutirubrum. Figure 3.11  The alignment of promoters and putative promoter-like sequences from halophilic rRNA operons.  Figure 3.12  69  71  Nuclease SI protection assays within the 5-flanking regions of the 16S rRNA from the rrnA and rrnB operons of Ha.  76  marismortui. Figure 3.13  Figure 3.14  Primer Extension Analysis of the 5-flanking regions of the rrnA and rrnB operons.  80  Nucleotide sequences and alignment of the 16S rRNA encoding  82  genes. Figure 3.15  Predicted secondary structure for the 508-823 domain of the rrnA and rrnB 16S rRNAs from Ha. marismortui.  Figure 3.16  Predicted secondary structure for the 986-1158 domain of rrnA and rrnB 16S rRNAs from Ha. marismortui.  Figure 3.17  85 87  Predicted secondary structure for the 58-321 domain of the rrnA and rrnB 16S rRNAs from Ha. marismortui.  89  Figure 3.18  Ribosomal RNA protection of end-labelled DNA fragments.  91  Figure 3.19  Alignment of the 16S-23S intergenic spacer sequences from the rrnA and rrnB operons of Ha. marismortui with the sinsle operon 97 from Hb. cutirubrum.  Figure 3.20  Comparison of the internal promoter sequences of the rrnA, rrnB, Hb. cutirubrum and He. morrhuae operons.  Figure 3.21  98  Nuclease SI protection assay of the primary transcript products within the 16S-23S intergenic spacer region of the rrnA and rrnB operons using DNA probes labelled at their 3'- ends.  100  xii Figure 3.22  Nuclease Sl protection assay of the primary transcript products within the 16S-23S intergenic spacer region of the rrnA and rrnB operons using DNA probes labelled at their 5'- ends.  Figure 3.23  Primer Extension Analysis within the 16S-23S spacer regions of the rrnA and rrnB operons.  Figure 3.24  106  Nucleotide Sequences and Alignment of halophilic 23S rRNA encoding genes.  Figure 3.25  102  112  Histogram showing the distribution of nucleotide substitutions within the 23S rRNAs from Ha. marismortui, Hb. cutirubrum and  He. morrhuae. Figure 3.26  ^ ~*  The predicted secondary structure of Domain I of 23S rRNA from Ha. marismortui.  119  Figure 3.27  Comparison of the 23S-5S intergenic sequences.  pj  Figure 3.28  Comparison of the 5S rRNA genes.  p  Figure 3.29  The predicted secondary structure for the 5S rRNA from the rrnA operon of Ha. marismortui.  Figure 3.30  123  Comparison of the 5S distal regions. The 5S distal regions from the rrnA, rrnB, and rrnC operons and from Ha. marismortui.  Figure 3.31  125  Nuclease Sl mapping analysis of the 5S distal regions from the rrnA and rrnB operons.  Figure 4 . 1  0  127  Phylogeny obtained from the 508-823 domain of the 16S rRNA genes (Tree A), the complete 16S rRNA genes (Tree B), the complete 23S rRNA genes (Tree C), and the combination of 16S23S-5S rRNAs (Tree D).  '  135  LIST OF ABBREVIATIONS A  Adenosine  A260  Absorbance at 260 nm  AMV  Avian myeloblastosis virus  ATP  Adenosine 5'-triphosphate  bp  Base pair  BPB  Bromophenol blue  BSA  Bovine serum albumin  C  Cytosine  dATP  2'-deoxyadenosine - 5' - triphosphate  dCTP  2'-deodycytidine - 5' - triphosphate  ddATP  2', 3' - dideoxyadenosine - 5' - triphosphate  ddCTP  2', 3' - dideoxycytidine - 5' -t riphosphate  ddGTP  2', 3' - dideoxyguanosine - 5' - triphosphate  ddTTP  2', 3' - dideoxythymidine - 5' triphosphate  dNTP  2', 3' - deoxyribonucleotide - 5' - triphosphate (dATP. dCTP, dGTP and dTTP)  DNA  Deoxyribonucleic acid  DTT  Dithiothreitol  EDTA  EmyleneaUamine tetraacetic acid  FDM  Formamide Dye Mix  G  Guanosine  GTP  Guanosine - 5' - triphosphate  IPTG  Isopropyl - P - D - thiogalactopyranoside  kbp  KHobase pair  kd  Kilodaltons  LSU  Large subunitribosomalRNA  mRNA  Messenger RNA  NTP  Ribonucleotide - 5' triphosphate (ATP, CTP, GTP and TTP)  PAGE  Polyacrylamide gel electrophoresis  PAUP  Phylogenetic analysis using Parsimony  RFLP  Restriction fragment length polymorphism  RNA  Ribonucleic acid  RNase  Ribo nuclease  RNP  Ribonuclear protein  S  Svedberg unit of sedimentation coefficient  SDS  Sodium dodecyl sulphate  SSC  Standard saline citrate  SSU  Small subunitribosomalRNA  T  Thymidine  Tris  Trihydroxymethylaminomethane  tRNA  Transfer RNA  x-gal  5-bromo-4-chloro-3-indolyl-p-D-galactoside ,  ACKNOWLEDGEMENTS  I would like to thank Dr. Patrick Dennis for providing me the opportunity to work in his lab, for his guidance and encouragement throughout this study. I am also indebted to Drs. Peter Candido and Rosemary Redfield for their advice and support. 1 would also like to thank Dr. George Mackie for the critical reading of this thesis and comments. A special thanks goes to Deidre de Jong-Wong for all her technical assistance, for proof reading my thesis several times without any hesitation and all her help, support and friendship. I thank all my colleagues, Phulgun Joshi, Lawrence Shimmin, Janet Chow, Janet Yee, Luc Bissonett, Peter Durovic, Simon Potter, Daiqing Liao and Josephile Yau for their help, cooperation and friendship in the lab. I am also grateful to Dr. Lingwood and his students from Hospital for Sick Children for their help during the final preparation of my thesis Most of all, I am very grateful to Yogam Aunty, Yoga, Jeyanthy, Devini and the children, Amanda, Sandra and Melinda for taking care of Meena, for their love and unconditional support during the past few years.  DEDICATION  To  Meena,  Saeyon,  Myl  and  My  Family  1  CHAPTER 1 General Introduction 1.1 The Evolutionary Record It is widely acknowledged by scientists from various disciplines that our planet earth came into existence about 4.5 billion years ago (Walter, 1983). Furthermore, an evaluation of microbial fossil records suggests that the photosynthetic eubacteria emerged 3 to 4 billion years ago (Ernst, 1983). Recently, eleven taxa of cellularly preserved filamentous microbes have been discovered in a bedded chert unit of the early archaean apex basalt of Northwestern Western Australia (Schopf, 1993), which are among the oldest fossil records known. This prokaryotic assemblage establishes that the trichomic cyanobacterium-like microorganisms were extant and morphologically diverse as early as 3.5 billion years ago. The records further suggest that oxygen-producing photoautotrophy may have already evolved and been present in this early stage of biotic history (Schopf, 1993). Therefore on the basis of these findings it is suggested that the emergence of life on our planet occurred about 3.5 billion years ago or even earlier. Geological fossil records provide valuable information in reconstructing early evolutionary events. However, for biologists, the components of the cell also reveal the evolutionary past, particularly the amino acid sequences of its proteins and the nucleotide sequence of its nucleic acids, DNA and RNA. These living records are potentially richer and more extensive than the fossil records, Also, they reach beyond the time of the oldest fossil records known, back to the period when the common ancestor of all life forms existed. Current speculations on the nature of the early genetic information focuses on RNA because of its unique physical, chemical and genetic property. The inherent template properties of the RNA enables it to self replicate using a minus strand intermediate (Gilbert, 1986; Weiner and Maizels, 1987). Furthermore, the discovery of catalytic RNA as illustrated  by the self-splicing group I and II introns, the putative peptidyl transferase activity of the 23S RNA of the ribosome (Noller et al., 1992) and the catalytic activity of the RNA component of RNase P (Darr et al., 1992) suggest the possibility that early catalytic and later informationdriven protein synthesis could have been carried out exclusively by RNA. It has been shown recently that a genetically engineered ribozyme is able to function effectively as catalyst and template in self-copying reactions (Green and Szostack, 1992). The realization that RNA can serve as genome and catalyst initiated extensive discussion on the role of RNA in the origin of life (Pace and Marsh, 1985; Sharp, 1985; Lewin, 1986) which in turn led to the coining of the phrase "the RNA world" (Gilbert, 1986).  1.2 The RNA World and Beyond How protein synthesis would have evolved in an RNA world was addressed in the "genomic tag model" of Weiner and Maizels (1987, 1991). In this model, it was proposed that ancient linear RNA genomes possessed 3'-terminal tRNA-like structures which they called genomic tags. Like the 3'-terminal tRNA-like motifs of contemporary bacterial and plant RNA viruses (Rao et al., 1989) and possibly animal picornaviruses (Pilipenko et al., 1992), the genomic tag also would have served in important roles such as providing an initiation site for replication and functioning as a simple telomere. Transfer RNAs would have been derived from the 3'-terminal tRNA-like structures which tagged RNA genomes for replication by a replicase made of RNA. Activation of these tRNA-like structures with an amino acid could have been selected to facilitate replication. For example, a variant RNA replicase may have given rise to the first tRNA synthetase that could transfer an activated amino acid to a 3'-terminal tRNA-like structure (Weiner and Maizels, 1987). The demonstration of catalysis of a reaction at a carbon center by an RNA enzyme suggests that the RNA world could have expanded to include reactions of amino acids and other non-nucleic acid substrates prior to the involvement of proteins (Piccirilli et al., 1992). With tRNAs, mRNAs, and ribozymes functioning as tRNA synthetases and peptidyl transferases, it is possible that information-driven protein synthesis could have been carried  out entirely by RNA. The first ribosomes, also referred to as protoribosomes, may have been peptide specific (^Watson et al., 1987). The dependence of peptide-specific protoribosomes on built-in mRNAs would have significantly limited the variety of peptides that could be made. This limitation would then have provided the driving force for the evolution of a true ribosome capable of using separate mRNA molecules as templates for protein synthesis. The first ribosomal proteins were likely to be small peptides which interacted with rRNA to stabilize the structure and function of the rRNA (Wiener and Maizels, 1987; 1991). Therefore, the crude translation machinery that gave birth to the RNP world, a world made of both RNA and protein, must have been radically simpler than the ribosome of today. Early in the evolution of the RNP world, very sophisticated catalytic reactions had to be carried out by RNA, where rudimentary proteins might have assisted the RNA catalysts. As the RNP world became capable of translating longer mRNAs accurately, proteins became more important and began to assume some of the independent activities of RNA. Once the RNP world reached a certain level of sophistication, living systems became so complex that RNA was no longer a suitable genetic material. It is reasoned that RNA is far more vulnerable to "spontaneous" hydrolysis than DNA; the 2'-hydroxyl group of RNA could be involved in the catalytic cleavage of the adjacent phosphoester bond (Watson et al, 1987). This catalytic reaction is accelerated by the extreme conditions that might have been unavoidable on the primitive conditions of earth (e.g. high pH, high temperatures, and the presence of certain divalent cations from transition metals). The actual conversion from RNA to DNA genomes would have been a complicated process. However, there is a simple and attractive scenario for this conversion. Once the RNP world reached a certain level of complexity, the production of dNDPs by ribonucleoside diphosphate reductase might have enabled the preexisting replication enzymes to copy the RNA into DNA genomes and then to transcribe these DNA genomes back into functional RNA molecules. It is also possible that the enzymatic activities of ribonucleoside reductase  (to synthesize DNA precursors), reverse transcriptase (to transcribe the preexisting RNA genomes into DNA), DNA polymerase (to replicate new DNA genomes), and DNA-dependent RNA polymerase (to transcribe DNA segments back into functional RNA molecules), present in modern organisms, evolved simultaneously. In the DNA world, right-handed double helical DNA serves as the genetic material and some ribonucleoprotein complexes have been retained as important cellular components, such as the ribosome and RNase P. The progenote, the common ancestor of all modern life, is likely to have evolved in this DNA world. Figure 1.1 shows a possible scheme for early evolution (Darnell and Doolittle, 1986). After the discovery of split protein genes, it became clear that modern exons sometimes encode relatively large domains of protein structure (Gilbert, 1978).  This  observation gave rise to the hypothesis that the rate of acquisition of complexity (i.e., the creation of new genes with new functions) may be enhanced in the DNA world by exon shuffling.  Such novel exon combinations can produce functional mRNAs because the  junctions of the acquired units coincides precisely with the borders between exons and introns (Doolittle, 1985; Patthy, 1985). Other means of creating new genetic information such as gene duplication, overlapping genes, alternative splicing, and gene sharing are also important in evolution (Haldane, 1932; Ohno, 1970; Anderson et al., 1981; Perlman and Butow, 1989; Piatigorsky et al., 1988).  1.3 Gene Duplication and Formation of Gene Families The importance of gene duplication in evolution was first noted by Haldane (1932) and Muller (1935). They suggested that a redundant duplicate of a gene, freed from its functional constraints, acquired divergent mutations and eventually emerged as a new gene with a distinct function.  Although there are several other means by which a new gene can arise, gene  duplication plays a major role in the evolution of new genes in contemporary biological systems (Ohno 1970; Anderson et al., 1981; Perlman and Butow, 1989; Piatigorsky et al, 1988).  5  EVOLUTIONARY STAGE  MOLECULAR AND CELLULAR EVENTS SYNTHESIS OF ESSENTIAL BUILDING BLOCKS (Amino acids, bases, sugars, nucleosides, nucleotides, fatty acids, cofactors)  PREBIOTIC SYNTHESIS  CONDENSATIONiOF ofBUILDING I BLOCKS (Oligonucleotides, oligopeptides, lipids) EVOLUTION OF PROGENOTE  FIRST RNA REPUCASE o  •  <  RNA GENOMES (Distinction between genomic and functional RNA molecules? RNA splicing? Primitive metabolism?)  I I  PEPTIDE-SPECIFIC RTBOSOMES DEFINE GENETIC CODE (Primitive tRNAs, rRNAs, aminoacyl-tRNA synthetases) TEMPLATE-DEPENDENT TRANSLATION APPARATUS (True m RNAs)  a.  t t  TRANSCRIPTION AND REPLICATION OF SEGMENTED DOUBLE-STRANED GENOMES RNA GENOMES COPIED INTO DNA (Ribonucleoside diphosphatereductase,reverse transcriptase)  I  o  PROGENOTE (DNA genome, thymidylate synthase, most genes have in irons, slow growth, hetrotropic)  <  z  CELLULAR AND ORGANISMAL EVOLUTION  Q  COMMON ANCESTOR OF EUBACTERIA (Probably hyperthermophilic)  COMMON ANCESTOR OF ARCHAEBACTERIA AND EUKARYOTES (Probably hyperthermophilic)  I  I  I  Selection for complexity, inefficient growth extra DNA tolerated  Selection for efficient growth, auiotronhism extra DNA lost  + URKARYOTES (Eukaryolic nuclear lineage)  r  Mitochondria  r  SINGLE-CELLED EUKAROTES Multiccllularity ANIMALS Introns retained hetrotropic  ARCHAEBACTERIA  EUBACTERIA  HypenhermophUes (Sulfur-metabolizing thermoacidophiles etc.)  Oxidative phosphorylation, photosynthesis  I  Methanogens and Halophiles  Chloroplast  PLANTS Introns retained, both hetrotropic and autolropic  Some introns retained in tRNA. rRNA and some protein genes  Most introns lost Operons arise  Figure 1.1 A possible scheme for early evolution (modified from Darnell and Doolittle, 1986)  Two genes are said to be paralogous if they are derived from a duplication event within a genome, but orthologous if they are derived from replication followed by a speciation event. A complete gene duplication produces two identical copies. How they will diverge varies from case to case. The copies may, for instance, retain the original function, enabling the organism to produce large quantities of the RNA or protein gene products. Alternatively, one of the copies may become a functionless pseudogene. Genetic novelties or new genes may emerge if one of the duplicates retains the original function whereas the other accumulates molecular changes such that, in time, it performs a rather different task. For example, the globin superfamily has experienced all of the evolutionary pathways that can occur in families consisting of repeated sequences. The evolutionary pathways are; (i) retention of original function (i.e. two identical copies of the a-gene in humans), (ii) acquisition of new function (i.e. the a- and P genes), and (iii) loss of function (i.e. the pseudo a-gene or pseudo (3-gene). Paralogous multigene families with invariant sequences like those encoding rRNA or histones are special cases of gene duplication, in which functional uniformity rather than diversity of gene copies is needed in the organisms (Ohta, 1991). For example, in E. coli, removal of one or two of the seven rRNA operons reduces slightly the doubling time of the organism and also causes increased expression of the remaining intact copies (Condon et al., 1993). Histone proteins are among the slowest evolving proteins known and their sequences are conserved across species. For example, histone 3 interacts directly with either the DNA or other core histones in the formation of nucleosomes. It is reasonable to assume that there are very few possible substitutions that can occur without impeding the function of this protein. In addition, histone 3 must retain its strict compactness and its high alkalinity which are necessary for interaction with the acidic DNA molecule. As a consequence, histone 3 is resistant to molecular changes.  In the case of ribosomal RNA genes, they are essentially invariant within a species but between species they show sequence uniformity only within the functionally essential region of the rRNAs. Variable regions show different mutational rates across species. The primary nucleotide sequence of rRNA is essential for maintaining secondary and higher order  7 structural features and for participation in mRNA recognition, tRNA binding (in A, P, and E sites), ribosomal subunit association, proofreading, factor binding, antibiotic interactions, translational fidelity and suppression of frameshifting, tennination and the peptidyl transferase function (for reviews, see Brimacombe et al., 1990; Noller 1991).  Changes in-rRNA gene  sequences mat are fixed at all loci are usually considered to be nearly neutral or balanced by compensatory changes elsewhere in the molecule ( for reviews see Noller et al., 1990; Brimacombe et al., 1990). The phenomenon of compensatory mutation is a special case of molecular coevolution (or covariation, Dover and Flavell, 1984; Gerbi, 1985). Here a disrupting mutation in one strand of a stem is compensated by a complementary mutation in the other strand which preserves base pairing at that particular position in the rRNA secondary structure (Figure 1.2). The non-canonical G«U and AoG base pairs are likely to be the intermediate stages in the transitional and transvertional compensatory mutation pathways respectively. Mutations which cause unpaired bases may form unstable structures. Many secondary structural elements are capable of withstanding the introduction of bulges or noncanonical base pairs, either because they are not destabilized or they are stabilized by other interactions such as RNA-protein or additional RNA-RNA interactions (Hancock and Dover 1988; Brimacombe etai, 1990). In general, it is thought that molecular interaction mechanisms, such as frequent unequal meiotic crossovers and gene conversions, which "correct" one gene sequence against another, are responsible for the concerted evolution of rRNA genes (concerted evolution). Recombinant DNA studies with yeast rDNA have shown that frequent unequal crossovers do, in fact, occur within the rRNA genes. When the rate of occurrence of such interactions is high, a uniform gene family (homogeneity) is expected, whereas when it is low, the family acquires variability (heterogeneity) (Ohta, 1991).  8  U fl C fl G - C C - G C - G rj*U fj  G • U intermediate  #  fl*G - C*U C*U C fl fl fl G G G C*U U G C C U G  G • U - C - C - fl*G •G - C - G - G - fl o fl fl C G*fl - U*C C - G*fl fl - U*G fl — U  Q _ Q  C*U C G 525-G I I  in Hb. cutirubrum  - fl*G - G C  - fl*G - G *• C • U-590 I I  fl  o G intermediate  in Hf. volcanii  Figure 1.2 Compensatory base changes within rRNA secondary structures. A secondary structure within the 16S rRNA from Hb. cutirubrum is illustrated and the mutational differences observed in the corresponding positions from Hf. volcanii are indicated by asterisks (*). Here, a mutation in one strand of a stem is compensated by a mutation in the complementary strand which preserves base pairing at a particular position in the rRNA secondary structure. Canonical Watson-Crick base pairs (-) and non-canonical G»U and AoG base pairs are indicated. The G»U base pairs and AoG base pairs are considered to be the intermediate stages of the transitional and transversional compensatory mutations respectively.  9  1.4 Molecular Phylogeny and the Three Kingdom Classification Ribosomal RNA sequences are the most useful chronometers for defining the genealogical relationships between organisms (Woese, 1987).  They are universally  distributed, functionally conserved and the divergence of their primary sequences exhibits relatively good clock-like behavior. Since different positions in their sequences change at very different rates, most phylogenetic relationships can be ascertained. These molecules are abundant, easy to isolate and sequence, and due to their large size, they usually provide a sufficient amount of data for statistically reliable analysis. Phylogenetic trees based on the rRNA sequences show that the life on this planet can be divided into three domains; the eubacteria (or the bacteria), the archaebacteria (or the archaea) and the eukaryotes (or the eucarya) (Pace et al., 1986; Woese and Olsen, 1986; Woese et al., 1990; Pace, 1991). Each of these domains contains two or more kingdoms, for example, the eukaryotes contain Animalia, Plantae, Fungi and a number of others yet to be defined. Figure 1.3 is a universal phylogenetic tree showing the relationships among the primary domains (Woese et al., 1990). The root of the tree is seen to separate the eubacteria from the other two primary groups, making the archaea and eukaryotes specific (but distant) relatives. The position of the root was determined by comparing pairs of paralogous gene sequences (elongation factors Tu and G and the a - and (3- subunits of ATPase) which are thought to have diverged by gene duplication before divergence of three primary domains (Iwabe et al., 1989). According to Woese et al., (1990), the archaea comprise a monophyletic group which could be divided into two major phylogenetic lineages, euryarchaeota and crenarchaeota (Woese, 1987; Achenbach-Richter and Woese, 1988). The euryarchaeota is phenotypically heterogeneous and comprised of three methanogenic lineages; extreme halophiles, sulfurreducing species (the genus Archaeoglobus), and two types of thermophiles (the genus Thermoplasma and the Thermococcus-Pyrococcus group, Woese, 1987; Achenbach-Richter et al., 1987). Analysis of the small subunit rRNA sequences also indicates that  10  Bacteria  Archaea  Eucarya  t Figure 1.3 The universal phylogenetic tree. The tree is in rooted form and shows the three domains (adapted from Woese and Pace, 1993). Branching order and branch lengths are based upon 23 different small subunit (16S and 16S-like) rRNA sequence comparisons. The position of the root was determined by comparing sequences of pairs of paralogous genes that diverged from each other before the three primary lineages emerged from their common ancestral condition (Iwabe et al., 1989). This rooting strategy (Schwartz and Dayhoff, 1978) uses one set of (aboriginally duplicated) genes as an out group for the other. The root of the universal tree is seen to lie between the bacteria on the one hand and the common lineage that the archaea and eukaryotes share on the other. halophilic archaea have emerged from the methanogen group (Woese, 1987).  The  crenarchaeota comprise all of the sulphur-dependent extreme thermophiles. This classification has been reinforced by analysis of the sequences of RNA polymerase subunits (Puhler et al., 1989b) and other biological molecules. However, Lake (1988) has argued that archaea are paraphyletic and that they fall into three groups; extreme thermophiles or eocytes. methanogens, and extreme halophiles. The phylogenetic tree constructed by Lake from rRNA sequences of organisms is shown in Figure 1.4. According to Lake's tree (eocyte tree), extreme thermophiles are more closely related to eukaryotes  than to other archaeal groups whereas methanogens  and halophiles  11  Eukaryotes  Halobacteria  Eubacteria  E  o  a c o o  a =  e  E  a  E  =  -  s  CD  Iii  2  2  Methanogens  Eocytes I  %  <  o E CO N  I  —  o  in  J3  o t  42.2 116.0  — X " 90.oT  10 Transversions per 1000 Nucleotides  Figure 1.4 The rooted evolutionary tree that relates all known groups of extant organisms (adapted from Lake, 1988). The five groups are (from left to right) eubacteria, halobacteria, methanogens, eocytes and eukaryotes. The rate-independent technique of evolutionary parsimony (Lake, 1987a) in conjunction with the neighbourliness procedure (Fitch, 1981) were used to construct the unrooted tree. Branch lengths were determined by operator matrices (Lake, 1987b). Eubacteria, and eukaryotes (and halobacteria to a lesser extent) are "fast clock" organisms; hence, the length of the three long branches have been shortened to fit into the figure. Rooting was based on the simultaneous application of two criteria, parsimony (Fitch, 1977) and the relative rate test (Wilson et al., 1977). The 5% significance level for the most parsimonious rooting is shown by the two horizontal arrows on either side of the root.  12 are closer to eubacteria. The root of the tree, based mainly on small-subunit rRNAs, was determined by evolutionary parsimony. The accuracy and reliability of various tree building procedures are continuously reevaluated (Woese and Olsen, 1986; Lake, 1988) and therefore it is difficult to determine, which one of the trees, if any, discussed above represents the correct phylogenetic tree.  1.5 Archaea 1.5.1 Characteristic Features Although archaea share some properties with the eubacteria and some other properties with eukaryotes, they are considered to be a distinct group (Woese et al, 1983; Lake, 1988; Table 1.1). Archaea are an interesting group from a biological point of view, because many are adapted to extreme environments that are high in salt (halophiles) or high in temperature (thermophiles and hyperthermophiles), extreme in pH (acidophiles and alkalophiles), or anaerobic (methanogens). The composition of the archaeal rRNAs probably represents the adaptative changes brought about by the extreme niches the organism came to inhabit (Woese et al., 1984). In extreme thermophiles, RNAs are rich in G-C base pairs (75-80%) and post-transcriptionally modified nucleotides are five times more prevalent than in the other archaeal organisms (Woese et al, 1984) which probably enhances their thermal stability. The extreme halophilic rRNAs are also rich in G-C base pairs. The methanogens have relatively high levels of guanosines and uridines (0stergaard et al, 1987), resulting in many G»U base pairs (21% in M. thermoautotrophicum 23S rRNA). The primary sequences of all archaeal 16S rRNAs are at least 70% similar and this suggests that they diverged from a common ancestor between 2.0 and 3.5 billion years ago (Woese, 1987).  1.5.2 Halophiles The extreme halophiles are aerobic and require a minimum NaCl concentration of 1.5 M with an optimum growth usually in the range of 2.5 M to 4.5 M. Environments inhabited by halophilic archaea contain Na , K , M g +  +  + +  and C a  + +  ions as the predominant  13 Table 1.1 Characteristics of the three domains (Woese et al., 1990). Some distinguishing features of the eubacteria, archaea and eukaryota are listed. Abbreviations are: Chloramphenicol (Cm); Anisomycin (Ani); Kanamycin (Kan); Pseudouracil (Y); a-amanitin (Ama) and Rifampicin (Rif).  Characteristic  Eubacteria  Archaea  Eukaryota  Cellular  anucleate  anucleate  nucleated with  Organization  organelles  Genome size (bp)  5xl0 to 5xl0  Membrane Lipids  ester-linked  ether-linked  ester-linked  straight chain  branched chain  straight chain  peptidoglycan  various but not  various or none  Cell Walls  5  1.5xl0 to 5 x l O  5xl0 to 10  6  5  7  7  n  peptidoglycan Ribosomes (a) rRNA  5S,16S,23S  5S,16S,23S  5S, 5.8S, 18S, 28S  (b) diptheria toxin  insensitive  sensitive  sensitive  (c) antibiotic sensitivity  Cm  s  A n i Kan R  s  Cm  R  A n i Kan s  R  Cm  R  A n i Kan s  R  Transfer RNA (a) TYC loop  TYCG  1-methyl YYCG  TYCG  (b) 1-methyl adenine  absent  present  present  5-monophosphate  5'-triphosphate  5'-monophosphate  N-formyl methionine  methionine  methionine  (a) number of types  1  1  3  (b) subunits  5  6-13  12 or greater  (c) 5'-end of initiator tRNA (d) initiator amino acid RNA Polymerase  (c) antibiotic sensitivity mRNA  Ama Rif R  Uncapped  s  Ama Rif R  Uncapped  Ama (PolII) (PolI+III) Rif s  R  R  R  7-methyl G cap and polyadenylation  14  cations; the relative concentrations of these ions fluctuate due to their different solubility properties which vary with changing conditions. These organisms are known to concentrate predominantly K ions, to actively extrude N a ions, and to balance the overall intracellular +  +  cation concentration with that of the external environment (Christian and Waltho, 1962; Ginzburg et al, 1970; Lanyi and Silverman, 1972). Members of the genus Halobacterium produce a unique purple membrane containing the transmembrane protein bacteriorodopsin, a light driven proton pump that generates ATP (Stockenius et al., 1979; Stockenius and Bogomolni, 1982). Halophiles exhibit various morphologies; they can be pleomorphic (Haloarcula marismortui)  (Oren et al., 1988), box-shaped {Halobacterium sp. GN), rod shaped  {Halobacterium salinarium and its relatives including Hb. cutirubrum and Hb. halobium), or disc-shaped (Haloferax volcanii) (Mullakhanbhai and Larsen, 1975). Recently, it was shown that two members of the genus Ha. vallismortis and Ha. hispanica display complex cellular morphogenesis under unusual growth conditions (Cline and Doolittle, 1992). Morphogenesis of these species appears to be closely related to the uncharacterized halophilic archaea isolate Gp9 (Wais 1985). These morphological types at least bear superficial similarity to examples found within the methanogenic genus Methanosarcina (Boone and Mah, 1990). Genetic exchange has recently been demonstrated for the first time in archaea using auxotrophic mutants of Hf. volcanii, a genomically stable and non-purple-membrane producing species (Mevarech and Wecyberger, 1985). It is believed that exchange proceeds by way of cell fusion to produce a transient diploid cell.  1.6 Archaeal rRNA Operons 1.6.1 Organization The structural organization of archaeal rRNA operons exhibits considerable variability. The archaeal species characterized contain a low copy number ranging from one to four operons per genome (Table 1.2). In most of the sulfur-dependent thermoacidophiles, the 5S  15 gene is unlinked to the 16S-23S rRNA operon and transcribed separately (Kjems and Garrett, 1987; Reiter et al., 1987). Thermoplasma acidophilum is exceptional in that all three genes are unlinked and transcribed separately (Ree et al., 1989; Larsen et al., 1986). The rRNA operons from methanogens and halophiles share a common 5'-16S-23S-5S-3' gene organization, that is reminiscent of the rRNA gene organization found in most eubacteria (Jarsch and Bock, 1985; Hui and Dennis, 1985). The primary rRNA operon transcripts also contain a t R N A  A l a  sequence in the 16S-23S intergenic spacer and a tRNA^ys sequence distal  to the 5S gene. These tRNA sequences are inefficiently processed from the primary transcript; as a consequence, their stoichiometry relative to ribosomes is considerably less than one (Chant and Dennis, 1986). Eubacterial rRNA operons also contain various tRNA genes in these locations (Brosius et al., 1981).  1.6.2 Transcription Signals The rRNA genes of halophiles and Desulfurococcus mobilis (a thermophilic organism) are generally expressed from multiple promoters located upstream of the 16S rRNA gene (Mankin et al., 1984; Dennis, 1985; and Larsen et al., 1986). This complex and unusual promoter alignment distinguishes halophiles from methanogens and thermophiles, where the latter two generally contain only a single promoter. In archaea, the 16S and 23S rRNAs are usually cotranscribed. In halophiles, the 5S rRNA. an intergenic tRNA (usually coding for alanine), and in some cases, a distal tRNA gene (usually coding for cysteine), are also cotranscribed with 16S and 23S rRNAs. A weak internal promoter within the 16S-23S spacer region of halophile operons has been reported. It is assumed that the internal promoter rectifies the stoichiometric imbalance between 16S and 23S rRNAs caused by premature transcriptional  termination  (Mankin et al., 1987; and Garrett et al., 1993). In S.  acidocaldarius, a cryptic promoter-like sequence in the 16S-23S spacer was reported, but transcripts initiated from this site could not be identified (Durovic and Dennis, 1994). Transcription initiation has been studied in vitro using purified RNA polymerase and suitable promoter-containing template DNA (Zillig et al., 1985; Gropp et al., 1986; Reiter et  16 Table 1.2 Number of rRNA Operons in archaea. The number of rRNA operons vary from one in sulfur-dependent thermoacidophiles (Bock et al., 1986), one to four in methanogens (Neuman et al., 1983; Lechner et al., 1985) and one to probably four in halophiles (Sanz et al., 1988).  Organism  Number of rRNA Operons  Halophiles Halobacterium salinarium NCMB 777  1  Halococcus morrhuae NCMB 757  2  Haloarcula marismortui  3  Haloarcula californiae ATCC 33799  4  Methanogens Methanobacterium thermoautotrophicum  2  Methanothermus fervidus  2  Methanococcus vanielli  4  Sulfur-dependent Thermoacidophiles Sulfolobus acidocaldarius  1  Thermoproteus tenax  1  Desulfurococcus mobilis  1  al., 1990; Klenk et al., 1992). Characterization of the archaeal transcription signals was initially confined to nucleotide sequence comparisons. Such analyses reveal two conserved sequence elements, one located between positions -38 and -25 (distal promoter element) and the other located between positions -11 and -2 (proximal promoter element). The distal promoter element contains the TATA-like (also called "box A") motif, located approximately  17  26 nucleotides upstream of the transcription start sites of archaea.  By site specific  mutagenesis, this element was shown to be important for the efficiency of transcription and start site selection (Reiter et al., 1990). In some halophiles. another conserved element [consensus T(C or T)CGA] occurs ten nucleotides upstream of the "box A" sequence (Chang et al., 1981; Das Sarma etai, 1984; Dennis, 1985; Hamilton and Reeve, 1985). The similarities in promoter structures parallel those found between the archaeal and eukaryotic RNA polymerases. Particularly, the similarities found with polymerases II (Klenk etai, 1992; Huet et al., 1983; Berghofer et al, 1988; Leffers et al., 1989; Puhler et al., 1989a and; Puhler et al., 1989b) suggest that the expression signals and transcribing enzymes are conserved between the two primary lineages. Transcription termination sites are not as well characterized as the promoter sites. Transcription termination sites for the polypeptide encoding genes of methanogens and in the bop gene of Hb. halobium have been determined by nuclease Sl protection assays (reviewed in Brown et al., 1989). In most cases, termination occurs at sites that are similar to the rho-independent termination sites in E. coli. In the RNA transcripts, sequences that contain inverted repeat symmetry followed by poly dT regions are found downstream of many archaeal polypeptide-encoding genes. Transcription termination signals downstream of the 16S-23S rRNA and the 5S rRNA operons of D. mobilis are similar to those found in the 16S-23S rRNA operon of Thermoproteus tenax and the distal (of the two) rRNA operon of M. thermoautotrophicum. These terminations occur within, or very close, to the pyrimidine-rich regions which contain CTCCT or CTCCCT sequences (Kjems and Garrett, 1987; Klein et al., 1988). Termination of the transcripts of the sod (superoxide dismutase) family genes found in Halobacterium cutirubrum. sp. GRB, Hf. volcanii and Ha. marismortui, and the ribosomal protein genes, L l l e and the tricistronic Lle-LlOe-Ll le present in Hb. cutirubrum were located near or within the T rich polypyrimidine tracks (May et al., 1987; Shimmin and Dennis, 1989; Joshi and Dennis, 1993). Termination of the rRNA operon of the M. vanielli has been shown to occur at the beginning of a TTTTTAATTTT sequence located immediately downstream of the 5S rRNA-encoding sequence (Wich et al., 1986a; Wich et al, 1986b). In most of the characterized termination sites, longer T rich polypyrimidine tracts appear to result in more  18  efficient termination. On the basis of above examples, it is reasonable to suggest that the initial event of termination in archaea involves T rich polypyrimidine sequences. However, the mechanisms for this event have not been characterized in any of the cases described above. Helical structure in the RNA transcripts do not appear to be an absolute requirement, as it is missing from most termination sites.  1.6.3 Secondary Structural Organization of Transcripts Primary transcripts of all the archaeal rRNA operons exhibit several partially conserved inverted repeats which presumably form stem loop structures. Figure 1.5 shows putative secondary structures within the regions surrounding the primary transcripts of 16S and 23S rRNAs of some archaea.  Primary transcripts of methanogens and extreme  thermophiles form simple structures, which generally include helices A, B, E, and F (Kjems and Garrett, 1990).  However, transcripts of extreme halophiles, exemplified by Hb.  cutirubrum, form the most complex structures. Three of the stem structures or helices C, D and G shown in Figure 1.5 are exclusive to extreme halophiles. The stem structures C and G directly preceding tRNAs and 5S RNAs respectively, may participate in their processing (Kjems and Garrett, 1990). The helix D, in the 16S-23S spacer, overlaps with an active promoter sequence of the DNA. All the putative helices shown in Figure 1.5 are specific to either extreme halophiles or archaea. These helices are defined by their location relative to the processing stems, the tRNA, and the 5S RNA sequences. The transcripts of Thermoproteus tenax and Pyrodictium occultum exhibit exceptionally short spacers lacking helices E and F. The T. tenax transcript forms a bifurcated processing stem that contains two large rRNAs where the 5- and 3-ends of the mature RNAs lie within the processing stem. This relatively simple structure may reflect the minimal function of the 16S-23S RNA spacer, permitting the rRNAs to associate with their respective ribosomal protein and undergo maturation in vivo without mutual interference.  19 Sulfolobus acidocaldarius  Halobacterium halobium  A  i 1  o  OB<  0  G  o  O  O  5S tRNA  Ala  tRNA  Cys  T. tenax  M. autotrophicum  o  0  16S  O o  B  0  F  la 1  _^  r~\  ^ /3S  5S tRNA  Ala  Figure 1.5 Gene Organization and Structure of Archaeal rRNA Transcripts. In H. halobium and M. autotrophicum, the organization of genes within the operons resemble those of eubacteria. The other two, sulfur-dependent thermoacidophile strains, represent the extremes of structural variability where the 5S genes are unlinked. In T. tenax, the 16S and 23S processing stems bifurcate from a common stem structure. All of the long processing stems for the archaeal 16S and 23S rRNAs, with the exception of the 16S RNA stem of S. acidocaldarius and T. acidophilum, contain a welldefined structural motif which serves as the recognition site for the excision endonuclease. The aforementioned motif consists of two three-base bulges on opposite strands which are  20 precisely separated by four helical base pairs near the center of a much longer helical structure (Figure 1.6). The nucleotides on the bulge are usually A or U and cleavage within the'two bulges results in products with a 5'-hydroxyl and a 2' 3'-cyclic phosphate (Thompson et al., 1989). Remarkably, this activity was first characterized as the excision endonuclease that removes intron sequences from the transcripts of archaeal tRNA and rRNA genes (Thompson and Daniels, 1988). The exon-intron junction in these cases forms the requisite "bulge-helixbulge motif". The mature 16S and 23S rRNAs are formed by subsequent trimming at the 5'and 3'-ends. The rRNA operons in E. coli are similar to those of archaea in that the processing occurs within the long processing stems surrounding the 16S and 23S RNAs (Figure 1.6) but the enzymology of precursor processing is somewhat different.  In E. coli and related  eubacteria, the enzyme RNase III is responsible for these initial excisions.  The enzyme  recognizes duplex RNA of generally undefined sequence, often with one or more bulge or unpaired nucleotides near the site of the cleavage. Cleavage occurs at staggered positions on each of the two strands and produces a 5'-phosphate and 3'-hydroxyl product (Court, 1993). Subsequent trimming at the 5'-and 3'-ends is required to produce the mature 16S and 23S rRNAs. In the sulfur-dependent thermoacidophile. 5. acidocaldarius, the processing and maturation of 23S rRNA appear to follow the typical archaeal pathway, utilizing a "bulgehelix-bulge" motif within the 23S processing helix (three-base bulges on opposite strands, precisely separated by four helical base pairs) as the substrate for an excision endonuclease. The pathway for processing and maturation of 16S rRNA is distinct and does not involve the "bulge-helix-bulge" motif in the processing stem (within the 5'-flanking region, the bulge motif contains only two nucleotides).  Instead, the transcript is cleaved at several novel  positions in the 5'-leader and in the 3'-intercistronic sequence. In vitro processing studies showed that the 16S helix is neither used nor required for leader processing. The same studies also showed that the complete maturation of the 5'-end of 16S rRNA can occur in the absence  21  E. coli rrnB  E. coli rrnB  Hvo tRNA  r-23S-j 11  1 1  G«ll G-C u-n G-C U-fl G-C U-fl C-G  U-fl G-C U-fl U-fl G-C  1  ^  5  1  u- n fl-U fl-U i• ii 1  ^  '  G-C G-C  C-G U-fl U-fl 1 1 1 1  3'  5'  r-INT-I  3'  J | fl-U * C-G ^ flU-fl U 1 flG-C fl-U G-C G-Cfl 1 G C-Gfl C-G i i i i 5' 3"  Hcu  a  C-G C-G < flU.G C I flG-C U-fl G-C G.UU I fl U-flC >w U-fl U-fl I I  5'  3'  Figure 1.6 The secondary structures containing the recognition features for endonuclease activities that excise precursor 16S and 23S rRNAs from the primary transcripts. In eubacteria (e.g., E. coli), the processing enzyme RNase III recognizes duplex RNA of generally undefined sequence, often with one or more bulges or unpaired nucleotides near the site of cleavage. Cleavage sites are shown by arrows. The Hvo tRNA (tRNA intron from Hf. volcanii) and Hcu (the 16S processing stem from Hb. cutirubrum) represent the archaeal endonuclease recognition motif consists of two, three nucleotide bulges on opposite strands separated by four helical  base pairs.(Thompson et al., 1989). The two structures from E. coli rrnB  represent the 16S and 23S processing stem structures. of concomitant ribosome assembly and in the absence of all but the first 72 nucleotides of the 16S rRNA sequence (Durovic and Dennis, 1994).  I. 6.4 Secondary Structures of Ribosomal RNAs In terms of secondary structure, the various archaeal groups exhibit a mixture of unique, eubacterial and eukaryotic features built upon the same structural themes seen in all 5S rRNAs (Fox, 1985). The archaeal 5S rRNA models were categorized into four groups: I, II, III and IV.  Group I contains members of the genera  Methanobacterium,  22 Methanobrevibacter, Methanosarcina  and possibly Methanococcus.  The extreme halophiles and  exhibit typical group II 5S rRNA (Figure 1.7).  Group III includes  Methanomicrobium and Methanospirillum, and Group IV includes Thermoplasma and Sulfolobus. Table 1.3 provides a summary of the structural features seen in specific archaeal strains, as well as the eubacterial and eukaryotic cytoplasmic 5S rRNAs (Fox, 1985). A detailed analysis of the secondary structure of the 16S-like rRNA from the three evolutionary lineages has shown that, in spite of the significant sequence variability, there exists a common secondary structure core, which is responsible for its main functions in the ribosomes (Mankin and Kopylov, 1981; Maly and Brimacombe, 1983; Woese, 1987). Using E. coli and Hf. volcanii as examples it was shown that, despite the differences in their primary sequences, the secondary structures of the 16S rRNAs are similar to each other (Woese et al., 1983). In terms of primary structural homology, a comparison of the Hf. volcanii'and E. coli, 16S rRNA genes gives a value of 58%, which is indeed low compared to the 75% seen in the very distant E. coli-Zea mays chloroplast (Fox, 1985). Figure 1.8 schematically summarizes the results of a structural comparison encompassing about 100 different small subunit rRNAs (SSU rRNAs) from a wide variety of organisms covering virtually the entire evolutionary spectrum (Raue' et al., 1988). Although the sequences of 23S rRNAs of the eubacteria (E. coli) and archaea (Hb. halobium or Hb. cutirubrum) differ significantly, the major part of their secondary structures are highly conserved (Mankin and Kagramanova, 1986; Raue' et al., 1990). Relative to eubacterial rRNAs, small inserts are often found in archaeal rRNAs, at positions where the larger inserts occur in eukaryotic rRNAs (Leffers et al., 1987). For example, in the 23S rRNA, two helices in domains I and II show, archaeal-specific structural variations (Figure 1.9). In the archaea, the size of the inserts ranges from 50-58 nucleotides in domain I and 2637 nucleotides in domain II whereas in eukaryotes, the lengths of the inserts range from 183836 and 49-123 nucleotides, respectively. Kingdom-specific nucleotides have been identified that are associated with antibiotic binding sites at functional centers in 23S-like RNAs: at the elongation factor-dependent GTPase center (thiostrepton domain II) they resemble bacteria  Table 1.3 Summary of characteristic structural features in various 5 S rRNAs. The 5S rRNAs are as follows: Eb, consensus eubacterial 5S rRNA (excluding organelles and mycoplasmas): 1, Methanobrevibacter ruminantium; 2, Methanobrevibacter smithii; 3, M.  thermoautotrophicum; 4, Methanobacterium bryanti; 5, M. vanielli; 6, He. morrhuae; 7, Hb. cutirubrum; 8, Hf. volcanii; 9  barken; 10, Methanogenium marisnigri; 11, Methanogenium cariaci; 12, Methanospirillum hungate'x; 13, Methanomicrobi mobile; Ec, consensus eukaryotic 5S rRNA; 14, T. acidophilum; 15, S. acidocaldarius. Structural property refers to the distance between positions 66 and 78 in the eubacterial consensus sequence.  RNA Source Group fl  Group I 5  2  3  Helix I looped-out base  -  +  +  +  +  +  Helix 11 extension  +  +  +  +  +  +  Helix 111 loop length  13  13 13 13 13 13 13 13 13 13 13 13 13 13 13  CGAAC- sequence  +  Extended helix V obviously feasible  -  Helices I/V or II/V may be coaxail  -  -  +  +  +  +  7  +  8  9  +  +  10 11 12 13 14 15 Ec  1  Helix IV looped-out base  6  Gp IV  Eb  Structural Property  4  Group III  -  -  -  -  +  +  +  13 12  +  +  +  +  +  +  +  +  +  +  -  +  +  +  +  +  +  +  +  +  +  +  -  +  +  -  -  +  +  +  +  +  +  +  •+  +  Number of bases between main portion of helix II and beginning of helix IV  13  17 17 17 16 16 16 16 16 16 16 16 16 16 15 14 15  Total number of bases in helix IV loop and stem  19/20 20 16 19 18 16 21 21 21 21 21 21 21 21 21 21 21  24  3'  A c  5'  U u c fl U - fl-5 C-G 113-C - G-7 I G - C H C-G I TT 28 TTT C C 110-C - G fl U 11 23 U U fll C fl G - C C f l IG C fl U-40 C GC GGUGGG UC CCGU C U I I I 1*1 I I II I I II C U CG CCGCCC AG GGCfl C G - C U G U G Uflfl C G V G - C U I flfl I Afll C-G 66 I 52 44 C-G 58 U U • fl G 100-fl "fl G-76 fl U G fl-78 G-C G • U U • G-81 IV C - G U-fl C-G U-84A C-G 90-G - C 89E-A G-87 GC A A A  Figure 1.7 Halobacterium cutirubrum 5S rRNA (Fox, 1985), a typical group II 5S rRNA. Helix I has a larger number of unpaired bases at its terminii than is usual, but the molecule is totally eubacteria-like in its structural features through position 69. Helix IV has a four-base loop as is typical of eukaryotes. The looped-out base in helix IV is followed by two base pairs instead of the three that characterize the eukaryotes in this region. This feature is best regarded as uniquely archaeal and the relevant position is accordingly numbered 84A, which distinguishes it from the eukaryotic analog, 89E. and in the peptidyl transferase center (erythromycin-domain V), and at the putative aminoacyl tRNA site (a-sarcin domain VI) they resemble eukaryotic large subunit rRNAs (Raue' et al., 1990; Jarsch and Bock, 1985; Leffers etai., 1987). Figure 1.9 shows a schematic  25  /  TOO O O \ 0 O *  n  rt  .0OO  t R N A binding initiation subunit ass.  0  m  o  o—o •»' O—O n o , O—O /  o  *OO0 u o o o C  c  J5  36  2J  fi* 2JA t R N A binding subunit a s s .  elongation subunit a s s  o-o o-o o-o  "5  »o o o o  A-O o-o *c° A-O  A  ooooo o a  o-o CCL,-5 o-o. u-o i aoo oCo„ o o clScr 0  o  S3  22  o o o o ° o o o o o o  G  ACGAOC a c b u o c  UGA termination  o-o o-o "'o-o o o  On  C  o-o o-o o-o :..^.  #  /  AHO°  U  V  31  ooooooo oooooo  decoding  a° 0  if toocS;  A_QO^V«« „  . . . . . . . * . . . . o q ^ o ° ' ..A O^VP**  "  • « i i i i u • • • • o£  -?  !  /  A  •  '  fs  9  OA.A  A  A  JfO  <?°n  co  0-,°r O 0 O O OOOO u  72  C  elongation  V"«L C - G  °o -  decoding tRNA binding  subunit ass. —  p  «A»A  *n oooo oooo »u •  o  o-o o-o o-o  44  o  £8  YmwrftY  o-o  £8 o-o  initiation (anti-SD)  o-o  o-o  ... O - O .•>• O-O: o-c; i  i  0  it  s  150  O '6o 00 t t  0 0  000 000 o 0  I I I I I I  OoOOoOC^^pOOOOOg  X 0  Figure 1.8 Functional regions in small subunit rRNA (adapted from Raue' et al., 1990). The figure shows a schematic representation of the SSU rRNA structure based on the secondary structural model for E. coli 16S rRNA. Variable expansion segments are numbered consecutively from the 5'-end. Filled circles indicate regions conserved in all classes of ribosomes; open circles correspond to regions conserved in all classes except the mitochondrial ones. Highly conserved nucleotides (present in >90% of all known SSU rRNA sequences) are indicated. Helices are numbered according to Raue' et al. (1988). The functional regions are also labelled.  26 representation of the LSU rRNA structure based on the secondary structure model for E. coli 23S rRNA where the universally conserved regions, expansion segments and functional regions are indicated. In two thermophilic archaea S. marinus andZ). mobilis, introns have been discovered in the 23S rRNAs (Kjems and Garrett, 1987; Kjems and Garrett, 1990). These introns, like those of eukaryotes, are located within the functionally important domains IV and V of LSU rRNA. The similarity of the locations of the archaeal and eukaryotic LSU rRNA introns suggests that they are functionally related. In archaea, splicing of the rRNA introns occurs at a "bulge-helix-bulge" motif that forms at the exon-intron junctions (Kjems and Garrett, 1987). This structural feature, first characterized for the intron-containing tRNAs of extreme halophiles and thermophiles (Kjems and Garrett, 1990), also occurs in the processing stems of the archaeal 16S and 23S RNAs (see section 1.6.3).  1.7 Ribosomal RNA Heterogeneity Distinct population (or subpopulations) of ribosomes, i.e., ribosome heterogeneity in an organism, may result from a change in one of the ribosomal components during its life cycle. The presence or absence of a new ribosomal protein (r-protein) or rRNA, a unique covalent modification, or a stoichiometric difference in one cell type or developmental stage may all lead to ribosome heterogeneity in eukaryotic organisms (Ramagopal, 1992). Several examples of low level rRNA sequence heterogeneity in a variety of species have been reported. In E.coli, which has seven rRNA operons in its genome, eight sites of sequence heterogeneity have been observed in the analysis of bulk 23S rRNA (Branlant et al., 1981). Three rRNA operons from Rhodobacter sphaeroides have been sequenced; the three 16S genes are identical whereas the 23S genes exhibit a single nucleotide substitution in one gene and three single nucleotide deletions in the second gene. The 5S genes exhibit slightly higher heterogeneity; one 5S gene differs at four positions from the other two genes (Dryden and Kaplan, 1990). The mitochondrial genome of Tetrahymena contains two genes encoding  Figure 1.9 Functional regions in large subunit (LSU) rRNA (adapted from Raue' et al., 1990). The figure shows a schematic representation of the LSU rRNA structure based on the secondary structure model for E. coli 23S rRNA. Variable expansion segments are numbered consecutively from the 5'-end. Filled circles indicate universally conserved regions. Functional regions are also labelled in the diagram.  to  28 large subunit rRNA; the two genes differ at five out of 2595 positions, and both genes are expressed (Heinonin et al., 1990). The nucleotide substitutions discussed above are not present in functionally important positions of the rRNA structures. These heterogeneous gene products are believed to be an intermediate stage in the process of concerted evolution (Hancock and Dover, 1990). A more interesting example has been described in the blood parasite, Plasmodium berghei, where two 18S rRNA encoding genes referred to as A- and C- genes (Gunderson et al., 1987) are expressed in gametocytes and sporozoites respectively. The two A genes and two C genes differ at 72 positions out of 2059 (i.e., 96.5% identical). It has been suggested that this differential expression may play a role in the types of proteins synthesized during different developmental stages of the complex life cycle of Plasmodium berghei. It was shown by Walters et al., (1989) that the switch from A to C gene expression involves the control of rRNA processing. In gametocytes, precursor transcripts from C-type genes are not processed and ribosomes contain predominantly A type RNA. In the zygote and the early ookinate, transcription and processing of the rRNA from C type genes is accelerated. As Ctype ribosomes accumulate, a defined and limited pattern of breakdown of the dominant Atype ribosomes occurs, during which conserved and functionally important sequences involved in the elongation and termination of translation are targeted. By the late oocyst stage, the A-type ribosomes have essentially been replaced by C-type ribosomes. Sequence comparison of the C and A type genes showed that the distribution of the differences between the two sequences is not random. There are only four differences in the 3'-domain, whereas considerable variation is found in the middle and 5'-domains of 18S rRNA. The secondary structural analysis suggests that these differences are restricted to regions that are otherwise phylogenetically variable (Gunderson et al., 1987; Woese et al., 193). In eukaryotes, several examples of ribosome heterogeneity, caused by ribosomal proteins were also reported (Ramagopal, 1992). It was shown by Sanz et al., (1988) that the genome of Ha. marismortui contains three ribosomal RNA operons in its genome. However, Mevarech et al., (1989) showed that Ha.  29 marismortui contains two unlinked rRNA operons in its genome and are different in several respects. These two operons, designated as rrnA and rrnB, were previously cloned as separate genomic restriction fragments. Preliminary characterization by RFLP (Restriction Fragment Length Polymorphism) indicated that the two copies of the 16S and 23S rRNA genes differ at a number of positions. Furthermore, analysis of the 5'-flanking regions indicated that the rrnB operon was unusual in that it contained only a single recognizable promoter and apparently lacked the consensus processing site within the 5'-portion of the putative 16S processing stem.  1.8 Research Objectives To provide a more complete characterization of the ribosomal RNA operons, rrnA and rrnB, present in the genome of Ha. marismortui, several approaches were made. The genomic DNA derived from the progeny of a single cell was probed for the presence of both rrnA and rrnB operons. Then, the nucleotide sequences of the flanking, coding and intergenic spacer regions of the rrnA and rrnB operons were determined and analysed for the conservation of primary sequence and secondary structure. Using structural models, primary transcripts and processing intermediates derived from the two operons were characterized. The rrnA and rrnB gene sequences (rRNAs) were also compared to the corresponding gene sequences from a.number of other halophilic archaeal species in order to understand the significance of their sequence divergence from each other and from other halophilic sequences. Using nuclease S1 protection assays, it was shown that the 16S rRNAs from the rrnA and rrnB operons were expressed and present in active 70S ribosomes. To determine the start and end sites of the transcripts and the processing intermediates, in vivo transcript analysis were performed using total RNAs from Ha. marismortui. Finally, the rRNA gene sequences from the rrnA and rrnB operons were aligned to the corresponding gene sequences from a number of other halophiles, a methanogen and an eubacterium (E. coli) and phylogenetic analysis was performed using PAUP (Phylogenetic Analysis Using Parsimony).  CHAPTER 2 Materials and Methods 2.1 Materials Bacterial culture components—yeast extract, tryptone, casamino acids, and agar were purchased from Difco Laboratories, ampicillin from Sigma Chemical Co., IPTG from GIBCO BRL, Xgal from Bethesda Research Laboratories or Biosynth AG., and D-glucose from BDH. E. coli strains JM101 and JM109 are available from Promega. Strain DH5a is available from Bethesda Research Laboratories. The vector pBR322 is available from New England Biolabs.  Vector phages,  M13mpl8 and M13mpl9 are available from Pharmacia. The vectors pGEM-3Zf(+/-), pGEM-5Zf(+/-), and pGEM-7Zf(+/-), and the helper phage R408 were purchased from Promega. Most of the restriction enzymes and DNA modifying enzymes were purchased from Pharmacia or New England Biolabs (NEB). Enzymes purchased from other sources are: T4 polynucleotide kinase from PL Biochemicals, Pharmacia; modified T7 DNA polymerase (Sequenase) and shrimp alkaline phosphatase (SAP) from United States Biochemical Corp. (USB); AMV reverse transcriptase from Boehringer Mannheim Canada (BMC); lysozyme and ribonuclease A from Sigma; exonuclease III from Promega; T4 DNA ligase and Sl nuclease from Pharmacia; and Klenow fragment from BMC. Deoxyribonucleoside triphosphates, dideoxyribonucleoside triphosphates, and (1) -phosphorothioate deoxynucleotide triphosphates were obtained from Pharmacia. Radioactive a ( P) dNTPs, y ( P) NTPs and a ( S)dNTPs were obtained from Dupont New England 32  32  35  Nuclear (NEN) Research Products. The T7 and SP6 primers were purchased from NEB. Oligonucleotides used for sequencing or probes for genes or for primer extension analyses were obtained from Dr. Carl Woese's laboratory (University of Illinois, Urbana Champaign, USA) or synthesized by T. Atkinson (University of British Columbia) and Dr. Ivan  31 Sadowski's laboratory (University of British Columbia). The synthesized oligonucleotides obtained from UBC were supplied as lyophilized crude powders. Crude oligonucleotides were purified by C18 Sep-Pak reverse phase chromatography and quantified by measuring the A260 (Atkinson and Smith, 1984). Acrylamide and N, N'- methylenebisacrylamide were purchased from BioRad Laboratories, genetic technology grade agarose was purchased from Schwartz/Mann Biotech, cesium chloride was purchased from Cabot chemicals and p-mercaptoethanol (P-ME) was purchased from Matheson Coleman & Bell. All other chemicals were purchased from either BDH, Fisher, or Sigma. Hybond-N nylon membranes and filters, and Hybond M & G paper were purchased from Amersham. Films for autoradiography (X-Omat and XAR) were purchased from Kodak and films for visualization of ethidium bromide stained DNA were purchased from Polaroid.  2..2 Methods 2.2.1 Media and Culture Conditions Haloarcula marismortui was grown at 42°C in a rich medium as described by Oren et al., (1988). The medium contains the following composition (g/1): NaCl, 206; MgS04»7H20, 36; KC1; 0.373; CaCl2»2H 0, 0.5; MnCh, 0.013 mg/1, and yeast extract, 5.0; pH 7.0. The 2  plates for plating the cells were prepared by adding 15 g/1 agar into the medium and autoclaving.  2.2.2 Isolation of Haloarcula marismortui Genomic DNA For isolation of Ha marismortui DNA, a 1 L culture was grown to an A600 of 1.0 - 1.5 and pelleted. The cells were washed in a solution containing 204 g of NaCl and 39.6 g of MgS04»7H20 per liter. The washed bacteria were resuspended in 100 ml of 10 mM MgCl2 -10 mM Tris hydrochloride (pH 7.5), aliquoted into five tubes and each tube was extracted twice with phenol. NaCl was added to a final concentration of 0.5 M and the solution was  32 cooled on ice. 40 ml of 95% ethanol was added to each tube and the DNA was collected by spooling on a sealed pasteur pipets. The DNA was washed twice with absolute ethanol, air dried, dissolved in TE (10 mM Tris-HCl (pH 7.5), 1 mM EDTA), and stored at -20°C.  2.2.3 5'-end Labelling of Oligonucleotides Oligonucleotides (about 5 pmol) were 5'-end-labelled at 37°C for 40 minutes with 1 unit of T4 Polynucleotide Kinase and 50 (iCi of 3000Ci mmol" [y P] ATP in 20 ul of 1  32  kinase buffer ( 1 X = 0.1 M Tris-HCl, pH 8.0, 5 mM DTT, 10 mM MgCl2). The reaction was stopped by adding 1 u.1 of 0.5 M EDTA (pH 8.0) and heating at 65°C for 5 minutes. Carrier tRNA (8 (_ig) and 80 [ll of distilled water were added and the preparation then precipitated twice with 2.5 volumes of 95% ethanol in the presence of 0.3 M NaOAC and dried. Each sample was counted by Cerenkov and then dissolved in 10-20 ul TE buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA).  2.2.4 Southern Blot Analysis DNAs were digested with various restriction endonucleases (as described in section 3.2) and separated by size on agarose gels alongside P-labelledsize standards. The DNA 32  was denatured in situ by soaking the gel for 30 minutes in denaturing solution (1.5 M NaCl, 0.5 M NaOH). The DNA was then transferred to a nylon membrane by blotting with transfer buffer (1.5 M NaCl, 0.25 M NaOH) for 12 hours. The membrane was washed in 2 X SSPE (SSPE: 180 mM NaCl, 100 mM NaH2P04, 10 mM EDTA (pH 7.7)) and dried at room temperature for an hour. The DNA was crosslinked to the membrane by a 2 minute exposure to UV light. The membranes were prehybridized for one hour in hybridizing solution (5 X SSPE, 5 X Denhardt's solution (Denhardt's solution: 0.02% w/v BSA, 0.02% w/v Ficoll, 0.02% w/v polyvinylpyrollidone, 0.5% w/v SDS) before the addition of the radioactive DNA probe. The hybridization was carried out at 37°C overnight and the membrane was washed with low and medium stringency buffers respectively.  33  2.2.5 Plasmids and Phage Preparations Small-scale plasmid DNA purifications were done by using the "Magic™ Minipreps DNA Purification System" from Promega or by the alkaline lysis method (Maniatis et al., 1982). Large scale plasmid preparations were done by the methods described by Maniatis et al. (1982). The E. coli strains DH5a and JM101 were used for plasmid propagation and generation of single stranded phagemids respectively (Dente and Cortese, 1983). The helper phage R408, was used in conjunction with pGEM (Promega) and dideoxy sequencing reactions using KITS (i. e. sequenase) contain modified procedures as described by Sanger et al. (1977).  2.2.6 Gel Electrophoresis 2.2.6.1 Native Gel Electrophoresis Restriction endonuclease digested DNA fragments were separated electrophoretically through agarose (genetic technology grade) slab gels (0.8% or 1%) or 5% nondenaturing polyacrylamide gels (acrylamide: N N'-methylene bisacrylamide, ratio of 39:2). The buffer used for electrophoresis was 0.5 X TBE buffer (45 mM Tris, 45 mM Boric acid, 1 mM EDTA, pH 8.2) or 1 X AGB buffer (for agarose only; 20 mM Sodium acetate, 40 mM Tris, 1 mM EDTA, pH adjusted to 8.0 with glacial acetic acid) and the electrophoresis was performed at between 100 and 150 volts. In order to visualize the bands, the gels were run in the presence of 0.25 |J.g/ml ethidium bromide (for agarose only) or were stained with it after electrophoresis. The bands were cut out using a clean scalpel and the DNAs were extracted by using the Gene Clean Kit from Pharmacia (for agarose gels) or electroeluted against 0.5 X TBE buffer (for acrylamide gels) and then ethanol precipitated.  2.2.6.2 Denaturing Polyacrylamide Gel Electrophoresis The DNA products from sequencing reactions, nuclease SI transcript mapping reactions, and from primer extensions were separated on 0.35 mm denaturing vertical  34 polyacrylamide gels. The gels (ratio of acrylamide to N N'-methylene bisacrylamide 39:2) were composed of the following: (i) acrylamide concentration-6% or 8%, (ii) 0.5 X TBE buffer, (iii) 8.3 M urea, and (iv) the reagents used for polymerization (260 (J.1 (NH4)2S20s and 30|il TEMED) for a total volume of 50ml. Electrophoresis was carried out at 32 watts constant power. After electrophoresis, gels were dried on to Whatmann 3 MM paper and exposed to Kodak x-ray film.  2.2.7 Ligation For cohesive-end and blunt-end ligations, 40 fmoles of plasmid vector DNA and 3 fold molar excess of insert DNA were ligated in 10 |ll of reaction mix containing IX ligase buffer (20 mM Tris-HCl, pH 7.6, 5 mM MgCl , 5 mM DTT, 50 |lg/ul BSA) and 0.1 Weiss unit of 2  T4 DNA ligase. The incubation was carried out at 16°C overnight.  2.2.8 Transformation E. coli competent cells were prepared by the CaCl2 method for transformation. The E.  coli cells were grown in 100 ml YT medium to an O.D at A50O °f about 0.5-0.8 (1 cm path length) and harvested at 4°C at 4000 rpm in the Sorvall centrifuge for 10 minutes. The cells were collected by centrifugation, resuspended in 50 mM CaCl? (0.5 x volume of the original culture,) and incubated on ice for 40 minutes. The cells were then centrifuged in the Sorvall centrifuge at 4°C at the speed of 4000 rpm for 10 minutes, resuspended in 2 ml of 50 mM CaCl2, 15% glycerol, and incubated on ice for 1 hour. The cells were either used directly or pipetted into small aliquots (100 u.1), frozen quickly in dry ice, and stored at -70°C for later use. Competent cells were gently mixed with 2-4 fmoles of DNA, incubated on ice for 30 minutes, heat-shocked at 42°C for 45 seconds, incubated on ice for a further 2 minutes, and then directly plated on YT-agar containing ampicillin, X-gal and fPTG.  35  2.2.9 Exonuclease III Deletions Bidirectional deletions of insert DNA were constructed in the plasmids pGEM 7 and +  pGEM 3 using exonuclease III (Henikoff, 1987). This technique was used in the sequencing +  of the 23S rRNAs from the rrnA and rrnB operons where Xbal (gives a 5'-cohesive end) and SphI (gives a 3'-protecting end) enzymes were used. A molar excess of exonuclease III digests DNA in the 3'- to 5'- direction in a time- and temperature-dependent manner. Nuclease S1 was used to digest the remaining single stranded DNA, klenow fragment along with a mix of the four deoxyribonucleotides was used to fill in any recessed 3'-ends, and T4 DNA ligase was used to circularize the plasmid. The plasmids were transformed into DH50: strains. The colonies were picked randomly and grown into 5ml YT media. The plasmids were isolated and screened for the insert sizes and those that contained deletions were sequenced using sequencing KITS (sequenase) contain modified version of the dideoxy sequencing method (Sanger et al., 1977).  2.2.10 DNA Sequencing Most of the 16S rRNA sequencing reactions were carried out using the primers obtained from Carl Woese's laboratory. The 23S rRNA genes, 5S rRNA genes, intergenic spacer regions, and the 5S 3'-flanking regions were sequenced by using either sequence specific primers or exo-nuclease III deleted template DNAs.  The deleted DNAs were  sequenced by the dideoxy chain termination method employing either T7 or SP6 primers. Both single- and double-stranded DNA molecules were employed as templates (Sanger et al., 1977; and Zhang et al., 1988). For most of the double stranded sequencing, a 7-deaza-2'deoxyguanosine 5'-triphosphate (C dGTP) sequencing kit from Pharmacia was used. 7  2.2.11 Isolation of Total RNA Total cellular Ha. marismortui RNA was isolated using the boiling SDS-lysis method described by Chant and Dennis (1986). Briefly, cells were rapidly cooled on ice, collected by centrifugation. resuspended, and lysed in an SDS-containing buffer (100°C, for 15-30 sec);  36 RNA was extracted with phenol and precipitated with ethanol. DNA contaminants were removed by ultracentrifugation through a 5.7 M CsCl block gradient.  The RNA was  resuspended in TE buffer and stored at - 20°C.  2.2.12 Isolation of RNAs From Mature Ribosomes A one liter culture was grown to an A46O of 0.5 and pelleted.  The cells were  resuspended in 1 ml ribosomal preparation buffer (3.4 M KC1, 100 mM MgCl2, 6 mM 2mercaptoethanol and 10 mM Tris-HCl, pH 7.6; Shevaik et al., 1985) and disrupted by passage through a french pressure cell. The lysate was centrifuged at 2500 rpm for 5 minutes and again centrifuged at 9500 rpm for 15 minutes to remove debris and unbroken cells. The supernatant was layered on to a- 5-30% sucrose density gradient in ribosomal preparation buffer and spun at 27,000 r.p.m using an SW27 rotor for 6 hours at 10°C. The fractions containing 30S, 50S, and 70S were collected, dialysed against ribosomal preparation buffer for 12 hours, extracted with phenol/CHCfj three times in the presence of 1% SDS, ethanol precipitated with 2.5 times the volume with 95% ethanol, and purified by CsCl centrifugation.  2.2.13 5'-End Labelling of DNA Fragments Restriction DNA fragments containing 5'-overhang ends were dephosphorylated with SAP (Shrimp Alkaline Phosphatase) at 37°C in a solution containing 10 mM MgCk>, 20 mM Tris-HCl, pH 8.0, for 1 hour. The reaction mixture was then heated for 30 minutes at 65°C to denature the SAP. Once the mixture was cooled down to room temperature, Tris-HCl (pH 8.0), DTT, and spermidine were added to a final concentration of 30 mM, 5 mM, and 0.1 mM respectively, and then 0.1 units of T4 poly nucleotide kinase (T4 PNK) and 50 p:Ci [y P] 32  ATP were also added. The reaction mixture was incubated at 37°C for 40 minutes. The labelled fragments were precipitated using salt and 2 volumes of 95% ethanol and the radioactivity was measured by Cerenkov counting. If necessary (especially for Maxam and Gilbert sequencing), the fragments were then digested with appropriate restriction enzymes and purified by isolating the product fragments on a 5% non denaturing gel. After the  37 digestion with restriction enzyme, if the unwanted product fragment was comparatively small, the product fragments were ethanol precipitated after phenol/ chloroform and used directly for Maxam and Gilbert sequencing.  2.2.14 3'-End Labelling of DNA Fragments Restriction DNA fragments containing recessed 3'-ends were end labelled using the Klenow fragment of the E. coli DNA polymerase I and the appropriate [oc32p]dNTP (specific activity of 3000 Ci/mmol, 10 mCi/ml). The DNA fragments of approximately 100 - 500 ng were labelled at the 3'-end in a solution containing lx Klenow buffer (10 mM NaCl, 10 mM Tris-HCl, pH 7.5, 7 mM MgCl2), 15 ul of 3000 Ci/mmol [ a P ] dNTP and 5-7 units of 32  Klenow at room temperature for 15 minutes.  2.2.15 Maxam and Gilbert Sequencing Chemical sequencing of DNA fragments was carried out according to the method described by Maxam and Gilbert (1980). The DNA fragment was either labelled uniquely at the 3'-end by Klenow enzyme or labelled at both 5'-ends with T4 PNK and digested internally with a restriction endonuclease to yield product fragments with a uniquely labelled end. The labelled fragments were then purified electrophoretically through a polyacrylamide gel. The Cerenkov count was taken and the DNA was dissolved in dH20 to give 50K cpm/reaction. Since chemical sequencing was only performed to establish a size ladder with nucleotide precision for transcript mapping on DNA probes whose sequences had already been determined, only two sequencing reactions were required. End-labelled DNA was spotted on to strips of Hybond M&G filter for the G reaction and double the amount was used for the A+G reaction. The filters wererinsedtwice with dH20, once with 95% ethanol, and air dried for 3 minutes. The G reaction was performed by treating with 50 mM ammonium formate (pH 3.5) and 0.7% dimethyl sulfate for 40 seconds at room temperature. The A+G reaction was performed by treating with 66% formic acid for 10 minutes.  The reactions were  terminated by removing the strips from the reaction solutions, washing twice with dH20,  38 once with ethanol, and air drying. Cleavage and removal of the DNA from the filter was achieved by submersion of the filter in 100 \il of 10% piperidine and heating at 90°C for 30 minutes.  After treating with piperidine, the strips were removed and the mixture was  lyophilyzed. The pellet was lyophilized with 40 p:l of dH.20 twice and the DNA was finally resuspended in 4 |il of dH20.  2.2.16 Transcript Mapping 2.2.16.1 Nuclease Sl protection Analysis of the Total RNA Nuclease Sl protection of Ha. marismortui rRNA transcripts was performed as follows. Fragments of DNA, labelled at either the 5'-or the 3'-end, were ethanol precipitated in the presence of 1 - 2 |ig of total RNA and resuspended in 4 p:l of S1 hybridization buffer (40 mM PIPES (pH 6.8), 400 mM NaCl, 1 mM EDTA) and 16 u.1 of deionized formamide. The samples were denatured for 15 minutes at 80°C and hybridized to the RNA for three hours at temperatures optimized for G-C content (52°C-59°C). Unhybridized single stranded DNA and RNA was digested by the addition of 300 p:l nuclease Sl digestion buffer containing 280 mM NaCl, 30 mM NaOAc (pH 4.4), 4.5 mM ZnCl2, 6 (ig nonspecific single stranded DNA, and nuclease Sl (200-500 U/ml). The reaction mixture was incubated for 30 minutes at 37°C. .The protected products were isopropanol precipitated, dried under vacuum, resuspended in 4 p:l of dH.20, and then 4 p.1 of FDM was added. The samples were heat denatured and loaded on denaturing polyacrylamide gels alongside the DNA probe and the Maxam and Gilbert sequencing ladder.  2.2.16.2 Nuclease Protection Analysis of the Active 70S Ribosomal RNAs Nuclease Sl protection experiments were carried out as described above. This experiment was performed to show that the 16S rRNAs derived from both operons (rrnA and rrnB) during exponential growth were present in the active 70S ribosomes (see section 2.2.12). The restriction fragments used for this experiment were the homologous but nonidentical 272 nucleotide long SacII-Smal fragments (nucleotide positions 463-734 in 16S  39 rRNA sequence) from the Ha. marismortui pHC8 (rrnA) and pHHIO (rrnB) clones and from the Hb. cutirubrum p4W clone. The fragments were 5'-end labelled at the Smal site (nucleotide position 736) as described above. Approximately 20-30 ng of the respective fragments (DNA excess; between 10 and 10 dpm per assay) were hybridized to 200 ng of 5  6  total RNA (in the case of Hb. cutirubrum) or 50ng of RNA isolated from 70S ribosomes (in the case of Ha. marismortui) in SI hybridization buffer for three hours at temperatures between 50°C and 60°C. Hybrids were digested with 200-500 units/ml of nuclease SI at 3235°C for 30 minutes, and protected products were analysed for length by electrophoresis in 8% polyacrylamide gels. As a negative control, total tRNA from Saccharomyces cerevisiae was used in place of Ha. marismortui or Hb. cutirubrum RNA to protect the end labelled DNA fragments. In these experiments, the distribution and intensity of partial protection products were sensitive to (i) hybridization temperature, (ii) the S1 concentration, and (iii) to a lesser extent, the digestion temperature. The above conditions were optimized in order to obtain unambiguous results.  2.2.16.3 Primer Extensions Primer extension reactions by A M V reverse transcriptase enzyme were carried out according to the method described by the manufacturer. A sequence-specific oligonucleotide primer (~2ng) labelled at the 5'-end was ethanol precipitated with ~ 5 ugs of total RNA, resuspended in 10 u.1 reverse transcriptase annealing buffer, annealed for 5 minutes at 65°C, cooled slowly to the reaction temperature (37-45°C), and incubated for an additional hour. Primer extensions were carried out by the addition of 10 (J.1 of reaction buffer (10 mM MgC12, 1 mM dNTPs, and 10 mM (3-mercaptoethanol), 5 units of A M V reverse transcriptase, and 5 units of RNase inhibitor, followed by incubation at appropriate temperatures (either 37°C or 42°C) for 30 minutes. After that time, more A M V reverse transcriptase was added and incubated for an additional half an hour at 37°C, 42°C, or 52°C. The reaction was stopped by treating the reaction mixture with 1 ul of RNase A (10 mg/ml), incubating at 37°C for 15 minutes, and then by adding the stop mix (10 mM EDTA and 300 mM NaOAc). The  40 products were ethanol precipitated, dried under vacuum, resuspended in 4 |il dH^O and 4 |j.l FDM, denatured, and run on denaturing polyacrylamide gels alongside a template sequencing ladder.  2.2.17 Sequence Alignment of the 16S 5'-Flanking Regions The 5'-flanking sequence from the rrnA, rrnB and the rRNA operon from Hb. cutirubrum were first aligned using Clustal V method (Des Higgins, European Molecular Biology Laboratory, Germany). Although the stretch of nearly identical sequences from the three operons (Mevarech et al., 1989; Dennis, 1991) were aligned (Figure 3.10), the conserved sequence motifs within multiple promoter sequences were not recognized by the program. In the rrnA and Hb. cutirubrum operons, these nearly identical sequences extend beyond the primary processing sites at positions -86 and -110, respectively. The rrnB operon also contains the major portion of this conserved sequence between positions -94 and -145; however, it lacks the processing site located at the end of the sequence. In order to get the maximum alignment, the promoter sequences were then aligned manually.  The rrnA  promoter sequences Pi, P2, P3 and P4 were aligned to the Hb. cutirubrum promoters P3, p4, P5 and P6, respectively. The rrnB promoter sequences P and Px were aligned to the Hb. cutirubrum promoters P5 and P2, respectively. When aligning the promoter sequences, first, the Box A and Box B sequences were aligned and then the gaps were introduced in order to get the maximum alignment.  2.2.18 Molecular Phytogeny Method The phylogenetic method used in this study is maximum parsimony analysis or PAUP analysis with stepwise addition. The PAUP analysis package was created by David Swofford (1993). The principle of this method is minimum evolution, involving the identification of a tree that requires the smallest number of evolutionary changes to explain the differences observed among the operational taxonomic units under study. Such trees are assumed to be closest to the true tree and are therefore preferred (Fitch, 1971). To use this method, the sequences are first aligned using Clustal V method (packaged by Des Higgins, European Molecular Biology Laboratory, Heidelberg, Germany). Gaps are then introduced within the  41 sequences in order to optimize the alignment. The gaps represent insertion or deletion events in the sequences that are assumed to have occurred at these sites since their divergence from the common ancestor. Next, "informative sites" on the sequence alignment are determined. A site is considered to be phylogenetically informative only if it favours some trees over others. The minimum number of substitutions at each informative site is then calculated and a tree describing these substitutions is assigned to this site. Finally, the incidence of each of the trees over all of the informative sites are calculated; the one that occurs most frequently is considered to be the most parsimonious and best represent the evolutionary relationship of the sequences under consideration (Felsenstein, 1988). Since this is a statistical method, there are inevitable errors; for example, underestimation of distances between sequences due to multiple mutations at a single site. In order to estimate the uncertainty in the overall tree elucidation, the data are subjected to resampling analysis. The most widely used resampling method in phylogenetic analyses is "bootstrapping" (Felsenstein, 1988). This consists of resampling with replacement; randomly selected samples in the set (corresponding to sites in aligned sequences) are analysed and returned to the set so that some sites of the sequences under consideration are sampled more than once and some are never sampled. If such resampling is performed frequently enough, it is possible to determine the confidence limits (P values) and hence the significance of any inferred evolutionary branch.  42-  CHAPTER 3 The Gene Organization, Sequence Heterogeneity, Expression and RNA Processing of the rrnA and rrnB  Operons From the Halophilic Archaeon Haloarcula  3.1  marismortui.  Introduction The initial report on ribosomal RNA heterogenity in the multicopy rRNA operons from  the genome of a halophilic archaeon. Ha. marismortui was published by Mevarech et al. (1989). The observation of two nonidentical rRNA operons in the aforementioned report poses certain important questions: Why does a unicellular organism need two types of rRNA operons? Are the products of both rRNA genes from the two operons expressed and present in the active ribosomes? Are these operons expressed differentially at different stages of the growth cycle? How did the sequence heterogeneity originate and are there any selective forces which have allowed it to be maintained and propagated? Are there any unique mechanisms involved in the expression and processing of the rRNA genes from these operons? Are the two types of rRNA genes advantageous for the survival of Ha. marismortui'! Are the presence of heterogeneous genes in this organism an adaptation to a changing environment? To address some of these questions, a number of experiments were performed. The total number of rRNA operons present in the genome of this organism was determined by Southern hybridization analysis of genomic DNA isolated from progeny of a single cell. Next, complete sequence analysis of the rrnA and rrnB operons was performed. The two sequences were compared to ascertain the level of heterogeneity at the nucleotide level. Comparisons of the rRNA sequences from related halophiles were also performed to get more information about structurally and/or functionally important sites. The sequences of rRNA genes, spacers, and leaders were subsequently compared with related halophile sequences in order to identify conserved elements that are presumably important for function or expression. The primary  43 transcription products of the rrnA and rrnB operons, and the intermediates produced by a variety of steps (processing and maturation of rRNAs) involved in the assembly of functional ribosomal particles, were studied using nuclease S1 mapping and primer extension analyses. A nuclease S1 protection assay was developed using the rRNAs isolated from active 70S ribosomes with two different DNA probes which are specific for the 16S rRNAs from the rrnA and rrnB operons. Tfiis assay was used to demonstrate that both operons are active and their product 16S rRNAs are assembled into active 30Sribosomesubunits.  3.2  Results and Discussion 3.2.1  Number of Operons  Mevarech et al., (1989) reported that Ha. marismortui contains two non-adjacent ribosomal RNA operons, designated as rrnA and rrnB, within its genome. In their analyses, Southern hybridization to a Hindlll digest of total genomic DNA identified two bands of 20-kbp and 10-kbp in length. The 10-kbp Hindlll-Hindlll fragment containing rrnB and a 8-kbp Hindlll-Clal  fragment containing the rrnA (derived from the 20-kbp band) were cloned  separately into the plasmid pBR322 (Mevarech et al., 1989). Since pBR 322 is a low copy number plasmid, the 10-kbp Hind Ill-Hind III fragment and the 8.0-kbp Hind III- Cla I fragments were isolated and subcloned into pGEM 7 vectors, +  and using these plasmid DNAs, restriction endonuclease digestions were performed in order to confirm that the two clones are from the rrnA and rrnB operons of Ha. marismortui. The results are summarized in Figure 3.1 and Appendix. The initial RFLP characterization of the rrnA (pHC8) and rrnB (pHHIO) clones are indicated as A and B, respectively in Figure 3.1. Digestion of the the rrnA and rrnB operons with the same restriction enzymes gave different products [e.g., lanes C+X (A) and C+X (B)] suggest the presence of polymorphic restriction sites in these operons. In. Figure 3.1, most of the bands below 2.8-kbp matches with the restriction maps shown below (see Appendix), however, some of the restriction sites (for example, Xhol and PstI sites; see Appendix) were not detected in the restriction maps obtained  44  Oj  +  +  +  +  0  +  +  +  +  S  H  I  +  1  1  1  2  1 3  X  +  +  +  X  E  +  +  +  +  +  +  +  + ^  c  K  K  H  1  1  1  1  1  1  I  I  4  5  6  7  8  5  10  11  1 12  Nucleotide Scale  Figure 3.1 Restriction endonuclease digestion of the pHC8 (rrnA) and pHHK) (rrnB) operons from Ha. marismortui. The 8.0-kbp HindlH-Clalfragmentfrom the rrnA operon (A) and the 10-kbp HindHI- HindHIfragmentfromthe rrnB operon (B), subcloned into plasmid pGEM 7 , were used in this analysis. The following restriction +  enzymes were chosen so that they cut the plasmid at unique positions; Clal (C), HindHI (H), EcoRI (E), Kpnl (K), PstI (P), Smal (S), Xhol (X) and SphI (Sph). The rrnA (A) and rrnB (B) operons were digested with similar enzymes and each pair of digestion products shows an RFLP [e.g., see H+E (A) and H+E (B)]. The fragments from X, DNA digested with PstI was used as a size marker. by Mevarech et a/.(1989). Also, some of the larger bands gave unexpected sizes. This may be due to overloading of the samples and incomplete digestion of the largerfragments.In addition,  45  r  _ i  r — i  r  Nucleotide Scale  5  M  11 s l  9 m S3  S£  su  £  1  5.  M  "S .5  S  3 a  3 0  ^ 5 • w*  a.  11.5 kb  5.1 kb  m 2.8 2.5 2.1 2.0  kb kb kb kb  1.7 kb  1.2 kb 1.1 kb 805 bp  Figure 3.2 Figure caption on next page (page46)  46 Figure 3.2 Genomic southern hybridization with oligonucleotide, oPD 34. (a) Genomic Ha. marismortui DNA isolated from progeny of a single cell was digested singly with the following restriction enzymes: HindHI, Smal, PstI, Kpnl, EcoRI and Clal. For the double digests, HindHI was used along with one of Smal, Kpnl, PstI, EcoRI or Clal. The digested DNAs were fractionated in a 1 % agarose gel along with X DNA digested with PstI as a size standard. An oligonucleotide oPD 34 was used as a probe in this analysis (see section 2.2.4). The line diagrams indicated below are the restriction maps of the cloned rrnA and rrnB operons showing the digestion sites of HindHI (H), EcoRI (E), Smal (S), PstI (P), Kpnl (K), Xhol (X), and Clal (C) (a similar restriction map was obtained by Mevarech et al., 1989). (b) The autoradiogram showing the hybridization pattern of the DNA fragments from different digestion products of the genomic DNA. The lengths of fragments were estimated using the X, PstI fragments as standards. sequencing analysis of the 5'-flanking region of the two operons were also performed and compared with the sequences obtained by Mevarech et al., 1989. These results confirmed that the two clones are HC8 (rrnA) and HH10 (rrnB). To confirm the presence of rrnA and rrnB operons in the genome of Ha. marismortui, a Southern hybridization analysis was performed (Figure 3.2). Chromosomal DNA was isolated from progeny of a single cell and digested singly with the following restriction enzymes: HindHI, Smal, Kpnl, PstI, EcoRI, and Clal. For the double digests, HindHI was used along with one of Smal, Kpnl, PstI, EcoRI or Clal. An oligonucleotide, oPD 34, which anneals to conserved nucleotides at position 57-38 of the 16S rRNA genes, (sequence obtained from Mevarech et al, 1989), was used as a probe in this analysis. The digested fragments were transferred on to a Hybond N membrane, hybridized overnight to the oligonucleotide probe at 37°C, and then washed with a low stringency buffer at room temperature. The hybridization pattern shown in Figure 3.2 indicates that in all digestions, there are always two strong bands which correspond to the rrnA or rrnB operons (this is based on the restriction maps of the two operons).  In some lanes, there is a third band with varying  intensities and in some lanes, there are either two or four bands. Digestion with Hind Hl-Sma I, Hind III-Cla I, Hind III- Pst I, Sma I and Kpn I gave rise to three bands, suggesting that  47 there are three rRNA operons present in Ha. marismortui genomic DNA. In lane 1, the probe hybridized to three HindlH-Smal digestion products that correspond to a 2.8-kbp fragment from the rrnB operon, a 2.1-kbp fragment from the rrnA operon, and a 1.1-kbp fragment presumably from a third operon, rrnC. The 1.1-kbp band also appeared in lane 6, where the DNA was digested with Smal alone. This clearly indicates that the 1.1-kbp fragment is bounded by Smal sites at each end. In lane 2, the probe hybridized to three HindlH-Pstl digestion products corresponding to a 6.0-kbp fragment from the rrnA operon, a 6.6-kbp fragment from the rrnB operon, and a 7.7-kbp fragment presumably from the rrnC operon. In lane 3, the probe hybridized to three HindHI-Kpnl digestion products corresponding to a 5.8-kbp fragment from the rrnA operon, a 6.2-kbp fragment from the rrnB operon and an 8-kbp fragment presumably from the rrnC operon. In lane-4, the probe hybridized to four HindlH-EcoRI  digestion  products, corresponding to a 5.7-kbp fragment from the rrnB operon, a 5.3-kbp fragment from the rrnA operon, and a 2.1-kbp fragment presumably from the rrnC operon and a band >11.5kbp, may due to partial digestion. The 2.1-kbp fragment is a product of HindHI and EcoRI digestion because the band was not apparent in the single digestions (lanes 8 and 9). In lane 5, the probe hybridized to three Hindlll-Clal fragments, corresponding to a 7.75-kbp fragment from the rrnB operon, an 8.0-kbp fragment from the rrnA operon, and a third fragment which is >15-kbp in length. Digestion with Clal alone indicates that this >15-kbp fragment may be a Clal-Clal digestion product (lane 11). Single digestions with Smal and Kpnl shown in lanes 6 and 8 respectively (Figure 3.2b), yielded three bands. In lanes 7 and 9, where the DNA was digested with PstI or HindHI, only two bands are apparent, it is presumed that a third band is not visible because two of the three fragments were not resolved.  These results suggesting that the presence of a third band in some digestions may be due to several reasons. First, a third operon, rrnC (see section 3.2.6), is present in the genome of Ha. marismortui. This observation is in agreement with the previous report by Sanz et al (1988); they performed a Southern hybridization on restriction fragments separated by pulse field gel electrophoresis and observed three fragments that hybridized to ribosomal probes specific for 16S and 23S rRNAs (Table 3.1). Second, a recombination product of the rrnA  48 and rrnB operons, present in small proportion of the population, may represent the third band. The low intensity bands observed in some lanes (e.g. in Hind Ill-Sma I and Sma I digestions) may be an indication for this assumption. Third, there may be some non-specific binding between the probe and a complementary sequence in some part of the genome, not containing any rRNA operon. The lanes showing four bands may be due to partial digestion of the DNA or non-specific binding of the probe. The lanes showing only two bands may indicate that there are only two operons present in the genome or that two of the hybridizing fragments comigrate. To confirm these assumptions, other probes have to be used and the third operon has to be isolated from the genomic DNA of Ha. marismortui, and characterized. Mevarech et al., (1989) have reported that there are only two operons present in Ha. marismortui genome. The experimental procedure employed by us and Mevarech et al. (1989), was similar (the genomic DNA was digested with Hind lll-Cla T). They misinterpreted the third band as a partial digestion product. The basis for their interpretation is attributed to the reduced intensity and coincident mobility of a third fragment produced by Hindffl digestion. The third operon in this thesis has been designated rrnC and is assumed to contain the 23S and 5S gene sequences determined by Brombach et al. (1989). Confirmation of this assumption awaits the cloning and sequence characterization of the genomic rrnC locus.  3.2.2  Operon Structures 3.2.2.1  Primary Structure  To study the degree of heterogeneity between the rrnA and rrnB operons, a complete sequence analysis was performed. Several strategies were used in generating the 6171-bp sequence from the rrnA operon (Figure 3.3A) and the 5947-bp sequence from the rrnB operon (Figure 3.3B). These strategies included the use of rRNA sequence specific primers, and the use of M13, T7 and SP6 primers to sequence small fragments that had been  49 Table 3.1 Summary of pulse field gel electrophoresis experiments using chromosomal DNA from different halophilic archaea, digested with different restriction enzymes and hybridized with 23S and 16S rRNA probes (adapted from the data of Sanz et al., (1988).  Microorganisms  Enzymes Used  NotI Ha. Californiae ATCC 33799  No.of operons  Dral  Sfi I  BamHI  +  +  +  4  Hf gibbonsii ATCC 33959  +  +  +  4  Hb halobium NCMB 777  +  +  +  3  Ha. marismortui  +  +  +  3  He. morrhuae NCMB 757  +  +  +  +  2  Hb. salinarium CCM 2\4c\  +  +  +  +  1  sub-cloned into pGEM and M13 vectors. Exonuclease III generated deletions were also made and sequenced using T7 and SP6 primers.' About 95% of the sequences were performed by sequencing double stranded or single stranded DNAs from both directions and 5% on one direction only (within the 5'-flanking region of the 16S rRNA genes, 23S-5S spacer and the 5S distal region). However, when there were uncertainties, the opposite single strand was sequenced to confirm that region.  Special attention was paid to any site exhibiting  polymorphism between operons. The figures 3.3 A and 3.3 B show the complete nucleotide sequences of the rrnA and rrnB operons respectively. The gene orders and the unique polymorphic restriction sites are also depicted. As in other halophilic archaea, the 16S, 23S and 5S rRNA genes are linked in  50 1  10  20  30  40  SO  (0  70  00  90  100 100  S'-CfllCIBCGTCGCflTtCCCGnCflllTTCCOTCOCCOICTGIGTCOTCinnTICinncTCCOCOCICnCOIGCCCGCCnOHRCCOTCTOTCflTnGTCHCrGT AH  111 200  CCGTGBCCCGTflCTCTCCGTCTGTflCflCGTGTGGGflTflCTCflCflCCCGTGTflflTCRCGTCTCGCflGGCGflCCTCCTTCGflCGGCGTTflRGTGTGGCTCRC  CCRTCGGflflTGRRRTGCGflRCGCGGTCRGGGGCGRTGCCCCGRRCGRRCGCRCTCGTTCCGRCGCCCTTRRGTGTflRCRGGGCGTTCGGRRCGRRCGCflR 3 0 0 R..BI RGGTCGCTCGCGTCGTGTGGTTCGRCGCGRTRCCRflTCCGRCGCCCTTflflGTGTRRCRGGGTGCCCGRflTGRflCGCGRRCGRCGTTCGGTCGGGGRCnTC 1 0 0  GftftCCCCTGCCfiCflCGGfiCCflGGCGGGRCCGRCTCGCCflftCGflflftCTCGRRGCCTITflTGGTGGCTTCGRGRCflflCRflTCRGGTCCGflRGflflflTGflGGRT  500  Sill TCCRCCCCTGCGGTCCGCCGTCflflGflTGGGRTCTGRTGTTRGCCCTGRTflGTTCGGTGRCRCTCGGTCGRCGGGTGTCCTCGflflCflCCCTTCGflTRGCGR  600  16S rRNA rrarBi;rrrHTTnTRTnr,r.ii;rr.aii-rFr.rrTnflrrncncaTTr.Tf,rcGGr.flTRrflTTrrr,nTTnRTrrTnrrnnRnr,rrRTTnrTflTrnr,Rr,TrrnRTT 700  Tflr.rrflTnrrflGTrr.rBrGnnrTTBnBrrrnrf.nraTBTflGrTrflGTBBrBrnTnnrrBflBrTBrrrTBrBnBrrnrnBTflBrrTrnr-nBBBrTGBnnrr 600 RBTBnrnr.fliBTBBrrrTrBTniTnnBnir.rrnBr.BnTTBr.BBBrnTTrrr.nrnnmnnGRTnTBr.rTGrnnrrGRTTBnnTflnRTGGTnr.nr.TRBrni; 9 0 0  nTBrrBrnrrnBTBflTrnnTBrnmTTGTnBnBnrBflGBBrrTnnflnflri;nTBTrTGBi;arBBnBTBrrnr.r.rrrTBrrnr,nrnrBnrBni;rnmflflflp  iooo  rTrTBranTncarnarflGTr,rr,arflnnnnnflrTrrnai;Tr.Tnflr,naraTaTBi;rrrTrnrTTTTrTGTBrrnTaaGr.TGnTBrBnGBflrBBnr,flrTf;f;flr 1 1 0 0  BflnBrrGfiTnrrBnrrnrrnrBr.TBBTflrrnnrBnTrrnBGTnflrGnriT.BTBTTBTTBnnrrTBBBnrGTrrfiTBGrrr.BrrGBRrBBnTrrnTGnnR 1 2 0 0  RBTrnflrnrnnrBBrnrnTrnni-GTrranri;r,BBBrTGTrrnnrTTnr,nnrrnr,BBnBrrTnflr,ni;r,TBrGTrrnnBnTflnnBnTr.flBBTrnf;TBflTr 1 3 0 0  rTnrBrnnBrrBrrBBTGnnnBBflrrBrrTrflnnBBRBrnBBrrrnRrnnTnBr.r.nBrnBflBnrTBnGnTrTrnBBrrnnBTTflnflTBrrrnnnTflr.Trr M O O  TBnrTnTBRflrGflTnrTrGrTBnnTnTGrrnTBnnrTBrfiBnrrTcrnrTrrnrTrTBrj.nflflnrrnBGBflnrnflnrrnrrTnnnflflGTRrfiTrTnrflflr. 1500  f,BTnBflBrTTBBBnnBBTTnnrr,nnnnflnrBrrBrBBrrnr.BnGBnr<-Tnn;i;TTTBRTTnnBrTrflBrGrrr,nRrflTrTrflrrnnTrri-GflCflGTBnTR 1 6 0 0  BTi;BrBnTrBnr,TTnBrnflrTTTBrTrnBrBrTBrTnBnBnnBnr,TnrBTnf;rrBrrGTrBf;rTrnTBrrnTnflGnci;TrrTGTTflBr,TrBnGrflBrRBfi  rnBnarrrnrflrTTrTBr,TTr;rrBr,rBBTBrrrTTnBGGTRnTTnGGTBrBrTBnnflGGRrTnrrGrrr,rTBBflnrGnBnnBRnfiRflrnnnrflRCfiGTflG  1700  1000  GTrflGTsrGrrrrnRnrnnRrpnGnrBBrBrnrnGnrTBrBBrmnBTnBrBnTnniiBTr.rBflrnrrnnnBnGrnBrnrTBnTrTrrHBRrGTBGTEGT 1900  B r , T r r r , r , B T T n r n n r , r T r , B B B r r r r , r r r r , r B T r , B B n r T n n B T T r n m B r , T B R T r n r f n G T r B n R H r , r n r f ; r r , n T r , B B T f l r r , T r r r T n r T r r T T n r B r B r f l 2000 Av.| rrnrrrnTrBBBnrRrrrr.BGTnnnGTrrnnflTnBGnrmTrBTnrnBrnnTrnRRTrTnnnrTrrnrBBnnnnnrTTBflGTnnTRflrRflGGTRGrcGTR 2 1 0 0  GBnBflflTrTGTnGrTfinflTrflrrTrrTflrTfiflrCGGGGRTTGGGGCTCTGrCCflflCCCflCr.TTTCfiGTRTTCGflGRRCTCCCCCGflCCGGGCflCCTflTGR  tRNA  2200  A I a  BrTBTrflr.cBninBrBBrrr.TnrnnflTrTGrrnr.r.rrrBTBnrTrBnrr.nnBnBnrGrrnrrTTTr.rBBnnrnnBnr.rrnTnnnTTrBBBTrrrRr.Tnnr.  2300  I££flTflCGCGTGCCTGCCCCGRRCCGTGCCGCTTflflGTGTGGGflCGGCGTTCGflflTGTGRTflCGRCGBCRGRTGCflCCRGGCCGGGTRRflRCCGRGCCTG  2100  GGRRGGGTCGftTTCGCCCRCCflTCTCCRCCTTGGGGGCGflGTflTGflRflCCGTGTGTflCGTGCGRTCCRGGCGTCCfiCTGGflCTCGTTCRGTTGRRCGRGT  2500  23S rRNA rarflarr.flrr.TTnnrTflrrBTnrrBnrTnnTnnBrTnrTrnnrTrBnnrnrTr.BTr.Bflr.nBrGr^rfBBnrTr.rf.flrBflnrTGTGGnnflnircGCRCGGflS 2600 Aval r.rnBanBarrBrBnflTTTrrnBBTnBGBBTrTrTrTflBraBrTnrTTrnri;rflBTr,BmBBrrrrnRnBBrTnflfl»rRTrTrBmRTrrGnBGr,aBrRnR  2700  RaBrnrBflrr,TBflTnTrnTTBr,TflRrrGrGBnTnRBrnrnBTBrflnrrrBBarrnBflr,rrTr,rBrnnr,rBBTnTnnTnTrflnnnrrarrrrTrflTrflncr  2600  GflrrnTrTTrBrf;BBnTrrrTrnnaaTar,flGrnTr,BTBi-flnnGTBBrflarrrrnTarrnaar,arrflnTar-nrTnTnrGr,Tai;Tr,rrBnRnTBncnnnnGT  2900  FroRV Nn.l Tr,GBTBTrrpTrf;rr:BBiflflrr,rBnr,rBTrnBrTrrnBBGnrTBBflrBrBBrrTnBnBrrGBTflGTGBarflBGTRmr,Tr,flBi-r,RBrr,rTf;rBBRr,Tflrn  3000  Figure 3.3A Figure caption on page 54  51 I 10 20 30 rT™fiftRnr.r,nnnrr.pflnTnrnnrnTr;RRflTra^  10  SO  (0  70  60  90  100 3100  nnr^rnrnnnflnnrrnflTCTTrTnTrCTH^^  3200  nffrrr,«rBTrxrrnrBnr.nrTTTnrrr™n^  3300  BTB»flBrr.rrrrrBgflnr,r°rnTnnBHnTrTCTTflr.HnTTnnTr,TrrTPfBaTflrrrTrTrnTnBTrTBT(;TnTBnn(;nTi;nBBnnriTnTrnpnTrn;r.  xoo  PBflPBRPTRBTTn-BRTPGBBBPBTBTrnBRGrsTnBrpTPrttppBBGr.TBGTrTBTBBnRTBGBBrRaprGBTTRRTRTBTrpBPPTPPGRnaRr.afiTP  3500  fiBrflrflrrTnTCflflBrTrrflflBrTTBrflffBmrTiiTTTBBrnrnfinmrTrrr.r.Tnfr.rnnnr.TBBnrrTBTBTBrfBr.nflr.finnflBrBBprrBfiBnBTH  3600  Av»l nr:TTBBn<;rrrrrBBf;Tf:TmBTTflBnTfiTBBTrf-Ti-TnBBni;ri;i;rPTri;B<;rrcTBnBrBf;rri;n(;Bni;TrBi:i-TTBr,BflnrBr.PTBrrrTrTBflr.Bfl 3700 Mad BfliirfiTflflrpfirTTRrrnnrrnBfiBTTTnBcnraprrflBflBTnBTrnnnBrrrflflflrrrBrrBrrBflnflrrrnTrmTBrrcrrrBTBTniiTBBTrnBnT SuL  3800  pnBTTnrrnrTrTBBTrrinaTnBflflfiTBnnnnrnBnBHrTrrTflrnnBrrnflrTBnTiiBrnBflBBrrrrnnrrflTBnrBnrBnpnBTBfiTrnnBTnBnflB  3900  1000  ppppRflrfflprTaRTRRflTflflBGGTTCCTmRrRCTGPTRaTPBGPTGaGGRTTaGPrGRTPPTaafiT^^^ Sol B«X1 flPBRRTTBflTfiTTPPTRTBPpaTPflTRPflRTRaflBRTTRaPBPPPTGRRfiTPRflTCflPGPPPRBraTTPRpppRRTPRaBPrfiTrPBBr^^  1100 1200  rfiTaflTnnrPffr.aanrnnBrr.BBr.fir.rafiCT™^^ heal BaBl aTnTrfflTnnr^nrnaaanrraannrrTfiTcnnnanraarraarnTTannnBBTTrfiiirflBnTTanTPrpCTflnnrTrnnBBnaflnnnflTr.rrTnrTrrn  1300  a«an nBarnnanrpnnTrnrar.TnapTrnnaflnPTriinBrTnTrTar.TaflTBBrBTannTnarrnraaflTrrnraannarTrnTBrmTrBrTfiaflTPrTnrrp 1 1 0 0 anTi;PBr,nTBTPTr,BBrflpPTPi;TBPRBnai;napnflBi;i;BPrTnTPflarni:rpi;i;nr.TaBPTBTt;BPPPTPTTBBi;nTBnpi;TBnTBPPTTnprnrRTPa 1500 r.TBnPBr.PTTrPBTnBaTnnflTTBBrranBnpTTpflrTi;TPfPBarnTTnr.r,prPf.isii;BarTi;TBPBTTprBBTnpnnflfiTPTPnBi;BPBPPPBGr,fiBn 1600  flflfipRflflflflrrpTflTBfiBnPTTTBPTfTPflfifipTBTPfiPTRBnBPflTBBTPBPPnflTRTfiPflRrBTBRBTBnRBRTrTtTTBPBRflGRTBPPPRPG BnrpflpprarBrBBPBnTnaflBTflPTBrppnTrBr.TnBPTnpr.aPTPTraPTPPunnBnnanr.apappnBTanprpnnpanTTTnapTnGpnpncTBpnp 1800  nrTPUBBaansTaTrranrnpnrppTBTfifiTnflTPTcanrrnciSBrflnBr.aprrnnrrsaflnar.Tnrflar.BnraaaaiiaTnRPTTr.apflnTnTTPTTrprfl  1900  BrfiBnCBBrnrTnBPnPfiBBBnBTGRTPTHBrBBBPCHaTTMITTr.PTTGBTRrBGCraHTTf.aTBBPaBBHBnfiCTRr:KT«6EGRTIIIirflliHGTC6T  5000  parTrr.raaBRBraPBTaTrnappnanTr.nPTTrPTarpTPBaTBapnRTrrrPTPparrPTrprBrBPaBaaBPBBrrRflBBnTBflnBTTBTTPBPPTR 5100 TTflflRRRBBRTPnTRflRPTRRRTTTBRBPPfiTPRTRBRRPflfiRTPRRPTRPTflTPTfiPTRRRTRTRTflBTRRTRTPTRflPRRBRPPRRPPRTRTRGTflPfi  5200  flRfiGRflflpTBPBBTTfiRTRRPPflPTBfiTRTflPPRRTTGTTPGflBflRflfiPflPfiTBPPRRRTflRPPflrRPPflPflPBRBBTRflBflBPTBRRP.GPRTP.TRflfiP.T 5300  PRflaaprPaPTTr.RBflBBRBRBpapPRPPBaRBTPPPnPfiTflpaafiflPRPRfiTPRflTaRflPTPRriRRTBTfiPRPRTPRafiBTaarfiaRapRTTBPBPPPfl 5100  PfiafiCBPTQPPaRarPaaaRPPaTPBTTPBTflPRPflCTGTGflCTPflTTCflPPRflCGflTTTflRPTrRTPfiPTBflaPGBBTCCRCGCGCflflRCTGGRTCGCfl 5500  SS rRNA pfiTflBTParBPRRTRRBBRaRTTflflTPRflfiRPTBBTflPTBTCGPGBTTCGBTTCCGTBflPTCGRPRBTTflRfirRRPrBPBBPRRTRRRRTTRPPTPPPBT 5600  flrppaTrprBPRParRRflftRaTaaRPppRpPRBPRTTrpaRBRaRTaPTRRflBTRrnpRRRrrTPTBRRaflftTprnRTTrRppnrrarpaPTPflTBPPTT  5700  TCRTflGCCCflCTCRGGflGRGRCflTCTCTCCCGRGTGGGCTTTCCGTflTTTRTflflGCRGTGCGTRflCTflCCTRRTTGRGCCGflGRRRCflRTGGCflTGflCTC  S800  GCGflRGTCRflRCTGflRCGRRCTCRCGGCGCTTCTTGRRflCGGCRflCGTRCCCGCTGTCGGTCGCGflCTGCTCGGGRGGRGTTCGRCGRTGGflflTTCGCTR  5900  tRNA y C  s  PRP^TaafirBflrrRflRTTRBPTaPPRPTflTPTBPRPPRPriBTRRraBaBTrTRRPPTRRPRPrHRPnRrPTBPflRRRrrRPprRTrRrPRRTTPPaaTPP  6000  GJi£££XIiiLlXLCTCflGTTGCrTflCGCCCCCRflTRRRTCCCGGTGRGCTRCCCGGTTTCTGCCTTTTTGTGTGGGTCGRCGTTRTGRflCTGTTGCTCGRfl 6100  flRCDGTTfiflRRRCflGGRRCRCGRTTCTflGCGRflRGRTRCflGTGTTGGflCGGCTGTCTRTCRfiGTCTGTTTT-3  ,  Figure 3.3A continued  I 10 20 30 10 50 60 70 SO 90 100 5'-flR0GTTGCGCflCflCflCG6flGCCCCCflTflGflflGCC6TGCTTGflCTCflTCGCCCTGGCTflflCflCGnfiCCTflflGCflTTTCCCCCCTGCTGGTCCCCTflTTCCG Nsil TGHATGCnTAGCTflnCCGCHGnCGnCGGCGBflGGGflGCGGTRTTGGCRRflGGRCCGflGCRRTCGCTGCflRCCGTTTTGGGGGGGTnRCflGTCCGTTTTCG EcoR V TCGGTGflTflGRGGRGflCTTCCTflGCGCGCRflCTGRTTflrTCTCflCCTCIGRflTCGCGGflTIGTGCCGRCCTGflGCGRIGRflRTCCTCCflCCGRTRTCCRG Allin  200 300  Mlitt  RTflflrfiTCCCTRRflflCTGRCTflTRCTTTTGTflfiCTGTCGCTCRflCTTCGCCGGTCCRflTflCRCTflCTGflflflCCRCCCGTGTGTflTGCCflCGCGTGTGCCG  100  rGTGGTrCCrGRflCRTGGCGGGrCCflflGCCGrCCflrTTRrRrRCrCCCCrCCCflrCGGRTGTflflTGCGflRGGTCGCGCCGGGCflGCflTTCCCCGflflCGRC  500  BaXU RCCCCRRCGCCRCCCCCGTGGTGGGTTGGGTCRCTCGCTTRGRTGCGRGGCCRGCGRGGCRTTCGTGTTCGTGTRTGRGGRTTCCRCCCCTGCG6TCCGC 600 CGTTflRGflTGGflRTCTGflTGTGRGCCCflCGGRCCCflTGCRGTRGTCRRCRTTGGCGTRGGRRCCflRTGTGTTRGCTTCCGflCGGRGTGCRRCCRCTTCCG  700  16S rRNA  B m B 1  rrGPTr.RTnTHTRTrRr.rHrpTTrrnnTTnflTcrTnrcnr,flfir,rrRTTnrTRTrnr;fli;rrrnRTTrRfirrRr(;rrRnrTGCRCGflGrrTflGRCrcGTflGC  Anm  800  Av»i  RTfiTRnPTPflBTRRrRrRTRnPCRRflPTRPPCTRPRRfiPPGPRRTRflPPTrfinnflRRPTRRnBPPflflTfifiPRnflTRTRRP.TPTP.flTGCTGGRGTGCRGfiG  900  flfiTTRfiflfiflPRTTPPnnPRPTGTRfiGRTKTRnPTGCGGPPBfiTTfifinTRnRTnnTnRRGTfiRCRBPPCRPPRTRPCGfiTfiRTCGGTfiCflGGTTGTGRGflG 1000 CfifiGfiRrCTGGfinRPGnTflTCTGRGflCfifiGfiTRPPGGGCPCTRPGGGRCGPfiGCRGGCGPRRRRPCTTTflCRPTGPfiCGfiPfiGTGPGfiTRGGGGGRCTCC MOO GflGTGTGRGGGrRTflTRGCCCTCGPTTTTCTGrRCCGTRHGGTGGTflPRGGRHCRRGGflCTGGGCflRGRPPGnTGCPRGCCGCCGCGGTHRTflCCGGCflG  1200  TPCpRGTnRTGGPrGRrRTTRTTGGGPCTRRflGPnTPPnTRnPTTGPTGTGTflRGTCCRTTGGGRRRTrflflCPflGCTCflflCTGGrCGGCGTCCGGTGGRR  1300  fiPTflPfiPflnPTTGGGGCPGRGfiGfiPTCflflPGGGTRPBTrPRGfifiTflGRRGTBRfifiTPCTBTRfiTPCTGRRPRnRPPRPPRRTRGGBfiRflPPfiPnTTGflnfl  MOO  TCma 1  Smai GRprGnRPPPGRrflGTGflGGGflCGRHflGPPftnGnrrTPGflRPPGGRTrflGRTflCCCGGGTflfiTPPTGGCTGTRfiRCRRTGCTCGnTRGGTRTGTCflCliCG  1500  CCRTGRGCRCGrGRTGTGCCGTRGrGflRGRCGRTHRGCGRGCCGCCTGGGRflGTflCGrCTGCRR6GflTGHRflCTTRRHGGRRTTGGCGGGGGRGCflCCRC  1600  RflCPGGflGGflGPCrfiPGGTTTRRrTnGRCTPRftPGPCGnRCRTrTCRPrGGTPCCGflCflGrRGTflflTGflPflGTPflGGTTGflrnRCTTTRrTCGRCGCTflC  1700  TGRnRGGflGGTGCRTGGCCflPPnTPRGPTPnTRPPGrGflGGCGTCCTGTTRflGTCRGGCflflCGRGPfiflGflPPPRPRPTTCTflGTTnpPftnPflflPRrCfCT  1800  GCGGTGGTTGGGTRCRCrRGGRGGRPTGCrflTTGGTRRRRTGGRGGflflGGflRrGGGCflRCGGTflGGTCRGTflTGCCPCGRfiTGGRCCGGGCRRCRCGCGS  1900  BCTRCflflTfiGCTBTfiRrflGTGGfiflTRrRBPBCCGRRflfiGCGRrGCTHRTrTrrRRRrGTflnTRRTflfiTTrGGHTTGCGGGCTGRflHrCCGCCCGeHrGHB  2000  Avil npTGGRTTCGGTPGTRRTPGPGTnTPRfiRRnPGPr.rGnTGflRTRPGTPCCTGCTPCTTGCflCflPflCCGKCCnTCHRRGCflCCCGflGTGGGGTCCGGRTGR  2100  Aim GGPPGTPRTGCnRrGGTPGRRTPTGfiGCTPPGPRRGGGGnCTTRflGTPGTRRCflRGGTRGCCGrRGflGGRflTPTGTGnCTGGRTrflCCTCCTRCTGflCCG  2200  BstEU  GGRTCRGGGCCTTGCCCTGRCCCRCCrRCRCrrGGTTG77GGTCRCflRCRRCCflGRCGGRRCTGRCTGGTGRCCRCRRGTCRCCGCGRGTCGGTRRGCGC 2300 ACll A<*J CGflCTflCTGCRTGGGCCCGCTGGGCTCflCflRGflCCTRTCCGflGGCGGTflTflCCCflCflCGGGGR7GTCGGGTGCRflCTCCCGflCGGG7CCGTRCTCCG7flT  2100  Ana  CGCTTCGflRflTCCGTCCCCT7nflGrGTGGGflCGGCGrrCGflflrGrGRTflCGRCGflCflGR7GCRCCRGGCCGGGrRRRflCCGflGCCTGGGflRGGG7CGR7T 2500 s t j a  23S rRNA  CGCCCRCCflTCTCCflCCTTGGGGGCGflGTflTGRRflCCGTGTGTflCGTGCGflTCCRGGCGTCCRCTGGRCTCGTTCflGTTGflflCGRGTCflCflflCGRCGIIli  2600  PR MI RrTRrTHTGCrRGrTRr,TGRR77GCTrGRPTrRnr,PRrTGRTRRflGGRCRrGCCRRGCTGCGRTRRGrCRTnRRRHRrrGrRCfir,flr,nrRRRnHRrrHTfi  2700  GBTrTrpnBflTnflGRRTCTCTCTRRrRRTTGrTTrnrr,rRBTr,RnnRRCrrCGRr,RRCTGflRRPRTrTCflGTRTPr,r,r.Rr,r.RflPflGRRRRCGCRflTGTGR 2800 pnTnTPnTCRGTflflCPnrriflGTfiRRPGPnflTRrfiPtPCPRRRPCGRRGCCCTCRCGGGCRRTGTGGTGTPRGGnrTflrPTCTPRTCRGCCGRCPGTCTCGR  2900  flflfiTCTCTTGGRRCRfiRfiPGTnRTRPRnGfiTnflPflRrPPrfiTfiPTrflRGRPPRPTflPGRPriTGPGGTRnTGrrRnRriTRGPGGGGGTTGGRTHTCCPTCG  3000  CGRRTRflCGCRGGCRTCGRPTGCGRRGGPTRRRPRCRRPCTGflGHCPGRTRGTGRRCRRGTRGTGTGRRCGRRCGCTGCRRRGTRCCCrCRGRRGGGRGG  ure 3.3B Figure caption on page 5 4  3100  53 rnanRTBnBfiCflTr.BHHTrRnTrnni^  3200  rrr.flTr.TTPTnTrr.TflrnTTTTfiflRRBRcnafirrBnfifiB^  3300  Aval RPflRBBPTTTGPPrnflnnnppRPPBTPTTpRRnBnPGnRRRGPPRTGTRGRCflPBBprpBRRTPCGGflnRfiTnTRCGCRTGGflrRRRRTRflRBrnTBrpB 3100  aBanBPRPBTGnflflGTCTBTTfiBflRTTBBTRTPPTflPBfiTBPPFTPTCRTGflTPTBTRTRTflRRRGTGflflflRGPPPfiTPRRBTPPRBPflRPflRPTBGTTP  3500  PflflrPBRRRCfirRT'CGflflGCfiTGflrcrPPRCPGRGBTRnTPTRTRflRRrffflflGrRflPPBBTTRBTRTGTPCGCCTPPGGGRGGflRTPGGCfiCflPPTGTPR  3600  RBPTPPBflflPTTBrflBflPBPTfiTTTRflPnPRfinnBTTrPRBTRPRPfinnBTBRRPPTGTRTRCrflGGRBBGGRRPflflPPPflGflGflTfiGGTTRfiRGTPCPP  3700  Xbol Aval  3800  flBGTRTRGaTTRRRTRTflflTPPTPTRRflRBTRBTPTPRfiBPPPTflRflPflRPPCRGfiGGTPfiGCTTRGfifiGCRGPTRCCrTCTfiflGRfifiRGPGTflRCRGCT Narl rflPPfiGPPGflGGTTTRBBBPRCPPflRBfiTGaTPGRBRPTPflflRTCPRPPfiPCGRGRCCTGTPCGTRCCGCTCRTRTGGTRRTCGfiGTfiGfiTTGCPGCTCT  3900  flflTTGGRTRGRflRTRRRRBTGRflRRPTPPTGTGRRPCRflTTRfiTRRCnflRRHTPrTGGPrRTRnTfiGGflRGGRTRGTer.BGTGflGRflCPPCGRPRRCCrfl  1000  fiTfiBRTflRGGGTTPPTPflRPRPTBPTGflTPfiGPTGflGGGTTRRPCGGTPCTRRGTP.flTRPPRPflflPTPRRPTflTRTPRRflfiTTBBflflfiPBBBTTRRTflTT  1100  BaXl PPPRTBPPflPTBTRPflRTRflRflGTTRflPBPPPTGBRBTPBflTPRPRPTBRBPRTTPGPPPRRTPRflRPPBTPPRflPTPPRTBGflflBPCBTRRTfiBPflGRR  1200  Apal flBPGGflPGRflPrRPBRPBTflRGGRBflPBTBflrTPRPPTRBGGPrPRTnflRflflRRPfifiRPflrflGTRTCCGTfiCPGRGRflrCGRCfiPRGGTGTPCRTGGPGG  1300  Esa&l PBflRRRPPfiRGGPPTGTPGGGflRPRRFPRaPfiTTflGGGBflTTPBRPRflRTTRRTPPPRTRPPTTPRGflflBfiflRRRRTRPPTBPTPPRGflflPGGRBPRfiRT  1100  BaEH PnPRGTGRPTCr.nRRGPTPGGRPTGTPTRGTRRTRRCRTRGGTGRCCGPRRRTPCGCRRGGRCTCGTRCGGTCRCTGRflTCCTGCCCRGTGCRGGTRTCT  1500  fiaflpflPPTCGTflPflBBflfiBflCGRRGGRPCrGTPRflPRGPGRRRRTaRPrRTGRPCCTPTTRRGGTflfiPnTflGTflPPTTRPrBPfiTPfiRTflBPRRPTTGPfl  1600  Apa I TnRRTGGRTTRflrPHGflGPTTPRPTGTPPCRRPRTTGRGCCCGGTGflflPTGTflEHTTPCflGTGCGGflGTCTGRflRRPHPPCRnRGGGRRGCGRRGRCCCT  1700  KpoJ  Pal  RTRBflBPTTTPrTRPRGRPTGTPBCTRflGRCGTBRTPBPPBRTGTBCRRBflTflBBTRBBflGTPGTTRCRRRBBTflPCPRPBCTBfiCGfinCCflnPCflGHCR  1800  EcoRV BPRfirGRflflrarrfirCPGT'PGGTGRPTGPGflCTPTPRPTPCnRGfiGGRGGRCRCrGfiTBGPPPGRPBRTTTGflPTBGRBPBBTBPGPRPTPGRRflflGRTfl  1900  BSJHU TPRfiGPGPGPCrTRTGBTPRTPTCflRPPRGGflPRGBRflPPCRRPGflflGRRTBPflflRflGPflflflBRRTGRPTTGflPRGTRTTPTTPPPflRPGRGGRfiPGCTG  5000  RPnPGflRflGCCGr.TPTflGPGRRPCflflTTRGCCTGCTTGRTGCGRGCRRTTGRTGRCRGRRHflGCTRCCCTRGGGRTflRCRGRGTCGTCRCTCGCflflGRGC  5I00  flPRTRTPnRCrnRnTGnrTTnprRPPTrnRTGRPGGTrPPCTPPRTCPTPCPGTGCRRHHGCGrGPRHGGfirnRGGTTGrTPGPPTRTTRRRGGRRGTPG  5200  rnflRCTGGGTTTRnRCPGTCGTGRGRCflGGTCGGCTGPTRTCrRPTGr,r.TGTr,TRRTGBrnTPTnRCRRGRflrnflPPGTRTRGTRPnRGRnr,RRPTRPr,G  5300  TTRRTRRPPRrTBBTRTRPPBBTTfiTTPRflRfiBRBrflPBTBPPBBnPRBPCflPRP.GRPfiPGGRBTfiflRfiGPTGRflPGPflTPTflRGPTPBRRflPPPflCTTB  5100  Aval  BflRaRnflGHCPCrr,rPr,RGGTCCCGCfiTHCRBRRCGrRRTCnRTflGRPTrGnB6TGTGCGCGTCGHGGTRRCfiflr,RCfiTTRRRCCCRrGRGCRCTRRrflG 5500  flrPRRRBPCRTCRT'TCRTRCGCRCTGTGRCTCRTTCRCCGflCGRTTTRRCTCGTCGCTGflRCGflGTCCRGGCGCflRflCTGGRTCGCRCGTflRTCRCflCGG  Qain  5S rRNA  5600  s a i l  TK^QQcQnTTQflTPBnBflPTBBTflPrflTPBPBBTTPBflTTrpBTBflPTPfiRPBTTRBBPBnrrRPRBPfiBTnBBGTTBrPTPPPBTflPPCRTPPPGRflflR  5700  Rr,aTRRnrrrprrRnrBTTrrBnr.r,flnrflCTr,nRBTr,pr,rr,RrrPTrTr.r,nRRRPCpnBTTrnrpnprRrrRTTPRTRPPrTTPRTRBPPPRPTPRCBRP 5800 HPJiGflGflTflrcrcrCCCGflG7GGGerrrCCGTflrTTflflflCflGRGCCGRRCCfiCrCRGTRRR7GRCCGGrTCTCGCRCrCrGTGGRflrRCGGCrTCRRTeG 5900  GTGRGflTCRGflCGTGCGCTflGCGRTCGTGRTCGRGTCGTTGRGTCflC-3'  Figure 3.3B continued  54 Figure 3.3A The complete sequence of the rrnA operon from Ha. marismortui is given from 5'- to 3'- direction. The 6171 nucleotide long sequence (+ strand) comprises the following: 16S leader, 16S rRNA, spacer containing tRNA^la sequence, 23S rRNA, 23S-5S spacer, 5S rRNA, 5S distal region containing a tRNACys. The rRNA and tRNA gene sequences are underlined and some important restriction enzyme sites are indicated above the respective sequences. Figure 3.3B The complete sequence of the rrnB operon from Ha. marismortui is given from 5'- to 3'- direction. The 5947 long nucleotide sequence (+ strand) comprises the following: 16S 5'flanking region, 16S rRNA, 16S-23S spacer, 23S rRNA, 23S5S spacer, 5S rRNA and 5S distal region. The rRNA gene sequences are underlined and some important restriction enzyme sites are indicated above the respective sequences both operons. Sequence analysis indicates that the rrnA operon contains a t R N A ^ gene within a  the 16S-23S intergenic spacer and a tRNACys g  e n e  located distal to the 5S gene. The rrnB  operon possesses neither the tRNA A l within the intergenic spacer nor the tRNACys i a  n m  e  388-  bp region distal to the 5S gene. A Southern hybridization analysis was performed to test for the presence of the tRNACys gene in the rrnB operon (Figure 3.4a). An oligonucleotide, oPD 45, specific for the tRNACys sequence in the rrnA operon was used as a probe in this analysis. As a control, a 1.0-kbp EcoRI-EcoRI fragment from the rrnA operon containing the tRNACys g  e n e w  a  s  u s e  d.  The probe was hybridized to the DNAs at 42°C, washed first with a low stringency buffer at room temperature and then with medium stringency buffer at 42°C. The autoradiogram showed that the probe hybridized only to the 1.0-kpb fragment from the rrnA operon (PD 1099) and not to the 1.8-kbp or 2.0-kbp fragments from the rrnB operon (PD 1022). This experiment clearly demonstrates that the rrnB operon does not contain a tRNACys g  e n e  sequence in the 5S distal  region (Figure 3.4b). Sequencing and Southern hybridization analysis indicate that the gene orders of the rrnA and rrnB operons from the Ha. marismortui are 5'-16S-tRNA -23S-5S-tRNACys-3' and 5'Ala  16S-23S-5S-3', respectively (Figure 3.5). The sequence from the rrnC operon (Brombach et  Figure 3.4 Identification of a tRNACys gene using Southern hybridization analysis. Plasmids containing the 5S distal regions of the rrnA (PD 1099) and rrnB (PD 1022) were digested with restriction enzymes EcoRI and Clal-Kpnl respectively and probed with the y-32p ATP-labelled oligonucleotide oPD 45, which is specific for the tRNACys g  e n e  (see section 2.2.4).  (a) Gel shows the restriction enzyme digested fragments from plasmids PD 1099 and PD 1022. The fragments generated by digestion of X DNA with PstI were used as a size standard for this analysis. The positions of the restriction enzyme sites indicated (Cla I, Kpn I and EcoRI) are identical to the restriction maps in Figures, 3.1 and 3.2) (b) . Autoradiogram showing the 1.0 Kb fragment hybridized to the probe.  56  Hmo rrnB 16S  IR:  IR)  5S  23S  H E  Hma rrnA  ill  I  Hcu 23S  16S  p.  IRj  5S  IKNAl*  1*4  IR]  E. coli rrnB IR,  TT  23S  16S «1  5S  «3  Figure 3.5 The gene organization of the rrnA and rrnB operons from Ha. marismortui. The gene organization of the rrnA and rrnB operons was compared with that of a related halophile, Hb. cutirubrum (Hui and Dennis, 1985) and a eubacterium, E. coli (Brosius et al, 1981). The rRNA genes (16S, 23S and 5S) and the tRNA genes ( t R N A a n d tRNA^ys) are shown as solid boxes and the 5' flanking regions, intergenic spacers and the distal regions are blank. The small boxes within the 5' flanking regions and spacers indicated as IR are the inverted repeat sequences surrounding the 16S and 23S rRNA genes. These sequences are usually involved in the processing of the respective rRNAs. The promoters are indicated as "P"s at the 5' flanking regions of all the operons. Internal promoters are indicated as Pi, within the 16S-23S rRNA spacer regions. a/., 1989) contains a portion of the rrnfi-like 16S-23S intergenic spacer (254 nucleotides upstream of the 23S rRNA start site) and a 463 nucleotide long rraA-like 23S distal region (including the 5S rRNA gene). The 5S distal sequence is insufficient to determine whether the rrnC  operon contains a tRNACy gene or not. The gene order of the rrnB operon of Ha. s  marismortui  is unusual because it deviates from the general pattern, 5'-16S-tRNAAla_23S-5S-  tRNACy -3', found in halophilic archaea, such as the rrnA operon of Ha. marismortui and the s  57 single rRNA operons from Halobacterium cutirubrum (Hui and Dennis, 1985, Figure 3.5), Halococcus morrhuae (Larsen et al., 1986) and one of the two rRNA operons from Haloferax volcanii (Gupta et al., 1983). The second operon from the Hf. volcanii contains a tRNA Aha gene within the 16S-23S intergenic spacer region but lacks the second, tRNACys gene, located downstream of the 5S gene (Daniels etai., 1986). To display the spectrum of the phylogenetic differences between halophiles and to identify functionally important elements in their molecular structures, the entire sequences from the rrnA, rrnB and the single operon from Hb. cutirubrum were aligned pairwise and the nucleotide differences within ten nucleotide intervals were recorded. Hb. cutirubrum was selected for the comparative analyses because it is a related organism and the sequence of the entire operon is available. Gaps (or deletions) were introduced to maintain alignment and were not considered as substitutions. Comparison of two DNA sequences usually cannot tell us whether a deletion had occurred in one sequence or an insertion had occurred in the other. Therefore, the outcomes of both types of events are collectively referred to as gaps. The distribution of substitutions along the length of the sequences of the three pairwise comparisons are illustrated by the histograms A, B, and C in Figure 3.6. The two sequences which were compared are indicated on the upper right hand corner of Figure 3.6. As a first approximation, regions containing more than five substitutions per increment were considered to be poorly conserved. In the rrnA-rrnB case, most of the substitutions were observed within the 5'-flanking, 3'- flanking and spacer regions, although a substantial number was observed within the rRNA gene sequences. Differences observed within the 16S rRNA genes (positions 180 to 1652 in Figure 3.6) sequences are less uniformly distributed than the surrounding regions. With a single exception at position 1216, they are confined to three' intervals, bounded by nucleotides at positions 236-501, 688-1003 and 1166-1338 within the 16S rRNA genes. The 23S rRNA genes (position 2140 to 5050 in Figure 3.6) show fewer substitutions than the 16S rRNA genes and the substitutions are confined in four intervals, bounded by  i«s  Nucleotide positions  Figure 3.6 Figure caption on next page (page 59).  59 Figure 3.6 Distribution of sequence differences of the pairwise comparisons along the length of the three rRNA operons. Pairwise sequence alignments were made between the fellowing sets of operons; rrnA and rrnB of Ha. marismortui, rrnA of Ha. marismortui and Hb. cutirubrum, rrnB of Ha. marismortui and Hb. cutirubrum.  Gaps were introduced in the alignment whenever needed. In each comparison, the nucleotide differences between the two sequences within ten nucleotide intervals were recorded and plotted against nucleotide positions. The two rRNA operon sequences in each pair are indicated on the right. The location of the 16S, 23S and 5S rRNA sequences are indicated at the bottom. Arrows indicate the peptidyl transferase center region. Abbreviations are as follows: Hma rrnA and Hma rrnB indicate rrnA and rrnB operons of the Ha. marismortui respectively; and Hcu indicates the single operon of Hb. cutirubrum. The bounds of the 16S, 23S and 5S rRNA sequences are indicated below the X axis.  nucleotides at positions 70-365, 1320-1765, 2320-2412 and 2759-2769 of the 23S rRNA genes. The 5S rRNA genes show only two nucleotide substitutions. In the 3'-flanking region, the rrnA and rrnB operon sequences were identical for approximately 60 nucleotides downstream of the 5S rRNA genes; after that point they are totally divergent. The biological and phylogenetic significance of the substitutions within the 5'-flanking regions, the rRNA genes, the spacers and the distal regions will be analysed in the remaining part of chapter 3 and chapter 4. Unlike the rrnA-rrnB case, the substitutions within the rrnA-Hb. cutirubrum and rrnBHb. cutirubrum comparisons  are distributed throughout the entire operon. However, in the  latter cases, the substitution rates are relatively higher within the regions surrounding the rRNA genes. Within the 16S-23S spacer regions, the rrnA and the operon from Hb. cutirubrum show fewer substitutions than what was observed between rrnB and the operon from Hb. cutirubrum, which is attributed to the presence of nearly identical t R N A l gene sequences in the first pair of A  a  operons. Considering the histograms A , B and C shown in Figure 3.6, the region between 4600 and 4800 in the first two and the one between 4460 and 4530 in histogram C show very few substitutions. This highly conserved region corresponds to the peptidyl transferase center within domain V of the 23 S rRNA.  60  3.2.2.2 Secondary Structure Putative secondary structures of the primary transcripts from the rrnA, rrnB and Hb. cutirubrum were deduced (Figures 3.7A, 3.7B and 3.7C, respectively) from their primary sequences using the structural model of Kjems and Garrett (1987). Comparison of these structures reveals that the 16S and 23S rRNAs of the rrnA and Hb. cutirubrum operons and the 23S rRNA of the rrnB operon are surrounded by long processing stems each containing the expected "bulge-helix-bulge" motif. The inverted repeat structure surrounding the 16S rRNA of the rrnB operon is different from other processing stems in that it does not contain the motif. The spacer ( t R N A  Ala  ) and the distal (tRNACys) tRNAs are present in the rrnA and Hb.  cutirubrum operons and are absent in the rrnB operon. A number of other putative helices, designated A to I, are also present within the primary transcripts (see Figures 3.7 A , 3.7 B, and 3.7 C). Helix A is irregular and can be formed in all archaeal transcripts except for those of D. mobilis and M. vanielii. Although there is no formal phylogenetic support or conserved base pairings among all archaea, the lower part of the helix was aligned for three methanogens and this yielded positive compensatory base changes for 4 base pairs (Kjems and Garrett, 1990). In the transcripts of the three halophilic operons (the rrnA operon of Ha. marismortui (this work) and the single operons of Hb. cutirubrum and He. morrhuae (Kjems and Garrett, 1990; Garrett et al., 1993) the helix A structures are conserved in that they contain the recognition elements for the most proximal promoter (Figures 3.7A and 3.7C). The AT-rich Box A promoter motifs lie in the terminal loops and the corresponding transcript start sites are located in the unstructured sequence between the helices. These structures can potentially form only in RNA transcripts initiated at a promoter located t upstream of the repeat. Any transcripts initiated from this most proximal promoter would lack the conserved helix A at the 5'- end. In each case, extending upstream from helix A, somewhat similar inverted repeats overlap other upstream promoter sequences (Figure 3.8). These repeats vary in their primary sequences and predicted secondary structure in ways such as the length of the repeats and the number of nucleotides between the two halves of the repeat. The rrnB operon  vO  => c u «  U U Zi = » = > t >  O 3 O t  O 3 « < E «  « — «J O  (A >. O <  O — tl (C u — w => *"* «j w — o (C o o — u O U U O O  g8  si  O 1 / 3 3 O O  U O O O I ? C 3, O — O O — U O — O  —  I  Z  O  ( C < C C O  CC  j  3  w 3 w w  O K O  3  t>  O  CO CHS < i> => o a.  > O 3 O a t ij 3 O  CO J^3 t» V 1» Cl «  : u s  w =» t» a: o  S  3 3 O <  «t «: o :  :U39UW3U3U<S007  > w o — — •  • 3 I E O W O 9 <  s c  8  J  .* u _ i-t u o o t v 3  u.  t ^  c  ! Ui  «J  o o  : t > «j  o  o 3  ) u u  u o o < c u u _  <3  3  3 « O *J (  c 2 J5 a  < Z  « —« o W » 3  — — — -  CC  i> Cl s < K  CL  I o (  = = =  u.  8  ff  3 - a:  » — O  • ' r « t • IA Q « UJ  2  , _ o  3 f-  •  o  u  -  b  o 13 V  co I  tt  I  (J  3 3 3  ucuuoa:3ou33( " 3 - * a: — » = ) — a O - OO - W  33i^ovroou:<E (Joo3i3 I I t I ( I I I t l I Q IU<£U>U;«C003Wl'343  3 u o o • I I • OOOO  t<C(E3a:30(E000030<Eoa:  u utto o o o o -  0300 — — — 033<E3«0;jO<  o — o o  —  (J  (j U O — *J U - U 5 - C 3 * - = » O • 3 3 = >  I C 3 3 « 3  30oo<£303  O - U 5 - It O - LF O - U  a.  t U O P 3 C O O  ca  -IJ  CO  b E  0 0 0 3 3 0 0 0 : 1 OOOS1EU031  3  62  r 23  IB A  u fi  Slit e  r  SITES « A U Cflc c I I I I I I lltCUCC c  ft U  U fi  c  c  C II II c c « II I • I CAflCGUGfl G I -30 G C -10-fl G C C u u c  G C 100-fl  i  1 6 S  G fl 10 C I G G G fl U C ft G G G C I II II I I I I I C C C fl G I! C C C G fl-30 C C u A* -  fi SITE U UG G • III G fl C C I 60  CU U I I I C fl fl  C-50  60 70 I I C C U G fl C tl G G U G A C I • I I I I I I I I II G G CCACU G ft U G G u fl 100 C -60 -50-U A I G G E l l C C fl fl II 6 U C 230 I I I I I II G I AA GGUUflCflfl C C C G U U U fl G C c fl U c U A c -80-G • C c fl u u c G G R c " c c G C c U G U A U G-C fl • U-120 G-C SITE V A C G C2 G C-G A A C A c G •U C A G c G -130-U tl C 180- U U G C G - C C II G • C C G - C U A G • C G - C c U GC •G C G C - G-190 -160 C R u U A C fl-U iso-n c fl • C-G G-C G - C G C • G-130 C-G C-G G 210-C c•G U-fl C-G c • U • G U U-fl G • fl o G fl U . -MO-fl U fl U - f l SITE G • U tl • -ISO G C G • 110 C 200 C f l U G I G . U U I U - A c G - C 1 • GfiUG CAAGACC CGUGUAUGA I I  SITE 7  n  A  c  c -170 -0 C SITE I • II C C G U II G S^-GGGUCACUCGCUUB  fl  Ci  fl C  fl-U C-G II - II C-G U - fi G-C * - U' G - C-20 C II 390- fl-U fl-U G-C 30 U-fl I C GflC G fl I I I I I J. -CGCUGCU - U C-G u-n U-fl G-C C - G-50 U-fl C-G U - fi C A G G U -  CU • G CI I 1 G0 GA360- C Cu -  U  C  c c c C-IO  «  -  G U G U G G G fl-210 C G G C c G C U U U 320- C A C u G A AA-250 U- A CA G- C CG- c U A 280-G G G-290 CU C A c G c - G c fl G - C GU G- C 3I0 - C fl u-U 230 C - G 300 C u G-260 I c - G I ACGACAGATGCfl GAAGGGUCG  F  t  fl  fl  fl  flc  fl  Figure 3.7B Primary transcript of the rrnB operon of Ha. marismortui.  fl fl fi C U G G-70  fl U  fl G C  -  CGUGC AUGc G U G GG UC GG CG 310 c AUI AUGAfl c u u G G-330  G G-60 c G C  -  C c fl c G U fl-80  fiU G-C G-C C-G C G-C G C-G G 90 U-fl U 1 GGAAGAGUUAAUCGRGACUGGUflCUA CUCGACl UCGACC II C  «  A 30-G A G fl G G  A  C U C U C C-10 C C G A U C G 20-A • U C G G c G c G C fl U-50 u U 65 O SITE 12 fl U I CUCAUACCUUUC C CCGUAUUUARACflGfl-3'  fl  CO  64 Figure 3.7 Putative secondary structures from the rrnA, rrnB and Hb. cutirubrum operon transcripts. Figure 3.7A The putative secondary structure of the primary transcript for the rrnA operon from Ha. marismortui. The two tRNAs, tRNA Ala j tRNAOys, d a number of putative helices, A to I, are present within the primary transcript and are named according to the nomenclature of Kjems and Garrett (1990). The processing and maturase sites of the 5'-flanking regions that were mapped by nuclease S1 protection assays and confirmed by using primer extension analysis (are indicated by dots (•). The arrows indicate the processing and maturase sites within the 16S-23S spacer and tRNA^ys processing site (only if they were mapped at their nucleotide positions by sequencing). Other sites mapped by S1 nuclease protection assays to approximate nucleotide positions are shown by boxes. The processing and maturase sites are indicated from sites 1 to 15. The promoter P4 start site is indicated at position -170 in the 5'-flanking region. The box A sequence (UUAA) of the internal promoter, Pi, is shown in the loop of the helix D. a n f  a n  Figure 3.7B The putative secondary structure of the primary transcript of the rrnB operon from Ha. marismortui. A number of putative helices A to H are shown. The processing and maturase sites mapped at the nucleotide positions are indicated by arrows within the 16S-23S spacer region and by dots (•) in the 5'-flanking regions. The approximate sites identified by nuclease S1 protection assays are indicated by boxes. Sites indicated from 1 to 12 are involved in the processing and maturation of the 16S, 23 S and 5S rRNAs. The box A sequence of the putative promoter Pi is shown in the loop of the helix D. Helices C l and C2 replace helix C and the tRNA Ala from the rrnA operon. Figure 3.7C The putative secondary structure of the primary transcript of the Hb. cutirubrum (sequence obtained from Hui and Dennis, 1985). The processing and maturation sites mapped by nuclease S1 protection assays (Chant and Dennis, 1986) are shown by arrows. The box A sequence of the internal promoter sequence (TTAA) is located within the loop of helix D. transcript appears to be unique, in that it contains a 16S processing site within helix A and the known rrnB promoter sequences are all located upstream of the helix A region.  65  G  u u u  t  1  5'-  C C C G C fl G C U CU  G U  U G fl fl fl fl  U • C C C G C  C  5'-  - G •G  fl G C  U fl Cfl  fl  r  ^ START ST i -440  CUCflCCCflUCGGRflUGAflflU  GCGflfl -  fl fl fl fl  U  u II - fl  C C C G C fl G  fl  - G •U - G - U  U G  - G - G - G - C - G - U •U  C i — * - S TT A l R T 359 -35  3  5'-  1  GU  flUGflflCGCRflfl -  U  C - fl - G - G - G •U - G C - U - C - G fl - fl r ^ - S T A R T - U GflflCdCGflflC - 3'  2  II  8  1  5*-  U  fl U  G G U U C - G C - G G - C fl - U fl - U G - C C - G S T A R T U - fl -170 C - G flfl ACflflCflflUCflGGUCCGflflGfl - 3 '  r  Figure 3.8 A series of inverted repeat structures of the 16S 5-flanking region of the rrnA operon containing the promoter motifs. The AT-rich Box A motifs invariably lie in the tenninal loop regions whereas the putative transcript start site is located beyond the regions of inverted repeat symmetry. These nucleotide positions of the sequences are indicated. The P4 structure corresponds to hehx A in Figure 3.7A. In the case of the rrnA operon transcript, the P4 promoter sequence (see Figures 3.10 and 3.11) is located within helix A; the other three promoter motifs are within regions of inverted repeat symmetry (Figure 3.8). In the four helices depicting these repeats, the TTAT promoter elements are located within the tenninal loop. Helix B is a universal structure in archaea and is clearly evident in the rrnA and rrnB transcripts. This structure is defined by its position and by the specific sequences at the base of the helix (Figure 3.9). The appropriate sequences from some archaeal organisms are aligned in Figure 3.9, where it can be seen that several base pairs in each helix are supported by  66 16S STEM -+  Helix B  •  ++++++  Hma rrnA  AUGflG  Hma rrnB  flUGflG GAUUCCflCCCCUGCGG  Hcu  flUGflG  ++++++  flUUCCflCCCCCUGGGG  GflUUCCnCflCCUGCGG  UCCG  CCUCUA--AGAUGGGAUC UGflUGU  UCCG  CCGUUA--AGAUGGRflUC  UGflUGU  UCCG  CCGUflfl—flGflUGGflflUC  UGflCGU  UCCG-  CCGUCfi—flGflUGGflflUC  UGflUGU  Hmo  AUGAG GflUUCCGCCCCUGCGG  S.so  GUGGfl GGGGCflflGAUCCCCGGGC—CCCUflfl  Saci  flUGGfl  GCCCGGGfl—flGCUUGUCUC  UGflCflfl  GGGAUGAGAAAGCUGAGGGG--UCUA  ACCUCAGA—AGUCUCAUUC  CUGflCfl  GCCCGGGCGGGCGUCGGGGG AGAAflfl  T.t  flUGUU  CCCCCGflCUflflCCCCCUGGG—UUUfl  M.f.  flUGflG  GGCCUflAAACUUUGC  M.v  D.m  Cons  GUGflU CGCCUGA  flUGGG  RUG  — •  GflCflfl  GCflflflCUGCUUUUflGGUU  ACUflflflUCflC  GAUUCAAGCACCUCGGGAGGCflCGUUACCUCCCCGAGA  fl  UGflUGU  UCRGGUG UGflUGU  GCUUGflflUC  U  CGflCGfl  VGA  Figure 3.9 Alignment of the partial sequences that generate a helix B in the 16S leader regions of some archaeal organisms. Complementary sequences which generate the putative helices are indicated by arrows over the sequences. A + indicates that the putative complementary nucleotides in the helix are supported by compensatory base changes. The approximate start point of the 16S RNA processing stem is indicated by an arrow at the right. Consensus sequences common to at least 80% of the organisms are given on the last line. Abbreviations used are: Hma rrnA and Hma rrnB refer to the Ha. marismortui rrnA and rrnB operons; Hcu, Halobacterium cutirubrum; Hmo, Haloccus morrhuae; Sso, Sulfolobus solfataricus; Saci, Sulfolobus acidocaldarius; Tf Thermofilum pendens; and Mva, Methanococcus vanielli.  67 compensatory base changes. An internal loop, observed usually in helix B structures of halophilic and thermophilic archaea, is present in the rrnA and rrnB operon transcripts (Figure 3.7A, 3.7B, 3.7C). The sequences surrounding the helix B structures of rrnA and rrnB , 5'AUGGA-helix-UGA-3', are identical in all the halophilic sequences known. These structures may constitute maturation signals, transcription terrru^ation-antiteirnination signals or have some other functions (Mason et al., 1992; Nodwell and Greenblatt, 1993). In halophiles, helix C is always present 5'-to the spacer tRNA (Figures 3.7 A and 3.7 C), suggesting that it may be related either to 5'-tRNA processing by RNase P or to 3'-end processing of 16S rRNA. The rrnB operon which lacks the tRNA^la contains two helices designated C l and C2 at this position. Helix D is only present in halophilic archaeal organisms, which contain an intergenic, putative promoter sequence. Within this structure, a TTAA motif (Box A) is located in the apical loop region. The number of base pairs involved in the formation of the helix structure varies from species to species. In the rrnA md rrnB operons from Ha. marismortui, the sequences between the TTAA motifs of the Pi and the 5-ends of 23S rRNAs are identical. However, the helix D structures are different in that the sequences upstream to the TTAA motifs are different It was shown that these intergenic promoters found in the rrnA and rrnB operons are active (discussed under section 3.2.5.3). In the thermophilic archaeon, S. acidocaldarius, an internal promoter-like sequence was observed witMn the 16S-23S spacer, however, activity was not detected (Durovic and Dennis, 1994). Helix E, formed from a pair of perfect inverted repeat structures which are bordered by sequences AUGCA and GAAG, is conserved in all halophiles. Helix F, defined as the helix which directly precedes the 23S precursor processing stem, is formed from a pair of long, imperfect, inverted repeats. The sequence conservation and the secondary structural conservation of E and F, along with their primary sequence conservation within the flariking sequences have been observed in many species (Leffers et al., 1987). Helix G is always observed in the 5'-flanking region of 5S rRNA transcript and is assumed to be involved in the  68 processing of 5S rRNA (Kjems and Garrett, 1987). Helix H , found downstream of the 5S rRNA transcript of the rrnA, rrnB and Hb. cutirubrum operons, is followed by poly dT stretches. These structures are similar to the r/io-independent termination sites observed in E. coli. In the rrnA operon transcript, another helix structure, helix I, is located downstream to the tRNACys, followed by poly dT stretch. Nuclease S1 protection assays indicate that termination occurs at this site (section 3.2.7.4; Chant and Dennis, 1986). Therefore, it is suggested that the transcript reads through the poly dT stretch associated with helix H and the tRNA^-ys and tenninates at the poly dT stretch associated with helix I. In Hb. cutirubrum, the sequence beyond the tRNA^ys is not sufficient to determine whether the transcript can form a structure like helix I (Hui and Dennis, 1985).  3.2.3 The 16S Leader Region 3.2.3.1 Sequence Alignment The 5'-flanking regions of the rrnA (664 nucleotides) and rrnB (720 nucleotides) operons from Ha. marismortui were determined. The two sequences were aligned and compared with the Hb. cutirubrum sequence (714 nucleotides) as described in section 2.2.17 (Figure 3.10). The alignment shows that the rrnB operon differs from the rrnA and Hb. cutirubrum operons in several respects. The rrnA and Hb. cutirubrum operons contain multiple promoters which are active (this work; Mevarech et al., 1989; Chant and Dennis, 1986; Mankin and Kagramanova, 1986); however, only one major promoter activity (P) was observed in the case of rrnB operon so far (activity of the Px promoter was not investigated). The 75 to 80 nucleotides preceding and including the processing sites are about 90% identical in the case of the rrnA and Hb. cutirubrum operons. The rrnB operon contains the preceding sequence but lacks the processing site sequence (GTGACA sequence located within the "bulge-helix-bulge" motif of the secondary structure; Mevarech etai, 1989; Dennis, 1991). In the rrnB operon, the sequence associated with a putative precursor processing site CGAGGCC is located at position -172 within the helix A of the universal secondary structure. A related sequence CGAAGCC is  69 Hma rrnB  1 10 20 30 -10 SO 60 70 SO 90 100 nnOCTTCCCCflCflCflCCCnCCOCCCflTflOnROCCCTCCTTCOCTCflTCCCCCTCCCTfinCflCOnnCCTnnOCnitTCCCC-COTCCTCCTCCGCTfinCO  -621  HmamiA  —GflTCT-RGGTCGCflTT~CCCGflCR77T7CGGTGGCCGTC7G7G  CACGTGPRTTTGRGCGAGRAC  -558  Hcu  RflCGflCGTGflGGTGGCTCGGTGCRCCCGRCGCCflCTGRTTGfiCGCCCCC--TCGTCCCGT7CGGRCGGflflCCC  GRC-TGGGTT-CRG7CCGR--  -591  HmarmB  GTGRRTGCRTRGCTRRCCGCflGRCGRCGGCGRflGGGRGCGGTflTTGGCRflflGGRCCGRGCflRTCGCTGCflflCCGTTTTGGGGGGGTflflCflGTCCGTTTTC  .... • • .. ... • • • GRCCCG—TRCTCTCCGTCTGTRCflPGTGTGGGRTRCTC-RCRCCCGTG••• * ... • • • ... .. • ..  -521  Hma rrnA  «««». « . .. « CGTCTflTCRTRGT—CRCTGCTGTTCCGT • • •• • ...  -186  Hcu  TGCfCTTRflGTRrBRCmfiGTfiCTTCGCTGGRRTGreflflC  GflCRftTGGGGCCGCCCGGTTflCRCGGGIGGCCGflCGCBTGflCICCGeTGR  -502  • ••  ••  •• • •  •• • ....  • •  • ...  • ««  • •••• . ••• « «« .  . .....  TCGTCTRRTTCTRRC-TGCGCGCT  « .. « «  •••••  ••• ..  .  . .. .  .. . . .. .«  A  F2  ' — PROMOTER START Pi G7CGGTGfiTfiGRGGflGRCTTCCTRGCGPr.PRRP7RRTTRTTC7rRPP7P-TGRfiTCGCGGRTTG TGCGRPCTGRGCGRTGRRRTPCTCCflCCGfiTfiT • • • ••• • pj • • • ... ..... • • • • • . «••• TfifiTCRCGTCTCGCRGGCGRCCTCC-TTrGRCGGPGTTflTlGTGTr RrTCRrCCflTC--GGflflTnRflRTG'CGRRCG-CGGTCfiGGGGCGRTGCCCCGRflCG • «. « • * . . . . . . . . . . . • • ..««.« «« * « . . . . . . . . . a * . • • . . • TCGGTTCGGCGTTCGGCCGRRCTCGRTTCGRTGrprTTRTlG1RHTflRrGGGTGTTC--CGR-TnRGRTGrGRRCGflCRRTGRGGCTRTCCGGCTTCGTCC  Hma rrnB HmamiA  -127 -390  J  Hcu  -106  A  P3  PROMOTER START  Hma rmB  CCRGRTRRTRTCCCTRflRRCTGRCTRTflCT7TTGTRRCTGTCGCTC  HmamiA  RflCTTCGCCCGTCCRRTRCflCTRC TGRRRCCRC CCGTG1GTR • •« . . . . . . • «• P2 . . « « . . • RRCGCRCTCGTTrrCRCGCCCTTRRGTG-TRRCflGGGCGlTCGGflflTGRfl  -339  Gr.nTflGPTGflTriPflTPTPT7CGRCTCTPPR7GG7GTCGGTE7CflnPTrflG7RflG7GTGR7TP-RR7GPCCTTRRG7RRTRftC-GGGPGTTflPGRG-GRfl  -309  •  Hcu  -  -313  P4  HmamiA  P _ -TGCCflCG--CGTGTGCCGTGTGGTTCCTGflR-CRTGGCGGGTrrflflGrCGTCCRTTTRTflTflPTPPCCTCCCflTCGGfl7GTRflTGP.GflflGGTP.GC • • •« .. ••• .. « . «•• • «. . . . • • • p3 • •• ••••• . -CGCRRRGGTCGCTCGCGTCGIGTGGTTCGRtGCGRTRCCRBTCCGR—CGCCC—TTRRGTG-TRRCRGGGTGCPPGflRTCRRPC'cGRflPCflPGTTCGG  Hcu  JjIC£flfi.CGRCflATGTGGCTRCCTGGGCTTCTPPPfl  HmarmB  «««.«» •  T-  •••  ... • •  ...... «...  •  G -215 • -250  •• ....... ••• •  GGTGG-TTRRGTRRTflflPGGGGCGTTPGGGGfiRRTrTcGBfiCGTCGTCTTG  PROMOTER START  PROMOTER START  -226  1  PROCESSING SITE  JL Hma rrnB  CC-CCCCflCCflTTCCCC-CflflCCfl • ••••• • •••• *• TCGGGGCRTCGf.flCCCCTGflCRGRC^  Hma rrnA  CfiCCCCflflCGCCf.CCCCCGTGCTCCCTTCCCTCflC7CGCTTflGflTCC£flfiCCCfiCCCnCCCft • • •••••• • • • • • • • P4 - • • • • •  -163 •  Hcu  —  -160  -  •  • • •  » • • • • •  ••••>  •  • • • • • • • • •  e  —GflCTGRTPnGflGTrrGRTRGmTTRTGRP—C7GTCGflflCTCTflCGGTCTGGTCrGBB.GGB -168 PROMOTER START  HmarmB  TTCGTGTTCGTGTRTGHGGRTT—CCRGCCCTGCGGTCCGCCGTTflflGRTGGRRTCTGflTGTGRGCCCflCGGflCCCflTGCflGTRGTCflRCRTTGGCGTflG  .««..««.. ..».*.«..«......«.*. ....... ......... •••••  HmamiA  •  •  • ••  • ••  R7GflGGflTTCGCCflGCCCTGCGG7CCGCCGTCRRGflTGGGRTCTGflTGTTRGCCCTGR7flGTTCGGTGfi£flCTCGGTCGRCGGGTGT . . . . « « » . « • • • . . . a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . «««««« . • « RTGRGGflTT — CCflCflCCTGCGGTCCGCCGTflRflGRTGGflflTCTGRTGTTRGCCTTGRTGGTTTTGTGflE&TCCAflCTGGCCRCGAC  Hcu  -62 -82' -81  [-PROCESSING SITE  HmarmB  GRRCCfiflTGT-GTTRGCTTCCGRC  Hma rrnA  CCTCGflflCflCCCTTCGflTGCCflGCGflCCflCRGCCCflTTRTflTGGGTGTGRTCCCGCCTflRCCCCGCRTTGTGCCGGGRTRC  • Hcu  •  -  GGRGTGCRRCCRC77CCGCCGCTGRTGT  « . . « « . «  •  RTRTCRGCRC  •  -I -1  •• •  GRTflCGTCGTGTGCTRR-GGGRCflCRTTRCGTGTCCCCGCCRRRCCflflGflCTTGRTRGTCTTGGTCGCTGGGRflC-CflTCCCflGCflR  -1  ure 3.10 Alignment of the 5' flanking sequences from the Ha. marismortui operons, rrnA and rrnB, and from the Hb. cutirubrum. The methods used in the alignment are described in section 2.2.17. The nucleotide identities shared between two sequences (rmA-rrnB, top row and rrnA-Hcu, bottom row) are indicated by dots (•).  The AT-rich putative promoter motifs (Ps), promoter start sites, and the  processing sites are under and/or over-lined (see Figures 3.7A, 3.7B and 3.12). The nucleotide positions at the end of each row is indicated on the right and the gaps are not included as nucleotides.  70 present in the rrnA operon transcript at the corresponding position within the helix A but no cleavage has been detected at this site. To analyze the substitution rates within the 5'-flanking regions of the rrnA, rrnB and Hb. cutirubrum operons, pairwise comparisons were made using the available sequences from each operon. When entire sequences were compared between the rrnA-rrnB 16S 5'-flanking regions (597 nucleotides), they show 49.2% identity; between position -1 and the P i promoter sequence of the rrnA operon (position -484), they show 53.2% identity. In the rrnA-Hb. cutirubrum comparison, the entire sequences from both operons show 55.8% identity; between positions -1 and the P i promoter sequence of the rrnA operon (position -484), they show 59.6% identity. In the rrnB-Hb. cutirubrum comparison, the 599 nucleotides show 51.4% identity; between positions -1 and the Px sequence of the rrnB operon (position -500), they show 53.6% identity. These results indicate that the 5'-flanking region of the rrnA operon is more similar to the Hb. cutirubrum operon than to the rrnB operon; between positions -1 and -484 of the rrnA operon, rrnA and Hb. cutirubrum are almost 60% identical. Sequence comparisons also indicate that the rrnB sequence is more closely related to the Hb. cutirubrum sequence than the rrnA sequence.  3.2.3.2 Promoters The rrnA operon of Ha. marismortui contains four tandem promoter sequences, P i , P2, P3, and P4, located upstream of the 16S rRNA gene (see section 3.2.3.4, and Mevarech et al., 1989). In the rrnB operon a single promoter, P and a promoter-like sequence Px were identified (see section 3.2.3.4, and Mevarech et al., 1989). A l l six promoter sequences from the rrnA and rrnB operons were aligned with promoters from the single rRNA operons of Hb. cutirubrum and He. morrhuae. (Figure 3.11). Three regions with high nucleotide sequence similarity are apparent The first is centered around the transcription initiation site; the majority of the transcripts initiates at a G residue and when the start nucleotide is not G, the promoter is generally low in efficiency (Dennis, 1985; Chant and Dennis, 1986; Mankin and Kagramanova, 1986). In Ha. marismortui, the weak P4 promoter from the rrnA operon and the putative Px  71  BOX  Hm-rrmA  Hcu  HIM  CONSENSUS  A  BOX  8  Petition n l i o v c B . l ctf OS  PI  HP,  CRCCC  TC—CAT  TTflT  ATBCTCCCCTCCCflT—CCCfll-CTfl  AT  C  CCAA CCTC  -251  Px  TP.  ccccc  CflflCTCfl  TTflT  TCTCRCCTCTGA-ATCGCGGflTTGT-  GC  C  CACC TCAC  -152  PI  CC  TTCCfl  CC—CCC  TTflfl  CTCTCGCTCflCCCflTCCCflfl—TCflfl  AT  C  CCflfl CCCC  -110  P2  CT  TCCCfl  CC—CCC  TTflfl  GTCTflRCflCGCCGTTCCCfifl—TGfl-  AC  C  Cflflfl CCTC  -359  AC  C  C C A R CGflC  -281  P3  RA  TCCCR  CC—CCC  TTflfl  CTCTflflCBCCCTCCTCCCflfl—TCfl-  P1  PR  CTCCR  nc—en  TTflT  GGTGGCTTCGRGACARCAfl  TCAC  CT  C  CCflfl Cflflfl  -170  POP  TC  GCCGA  CA—TAT  TTflT  CCTCCCCCCTTCTGTTTG  CATC  CC  A  C C A f l CflflA  -1283  POB  CC  CCCGR  RR-CTCC  TTflC  RRCCCCCCRRCCCflflCflCG—CAC-  CC  C  C C T C CGTA  -916  TTflT  PI  CC  TTCCR  CCCTCTT  GTflC—CCCflCCflCTCCCP—TCflG  AT  C  C C A R CGAC  -717  P2  RC  TCCCfl  TC—CCC  TTflfl  CTflCflflCflCCCTflCTTCCC—TCCA  AT  C  CCflfl CCAC  -631  P3  CR  TTCCfl  TC—CCC  TTflfl  GTflATARCGGGTGTTCCGA  TCAC  AT  C  CCflfl CCAC  -510  P1  CR  TTCCfl  TC—CCC  TTflfl  CTflfllflACCCCCCTTflCCfl  CCHA  TT  C  CCflfl CCCC  -377  PS  CR  TTCCfl  TC—CCC  TTflfl  CTARTAACCCCCCGTTCCC—GGflfl  AT  C  CCRfl CCTC  -211  ft  RC  TCCCR  TC—OCT  TTflT  CflCCTCTCGflflCTCTflCCC  CT  C  CCflA CCAA  -I78  PI  CT  TCCCfl  CC—CCT  TTflT  CCCTTflCCCCGCRTTCCCflflT-CCflfl  AT  C  CCAA CCTC  -378  P2  CR  TCCCfl  CG—CCC  TTflfl  TTGGTACAGGGCflCTCGGA  TCGfl  AT  C  C A C A RRGA  -299  P3  CC  BCCCTCCflflCCCTCTBCCfl  flCflG  AT  C  CCRR CCAC  -172  TTCCR  RG--GGT  TTflT  . . . . .  •«  «  « . . .  TVCCfl  vc  v  -40  TTfltl -30  RV  TCTC  RV  G •1  E n d o a a f i m D d by S1 «r P n m c r £ n c m i o n  ...  aatknoivn  •  ...  •  ... •  -  CGRfl SG  START  Figure 3.11 The alignment of promoters and putative promoter-like sequences from halophilic rRNA operons. Seventeen rRNA promoter sequences from halobacterial species are summarized. Ha. marismortui (Hma) contains two operons, designated rrnA and rrnB containing four and two tandem promoters respectively. Hb. cutirubrum (Hcu) has a single operon with eight tandem promoters. He. morrhuae (Hmo) has a single operon with three apparent tandem promoters. The start sites within the highly conserved Box B and the other conserved motifs, designated Box A (at position -30) and the halophilic-specific TYCGAYG motif (at position -40), are aligned. Consensus nucleotides within the rRNA promoters are also presented: Y, pyrimidine; R, purine; W, A or T; S, G or C. The relative strengths of the promoters based on the intensity of the bands from the S1 nuclease analysis (see section 3.2.3.4) are indicated by +, ++, ++4- and ++++. promoter from rrnB starts with a C residue; all the others start with G. The second conserved element is a well defined AT-rich box (Box A) 25-38 nucleotides upstream from the first  72 transcribed nucleotide. This feature may be the equivalent of the eukaryotic TATA box, located about 25 nucleotides upstream of the transcription start site in RNA polymerase II promoters (Reiter et al., 1988). The distance between the two consensus promoter elements (Boxes A and B) is approximately the same in archaea and eukaryotes (Reiter et al., 1990). The third element is a TYCGAYG sequence at the -40 position from the start site which appears to be unique to halophiles (Mankin and Kagramanova, 1986). Pairwise alignments of the promoter sequences (80 nucleotides in length) from the rrnA and rrnB operons were performed and the percentage similarities from each comparison are summarized in Table 3.2. The percentage similarities given in Table 3.2 show that the promoters P2 and P3 from the rrnA operon are more closely related than the rest of the pairs. Among the four rrnA promoters, the weak promoter P4 is the least similar. The P and Px sequences from the rrnB operon are more similar to each other than they are to the four promoters from the rrnA operon. A pairwise comparison of the rrnA and rrnB promoter sequences with those of the rRNA promoters of Hb.cutirubrum and He. morrhuae revealed that they can be grouped into three subsets (data not shown). The P i , P2 and P3 sequences from the rrnA operon are related to the P2, P3, P4 and P5 sequences from Hb. cutirubrum. The P4 of rrnA, PfJA and P6 of Hb. cutirubrum, and promoters P i , P2 and P3 of He. morrhuae are related and can be grouped as a subset. The P and Px sequencesfromthe rrnB are related to the POB and P i promoters of Hb. cutirubrum. It is envisaged that the presence of multiple promoters in the halophiles described above provides the flexibility to maintain transcription under different salt conditions, ranging from 1.5M - 5M NaCl (Christian and Waltho, 1962; Ginzburg et al., 1990; Lanyi and Silverman, 1972). Perhaps the presence of template specific multiple transcription factors, i.e., the DNAbinding proteins, which recognize short nucleotide sequences on different promoters, enable efficient transcription. An alternate possibility is that at different salt concentrations, the DNA-  73 Table 3.2 The percentage similarities between the promoter sequences from the rrnA and rrnB operons of Ha. marismortui. Pairwise alignments of the promoter sequences consisting of 80 nucleotides were compared and the percentage sirnilarities between each pair were recorded. Promoters from the rrnA operon are indicated as rrnA PI, rrnA P2, rrnA P3 and rrnA P4, and from rrnB as rrnB P and rrnB Px.  rrnA Pi  rrnA P2  rrnA P3  rrnA P4  rrnB P  rrnB P  66  66  45  54  58  73  56  38  45  57  50  51  52  55  rrnAP  2  rrnAPi  •  rrnA P4 rrnB P  x  64  dependent RNA polymerases present in Ha. marismortui, Hb. cutirubrum and He. morrhuae respond differently in their DNA-binding ability. Louis and Fitt showed that the salt response of the DNA-dependent RNA polymerase of Hb. cutirubrum was dependent on the type of DNA template supplied (linear, circular and supercoiled DNAs; Louis and Fitt, 1971a; Louis and Fitt, 1971b; Louis and Fitt, 1972a; Louis and Fitt 1972b). Louis and Fitt indicated that the DNA-dependent RNA polymerase of Hb. cutirubrum is a dimer containing single alpha and beta subunits (Louis and Fitt, 1971b; Louis and Fitt, 1972a; Louis and Fitt 1972b). However, Zillig et al (1985) later showed that Hb. cutirubrum RNA polymerase contains eight components, A, B', B", C, D, H, I and J (Zillig et al., 1985). Louis and Fitt had shown that the beta subunit of DNA-dependent RNA polymerase is responsible for chain initiation and that in a primed reaction, catalyzed by the alpha subunit, the template-dependent salt effects were no longer observed (Louis and Fitt 1972b). These findings suggest that the template specificity of the RNA polymerase of Hb. cutirubrum might  74 reside in a site which recognizes short DNA sequences. On the basis of the above discussion, an explanation for the presence of different promoter sequences would be that the beta subunit carries two or more initiation sites with different salt dependent properties. Furthermore, since there is only one active site in the beta subunit, the effect of salt concentrations could change the conformation of the protein, thereby resulting in the recognition of different nucleotide sequences. With the exception of the rrnB operon of Ha. marismortui, all other halophilic promoter elements contain an inverted repeat sequence surrounding the Box A element ( see section 3.2.2.2). The inverted repeat sequences containing all four promoter motifs from the rrnA operon are shown in Figure 3.8. In all four cases, the TTAT promoter element is present in the region between the repeats. There is little apparent conservation in either the sequence or lengths of the repeats or the distance separating the repeat sequences. The functional role of these structures was not investigated. Inverted repeat sequences within promoters were also observed in eukaryotic and eubacterial organisms (Vogeli etai, 1981; Orosz and Adhya, 1984; Majumdar and Adhya, 1984; Dunn et al.. 1984). Larsen and Weintraub (1982) have suggested that, in the chick OC2 collagen gene promoter, the inverted repeat sequences might provide a recognition site for transregulatory proteins; and that, once bound, these proteins would inhibit nucleosome formation in that region of DNA. Inverted repeat sequences were also observed in E. coli promoters of the gal (Orosz and Adhya, 1984; Majumdar and Adhya, 1984) and ara (Dunn et al., 1984) genes. The significance of the presence of these inverted repeat sequences was not studied in detail.  3.2.3.3 Primary Transcript Analysis of the 16S Leader Region Protection from Sl nuclease digestion of end labelled DNA fragments by precursor and product rRNA transcripts was used to identify the positions of transcription initiation sites, processing sites, and mature 16S maturation sites within the 5'-flanking region of the two rRNA operons from Ha. marismortui. To investigate the 5'-end sites of the transcripts within the 5flanking region, restriction fragments were obtained from plasmids PD 927 (rrnB operon) and  75 PD 928 (rrnA operon) digested with AfllH and used as probes. The enzyme AflHI cleaves at position 95 within the 16S gene of both operons, at position -544 in the 5-flanking region of rrnA, and at position -331 in the 5'-flanking region of rrnB. The 638 nucleotide long AflHlAflHI fragment from rrnA (Figure 3.12A) and similar 425 nucleotide long fragment from rrnB (Figure 3.12 B) were 5'- labelled on the minus strand at position 95 within the 16S gene, hybridized to Ha. marismortui total RNA, and digested with SI nuclease. Protected DNA fragments were analyzed on an 8% polyacrylamide gel. With the two probes, the shortest protected fragment had a size of about 95 nucleotide corresponding to protection by the mature 5'- ends of the 16S rRNAs—the AfllH site is conserved within the 16S rRNA genes of both operons. Multiple bands observed in this region may be due to imprecise maturation of the 5'- end of 16S rRNA or imprecise trimming of the protected DNA by S1 nuclease. With the rrnA probe, additional major protected fragments of approximately 191, 265, 376, 454, and 535 nucleotides were observed. The end of the 191 nucleotide fragment corresponds to the precursor processing site at position -96 and the ends of the other four fragments correspond to the transcription initiation sites that mapped by S1 nuclease analysis at or near the G residues at positions -440, -359, -281, arid -170. With the negative strand DNA 5'-labelled at the Afllll site on the 425 bp fragment from the rrnB operon, a full length product and partially protected fragments of 95, 267, 349, and 425 nucleotides were observed. These partial products are believed to represent protection by (i) the mature 5'- end of the 16S rRNA; (ii) the precursor processing site (at position -172); (iii) the promoter start site (at position -254); and (iv) the full length product corresponding to the transcript initiated upstream of the Afllll site within the 5-flanking region, respectively (Figure 3.12B). The minor band X, may correspond to another processing site located upstream of the promoter P (since there is no putative promoter sequences are present within this region, this band may not be a promoter start site). The band Y corresponds to a contaminant DNA fragment that is present within the conserved region of the 16S genes which ran closer to the 425 nucleotide probe during isolation. Further purification of the probe was not needed since  Hma mill  Hma rrnA  16S  2  g  £  -a o  a-  AftJJ  M1U1  5' eud of 16S. I6S processing site P4 startP3 startP i staru PI start  iJSn • lOlr. -26511 • <76n  5' end of 16S16S processing site 1 Promoter start Probe  — « 149n —  425n  -5J5n  •a s  CQ  D  638n535n-  404n  376n^  •a  o  Ui  au  622n 527n  454n-  o  4 e  425n x y  349n 307n  265n267n —  —  — • 267ll  MMn  u  191n  _ > 5 n  a!  -  622n  -  527n  - 404n —  307n  = »  - 217n  - 217n  -  201 n  - 201 n  " 190n - 180n  " 190n - 180n  • 160n  - 160n  - 147n  -  - 122n  - 122n  11 On  95n =  147n  • 110n  95n = : 90n  Figure 3.12 Figure caption on next page (page 77).  90n  77 Figure 3.12 Nuclease SI protection assays using probes from the 5'-flanking regions of the  rrnA and rrnB operons of Ha. marismortui. Figure 3.12A A line diagram showing the 16S 5'-flanking regions of the rrnA operon (from plasmid PD 928). The two Afllll sites within the 16S rRNA and 5-flanking region were used in the isolation of the probe for the nuclease S1 protection assay (see section 2.2.16.1). The positions where the probes were labelled with y-32p ATP are indicated by dots (•). The sizes of the protected fragments and their corresponding, sites are shown below the operon structure.  Figure 3.12B A line diagram showing the 5' flanking region of the 16S rRNA gene of the rrnB operon (from plasmid PD 927). The two Afllll sites within the 16S rRNA and the 5'-flanking region were used in the isolation of the probe for the nuclease SI protection assay (see section 2.2.16.1). The positions where the probes were labelled with y-32p ATP are indicated by dots (•). The sizes of the protected fragments and their corresponding sites are shown below the operon structure.  Figure 3.12C The autoradiogram showing the nuclease SI protection assay products from the 5'-flanking region of the rrnA operon. Above each lane, the rrnA probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA), and marker (pBR 322 plasmid digested with Mspl and labelled at the 3'- end with a-32p dCTP) are indicated. The, sizes, of the protected fragments from the S1 treated DNA-RNA hybrids and the markers, are indicated on both sides of the autoradiogram.  Figure 3.12D The autoradiogram showing the nuclease SI protection assay productsfromthe DNA-RNA hybrid of the rrnB operon. Above each lane, the rrnB probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA), and marker (pBR 322 plasmid digested with Mspl and labelled at the 3'-end with a - P dCTP) are indicated. The sizes, of the protected fragments from the 32  S1 treated DNA-RNA hybrids and the markers, are indicated on both sides of the autoradiogram.  78 this fragment is identical in the rrnA and rrnB operons. and would not affect the SI analysis within the 5-flanking region. Since SI nuclease analysis are only semi-quantitative, the band intensities do not indicate the exact amount of RNAs present in the cells. These analysis are mainly done in order to detect the transcription initiation, termination or processing sites of the transcripts. In the case of rRNA transcripts, usually we expect only about 1 - 2% of rRNA in the form of very long precursors and most of the rRNAs in mature form (King and Schlessinger, 1983). However, this is not what we have observed from the SI nuclease analysis shown in Figure 3.12; the intensities of the bands representing the mature 16S rRNAs (Figures 3.12 C and 3.12 D) are only about 3 or 4 times stronger than the bands representing the processed 16S rRNAs. This reflects non-quantitative aspects in the S1 assays. First, the efficiency of hybridization between rRNA and DNA probe may depend on the lengths of the rRNAs; longer RNAs may hybridize more efficiently than the shorter RNAs and may displace shorter RNAs from the RNA probe. Second, the conditions used in the S1 analysis (especially temperature) may not have permitted the mature rRNA to hybridize efficiently. Third, mature rRNAs may be folded into a secondary structural conformation and may not readily available to hybridize with the DNA probe. Primer extension analysis were performed in order to confirm the end sites of the prodects identified from S1 nuclease assays (the approximate positions of the end sites were already determined using DNA restriction fragment ladders, as shown in Figures 3.12 A and 3.12B). The 267 and 349 nucleotide fragments, observed from the SI mapping analysis, were confirmed and precisely positioned by primer extension analysis using an rrn/3-specific oligonucleotide, oPD 48. This oligonucleotide is complementary to a sequence from -96 to -118 within the 5'-flanking region of the rrnB operon. The smallest extension product terminates at an A residue at position -172 and corresponds to the major processing site in the 5'-flanking region (Figure 3.13). The second product terminates at a G residue at position -253 which corresponds to the transcription initiation site from the promoter P (data not shown). Precise mapping of the 95 nucleotide fragments observed from the SI nuclease analysis (Figures 3.12 C and 3.12 D) were also confirmed by primer extension analysis using oPD 34  79 (complementary to the region 57 - 38 within the 16S genes of the rrnA and rrnB operons). The major extension products terminated at an A residue at position 1 of the 16S rRNA genes (data not shown).  3.2.3.4 The 16S rRNA Processing Pathways for the rrnA and rrnB  Operons In the rrnA operon. the 16S rRNA processing occurs within the predicted "bulgehelix-bulge" motif (see sections 1.1.7.2 and 3.2.2.2) of the processing stem. This is similar to the recognition sites used by other archaeal organisms to excise precursor 16S and 23S rRNAs from the primary transcripts and to remove introns from the transcripts of intron containing rRNA and tRNA genes. Sequence comparison of the 5'-flanking sequences (Figure 3.10) and secondary structural analysis of the primary transcripts (Figure 3.7B) indicated that the rrnB lacks the conserved sequence associated with the cleavage site and the "bulge-helixbulge" motif within the processing stem. Subsequently, the S1 nuclease protection assay (Figure 3.12) and primer extension analysis (Figure 3.13) confirmed that the sequence and the secondary structure associated with the cleavage site in the rrnB operon are different. This suggests that in the rrnB operon, a different endonuclease is involved in the processing of 16S rRNA from the primary transcript and that the recognition feature for this enzyme is different from that of the endonuclease involved in the processing of the 16S rRNA from the rrnA operon. Since the processing site is located within helix A in the universal secondary structure for the rrnB transcript (Figure 3.7B), it is possible that endonuclease recognition site is associated with some aspect of helix A structure. The reason for two distinct 16S processing pathways in Ha. marismortui is not understood. It may be that the processing reactions involved are optimally active under different environmental conditions (e.g. different NaCl concentrations, ranging from 1.5 - 5M).  80  c  .2  GAT C u £ A C G C T C C G G A A C  -(-172) nucleotide position  Figure 3.13. Primer extension analysis within the 5'-flanking region of the rrnB operon using oPD 48. The oligonucleotide oPD 48 is complementary to the sequence from -98 to -118 within the 5'-flanking region of the 16S rRNA from the rrnB operon. A sequencing ladder using the same primer is shown adjacent to the primer extension product. The arrow is pointing at the nucleotide corresponding to termination of extension product. The poor quality of reaction products from primer extensions is caused by poor hybridization of certain primers to template DNA, and inefficient extension and premature termination caused by high GC content and higher order structure in the template DNA.  3.2.4 16S rRNA 3.2.4.1 Primary Structural Analysis The complete nucleotide sequences of the rrnA and rrnB 16S rRNA genes have been determined and are both 1472 nucleotides in length. Comparison of the 16S rRNA gene sequences from the rrnA and rrnB operons showed that they differ at 74 nucleotide positions (Figure 3.14 and Table 3.3). To gain insight into the functional and evolutionary significance of these nucleotide differences, the two Ha. marismortui 16S rRNA sequences from rrnA and rrnB were compared to the single 16S rRNA sequences of closely related genera: Hb. cutirubrum, Hf. volcanii and He. morrhuae (Gupta et al, 1983; Leffers and Garrett, 1984; Hui  81 and Dennis, 1985) (Figure 3.14). The 16S sequence of He. morrhuae is 1475 nucleotides whereas the others are 1472 nucleotides in length. When aligning the sequences the secondary structural model for 16S rRNAs was also considered (Woese and Pace, 1993). To maximize the nucleotide identities in the end to end alignment of the five sequences, introduction of nine gap positions were required. Results from the pairwise alignments are summarized in Table 3.3. The 16S rRNA sequencesfromthe rrnA and rrnB operons of Ha. marismortui are 95% identical, whereas the remaining comparisons show substantially less identity, ranging from 86.8% to 90.2%. Intraand interspecies comparisons show that the substitutions caused by transitions outnumber transversions by about 2:1; in rRNA genes, the significance of this ratio has not been investigated. About 60% of these nucleotide substitutions are compensatory, affecting both components of a nucleotide base pair in a region of an RNA helical structure. All of the substitutions occur between rrnA and rrnB in what are recognized to be phylogenetically variable positions. These differences are not randomly distributed; with a single exception, all are confined to three domains, 58-321, 508-823, and 986-1158, within the universal secondary structure model for small subunit rRNA (Figures 3.15, 3.16 and 3.17, Woese et al., 1983). In contrast to the rrnA-rrnB case, the nucleotide differences in all interspecies pairwise alignments showed that the differences were not concentrated in specific domains but were more generally distributed throughout the 16S sequence (Figure 3.15). For the three interspecies pairwise comparison between Hf. volcanii and Hb. cutirubrum, Hf. volcanii and He. morrhuae, and Hb. cutirubrum and He. morrhuae, the 508-823 domain exhibits essentially the same frequency of nucleotide substitutions as the entire 16S molecule (percentage similarities of 88.3%-88% versus. 88.6%-85.6% ; Table 3.3, A and B).  3.2.4.2 Secondary Structural Analysis Phylogenetic comparison of rRNAs from all three lineages provides support for the idea that despite often considerable differences in their sequences, the secondary and probably the  20  -  H«o  10  -  Hcu  T  -  G  T  T--G-  CA—GG-A--  T  T  -  H-T-C-T--T  1  I20 G  -T  Hcu  110  CTG  G  GTG  G  Huo Hmo rrnfl  —  -  Gfl-A-  G-GA  I---A  fl-T  190  I60  I  T CT  CC-A T TC  200  A-A-A--A---CG  CTG—T  T---CC-C T  —GAA  G  C— -  ACG — T - G C - C C  AG-GGC-A--CCG  C  CGGGAG  A  C--A  C  CTCCCC  GCCAAflCTflCCCTflCAGACCGCGATflflCCTCGGGAAflCrGflGGCCAftTAGCGGATRTAflCTCTCATGTrGGAGTGCCGAGAGTTAGAAACGTTCCGGCGC A  Hmo rrnfl  C  220 Hmo  C-G  C---A  Hcu Huo  CAC  c  210  r  280  260  '  A  A  300  G  -c  —a  C  G  C  c  G--C  G--C  A —  TGTRGGflTGTCGCTGCGGCCGATTAGGTAGATGGTGGGGTAACGGCCCACCATGCCGATAflTCGGTACGGGTTGTGAGAGCARGAACCTGGAGACGGTAT  Hmo rrnfl  A  Hmo rrnB  320 —  Hmo Hcu Huo Hma rrnfl  100 C  -A  flJTCCGGTTGfl'CCIGCCGGAGGCCArTGCWCGGAG1CCGATTrAGCCAIGC!AGTCGCACGGGCTTAGACCCGTGGCATATAGCTCAGTAACACGTG  Hma rrnfl  Hmo  80  — fl—  I-C  ---T-TG  -  Huo Hma rrnfl  60 T-T  -C  310  360 C  ---•  T I  G  380 fl fl  T  100  C  C  C  C  fl  Cfl  fl  C--T-  G  fl  T-  CTGflGACflflGAIflCCGGGCCCTflCGGGGCGCAGCAGGCGCGAAACCTTTACACTGCACGACAGTGCGATAGGGGGACTCCGflGTGTGAGGGCflTATflGCC  Hma rrnB  120 Hmo  --  GTG---  Hcu  T--A  110  flfl---CT  G-AC —  130  C---CG  500 -A---A  C  G--C---T  CG  Huo  160  Cfl — - I - - - A G -  C—  T---AG  — - C T - f l fl  CTCGCTTTTCTGTaCCGTAAGGTGGTACAGGAACAAGGACTGGGCflAGACCGGrGCCAGCCGCCGCGGTAATACCGGCHGTCCGAGTGATGGCCGATATT  Hma rrnfl Hmo rrnB  520  510 G  Hmo Hcu  ---  Hmo rrnfl  —  TGTC  -ACGfl-G--T-fl-C U--J.-Ifil  fl-  620 A - —A  G  Hcu Huo  -  r  —Cfl  G — GG  -C  A A  fiXfl-fl  C  fl  700  G-GT — - T - - - G A G-G G  T  GA GA  A-  '  G - T G A - - G A C fl  710  760  --  780 0---T  A  -  -  TC -  G--C  800 7 T —  — -A-AC-  -  fl  T-T —T — -  GACGAAAGCrSGGGTCTCGAflCCGGATTAGATACCCGGGTAGTCCTAGCTGTAAACGATGCTCGCTAGGTGTGCCGTAGGCTACGAGCCTGCGCTGCGCC £  £  G G  -  —  Gfl  G--  fl--I-flC£C  —A  810  860  C  -  C  —C-T 880 T T T-  —  A£-I-fl—I— 900  CTHGGGAAGCCGflGAHGCGAGCCGCCTGGGAAGTACGTCTGCAAGGATGAAACTTAAAGGAflrTGGCGGGGGAGCftCCACHACCGGAGGAGCCTGCGGTT G  X  ^  ^  '  :  920 --  910 CA  -  Hcu  Hma rrnB  — -C  G  T  Hma rrnfl  Hma rrnfl  G-I  "Cfl-C  820  Huo  T-A  680  GG-A-C -  Hmo rrnfl  Hmo  CACGT  AflGACCTGRGGGCTACGTCCGGGGTAGGAGTGAAATCCTGTAflTCCTGGACGGflCCACCAATGGGGAflACCflCCTCAGGAAGACGGACCCGACGGTGAGG G  Hma rrnfl  Hma rrnB  G-T-A  660  Huo  Hmo Hcu Huo  •  IS.  610  720 Hmo Hcu  CAGT  -  --TG-CG--.  -  G-TC  Hmo rrnfl Hma rrnfl  G-CA —  T  CG-CA  600  G  ATTGGGCCTRPSGCGTCCGTAGCCGGCCGAACAAGTCCGTTGGGAflATCGACGCGCTCAACGCGTCGG-CGTCCAGCGGAAACTGTCCGGCTTGGGGCCGG  Hma rrnfl  Hmo  G--G--A  C —C  T---T -  Huo  580  560  C  960  A-CA-CT  TAC-. — G  980  1000  TCC--T-GGC- — - - T - - . - - C - -  GT-C-G G  T---C  C-CG--G  GA--  T  T-A  C  GTA  TflflTTGGSC'Z-ACGCCGGACflTCTCACCGGTCCCGACAGTA.GTAATGACAGTCAGGTTGACGACTTTA«CTCGACGCTACTGflGAGGAGGTGCATGGCCG •  Figure 3.14 Figure caption on next page (page 83).  •  A  83  1020  1010  Hmo  1060 GC  1080  GTCC  Hcu  fl  C—fl  HU0 rrnfl  Hma  rrnB l 120  Hcu  -T  Hoo  fl  Hma  rrnB  TC  1110  HI  T-TCT  1210 fl-G  fl  —  flGfl  fl  1  TCGfl  T  TCGfl  -T fl  1310  1360 T T- —  CGTGTCfl Gfl  1110  .  Hma  rrnB  fl fl  .  rrnfl  — - — -T  G  «--CT  H-  HC--CA.-GI  flGCGCGCGGTGRATflCGTCCCTGCTCCTTGCflCflCflCCGCCCGTCftflflGCACCCGflGTGGGGTCCGGRTGAGGCCGTCATG.CGflCGG  1120  Hmo  1100 G-GCRA--C  A  T  Huo  I380 A  TCGfl  *  Hmo Hcu  T  •  C rrnfl  I300  T  ---T  -fl-T — - A I - - H - T rrnB  I280 --fl T  fl  fl-T  HCU  Hma  fl-  GGGfl TGCAACGCCGHGflGGCGflCGC T flfl TC T C C . A f l A C G T A G T C G TflGTTCGGBT T GCGGGC T GAAACCCGCCCGCfl TGAAGCTGGBTTCGGTAGTAATCG  I320  H.a  TCGfl  1260  T--C-ICG  -fl-T  Huo  TCGfl T  1  G  A--C--T  Hmo  CGfl  G-T  6J.--G  T rrnB  1200  G  GflGGflCTGCCGCTGCTflflflGCGGflGGflflGGfl»CGGGCflflCGGTflGGTCflGrflTGCCCCGflfl'GGflCCGGGCflflCflCGCGGGCTflCflflTGGCTflIGHCflGT  Hcu  Hma  H80 10  flG-T  C--T--C  rrnfl  C - - C _ — -G_  l 160  C  Hmo  H.a  T--T— -  £  Gfl-  I220  Huo  C  GC-GT.--CGAC-G-C fl  -GA  rrnfl  IG  --T- —  CCGTCflGCTCGTflCCGTGRGGCGTCCTGTTAAGTCflGGCflflCGFIGCGRGflCCCGCflCTTCTflGT TGCCAGCAATACCCT TGAGGTAGTTGGGTACACT AG  H.o  H»a  T-C-G-C-  G  fl  Hma  iioo  GC-G  7  1160 ---C  G  -G  G  —  --  C C  flTCTGGGCTCCGCARGGGGGCTTRAGTCGTRflCRAGGTflGCCGTAGflGGflflTCTGTGGCTGGATCflCCTCCT  Figure 3.14 Nucleotide sequences and alignment of the 16S rRNA encoding genes. The complete nucleotide sequence of the Ha. marismortui 16S encoding gene from the rrnA operon is depicted {Hma rrnA). Below is the rrnB 16S rRNA encoding sequence (Hma rrnB); only nucleotides that differ from the rrnA sequence are indicated. Substitutions in the rrnB sequence that are compensatory, affecting both components of a base pair in a region of RNA secondary structure, are underlined. For comparison, the entire 16S rRNA sequences from Hb. cutirubrum (Hcu), Hf. volcanii(Hf), and He. morrhuae (Hmo) are also included; again, only nucleotides that differ from the corresponding nucleotides in the rrnA sequences are indicated. Dashes (--) indicate nucleotides identical to the rrnA sequence; dots (•) indicate single nucleotide gaps in the sequence(s) required to maintain alignment.  84  Table 3.3 Comparison of the nucleotide sequences of the 16S rRNA and the 508-823 domain from halophilic archaeal species are summarized. The abbreviation used are as follows: L is the length of the nucleotide sequences compared; Ts is the number of transitional substitutions; Tv is the number of transversional substitutions; Total is the total number of substitutions. Abbreviations of species names are as in the legend to Figure 3.14.  Nucleotide Differences Comparisons L  Ts  Tv  Total  Percentage similarities  A. Complete 16S sequence rrnA/ rrnB rrnA/ Hvo rrnB/ H vo rrnA/ Hcu rrnB/ Hcu rrnA/ Hmo rrnB/ Hmo Hvo/ Hcu Hvol Hmo Hcu/ Hmo  1472 1470 1470 1471 1471 1471 1471 1471 1470 1470  54 118 124 94 120 106 128 121 107 115  20 54 60 50 62 56 66 56 66 51  74 172 184 144 182 162 194 177 173 166  95.0% 88.3% 87.5%  316 316 316 316 316 316 316 316 316 316  34 29 34 24 40 16 31 40 29 29  17 13 16 8 19 15 26 5 11 7  51 42 50 32 59 31 57 45 40 36  83.9% 86.7%  90.2% 87.6% 89.0% 86.8% 88.0%  88.2% 88.7%  508-823 Domain rrnA/ rrnB rrnA/ Hvo rrnB/ Hvo rrnA/ Hcu rrnB/ Hcu rrnA/ Hmo rrnB/ Hmo Hvol Hcu Hvol Hmo Hcul Hmo  84.2% 89.9% 81.3%  90.2% 82.0% 85.8% 87.3% 88.6%  tertiary structures of both small subunit (SS) and large subunit (LS) rRNA have been conserved to a remarkable degree across the entire evolutionary spectrum (Raue' et al., 1988). In E. coli, several structural models which predict higher order RNA-RNA and RNA-protein interactions within the 30S subunit of the ribosome have been proposed (Woese et al., 1983; Stern et. al..  85 S6, SI 1, S15, S21  U  640 R  c  R  "G  A  C  U  U  R  R  I C C  C  C  I  I  I  I  I  R  G 0  G  G  U  G  •  G I  G  C  H I  C  C I  U  G I  G  G I  C  A  C  o I R  U  A A 0  C I  G  I  I  G  I  C C  I  G •  620  G G G  - C  fl  fl  S11.S21  G - C • G A  •  - U  G - C • U  •  U - A • G  •  C - G • fl A  C  - G  G  A - 6 S 0 o A • G  • A o G 600  — A  •  G o A G - C C  - G  C  - G  G  -  6  -  C  -  C  -  S18  7 2 0 — A  G - C G - C I  530  U G  G  G  U  |  U C  C  G C G U C  I  I  I  I  I  I  G  C G C A G  G  G  U  I  I  I  A A C C R G C G G A  • •  C  A C  U  I U  I G  I C  I C  G  * fl  C • • • C  C U G U I  U  I  I  I  A  U  *  U  C  G - C  7 8 0 -  u  C G G C I  G A C A  I  I.  I 1  G C C G  i •  540 A  U  0 •  C  U  G  •  G  -  G  -  A  -  • fl - U  - G G - C  C  * U • G  G U fl U  A A  A  *  - G  A C  I C  - G  0  - C  5  "f  C G  C I  A 820  U  • G  G - C  G  U fl C C C C  C  • U  .  G  u  G - C  G  ' G  C  G  U  C  -  G - C  C  A  505  *  G  fl • C  • U A  C  fl • G - C  C  S8  - G  G - C  I  G C  U fl I  - C --  800  • G  H C t  A  I  I  G A  I  A  G G G  I  I •  fl G  Figure 3.15 Predicted secondary structure for the 508-823 domain of the rrnA and rrnB 16S rRNAs from Ha. marismortui. The secondary structure for the region bounded by nucleotide positions 508-823 for the rrnA sequence is illustrated. Mutational differences between rrnA and rrnB (*), normal Watson-Crick (-), G-U (• ), and A-G (o) base pairs are indicated. The boxed nucleotides correspond to the positions in E.coli 16S rRNA protected from chemical modifications by tRNA binding to the P site (Moazed and Noller, 1989). Regions that in E. coli 16S rRNA interact with the indicated small subunit ribosomal proteins S6, S8, SI 1, S15, S18, and S21 are also illustrated (Brimacombe etai, 1990). 1988; Noller et al. 1990; Brimacombe et al., 1990; Oakes et al., 1990). Studies on rRNAprotein interactions using archaeal rRNAs and E. coliribosomalproteins indicated functional conservation of at least some ribosomal protein binding sites from one primary kingdom to another (Zimmermann et al., 1980; Thurlow and Zimmermann, 1982; Leffers et al, 1988). Because of the evolutionary conservation, it seems likely that Ha. marismortui 30S subunits  86 containing either the rrnA or rrnB 16S rRNA sequence will retain the important structural and functional features necessary for protein synthesis (Woese and Pace, 1993). The sequences and predicted structure for the more variable 508-823 domain (the central domain) are illustrated in Figure 3.15. Most of the nucleotide differences between rrnA and rrnB sequences are located within certain helical regions of RNA secondary structure and are compensatory, affecting both components of a nucleotide base pair. The two helical regions bounded by nucleotides 526-588 and 768-800 exhibit many differences between rrnA and rrnB; however, most of the differences are compensatory. The base of the 526-588 helix corresponds to the region in E. coli 16S rRNA where protein S8 binds and the apical portion of the 768-800 helix corresponds to the region where protein S18 binds. Another helical region bounded by nucleotides 599-611 and 673-684 also exhibits many nucleotide substitutions. Ribosomal proteins S6, SI 1, S15, and S21 bind to sites within one or both of these helical regions between positions 594-690. The two interrupted helical regions bounded by nucleotides 612-672 and 708-749 are invariant except for a compensatory pair of substitutions occurring at positions 710 and 747. The apical loops of these helices correspond to the loops in E. coli 16S rRNA which contain nucleotides (boxed in Figure 3.15) protected from chemical modification by tRNA binding to the P site of the ribosome (Moazed and Noller, 1989). In the 30S subunit, these two loops are in close proximity and form a ring-like structure that defines the platform on the top of the 30S subunit (Oakes et al., 1990). In E. coli, the sequence UUAGAU, in the apical loop of the second heux adjacent to the two tRNA P site nucleotides, has been proposed to interact with a complementary sequence in 23S rRNA during 70S particle formation (Tapprich et al., 1990). The homologous sequence, UUAGAU, is conserved in both the rrnA and rrnB 16S rRNAs (positions 727-732, Figure 3.15). The complementary hexanucleotide is also conserved in the rrnA, rrnB and rrnC 23S rRNAs between position 2783 and 2788. The 986-1158 region also contains 12 substitutions between the rrnA and rrnB operons; ten of these are compensatory. In E. coli, one of the key helices in this region, bounded by  87 nucleotides 1118-1 159, can be crosslinked to proteins S3, S9, and S10. The corresponding helices in the rrnA and rrnB operons of Ha. marismortui, bounded by nucleotides 1052-1106, contain four substitutions within this helix and seven substitutions in the surrounding helices (Figure 3.16). In Ha. marismortui, although the substitutions are not directly involved in the corresponding binding regions of S9 and S3 proteins in E. coli, two substitutions are located within the region involved in binding of S10.  G ' U  c u  c  - fl  S9  c -  c  c -  c c  1060  C C fl  1010  c  u  C G fl  c c  G  I  I I  C U U C U fl G I I •. I I I I  GG  G f  fl fl  C G  I I  G C U I  C  10I0  1020-G  fl  C fl  > U  fl G U  I 110  I  G G fl U c  G fl  U  .] U  S3  G U II U G  U •C U-1080 G  S10  I 100  fl U G  Elongation Subunit association  C I G G fl G U R 997-G  ~~ I  . U-1H7  I  T T Figure 3.16 Predicted secondary structure for the 986-1158 domain of rrnA and rrnB 16S rRNAs from Ha. marismortui. A likely secondary structure for the region bounded by nucleotide positions 986 and 1158 for the rrnA sequence is illustrated. Nucleotide substitutions between rrnA and rrnB are indicated by asteriks (*). Watson-Crick •(-), G-U (•), and A-G (o) base pairs are also indicated. The helices or regions in Ha. marismortui that correspond to the regions that interact with the small subunit proteins S3, S9 and S10 in E. coli 16S rRNA are also indicated. In E. coli, the differences in sensitivity to chemical modification showed that the helix bounded by nucleotides 1048-1065 and 1191-1198 is crucial for subunit association in E. coli  88 ribosomes. Either removal of this helix (Zwieb et al., 1986) or perturbation of its secondary structure (Meier et al., 1986) causes a severe reduction in the efficiency of formation of the 70S ribosome. In the two operons from Ha. marismortui, the corresponding helices are bounded by nucleotides 995-1006 and 1137-1152. In the rrnA and rrnB sequences a single nucleotide substitution of G1000 -> A is located within this region; however, it does not disrupt the helical structure and probably does not affect the formation of 70S ribosomes. The sequence and universal structure for the 58-321 domain from the rrnA operon is illustrated in Figure 3.17. Mutational differences between the rrnA and rrnB operons are indicated by asteriks (*). Four of the substitutions observed within this domain are located within the helix bounded by nucleotides 60-78 and two nucleotide substitutions were located within the helix bounded by nucleotides 157-189. Another helix bounded by nucleotides 200245 region in Ha. marismortui is identical in both the rrnA and rrnB operons. In E. coli, the small subunit protein S17 binding site is located in the corresponding helix (Brimacombe et al., 1990).  3.2.4.3 Expression of rrnA and rrnB 16S rRNA In prokaryotic organisms, three steps are required to produce functionally mature rRNAs from the rm operon primary transcript: first, excision of monocistronic precursor RNAs from polycistronic primary transcript; second, accurate recognition and trimming to remove extraneous nucleotides to form mature 5'- and 3'-termini; and third, modification of the base or the ribose moiety at specific sites within the RNA chain. In E.coli and presumably also in halobacteria, only 30S subunits that are correctly assembled and active in translation are capable of associating with 50S subunits to form functional 70S particles (Tapprich et al., 1990). The heterogeneity between the rrnA and rrnB 16S rRNAs from the Ha. marismortui raises two important questions-are both operons actively transcribed and do both 16S sequences get assembled into active ribosomes? In this analysis, protection against nuclease S1 digestion of the end-labeled DNA fragments was used to demonstrate that both rrnA and rrnB 16S rRNAs  89  R G G fl U . G G - C U - fl U - fl G  fl  S17  fl'G G - C-290 250 G - C C I fl R * G . U G - C G G U C - G fl - U CCCflCCflU GCC flflUCGG G - C II I I I I I I .11 I I I I I I fl GGGUGGUfl UGG UUflGCC Gfl o fl C - C R G U G - G-ISO G - C G - C U - fl G o fl U - fl C fl U  200-C  G G fl U G U C  GCG • I I UCG C fl fl  U-51 fl G U C*U fl U G • * CHCGGGC II I I I I GUGCCCfl G • • C fl U U R U  - C - C - U - fl - C - fl - G flfl  CCGC II I I GGCG fl I fl U 150 fl U U G fl G  G CCUCGG I I I I.I GGflGUC flCC  fl  - U - fl - fl - C - U - C  fl -  u  fl -  u  G - C G - C  C L C - G G • U fl G  Figure 3.17 Predicted secondary structure for the 58-321 domain of the rrnA and rrnB 16S rRNAs from Ha. marismortui. A likely secondary structure for the region bounded by nucleotides 58-321 for the rrnA sequence is illustrated. Nucleotide differences between rrnA and rrnB 16S rRNAs are indicated by asterisks (*). The corresponding regions in E. coli 16S rRNA where the interaction of small subunit protein S17 is also shown.  90 are present in intact 70S ribosomes (Mylvaganam and Dennis, 1990; see section 2.2.12). The RNA was obtained from 70S ribosomal particles isolated by sucrose density gradient sedimentation. The DNA probes were two homologous but nonidentical 272 bp SacII-Smal fragments, isolated from the rrnA and rrnB operons, that overlap a portion of the 508-823 variable domain (nucleotides 462-734 in Figure 3.18). As a control, the homologous fragment from the clone p4W (PD 655) containing the single copy 16S rRNA gene of Hb. cutirubrum was also used (Hui and Dennis, 1985). bi principle, a full length protection product is observed only if the DNA probe fragment is hybridized to an RNA with perfect sequence complementarity. During SI digestion, partial protection products arise from cleavage of the labelled DNA probe within regions of RNA-DNA duplex that contain one or more mismatched base pairs. The thermostability of RNA-DNA duplexes with perfect sequence complementarity will be greater than the stability of duplexes containing a substantial number of mismatches.  The hybridization temperatures and the  concentration of RNA were optimized before performing the final analysis. At a temperature of 59°C, the probe hybridizes only to the RNA with perfect sequence complementarity and at 53°C, the probe hybridizes to both perfect and non-perfect RNAs. Nuclease SI protection analysis using DNA fragments from both rrnA and rrnB operons exhibited intense 272 nucleotide long, full length products resulting predominantly from protection by Ha. marismortui RNA with perfect sequence complementarity. These full length protection products were visible after hybridization at 53°C and 59°C (Figure 3.18, lanes 1, 2, 7, and 8). Several clusters of partial length protection products were also visible following hybridization at 53°C (compare lane 1 with 2, and 7 with 8). These partial length products of 54-63, 149-153, 174-184 and 205-213 nucleotides in length are produced by S1 cleavage of the probe in the vicinity of clustered nucleotide differences that exist between the rrnA and rrnB DNA sequences at positions 675-684, 583-587, 555-566 and 524-532, respectively. The partial products observed from the rrnA and rrnB probes are expected to be the same because, the mispairings occur at the same positions. The lower bands of 54 - 63 nucleotides in length  91  appeared at identical positions in both cases, however, the upper bands seem to be digested at different rates in lanes 2 and X. This may be due to some artifactual effects in the Sl experiments. For example, the Sl nuclease digestion reactions were performed at 35°C for 30 minutes; if the reaction continues for more than 30 minutes, the Sl nuclease continues to digest  DNfl PROBE CJ  TEMP. RNfl  a  INPUT  —  O  a  (N  O  D  o n  CJ  3  o TT  O  C_>  CJ  CJ  o cc <_> TT  IT)  v3  C_>  o r-  oo  o\  CJ  C J  CO  CJ  CJ  CO  V  C J  C J  C J  LU  «o  r— CO  o cc :x CD  —  CM _ J  LT)  ' -272 -220 -190 205-13  £  174-81  f_  119-53  f_  -135  -105  Figure 3.18 Figure caption on next page.  92 Figure 3.18 Ribosomal RNA protection of end-labelled DNA fragments. The 272-nucleotide long SacII-Smal (nucleotide position 463-734) fragments were isolated from the Ha. marismortui pPD 928 (rrnA or HC8) and pPD 927 (rrnB or HH10) clones and from the clone p4W (PD 655) containing the single copy Hb. cutirubrum operon. The fragments were labelled at their 5'-ends in the minus strand (position 736) and hybridized to approximately 50 ng of Ha. marismortui RNA isolated from 70S ribosomes or to approximatly 200 ng of total Hb. cutirubrum RNA (see section 2.1.2.16.2). For each lane, 1-18, the source of RNA, the hybridization temperature, and the source of DNA are indicated. The S1 digestions are not Umit (i.e. exhaustive), but is sufficient to trim the ends of RNADNA hybrids and not cut at every internal mismatch sites. Exhaustive digestion produces additional artefacts. Abbreviations are as in the legend to Figure 3.14. the longer fragments into smaller fragments (data not shown). In lane 8, the bands of 149 - 153 nucleotides are much stronger than the bands of 174 - 184 nucleotides in length: nuclease SI may also exhibit a sequence or contextual preference when cleaving the DNA strand in a RNADNA hybrid at the site of a mismatch. As a control, heterologous RNA from Hb. cutirubrum was hybridized to the Ha. marismortui rrnA and rrnB DNA probes at both 53°C and 59°C (lanes 3, 4, 4a, 9 and 10). At higher temperature, little or no partial protection of the DNA probes was observed. At lower temperature, however, partial protection products of 54-63, 65-78,149-154, 174-184 and 205213 nucleotides in length were clearly visible. Again, these products are produced by SI cleavage of the DNA probes in the vicinity of clustered nucleotide differences that exist between the Hb. cutirubrum and both the rrnA and rrnB DNA sequences at nucleotide positions 675684, 657-670, 583-587, 555-566 and 524-532. As a second control, the DNA probe from Hb. cutirubrum was hybridized to either homologous Hb. cutirubrum or heterologous Ha. marismortui RNA at 53°C and 59°C. With the homologous RNA, a full length product is visible at both temperatures whereas with the heterologous RNA, little or no full length protection product was observed (lanes 13, 14, 15 and, 16). However, at 53°C, heterologous RNA hybridization resulted in partial length  93 products of 205-213 nucleotides in length. These correspond to SI cleavage of the Hb. cutirubrum DNA probe in the vicinity of clustered nucleotide differences that exist between the Hb. cutirubrum DNA probe and both the rrnA and rrnB RNA sequences between nucleotide positions 524-532. Finally, for all three DNA probes, yeast tRNA failed to produce either full length or partial length protection products regardless of hybridization temperature (lanes 5, 11 and 17 and unpublished results). These results clearly demonstrate that 16S rRNA sequences derived from both rrnA and rrnB 16S rRNA genes of Ha. marismortui are present in 70S ribosome particles. Particles 70S in size are presumed to have initiated protein synthesis and to be involved in active elongation (Tapprich et al., 1990). At high stringency, both DNA probes exhibit substantial full-length protection. At lower stringency, both DNA probes exhibit many of the partial length protection products expected  from  SI  cleavage within regions  of RNA-DNA  sequence  noncomplementarity. Although the S1 nuclease digestion assays are only semi-quantitative, the relative intensities of the full length autoradiographic bands suggest that the rrnA and rrnB type 16S RNAs are (to a first approximation) equally represented in 70S ribosomes. It is also possible that the 16S rRNA transcript of the rrnC operon, if present in the 70S ribosomes, may also have hybridized to either rrnA or rrnB probes and resulted in partial or complete protection products in the above S1 nuclease assays. The presence of the 16S rRNA from the rrnC operon in 70S ribosomes can be determined by sequencing the 16S rRNA gene from the rrnC operon and using a rrnC specific probe in SI nuclease protection assays (similar to the assays discussed above). Attempts were also made to verify the presence of two heterologous RNAs from the 70S particles by using the RNA as template in a primer extension-reverse transcription sequencing reaction. It proved problematic to generate an accurate and reliable sequence primarily because of non specific stops caused by RNA secondary structure and the high G+C content of the template RNA. Oren et al, (1988) used reverse transcription to determine a portion of the 16S rRNA sequence of Ha. marismortui without realizing that there is substantial sequence  94 heterogeneity; consequently, their sequence contains a large number of uncertain or unidentified nucleotides. This example clearly illustrates a potential error that can be encountered when using reverse transcriptase sequencing of small subunit rRNA in phylogenetic and systematic studies. In summary, when the nucleotide sequences of the 16S rRNAs from the rrnA and rrnB operons of Ha. marismortui were compared, several features became apparent. The two sequences show differences at 74 of 1472 nucleotide positions, which are located with a single exception to three domain regions, 56-321, 508-823 and 986-1158, within the universal secondary structure model for small subunit rRNA. Most of the differences are located within certain helical regions of RNA secondary structure and are compensatory, affecting both components of a nucleotide base pair. None of the 74 nucleotide differences between the rrnA and rrnB 16S sequences occurs at positions that have been identified as functionally important for interaction with tRNA, mRNA, or factors during the protein synthesis cycle. The substitutions do not alter the predicted secondary structures significantly or affect nucleotide sites proposed to be important for tertiary interactions. None of the substitutions occurs at nucleotide positions that tend to be highly conserved during evolution. Using nuclease Sl protection analysis it was shown that both the rrnA and rrnB 16S rRNA sequences are expressed and present in the active 70S ribosomes. The distribution of nucleotide substitutions between the 16S rRNA sequences of the rrnA and rrnB shows that they are similar to the differences observed between the two distinct small subunit rRNA genes from Plasmodium berghei (see section 1.8). in P. berghei, the two genes are differentially regulated; one is expressed in sporozoites found in the insect host (C type), whereas the other is expressed in the asexual stage found in the blood stream of mammals (A type). The A- and C-types differ by substitution at 72 of 2059 (3.5%) aligned nucleotide positions. As with the two Ha. marismortui sequences, the distribution of differences between the two small subunit sequences of P. berghei is not random. Virtually all the substitutions and gap differences fall within two domains of the universal secondary structure model for small  95 subunit RNA (Gunderson et al., 1987; Woese et al., 1983). Interestingly, these two domains are the homologs to the 56-321 and 508-823 domains in Ha. marismortui 16S rRNAs. The significance of the clustering of substitutions in both cases is not understood. Ha. marismortui does not go through different developmental stages; nonetheless, it is possible that the 16S sequences from the rrnA and rrnB operons may exhibit differential activity in different conditions of growth. Sequence comparisons indicate that the degree of identity between the 16S rRNA sequences of the rrnA and rrnB operons (95%) is substantially less than the 99% value expected in paralogous small subunit rRNA genes in the genomes of most organisms (Lewin, 1990). This indicates that one or more of the intracellular and intraspecies processes (selection, recombination gene conversion, etc.) has been altered in Ha. marismortui in such a way as to allow the accumulation of nucleotide differences within the 16S genes of the rrnA and rrnB operons. Based on nucleotide sequence similarity alone, it would appear that the divergence of the rrnA and rrnB 16S rRNA sequences from each other is more recent than their divergence from the other three sequences (from Hb. cutirubrum. He. morrhuae and Hf. volcanii). Pairwise comparisons show less similarity (86.8%-9().2%; Table 3.3)), and suggest that the orthologous small subunit genes from the halophiles are diverged from a common ancestral sequence within a relatively short period of evolutionary time (about 600 million years ago). However, within the 508-823 region, the divergence of the rrnA from the Hb. cutirubrum, He. morrhuae and Hf. volcanii are more recent (about 500-600 million years) than the divergence of the rrnB from the rrnA and the other three halophilic sequences (about 800-900 million years). This calculation is based on the assumption that in archaea, the mean rate of substitution is the same as the estimated value for the 16S rRNAs in eubacteria and 18S rRNA in eukaryotes (about 1% substitutions/50 Myr; Ochman and Wilson, 1987).  96  3.2.5 16S-23S Intergenic Spacer 3.2.5.1 Primary Structural Analysis The nucleotide sequences of the 16S-23S intergenic spacer regions from the rrnA and rrnB operons of Ha. marismortui have been determined (Figure 3.19). The two spacer sequences differ in length by 20 nucleotides.  The shorter rrnA spacer sequence, 385  nucleotides in length, contains a tRNAAla coding sequence and the longer rrnB spacer, 405 nucleotides in length, contains no recognizable universal tRNA-like secondary structure (Figures 3.7 A and B). The 16S-23S spacer sequences from the rrnA and rrnB operons of Ha. marismortui and Hb. cutirubrum (Hui and Dennis, 1985) were aligned using the Clustal V method (Des Higgins, European Molecular Biology Laboratory, Germany) and pairwise comparisons were made (data notshown). The rrnA and rrnB sequences were compared at 381 nucleotide positions and they showed 79 substitutions (i.e., 79.3% identity). The two sequences are identical for 177 nucleotides between the conserved TTAA sequence specific for the Box A motif of the intergenic promoters and the 5'- end of the 23S genes. Most of the substitutions between the rrnA and rrnB were observed within the region corresponding to the 16S rRNA processing sites 4 and 6 of rrnB, and the tRNA^la sequence in the rrnA operon. The substitutions caused by transitions (Ts) out-number transversions (Tv) by 59 to 20. This 3:1 ratio is higher than the Ts/Tv ratio between the 16S rRNA sequences (i.e., about 2:1). The available 254 nucleotides of sequence in front of the 23S rRNA gene of the rrnC operon (Brombach et al., 1989) is identical to the sequence in the rrnB operon and identical for 177 nucleotides with that of the rrnA operon.  Comparison of the 16S-23S spacer sequences from the rrnA and Hb. cutirubrum operons showed 65% identity within 375 nucleotide positions. The regions comprising the 16S and 23S rRNA processing sites (or the inverted repeat sequences) and the tRNA^la gene show high levels of sequence conservation. Comparison of the rrnB and Hb. cutirubrum showed 57.9% identity within 383 nucleotide positions.  Except within the 23S inverted repeat  sequences and the internal promoter regions, there are no highly conserved regions between the rrnB and Hb. cutirubrum spacer sequences.  97 10  20  30  10  50  60  70  80  16S 3' processing site Hcu  RflCGTCCGRG-RCTGGflGCGRCGCTCCRGCTCRCCGRflCGflCGCC  Hma rrnA  RCIGRCCGGGGRTTGGGGCTCTGCCC-RRCCCRCCT —TTCGG7GTT  • •••••••••  •••• ••••• ••••••••  •  •••  —  GTCGTGCCCRGTTGGGCflCTiaC_flRRCCRTCRRGGCT  81  CGRGRflCTCCGCCGRCCGRCCGGGCRCCXSlGflRCTflTCRGGGCT  89  ••  •  • • ••  CI1G flCTGRCCGGG-RTCRGGGCCTTGCCCTGRCCCRCCTRCRCllGGTTGTTGGTCRCflRCRFlCCRGRCGGRflCTGRCTGGTGflCCflCRRGTCflCCGCGRGTC  Hma rrnB  "  99  Processing site 5 tRNA  *********************  A l a  Cletie  **************************************************  Hcu  RRCRTflCCCTCCCRRTCGGGflGGTGGGCCCRTflGCTCRGIGGTfl  GRGTGCCTCCTTTGCRRGGRGGRTGCCCTGGGTTCGRflTCCCRGIGGGTC  175  HmarrnA  GRCR-RCCGTGCGGRTCTGC-—CGGGCCCRTRGCTCRGCGGTfl  GCGCCGCCTTTGCflRGGCGGRGGCCCTGGGTTCRRRTCCCRGTGGGTC  177  HmarrnB  '  GGTRRGCGCCGflCTRCTGCRTGGGCCCGCTGGGCTCRCRflGRCCTflTCCGRGGCGGTRTfiCCCHCR  CGGGGflTGTCGGGTGCRRCTCCCGRCGGGTC  Processing site 6  196  ^  Internal promoter  ****  r  Hcu  rflTrrnTTrGGnriTTrRTrTTnnR-TrnTfiTrrnTTRRGTGGGRGRrriPnnrRRrnRTn^QTrGrnRCGRRGRrTGRTGCRCCRCTCCGCGRRRGTGCGR 274  Hma rrnA  rRTflrr,rn--TT,rrTGrrrrGRR —rrGTnnrnrTTRflfiTGTGGFRrnnrnSTnnRR-rT^RTRrnHrGflCRGRTGCRCCflGGCCGGGTRRRflCCGRGCC 2 7 2  lima rrnB  CGTRCT —CCGTRTCGCTrCGflRRTcCGTCCCC-TTRRGTGTGGGRCGGCGHTGGRRTGTGRTRCGflCGRCRGRTGCRCCRGGCCGGGTRRRRCCGRGCC 2 9 2  ^  Processing site 7  2  3  S  5.  p r o c c s s  j„  g  s i  (  e  GTGGGRRGGGTCGRTGCRCGCTCCCTGTTCRCCTflGGGRCGTGCGRTGflCGGCCGTGTGTflflGTGCflflTCCflGGCGCT£fl£TGGRCTCGRCCRCCGTGGT  Hcu  •••••••••••••• • • •  • • ••••• •••  ••••  •••«••••• •••• •••••••••  374  !««••••••••  HmarrnA  -TGGGRRGGGTCGRTTCGCCCRCCRTCTCCflCCTTGGGGGCGRGTflTGRfiR-CCGTGTGTflCGTGCGflTCCRGGCGTCcScTGGflCTCGTTCR  Hma rrnB  -TGGGRRGGGTCGRTTCGCCCHCCRTCTCCRCCTTGGGGGCGRGTRTGRRfl-CCGTGTGTflCGTGCGRTCCRGGCGTC£flC.TGGRCTCGTTCR  — 364  Hcu  CGRGTRCCGRCTGTTRGTCRCflrCCGTGRCTTRRCflGCGCTCRCCCRTTGTGTGGTGflGCRTTCRTCGTCGTTGCTGTTTRCRGCCRRCflTCTCGRCflCT 4 7 4  384  Hma rrnA Hma rrnB  TCGTGTGGTTGRGTRCCRTRTTGTCGRCRRCCRRC  Hcu  •  ••••• •  509  Hma rrnA  GTTGRRCGRGTCRCRRCGflCG 3 8 5  Hma rrnB  GTTGRRCGRGTCflCRRCGRCG 405  Figure 3.19 Alignment of the 16S-23S intergenic spacer sequences from the rrnA and rrnB operons of Ha. marismortui with the single operon from Hb. cutirubrum. The tRNA  Ala  sequence from the rrnA and Hb. cutirubrum operons are indicated by  asterisks over the sequences. The three nucleotides within the 16S and 23S inverted repeat structures (under and overlined) are involved in the processing of 16S and 23S rRNAs from the rrnA and Hb. cutirubrum operons and processing of 23S rRNA from the rrnB operon. The processing sites 4, 5 and 6 observed in the rrnB operon are also under and overlined (indicated by arrows). An internal promoter sequence is underlined and the start site is overlined within the sequence.  98  Internal Promoters  B O X If  BOX A  Position relative lo 3' end of I6S rRNA  Hma rrnA  flfl flCCCT  TTflfl  GTGTGGGRCGGCGTTCGAA- TGTG AT fl CGflC GflCfl  227  Hma rrnB  CR TCCGT -CCCC — — TTflfl  GTGTGGGACGGCGTTCGRA- TGTG AT A CGflC GflCfl  207  Hma rrnC  Cfl  TCCGT -CCCC  TTflfl  GTGTGGGRCGGCOTTCGAA- TGTG AT fl CGflC GflCfl  207  Hcu  CG flTCGT  CTCCC  TTflfl  GTGGGAGRCGGGGCRRCGA- TGflA  TC G CGflC GflflG  208  Hmo  Cfl flCCCfl *****  GTCCC **  TTflfl GTGGGTCTCGGGGRTflTGT * * * * **  TCGfl GT C CGflC CGflfi • **** *•  209  UYCGU  SC  CONSENSUS  GCCGC  TTflfl  GT  e  o  f  1  6  S  ATAC-TCCCCTCCRTCGG-ATGTA- AT G CGAA GGTC  -251  RT G CGRR CGAC  -717  GT TTflT CCGTTACCCGGGATTCCGAATGGAA AT G CGAA GGTC * * * * * * * ** • • *** * * flV G CGAA SG V TTflU AV START -30  -378  CGC  Hma rrnB P  GG TCCflfl  DCCGTCCAT TTflT  Hcu PI  GG TTCGfl  : 0 G - ~ GTT TTflT GTflC--CCCACCfiCTC-GGHTGRG-  HmoPl  CT TCCGfl *****  ;GG •* 3S  d  -111  GTGTGGCTCACCCATCGGflflTGAA-  CC TTCGfl  -40  n  AT G CGAA GGTC  CC TTflfl  Hma rrnA PI  TVCRA  CGflC SR  Position relative to 5. \ i  Promoters From the 16S Leader Region  CONSENSUS  V  START  •30  •40  r  R  i  A  Figure 3.20 Comparison of the internal promoter sequences of the rrnA, rrnB, Hb. cutirubrum and He. morrhuae operons. The sequences from the intergenic promoters from the halophiles are compared above and also one promoter sequence from the 16S 5-flanking regions of each organism are compared below. The start sites which have been mapped at the nucleotide position are shown as the third nucleotide within Box B. The position of start sites relative to the 5'- or 3'- end of the 16S sequence are indicated on the right. This figure illustrates that the primary sequences of the Box A, Box B and -40 regions from the intergenic promoters and the 16S leader promoters from halophiles are conserved. An intergenic promoter sequence (Pi) was found in the 16S-23S spacer region of the rrnA, rrnB and rrnC operons. These Pi sequences show a high degree of identity to the Pi sequences from related halophiles and the consensus motif of the promoter sequences present within the 16S 5-flanking region (Figure 3.20).  99  3.2.5.2 Primary Transcript Analysis of the rrnA 16S-23S Spacer Nuclease Sl digestion experiments were used to investigate transcription through the intergenic spacer. For the analysis of the rrnA operon, a 646 nucleotide long Aval-Aval fragment that contains the last 110-bp of the 16S gene, the 383-bp long 16S-23S intergenic spacer, and the first 153-bp of the 23S gene, was isolated. The fragment was 3'-labelled on the minus strand at position 1362 within the 16S gene. The labelled fragment was hybridized to total Ha. marismortui RNA and treated with Sl nuclease. Maxam and Gilbert sequencing reactions for A and A+G were also performed with the same labelled fragment and the reaction products were separated on a 6% polyacrylamide gel, along with the protected fragments from the Sl nuclease assay (Figure 3.21A). In addition to full protection, a number of different partial protection products of 110 , 180, 216, 246, 286, 460 and 493 nucleotides in length were observed. These products have 3'ends near positions 1472 of the 16S rRNA gene, positions 70, 106. 136, 176, and 350 within the spacer region and position 1 of the 23S rRNA gene, respectively. They presumably correspond to protection by RNAs with 3'- ends generated by cutting the primary transcript (i) at the end of 16S rRNA, (ii) at the 16S precursor processing site, (iii) at the 5'- end of the t R N A  Ala spacer (iv) at a site within the anticodon loop of the tRNA Ala* (j ) v  tRNA^la  a t t n e  3'. \ f th en(  0  e  (vi) at the 23S precursor processing site and (vii) at the beginning of the 23S  sequence, respectively. The 646 nucleotide long Aval-Aval fragment was also 5'- labelled on the plus strand at position 153 within the 23S gene. The labelled fragment was hybridized to total Ha. marismortui RNA and treated with S1 nuclease and the reaction products were separated on a 6% polyacrylamide gel, along with the probe and size standard markers (Figure 3.22 A).  A  Hma rrnA  Figure 3.21 Figure caption on pages 101.  g  Hma rrnB  101 Figure 3.21 Nuclease S1 protection assay of the primary transcript products within the 16S23S intergenic spacer region of the rrnA and rrnB operons using DNA probes labelled at their 3'- ends. Figure 3.21A A line diagram showing the 16S-23S intergenic spacer region of the rrnA operon (from plasmid PD 926). The two Aval sites within the 16S rRNA and 23S rRNA were used in the isolation of the probe for the nuclease S1 protection assay (see section 2.2.16.1). The position where the probes were labelled with a-32p dCTP are indicated by dots (•). The sizes of the protected fragments and their corresponding sites are shown below the operon structure. Figure 3.21B A line diagram showing the 16S-23S intergenic spacer region of the 16S rRNA gene of the rrnB operon (from plasmid PD 929). The two Aval sites within the 16S rRNA and the 23S rRNA were used in the isolation of the probe for the nuclease SI protection assay (see section 2.2.16.1). The position where the probes were labelled with a-32p dCTP are indicated by dots (•). The sizes of the protected fragments and their corresponding sites are shown below the operon structure. Figure 3.21C The autoradiogram showing the nuclease SI protection assay productsfromthe 16S-23S spacer region of the rrnA operon. Above each lane, the rrnA probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA), marker (pBR 322 plasmid digested with Mspl and labelled at the 3'-end with a- 32p dCTP) and the M axam and Gilbert sequencing products (G and A+G) are indicated. The sizes of the protected fragments from the S1 treated DNA-RNA hybrids and the markers are shown on both sides of the autoradiogram. Figure 3.2ID The autoradiogram showing the nuclease SI protection assay productsfromthe 16S-23S spacer region of the rrnB operon. Above each lane, the rrnB probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA), marker (pBR 322 plasmid digested with Mspl and labelled at the 3'- end with a-32p dCTP) and the Maxam and Gilbert sequencing products (G and A+G) are indicated. The sizes of the protected fragments from the S1 treated DNA-RNA hybrids and the markers are shown on both sides of the autoradiogram.  B  lima rrnA  Hma rrnB  Av. 1  Avi |  iss y «  »l5}n - I Son -300n -330n » 3<>3ii -«03n -4Z3n  23S | * n e l s n v S l t C Pi ittrt Segiteiwc .ti venrntp point y end of i l t n u K U i N A Site widun Uic iiiucooon loc^i V end of l i u u i K t R N A Probe  o c  US 5 e n d 235 proceeang d i e Ptsuut Seaacnoc diverging pcui4 1 OS processing site 6 ] oS proceeding ute 5 ! 6S processing sue 4  D O —  •300n •330.1 •34In - 37i)n •5IOn * span  u 93  < c  c  — — I53n  PQ  E  C/2  b - 646n  433n 403n  668n^ 512n -  2 - 622n - 527n  -  363n 331n  379n. 341n. 331n —  302n  302n-  407n  — 307n  249n 238n 217n  - 189n  189n-  _  201n 190n 180n  = 153n  Figure 3.22 Figure caption on pages 103.  -  160n  _  147n  153n =  103 Figure 3.22 Nuclease S1 protection assay of the primary transcript products within the 16S23S intergenic spacer region of the rrnA and rrnB operons using DNA probes labelled at their 5'- ends. Figure 3.22A A line diagram showing the 16S-23S intergenic spacer region of the rrnA operon (from plasmid PD 926). The two Aval sites within the 16S rRNA and 23S rRNA were used in the isolation of the probe for the nuclease S1 protection assay (see section 2.2.16.1). The position where the probes were labelled with y-32p ATP are indicated by dots (•). The sizes of the protected fragments and their corresponding sites are shown below the operon structure. Figure 3.22B A line diagram showing the 16S-23S intergenic spacer region of the 16S rRNA gene of the rrnB operon (from plasmid PD 929). The two Aval sites within the 16S rRNA and the 23S rRNA were used in the isolation of the probe for the nuclease SI protection assay (see section 2.2.16.1). The position where the probes were labelled with y-32p ATP are indicated by dots (• ). The sizes of the protected fragments and their corresponding sites are shown below the operon structure. Figure 3.22C The autoradiogram showing the nuclease S1 protection assay products from the 16S-23S spacer region of the rrnA operon. Above each lane, the rrnA probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA) and markers (pBR 322 plasmid digested with Mspl and labelled at the 3'-end with a- 32p dCTP) are indicated. The sizes of the protected fragments from the S1 treated DNA-RNA hybrids and the markers are shown on both sides of the autoradiogram. Figure 3.22D The autoradiogram showing the nuclease S1 protection assay productsfromthe 16S-23S spacer region of the rrnB operon. Above each lane, the rrnB probe, S1 product (S1 nuclease digestion product of the annealed probe with total RNA) and markers (pBR 322 plasmid digested with Mspl and labelled at the 3'-end with a-32p dCTP) are indicated. The sizes of the protectedfragmentsfrom the S1 treated DNA-RNA hybrids and the markers are shown on both sides of the autoradiogram.  104 In addition to full protection, a number of different partial protection products of 153, 186, 300, 330, 360, 403 and 423 nucleotides in length were observed. These products have 5'- ends near positions 1 of the 23S-rRNA gene and near positions 350, 236, 207, 176, 136, 105, and 70 within the spacer region, respectively. They presumably correspond respectively to protection by RNAs with 5'-ends generated by cutting the primary transcript (i) at the beginning of the 23S gene, (ii) at the 23S precursor processing site, (iii) at the site of transcription initiation from the internal promoter, (iv) at the sequence diverging point between the rrnA and rrnB sequences, (v) at the 3'- end of the tRNA^a sequence, (vi) at a site within the anticodon loop of the tRNA^la, (vii) at the 5'- end of the tRNA A ' , and (viii) at the 16S rRNA precursor processing site. a  Multiple transcript ends observed near the processing and end sites of the 16S rRNAs (Figures 3.22C and 3.22D) and 23S rRNAs (Figures 3.23C and 3.23D) may have been produced by endogeneous nuclease activities acting on the ends of the RNAs or alternatively may have been caused by inaccurate S1 trimming at the ends of RNA-DNA hybrids. The presence of other minor bands which are not labelled, could also be due to S1 artefacts or could represent unstable or abnormal products produced during the rRNA maturation process. Since S1 nuclease analysis are only semi-quantitative, the band intensities do not reflect the exact amount of RNAs present in the cells. These analysis are mainly done in order to detect the transcription initiation, termination or processing sites of the transcripts. In the case of rRNA transcripts, usually we expect only about 1 - 2% of rRNA in the form of very long precursors and processing intermediates; most of the rRNA is in the mature form (King and Schlessinger, 1983). However, this is not what has been observed from the Sl nuclease analysis shown in Figures 3.21, 3.22 and 3.23; the intensities of the bands representing the mature 16S and 23S rRNAs are much less than expected (Figures 3.21C, 3.21D, 3.22 C and 3.22 D). There are several explanations for this. First, the efficiency of hybridization between rRNA and the DNA probe may depend on the lengths of the rRNAs; longer RNAs hybridize more efficiently than the shorter RNAs and during a three hour hybridization, displacement may occur.  Second, the conditions used in the Sl analysis (especially  temperature) may not have permitted the mature rRNA (smallest in size) to hybridize efficiently.  105 Third, majority of the mature rRNAs are folded into a secondary structural conformation and may not be readily available to hybridize with the DNA probe. Primer extension analysis are performed in order to locate more precisely the end sites of the products identified from Sl nuclease assays (the approximate positions of the end sites are already determined using DNA ladders, as shown in Figures 3.21C, 3.21D, 3.22C and 3.22D). Using oPD 43 (positions 106-125 in rrnA; Figure 3.19), the processing site at the 3'-end of the tRNA Ala was mapped at a U residue at position 176 (Figure 3.23B). There was very little extension product through this site because the tRNA structure represents an efficient block to reverse transcription. Primer extension analysis using oligonucleotide oPD 44 (positions 37-59 within the 23S rRNAs of rrnA and rrnB; Figure 3.24) showed that the 5'ends of the two 23S rRNAs stopped at U residues at position 1 of the 23S rRNAs (Figure 3.23C). The 23S precursor processing sites were also mapped at a C residue within the bulge motif located at positions 350 and 370 (Figure 3.19) within the intergenic spacer of rrnA and rrnB operons, respectively (Figure 3.23C). Since the first 70 nucleotides in the 23S rRNA genes and the 177 nucleotide upstream regions are identical in both the rrnA and rrnB operons, the primer extension analysis using oPD 44 gave identical products corresponding to the 5'- end of 23S rRNA and the 5'- processing site. Nuclease S1 digestion experiments within the 16S-23S spacer regions of the two rRNA operons of Ha. marismortui showed that the two internal promoters are active (section 3.2.5.3). If one considers the possibility of nonspecific premature transcription termination inside the ribosomal operons (i.e. between the 16S and 23S rRNAs or within the 23S rRNA, genes), it would lead to a slight increase in the level of 16S rRNA over the 23S and 5S rRNAs. The activity of the Pi promoters could possibly result in excess 23S and 5S rRNAs relative to 16S rRNAs to compensate for such an imbalance and adjust the cellular level of the ribosomal RNAs.  Hma rrnB G A T C  Extension Product  Figure 3.23 Figure caption on next page (page 107).  Hma rrnA G A T C  107 Figure 3.23 Primer Extension Analysis within the 16S-23S spacer regions of the rrnA and rrnB operons. Figure 3.23A Primer extension analysis to map the processing site 6 (Figure 3.7B) within the spacer of the rrnB operon using oligonucleotide oPD 46. Sequencing ladders from the rrnA (PD 926) and rrnB (PD 929) operons using the same primer, oPD 46, were also generated in order to identify the nucleotide positions at which the extension products terminate. Figure 3.23B Primer extension analysis to map the 3'- processing site of the tRNA Ala from the rrnA operon using oligonucleotide oPD 43. In order to identify the exact nucleotide position of the extension products, the spacer region of the rrnA operon (PD 926) was also sequenced using the same oligonucleotide, oPD 43. Figure 3.23C Primer extension analysis to map the 5'-end of 23S rRNA and 5'- processing sites for the rrnA and rrnB operon transcripts. In order to identify the exact nucleotide position of the extension products, sequencing analysis were also carried out within the spacers of the rrnA (PD 926) and rrnB operon (PD 927) using oPD-44. A number of conclusions can be made from the studies involving nuclease protection and primer extension analysis concerning transcription and processing in the intergenic spacer of the rrnA operon. First, as evident from the full length protection product, the 16S and 23S rRNA genes are co-transcribed. Second, the internal promoters are active and give rise to transcripts with a 5'- end at position 236 within the intergenic spacer. Third, a 331 nucleotide long product was observed when using the 5'-end labelled probe. This was due to partial protection of the rrnA probe with transcripts derived from the rrnB operon where the sequences of the two operons start to diverge at positions corresponding to 207 and 228 within the spacer of the rrnA and rrnB operons, respectively (see Figure 3.19). Finally, all other end sites appear to be generated by endonuclease cleavage of the primary transcript since both 3'- and 5'ends can be mapped to each of the positions. Clearly, there appears to be no predetermined ordering of cleavage events since all possible products were observed with both 5'- and 3'labelled probes. The only exception is the maturation of the 3'-end of 16S rRNA where precursor excision occurs before maturation. It is uncertain if maturation is the result of an endo- or exonuclease activity. In contrast, maturation at the 5'- end of 23S rRNA can  108 sometimes occur before precursor excision as evident from the presence of a 496 nucleotide protection product (Figure 3.21 A and B) Although there is no mandatory order for intergenic processing events, some sites appear to be cleaved earlier than others. These include the 16S precursor excision site and the tRNA-^la 3'- end processing site. It is likely that the 23S precursor excision and maturation occur at a later time because the RNA polymerase must traverse the entire 23 S gene before the second half of the inverted repeat strcture is available to form the processing stem.  3.2.5.3 Primary Transcript Analysis of the rrnB 16S-23S Spacer For the nuclease SI protection analysis of the 16S-23S spacer region of the rrnB operon, a 668 nucleotide long Aval-Aval fragment containing the last 110-bp of the 16S gene, the 405-bp long 16S-23S intergenic spacer, and the first 153-bp of the 23S gene, was isolated. The fragment was 3'- labelled on a minus strand at position 1362, located within the 16S gene. The labelled fragment was hybridized with the total RNA of Ha. marismortui and then treated with SI nuclease. The same labelled fragment was also used in the Maxam and Gilbert sequencing reactions A and A+G, and the products were separated on a 6% polyacrylamide gel, along with the protected fragments from the SI nuclease assay (Figure 3.21D). In addition to full protection, a number of different partial protection products of about 110, 148-151, 282, 329, 482, and 515 nucleotides were observed. These partial protection products correspond respectively to protection by RNAs with 3'-ends generated by cutting the primary transcript (i) at the end of 16S rRNA, (ii) at the 16S rRNA precursor processing site 4, (iii) at the 16S rRNA precursor processing site 5, (iv) at the 16S rRNA precursor processing site 6, (v) at the 23S rRNA precursor processing site, and (vi) at the beginning of the 23S rRNA. The 668 nucleotide long Aval-Aval fragment was also 5'-labelled at position 153 on the plus strand within the 23S gene. The labelled fragment was hybridized with the total RNA of Ha. marismortui and then treated with SI nuclease and the reaction products were separated on a 6% polyacrylamide gel, along with the probe and size standard markers (Figure 3.22 D). In  109 addition to full protection, a number of different partial protection products of about 153, 186, 300, 330, 341, 379, and 510 nucleotides in length were observed. These correspond to protection by RNAs with 5'-ends generated by cleavage of the primary transcript; (i) at the beginning of the 23S rRNA gene, (ii) at the 23S precursor processing site, (iii) at the site of transcription initiation from the internal promoter, (iv) at the sequence diverging point between the rrnA and rrnB sequences, (v) at the 16S rRNA precursor processing site 6, (vi) at the 16S rRNA precursor processing site 5, and (vii) at the 16S rRNA precursor processing site 4. Primer extension analysis were carried out in order to locate more precisely the positions of some of the end products obtained from SI nuclease analysis. In the rrnB operon, the precise mapping of the 16S precursor processing site (site 6 in Figure 3.7B) was performed by primer extension analysis using oligonucleotide oPD 46 (Figure 3.23A) which binds at positions 292-311 within the intergenic spacer of the rrnB operon (Figure 3.19). The extension product stopped at a U residue at position 219 (Figure 3.23A). Primer extension analysis using oligonucleotide oPD 44 (positions 37-59 within the 23S rRNAs of rrnA and rrnB; Figure 3.24) showed that the 5'-ends of the two 23S rRNAs (rrnA and rrnB) stopped at U residues at position 1 of the 23S rRNAs (Figure 3.23C). The 23S precursor processing sites were also mapped at a C residue within the bulge motif located at positions 350 and 370 (Figure 3.19) within the intergenic spacer of rrnA and rrnB operons, respectively (Figure 3.23C). A number of conclusions concerning the transcription and processing of the intergenic spacer of the rrnB operon can be reached from these nuclease SI protection and primer extension analysis. First, as evidenced by the presence of full length protection product, the 16S and the 23S rRNA genes are co-transcribed. Second, the internal promoter is active and gives rise to transcripts with a 5'-end at approximately position 260. Third, a 330 nucleotide long product was observed when using the 5'-end labelled probe. This is likely due to partial protection of the rrnB probe with transcripts derived from the rrnA operon where the sequences of the two operons start to diverge at positions corresponding to 207 and 228 within the spacer  110 and rrnB operons, respectively.  Finally, all other end sites appear to be generated by  endonuclease cleavage of the primary transcript, because both the 3'- and the 5'- ends can be mapped to each cleavage position. There is no mandatory order of cleavage events since all products are observed with both 5'- and 3'- probes. The only exception being the maturation of the 3'- end of 16S rRNA where precursor excision precedes maturation. In the 16S-23S rRNA spacer of the rrnB operon, three endonuclease processing sites have been identified, and labelled as sites 4, 5, and 6 located at positions 36 to 41, 172, and 219 (Figure 3.7B). Since the nucleotide sequences of the rrnA and rrnB spacer regions are identical within the 177 nucleotides immediately upstream to the 23S rRNA genes, the two 5'-labelled probes detected identical 23S intermediates and maturation products . In addition, the nuclease Sl protection assay showed that the maturation of the 23S 5'- end can occur before the processing of primary transcripts; however, the intermediates are minor when compared to other intermediates from the processing reactions. Since all protection products can be accounted for as either endonuclease processing sites, internal promoter initiation sites or rrnA-rrnB sequence divergence sites, it implies that there is apparently no interference with transcripts derived from the rrnC operon intergenic space.  3.2.5.4 The Processing Pathways Within the 16S-23S Spacer It is postulated that the "bulge-helix-bulge" motifs on both the 16S and 23S processing stems are the substrate for the processing endonuclease in archaeal organisms (Thompson and Daniels, 1988). The endonuclease activity is also present in Ha. marismortui and is used to excise precursor 16S and 23S rRNAs from the rrnA operon transcript and precursor 23S rRNA from the rrnB operon transcript. The 16S processing stem in the rrnB operon transcript is different in that it lacks the "bulge-helix-bulge" motif and precursor 16S rRNA is presumably excised by one or more alternative endonuclease activities.  Ill The Sl nuclease and primer extension analysis have been used to locate endonuclease cleavage sites in the 5'-flanking region and the 16S-23S intergenic spacer of the rrnB operon transcript. Three unique sites were located in the intergenic spacer. These sites exhibit no apparent sequence or structural conservations although the nucleotides involved in the cleavages are always either an A or a U and are associated with either a G • U or an AoG base pair in the universal secondary structure for primary transcript (Figure 3.7B). Comparison of the processing site from the 5'-leader also shows that the nucleotide involved in the cleavage is an A and is located adjacent to a G • U base pair in the universal secondary structure. These findings suggest that the presence of a G • U or an AoG base pair adjacent to the cleavage site might be an important feature for the processing by endonuclease enzyme. It is uncertain whether all the cleavages observed in the processing of the 16S rRNA from the primary transcript of rrnB operon are mediated by a single activity or whether or more than one activity is involved.  3.2.6  23S rRNA 3.2.6.1 Primary Structural Analysis The complete 2917 nucleotide long sequences of the rrnA and rrnB 23 S rRNA genes  have been determined. A third, 2917 nucleotides long 23S rRNA sequence from Ha. marismortui, different from the rrnA and rrnB operon sequences and designated rrnC, has been determined by Brombach et al., (1989). It is uncertain if the Brombach sequence is entirely derived from rrnC or if it is a composite derived from the three different operons; here we assume that it is not a composite and represents the rrnC sequence. Comparisons of the 23S rRNA sequences indicate that 29 substitutions are unique to rrnA, 11 substitutions are unique to rrnB and only four substitutions are unique to rrnC (Figure 3.24). The three 23S Ha. marismortui sequences were compared to each other and with other related halophilic archaea, Hb. cutirubrum (2905 nucleotides) and He. morrhuae (2927 nucleotides). Alignment of the sequences was obtained by the Clustal V method (Figure 3.25). Secondary structural features were also taken into consideration to obtain the optimal alignment.  10  I  iO  . .  Hmo  G  Hcu Hma  rrnA  Hma  rrnB  Hmo  rrnC  G  50  A  R--A  c  A  «0  SO  60  '0  60  GK-T--C  90  CR  -G-HT--C--C  100  r  CTGA  t  TTGGCTACTflTGCCRGCTGGTGGATTGCTCGGCTCAGGCGCTGflTGAAGGACGTGCCAAGCTGCGATAAGCTGTGGGGA  -  r  jCCGCACGGRGGCGRRGRRCC  ca t---c  G  c--«--r  C - -CC  G  T  ro  G  C-CACCG  c  e  G---G  « n  CRG-—C---T  0  C--A  . C - t  G  GG-CG  HtlA rrnA  RCRGflTTTCCGflflTGRGflATCTCTCTAACflArrGC.TTCGCGCARTGflCGAACCCCGfiGRRCTGflAflCflTCTCRGTATCGGGAGGAACRGRAAACGCAAC  Hmo rrnB  -I£  Hma rrnC  -Ifi  «.«  -  -0  Hmo  T  C  TG  G  >T  •  T  I T  GC  I-ACGGGGCTTTGCCC I,  rrnA  Hma  rrnB  C  Hma  rrnC  C  -  »  TC- -R—GA  199  0-  .-rc--A--GR--r-.CRCGGGCARTGTG GTGTCAGGGCTACCTCTCATC  GTGflTGTCGTTflGTAACCGCGAGTGAflCGCGflTflCRGCCCRAACCGAAGCCTG'  Hme  X  •  G  -A--C-C  Hcu  100  CTi -  CT 1  Hma  rrnB  Hma  rrnC  G-TC--CT . . - - - G - T GTGT C--A--CG G R RCRCrcoror C--R--CG c G TCACG R C - —ic— cr AGCCG»RCCGTCTrcflCGRRGTCTCTTGGflRTHGflGCGTGflTRCflGGGTGACflflCCCCGTACTGAAGACCAGTACGCT5*5CGGTflGTGCCRGRGTAGCG 3 8 5 c RC El- - — - - R - - -C£ C--Ci 3 i  Hmo  rrnA  GGGGTTGGRTRTCCCTCGCGRRrRACGaCRGGCArCGRCTGCGRRGGCTARflCRCRRCCTGAGACCGRTRGrGRRCRRSTRGTGTCARCG*RRCGCTGCfl  Hma  rrnB  Hma  rrnC  Hcu Hma  GTTT-. rrnA  cs  -  Hmo  Hmo Hcu Hma  T  R  Hcu  rroH  Hma  rrnB  Hma  rrnC  Hma  rrnA  Hmo  rrnB  Hma  rrnC  ti  .T-G-C  I  CG  GC-RC-GR-RCG  C  CRTG  Hmo  rmft  Hma  rrnB  Hma  rrnC  G  C  CTGR-R-G  •  G.  rrnA  Hma  rrnB  Hma  rrnC  CT--R-G I  C  G  G  G  CIG  -A--G-  C--T  A  Hma  rrnA  Hma  rrnB rrnC  --C-T-  rrnB  Hmo  rrnC  Hmo  rrnA  Hma  rrnB  Hmo  rrnC  C  T-GR-C  C  T--ARC--5  G-C  G-R  G-C  G  G  -  C--C"  C—G  -C  G  T  C A  C-AGC  C-GATCTRTGTGTAGGGGrGfififlGG  rrnC  TC---G  A-G-R---B G-A--T-RA  C.  A  GAG  T  C  7  AG  R  T  C—GT  RG  C G - . . —TC  TC  A  rrnA rrnB  Hmo  rrnC  1076  G  CG..—AT  G  -  CGaSflCCTGTCCGTACCGCTCfl  —  Hma  976  RCAACCCAGAGATAGGTTAAGGTCCCCflAGTGTGGATTAAGTGTflflTCCrcTGAAGGTGGTCTCGRGCCCTAGRCAjCCjGGflGGTCflGCTTRGflAGCAG  R--GC  C  C-CC--G.---C Hma  076  TC---G  rrnA rrnB  Hmo  G-C  --CGT  G  G---G-AT---GT-C--G  C CTC---T  T-G  ACG  fl—GAG  '  G-G-fifl-A  C -  TATGGTA.ATCGAGTAGATTGCCGCTCTAATTGGATGGAAGTRGGGGCGAGAGCTCCTflTGGflCCGRTTAGTGflCGAAAATCCTGGCCATAGTAGCAGCG  •  —-  ure 3.24 Figure caption on page 113  --i-.R-a  77T  C--  T-I-.-C  r  G---C---GR  W  -R—C-GT  CTCCGAGGGGAGTCG&CACACCTGTCAAACTCCAAACTTACRGACGCTGTTTGACGCGGGGATTCCGGTGCGCGGGGTAAGCCT&TGTACCAGGAGGG&A  -  Hma  AG R.  G  C  T-T-T-C  Hcu Hme  10]  CCCATCGAGTCCGGCAACAGCTGGTTCCAATCGAAAGflTGTCGAAGCATGACCTCCGCCGAGGTAGTCTGTGAGGTAGAGCGACCGRTTGGTGTSTCCGC  Hma rrnA  RR-  RG-C-—G  G  G RC-G  Hme  0 •  T--R-G  ACGCR.rGGRCAAGATGAAGCGTGCCGAAAGGCACGrGGAAGrcrGITAGRGtTGGTGTCCTACRATRCCCrcTCS  Hcu  Hmo  -  G - C R T - » . « . * * * G  Hma  Hme  G  r.-G-TCTRC-tR-C R-T  GA  Hma  C  GCRCRGGGAAACCGRCATGGCCGCAGGGCrT»***TGCCCGAGGGCCGCCGTCTTCAAGGGCGGGGA&CCATGTGGACACGRCCCGRATCCGGRCGATCT  Hmo Hcu  GI--ICT-C  rCCflGTAAGACTCflC6GGRfl5CCGflrGTrCTGrCGTRCGTTrTGRRRBACGAGCCAGGGAGrGTGTCTGTfiTGGCfl^5"TARCCGGRGTATCCGGGRG»  --GT-  Hcu  T-tC  C--.-R  ..-t . - - C R r-C C GT GI---C C - - G CC C --C-R G - - T G--T G G G R C-T--CTGS C TTGAGT--A-ACT RAGTRCCCrCAGA'AGGGAGGCGAAArAGRGCRrGRAArCAGTrGGCGAnCGAGCGflCAGGGCArRCRRGGTCCCTTGRCGRRTGRCCGflGRCGCGAGrc 5«2  ---r  Hmo  G--C---R  TGT-GI  G  -  -  II 76  1  Hma Hcu Hma rrnH  10  c  20  30  10  —c-c---c  r—AT  50  60  70  o-—»--I-CB—B-C  00  90  c  100  or—G  T  A .7 T G--A-AGCG---A T -C A G-C-GT--T--I G TSBGtCGOGTGnGBB.CCCCGnCGGCCI'UnrGGnTnflGGGIICClCBGCRCIGCrGntCnGCTGnGGGnnCCCGGTCCTBflGrCTCnCCGCnnCtCGnC  Hma rrnB  m- -  Hmo rrnC  •  AT  H  «—A  GA-I--C  <CA  T  Hmo  mm  Hmo Hcu  -A TC C-G-T-A---.  Hmo rrnH  TGAGACGAAATGGGAAACAGGTTAATATTCCTGTGCCATCATGCAGTGAAAGTTGACACCCTGGGGTCGATCACGCCCGGCATTCGCCCGGTCGAACCGT  Hmo rrnB Hma rrnC  -81  7 T T  A A fi  CG-G S - C - T G - C C - - G —IC C-GA—G C-G-A--T c—G—TC--A-CA-C-TGA—G  £ -£  J  J l  G  G G  IG TG  -A--A--T -G--.G-T  C---T  Hmo  G  I  Hma rrnH  ...-CT-.  TTC-G-C--A--. TCA-G---«ATC  1571  -S -A.-  CCAAaCTCCGTGGAAGCCGTAATGGCAGGAAGCGGACGAACGGCGGCATAGGGAAACGTGATTCAaCCaTGGGGCCCATGAAAAGACGAGCATAGTGTCC • •—• • — • -- .  A  A  T-AC  Hmo Hcu  Hcu  T-AG 7-—A-C  C  Hmo rrnH Hmo rrnB Hma rrnC  -  GAG  0  AT  G  A-A-G  G G  G—CA-GCG G 7-CG  71671  CA-  CG  GTACCGAGAACCSACACAGGTGTCCATGGCGGCGAAAGCCAAGGCCTGTCGGGAGCAACCAACGTTAGGGAATTCGGCAAGTTAGTCCCGTAGGTTCGGA  Hma rrnB  CC  Hma rrnC  CC  r-«G  1(71  ---  -r  cr  Hcu  -T  CA—C--T-AG  Hmo rrnH Hma rrnB  AGAAGGGATGCCTGCrCCGGAACGGAGCAGGTCGCAGTGACTCGGAAGCTCGGACTGrCTftGTRATAACATAGGTGACCGCAAATCCGCAAGGACrCGTA  1sTI  CGCG C G CA TS—G C--T CGGTCACTGAATCCTGCCCAGTGCAGGTATCTGAACACCTCGTACAAGAGGACGAAGGACCTGTCAACGGCGGGGGTAACTATGACCCTCTTAAGGTAGC  1971  -c  c—CA  A---C—c  G  »---C  I 771  C  G  Hma rrnC  Hma Hcu Hmo rrnH Hmo rrnB Hmo rrnC  T T  Hma Hcu Hmo rrnH Hma rrnB Hmo rrnC  G--TG C—G  5 C  G-G- —  GTAGTACCTTGCCGCATCAGTAGCGGCTTGCATGAATGGATTAACCAGAGCTTCACTGTCCCAACGTTGGGCCCGGTGAACTGTACATTCCAGTGCGGAG  ---C C  Hmo Hcu Hma rrnH Hma rrnB  -C  A A  -  A A  G---AC G---AC  TAG T  G AG  2071  C-A  TCTGGAGACACCCAGGGGGAAGCGAAGACCCTATGGAGCTTTACrGCAGGCTGTCGCTGAGACGTGGTCGCCGATGTGCAGCATAGGTAGGAGTCGTTAC  2171  Hma rrnC  GC—C  Hma  G--G GT  AC  —C—C  ACTA  G-T--C- —  C  A  T---C—T  G  T--C  G  T  Hcu  -C  Hmo rrnH  AGAGGTACCCGCGCTAGCGGGCCACCCAGACA*ACAGTGAAATACTACCCSTCGGTGACTGCGACTCTCACTCCGGGAGGAGGACACCGATAGCCGGGCA  2270  Hma rrnB Hmo rrnC  G---T  Hma  G-A-A---C  C  T  Hcu Hma rrnH  A-CCT  T  T-G  AAT  A-T  C T  A  GTTTGACTGGGGCGGTACGCGCTCGAAAAGATATCGAGCGCGCCCTATGGTGATCTCAGCCGGGACAGAGACCCGGCGAAGAGTGCAAGAGCAAAAGATG -  Hma rrnB  D OO  G--  T  C  CS 2370  CL,  n o  -  Hma rrnC  c CT  AC-AG  GTC  Hcu  GGC  Hma rrnH  ACTTGSCAGTGTTCTTCCCBACGAGG.AACGCTGACGCGAAAGaGTGGTCTAGCGAACCAATTAGCCTGCTTGATGCGG.C'AATTGATGACAGAAAAGC  C--A-A  C-C  .G  •  GG--G  -  o  Hmo  G  C  —CC  Hmo rrnB  G-.C---T-C C  —  21S5  —  CL. C3  o  0) l_  Hma rrnC  3  Hma Hcu Hmo rrnH Hmo  TACCCTAGGGATAACAGAGTC5TCACTCGCAAGAGCACBTATCGBCCGAGTGGCTTGCTACCTCGATGTCGGTTCCCTCCATCCTGCCCGTGCAGAAGCG  2S66  rrnt  Hmo rrnC  rr •N tri o V-  Hmo Hcu Hma rrnH Hma rrnB Hma rrnC  -  -  -  T T  G G  C TT  3  ACCT  GGCAASGGTGAGGTTGTTCGCCTBTTRBAGGAGGTCGTGAGCTGGGTTTaGACCGTCGTGAGACAGGTCGGCTGCTflTCTACTGGGTGTGTAATGGTGTC  2666  114  1  Hma Hca Hma rmH  Hma rmd Hma rrnC  10  20  GG  T  G  TI  30  60  ARC--GT  I  CC -  Hen  I  Gf  C  70  60  C  G  C  G---I  50 C  100 G  R-C---T  AC  AG  » - t A-T---AC-G-I  G  » - I  G  « - I  A T  c  A  GTAAGAGCTGAACGCATCIAAGCTCGAAACCCflCTTGGAAARGAGACRCCGCCGflGGTCCCGCGTACRRGnCGCGGrCGATAGACTCGGGGTGrGCGCGT  Hma rr-nS Hma rrnC  .  Hma  C C  «c» Hma rrnn  SO C  TGflCARGfiflCGRCCGTArAGTRCGAGAGGflflCTflCGGTTGGTGGCCACTGGrGTRCCGGTTGTTCGRGflGfiGCACGTGCCGGGTAGCCRCGCCACRCGGG 2 7 7 6 £ G  Hma Hma rrnfl  40 »  G  1  G  I  G  2976  .  .  G-. I  GI---T  CB--C  CGAGGTARCGRGACGTTAAGCCCnCGAGGCACTflACAGRCCAAflGCCAT.CAT 2 9 1 7  Hma rmB Hma rmC •  » •—  Figure 3.24 Nucleotide Sequences and Alignment of halophilic 23 S rRNA encoding genes. The complete nucleotide sequence of the Ha. marismortui 23S encoding gene from the rrnA operon is depicted (Hma rrnA). The sequence from Brombach et al., (1989) is assumed to be from the rrnC operon. The 23S rRNA encoding sequences from the rrnB and rrnC operons are aligned below the rrnA sequence; only the nucleotides that differ from the rrnA sequence are indicated. Substitutions that are compensatory, affecting both components of a base pair in regions of RNA secondary structure are underlined. For comparison, the entire 23 S rRNA sequences from Hb. cutirubrum (Hcu) and He. morrhuae (Hmo) are also included. Dashes (-) indicate nucleotides identical to the rrnA sequence; dots (•) indicate single nucleotide gaps in the sequence(s) required to maintain alignment. Pairwise analysis of the five sequences in this alignment is summarized in Table 3.4. The distribution of nucleotide substitutions along the length of the 23S rRNA genes from the rrnA, rrnB and rrnC operons of Ha. marismortui and the operons from related halophiles Hb. cutirubrum and He. morrhuae is illustrated by the histograms in Figure 3.25. Comparisons of the 23S rRNA gene sequences from the rrnA and rrnB operons of Ha. marismortui indicate that the sequences are 98.7% identical where the differences are concentrated in three intervals defined by positions 72 to 363, 984 to 1765 and 2322 to 2769. The rrnA-rrnC comparison indicates that the sequences are 98.8% identical and the differences  Hma rrnA/}[ma rrnB  Nucleotide positions  o £ 10  Nucleotide positions  Hma rmBI Hma rmC  Nucleotide positions  HmolHcu  10 i  ••••  Nucleotide positions  ,ure 3.25 Figure caption on next page (page 116).  ••  116 Figure 3.25 Histogram showing the distribution of nucleotide substitutions within the 23S rRNAs from Ha. marismortui, Hb. cutirubrum and He. morrhuae. The number of nucleotide substitutions for every 10 nucleotide increments was determined for six pairwise sequence comparisons; the two sequences in each pair are indicated on the upper right hand corner. The number of nucleotide substitutions were plotted against the position of the nucleotide sequences and the level of heterogeneity is indicated by the height of the vertical bars.  Table 3.4 A comparison of the nucleotide sequences of the 23S rRNA from halophilic archaeal species. The abbreviations used are as follows: L is the length of the nucleotide sequences compared; Ts is the number of transition substitutions; Tv is the number of transversion substitutions; Total, is the number of total substitutions.  Nucleotide Differences Comparisons  L  Ts  Tv  Total  Percentage similarities  Complete 23 S sequence 2917 2917 2917 2903 2903 2896 2896 2903 2903 2890  rrnA! rrnB rrnA/ rrnC rrnB/ rrnC rrnA/ Hcu rrnB/ Hcu rrnA/ Hmo rrnB/ Hmo Hcu 1 rrnC Hmo/ rrnC Hcu 1 Hmo  27 24 10 288 283 235 238 285 240 275  12 10 2 153 155 152 155 155 155 120  39 34 12 441 438 387 393 440 395 395  98.7% 98.8% 99.6% 84.8% 84.9% 86.6% 86.4% 84.8% 86.4% 86.3%  are concentrated within regions defined by positions 72 to 363, 984 to 1765 and position 2322. The sequences of  rrnB  and rrnC  are almost identical (99.6% identical); the 15 nucleotide  differences are located in regions defined by nucleotide positions 287 to 295, 984 to 1486 and 2413 to 2769. These results indicate that the distribution of nucleotide substitutions in the  117 pairwise comparison of the 23S rRNA sequences from the rrnA, rrnB and rrnC operons are similar to the comparison of the rrnA and rrnB 16S sequences, in that the differences are not randomly distributed but rather are concentrated in defined intervals. Comparison of the rrnA and rrnB 23S sequences with the rrnC sequence indicates that the rrnC sequence contains 29 rrnA specific substitutions (within nucleotide positions 984-1786 and 2413-2769), 11 rrnB specific substitutions (at position 2322 and within regions 72-363 and 1322-1564). Four substitutions (at positions 287, 288, 295 and 1163) are unique in that they are not present in rrnA or rrnB. These results indicate that in the rrnC sequence, except at position 2311, the rrnB-like substitutions are concentrated within the first 1786 nucleotides of the sequence and the rrnA-\\ke, substitutions are located only beyond position 984 of the rrnC 23S rRNA sequence. Both the rrnA- and rrnB-like substitutions are present between positions 984-1786. In addition analysis of the 23S flanking sequences of rrnC indicates that the 16S23S spacer appears identical to rrnB whereas the 5S and distal regions appear identical to rrnA. There are three possible explanations for the presence of rrnA- or rmB-like sequences within the third (rrnC) 23S rRNA gene sequence determined by Brombach et al., (1989). First, the sequence is from the third operon, rrnC. Second, the third sequence might be a hybrid of rrnA and rrnB operons produced by recombination within the identical domains of the rRNA gene sequences from the rrnA and rrnB operons and present in a small proportion of the bacterial population. The presence of four unique sites in the rrnC sequence might be sequencing errors. Third, the sequence obtained by Brombach et al., (1989) might be a combination of the rrnA, rrnB and rrnC operon sequences. The procedure used by Brombach et al., (1989) involved a DNA probe to identify the 23S rRNA gene (the probe binds at identical positions within the the 3'- end of the 23S rRNA genes of all three operons) and then the 23S rRNA gene was sequenced in sections by chromosomal walking. Since the precise strategies used in isolating the DNA fragments for subcloning or how the chromosomal walking was performed were not described, it is not clear if their clones may have been derived from all three  118 operons. The resolution of this issue awaits the cloning and sequencing of the intact rrnC operon. Pairwise comparisons of the 23S sequences from Hb. cutirubrum and He. morhuae to either one of the three Ha. marismortui sequences show that they are only 84.8-86.6% identical. Unlike the intraspecies comparisons (between the three 23S sequences from Ha.  marismortui),  the nucleotide differences in all interspecies pairwise alignments are not concentrated in specific domains but rather are more generally distributed throughout the entire 23S sequence (Figure 3.25). These distribution patterns are similar to the interspecies comparison of the 16S sequences from halophiles. In both intra- and inter-species comparisons, transitions outnumber transversions by about two to one. and 60% of the substitutions are compensatory affecting both components of a nucleotide base pair in regions of RNA helical structure. The intra-species nucleotide comparison between the 23S rRNAs of rrnA-rrnB, rrnC and rrnB-rrnC  sequences of Ha. marismortui  rrnA-  gave 98.7%, 98.8% and 99.6% identical,  respectively. These values are substantially less than the expected values (i.e., 0.1%) for paralogous small or large subunit rRNA genes within the genome of most organisms. However, based on the nucleotide similarity alone, it would appear that the divergence of the rrnA, rrnB and rrnC 23S rRNA sequences from each other is a recent event (i. e., in terms of evolutionary time). All pairwise interspecies comparisons show higher degrees of sequence divergence (between 86.6% and 84.8% identity) and suggest that the orthologous large subunit genes (23S rRNA) from the halophiles diverged from a common ancestral sequence within a short period of evolutionary time (about 600 million years ago). This situation is similar to the divergence of small subunit genes (16S rRNAs) from the same halophilic organisms (see section 3.2.4.1 above)  3.2.6.2  Secondary Structural Analysis  Pairwise comparisons between the 23S rRNA sequences from the rrnA, rrnB and rrnC operons of Ha. marismortui indicate that the nucleotide substitutions between each pair are  119  concentrated in four specific domains (I, III, IV and VI) of the universal secondary structural model for E. coli LSU rRNA (see Figure 1.10; Noller et al., 1981). Most of the substitutions are located within domain I (Figure 3.26). Detailed analysis of the nucleotide substitution  \ <j " «  ** % -100  " ft  u-c ft'& c> . o  fl ft'  ,u  fl  6  c'  ,v  r  5  3'  \\  fl c  o  « -  \^  U  ft  ^  r  "  « r fi « C  c <J  f  "  fl  C  G  G fl"  H-150  fl C  U  G - C fl-U G - C fl-U C*U U fl G AG  ">0O  A CHG UG  fi C  U  C  G CCCGflGfl ,, . IMIMI GGGCUflU -  .  G  G c fl GC C - G I  «  G  G f l U -3SJ U-fl G G - C G - C C G - C C A C G C - G GU U ft - U CG 350H fi HC U - A I C C " U G C fl CUGflflGflC GCUG GGUA G M 1 I I II ; | II I ! | | GCACUUCUG CGAC CCAU . „ ' ' o . o , . , .  o ^  fl CUGfl IIII GHCU  C fl U C  0  G  ...rUOOO^UDO^  C'U C-250 C  1 1  A  G C  fl  fl  fl G  G C C A C  C , , . C G fl  C  * _ _ fl C - G ft - U-200 U . G fl fl G . U C - G G . U C G - C C - G r  r  c  H - U • fl-U G o fl U . G G . U fl fi G fi C C G C  H  .  r 0 C  c  Figure 3.26 The predicted secondary structure of Domain I of 23S rRNA from Ha. marismortui. The secondary suucture for Domain I of 23S rRNA from the rrnA operon, bounded by nucleotide positions 72 to 365, is illustrated. Mutational differences between rrnA and rrnB (*), rrnA or rrnB and rrnC (• ) are indicated.  120 patterns between the 23S rRNAs from the rrnA, rrnB and rrnC operons using secondary structural models from E. coli and Hb. cutirubrum also indicate that most of the substitutions are compensatory and are located within the variable regions of the secondary structures. None of the substitutions observed between rrnA and rrnB, rrnA and rrnC or rrnB and rrnC would affect any known tertiary interactions, protein binding sites, or functional domains such as the peptidyl transferase center, the GTPase center or the ct-sarcin loop (Leffers et al., 1987; Raue' et al., 1990).  3.2.7  23S Distal Region 3.2.7.1  23S-5S Intergenic Spacer  The 139 nucleotide long 23S-5S intergenic spacer sequences from the rrnA, rrnB, and rrnC operons are identical and are 27 nucleotides longer than the corresponding sequence of Hb. cutirubrum . Alignment of the 23S-5S spacer sequence from Hb. cutirubrum with any one of the three 23S-5S spacer sequences from Ha. marismortui indicate that they are 62.5% identical (Figure 3.27). The alignment studies also showed that there were two regions of higher sequence conservation; the 23S processing stem region, and a 19 nucleotide long conserved sequence that forms helix G in the universal secondary structure for the primary transcript (Figure 3.7). The sequences of the bulge motifs of the processing stem that are involved in the excision of 23S rRNAs in Hb. cutirubrum and Ha. marismortui are different; the former has TTT whereas the three operons of the latter are CAA. The complementary half of this inverted repeat is located in the 16S-23S intergenic spacer (see Figure 3.7). The conservation of the sequences involve in the formation of helix G within genera of halophilic archaea (Figure 3.27, 3.7A, 3.7B and 3.7C) implies that this sequence is of crucial importance (Kjems and Garrett, 1990). The presence of helix G at the 5'-region of the 5S rRNA might be an important element for the processing of 5S rRNA.  121  1  10  20  30  10  50  Hma rrnA  TCflTflCGCflCTGTGflCTCflTTCRCCGfiCGRTTTflflCTCGTCGCTGRflCGfl 50  Hma rrnB  TCflTflCGCRCTGTGflCTCRTTCflCCGflCGflTTTRflCTCGTCGCTGflRCGR  HmarrnC Hcu  ••  50  TCRTRCGCRCTGTGRCTCRTTCRCCGRCGRTTTRRCTCGTCGCTGRflCGR 50  ••••••  RCRCTCRTGC  •••  flCTCflCCflCRTRCGTGGTCGfl  31  23S 3' Processing site HmarrnA GTCCRGGCGCRRRCTGGRTCGCRCGTRRTCRCRCGGTGGRRGRGTTRRTC 100 Hma rrnB  GTCCRGGCGCJflRCTGGRTCGCRCGTRRTCRCRCGGTGGRRGRGTTRRTC 100 GTCCflGGCGC^flflCTGGRTCGCRCGTRRTCRCRCGGTGGRRGRGTTflflTC  HmarrnC Hcu  GTCCRGGCGTTTRCTGGRTTGCRCTTR—CRCRCGG  100  RCGTCCGCC 74  Helix G Hma rrnA  GRGRCTGGTRCTRTCGCGGTTCGRTTCCGTGRCTCGRCG 139  HmarmB GflGRCTGGTRCTRTCGCGGTTCGRTTCCGTGRCTCGRCG HmarrnC Hcu  139  GflGRCTGGTRCTRTCGCGGTTCGRTTCCGTGRCTCGRCG 139  ••••••••••••••  • ••  GRCGTCGGCG-TRCRRCGGTTCGRTTCCGTTGGTCGGTR 112  Figure 3.27 Comparison of the 23S-5S intergenic sequences. The sequences from rrnA, rrnB, and rrnC operons from Ha. marismortui and the single operon from Hb. cutirubrum are compared. The sequence similarity between two sequences are indicated by dots (•) and to maintain the alignment, gaps or deletions (-) were introduced within the sequences. The distal portion of the 23S processing repeat is underlined; excision occurs by cleavage within the three bulge nucleotides (not underlined). Helix G is overlined.  122  1 Hma rrnA  10  20  Hma rrnB  «•  HmarrnC  ••  Hcu  GC--T  Hma rrnA  30  10  50  TTflGGCGG— CRCfiGCGGTGGGGTTGCCTCCCGTflCCCflTCCCGflflCRCG  fl-.  •  GflflGflTflflGCCCflCCflGCGTTCCflGGGRGTRCTGGflGTGCGCGflGCCTCT G  Hma rrnB  Hma rrnC G--T  Hcu  Hma rrnA  G-TC  G G G R R f l T C C G G T T C G C C G C O R C C R 124 C  Hma rrnB  •  Hma rrnC  •  Hcu  T— T-  Figure 3.28 Comparison of the 5S rRNA genes. The 5S sequences from the rrnA, rrnB and rrnC operons of Ha. marismortui are compared with the 5S sequence from Hb. cutirubrum. Only nucleotides that differ from the rrnA sequence are indicated. Dashes (—) indicate nucleotides identical to the rrnA sequence; dots (•) indicate single nucleotide gaps in the sequence(s) required to maintain alignment.  3.2.7.2  5S RNA  The 5S rRNA sequences from the rrnA, rrnB have been determined and are 120 nucleotides in length. A comparison of the rrnA and rrnB 5S sequences with the rrnC sequence (Brombach etai.. 1989) revealed that the 5S rRNA sequences from rrnA and rrnC are identical whereas the rrnB differs at two positions (Figure 3.28). The 5S rRNA sequence of Hb. cutirubrum differs from the Ha. marismortui sequence at 10 of 120 nucleotide positions (91.7% similarity).  123 R C C 1 20-fl C C G C C G C  I0 II 20U I c A IG AC GC GGUGGG  III  U G  U  V  U 1 1 0-G G C C C*U fl fl fl G G 0 0 -G U C U C  U UG CG C C A C C C Cl fl C70 6 0 fl A*G G G G fl G U A-80 C rv U G G A G  U Fl  C  C  uc  30 C I fl CCGU  C-  11  M i l  C  AG  GGCR flfl I C 50  c c  10  fl fl  C G G-90 G C  Figure 3.29 The predicted secondary structure for the 5S rRNA from the rrnA operon of Ha. marismortui. Mutational differences between rrnA and rrnB (*), normal WatsonCrick (-), and G • U base pairs are indicated. This secondary structure shows the characteristic features for the group II archaeal (representing extremely halophilic) 5S rRNAs (discussed in section 1.7.4; Fox, 1985). The rrnC 5S sequence is identical to that of the rrnA 5S sequence. All three 5S rRNA sequences from Ha. marismortui can conform to the expected group 11 archaeal (halophilic) structure (Figure 1.7).  The two nucleotide substitutions observed  between the rrnB and rrnA (or rrnC) 5S sequences are located within the helix V region of the secondary structural model for the 5S rRNA (Figure'3.29). One substitution, located at position 73 in the 5S rRNAs of the rrnA and rrnC operons, disrupts a G - C base pair (present in the rrnB operon) by a G->A substitution, however, the substitution at position 106 does not disrupt the helix structure (Figure 3.29). These two substitutions are located within the regions  124  involved in the binding of L25 protein in E. coli. Comparison of the rrnA, rrnB and rrnC 5S rRNAs structures with the reported tertiary interaction sites in Hb. cutirubrum revealed that the two substitutions do not affect the nucleotides involved in any tertiary interactions. The characteristic structural features of group U 5S rRNAs include: no loop-out bases in helix I; an extended helix II; a helix III loop length of 13 nucleotides; a -CGAAC- sequence within the helix III loop; 16 nucleotides between the main portion of helix II and beginning of helix IV; and 21 nucleotides in the helix IV stem and loop (Fox, 1985 and Nazar, 1991).  3.2.7.3 The 5S Distal Region With the exception of the first nucleotide, the sequences of the rrnA, rrnB and rrnC operons are identical for the first 61 nucleotides beyond the 5S gene sequence. At this point, the rrnB operon sequence diverges abruptly and exhibits no further similarity to the rrnA and rrnC sequences (Figures 3.6, 3.30). The perfect identity between rrnA and rrnC continues to position 200 (the EcoRI site); there is no sequence available beyond this point for the rrnC operon (see section 3.2.2.1). The SS-tRNA^ys intergenic spacer in the rrnA operon is 239 nucleotides in length. Sequence and Southern hybridization analysis indicate that there is no tRNACys g ne in the distal region of rrnB (Figure 3.4). e  The 5S distal region of Hb.  cutirubrum contains a tRNACys gene, 110 nucleotides downstream of the 3'-end of the 5S rRNA gene (Hui and Dennis, 1985). The tRNACys sequences from the two organisms are 97.4% identical, whereas, there is very little homology in the immediate 5'-and 3'-flanking sequences. Predicted secondary structures for the primary transcripts of the rrnA, rrnB, rrnC and that of Hb. cutirubrum indicated that helix H structures present in the 5S distal regions, are similar to rho-independent termination signals in E. coli (see section 1.6.3). The helix H structures from the rrnA, rrnB and rrnC operons are identical and they are followed by T-rich sequences (Figures 3.7A, B and C). The helix H sequence of Hb. cutirubrum is also followed by T-rich sequences; however, the primary and secondary structures comprising helix H are not identical to those from the rrnA, rrnB or rrnC operons. In the rrnA operon, helix I structure is  125  1 10 _20 30 40 50 60 S'-TTCRTRCTCTCflTTCRTGCTTCGflRCRGCfiGCGGTGCTGTTCGGGGCTTTTTGCflGTTTT  Hcu  • • • • • •  •  •  •  •  mm  m  mm  mm  mmm  mmm  70  80  90  100  m  Hnia rrnA  5'-CTCflTfiCCTTTCRTflGCCCflCTCflGGflGflGflCflTCTCT£Ci^ifiG_TGGGCTTTCCGTflTTTRTRflGCflGTGCGTflflCTflCCTRRTTGfl  Hma rrnC  5'-CTCRTRCCTTTCRTflGCCCRCTCRGGRGRGRCflTCTCT££CJifliiTGGGCTTTCCGTRTTTRTflRGCRGTGCGTflRCTRCCTRflTTGRGCCGRGflRflCRRT  Hma rrnB  5,-TTCRTRCCTTTCRTRGCCCRCTCRGGRGRGRCRTCTCTmM£TGGGCTTTCCGTRTTTR--RRCRGRGCCGRRCCRCTCRGTRflflTGRCCGGTTCTCG Helix H  Aval  | Sequence diverging poinl  Hcu  GRCGRCTG  Hma rrnA  GGCRTGRCTCGCGRRGTCRRRCTGRRCGRRCTCRCGGCGCTTCTTGRflRCGGCRRCGTRCCCGCTGTCGGTCGCGflCTGCTCGGGflGGRGTTCGRCGRT-  HmarrnC  GGCRTGRCTCGCGflRGTCRflflCTGRRCGRRCTCRCGGCGCTTCTTGRRRCGGCRRCGTRCCCGCTGTCGGTCGCGRCTGCTCGGGRGGRGTTCGRCGRT-  Hma  •• rrnll  •  •  ••  •  •  • • •  •••• •  --CRCT-CTGTGGRRTRCGGCTTCRR-TCGGTGRGRTCRGRCGTGCGCTRGCGRTCGTGHTCGR-GTCGTTGRGTCRC-3' tRNA - .vs (  :  **************************w***w**** Hcu  GGRGRCCGCTflCGTTTflTTGGCGGGRCRCCGGTRCGTRGTCGTGTGCCRRGGTGGCflGRGTTCGGCCTflflCGC-GGCGGCCTGCRGRGCCGCTCflTCGC  Hma rrnA  GGJRJJiGCTRCGCTTRRRCGRCCGRGTTGRCTRCCRCTRTCTGCGCCRRGGTGGCRGRGTCTGGCCTRRCGCCRGCGGCCTGCflGflGCCGCCCRTCGC  HmarrnC  GGRRTTC-3' EcoRI  CGGTTCflflRTCCGGCCCTTGGCTTT-CRGCRGCGRGTGRCGGTTCGRGTGGCRRCRCRGTCGGCflGTGG-TCGTGGRGCGGRGflflTCGRGTGGCTRTRCC CGGTTCRP.flTCCGGCCCTTGGCTTCTCRGTTGCTTflCGCCCCCRRTRRRCTflCCCGGTTTCTGCCTTTTTGTGTGGGTCGflCGTTflTGflRCTGTTGCTCG Helix I  Hcu  GGGGT-GTTCGGCCCGCGCGTGGGGTTGCTCGTTGTGGCCGCGGflRRGTfiflCGflTTRTflCGCfl-3 • • •  Hma rrnA  • • •  •  m  •  •  •  ,  •  RRRRCGGTTRRRRRCRGGRRCRCGRTTCTRGCGRRRGflTRCRGTGTTGGRCGGCTGTCTRTCRRGTCTGTTTT-3'  Figure 3.30 Comparison of the 5S distal regions. The 5S distal regions from the rrnA, rrnB, and rrnC operons of Ha. marismortui and the single operon from Hb. cutirubrum. The lengths of the rrnA, rrnB, rrnC and Hb. cutirubrum are 480, 168, 200 and  324 nucleotides, respectively. The nucleotides that are identical between two sequences are indicated by dots (• ). The tRNACys genes from the rrnA and Hb. cutirubrum are also indicated (***). Helices H and I and underlined present within the 3'-flanking region of the tRNACys This structure (helix I) also followed by a T-rich sequence, acts as a termination site in the rrnA operon.  126  3.2.7.4 Primary Transcript Analysis of the 23S Distal Region Using nuclease SI digestion experiments, the regions downstream to the 23S rRNAs of the rrnA and rrnB operons were examined to locate the 3'-processing sites, 3'-maturation sites, 5S maturation sites and termination signals. A 365 nucleotide long Aval-Aval fragment from the rrnA operon was used as a probe for the analysis of the processing and maturation sites of the 23S and 5S rRNAs from the rrnA and rrnB transcripts. This fragment contains two substitutions (between rrnA and rrrc£)within the 5S genes and one substitution within the 5S distal region, however, none of these single nucleotide substitutions appears to affect the analysis. The 365 nucleotide fragment isolated from the rrnA operon contains the last 66-bp of the 23S gene, the 139 nucleotide" long 23S-5S spacer, the 120 nucleotide long 5S rRNA gene and a 40 nucleotide long 5S distal region. The fragment was 3'-labelled with a^2p dTTP on the minus strand, at position 2852, within the 23S gene. The labelled fragment was hybridized to total Ha. marismortui RNA, treated with S1 nuclease and the reaction products were separated on an 8% polyacrylamide gel along with the probe and size markers. In addition to full length protection, a number of partial protection products with nucleotide lengths of approximately 65, 127, 204 and 328 nucleotides were observed (Figure 3.31 A). These products correspond respectively to protection by RNAs with 3'- ends generated by cleavage of the primary transcripts of the rrnA and rrnB operons at or near (i) the 3'-end of mature 23S rRNAs at position 2917 of the 23S rRNA genes, (ii) the precursor 23S rRNA processing sites at position 62 within the 23S-5S spacer regions, (iii) the 5'- end of the 5S rRNA gene at position 1 of the 5S rRNA genes, and (iv) the 3-end of the 5S rRNA genes at position 124 of the 5S rRNA genes. Multiple bands below the 204 band may be due to imprecise 5S processing exonuclease activity at the 3'-transcript end or S1 nibbling at the end of the RNA-DNA hybrid. It is also possible that the 5S rRNAs may be excised as a precursor with a few extra nucleotide at the 5- or 3 - ends and then trimmed to mature length. The presence of a full length product from the 368 nucleotide probe suggests that the transcripts exiting the 23S gene read through the distal 5S gene. This probe does not include the  127  B  Hma rrnA  Hma rrnB 23S  368n Prone 65u • 127n • 204n •  . J «*> of Z>S rRNA DS rRNA procemmg aits ? end of SS rRNA  36«ripror»  1000aPn>b« M5n •  _ 3'««1 crflRNACyi  165a* 275n •  328u-  Figure 3.31 Figure caption on next page (page 128).  * •Z^' 204a « 328a>  y md of 23S rRNA US rRNA y«ndafSSrRNA r rod ofSS rRNA  128 Figure 3.31 Nuclease SI mapping analysis of the 5S distal regions from the rrnA and rrnB operons. Figure 3.31 A The line diagram showing the 23S distal region of the rrnA operon. The 368 nucleotide long Aval-Aval fragment was used for the S1 nuclease analysis of the 23S distal regions of rrnA and rrnB transcripts. The lengths of the partial protection products and their corresponding cleavage positions are indicated below. The Aval sites within the 23S rRNA and 5S distal regions are at identical positions in both rrnA and rrnB operons. The 368 nucleotide Ava IAval fragment from the rrnA operon was 3'-labelled at the minus strand with a32p dCTP and used as a probe for the SI nuclease analysis (Figure 3.31C). A second, 1000 nucleotide long EcoRl-EcoRI fragment, containing the tRNACys gene, was also isolated from the 5S distal region of the rrnA operon. This fragment was 3'-labelled at the minus strand with a^2p dATP and was used to study the processing sites from the tRNACys H termination sites a n (  downstream of the tRNACys gene. (Figure 3.31D). The lengths of the partial protection products and their corresponding cleavage positions are indicated below. Figure 3.31B The line diagram showing the 23S distal region of the rrnB operon. The two Aval sites shown in the diagram are located at position identical to the rrnA operon and this 368 nucleotide long Aval-Aval fragment from the rrnB operon is nearly identical to the rrnA operon (with only 3 substitutions). The partial protection products and their corresponding cleavage positions are shown below. Figure 3.31C Autoradiogram showing the nuclease SI protection assay products of the 23S distal regions from the rrnA and rrnB operons. Above each lane, the probe, S1 products, and size markers are indicated. The sizes of the markers and the S1 products are indicated on the right and left sides, respectively. Figure 3.31D Autoradiogram showing the nuclease SI protection assay products of the 5S distal regions from the rrnA operon. Above each lane, the probe, S1 products, and size markers are indicated. The sizes of the markers and the S1 products are indicated on the right and left sides, respectively. poly pyrimidine sequences located about 50-60 nucleotides distal to the 5S gene of the rrnA and rrnB operons or the position at which the two sequences start to diverge. Therefore, these sites have not been studied.  129 Since S1 nuclease analysis are only semi-quantitative, the band intensities do not reflect the exact amount of RNAs present in the cells. These analysis are mainly done in order to detect the transcription initiation, termination or processing sites of the transcripts. In the case of rRNA transcripts, usually we expect about 1 - 2% of rRNA in the form of very long precursors and processing intermediates; most of the rRNA is in mature form (King and Schlessinger, 1983). However, this is not what has been observed from the Sl nuclease analysis shown in Figure 3.31C; the intensities of the band representing the mature 23S rRNAs (64 nucleotides in length) are much less than expected. There are several explanations for this. First, the efficiency of hybridization between rRNA and the DNA probe may depend on the lengths of the rRNAs; longer RNAs hybridize more efficiently than the shorter ones and during a three hour hybridization, displacement may occur. Second, the conditions used in the assay (especially temperature) permit longer RNAs to hybridize more efficiently. Third, the majority of the mature rRNAs are folded in the form of a secondary structure and may not be readily available to hybridize with the DNA probe. Another S1 mapping analysis was performed to investigate the processing sites of the tRNACys and termination signals within the 5S distal region of the rrnA operon. The 1000 nucleotide long EcoRI-EcoRI fragment contains a 45 nucleotide region upstream of the 5'-end of the tRNACys gene, the 74 nucleotide long tRNACys g ne and about 881 nucleotides beyond 6  the tRNACys gene. This fragment was 3'-labelled at position 203 distal to the 5S sequence, hybridized to total Ha. marismortui RNA, treated with Sl nuclease and separated on an 8% polyacrylamide gel along with the probe and size markers (Figure 3.3IB). Three partial protection products of approximately 115, 165 and 280 nucleotides in length were observed. These correspond respectively to protection by RNAs with 3'-ends (i) at the 3'-end of the tRNACys sequence at position 321; (ii) at the putative transcription termination sites near position 368 distal to the 5S sequence; and (iii) at a second transcription termination site near position 473 distal to the 5S gene. By carrying out an Sl nuclease analysis using the 1000 nucleotides long EcoRI-EcoRI fragment, along with Maxam and Gilbert sequencing, it was  130 possible to show the 3'-end of the tRNA^ys at nucleotide level (data not shown). The two larger protection products are assumed to result from protection by terminated transcripts. Alternatively, one of them might represent the site of sequence divergence between the rrnA and rrnC operons. A number of conclusions concerning the transcription and processing in the 23S distal region can be made from these S1 nuclease analyses. In the rrnA operon, the observation that only the 3'-end of the tRNA^ys was apparent and not the 5'-end suggests that the transcript exiting the 5S gene reads through the tRNA^ys gene and terminates downstream of the gene. It also suggests that the processing of the 3'-end of tRNA^ys occurs before that of the 5'-end. In the case of rrnA operon, transcription appears to terminate at T-rich residues near positions 368 and 473 distal to the tRNA^ys sequence which, are proceeded by a helix structure (position 368 is located near helix I in Figure 3.7A). This structure resembles the r/i«r>-independent termination signals in E. coli. The precise role of these conserved secondary structural features and their relationship to transcription termination remains to be established.  3.3  SUMMARY Ha. marismortui contains two (or possibly three) rRNA operons in its genome. The  rrnA and rrnB operons were initially identified and isolated by Mevarech et al, (1989) and cloned into plasmid pBR322. These two clones were fully characterized in this study. In addition, genomic hybridization analysis, using different restriction enzymes suggests that a third operon, designated rrnC, might be present. A 23S rRNA gene and flanking sequences from Ha. marismortui was published several years ago by Brombach et al, (1989). This sequence is different from the 23S rRNA sequences of the rrnA and rrnB operons deteirriined in this study. Here it is assumed that the sequence published by Brombach et al, (1989) corresponds to the rrnC 23S rRNA gene andflankingregions, although it may be an artifactual composite of the rrnA and rrnB and rrnC sequences(see section 3.2.6.1 above). Clarification of this issue will require cloning and characterization of the complete rrnC operon.  131 Sequence and Southern hybridization analysis show that the gene order of the rrnA operon is 5'-16S-tRNAAla_23S-5S-tRNACys.3' d the gene order of the rrnB operon is 5'a n  16S-23S-5S-3'. Comparison of the nucleotide sequences from the two operons shows that most of the nucleotide substitutions are located within the noncoding flanking and spacer regions, although long portions of the 5'-flanking, the 16S-23S spacer and 5S distal regions are virtually identical. It is possible that these conserved regions contain important motifs used for the regulation or processing of the RNA products. The two putative primary transcripts can be folded to form secondary structures, containing processing stems that surround the 16S and 23S rRNAs as well as seven other conserved regions of potential secondary structures (designated as helices A to G; Kjems and Garrett 1990). The  16S 5'-flanking sequences of rrnA and rrnB differ in a number of substantive  ways. The rrnA operon contains four promoters; primary transcript analysis show that all are active in transcript initiation. The rrnB operon contains two promoter-like sequences but only one has been shown to be active. In the rrnA transcript, the processing helices surrounding the 16S and 23S gene sequences contain the consensus "bulge-helix-bulge" motif that is recognized by an endonuclease and used to excise pre 16S and pre 23S from the primary transcripts. The transcript from rrnB contain the consensus 23S processing motif but the 16S stem lacks the "bulge-helix-bulge" motif and processing presumably proceeds by a different pathway. The 16S rRNAs from the rrnA and rrnB operons differ at 74 positions of the 1472 nucleotide sequence. The substitutions are not spread throughout the molecule but rather are confined within three different domains, defined by nucleotide positions 58-321, 508-823, and 986-1158. About two thirds of the substitutions are located within the 508-823 domain and most of them are compensatory, affecting both components of a nucleotide base pair within the molecule. None of the substitutions is believed to affect functionally important nucleotide positions. Using stringent and non stringent nuclease Sl protection assays, it was shown that both of the 16S rRNAs are expressed and present in the active 70S ribosomes.  132 Within the 16S-23S spacer region, the rrnA operon contains a tRNA Ala g  e n e  wn  ereas  the rrnB operon does not. Three unique processing sites were identified within the spacer region of the rrnB operon which are probably important for 16S excision and maturation. The intergenic spacers in both operons contains an active internal promoter. This promoter is located at the beginning of a 177 nucleotide long region immediately preceding the 23S genes, which is identical in the two operons. The pairwise comparison of the 2917 nucleotide long 23S rRNA sequences showed 39 differences between rrnA and rrnB, 28 between rrnA and rrnC, and 16 between rrnB and rrnC. Most of the differences are compensatory.  None of the substitutions appear to affect  functionally important positions. The presence of identical flanking sequences suggests that excision and processing of 23S rRNA from the three operons proceeds by a single common pathway. The intermediates in this pathway were identified using Sl nuclease protection assays. The 139 nucleotide long 23S-5S spacer regions are identical in the rrnA, rrnB and rrnC operons. The 5S gene from rrnB differs from the rrnA and rrnC sequence at two positions, but both positions appear to be functionally unimportant. The rrnA operon contains a t R N A Y c  s  gene located 239 nucleotides beyond the 5S gene. Transcripts that extend through the tRNA appear to terminate at T-rich pyrimidine tracts located in the 3'-flanking region. No terrrvination signals have been identified for the rrnB operon.  133  CHAPTER 4 Phylogenetic Implications of Sequence Diversity Between the Two Ribosomal RNA operons, rrnA and rrnB from the Haloarcula  marismortui.  4.1 Introduction Studies on ribosomal RNAs from the rrnA and rrnB operons of Ha. marismortuii are of particular interest because this is the first archaeal organism in which rRNA sequence heterogeneity has been documented. It is expected that some of this sequence will help us understand the molecular adaptation of halophilic organisms to an extreme saline environment. These studies will also provide information for establishing the phylogenetic relationship between Ha. marismortui rRNA operons and the other related halophilic archaeal organisms. By using the rRNA gene sequence data from two non-identical rRNA operons of Ha. marismortui, the phylogenetic implications of sequence diversity between the two operons and how they are related to other halophiles can be studied. To investigate the phylogenetic implications, the sequence divergences from the rrnA and rrnB operons of Ha. marismortui, the homologous sequences from related halophiles, and sequences from two outgroup species, were aligned. Four different comparisons were carried out for this phylogenetic analysis; the 508-823 domain of 16S rRNA, the entire 16S rRNA, the entire 23S rRNA and the combined 16S-23-5S rRNA sequences. In principle, the trees generated from each of these sequences should be congruent and should reflect the phylogenetic relationships between the organisms from which the sequences were obtained. In practice, the trees are essentially congruent except for the 508-823 domain tree.  4.2 Results and Discussion 4.2.1 Phylogeny of the 508-823 Domains of the 16S rRNA Genes The 508-823 domains of the 16S rRNA genes from five halophiles, Ha. marismortui (both the rrnA and rrnB operons), Hf. volcanii, Hb. cutirubrum, and He. morrhua, and two outgroup species, M. vanielli and E. coli, were used in this study. The tree obtained is shown in Figure 4.1A. It indicates that all the halophiles grouped together with a consistency value of 1.00. In contrast to expectation, the rrnA and rrnB 508-823 domains do not group together; the rrnA branches with Hb. cutirubrum and He. morrhuae, whereas rrnB branches before Hf. volcanii. It is also interesting to point out that the Hb. cutirubrum and He. morrhuae prefer high salt concentration (3-5 M) whereas> Hf. volcanii prefers moderate salt concentration (2.14 M). Salinity of the hypersaline lakes (e.g. the Dead Sea) can vary due to seasonal changes (wet or dry periods; Edgerton and Brimacombe, 1981). Organisms which live in this environment must be able to adapt to fluctuating salinity conditions. Ha. marismortui has been able to adapt to a variety of lake with different salt compositions. The mutational changes observed within the 508-823 domain of the 16S rRNA genes from the rrnA and rrnB operons may be important for the organism to survive in different saline environment. In this scenario, the rrnB operon, because of its similarity to Hf. volcanii, is used to survive in moderate salt concentration whereas the rrnA operon, because of its similarity to Hb. cutirubum and He. morrhuae, is used to survive in high salt concentration.  4.2.2  Phylogeny of the Complete 16S rRNA Genes The entire 16S rRNA sequences were used to analyse the phylogenetic implications of the  sequence divergence observed between the rrnA and rrnB operons of Ha. marismortui. Again, the five halophilic 16S sequences form a coherent group with a consistency value of 1.00. In this tree, the rrnA and rrnB sequences form a coherent subgroup because the differences in the 508823 region have been masked by the higher degree of similarity in the rest of the molecule. The rrnB sequence appears to be evolving more rapidly than the rrnA sequence. Presumably, this  135  Hma rrnA j  Html rrnB  Lr 0.98 [ Hma n-nC Hmo 0.92 Hcu  Hma rrnA i — nmti  D  I  Hima rrnB Hcu -Hmo  Figure 4.1 Phylogeny obtained from die 508-823 domain of the 16S rRNA genes (Tree A), the complete 16S rRNA genes (Tree B), the complete 23S rRNA genes (Tree C), and the combination of 16S-23S-5S rRNAs (Tree D). A parsimony analysis of aligned 16S and 23S sequences and 16S-23S-5S sequences was carried out using PAUP with the heuristic search option. The most parsimonious tree is illustrated along with the consistency index for each branch. The numbers preceding each branch are bootstrap consistency values and indicate the proportion of replications that group all the taxonomic units within the branch. The consistency index values for Trees A, B, C, and D are, 0.833, 0.824, 0.909, and 0.91, respectively. The branches with consistency values less than 0.5 are not indicated.  represents the accumulation of nucleotide substitutions predominantiy in the 508-823 domain that make the domain more like that of Hf. volcanii. The two identical 16S genes from Hf. volcanii branch early and the single 16S genes from Hb. cutirubrum and He. morrhuae branch later from the lineage leading to the paralogous 16S rRNA genes from Ha. marismortui. It is not known whether the 16S gene from the rrnC operon is identical to one of the other two or represents a third type of 16S gene that had formed by a second paralogous sequence divergence event.  4.1.3 Phylogeny of the 23 rRNA Genes The 23S rRNA gene sequence from Hf. volcanii is not available whereas a third 23S rRNA gene sequence (rrnC) from Ha. marismortui is available in this comparison. The available 23S gene sequences were analysed by PAUP and the resulting tree is illustrated in Figure 4.1C. As with the 16S tree, the halophilic 23S sequences form a coherent group and the rrnA, rrnB, and rrnC sequences of Ha. marismortui form a coherent subgroup with rrnB and rrnC on one branch and rrnA on the other. The Hb. cutirubrum (halobium) and He. morrhuae are together on a separate branch.  4.1.4. Phylogeny of the 16S-23-5S rRNA Genes For statistical reasons, the resolving power of the PAUP analysis is related to the length of the sequences under comparison. Therefore, the available 16S, 23S and 5S sequences were combined from the rrnA and rrnB operons of Ha. marismortui and the single operons from Hb. cutirubrum (halobium) and He. morrhuae. The outgroup sequences were again from M. vanielli and E. coli. The topology of this tree is identical to the 23S tree. Within the tree, the Ha. marismortui operons are branching later than the Hb. cutirubrum and He. morrhuae operons; this may indicate that the Ha. marismortui rRNA operons are evolving more rapidly than the Hb. cutirubrum and He. morrhuae operons. Within the two operons from Ha. marismortui, rrnB appears to be evolving more rapidly than the rrnA. The genera of halophilic archaea considered here appear to have diverged from each other within a relatively short period of time (some 600 million years ago; Mylvaganam and Dennis,  1990; Ochman and Wilson, 1987). This calculation is based on the assumption that in archaea, the mean rate of substitution is about 1 % per 50 Million years. Extensive analysis of both of the rRNA sequences and protein encoding sequences have failed to produce congruent and reliable trees describing precise phylogenetic relationships and branching orders. Nonetheless, the result presented here indicates that the rrnA, rrnB and rrnC operons of Ha. marismortui, in spite of their sequence divergence, are more closely related to each other than they are to the sequences from three other halophilic species. The only exception to this is the 508-823 regions of the rrnA and rrnB (see section 3.2.4.1).  4.1.5. Evolution of Ribosomal RNA Operons in Ha.  marismortui.  In the case of multicopy ribosomal RNA operons, it is highly unusual to observe nucleotide sequence differences above 0.1% within a genome of an organism. Sequence homogenization of these duplicated rRNA genes is believed to involve processes of gene conversion and unequal crossing over in a process termed concerted evolution (see section 1.7; Li and Graur, 1991). In the extreme halophilic archaeon Ha. marismortui, the 16S, 23S and 5S rRNAs representing the rrnA and rrnB operons are 95%, 98.7% and 98.4% identical. The number of nucleotide differences between these sequences are between 10 and 50 fold higher than what is normally observed for duplicated ribosomal RNA genes. There are at least three explanations or causes for this unexpectedly high level of nucleotide sequence divergence: (i) a failure in the recombination system used to maintain sequence homogeneity, (ii) fusion to produce a chimeric genome containing a divergent rRNA operon, and (iii) selection-induced divergence within rRNA operons. The analysis of the positions of nucleotide difference between the 16S, 23S and 5S sequences from the rrnA, rrnB and rrnC operons indicate that all of the substitutions occur at positions that are normally phylogenetically variable. In other words, none of the substitutions affects positions known to be important in the essential structure or functions of the RNA in the  ribosome. It is therefore possible that the substitutions represent neutral mutations that have been fixed by random processes and that they have no physiological or biochemical significance. This does not explain why the differences are confined to certain regions and absent from other regions. However, one could imagine that gene conversion continues to operate over the regions of those genes that are homologous in sequence but no longer operates over the regions that have accumulated a high proportion of substitutions. A second possible explanation is that there was a fusion between two organisms with divergent rRNA sequences and that their separate genomes were able to form a genetic chimera. One can imagine that gene conversion and recombination events might have homogenized parts of the ribosomal RNA operons whereas other parts have remained distinct. There is no direct evidence or precedent to support this suggestion. By sequencing the genome of Ha. marismortui (or the regions that are associated with the rRNA operons), may be possible to identify more examples of duplicate genes with high sequence divergence. If there are such genes, this would support the theory of a cell fusion event. A third possible explanation is that Ha. marismortui duplicated (into rrnA and rrnB) or triplicated (if rrnC operon exists) its rRNA operons and suppressed homogenization in order to allowed each operon to evolve independently. It is known that halophilic archaea balance the intracellular salinity with that of the external environment (which can vary between 1.5-5M NaCl; Christian and Waltho, 1962; Ginzburg et al., 1970; Lanyi and Silverman, 1972). A single homogeneous ribosome population likely will not be able to maintain optimum function over this entire salinity range. It may be that the observed sequence divergence produces ribosome populations with different salt optima. So far, no significant difference in the level of expression of the rrnA and rrnB operons have been detected under standard laboratory conditions (see section 2.2.1). In summary, there clearly exists sequence heterogeneity between the three rRNA operons of Ha. marismortui. None of the differences appear to affect residues known to be critical to rRNA structure or function. At least two of the operons are known to be expressed under normal  139  laboratory conditions. Three possible explanations for this divergence have been suggested; none of the three are compelling or completely satisfying.  140  Future Research Prospects The studies involved in this thesis demonstrate that there are at least two and possibly three ribosomal RNA operons (rrnA, rrnB, and possibly rrnC) in the genome of Ha. marismortui. Only rrnA and rrnB have been characterized extensively. In order to understand the features of the rrnC operon (whether it exists, whether it is rraA-like, rrnB-like, a hybrid of rrnA and rrnB or unique), the operon must be cloned and its entire sequence must be determined. First, using different 16S and 23S rRNA probes, the existence of this operon in the genomic DNA of Ha. marismortui must be determined. If exists, cloning and characterization of the entire rrnC operon must be performed in order to understand the origin and propagation of the rRNA operons from Ha. marismortui. In order to determine whether the rRNA genes from the rrnC operon is active or not and to understand the metabolic pathways involved in the formation of functional rRNA molecules, Nuclease SI protection assays must be performed (see section 2.2.16). It is known that halophilic archaea balance the salinity of the intracellular medium with that of the external environment (which can vary between 1.5 - 5M NaCl; Christian and Waltho, 1962; Ginzburg et al., 1970; Lanyi and Silverman. 1972). It is possible that under different external environments the expression of rRNA genes can be switched between different rRNA operons. In order to determine whether these operons are expressed under different extreme conditions, such as those encountered in halophilic biotopes, the organism should be grown under different salt concentrations (NaCl, KC1 and MgCl2) and the expression of the rrnA, rrnB and if present, rrnC rRNA operons under these varying conditions could be studied by nuclease S1 protection assays and/or primer extension analysis. Since efficient transformation systems for Haloarcula species are now available (Cline and Doolittle, 1992), the shuttle vectors can be used to investigate the molecular biology of Ha. marismortui. One or two of the rRNA operons of Ha. marismortui can be altered or removed from the genome using these vectors and their effects on the survival of the organism can be studied. With the altered genomic DNA of Ha. marismortui (the presence of only one rRNA operon), the salt effects also can be studied.  141  REFERENCES Achenbach-Richter, L., K. O. Stetter and C. R. Woese. 1987. A possible missing link among archaebacteria. Nature. 327:348 - 349. Achenbach-Richter, L. and C. R. Woese. 1988. The ribosomal gene spacer region in archaebacteria. System. Appl. Microbiol. 10:211 - 214. Anderson, S., A. T. Bankier, B. G. Barrell, M. H. L. d. Bruijn, A. R. Coulson, J. Drouin, I. C. Eperon, D. P. Nierlich, B. A. Roe, F. Sanger, P. H. Schreier, A. J. H. Smith, R. Staden and I. G. Young. 1981. Sequence and organization of the human mitochondrial genome. Nature. 290:457 - 464. Atkinson, T. and M. Smith. 1984. Solid phase synthesis of oligoribonucleotides by the phosphite-triester method. In Oligonucleotide Synthesis. A Practical Approach. Edited by Gait. M. J., IRL Press, OxfordAVashington, pp. 35-81. Bayley, S. T. and D. J. Kushner. 1964. The ribosomes of the extremely halophilic bacterium Halobacterium cutirubrum. J. Moi. Biol. 9:654 - 669. Bayley, S. T. and E. Griffiths. 1968. A cell-free amino acid incorporation system from an extremely halophilic bacterium. Biochem. 7:2249 - 2256. Berghofer, B., L. Krockel, C. Kortner, M. Truss, J. Schallenberg and A. Klein. 1988. Relatedness of archaebacterial RNA polymerase core subunits to their eubacterial and eukaryotic equivalents. Nucleic Acids Res. 16:8113 - 8128. Boone, D. R. and R. A. Mah. 1990. Family n. Methanosarcinaceae, genus. Methanosarcina. In Bergey's Manual, of Systematic Bacteriology. Edited by Staley, J. T., Brylant, M. P., Pfennig, N., and Holt, J. G., Williams and Wilkins, Baltimore, pp. 2198 - 2205. Bock, A., H. Hummel, M. Jarsch and G. Wich. 1986. In Biology of Anaerobic Bacteria.  142 Edited by Dubourgiuer H. D. et al. Elsevier Science Publishers. Amsterdam, pp. 206 226. Branlant, C. A., M. A. Krol, A. Machatt, J. Pouyet, J.-P. Ebel, K. Edwards and H. Kossel. 1981. E.coli 23S rRNA heterogeneity. Nucleic Acids Res. 9:4303. Brimacombe, R., B. Greur, P. Mitchell, M. Osswald, J. Rinkeappel, D. Schuler and K. Stude. 1990. Three-dimensional structure and Function of Escherichia coli 16S and 23S rRNA as studied by cross-Unking techniques. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington, D.C., pp. 93 - 106. Brombach, M., T. Specht, V. A. Erdmann and N. Ulbrich. 1989. Complete nucleotide sequence of a 23S ribosomal RNA gene from Halobacterium marismortui. Nucleic Acids. Res. 17:8.  Brosius, J., T. J. Dull, D. D. Sleeter and H. F. Noller. 1981. Gene organization and primary structure of a rRNA operon from Escherichia coli. J. Moi. Biol. 148:107 - 127. Brown, A. D. 1976. Microbial water stress. Bact. Rev. 40:803 - 846. Brown, J. W., C. J. Daniels and J. N. Reeve. 1989. Gene structure, organization and expression in archaebacteria. CRC Crit. Rev. Microbiol. 16:287 - 338. Chang, S., A. Majumdar, R. Dunn, O. Makobe, U. R. Bhandary, H. G. Khorana, E. Ohtsuka, T. Tanaka, Y. Taniyama and M. Ikehara. 1981. Bacteriorhodopsin: partial sequence of mRNA provides amino acid sequence in the precursor region. Proc. Natl. Acad. Sci. USA. 78:3398 - 3402. Chant, J. and P. P. Dennis. 1986. Archaebacteria: transcription and processing of ribosomal RNA sequences in Halobacterium cutirubrum. EMBO J. 5:1091 - 1097.  Christian, J. H. B. and J. A. Waltho. 1962. Solute concentrations within cells of halophilic and non halophilic bacteria. Biochem. Biophys. Acta. 65:506 - 508. Christiansen, J. and R. A. Garrett. 1986. How do protein L18 and 5S RNA interact? In Structure, Function and Genetics of Ribosomes. Edited by Hardesty, B., and Kramer, G., Springerverlag, New York, pp. 733 - 748. Cline, S. W. and W. F. Doolittle. 1987. Efficient transfection of the archaebacterium Halobacterium halobium. J. Bacteriol. 169:1341 - 1344. Cline, S. W. and W. F. Doolittle. 1992. Transformation of members of the genus Haloarcula with shuttle vectors based on Halobium and Haloferax volcanii plasmid replicons. J. Bacteriol. 174:1076 - 1080. Condon, C , S. French, C. Squires and C. L. Squires. 1993. Depletion of functional ribosomal RNA operons in Escherichia coli causes increased expression of the remaining intact copies. EMBO J. 12: 4305 -4315. Court, D. 1993. RNase III: a double strand RNA processing enzyme. In Control of mRNA Stability, Edited by Brawerman, G., and Belasco, J. New York: Academic Press. Daniels, C. J., S. E. Douglas, H. Z. McKee and W. F. Doolittle. 1985. Archaebacterial tRNA genes: structure and intron processing. In Microbiology. Edited by L. Leive. American Society for Microbiology Publications, Washington, DC. pp. 349-355. Darnell, J. E. and W. F. Doolittle. 1986. Speculations on the early course of evolution. Proc. Natl. Acad. Sci. USA. 83:1271 - 1275. Darr, S. C , J. W. Brown and N. R. Pace. 1992. The varieties of ribonuclease P. TIBS. 17:178 - 182. DasSarma, S., U. RajBhandary and H. G. Khorana. 1984. Bacterio-opsin mRNA in wild type and bacterio-opsin deficient Halobacterium halobium strains. Proc. Natl. Acad. Sci.  144 USA. 81:125 - 129. DasSarma, S., T. Damerval, J. G. Jones and N. T. Demarsac. 1987. A plasmid-encoded gas vesicle protein gene in a halophilic archaebacterium. Mol. Microbiol. 1:365 - 370. Dennis, P. P. 1985. Multiple promoters for the transcription of ribosomal RNA gene cluster in Halobacterium cutirubrum. J. Mol. Biol. 186: 457 - 461. Dennis, P. P. 1991. The ribosomal RNA operons of halophilic archaebacteria. In General and Applied Aspects of Halophilic Microorganisms. Edited by Rodriguez-Valera, F., Plenum Press, New York, pp. 251 - 257. Dente, L. and R. Cortese. 1983. pEMBL: a new family of single-stranded plasmids for sequencing DNA. Methods Enzymol. 155:111 - 118. Doolittle, R. F. 1985. The geneology of some recently evolved vertebrate proteins. Trends Biochem. Sci. 10: 233 -237. Dover, G. A. 1982. Molecular drive: a cohesive mode of species evolution. Nature. 299:111 - 117. Dover, G. A. and R. B. Flavell. 1984. Molecular coevolution: DNA divergence and the maintanence of function. Cell. 38:622 - 623. Dover, G. A. 1987. DNA turnover and the molecular clock../. Mol. Evol. 26:47 - 58. Dryden, S. and S. Kaplan. 1990. Localization and structural analysis of the ribosomal RNA operons of Rhodobacter spheroides. Nuclaic Acids Res. 18:7267 - 7277. Dunn, T., S. Hahn, S. Ogden and R. Schleif. 1984. An operator at - 280 base pairs that is required for repression of araBAD operon promoter: addition of DNA helical turns between the operator and promoter cyclically hinders repression. Proc. NatLAcad. Sci. USA. 81:5017 - 5020.  145 Durovic, P. and P.P. Dennis. 1994. Separate pathways for excision and processing of 16S and 23 S rRNA from the primary rRNA operon transcript from the hyperthermophihc archaebacterium Sulfolobus acidocaldarius: similarity to eukaryotic processing. Moi. Microbiol. 13: 229-242 Edgerton, M. E. and P. Brimacombe. 1981. Thermodynamics of halobacterial environments. Can. J. Microbiol. 27:899 - 909. Ellwood, M. and M. Nomura. 1982. Chromosomal locations of the genes for rRNA in Escherichia coli K-12. / . Bacteriol. 149:458 - 468. Ernst, W. G. 1983. The early earth and the archaean rock record. In Earth's Earliest Biosphere: its origin and evolution. Edited by Schopf.J. W. University Press, Princeton, New Jersy, pp. 41 - 52 Felsenstein, J. 1988. Phylogenies from molecular sequences: inference and reliability. Annu. Rev. Genet. 22:521 - 565. Fitch, W. M. 1971. Toward defining the course of evolution: minimum change for a specified tree topology. Syst. Zool. 20:406 - 416. Fitch, W. M. and J. S. Farris. 1974. Evolutionary trees with minimum nucleotide replacementsfromamino acid sequences / . Moi. Evol. 3:263 - 278. Fitch, W. 1977. On the problem of generating the most parsimonious tree. Am. Nat. 111:223 - 257. Fitch, W. M. 1981. A non sequential method for constructing trees and hierarchial classifications../. Moi. Evol. 18:30 - 37. Fox, G. E., E. Stakebrandt, R. b. Hespell, J. Gibson. J. Maniloff, T. A. Dyer, R. S. Wolfe, W. E. Balch, R. Tanner, L. Magnum, L. B. Zablen, R. Blakemore, R. Gupta, L. Bonen, B. J. Lewis, D. A. Stahl, K. R. Luehrsen, K. N. Chen and C. R. Woese. 1980.  The phylogeny of prokaryotes. Science, pp. 261 - 266. Fox, G. E. 1985. The structure and evolution of archaebacterial rRNA. In The Bacteria VIII. Edited by Woese, C. R. and R. Wolfe, Academic Press, New York, pp. 257 310. Garrett, R. A., S. Douthwaite and H. F. Noller. 1981. Structure and role of 5S RNAprotein complexes in protein biosynthesis. Trends in Biochem. Sci. 6:137 - 139. Garrett, R. A., J. Dalgaard, N. Larsen, J. Kjems and A. S. Mankin. 1991. Archaeal rRNA operons. Trends. Biochem. Sci. 16:22 - 26. Garrett, R. A., M. Aagaard, M. Anderson, J. Z. Dalgaard, J. Lykke-Andersen, H. N. Phan, S. Trevisanato, L. 0stergaard, N. Larsen and H. Leffers. 1993. Archaeal rRNA operons. Intron splicing and homing endonucleases, RNA polymerase operons and phylogeny. In Molecular Biology of Archaea. Edited by Pfeifer, F., P. Palm and K. H. Schleifer, Gustav Fisher Verlag Gmbh and Co, Stuttgart, pp. 180 - 191. Gerbi, S. A. 1985. Evolution of ribosomal DNA. In Molecular Evolutionary Genetics. Edited by Mclntyre, R. J., Plenum Press, New York, pp. 419 - 517. Gilbert, W. 1978. Why Genes in Pieces? Nature. 271:501. Gilbert, W. 1986. The RNA World. Nature. 319:618. Ginzburg, M., L. Sachs and B. Z. Ginzburg. 1970. Ion metabolism in a Halobacterium I. Influence of age of culture on intracellular concentrations. / . Gen. Physiol. 55:187 - 207. Gogarten, J. P., H. Kibak, Dittrich, L. Taiz, E. J. Bowman, B. J. Bowman, M. F. Manolson, R. J. Poole, T. E. Date, T. Oshima, J. Konishi, K. Denda and M. Yoshida. 1989. Evolution of the vacuolar H -ATPase: implication for the origin of eukaryotes. +  Proc. Natl. Acad. Sci. USA. 86:9355 - 9359.  Green, R. and Szostak. 1992. Selection of a ribozyme that functions as a superior template in a self-copying reaction. Science. 258:1910 - 1915. Griffiths, E. and S. T. Bayley. 1969. Properties of transfer ribonucleic acid and aminoacyl transfer ribonucleic acid synthetases from the extreme halophile, Halobacterium cutirubrum. Biochem. 8:541 - 551. Gropp, F., W. D. Reiter, A. Sentenac, W. Zillig, R. Schnabel, M. Thomm and K. O. Stetter. 1986. Homologies of components of DNA-dependent RNA polymerases of archaebacteria, eukaryotes and eubacteria. System. Appl. Microbiol. 7:95 - 101. Gunderson, J. H., M. L. Sogin, G. Walters and T. F. McCutchan. 1987. Structurally distinct, stage specific ribosomes occur in Plasmodium. Science. 238:933 - 937. Gupta, R., J. M. Lanter and C. R. Woese. 1983. Sequence of the 16S ribosomal RNA from Halobacterium volcanii, an archaebacterium. Science. 221:656 - 659. Haldane, J. B. S. 1932. The Causes of Evolution. Longmans and Green, London. Hamilton, P. T. and J. N. Reeve. 1985. Structure of genes and an insertion element in the methane producing archaebacterium Methanobravibacter smithi. Mol. Gen. Genet. 200:47 - 59. Hancock, J. M. and G. A. Dover. 1988. Molecular coevolution among cryptically simple expansion segments in eukaryotic 26S/28S rRNA. Mol. Biol. Evol. 5:377 - 392. Hancock, J. M. and G. A. Dover. 1990. Compensatory slippage in the evolution of ribosomal RNA genes. Nucleic Acids Res. 18:5949 - 5954. Heinonin, T. V. K., M. N. Schnare and M. W. Gray. 1990. Sequence heterogeneity in the duplicated large subunit ribosomal RNA genes of Tetrahymena pyriformis mitochondrial DNA. J. Biol. Chem. 265:22336 - 22341.  Henikoff, S. 1987. Exonuclease III generated deletions for DNA sequence analysis. Promega Notes. Promega Corp. No. 8. pp. 1-3. Huet, J., R. Schnabel, A. Sentenac and W. Zillig. 1983. Archaebacteria and eukaryotes possess DNA-dependent RNA polymerase of a common type. EMBO. J. 2:1291 - 1294. Hui, I. and P. P. Dennis. 1985. Characterization of the ribosomal RNA gene clusters in Halobacterium cutirubrum. J. Biol. Chem. 260:899 - 906. Hiiber, P. W. and I. G. Wool. 1984. Nuclease protection analysis of ribonucleoprotein complexes: use of the cytotoxic ribonuclease alpha-sarcin to determine the binding sites for E. coli ribosomal proteins L5, LI8 and L25 on 5S RNA. Proc. Natl. Acad. Sci. USA. 8:322 - 326. Irani, M. H., L. Orosz and S. Adhya. 1983. A control element within a structural gene: the gal operon of Escherichia coli. Cell. 32: 783-788. Iwabe, N., K. Kuma, M. Hasegawa, S. Osawa and T. Miyata. 1989. Evolutionary relationship of archaebacteria, eubacteria and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA. 86:9355 - 9359. Jarsch, M. and A. Bock. 1985. Sequence of the 16S ribosomal RNA gene from Methanococcus vanielli: evolutionary implications. System. Appl. Microbiol. 6:54 - 59. Joshi, P. and P. P. Dennis. 1993. Structure, function and evolution of the family of superoxide dismutase proteins from halophilic archaebacteria. / . Bacteriol. 175:1572 1579. King, T. C. and D. Schlessinger. 1983. Sl nuclease mapping analysis of ribosomal RNA processing in wild-type and processing deficient Escherichia coli. J. Biol. Chem. 258: 12034 - 12042. Kjems, J. and R. A. Garrett. 1987. Novel expression of the rRNA genes in the extreme  theirnophile and archaebacterium Desulfurococcus mobilis. EMBO. 6:3521 - 3530. Kjems, J. and R. A. Garrett. 1990. Secondary structural elements exclusive to the sequence flanking ribosomal RNAs lends support to the monophyletic nature of the archaebacteria. /.  Mol. Evol. 31:25 - 32.  Kjems, J., H. Leffers, T. Olesen, I. Holz and R. A. Garrett. 1990. Sequence, organization and transcription of the ribosomal RNA operon and the downstream tRNA and protein genes in the archaebacterium Thennophilum pendens. Syst. Appl. Microbiol. 13:117 -  127. Kjems, J. and R. A. Garrett. 1991. Ribosomal RNA introns in archaea and evidence for RNA conformational changes associated with splicing. Proc. Nad. Acad. Sci. USA. 88:439 - 443. Klein, A., R. Allmansberger, R. Bokranz, M. Knaub, S. B. MuTler and E. Muth. 1988. Comparative analysis of genes encoding methyl coenzyme M reductase in methanogenic bacteria. Mol. Gen. Genet. 213:409 - 420. Klenk, H. P., P. Palm, Loltspeich and W. Zillig. 1992. Component H of the DNAdependent RNA polymerases of Archaea is homologous to a subunit started by the three eukaryal nuclear RNA polymerases. Proc. Natl. Acad. Sci. USA. 1989:407 -410. Lake, J. A. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol. Biol. Evol. 4:167. Lake, J. A. 1987b. Determining evolutionary distances from highly diverged nucleic acid sequences. / . Mol. Evol. 26:59 - 73. Lake, J. A. 1988. Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences. Nature. 331:184 - 186. Lanyi, J. K. and M. P. Silverman. 1972. The state of binding of intracellular K in +  Halobacterium cutirubrum. Can. J. Microbiol. 18:993 - 995. Lanyi, J. K. 1974. Salt-dependent properties of proteins from extremely halophilic bacteria. Bad. Rev. 38:272 - 290. Lanyi, J. 1979. Physicochemical aspects of salt dependence in halobacteria. In Strategies of microbes in extreme environments. Edited by Shilo, M. Verlag Chemie: Weinheim, FRG. pp. 93 - 107. Larsen, A. and H. Weintraub. 1982. An altered DNA conformation detected by Sl nuclease occurs at specific regions in active chicken globin chromatin. Cell. 29:609 - 622. Larsen, N., H. Leffers, J. Kjems and R. A. Garrett. 1986. Evolutionary divergence between the ribosomal RNA operons of Halococcus morrhuae and Desulfurococcus mobilis. System. Appl. Microbiol. 7:49 - 57. Lechner, K., G. Wich and A. Bock. 1985. The nucleotide sequence of the 16S rRNA gene and flanking regions from Methanobacteriumformicum: the phylogenetic relationship between methanogenic and halophilic archaebacteria. System. Appl. Microbiol. 6:157 163. Leffers, H. and R. A. Garrett. 1984. The nucleotide sequence of the 16S rRNA gene of the archaebacterium Halococcus morrhuae. EMBO J. 3:1613 - 1619. Leffers, H., J. Kjems, L. 0stergaard, N. Larsen and R. A. Garrett. 1987. Evolutionary relationships amongst archaebacteria: a comparative study of 23S rRNAs of a sulphurdependent extreme thermophile, an extreme halophile and a thermophilic methanogen. J. Moi. Biol. 195:43 - 61. Leffers, H., J. Egebjerg, A. Anderson, T. Christiensen and R. A. Garrett. 1988. Domain VI of Escherichia coli 23S ribosomal RNA-structure, assembly and function. J. Moi. Biol. 204:507 - 522.  Leffers, H., F. Gropp, F. Lottspeich. W. Zillig and R. A. Garrett. 1989. Sequence, organization, transcription and evolution of RNA polymerase subunit genes from the archaebacterial extreme halophiles Halobacterium halobium and Halococcus morrhuae. J. Mol. Biol. 207:1 - 19. Lewin, B. 1990. Genes for rRNA are repeated and are transcribed as a tandem unit. In Genes IV, Oxford University Press, New York and Cell Press, Cambridge, Mass. pp. 511-517. Lewin, R. 1986. RNA catalysis gives fresh perspective on the origin of life. Science. 231:545 - 546. Li, W and D. Graur, 1991. Concerted evolution of multigene families. In Fundamentals of Molecular Evolution, Sinauer Associates, Inc. Publishers, Sunderland, Massachusetts, pp. 162 - 169. Louis, B. G. and P. S. Fitt. 1971a. Nucleic acid enzymology of extremely halophilic bacteria: Halobacterium cutirubrum deoxyribonucleic acid polymerase. Biochem. J. 121:621 - 627. Louis, B. G. and P. S. Fitt. 1971b. Halobacterium cutirubrum RNA polymerase: subunit composition and salt-dependent template specificity. FEBS Lett. 14:143 - 145. Louis, B. G. and P. S. Fitt. 1972a. Isolation and properties of highly purified Halobacterium cutirubrum deoxyribonucleic acid-dependent ribonucleic acid polymerase. Biochem. J. 127:69 - 80. Louis, B. G. and P. S. Fitt. 1972b. The role of Halobacterium cutirubrum deoxyribonucleic acid polymerase subunits in initiation and polymerization. Biochem. J. 127:81 - 86. Maden, B. E., C. L. Dent, T. E. Farrell, J. Garde, F. McCallum and J. A. Wakeman. 1987. Human rRNA heterogeneity. Biochem. J. 246:519 - 527.  Majumdar, A. and S. Adhya. 1984. Demonstration of two operator elements in gal: in vitro repressor binding studies. Proc. Natl. Acad. Sci. USA. 81:6100 - 6104. Maly, P. and R. Brimacombe. 1983. Refined secondary structure models for the 16S and 23S ribosomal RNAs of Escherichia coli. Nucl. Acids Res. 11:7263 - 7286. Maniatis, T., E. F. Fritsch and J. Sambrook. 1982. Molecular cloning: a laboratory manual, Cold Spring Harbour Laboratories, Cold Spring Harbour, New York. Mankin, A. S. and A. M. Kopylov. 1981. Secondary structure model for mitochondrial 12S rRNA: an example of economy in rRNA structure. Biochem Int. 3:587 - 593. Mankin, A. S., N. L. Teterina, P. M. Rubtsov, L. A. Baratova and V. K. Kagramanova. 1984. Putative promoter region of rRNA operon from archaebacterium Halobacterium halobium. Nucleic Acids Res. 12:6537 - 6546. Mankin, A. S. and V. K. Kagramanova. 1986. Complete nucleotide sequence of the single ribosomal RNA operon of Halobacterium halobium: secondary structure of the archaebacterial 23S rRNA. Moi Gen. Genet. 202:152 - 161. Mankin, A. S., E. A. Skripkin and V. K. Kagramanova. 1987. A putative internal promoter in the 16S/23S intergenic spacer of the rRNA operon of archaebacteria and eubacteria. FEBS. 219:269 - 273. Mason, S. W., J. Li and J. Greenblatt. 1992. Host factor requirements for processive antitermination of transcription and suppression of pausing by the N protein of bacteriophage X. J. Biol. Chem:261:\9A\K - 19426. Maxam, A. and W. Gilbert. 1980. Sequencing end-labelled DNA with base-specific chemical cleavages. Methods Enzymol. 65:499 - 560. May, B. P., P. Tarn and P. P. Dennis. 1987. Superoxide dismutase from the extremely halophilic archaebacterium Halobacterium cutirubrum. J. Bacteriol. 169:1417 - 1422.  153 McCutchan, T. F., V. F. d. 1. Cruz, A. A. Lai, J. H. Gunderson, H. J. Ellwood and M. L. Sogin. 1988. Primary sequence of two small subunit ribosomal RNA genes from Plasmodium falciparum. Mol. Biochem. Parasitol. 28:63 - 68. Meier, N., H. U. Gbringer, B. Kleuvers, U. Scheibe, J. Eberle, C. Szymkowiak, M. Zacharias and R. Wager. 1986. The importance of individual nucleotides for the structure and function of rRNA molecules in E. coli: a mutagenesis study. FEBS Lett. 204:89 - 95. Mevarech, M. and R. Wercyberger. 1985. Genetic transfer in Halobacterium volcanii. J. Bacteriol. 162:461 - 463. Mevarech, M., S. Hirsch-Twizer, S. Goldman, S. Yakobson, H. Eisenberg and P. P. Dennis. 1989. Isolation and characterization of the rRNA gene clusters of Halobacterium marismortui. J. Bacteriol. 171:3479 - 3485. Moazed, D. and H. Noller. 1989. Intermediate states in the movement of transfer RNA in the ribosome. Nature. 342:142 - 148. Mullakhanbhai, M. F. and H. Larsen. 1975. Halobacterium volcanii spec, nov.; a dead sea Halobacterium with moderate salt requirement. Arch. Microbiol. 104:207 - 214. Miiller, H. J. 1935. The origination of chromatin deficiencies as minute deletions subject to insertion elsewhere. Genetics. 17:237 - 252. Mylvaganam S., and P. P. Dennis. 1992. Sequence heterogeneity between the two genes encoding 16S rRNA from the halophilic archaebacterium Haloarcula marismortui. Genetics 130: 399-410. Nazar, R. N. 1991. Higher order structure of the ribosomal 5S RNA. / . Biol. Chem. 266:4562 - 4567. Neumann, H., A. Gierl, J. J. Leibrock, D. Straiger and W. Zillig. 1983. Organizations for the genes for ribosomal RNA in archaebacteria. Mol. Gen. Genet. 192:66 - 72.  Nodwell, J. R. and J. Greenblatt. 1993. Recognition of box A antiterminator RNA by the E. coli antitermination factors Nus B and ribosomal protein S10. Cell. 72:261 - 268. Noller, H. F., J. Kop, V. Wheaton, J. Brosius, R. R. Gutell, A. M. Kopylov, F. Dohme, W. Heir, D. A. Stahl, R. Gupta and C. R. Woese. 1981, Secondary structure model for 23S ribosomal RNA. Nucleic Acids Res. 9:6167 - 6189. Noller, H. F., D. Moazed, S. Stem, T. Powers, P. Allen, J. Robertson, B. Weiser and K. Triman. 1990. Structure of rRNA and its functional interactions during translation. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington,- D.C., pp. 73 - 92. Noller, H. F. 1991. Ribosomal RNA and translation. Annu. Rev. Biochem. 60:191 - 227. Noller, H. F., V. Hoffarth and L. Zimniak. 1992. Unusual resistance of peptidyl transferase to protein extraction procedures. Science. 256:1416 - 1419. Oakes, M. I., A. Scheinman, T. Acha, G. Shankwieler and J. Lake. 1990. Ribosome structure: three dimensional locations of rRNA and proteins. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington, D.C., pp. 180 - 193. Ochman, H. and A. Wilson. 1987. Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. / . Mol. Evol. 26.1 4 - 86. r  Ohno, S. 1970. Evolution by Gene Duplication. Springer-verlag, Berlin. Ohta, T. 1980. Evolution and Variation of Multigene Families. Springer-Verlag, Berlin. Ohta, T. 1991. Multigene families and the evolution of complexity. / . Mol. Evol. 33:34 41.  Oren, A., P. P. Lau and G. E. Fox. 1988. The taxonomic status of Halobacterium marismortui from the dead sea; a comparison with Halobacterium vallismortis. System. Appl. Microbiol. 10:251 - 258. 0stergaard, L., N. Larsen, H. Leffers, J. Kjems and R. A. Garrett. 1987. A rRNA operon and its flanking region from the archaebacterium Methanobacterium thermoautotrophicum Syst. Appl. Microbiol. 9:199 - 209. Pace, N. and T. L. Marsh. 1985. RNA Catalysis and the origin of life. Origins of Life,  16. 97 - 116. Pace, N. R., G. J. Olsen and C. R. Woese. 1986. Ribosomal RNA phylogeny and the primary lines of evolutionary descent. Cell. 45:325 - 326. Pace, N. R. 1991. Origin of life-facing up to the new physical setting. Cell. 65:531 533. Patthy, L. 1985. Evolution of the proteases of blood coagulation and fibrinolysis by assembly from modules. Cell 41: 657 - 663. Perlman, P. S. and R. A.Butow. 1989. Mobile introns and intron-encoded proteins. Science. 246:1106 - 1109. Piatigorsky, J., W. E. O'Brien, B. L. Norman, K. Kalumuck, G. J. Wistow, T. Borras, J 1988. Gene sharing by delta-crystallin and arginosuccinate lyase. Proc. Natl. Acad. Sci. USA. 85:3479 - 3483. Piccirilli, J. A., T. S. McConnell, A. J. Zaug, H. F. Noller and T. R. Cech. 1992. Aminoacyl esterase activity of Tetrahymena ribozyme. Science. 256:1420 - 1424. Pieler, T. and J. Hamm. 1987. The 5S gene internal control region is composed of three distinct sequence elements, organized as two functional domains of the variable spacing. Cell. 48:91 - 100.  Pilipenko, E. V., S. V. Maslova, A. N. Sinyakov and V. I. Agol. 1992. Towards identification of ds-acting elements involved in the replication of enterovirus and rhinovirus RNAs: a proposal for the existence of tRNA-like terminal structures. Nucleic Acids Res. 20:1739 - 1745. Purler, G., H. Leffers, F. Gropp, P. palm, H. P. Klenk, F. Lottspeich, R. A. Garrett and W. Zillig. 1989a. Archaebacterial DNA-dependent RNA polymerases testify to the evolution of the eukaryotic nuclear genome. Proc. Natl. Acad. Sci. USA. 86:4569 4573. Purler, G., F. Lottspeich and W. Zillig. 1989b. Organization and nucleotide sequence of the genes encoding the large subunits A, B and C of the DNA-dependent RNA polymerase of the archaebacterium Sulfolobus acidocaldarius. Nucleic Acids Res. 17:4517 - 4523. Ramagopal, S. 1992. Are eukaryotic ribosomes heterogeneous? Affrrmations on the horizon. Biochem. Cell. Biol. 70:269 - 272. Rao, A. L. N., T. W. Dreher, L. E. Marsh and T. C. Hall. 1989. Telomeric function of the tRNA-like structure of brome mosaic virus RNA. Proc. Natl. Acad. Sci. 86:5335 - 5339. Raue', H. A., J. Klootwijk and W. Musters. 1988. Evolution conservation of structure and function of high molecular weight rRNA. Prog. Biophys. Moi. Biol. 51:77 - 129. Raue', H. A., W. Musters, C. A. Rutgers, J. V. Riet and R. J. Planta. 1990. rRNA: from structure to function. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington, D.C., pp. 217 - 235. Ree, H. K., K. Cas, D. Thurlow and R. A. Zimmermann. 1989. Structure and organization of the 16S rRNA gene from the archaebacterium Thermoplasma acidophilum. Can. J. Microbiol. 35:124- 133.  Reiter, W., P. Palm, W. Voos, J. Kanieki, B. Grampp, W. Schulz and W. Zillig. 1987. Putative promoter elements for the ribosomal RNA genes of the thermoacidophilic archaebacterium Sulfolobus sp. strain B12. Nucleic Acids Res. 15:5581 - 5595. Reiter, W. A., P. Palm and W. Zillig. 1988. Analysis of transcription in the archaebacterium Sulfolobus indicates that archaebacterial promoters are homologous to eukaryotic Pol II promoters. Nucl. Acids Res. 16:1 - 10. Reiter, W. D., U. Hudepohl and W. Zillig. 1990. Mutational analysis of an archaebacterial promoter: essential role of a TATA box for transcription efficiency and start-site selection in vitro. Proc. Natl Acad Sci. USA. 87:9509 - 9513. Rosenshine, I., R. Tchellet and M. Mevarech. 1989. The mechanism of DNA transfer in the mating system of an archaebacterium. Science. 245:1387 - 1389. Sanger, F., S. Nicklen and A. R. Coulson. 1977. DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA. 74:5463 - 5467. Sanz, J. L., I. Marin, L. Ramirez, J. P.Abad, C. L. Smith and R. Amils. 1988. Variable rRNA gene copies in extreme halobacteria. Nucleic. Aci. Res. 16:7827 - 7832. Schopf, J. W. 1993. Microfossils of the early archean apex chert: new evidence of the antiquity of life. Science. 260:640 - 646. Schwartz, R. M. and M. O. Dayhoff. 1978. Origins of prokaryotes, eukaryotes, mitochondria and chloroplasts. Science. 199:395 - 405. Sharp, P. A. 1985. On the origin of RNA splicing and introns. Cell. 42:397 - 400. *  Shevaick, A., H. S. Gewity, B. Hennemann, A. Yonath and H. G. Wittmann. 1985. Characterization and crystallization ofribosomalparticles from Halobacterium marismortui. FEBS Lett. 184:68 - 71.  Shimmiri, L. and P. P. Dennis. 1989. Characterization of the L l l , LI, L10 and L12 equivalent ribosomal protein gene cluster of the halophilic archaebacterium Halobacterium cutirubrum. EMBO. J. 8:1225 - 1235. Shine, J. and L. Dalgarno. 1974. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc. Natl. Acad. Sci. USA. 71:1342 - 1346. Southern, E. M. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98:503 - 517. Stern, S., B. Weiser and H. F. Noller. 1988. Model for the three dimensional folding of 16S ribosomal RNA. J. Mol. Biol. 204:447 - 481. Stockenius, W., R. H. Lozier and R. A. Bogomolini. 1979. Bacteriorhodopsin and the purple membrane of halobacteria. Biochem. Biophys. Acta. 505:215 - 298. Stockenius, W. and R. A. Bogomolni. 1982. Bacteriorhodopsin and related pigments of halobacteria. Annual Rev. Biochem. 52:587 - 616. Swofford, D. L. 1993. PAUP: Phylogenetic Analysis Using Parsimony version 3.1. Laboratory of Molecular Systematics, Smithsonian Institution, Washington, D.C. Tapprich, W., H. U. Goringer, E. d. Slasio, C. Prescott and D. A. E. 1990. Studies of ribosomal function by mutagenesis of Escherichia coli rRNA. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington, D . C , pp. 236 - 242. Thompson, L. and C. Daniels. 1988. A tRNA trp intron endonuclease from H. volcanii: Unique substrate recognition properties. J. Biol. Chem. 263:17951 - 17956. Thompson, L. D., L. D. Brandon, D. T. Nieuwlandt and C. J. Daniels. 1989. Transfer  RNA intron processing in the halophilic archaebacterium Halobacterium cutirubrum. J. Microbiol. 35:36 - 42. Thurlow, D. L. and R. A. Zimmerman. 1982. Evolution of protein binding regions of archaebacteria, eubacteria and eukaryotic ribosomal RNAs. In Archaebacteria. Edited by Kandler. O. Gustav Fisher Verlag, Shuttgart. pp. 347 - 357. Vogeli, G., H. Ohkubu, M. E. Sobel, Y. Yamada, I. Pastan and B. d. Crombrugghe. 1981. Structure of the promoter for chicken type I collagen gene. Proc. Natl. Acad. Sci. USA. 78:5334 - 5438. Wais, A. S. 1985. Cellular morphogenesis in a halophilic archaebacterium. Curr. Microbiol. 12:191 - 196. Walter, M. R. 1983. Archaean stromatolites: evidence of the earth's earliest benthos. In Earth's Earliest Biosphere: its origin and evolution. Edited by Schopf, J. W. Princeton University Press, Princeton, New Jersy. pp. 187 - 213. Walters, A. P., C. Syin and T. F. McCutchan. 1989. Developmental regulation of stagespecific, ribosome population in Plasmodium. Nature. 342:438 - 440. Watson, J. D., N. H. Hopkins, J. W. Roberts, J. A. Steitz and A. M. Weiner. 1987. In Molecular Biology of the Gene. The Benjamin Cummings Publishing Company, Inc., Toronto, 4th edition, Chapter 28, pp. 1095 - 1130. Wich, G., H. Hummel, M. Jarsch, U. Bar and A. Bock. 1986a. Transcription signals for stable RNA gene in Methanococcus. Nucleic Acids Res. 14:2459 - 2479. Wich, G., L. Sibold and A. Bock. 1986b. Genes for tRNA and their putative expression  signals in Methanococcus. System. Appl. Microbiol. 7:18 - 25. Weiner, A. M. and N. Maizels. 1987. 3' Terminal tRNA-like structures tag genomic RNA molecules for replication: implications for the origin of protein synthesis. Proc. Natl.  Acad. Sci. USA. 84: 7383 -7387. Weiner, A. M. and N. Maizels. 1991. The genomic tag model for the origin of protein synthesis: further evidence from the molecular fossil record. In Evolution of Life: fossils, molecules and culture. Edited by Osawa. S. and Honjo, T., Springer-Verlag, Tokyo, pp. 51 - 66. Wilson, A., S. Cartson and T. White. 1977. Biochemical evolution. Annu. Rev.Biochem. 46:573 - 639. Wittmann-Liebold, B., A. Kopke, E. Ardnt, W. Kromer, T. Hetakeyhama and H. G. Wittmann. 1990. Sequence comparison and evolution of ribosomal proteins and their genes. In The Ribosome: Structure, Function and Evolution. Edited by Hill, W. E., Darlberg, A., Garrett R. A., Moore P. B., Schlessinger, D. and Warner J. R., American Society for Microbiology, Washington, D.C., pp. 598 - 616. Woese, C. R., R. Gutell, R. Gupta and H. F. Noller. 1983. Detailed analysis of the higherorder structure of 16S-like ribosomal nucleic acids. Microbiol. Rev. 47:621 - 669. Woese, C. R., R. Gupta, C. M. Hahn, W. Zillig and J. Tu. 1984. The phylogenetic relationships of three sulfur-dependent archaebacteria. System. Appl. Microbiol. 5:97 106. Woese, C. R. and G. Olsen. 1986. Archaebacterial phylogeny: perspectives on the urkingdoms. System. Appl. Microbiol. 7:161 - 177. Woese, C. R. 1987. Bacterial Evolution. Microbiol. Rev.. 51:221 - 271. Woese, C. R., O. Kandler and M. L. Wheelis. 1990. Towards a natural system of organisms: proposal for the domains archara, bacteria, and eucarya. Proc. Natl. Acad. Sci. USA.87:4576 - 4580. Woese, C. R. and N. R. Pace. 1993. Probing RNA structure, function and history by  comparative analysis. In The RNA World. Edited by Gesteland, R. F. and Atkins, J. F., Cold Spring Harbour Laboratory Press, Cold Spring Harbour, pp. 91 - 117. Wolffe, A. P. and D. D. Brown. 1988. Developmental regulation of two 5S ribosomal RNA genes.Sc/cvnct?. 241:1626 - 1632. Zhang, H., R. Scholl, J. Browse and C. Somerville. 1988. Double stranded DNA sequencing as a choice for DNA sequencing. Nucleic Acids Res. 16:1220. Zillig, W. and K. O. Stetter. 1980. Genetics and evolution of RNA polymerase, tRNA and ribosomes. Edited by Osawa, S., Ozeki, H., Uchida, H and Yura, T., University of Tokyo Press, Tokyo, pp. 101 -126. Zillig, W., K. O. Stetter, R. Schnabel, J. Madon and A. Gierl. 1982a. Transcription in  archaebacterium. Zentralbl. Bakteriol.Microbiol. Hyg. Abstr. 1. Orig. C. 3:218 - 227. Zillig, W., R. Schnabel, J. Tu and K. O. Stetter. 1982b. The phylogeny of archaebacterium, including novel anaerobic thermoacidophiles, in the light of RNA polymerase structure. Naturwissenschaften. 69:197 - 204. Zillig, W., K. O. Stetler, R. Schnabel and M. Thomm. 1985. DNA-dependent RNA polymerases of the archaebacteria. In The Bacteria VIII. Edited by Woese, C. R. and Wolfe, R., New York Academic Press, New York, pp. 499 - 523. Zimmerman, R. A., O. L. Thurlow, R. S. Finn, T. L. Marsh and L. K. Ferrett. 1980. Conservation of specific protein-rRNA interaction in ribosome evolution. In Genetics and Evolution of RNA polymerase, tRNA and Ribosomes. Edited by Osawa, S., Ozeki, H., Uchida, H., and Yura T., University of Tokyo Press, Tokyo, pp. 569 - 580. Zweib, C , D. K. Jemiolo, W. F. Jacob, R. Wagner and A. E. Dahlberg. 1986. Characterization of a collection of deletion mutants at the 3'-end of 16S ribosomal RNA of Escherichia coli. Mol. Gen. Genet. 203:256 - 264.  APPENDIX Table A . l Showing the endonuclease digestion fragments, which are less than 3.0kb, obtained from the operons rrnA and rrnB of Ha. marismortui (see Figure 3.1). A comparison of these fragments to the restriction maps published by Mevarech et al. (1989) revealed discrepancies. The tabulation given below provides a comparison between what was published by Mevarech et al. (1989) and what was observed in the gel shown in Figure 3.1. The following points are considered, the fragments which appear on the gel but not shown on the map (§), the fragments shown on the map but was not observed on the gel (*), the restriction sites that are not indicated at the correct positions on the map (*]), and the restriction site present in the vector (#).  Restriction Enzymes  Clal-EcoRI  Fragments from the rrnA operon  Fragments from the rrnB  (A) in base pairs  operon (A) in base pairs  870  2200  880  2400  1600 2700 (partial digestion product) Clal-Kpnl  2300  450 (§)  2800 (*)  1700 1900  Clal-PstI  400  1850 (fl)  420  2200 a #)  600 (§)  3000 (#)  1500 (§) 2100 Clal-Smal  580  2300 (#)  163  Table A . l continued.  Restriction Enzymes  Clal-Xhol  Fragments from the rrnA operon  Fragments from the rrnB  (A) in base pairs  operon (A) in base pairs  900 (§)  <450 (*)  920 (§)  450  1150 (§)  780  1900 (§)  1600 (#) 2600  HindHI-EcoRI  900 -  2800 (#)  1500 2800 (#) Hindffl-Kpnl  2800 (#)  300 2800 (#)  HindHI-Pstl  400  2800 (#)  1200 (§) 1500 (§) 2100 2800 (#) Hindffl-Xhol  900 (§)  450  920 (§)  780  1150 (§)  1600 (#)  1900 (§)  2800 (#)  2800 (#) HindlR-Sphl  2800(#)  2800 (#)  Table A.2 Oligonucleotide used for sequencing the Ha. marismortui rrnA and rrnB operons and primer extention analysis on the Ha. marismortui ribosomal RNAs.  Name  Sequence (5'-3')  Description  Size  oPD33 GTCCG ATTTAG CCATGCTAG  20mer Hma rrnA and rrnB operons, forword, 16S rRNA at position 39-58.  oPD34 CTAGCATGGCTAAATCGGAC  20mer Hma rrnA and rrnB operons, reverse, 16S rRNA at position 39-58.  oPD36 ATGCGGGGTTAGGCGGG  17mer Hma rrnA operon, 5'-flanking of 16S rRNA at position -14 to -30.  oPD37 AGCGGCGGAAGTGGTTGC  18mer Hma rrnB operon, 5'-flanking of 16S rRNA at position -10 to -30.  oPD38 CTG(CT)GGCTGGATCACCTCCT  20mer Hma rrnA and rrnB operons, 16S rRNA at position 1453-1472.  oPD39 TTAAGTGTGGGACGGCG  17mer Hma rrnA and rrnB operons within 16S-23S intergenic spacer at position 106-123 in rrnA and position 128-145 in rrnB.  oPD4() CCGCCATGTTCAGGAA  17 mer Hma rrnB operon, 5'-flanking of 16S rRNA, position -301 to -318.  oPD41 CTGTTACACTTAAGGGC  17 mer Hma rrnA operon, 5'-flanking of 16S rRNA, position -301 to -318.  oPD43 CCATCGCCGTCCCACACTTAA  21 mer Hma rrnA and rrnB intergenic spacer at position 292-311 of rrnB and position 106-125 rrnA.  Table A.2 continued.  Name - oPD44 oPD45  Sequence (5'-3')  Size  Description  GCTTGGCACGTCCTTCATCAG  21 mer Hma rrnA and rrnB 23S rRNA at positions 39-60.  CCAAGGGCCGGATTTGAACC  20mer tRNA y reverse primer for Ha. marismortui and Hb. cutirubrum, c  s  position 73-53. oPD46  GGCGAATCGACCCTTCCCAG  20mer 16S-23S intergenic spacer primer for mapping Pi and processing sites, reverse, position 290-27'1 in rrnA and 291-311 in rrnB  oPD48  CTCACATCAGATTCCATCTT  20mer Hma rrnB operon at 5'-flanking of 16S rRNA at position 98-118.  oPD49  CGCTCAGGTCGGCACAATCC  20mer Hma rrnB operon at 5'-flanking of 16S rRNA at position -464 to -444.  oSHMl  GGGTGTGCGCGTCGAGG  17 mer Hma rrnA and rrnB 23S rRNA at position 2855-2862.  oSHM2 GAACCTCCAACTCCGT  17 mer Hma rrnA and rrnB 23S rRNA at position 1568-1585.  oSHM3 GGCGGGGGTAACTATGA  17 mer Hma rrnA and rrnB 23S rRNA at position 1942-1959.  oSHM4 AGCATAGGTAGGAGTCG  17 mer Hma rrnA and rrnB 23S rRNA at position 2151-2168.  oSHM5 AGGGAGTACTGGAGTGC  17 mer Hma rrnA and rrnB.SS rRNA at position 74-81.  0SHM6 GGTCGAAGGGGTTGGCG  17 mer Hma rrnA and rrnB 3'-flanking region of 5S RNAs at position 210237.  Table A.2 continued.  Name  Sequence (5-3')  Size  Description  OSHM7  TGGGTGTGTAATGGTGTCTG  20mer Hma rrnA and rrnB 23S rRNA at position 2649-2668.  0SHM8  AACGAGGAACGCTGACG  17mer Hma rrnA and rrnB 23S rRNA at position 2390-2407.  oSHM9  GA AAATCCTGGCCATAG  17mer Hma rrnA and rrnB 23S rRNA at position 1347-1354.  oSHMlO GAACAACCCAGAGATAG  17mer Hma rrnA and rrnB 23S rRNA at position 1075-1182.  oSHMll  GAAAGGCACGTGGAAGT  17mer Hma rrnA and rrnB 23S rRNA at position 803-820.  oSHM12 GTAACCGCGAGTGAACG  17mer Hma rrnA and rrnB 23S rRNA at position212-229.  oCW37  GGTGGTGCATGGCCG  15 mer 16S-like primer, forward, E. coli 1047-1061.  oCW38  GCATGGC(CT)G(CT)CGTCAG  15mer 16S-like primer, forward, E. coli 1053-1068.  oCW40  TGGGTCTCGCTCGTTG  16mer 16S-like primer, reverse, E. coli 1115-1100.  oCW40  TCTTAAGGTAGCGAA  15mer 23S-like primer, forward, E. coli 1922-1937.  oCW42  CCATTGTAGC(GC)CGCGTG  17mer 16S-like primer, reverse, E. coli 1242-1226.  oCW43  CGACCGCCCCAGTCAAACTG  20mer 23S-like primer, reverse, E. coli 2260-2280.  ON  Table A.2 continued.  Name  Sequence (5'-3')  Size  Description  oCW45 ACA CGCGTG CTACAAT  16mer 16S-like primer, forward, E. coli 1225-1240.  oCW46 ACGGGCGGTGTGT(GA)C  15 mer 16S-like primer, reverse, E. coli 1406-1392.  oCW48 GGTTACCTTGTTACGACTT  19mer 16S-like primer, reverse, E. coli 1510-1492.  oCW54 GCTGAAAGCATCTAAG  16 mer 23S-like primer, forward, E. coli 2744-2759.  oCW59 CGCCGGAAGGGCAAGGGTTCC  21 mer 23S-like primer, forward, E. coli 1315-1335.  oCW7() GCTTTTCACGGGCCCC  16mer 23S-like primer, reverse, E. coli 1575-1560.  oCW76 GCCCAGTGCCGGTATGTG  18 mer 23S-like primer, forward, E. coli 1835-1852.  oCW78 AGAGGGTGAAAGCCCCGT  18 mer 23S-like primer, forward, E. coli 322-339.  ON  168  Table A.3 Plasmids and strains used for the characterization of the rrnA and rrnB operons in Ha. marismortui. Strain number  Size of the insert  Vector  Host  Description  pD926  2.0kb  pGEM 7+  JM109  A Smal-Hindin insert consists of a part of 16S rRNA, 16S-23S spacer and a part of 23S rRNA genes from the rrnA operon  pD927  1.6 kb  pGEM 7+  JM109  pD928  2.6 kb  pGEM 7+  JM109  A Hindlll-Smal insert consists of the 5'flanking of 16S rRNA and a part of 16S rRNA gene from the rrnB operon A Hindlll-Smal insert consists of the 5'flanking of 16S rRNA and a part of 16S rRNA gene from the rrnA operon  pD929  2.8 kb  pGEM 7+  JM109  pD1097  654 bp  pGEM 3+  DH5a  pD1098  669 bp  pGEM 3+  DH5a  A Smal-HindlU insert consists of a part of 16S rRNA, 16S-23S spacer and a part of 23 S rRNA genes from the rrnB operon An Aval-Aval insert consists of a part of 16S rRNA, 16S-23S spacer and a part of 23S rRNA genes from the rrnA operon. An Aval-Aval insert consists of a part of 16S rRNA, 16S-23S spacer and a part of 23S rRNA genes from the rrnB operon.  pD1099  1.0 kb  pGEM 7+  DH5a  An EcoRI-EcoRI fragment consists of the 3'-flanking of 5S and Cystein tRNA gene from the rrnA operon  pDHOO  1.7 kb  pGEM 7+  DH5a  An EcoRI-EcoRI fragment consists of a part of 23S rRNA gene and 23S-5S  pD1021 pD1022  8kb 10 kb  pBR322 pBR 322  JM101  spacer region of the rrnA operon. A Hindlll-Clal fragment consists of the  JM101  entire rrnA operon. A Hindlll-Hindlll fragment consists of the entire rrnB operon.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087814/manifest

Comment

Related Items