UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The organization, expression, function and evolution of some essential genes from the hyperthermophilic… Liao, Daiqing 1993

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1993_fall_liao_daiqing.pdf [ 7.84MB ]
Metadata
JSON: 831-1.0098821.json
JSON-LD: 831-1.0098821-ld.json
RDF/XML (Pretty): 831-1.0098821-rdf.xml
RDF/JSON: 831-1.0098821-rdf.json
Turtle: 831-1.0098821-turtle.txt
N-Triples: 831-1.0098821-rdf-ntriples.txt
Original Record: 831-1.0098821-source.json
Full Text
831-1.0098821-fulltext.txt
Citation
831-1.0098821.ris

Full Text

THE ORGANIZATION, EXPRESSION, FUNCTION AND EVOLUTIONOF SOME ESSENTIAL GENES FROM THE HYPERTHERMOPHILICEUBACTERIUM THERMOTOGA MARITIMAbyDAIQING LIAOM.Sc., Peking University, 1986B.Sc., Hunan University, 1983A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF BIOCHEMISTRY AND MOLECULAR BIOLOGYWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAJUNE, 1993 ©Daiqing Liao, 1993141 National Library^Bibliothéque nationaleof Canada du CanadaAcquisitions andBibliographic Services Branch395 Wellington StreetOttawa, OntarioK1A ON4Direction des acquisitions etdes services bibliographiques395, rue WellingtonOttawa (Ontario)KlA ON4Your file Votre referenceOur file Notre referenceThe author has granted anirrevocable non-exclusive licenceallowing the National Library ofCanada to reproduce, loan,distribute or sell copies ofhis/her thesis by any means andin any form or format, makingthis thesis available to interestedpersons.L'auteur a accorde une licenceirrevocable et non exclusivepermettant a Ia Bibliothêquenationale du Canada dereproduire, preter, distribuer ouvendre des copies de sa thesede quelque maniere et sousquelque forme que ce soit pourmettre des exemplaires de cettethese a la disposition despersonnes interessees.The author retains ownership ofthe copyright in his/her thesis.Neither the thesis nor substantialextracts from it may be printed orotherwise reproduced withouthis/her permission.L'auteur conserve Ia propriete dudroit d'auteur qui protege sathese. Ni la these ni des extraitssubstantiels de celle-ci nedoivent etre imprimes ouautrement reproduits sans sonautorisation.ISBN 0-315-85403-0Canada!In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of RroCileivit The University of British ColumbiaVancouver, CanadaDate^(727-, iqq3DE-6 (2/88)i iAbstractThe hyperthermophilic eubacterium Thermotoga maritima grows optimallyat 80°C near marine geothermal locales. Phylogenetic analyses based on variousmolecular sequences indicate that T. maritima and other hyperthermophilicprokaryotes have very deep phylogenetic placements; i.e., that they have divergedearly from the ancestor of living organisms. Thus, studies on the biochemistry andmolecular biology of hyperthermophilic organisms such as T. maritima may shedlight on the early evolution of life, as well as enhance our understanding of life athigh temperature. In this study, a 5,800-base-pair DNA fragment from thechromosome of T. maritima was cloned and sequenced. This fragment encodesfive tRNAs, the ribosomal protein L33, an integral membrane protein, SecE, whichis probably involved in protein translocation, the transcription factor NusG, fourlarge subunit ribosomal proteins (L11, L1, L10 and L12), and the N-terminus of theRNA polymerase 0 subunit. The transcriptional patterns of this gene cluster wereanalyzed using S1 nuclease protection and primer extension techniques. The tRNAgenes and the protein-encoding genes are cotranscribed, except the 13 gene, which istranscribed separately. The following regulatory sequence elements were identifiedin this cloned fragment: five promoters (Pi and P2 in front of the first and secondmethionine tRNAs, respectively, PLio in the L1-L10 intergenic space, PL12 at theend of the L10 gene, and PR in the L12-(3 intergenic region), a transcriptionattenuator upstream of the L10 gene, a transcription terminator located between theL12 and the (3 subunit gene of the RNA polymerase, and an autogenoustranslational regulation site (the Ll binding site) located upstream of the L11 gene.The transcription factor NusG encoded in this cluster exhibits 43% aminoacid sequence identity when aligned to its E. coli counterpart; the alignment isinterrupted by a 171-amino-acid-long insertion into the T. maritima protein. The iiiT. maritima NusG was overexpressed in E. coli, and the recombinant NusG proteinwas purified. The NusG protein binds to DNA cooperatively, but nonspecifically.Two types of NusG-DNA complexes have been observed. The first type formsinstantly and can be stained with ethidium bromide ("loose" complex); the secondtype forms more slowly, and is probably converted from the loose structure(s). Thesecond type is probably more compact, as it can not be stained with ethidiumbromide ("tight" complex). This protein binds to both ds- and ssDNA, butpreferentially to dsDNA in a mixture of both DNA molecules. About 40 and 60NusG monomers per kilobases (pairs) of ds- and ssDNA, respectively, are requiredto form cooperative NusG-DNA complexes. When a relatively large amount ofNusG was added to an in vitro transcription assay, it appears to selectively suppressaberrant transcription initiation and termination, and at the same time, theproduction of specific transcripts is, at most, only marginally reduced.Available sequences that correspond to the E. coli ribosomal proteins L11, L1,L10 and L12 from eubacteria, archaebacteria and eukaryotes have been aligned, andthe alignments were subjected to quantitative phylogenetic analysis. Eubacteria andeukaryotes each form a well-defined, coherent and non-overlapping group.Archaebacteria also form a coherent phylogenetic group by themselves, but therelationships between the major groups of archaebacteria and outgroups (eubacteriaand eukaryotes) can not unambiguously be established. On the other hand, T.maritima does not appear as the deepest branch within the eubacterial kingdom;however, this placement is less definitive.ivTable of ContentsAbstract^ iiTable of Contents^ i vList of Tables viiiList of Figures^ ixAbbreviations xiAcknowledgments^ xiiiI. Introduction^ 11.1 The origin of life^ 11.1.1 Prebiotic synthesis of the building blocks of life^ 11.1.2 The living world before the first cell^ 31.1.3 The first cell^ 41.2 Hyperthermophilic prokaryotes^ 71.2.2 General properties of hyperthermophiles^ 71.2.2 The molecular basis of thermophily 101.2.3 Phylogeny^ 151.3 The Ribosomes 161.4 Transcription^ 191.5 Objectives of this study^ 21II. Materials and methods^ 232.1 Materials^ 232.2 Bacterial strains, plasmids, and media^ 24V2.3 Molecular biological techniques^ 242.3.1 Gel electrophoresis 252.3.2 DNA restriction fragment preparation^ 252.3.3 Ligation^ 262.3.4 Transformation^ 262.3.5 5' and 3' end-labeling of DNA fragments^272.3.6 5' end-labeling of oligonucleotides 272.3.7 Labeling DNA probes by random priming method^282.3.8 Southern blot hybridization and cloning procedures^282.3.9 DNA sequencing^ 292.3.10 RNA transcript analysis^ 292.3.11 Northern blotting 322.4 Expression of T. maritima NusG in E. coli^ 332.5 Purification of NusG^ 342.6 Immunization procedure 352.7 Enzyme-linked immunosorbent assay (ELISA) and Westernblotting^ 362.8 DNA band-shift assays^ 372.9^Isolation of T. maritima ribosomes^ 382.10 In vitro transcription^ 392.11 Molecular sequences 402.12 Sequence alignments^ 402.13 Phylogenetic reconstruction 43III. The organization and expression of essential transcription, translationcomponent genes in the hyperthermophilic eubacterium Thermotogaaritima 44vi3.1 Introduction^ 443.2 Results and discussion^ 463.2.1 The tRNA gene cluster^ 533.2.2 tRNA processing 563.2.3 Characterization of transcripts derived from protein-encoding genes^ 603.2.4 mRNA secondary structure and function^643.2.5 Transcription and translation initiation signals^703.2.6 Protein homologies^ 713.2.7 Evolutionary implications 793.3 Summary^ 79IV. The functions of the T. maritima NusG protein: DNA-binding activityand its role in transcription^ 824.1 Introduction^ 824.2 Results 844.2.1 Binding activity of NusG to duplex DNA^864.2.2 Accessibility of the NusG-DNA complex to restriction byTaql^ 904.2.3 Binding activity of NusG to single-stranded DNA^914.2.4 Competition between single-stranded and duplex DNA forNusG binding^ 944.2.5 Role of NusG in transcription^ 944.2.6 Association of NusG with ribosomes in T. maritima^ 1004.3 Discussion^ 1004.4 Summary 104viiV. Molecular phylogenies based on the sequences of ribosomal proteins L11,L1, L10 and L12^ 1065.1 Introduction 1065.2 Results and Discussion^ 1075.2.1 Alignment and phylogeny of L11 proteins^ 1075.2.2 Alignment and phylogeny of L1 protein sequences^ 1095.2.3 The sequence alignments and phylogeny of L10 proteins^1145.2.4 The sequence alignments and phylogeny of L12 proteins^1205.2.5 Phylogenetic considerations^ 1285.3 Summary^ 130VI. Conclusion and prospects^ 132VII. References^ 134viiiList of TablesTable 1 Taxonomy of hyperthermophilic prokaryotes^ 11Table 2 E. coli strains used for cloning^ 25Table 3 Oligonucleotides^ 30Table 4 Duplex DNA fragments used for DNA band-shift assays^38Table 5 Organisms and their abbreviations from which the sequences of theribosomal proteins L11, L1, L10 and L12 are available^ 41Table 6 T. maritima and E. coli protein homologies^ 73ixList of FiguresFigure 1 A possible scheme for early evolution^ 8Figure 2 A universal phylogenetic tree^ 17Figure 3 Structure and organization of the L11, L1, L10 and L12 encodingregions from E. coli and T. maritima genomes^ 47Figure 4 Nucleotide sequence of the T. maritima 5.8 kb EcoRI genomicfragment^ 49Figure 5 Structure and processing of tRNAs^ 54Figure 6 Mapping of transcript end sites in the tRNA-nusG region^58Figure 7 Characterization of transcripts from the SecE, NusG and ribosomalprotein encoding genes^ 61Figure 8 Structure and features of E. coli and T. maritima RNA transcripts^65Figure 9 Transcription initiation and translation initiation elements in T.maritima^Figure 10 Alignment of SecE protein sequences^Figure 11 Alignment of NusG protein sequences727577Overexpression and purification of the T. maritima NusG protein^85The binding properties of NusG to linear duplex DNA^87The stoichiometry of NusG:duplex DNA complexes 89Susceptibility of NusG:duplex DNA complexes to restriction byTaql^ 92The stoichiometry of NusG:single-stranded DNA complexes^93Competition between single-stranded and duplex DNA for NusGbinding^ 95Figure 18 In vitro transcription of DNA template containing a promoter and aFigure 12Figure 13Figure 14Figure 15Figure 16Figure 17terminator^ 97xFigure 19 In vitro transcription of DNA template containing a promoter, aterminator, and an attenuator^ 99Figure 20 Alignment of the amino acid sequences of ribosomal protein L11family, and the phylogenetic tree based on this alignment^ 110Figure 21 Alignment of the amino acid sequences of ribosomal protein L1family, and the phylogenetic tree based on this alignment^112Figure 22 Alignment of the amino acid sequences of ribosomal protein L10family, and the phylogenetic tree based on this alignment^117Figure 23 Alignment of the amino acid sequences of ribosomal protein L12family^ 122Figure 24 Phylogenetic tree inferred from the aligned L12 amino acidsequences^ 126xiAbbreviationsA^AdenosineA600^Absorbance at 600 nmATP^Adenosine 5'-triphosphateby^Base pairBSA^Bovine serum albuminC^CytosinedATP^2'-deoxyadenosine-5'-triphosphatedCTP^21-deoxycytidine-5'-triphosphateddATP^2',3'-dideoxyadenosine-5'-triphosphateddCTP^2',3'-dideoxycytidine-5'-triphosphateddGTP^2',3'-dideoxyguanosine-5'-triphosphateddTTP^2',3'-dideoxythymidine-5'-triphosphateddNTP^2',3'-dideoxyribonucleotide-5'-triphosphate (ddATP, ddCTP, ddGTPand ddTTP)dITP^2'-deoxyinosine-5'-triphosphateDNA^Deoxyribonucleic acidDNase^DeoxyribonucleasedNTP^2'-deoxyribonucleotide-5'-triphosphate (dATP, dCTP,dGTP and d I fP)ds^Double-strandedDTT^DithiothreitoldTTP^2'-deoxythymidine-5'-triphosphateEDTA^Ethylenediamine tetraacetic acidG^GuanosineGTP^Guanosine-5'-triphosphatexiiIPTG^Isopropyl-13-D-thiogalactopyranosidekbp^Kilobase pairskd^KilodaltonsMOPS^Morpholinopropane sulfonic acidm RN A^Messenger RNANTP^Ribonucleotide-5'-triphosphate (ATP, CTP, GTP and UP)PAGE^Polyacrylamide gel electrophoresisPBS^Phosphate-buffered salinePCR^Polymerase chain reactionPIPES^1,4-piperazine-N,N1-bis[2-ethane sulfonic acid]RNA^Ribonucleic acidRNa se^Ribonuclea serpm^Revolutions per minuterRNA^Ribosomal RNAS^Svedberg unit of sedimentation coefficientSDS^Sodium dodecyl sulphatess^Single-strandedT^ThymidineTmax^Maximum growth temperatureTris^TrihydroxymethylaminomethanetRNA^Transfer RNAX-gal^5-bromo-4-chloro-3-indoly1-13-D-galactosideAcknowledgmentsI would like to thank Dr. Patrick Dennis for providing me the opportunity towork in his lab, and for his guidance, encouragement throughout this study. I amalso indebted to Drs. Philip Bragg and Ross MacGillivray for their advice andsupport. I thank all my fellow workers in the Dennis's lab for their friendship andcooperation. Especially, I am grateful to Luc, Simon and Steve for critical readingthis thesis.Many scientists have provided generous assistance. Dr. Wolfgang Zillig sentus T. maritima cells and the plasmid pUC-TB4. Dr. Peter Palm shared with us thepurified DNA-dependent RNA polymerase from T. maritima. Dr. Jack Greenblattprovided purified E. coli NusG and antisera against it. Drs. Karl Stetter, AlapSubramanian, Roland Hartmann communicated manuscripts and data prior topublication. The Alfred Sloan Foundation awarded a fellowship to allow me toattend the third UCLA International School on Molecular Evolution.Most of all, I thank my wife Lisa for her unconditional support and enduringmany lonely evenings and weekends with our daughter, Jennifer, while I wasworking in the lab. To them—Jennifer and Lisa—I am forever indebted, for makingall this worthwhile. This thesis is dedicated to my parents, Mr. Wanjin Liao andMrs. Liuliang Zhu.1I.^Introduction1.1 THE ORIGIN OF LIFE1.1.1 Prebiotic synthesis of the building blocks of lifeAccording to current thinking, the solar system formed about 4.6 billionyears ago from a cloud of gas and interstellar dust. The Earth was formed from oneof the tiny planetesimals that were condensed from some of the cloud that did notfall into the Sun during the genesis of the solar system. The primitive atmosphereof the Earth resulted from outgassing of various gases from the interior of theEarth. Methane (CH4), ammonia (NH3), water (H20), and hydrogen (H2) wereprobably among the major constituents of the primitive atmosphere. Under theprimitive Earth conditions (frequent lightning and volcanic eruptions), these gasesreacted with each other, resulting in the prebiotic syntheses of amino acids, nucleicacid bases and sugars—the essential building blocks of life. These small moleculeswere probably assembled into larger molecules, such as nucleic acid precursors,short peptides and nucleic acids, by condensation reactions (for recent reviews, seeWeiner, 1987; Pace, 1991).Laboratory experiments, which were thought to simulate primitive Earthconditions, demonstrated that many amino acids, hydrogen cyanide (HCN) andvarious aldehydes (RCHO) could be formed by refluxing a mixture of CH4, NH3,H2O and H2, and passing high voltage sparks through the gaseous phase. Furtherwork showed that (i) refluxing a concentrated solution of HCN in ammonia led tothe synthesis of nucleic acid base adenine; (ii) uracil can also be synthesized fromHCN; and (iii) polymerization of formaldehyde (HCHO) can give rise to varioussugars (reviewed by Weiner, 1987).2The conditions and pathways of prebiotic syntheses remain controversial.First, the exact conditions prevailing on the primitive Earth are still a matter ofspeculation. A more recent model of the early atmosphere suggested that theprimitive atmosphere contained primarily CO2, CO and N2; that the surfacetemperature under such an atmosphere would have been 85°C, and that CH4 andNH3 were probably scarce (reviewed by Kasting, 1993). When life originated about3.5 billion years ago, the atmosphere was likely to have been weakly reducing(mainly CO2 and N2, with traces of CO, H2, and reduced sulfur gases). In such anatmosphere, formaldehyde could still have been synthesized efficiently (Pinto et al.,1980), but formation of hydrogen cyanide (HCN), which is essential for syntheses ofamino acids and nucleotides, would have been much more difficult (reviewed byKasting, 1993). Many current theories regarding the origin of life assume that theprimitive atmosphere was reducing. Explaining how HCN could have formed is,therefore, one of the major hurdles for these theories. However, prebiotic reactionscould still have taken place in anaerobic locales, such as submarine hydrothermalvents, under oxidizing conditions (Wachtershauser, 1992). Alternatively, thebiological precursor molecules could have been introduced by impacting comets orother planetesimals (Chyba et al., 1990).Secondly, many of the proposed prebiotic reaction pathways are mutuallyincompatible. For example, formation of various sugars requires highconcentrations of formaldehyde in alkaline solution, but under such conditions,formaldehyde would react rapidly with the amino group of both nucleic acid basesand amino acids (Shapiro, 1988). Nonetheless, regional variations in conditions(temperature, pH, chemical composition, etc.) might have allowed incompatiblereactions to occur in different locales and at different times on the primitive Earth.31.1.2 The living world before the first cellPrebiotic syntheses created a collection of basic building blocks of life andprobably also short peptides and nucleic acids. Curiously, ribose is much morereadily synthesized than deoxyribose under simulated prebiotic conditions(reviewed by Weiner, 1987). This may be the first indication of the primacy of RNAin the origin of life. The chemical properties of RNA seem to suit it to play centralroles in the early history of life. First, it has inherent template properties, enablingit to self-replicate. Secondly, RNA can catalyze chemical reactions as illustrated bythe self-splicing group I and II introns, the RNA component of RNase P, and thepeptidyl transferase activity of the ribosomal RNA (rRNA) component ofribosomes (Noller et al., 1992). Most recently, it was shown that an engineeredribozyme is able to function effectively both as a catalyst and a template in self-copying reactions (Green and Szostak, 1992). These and other key properties ofRNA have led to the hypothesis that life was based on RNA during early stages ofevolution and that DNA supplanted RNA as the dominant genetic material"relatively late" in the history of life (Crick, 1969; Gilbert, 1986).Naturally, because of the tremendous task of maintaining all metabolicreactions within a modern cell, it seems to be impossible to imagine that RNAcatalysis alone would be enough to carry out the transition from an RNA world to aDNA world. Thus it seems likely that encoded protein synthesis evolved in anRNA world and preceded the advent of DNA. How protein synthesis could haveevolved in an RNA world was addressed in the "genomic tag model" of Weinerand Maizels (1987 and 1991). In this model, it was proposed that tRNAs wouldhave been derived from the 3' terminal structures that tagged RNA genomes forreplication by a replicase made of RNA; charging of this tRNA-like structure withan amino acid could have been selected to facilitate replication, whereas a variantRNA replicase may have given rise to the first tRNA synthetase that ran transfer a charged amino acid to a 3' terminal tRNA-like structure. Thus, this model makes it4possible to select for two key components of the modern translation apparatus,tRNA and a tRNA synthetase, for reasons of replication. From this, it is notdifficult to imagine any number of scenarios by which encoded protein synthesismight have arisen. In this way, the RNA world would gradually give rise to aribonucleoprotein or RNP world early in evolution. In the RNP world, RNAmight have retained essential catalytic activities, while proteins would playstructural roles and enhance catalytic efficiency. Other proteins might have gainednovel catalytic capabilities. Recent experimental observations reveal that ribosomalRNA may catalyze peptide-bond formation during protein synthesis (Noller et al.,1992), and that an engineered Tetrahymena ribozyme can catalyze the hydrolysis ofan aminoacylester bond (Piccirilli et al., 1992). These results seem to be consistentwith the "genomic tag model" for the origin of protein synthesis.The ribose 2'-OH group that aids RNA catalysis also renders the RNA chainparticularly vulnerable to hydrolysis. As the complexity of the living systemsincreased, RNA became less suitable as the genetic material. Ribonucleosidediphosphate reductase was the key enzyme that made the transition to the DNAworld possible. In the DNA world, right-handed double helix DNA serves as thegenetic material, and some ribonucleoprotein complexes have been retained asimportant cellular components, such as the ribosomes and RNase P.1.1.3 The first cellQuantitative phylogenetic analyses of the sequences of many proteins andnucleic acids (RNA and DNA), especially small subunit ribosomal RNA, haveindicated that living organisms evolved from a common primordial ancestor (notnecessarily a cellular entity), and subsequently divided into three primarykingdoms: the eubacteria, the archaebacteria and the eukaryotes (Pace, et al., 1986;Woese and Olsen, 1986; Woese et al., 1990; Pace, 1991). (Woese et al. proposed that the three kingdoms should be renamed as three "domains:" the Bacteria for5eubacteria, the Archaea for archaebacteria, and the Eucarya for eukaryotes [Woese etal., 1990; Wheelis et al., 1992].) The common ancestor of all modern life, dubbed theprogenote, may have come into being after a crude translation mechanism and anucleic acid-based genetic system were devised. The macromolecules were mostlikely already enclosed within a lipid membrane at this stage of evolution (Woese,1987). Because the differences in cell architecture and many molecularcharacteristics are profound among the three primary kingdoms, and thefundamental cellular functions (translation, transcription and replication and soon) seem unique in each line of descent, it is plausible that the common ancestorthat led to the ancestors of archaebacteria, eubacteria and eukaryotes must havebeen very different from any modern organisms. Woese (1987) speculated that theprogenote may have lacked most of the functions characteristic of the cells knowntoday, and that its rudimentary machinery would have undergone significantrefinement and augmentation in the descendant lineages.Studies on microfossils of bacteria and stromatolites (laminated moundswith structures of a type often associated with mats of microorganisms) suggestedthat bacteria-like cells already existed 3.5 billion years ago (Walter et al., 1980; Schopfand Walter, 1983; Knoll and Barghoon, 1985; Schopf, 1993). Photosyntheticeubacteria were among these ancient cells (Walter, 1983), which implies that thecommon eubacterial ancestor existed even earlier. Thus, the common ancestorwould probably have evolved rather quickly into the ancestors of modernorganisms in less than 1 billion years.The question of the nature of the first cell may be the most difficult inbiology. Nonetheless, insights into this have been emerging from evolutionarystudies of modern organisms. Since archaebacterial rRNAs have on average asubstantially more slowly evolving rate than those of either eukaryotes oreubacteria, it was proposed that archaebacteria are more closely related to the common ancestor than the other two groups (Woese, 1987; Pace, 1991). A group of6hyperthermophilic archaebacteria such as Pyrodictium have evolved particularlyslowly; thus they may be the closest living relatives of the first organisms. Thesehyperthermophiles share some unique properties, such as extremely high growth-temperatures (about 100°C), utilization of geochemical energy sources (e.g. sulfur,molecular hydrogen), and fixation of CO2 (reviewed by Pace, 1991; see below).Along this conjecture, it was further proposed recently that the first cellswere probably chemo-autotrophic, thriving in a thermophilic and anaerobicenvironment containing iron-sulfur compounds (pyrite) (W5chtershauser, 1992).The sole reducing power for fixation of CO2 could have been provided by oxidativeformation of pyrite (FeS2) from hydrogen sulfide (H2S) and ferrous (Fe 2+) ion. Suchan environment would also allow the propagation and accumulation of biologicalcompounds, in which water activity is low; thus, macromolecular chemistry couldtake place (reviewed by Pace, 1991).In contrast, Cavalier-Smith (1991) argued that the first cells were probablygram-negative photoheterotrophic eubacteria with an outer membranesurrounding their plasma membrane. Archaebacteria and eukaryotes may haveevolved from a Thermotoga-like eubacterium after it lost the murein-based cellwall. The archaebacterial ancestor evolved a new cell wall and isoprenoidal etherlipids, while the eukaryotic cells arose from this mutant eubacterium and evolvedan internal cytoskeleton based on actin microtubules, which would allow theformation and evolution of endomembranes and nuclei.It is highly unlikely that these debates about what were the phenotypes of thefirst cells can be settled soon. Nonetheless, it is now generally accepted that life onEarth evolved from a single ancestor, which is represented by the root in auniversal phylogeny. The position of the root appears to fall closer to theeubacterial lineage, while the archaebacterial and eukaryotic lineages might arise bya second split after their common ancestor evolved from the primitive ancestor (Gorgarten et al., 1989; Iwabe et al., 1989). However, the position of the root and the7origins of all three lineages, especially the eukaryotes, remain unresolved. It ispossible that eukaryotes may have evolved from a branch within thearchaebacterial lineages; one contention argues that "eocytes" (a group ofhyperthermophilic archaebacteria, e.g. Sulfolobus) are the closest relatives ofeukaryotes (Lake, 1988; Rivera and Lake, 1992). Figure 1 summarizes a possiblescheme for early evolution.1.2 HYPERTHERMOPHILIC PROKARYOTES1.2.2 General properties of hyperthermophilesOrganisms with a maximal growth temperature (Tmax) greater than 50°C aredefined as thermophiles (Brock, 1986), while organisms with an optimal growthtemperature around 80°C are generally classified as hyperthermophiles(Kristjansson and Stetter, 1992). Thermophilic microorganisms have beenrecognized since the late nineteenth century. Early research on thermophilesmostly centered on several species of Bacillus and other thermophilic eubacteriawhich were isolated from a wide range of both thermophilic and mesophilic(temperature lower than 50°C) environments. These organisms can grow attemperatures as high as 70°C.The recent discovery of many extremely thermophilic microorganisms withoptimal growth temperatures above 80°C has brought much interest into the fieldof thermophilic research. Most of these extremely thermophilic organisms (orhyperthermophiles) were isolated from hot springs, solfataras, geothermal soils andsubmarine vents. Such hyperthermophiles are chiefly anaerobicchemolithoautotrophs and heterotrophs. Molecular hydrogen (H2) is often used asthe main source of energy by some chemolithoautotrophs. It was discovered thatmost of the hyperthermophiles belong to the archaebacterial lineage. Thehyperthermophiles in the archaebacterial lineage include some of the methanogensCommon ancestor of eubacteria(probably hyperthermophilic)Common ancestor ofarchaebacteria and eukaryotes(Probably hyperthermophilic)8EVOLUTIONARY STAGE MOLECULAR AND CELLULAR EVENTSPREBIOTIC^ SYNTHESIS OF ESSENTIAL BUILDING BLOCKSSYNTHESIS (Amino acids, bases, sugars, nucleosides,nucleotides, fatty acids, cofactors)1CONDENSATION OF BUILDING BLOCKS(Oligonucleotides, oligopeptides, lipids)FIRST RNA REPLICASERNA worldRNA GENOMES(Distinction between genomic and functionalRNA molecules? RNA splicing? Primitive metabolism?)PEPTIDE-SPECIFIC RIBOSOMES DEFINE GENETIC CODE(Primitive IRNAs, rRNAs, aminoacyl-tRNA synthetases)1TEMPLATE-DEPENDENT TRANSLATION APPARATUS(True mRNAs)Advent ofmembranesTRANSCRIPTION AND REPLICATION OFSEGMENTED DOUBLE-STRANDED GENOMES4,RNA GENOMES COPIED INTO DNA(Ribonucleoside diphosphate reductase, reversetranscriptase)VCELLULAR^ PROGENOTEAND ORGANISMAL (DNA genome, thymidylate synthase,EVOLUTION most genes have introns, slow growth, heterotrophic)3,EVOLUTION OFPROGENOTERNP worldrDNA worldSelection for complexity,inefficient growthextra DNA toleratedURKARYOTES(Eukaryoticnuclear lineage)Selection for efficient growth, autotrophism, extra DNA lost1ARCHAEBACTERIAHyperthermophiles(Sulfur-metabolizingthermoacidophiles etc.)4,EUBACTERIAOxidativephosphorylation,photosynthesisMethanogens and HalophilesMITOCHONDRIA ^SINGLE-CELLEDEUKAROTESMulticellularity ^ CHLOROPLASTSANIMALS^PLANTSIntrons retained, Introns retained, bothheterotrophic^heterotrophic andautotrophic•Some introns retainedin tRNA, rRNA and someprotein genesMost introns lost,Operons ariseFigure 1 A possible scheme for early evolution(Modified after Darnell and Doolittle [1986].)9and sulfur-metabolizing thermophiles. The extreme halophiles are usuallymesophilic with only one species that can grow up to 55°C (Grant and Larsen, 1989).The hyperthermophilic methanogens such as Methanothermus fervidus areneutrophilic, strictly anaerobic autotrophs, growing on hydrogen and CO2 (Stetter etal., 1981). Sulfur-metabolizing archaebacteria are metabolically very diverse. Thereare thermoacidophiles, which are obligate or facultative aerobes, growing optimallyat pH 2 to 3. The thermoneutrophiles are usually obligate anaerobes, and growbetween pH 5.5 and 7.0; some of them are the most thermophilic organisms known(Tmax=110°C) (Kristjansson and Stetter, 1992; Stetter, 1993). Sulfur-metabolizinghyperthermophiles have small, circular chromosomes. For example, the sizes ofthe circular chromosomes of Sulfolobus acidocaldarius and Thermococcus celer are3.1 and 2.0 megabase pairs respectively (Noll, 1989; Yamagichi and Oshima, 1990).Seven species of hyperthermophilic eubacteria have been isolated (Huberand Stetter, 1992, Huber et al., 1992). Among them, six species are in the orderThermotogales. Some, such as Thermotoga neapolitana and Thermotogathermarum, were isolated from terrestrial neutral hot springs, whilst other speciesthrive in marine hydrothermal fields. Thermotoga maritima was the firsthyperthermophilic eubacterium to be isolated, originally from a geothermalmarine sediment at Vulcano, Italy, and subsequently within many marine high-temperature ecosystems around the world (reviewed by Huber and Stetter, 1992). Itis a fermentative heterotroph, grows anaerobically at temperatures up to 90°C, andutilizes a wide range of carbon sources, such as ribose, glucose, xylose, cellulose,starch and glycogen.The organisms in the order Thermotogales are gram-negative, rod-shapedbacteria, which are about 2 to 5 pm in length and 0.5 to 0.6 pm in diameter. Theseeubacteria grow singly or in pairs; some form short chains and aggregates. Cells ofboth T. maritima and T. neapolitana show a characteristic "toga," a sheath -like 10outer envelope, ballooning over the ends. The main protein constituent of thetoga is a porin (reviewed by Huber and Stetter, 1992).Members of Thermotogales show some distinctive biochemicalcharacteristics. All of them contain rare long-chain dicarboxylic acids, and thespecies in the genus Thermotoga contain a novel ether-lipid, 15,16-dimethy1-30-glyceryloxytriacontanoic acid (De Rosa et al., 1988). Novel glycolipids were alsoidentified in T. maritima cells (Monca et al., 1992). Both T. maritima and T.neapolitana are insensitive to the antibiotic rifampicin, although their RNApolymerase is of the eubacterial type (Huber et al., 1986; Huber and Stetter, 1992).The ribosomes of T. maritima are insensitive to the aminoglycoside antibiotics(streptomycin, kanamycin, gentamycin, neomycin and paromomycin) (Londei etal., 1988).In addition, the hyperthermophilic eubacterium Aquifex pyrophilus wasidentified recently in a hot marine sediment. It is strictly chemolithoautotrophic,and uses H2, thiosulfate, and elemental sulfur as electron donors, and oxygen andnitrate as electron acceptors (Huber et al., 1992). The currently knownhyperthermophilic eubacteria and archaebacteria are summarized in Table 1.1.2.2^The molecular basis of thermophilyContemporary mesophilic organisms cannot survive at high temperatures,because their proteins are denatured upon exposure to heat. The consequences ofheat damage to proteins include the unfolding of tertiary structures. Chemicalmodifications, such as deamination of asparagine and glutamine, hydrolysis ofaspartic acid- and asparagine-containing peptide-bonds, as well as destruction ofcystine bonds, also occur (Hensel, et al., 1992). Thus it is extraordinary thatthermophilic microorganisms not only can survive, but can also grow andmultiply at high temperatures. This has led to consiciprahJe interest in the molecular bases of thermophily. Initially, the protein sequences from thermophilic11Table 1^Taxonomy of hyperthermophilic prokaryotesOrder - Genus Species Tmax(°C)ReferencesI. EubacteriaThermotoga T. maritima 90 Huber et al. (1986)T.^neapolitana 90 Windberger et al.(1989)Thermotogales T.^thermarum 84 Windberger et al.(1989)Thermosipho T. africanus* 77 Huber et al. (1989b)Fervidobacterium F. nodosum 80 Patel et al. (1985)F.^islandicum 80 Huber et al. (1990)"Aquificiales" Aquifex A. pyrophilus 95 Huber et al. (1992b)II. ArchaebacteriaSulfolobus S. acido-caldarius85 Brock et al. (1972)S. solfataricus 87 Zillig et al. (1980)S. shibatae 86 Grogan et al. (1990)S.^metallicus* 75 Huber et al. (1991)SulfolobalesS. thurin -giensis85 Stetter (1993)Metallosphaera M. sedula 80 Huber et al. (1989a)Acidianus A. infernus 95 Segerer et al. (1986)A. brierleyi* 75 Brierley et al.(1983)Desulfurolobus D. ambivalens 95 Zillig et al. (1987)Stygiolobus S. azoricus 89 Segerer et al. (1991)Thermoproteus T. tenax 97 Zillig et al. (1981)T. neutro -philus97 Stetter (1986)T.^uzoniensis 97 Bonch-Thermoproteales Osmolovskaya etal. (1990)Pyrobaculum P. islandicum 103 Huber et al. (1987)P. organo -trophum103 Huber et al. (1987)Thermofilum T. pendens 95 Zillig et al. (1983)T. librum 95 Stetter (1986)12Table 1 (continued)DesulfurococcalesDesulfurococcusStaphylothermusD. mobilisD. mucosusD. saccha -rovoransD. amyloly-ticusS. marinus9597979798Zillig et al. (1982)Zillig et al. (1982)Stetter (1986)Bonch-Osmolovskaya etal. (1985)Fiala et al. (1986)Pyrodictium P. occultum 110 Stetter et al. (1983)P. brockii 110 Stetter et al. (1983)Pyrodictiales P. abyssi 110 Pley et al. (1991)Hyperthermus H. butylicus 108 Zillig et al. (1990)Thermodiscus T.^maritimus 98 Stetter (1986)Thermococcus T. celer 93 Zillig et al. (1983)T.^litoralis 98 Neuner et al.(1990)ThermococcalesT. stetteri 98 Miroshnichenko etal. (1989)T. acidamino -vorans96 Stetter (1993)T. tadjuricus 94 Stetter (1993)Pyrococcus P. furiosus 103 Fiala et al. (1986)P. woesii 103 Zillig et al. (1987)Archaeoglobus A. fulgidus 92 Stetter et al. (1987)A. profundus 92 Burggraf et al."Archaeog- (1990a)lobales" A.^litho-trophicus100 Stetter (1993)Methano- Methanothermus M. fervidus 97 Stetter et al. (1981)bacteriales M. sociabilis 97 Lauerer et al. (1986)Methanococcus M.^thermoli-thotrophicus*70 Huber et al. 1982Methanococcales M. jannaschii 86 Jones et al. (1983)M. igneus 91 Burggraf et al.(1990b)"Methan-opyrales"Methanopyrus M. kandleri 110 Kurr et al. (1991)* extremely thermophilic organism (Tmax <80°C) (Table adapted after Stetter[1993]).13and mesophilic organisms were compared. Thermophilic bacilli were often used asmodels for many of these studies which now have accumulated useful informationwhich has been utilized by biochemists to attempt to modify proteins to improvetheir thermostability (Clarke et al., 1986).Amino acid replacements that make proteins thermostable are often subtle,and frequently involve a small number of residues. For example, glutamic acidand arginine are favored over aspartic acid and lysine, respectively; cysteine andasparagine are often avoided (Robson and Pain, 1971; Hensel et al., 1992), and highlevels of glycine and proline residues (allowing proteins to fold tighter, andtherefore more stable, loops that connect the secondary structural-elements) werealso observed in some thermal stable proteins (Davies et al., 1991; Watanabe et al.,1991), but definitive rules are lacking (Wedler and Merkler, 1985). It is generallyagreed that the thermostability of proteins is primarily dependent on the three-dimensional structure. Homologous proteins from thermophiles and mesophilesoften share similar tertiary structures, but the thermostable proteins exhibit moreextensive hydrogen bonding, hydrophobic interactions, and ionic bonding, as wellas more stable a helices, from which helix-destabilizing residues are deleted (Perutzand Raidt, 1975; Davies et al., 1993). Comparative studies of three-dimensionalstructures among mesophilic and thermophilic proteins have yielded best insightsinto the factors that control the thermal stability of proteins. Perutz and Raidt(1975) concluded that the enhanced thermal stability in ferredoxins andhemoglobins were conferred by additional salt bridges and/or hydrogen bonds. Onthe other hand, these stabilizing forces may make thermostable proteins lessflexible. For example, D-glyceraldehyde-3-phosphate dehydrogenase (GAPDH) fromT. maritima has a rigid structure at 25°C when compared to mesophilic GAPDHmolecules: A temperature-dependent conformational change of the T. maritimaenzyme has been detected, which is probably required to activate it to function at the physiological temperatures of T. maritima (Wrba et al., 1990; Rehaber and14Jaenicke, 1992). A small structural reorganization in the glutamine synthetase ofBacillus stearothermophilus occurs at well below the temperature of enzymeinactivation—such change may also be required for the activity of the enzyme(Magsanaga and Nosoh, 1974).Enzymes that do not have intrinsic thermostability at the optimal growthtemperature of an organism may be stabilized by cell components such asmembranes, cofactors and substrates, as well as a special intracellularmicromolecular environment (reviewed by Sharp et al., 1992). Purified DNApolymerase from a Thermotoga species was moderately thermostable and exhibiteda half-life of 3 min at 95°C, and 60 min at 50°C in the absence of substrates. Giventhe high growth temperature of this organism, it is probable that the substrates ofthe enzyme and other cellular components may be needed to enhance itsthermostability (Simpson et al., 1990). Binding substrate and some metal ions hasbeen shown to stabilize the glutamine synthetase from Bacillus caldolyticus at itsphysiological temperature (Wedler and Merkler, 1985; Merkler et al., 1988). Ahighly thermostable hydrogenase from T. maritima has been characterized, whichshows some distinctive properties when compared with hydrogenases ofmesophilic bacteria (Juszczak et al., 1991). For example, tungsten is needed for itsactivity and its iron-sulfur clusters may be different from those of mesophiles,although its amino acid composition is very similar to those of mesophilichydrogenases. These chemical differences may account for their differentthermostability. Other strategies, such as aggregation, are also employed forstabilization of proteins (Wedler and Hoffman, 1974).Genetically, it has been demonstrated that DNA from the thermophilicBacilli strains can be used to transform mesophilic Bacillus subtilis, enabling it togrow at higher temperatures. Because the transformants can produce somethermostable variants of proteins that are normally thermolabile in the host cells, it was postulated that thermophily may be controlled by a small number of genes15affecting translation, which leads to the thermostable phenotype (reviewed bySharp et al., 1992). For thermophiles that can grow at both high and lowtemperatures (a temperature span greater than 40°C), it was proposed that two setsof genes for some key enzymes are probably present, one for low and the other forhigh temperature growth, and that their expression might be regulated bytemperature (Wiegel, 1990).The RNA components of the RNase P from thermophilic eubacteria, such asT. maritima, have remarkably high thermostability (Brown et al., 1993). Likethermostable proteins, the ability of the RNase P RNAs to resist heat denaturationis to a large extent inherent in their nucleotide sequences. Several strategies areemployed to improve thermal stability of these RNAs: (i) increasing the number ofhydrogen bonds in helices (higher G+C-content, more G-C base-pairs in thesecondary structure); (ii) minimizing destabilizing elements in the secondary andtertiary structures (helices with fewer bulged nucleotides, minimal lengths ofconnections between helices); (iii) increasing Watson-Crick base pairs at the bases ofstem-loops; and (iv) avoiding alternative foldings (fewer alternative structures nearthe minimal free energy state or short sequence length) (Brown et al., 1993).1.2.3 PhylogenyBiological and geological evidence seems to indicate that the ancestor ofeubacteria may be a hyperthermophile. First, phylogenetic analyses based on thesequences of small subunit ribosomal RNA, and subsequently various othermolecules, such as translation elongation factors Tu and G, suggested that thedeepest branches in the eubacterial kingdom are represented exclusively by slowlyevolving hyperthermophiles including both T. maritima and A. pyrophilus(Achenbach-Richter et al., 1987; Bachleitner et al., 1989; Tiboni et al., 1991; Burggrafet al., 1992; Huber et al., 1992; Cousineau et al., 1992). Secondly, it has been proposed that the hydrosphere was far hotter three billion years ago than it is now (Ernst,161983). As noted above, eubacteria already existed at that time; those ancient bacteriaon stromatolites were probably Ch/oroflexus-like organisms, the photosyntheticdeep-branching thermophiles belonging to the order green gliding bacteria (Walter,1983; Woese, 1987).Within the archaebacterial kingdom, hyperthermophiles also occupy verydeep branches in the universal phylogenetic tree based on various analyses(Burggraf et al., 1992; Cousineau et al., 1992; Wheelis et al., 1992; Hasegawa et al.,1993). Therefore, it is plausible that the common ancestor of archaebacteria was alsohyperthermophilic (Woese, 1987; Pace, 1991). Because of the proximity of thehyperthermophiles to the root of the universal phylogenetic tree, the commonancestor of all living organisms was probably thermophilic (Pace, 1991; Stetter,1993). A universal phylogenetic tree including many hyperthermophiles isdepicted in Figure 2.1.3 THE RIBOSOMESRibosomes are ribonucleoprotein complexes that use an mRNA template toalign and polymerize amino acids into proteins. Each amino acid is carried on itscognate tRNA, which uses an anticodon to recognize a corresponding codon on themRNA. Since proteins have more versatile catalytic power than RNA, theselective pressure for the early living systems to devise a mechanism to makeprotein must have been intense. As a result, the ribosomes likely arose in an RNAworld (see discussion above). The primitive ribosome probably consisted mainly ofRNA (Woese 1980). The first ribosomal proteins were likely to have been smallpeptides that interacted with rRNA to stabilize the structure and function of therRNA (Weiner and Maizels, 1987 and 1991). These ribosomes were probably error-prone, only able to make small polypeptides (Woese, 1987). Further evolution17Animals^EukaryotesPlantsMicrosporidia^ Fungi^ FlagellatesCiliatesSlime moldsDiplomonadsGreen non-sulfurbacteriaEubacteria^ ArchaebacteriaSulfolobusGram-positive^ Desulfurococcusbacteria ThermotogaPurple bacteriaPyrodictiumFlavobacteria^ ThermofilumCyanobacteria^ ThermoproteusPyrobaculumroloc^1.6.0nottleett° Methanobacterium^Methanopyrus afna 96''^ArchaeoglobusM. Igneus^HalococcusM. thermolithotrophicus M. vannieliiHalobacteriumProgenote^MethanoplanusMethanosarcina MethanospirillumFigure 2^A universal phylogenetic treeThe hyperthermophilic organisms within the tree are indicated by heavy lines (Figureadapted after Stetter [1993]) Aquifex18resulted in the development of a more efficient translational apparatus with manybasic features of a modern ribosome, such as the arrangement as a two-subunitentity, by the progenote stage. After divergence of the three primary kingdoms, theefficiency and accuracy of ribosomes were further refined independently in eachline of descent.The modern ribosome is an exceedingly sophisticated molecular machinecomprised of a few species of RNA and over 50 different proteins: the eubacterialribosome consists of 5S, 16S and 23S rRNA molecules and approximately 50proteins; the eukaryotic counterpart is comprised of 5S, 5.8S, 18S, and 28S rRNAmolecules and about 75 proteins; the archaebacterial ribosome contains 5S, 16S, and23S rRNA molecules and 50 to 65 proteins. The ribosomal components areencoded on many operons in the chromosomes of eubacteria and archaebacteria.There are seven rRNA operons, and approximately 20 operons for the 52 differentribosomal proteins in Escherichia coli; each ribosomal protein is encoded by asingle-copy gene. The ribosomal protein genes are mostly organized in clusters ofone or more transcriptional units that often contain additional genes that encodeproteins involved in DNA replication (e.g. DNA primase), transcription (e.g.subunits of RNA polymerase), and translation (e.g. tRNA molecules, translationalelongation factors Tu and G). In other eubacteria, ribosomal component genes areusually arranged in a similar fashion. Although there is only a single rRNAoperon in the chromosome of T. maritima, the organization of 16S, tRNA, 23S, 5Sand tRNA is the same as in E. coli. In eukaryotes, ribosomal proteins are encodedby single or multicopy genes of monocistronic transcriptional units which arerarely clustered within the genome (reviewed by Planta et al., 1986; Warner, 1989).Archaebacterial genes encoding ribosomal components have been extensivelystudied (reviewed by Matheson et al., 1990; Wittmann-Liebold et al., 1990; Durovicet al., 1993a). The gene organization of ribosomal operons in archaebacteria often resembles that of corresponding operons in eubacteria. For example, the gene19organizations of the L11, L1, L10 and L12 ribosomal protein clusters in somearchaebacteria are identical to that of rif operon in E. coli, and the gene clusteringpatterns corresponding to the E. coli spc, S10, and str operons are very similar fromeubacteria to archaebacteria (Matheson et al., 1990; Wittmann-Liebold et al., 1990).The E. coli rif operon, located at about 90 min in the chromosome, encodesfour large subunit ribosomal proteins, i.e., L11, L1, L10 and L12, and two RNApolymerase subunits 03 and 0'). The ribosomal proteins encoded on this cluster areinvolved directly or indirectly in the GTPase activity on the large ribosomalsubunit, and are required for the binding of extrinsic factors (e.g. EF-Tu and EF-G) tothe ribosome (reviewed by Liljas, 1982). Since the homologous proteins in theGTPase domain have been identified in the eubacteria, the archaebacteria and theeukaryotes, it must have formed prior to the divergence of the three primary livingkingdoms (reviewed by Shimmin et al., 1989; Shimmin, 1990). Many studiesconcerning the structure, function and evolution of these proteins have beenreported (Liljas, 1982; Leijonmarck et al., 1987; Shimmin et al., 1989; Shimmin andDennis, 1989; Matheson et al., 1990; Wittmann-Liebold et al., 1990; Shimmin, 1990;KOpke et al., 1992).1.4 TRANSCRIPTIONTranscription is a cellular process in which RNA polymerase synthesizesRNA (rRNA, mRNA and tRNA) using a DNA template. RNA polymerase is oneof the largest cellular protein complexes in the bacterial cell. In E. coli, the entireenzyme consists of four kinds of subunits with composition of oc21313'a. Like nearlyall biological polymerization reactions, transcription takes place in three steps:initiation, elongation, and termination. Synthesis of RNA starts at a specificregion, called a promoter, on the template. The a factors bind to RNA polymerase,enabling it to selectively initiate transcription at various kinds of promoters. The20typical eubacterial promoter sequences show two common motifs on the 5' side(upstream) of the transcriptional start site. These are denoted as the —10 sequence(Pribnow box), which has a consensus of TATAAT, and the —35 sequence with aconsensus sequence of TTGACA. The distance between these two conservedelements is crucial; it is almost always between 16 and 18 nucleotides (McClure,1985).Following initiation of transcription, the a factor spontaneously dissociatesfrom the RNA polymerase, and transcription enters the elongation step. In theelongation phase, RNA polymerase stays bound to its template until a terminationsignal is reached. A few cellular proteins, such as NusA, NusB, NusG and NusE(ribosomal protein S10), play important roles in regulating transcription elongationand termination in E. co/i (Das, 1992; Roberts, 1993). The NusA protein reduces therate of transcript elongation, enhances pausing by RNA polymerase at certain sites,and is important for transcription termination at other sites (Greenblatt, 1991; Das,1992). The proteins NusB and S10 form heterodimers that interact specifically witha conserved sequence called box A within the leader region of ribosomal RNAtranscripts (Mason et al., 1992). This interaction permits RNA polymerase totranscribe through Rho-dependent transcriptional terminators in the ribosomalRNA operons (Nodwell and Greenblatt, 1993). The E. coli NusG stimulatesantitermination of transcription mediated by the N protein of bacteriophage Xalong with purified NusA, NusB and NusE. The NusG protein stabilizes thetranscriptional elongation-complex N-NusA-RNA polymerase when transcribing anut-containing template, and also facilitates efficient Rho-dependent terminationin vivo and in vitro (reviewed by Das, 1992; Sullivan and Gottesman, 1992; Whalenet al., 1992; Li et al., 1993). Finally, when the RNA polymerase reaches a stop signal,transcription is terminated and the a factor rejoins the core of the enzyme (Platt,1986), since it has stronger intrinsic affinity to the free RNA polymerase than elongation factor NusA (reviewed by Greenblatt, 1992).21In eukaryotic cells, transcription occurs in the nucleus, whereas translationtakes place outside the nucleus. There are three types of eukaryotic RNApolymerases: RNA polymerase I (Pol I), RNA polymerase II (Pol II), and RNApolymerase III (Pol III). Pol I transcribes the tandem array of genes for 18S, 5.8S, and28S rRNAs. Pol II synthesizes precursors of mRNA and several small RNAmolecules, such as the U1 snRNA of the spliceosomes. Pol III makes 5S rRNA andall of the tRNA molecules, and most of the small RNAs. The sequence elementsthat control transcription and their positions in eukaryotic genes are complex. Forthe Pol II promoters, the TATA box is located about 30-nucleotides 5' to the start siteof transcription initiation. Additional upstream activation sequences, such as theCAAT box and the GC box, which are located further upstream, improvetranscription efficiency. Because of the diversity of the control sequences in theeukaryotic genes, many proteins, called transcription factors, are required toregulate transcription (Sawadogo and Sentenac, 1990; Conaway and Conaway, 1991).1.5 OBJECTIVES OF THIS STUDYEarly-branching and slow-evolving organisms like T. maritima may havehigher likelihood of retaining ancestral characteristics than later-branching andmore rapidly evolving organisms. Therefore, a detailed analysis of molecularsequences and biochemical data from hyperthermophilic organisms may revealfeatures of early evolution. The present investigation has the following objectives:(i) to provide basic data from T. maritima for evolutionary comparison studies; (ii)to give perspectives on evolution of translation and transcription apparatuses; and(iii) to investigate the regulation of gene expression in this hyperthermophilicorganism. This thesis focuses on the biochemical and evolutionary analysis of agenomic region from T. maritima that corresponds to the E. coli rif region, which encodes several essential transcription and translation components including the22GTPase-domain proteins of the ribosome. The thesis first deals with the cloningand sequencing of this region, and the genomic organization and the expressionpatterns of the genes located in this cloned fragment (Part III). The next partpresents the function of the transcription factor NusG (Part IV). The thesis thendiscusses quantitative phylogenetic analysis based on the sequences of ribosomalproteins L11, L1, L10, and L12 (Part V). Each chapter begins with an introductionwith expanded information particularly relevant to the content discussed, which isnot detailed in the general introduction (Part 1). Finally, a brief conclusion andsome thoughts for the future are given.23II. Materials and methods2.1 MATERIALSBacterial cell culture components: yeast extract, tryptone, and agar werepurchased from Difco Laboratories, ampicilin was from Sigma Chemical Co.(Sigma), isopropyl-O-D-thiogalactopyranoside (IPTG) from GIBCO BethesdaResearch Laboratories Life Technologies (BRL), 5-bromo-4-chloro-3-indoly1-13-D-galactopyranoside (X-gal) from BRL or Biosynth AG.NTPs, dNTPs, and ddNTPs were obtained from either Pharmacia LKBBiotechnology Inc. (Pharmacia) or United States Biochemical Co. (USB, in theSequenase kit, version 2). Radioactive [a-32NNTPs, [a-32P]dNTP, and [7- 32PJATPwere from Dupont NEN Research Products.Most restriction enzymes, and DNA and RNA modifying enzymes werepurchased from Pharmacia or New England Biolabs (NEB). The E. coli DNA-dependent RNA polymerase was from Boehringer Mannheim; partially purifiedDNA-dependent RNA polymerase of Thermotoga maritima was provided by Dr.Peter Palm (Max Planck Institute, Martinsried, Germany). Modified T7 DNApolymerase (Sequenase) and shrimp alkaline phosphatase (SAP) were from USB.Proteinase K and Moloney murine leukemia virus reverse transcriptase (MMLV-RT) were from BRL. Exonuclease III and S1 nuclease were from Promega.Acrylamide and N,N'-methylene-bis-acrylamide were purchased from Bio-Rad Laboratories (Bio-Rad); agarose (genetic technology grade) was fromSchwartz/Mann Biotech. All other chemicals were purchased from Sigma andBDH. CM Sepharose CL-6B ion exchange resin was from Pharmacia.Hybond-N nylon membrane was obtained from Amersham; nitrocellulosemembrane was from Bio-Rad. Films for autoradiography (XRP-1 and XAR-5, for24radioactive nucleic acid) and photography (for ethidium bromide stained DNA orRNA gels and Coomassie blue stained protein gels) were from Eastman Kodak andPolaroid, respectively.All buffers containing Tris described below were brought to the appropriatepH by using concentrated HC1; the pH of EDTA stock solutions was adjusted withconcentrated NaOH.2.2 BACTERIAL STRAINS, PLASMIDS, AND MEDIAThe E. co/i strains that were used for cloning are described in Table 2. YT (5 gyeast extract, 5 g NaC1, 8 g tryptone, per liter, pH adjusted to 7.5 with NaOH) wasused for growing E. coli.The T. maritima MSB8 and the recombinant plasmid pUC-TB4 whichcontains a portion of the T. maritima rpoB gene and about 1 kb of upstreamsequence were kindly provided by W. Zillig (Max Planck Institute, Martinsried,Germany). Plasmids pGEM series (Promega) and bacteriophage A.gt10 were used forcloning and sequencing.The T. maritima strain MSB8 was cultured at 75°C in MMS medium (Huber,et al., 1986). It contains (per liter): 6.93 g NaCl; 1.75 g MgSO4 • 7H20; 1.38 gMgC12•6H20; 0.16 g KC1; 25 mg NaBr; 7.5 mg H3B03; 3.8 mg SrC12•6H20; 0.025 mgKI; 0.38 g CaC12; 0.5 g KH2PO4; 0.5 g Na2S; 2 mg (NH4)2Ni(SO4)2; 15 ml traceminerals (Balch et al., 1979); 1 mg resazurin; 5 g starch; pH 6.5 (adjusted withH2SO4).2.3 MOLECULAR BIOLOGICAL TECHNIQUESGeneral molecular biology experiments were performed according toprotocols described in Sambrook et al. (1989) and Promega Protocols and Application25Guide (Titus, 1991).Table 2^E. coli strains used for cloningStrain^ GenotypeJM101^A(lac-proAB), supE, thi/F' /acIcIZAM15, traD36, proAB+JM109^recA1, supE44, endA1, hsdR17, gyrA96, relA1, thi, A(lac-proAB)/F'traD36, proAB± lacIq /acZAM15DH5aF' A(lacZY A-argF), U169, endA1, recA1, hsdR17(rk -mk+), deoR, thi-1,supE44, X-, gyrA96, re/AVE' 4)80 diacZAM15DH5a^A(lacZY A -argF), U169, endA1, recA1, hsdR17(rk -mk+), deoR, thi - 1,supE44, X -, gyrA96, re/A1/F" 080 diacZAM15LE392^F-, hsdR1Ark -mk+), supE44, supF58, lac Y1 or 0(lacIZY)6, galK2, galT22,metB1, trpR55, X -BL21(DE3) F-, ompT,rB -me (DE3 is a X. derivitive that was inserted into the int gene of thechromosome of the host [BL21]. This X fragment carries the T7 RNA polymerase geneunder the control of lacUV5 promoter.)^2.3.1^Gel electrophoresisAgarose (genetic technology grade) slab gels (0.8% or 1%) and polyacrylamidegels were run in 0.5X TBE buffer (45 mM Tris, 45 mM boric acid, 1 mM EDTA, pH8.2) at various voltages. The gels were run in the presence of 0.25 tg/m1 ethidiumbromide, or were stained with it after electrophoresis.^2.3.2^DNA restriction fragment preparationRestricted DNA fragments were separated by gel electrophoresis; the bandswere visualized by ethidium bromide staining or autoradiography (when the26fragments were labeled with 3 2P), then excised from agarose or polyacrylamide gels.The DNA bands in agarose gels were isolated by either electroeluting into a dialysistubing, or using the Sephaglas BandPrep Kit (Pharmacia). The DNA bands inpolyacrylamide gels were electroeluted into dialysis tubing (in 0.5X TBE or lx AGBbuffer [20 mM sodium acetate, 40 mM Tris, 1 mM EDTA, pH adjusted to 8.0 withglycial acetic acid]). The eluate was collected and treated with phenol/chloroform(1:1 volume ratio) and precipitated by 2.5 volumes of 95% ethanol.^2.3.3^LigationFor cohesive-end ligation, 40 fmoles of plasmid vector DNA and 1-3 foldmolar excess of insert DNA were ligated in 10 41 of reaction mix containing 1X ligasebuffer (20 mM Tris-HC1, pH 7.6, 5 mM MgC12, 5 mM DTT, 50n/[1,1 BSA), 0.1 unit ofT4 ligase (Weiss unit). The incubation was carried out at room temperature for 3 h.Blunt-end ligation was carried out with molar ratio of vector to insert DNA of 1:4 inlx ligase buffer with 0.5 unit of T4 ligase at room temperature for 3-5 h. One-thirdto one-half of the ligation mixture was used for transformation.^2.3.4^TransformationThe E. coli competent cells were prepared by the CaC12 method fortransformation. The E. coli cells were grown in YT medium to an A600 of about 0.4(1 cm path length). The cells were then collected by centrifugation, and resuspendedin 50 mM CaC12 (0.5 volume of the original culture), and incubated on ice for 40min. The cells were then centrifuged, and resuspended in 50 mM CaCl2, 15%glycerol (in about 0.1 volume of the original culture). The cells were either usedfreshly or stored in small aliquots at -70°C for later use. Competent cells were gentlymixed with 2-4 fmoles of DNA, and incubated on ice for 30 min, heat-shocked at42°C for 45 sec, then incubated on ice for 2 min. One ml of YT medium was then added to cells, which were then incubated with shaking at 37°C for 1 h. 0.1 ml of the27culture was directly plated on YT-agar containing appropriate antibiotics. The rest ofthe culture was centrifuged, resuspended in 0.2 ml of YT medium, and plated.^2.3.5^5' and 3' end-labeling of DNA fragmentsRestriction DNA fragments containing recessed 3' ends were end-labeledusing Klenow fragment of the E. coli DNA polymerase I and appropriate [a-32P]dNTP (specific activity of 3000 Ci/mmol, 10 mCi/ml). For 5' end-labeling, theDNA fragments were dephosphorylated with SAP at 37°C in solution containing 10mM MgC12, 20 mM Tris-HC1, pH 8.0, for 1 h. The reaction mixture was then heatedfor 30 min at 65°C to denature SAP. Once the mixture was cooled down to roomtemperature, Tris-HC1 (pH 8.0), DTT and spermidine were added into the solution to30 mM, 5 mM, and 0.1 mM final concentrations, respectively, along with T4polynucleotide kinase (PNK) (0.1 unit) and 50 pCi [y-32P]ATP. The mixture wasincubated for 30 min at 37°C. The labeled fragments were precipitated twice with 2.5volumes of 95% ethanol. Radioactivity was measured by Cerenkov counting.^2.3.6^5' end-labeling of oligonucleotidesOligodeoxyribonucleotides (about 250 ng) were 5' end-labeled at 37°C for 40min with 1 unit of T4 PNK and 50 gCi of [y-32P]ATP in 20 gl of kinase buffer (0.1 MTris-HC1, pH 8.0, 5 mM DTT, 10 mM MgC12). The reaction was stopped by adding 1gl of 0.5 M EDTA (pH 8.0), and heating at 65°C for 5 min. Carrier tRNA (8 gg) anddistilled water (80 gl) were added. The labeled oligonucleotides were thenprecipitated twice with 2.5 volumes of 95% ethanol in the presence of 0.3 M sodiumacetate, and redissolved in 20-50 IA TE buffer (10 mM Tris-HC1, pH 7.5, 1 mMEDTA).28^2.3.7^Labeling DNA probes by random priming method (Feinberg andVogelstein, 1984)Solution containing about 0.1 pmol of DNA fragment (70 ng for 1 kbfragment) was boiled with 5 ill of random hexadeoxyribonucleotides (about 50A260/ml) for 5 min, then chilled on ice immediately. Five lil of 10X buffer (0.5 MBis-tris-HC1, pH 6.6, 0.2 M NaC1, 50 mM MgC12, 50 mM 13-mercaptoethanol), 1 p.1each of dGTP, dCTP, dTTP (1 mM), 5 1.1.1 of [a-32P]dATP (50 .tCi) and 2 units ofKlenow fragment were added, and the labeling reaction was allowed to take place atroom temperature for 3 h. One 11.1 of yeast tRNA (10 mg/ml) and 1 W of EDTA (0.5M, pH 8.0) were added to the reaction mixture, which was then heated at 65°C for 10min. The labeled probe was precipitated twice with 2.5 volumes of 95% ethanol inthe presence of 0.3 M sodium acetate. The pellet was dried, and the radioactivity wascounted by the Cerenkov method.^2.3.8^Southern blot hybridization and cloning proceduresGenomic T. maritima DNA was isolated following a CsC1 gradientcentrifugation procedure (Sambrook et al., 1989). For Southern blotting, genomicDNA was digested with restriction enzymes, electrophoresed through 0.7% agarosegel, transferred to Hybond N membrane (Amersham), and probed with radioactiverestriction fragments. The fragments were labeled with [a- 32P]dATP using therandom primer method (see above; Feinberg and Vogelstein, 1984). Genomic EcoRIfragments were size fractionated on a 5% polyacrylamide gel and, followingelectroelution, fragments of 4 to 7 kb were ligated with the two arms of kgt10. Theligated DNA was packaged with Packagene extract (Promega). The in vitropackaged phage was used to infect E. coli strain LE392. The phage library wasscreened by plaque hybridization with the 2.2 kb XbaI-EcoRI fragment from plasmidpUC-TB4. Phage DNA from positive plaques was digested with restriction enzymes. A 4.0 kb EcoRI-SacI and a 2.2 kb XbaI-EcoRI fragments were subcloned29into pGEM-7Zf(+) to yield pPD934 and pPD990. The two restriction fragmentsoverlap for about 300 by (between XbaI and Sad sites).^2.3.9^DNA sequencingBidirectional deletions of insert DNA were constructed in plasmids pPD934and pPD990 using exonuclease III (Henikoff, 1984). These deletions were used tosequence both strands of the two overlapping clones. The deletions were sequencedby the dideoxy chain termination method employing either pUC/M13 forward orreverse primers or T7 or SP6 primers. Both single- and double-stranded DNAmolecules were employed as templates (Sanger et al., 1980; and Zhang et al., 1988),and when necessary, 7-deaza-2'-deoxyguanosine 5'-triphosphate (c 7dGTP) and 2'-deoxyinosine 5'-triphosphate (dITP) were used to resolve ambiguities caused by GCcompression.^2.3.10^RNA transcript analysisTotal cellular RNA was isolated from exponential cultures of T. maritimausing the boiling SDS lysis method (Dennis, 1985). Briefly, cells were rapidly cooledon ice, collected by centrifugation, resuspended, and lysed in an SDS-containingbuffer (100°C, 15-30 sec); RNA was extracted with phenol, precipitated with ethanol,resuspended in TE (10 mM Tris-HC1, pH 8.0, 1 mM EDTA), and pelleted through acushion of 5.7 M CsCI (Shimmin and Dennis, 1989). The RNA pellets wereresuspended in TE and stored at —70°C until required.Primer extension analysis was carried out essentially as described by Yang etal. (1990) with minor modifications. Briefly, RNA (5-10 lig) was precipitated with 5'end-labeled primer (1 ng; primers used are listed in Table 3), and resuspended in 20ill of hybridization buffer (0.25 M KC1, 5 mM Tris-HC1, pH 8.3, and 0.5 mM EDTA,pH 8.0). The sample was then denatured at 85°C for 2 min, cooled gradually to 42°C, and then incubated at this temperature for at least 2 h. Afterwards, the30Table 3^OligonucleotidesDesignation Sequence (5'-3') Length Positions Transcripts Strand2or genes(A) Oligonucleotides for primer extension 3oD1 GTCAACCTCGAACATCTT 18 248-231 tRNAmetl-tRNAmet2antisenseoD2 AACAATACGGCCCACCAG 18 1160-1143 nusG antisenseoD3 GGCGGGTCCAACGGGTGG 18 2233-2216 L11 anti senseoD4 GAAACCCAGGAAATCGG 17 3539-3522 L10 antisenseoD5 CCCTCGTCCTCCTACCGC 18 4609-4592 antisenseOD9 GTGTTTCCCGCCCCAGTATC 20 33-144 5S rRNA anti senseoD10 CCGACCACCCGGTTATGAG 19 155-136 tRNAme tl anti senseoD11 ACGACACGGTGATTATGAA 19 338-319 tRNAmet2 antisenseoD12 CACAACCTACTGATTACAAA 20 433-414 tRNAthr anti senseoD13 TCTGCCAGCGGATTTACAG 19 519-501 tRNAtyr anti senseoD14 CGCAACCACCGGI•1•I'IGGAG 20 783-764 tRNAtrp antisenseoD15 TTTCTGTTGCCTGGTCAG 18 3464-3447 L10 antisenseoD16 CGCTTCGATGATTTCATC 18 4026-4008 L12 anti sense(B) Oligonucleotides for PCR5oD6 TAGAATTCATGAAAAAAAAATGGT 41 1512-15356 nusG senseACGTGGTCGTTCAGACAoD7 GAGAATTCATGAAGAAAAAGTGGT 26 1044-1061 nusG senseACoD8 ACAAGCTTTCACTCGA frl'ICTCCA 26 2105-2088 nusG antisenseCoD18 GTGAATTCTATGCTGACCAGGCAAC 27 3443-3461 L10 senseAG01)19 CAAAGCTTTCATTCAGATT^11 CTC 26 3983-3966 L10 antisenseo D20 TTTCTAGAGTGAAAAAAGAATGTT 26 4396-4413 113 senseGC0D21 AAAAGC 111 FGTACCACACTCTATT 26 4476-4459 Tf3 anti senseToD22 TGAGATCTCCGGAGGGTCCACAAA 27 3411-3393 AL10 anti senseAAG31Table 3 (continued)oD23 TAGGGCC=GTGGACCCTCCGG 27 3395-3413 AL10 senseGGL1-5' GGAGAATTCATGCCGAAGCACTCC 27 2602-2619 L1 senseAAGL1-3' TCGAA=TTACTCTITCAACAGA 27 3303-3286 Ll antisenseCTL12-5' TGTQAALECATGACGATTGATGAAA 27 3999-4016 L12 senseTCL12-3' AACAAGC1111ACTTCAGTTCCACT 27 4368-4385 L12 antisenseTC(C) Oligonucleotide for site-directed mutagenesisoD17 GTAGATTCTGATCTC1'1'1'1CTTGTTG 51 1178-1155 nusG antisenseAAACTACCTCTTCAGGAATAACAA andT 1715-16891. The position corresponds to that in Figure 4.2. Antisense indicates that the oligonucleotide is complementary to the codingstrand or mRNA.3. The primers oD9 to oD14 were used for Northern blot hybridization to detecttRNAs and 5S rRNA.4. The position of the 5S rRNA corresponds to the mature form of the T.maritima 5S rRNA (Achenbach-Richter, personal communication).5. The PCR primers have 5' extension that contains restriction endonucleaserecognition sequence (underlined) and two more bases at the 5' end; theposition indicated does not include the extension. The restrictionendonuclease recognition sequences are: GAATTC, EcoRI; AAGCTT,HindIll; TCTAGA, XbaI; GGGCCC, Apr&6^Sequence TGGTAC (code for Trp and Tyr, respectively) in oD6 was inserted,thus the positions indicated do not include these six bases.32annealed sample was diluted to 125 ill in reaction mixture (50 mM Tris-HC1, pH 8.3,3 mM MgC12, 75 mM KC1, 10 mM DTT, 2 mM each of dATP, dGTP, dTTP and dCTP,10-20 units of RNase inhibitor [Pharmacia] and 400 units of MMLV reversetranscriptase [BRL], all final concentraton). The extension reaction was allowed totake place at 42°C for 60 min. The extension products were precipitated, redissolvedin gel loading buffer (98% formamide, 20 mM EDTA, 0.05% bromophenol blue and0.05% xylene cyanol FF), and electrophoresed on an 8% polyacrylamide, 8 M ureagel. The sequencing ladders generated with the same 5' end-labeled primers wererun alonside the samples on the same gel.The 3' and 5' ends of in vivo tRNA and mRNA transcripts were analyzed byS1 nuclease protection analysis as previously described (Favaloro et al., 1980; Dennis,1985; Downing and Dennis, 1987). The appropriate DNA fragment was 5' end-labeled with T4 polynucleotide kinase and [y- 32P]ATP, or 3' end-labeled with theKlenow fragment of DNA polymerase I and the appropriate [a- 32P]dNTP (seeabove). Total cellular RNA (5-10 lig) was hybridized with the end-labeled DNAfragment in 20 Ill of hybridization buffer (80% formamide, 40 mM PIPES, pH 6.8 [thepH was adjusted by concentrated NaOH], 0.4 M NaC1, and 1 mM EDTA) at 50°C for 3h. Hybrids were treated with S1 nuclease (400 units/ml; 34°C for 30 min). Theproducts were precipitated, resuspended, and electrophoresed on an 8%polyacrylamide, 8 M urea gel along with end-labeled DNA length markers. Thelength markers were generated by restriction enzyme digestion of plasmid DNA ofknown sequence; fragments were 3' end-labeled as described above.2.3.11^Northern blottingTen to 20 lig of total cellular RNA were denatured at 70°C for 10 min in thedenaturing buffer (50 mM MOPS, 1 mM EDTA, 0.66 M formaldehyde and 40% [v/v]formamide), electrophoresed through a 1.5% agarose horizontal slab gel made up inthe denaturing buffer, and transferred to Hybond-N membranes by blotting with 20X33SSC (3 M NaC1, 0.3 M trisodium citrate) or 20X SSPE (3.6 M NaC1, 0.2 M sodiumphosphate, 2 mM M EDTA, pH 7.7). The membranes were hybridized withradioactive DNA probes (generated by the random priming method, as describedabove) at 48°C overnight in hybridization solution (5X SSPE, 5X Denhardt's solution[100X Denhardt's solution contains 2% BSA, 2% Ficoll [Pharmacia], 2%polyvinylpyrollidone [Pharmacia], and 0.5% SDS), which were prehybridized in thesame solution in the absence of the DNA probe for 1 to 5 h. The membranes werethen washed with 2X SSPE containing 0.1% SDS twice at room temperature for 15mM. Two subsequent washes were done with lx SSPE, 0.1% SDS for 10 min, and0.25X SSPE, 0.1% SDS for 10 mM, respectively. The membranes were sealed inplastic bags, and exposed to Kodak XAR-5 film with an intensifying screen.2.4 EXPRESSION OF T. MARITIMA NUSG IN E. COLIThe E. coli stains JM 101, JM109 and BL21(DE3)pLysS (Studier et al., 1990) wereused for cloning and expression of NusG-encoding gene of T. maritima. The nusGwas amplified by the polymerase chain reaction (PCR) using primers oD7 and oD8(Table 3). The 5' primer oD7 contains the first 6 codons of the nusG and an eight-nucleotide extension at its 5' end containing the EcoRI recognition sequenceupstream of the ATG initiation codon. The 3' end primer oD8 is complementary tothe last 6 codons of the nusG, and the eight-nucleotide extension at the 5' end of oD8has a HindIII recognition site and two extra bases at the 5' end. The PCR was carriedout in a 100 41 mixture containing 10 mM Tris-HC1, pH 8.3, 50 mM KC1, 1.0 mMMgC12, 50 mM dNTP, 100 mg/ml gelatin, 40 pmol of both primers, 2.5 units of TaqDNA polymerase (Pharmacia) and 10 pg plasmid pPD990 which contains the T.maritima nusG (Part III; Liao and Dennis, 1992). The reaction cycles were: cycle 1, 90sec at 94°C, 30 sec at 55°C, and 90 sec at 72°C; cycles 2 to 31, 15 sec at 94°C, 30 sec at55°C, and 90 sec at 72°C; cycle 32, 5 mM at 72°C. The amplified DNA fragment was34gel purified, digested by EcoRI and Hindi'', and cloned into the EcoRI and HindIIIsites of pGEM-7Zf(+). The amplified sequences were then checked by DNAsequencing from several clones. The EcoRI—HindIII fragment with the correctNusG-coding sequence was cloned into the expression vector pKK223-3 (Pharmacia)to give plasmid pPD1077. To clone the NusG-coding sequence into T7 expressionvector pET-3a (Studier et al., 1990 ), the pET-3a was cut with Ndel, and the nusG-containing pGEM-7Zf(+) was cut by EcoRI, and these sites were rendered blunt bytreatment with nuclease 51 and Klenow fragment of the E. coli DNA polymerase I.These plasmids were then cut by BamHI. The linearized vector pET-3a and nusG-containing fragment were recovered from gel, and ligated together to yield plasmidpPD1078. The E. coli strains JM101 and JM109 were transformed with pPD1077, andBL21(DE3)pLysS was transformed with pPD1078. These transformants exhibited ahigh level expression of the NusG protein of T. maritima.2.5 PURIFICATION OF NUSGThe strain JM109 harboring pPD1077 or BL21(DE3)pLysS harboring pPD1078were grown at 37°C in 2 liters of YT medium containing 100 µg/ml ampicilin.When the absorbance of the culture at 600 nm reached between 0.6 and 1.0, IPTG wasadded to each culture to a final concentration of 0.4 mM. Each culture was grownfor an additional 3 h, and the cells were harvested at 4°C by centrifugation. The cellswere washed with buffer A (50 mM Tris-HC1, pH 8.0, 0.35 M NaC1, 10 mM MgC12and 1 mM EDTA) and resuspended in 50 ml buffer A. The cell suspensions weresonicated at 1 min intervals in an ice/water mixture for 8 min (the cells ofBL21(DE3)pLysS/pPD1078 can also be lysed by a freezing-thawing cycle). The celllysates were centrifuged at 27,000 xg for 25 min to remove cell debris. Then, 1 MNaC1 was added to the cleared cell lysates to a final concentration of 0.5 M.Streptomycin sulfate (20%, w/v) was added slowly to the cell lysates with stirring on35ice to a final concentration of 4%. The cell lysates were stirred further for 15 mM onice. The solutions were then centrifuged at 18,800 xg for 10 mM. The supernatantswere heated at 75°C for 30 min, and centrifuged at 10,800 xg for 15 min. Solidammonium sulfate was added to the supernatants to a final concentration of 24%(w/w), and the supernatants were stirred on ice for 20 min. The supernatants werecentrifuged at 15,900 xg for 15 min. The resulting pellet was dissolved in 15 ml ofbuffer B (25 mM sodium phosphate, pH 7.0, 50 mM NaC1), and dialyzed overnight at4°C against the same buffer with several changes of buffer. After dialysis, thesolution was clarified by centrifugation to remove insoluble materials. The clearedprotein solution was applied to a column (2.0X15 cm) of CM Sepharose CL-6B(Pharmacia) which had been equilibrated with buffer B. The column was washedwith 10 column volumes of buffer B, and eluted with a linear NaC1 gradient of 50-300 mM in 25 mM sodium phosphate buffer, pH 7.0. The column profile wasobtained by plotting A280 of each fraction against fraction number. The fractionsacross the peak were analyzed by SDS-PAGE (Laemmli, 1970). The fractionscontaining NusG were pooled and concentrated to 10- to 15-fold with a Centriprep-10 (Amicon, Beverly, MA). The concentrated NusG solution was dialyzedovernight at 4°C against buffer B with several changes of buffer. All the purificationsteps were carried out at room temperature described above except otherwisespecified. The protein concentration was determined by using BCA* protein assayreagent and procedure recommended by the manufacturer (Pierce, Rockford, IL).2.6 IMMUNIZATION PROCEDUREFor primary immunization, four white rabbits were used; each was injectedsubcutaneously with 0.3 mg of the purified NusG in 0.3 ml of phosphate-bufferedsaline (PBS) and emulsified in an equal volume of Freud's complete adjuvant. After 4 weeks, each rabbit was given an intramuscular booster injection of 0.2 mg of36protein in PBS which was mixed with an equal volume of Freud's incompleteadjuvant. Two weeks later, the antisera were collected.2.7 ENZYME-LINKED IMMUNOSORBENT ASSAY (ELISA) AND WESTERNBLOTTINGAntisera were tested for the presence of specific antibodies against NusGusing ELISA. Purified NusG protein (30 ng) was absorbed onto the wells of amicrotiter plate (MicroTest III assay plate, Becton Dickinson, Oxnard, CA) by dryingovernight at 37°C. The plate was then washed three times with 0.05% Tween 20 inPBS, and blocked for 1 h with 1 % gelatin in PBS. Antisera were diluted 50 fold in0.5 % gelatin-PBS, and added into the first well of each row, and serially diluted ineach adjacent well. After incubation at room temperature for 1 h, secondaryantibody (goat anti-rabbit immunoglobulin G-Horseradish peroxidase conjugate[BRL]), which was diluted 3000 fold in 0.1% gelatin-PBS, was added to each well andincubated for 1 h. After washing with 0.05% Tween 20-PBS, 100 pl substrate mixturewas added into each well. The substrate mixture contained 1 mg/ml of 2,2'-azino-bis(3-ethylbenzthiaziline-6-sulfonic acid) (Sigma) in substrate buffer (0.1 MNa2HPO4, 80 mM citric acid, pH 4.0, and 0.01% H202). After color development, theplate was read by using a BIO TEK microplate reader (Mandel Scientific, Guelph,Ontario) at 405 nm.For Western blotting, the proteins were separated by SDS-PAGE andelectrotransferred to a nitrocellulose membrane (Bio-Rad). The NusG was detectedwith the antiserum against it (diluted 1:30,000 with 0.1% gelatin, 0.05% Tween-20 inTBS [20 mM Tris-HC1, 0.5 M NaCI, pH 7.5]). Membranes were blocked with 3 `)/0gelatin in TBS for 1 h, and washed twice with washing buffer (0.05% Tween 20 inTBS). The membranes were then incubated with primary antibodies for 2 to 16 h. After washing twice with washing buffer, the membranes were incubated for 1 h37with secondary antibodies (goat anti-rabbit immunoglobulin G-alkaline phosphataseconjugate [BRL]), which was diluted 3,000 fold in 1% gelatin, 0.05% Tween-20 inTBS. The membranes were washed subsequently with washing buffer, TBS, andsubstrate buffer (0.1 M Tris-HC1, 0.1 M NaC1, 50 mM MgC12, pH 9.5), respectively; theimmunoreactive proteins were then detected by incubating membranes in thedevelopment solution (0.225 mg/ml BCIP [5-bromo-4-chloro-3-indolyl-phosphate p-toluidine salt] and 0.22 mg/ml NBT [nitroblue tetrazolium chloride], that were firstsolubilized in dimethylformamide [BRL], then diluted in substrate buffer).2.8 DNA BAND-SHIFT ASSAYSDouble-stranded plasmid DNA and single-stranded M13 derivatives werepurified by ethidium bromide-CsC1 ultracentrifugation (Sambrook et al., 1989).Plasmids were digested with restriction enzymes and linear fragments were isolatedfrom agarose gels. Table 4 lists the sizes and the origins of the duplex fragmentsused in the experiments described in Part IV.For the band-shift assays, purified NusG protein was mixed with DNA andincubated at the stated temperatures for the indicated period of time. The standardassay was 65°C for 2 h. Unless otherwise indicated, a typical buffer was 33 mM NaCland 17 mM sodium phosphate, pH 7.0. When necessary, binding reactions wereterminated by freezing in a dry-ice ethanol bath, thawing at 0°C, and mixed with 5Ill of electrophoresis loading solution (50% glycerol [v/v], 0.2% bromophenol blue,0.2% xylene cyanol), and electrophoresed in either agarose or polyacrylamide gels in0.5 X TBE. The gels were stained with ethidium bromide. In some experiments, 3'or 5' end-labeled DNA fragments were employed; complexes were visualized byautoradiography of the electrophoresis gels.38Table 4 Duplex DNA fragments used for DNA band-shift assaysFragment Size(kb)Source Description ReferenceXbaI—HindIII 5.2 pUC-TB4 Sequence that encodes part of the13 subunit of the T. maritima RNApolymeraseW.^Zillig,^personalcommunicationHindIll—HindIll3.5 pGEM-L12 Linearized plasmid pGEM-7Zf(-)that contains the T. maritimaLiao, unpublishedL12-encoding geneHindIll—EcoRI3.0 pGEM-L1 The larger HindIII—EcoRIfragment of the plasmid pGEM-Liao, unpublished7Zf(-) containing the T. maritimaLl-encoding geneHindIll—EcoRI0.7 pGEM-L1 The smaller HindIII—EcoRIfragment of the above plasmidLiao, unpublished(the T. maritima L1-encodingsequence)2.9 ISOLATION OF T. MARITIMA RIBOSOMESThe T. maritima cells were suspended gently in buffer I (10 mM Tris-HC1, pH7.5, 10 mM MgC12, 6 mM P-mercaptoethanol, 30 mM NH4C1), and broken by twopasses through a French press. The DNase I (RNase-free, 200 pig/m1) was added tothe cell extract and incubated at 0°C for 10 min. The cell extract was then centrifugedat 15,000 xg for 30 min to remove cell debris. The supernatant was centrifuged twicefor 5 h at 248,000 xg with a 50 Ti rotor. The pellet was dissolved in buffer I, and theabsorbance of the solution at 260 nm was measured. Two ml portions of thesolution were layered on 6-30% (w/v) linear sucrose gradients (35 ml) made up inbuffer I, and centrifuged at 48,000 xg for 15 h at 10°C. The gradients were fractionateddlld the absorbance of each fraction at 260 nm was monitored continuously. The39fractions containing the 70S ribosomes, and the 50S and 30S ribosomal subunitswere appropriately pooled, and dialyzed against buffer II (same as buffer I, with only0.3 mM MgC12) at 4°C overnight with several changes of buffer II. The ribosomesand the subunits were recovered by adding 1 volume of 95% ethanol (theconcentration of MgC12 was raised to 10 mM before adding ethanol) andcentrifugation at 10,800 xg for 20 min at 4°C. The pellets were dried and resuspendedin small volume of buffer II.2.10 IN VITRO TRANSCRIPTIONVarious DNA templates (about 0.1 pmol) containing T. maritima promoterwere incubated with T. maritima RNA polymerase (0.2 pmol, about 0. 1 p,g) in 5 plof buffer containing 50 mM Tris-HC1 ( pH 9.0 at 25°C) and 50 mM NaCl at 75°C for 10min to form the binary complex. The binary complex was then added to reactionmixture (preheated at 75°C for 1 min) containing 1 mM each of ATP, GTP, CTP, 0.4mM of UTP, 80 nM of [a- 32MTP (about 10 1.1Ci), 6 mM MgC12, 50 mM Tris-HC1 (pH9.0 at 25°C), 0.05 µg/µl BSA, 12.5 mM sodium phosphate (pH 7.0 at 25°C) and 37 mMNaCl (all at final concentration) in 20 pl of total reaction volume, and incubated at75°C for 20 min. When the NusG was included in the transcription assay, differentamounts of the protein (between 0.01 lig and 10 p.g) were used. The transcriptionwas stopped by adding 20 pl of stop buffer (0.6 M sodium acetate, 0.1 M EDTA, 0.2mg/ml yeast RNA) and 60 pl of distilled water. The reaction mixture was extractedonce with phenol/chloroform (1:1 v/v), and then the RNA was precipitated by 2.5volumes of 95% ethanol. The precipitated RNA was dried, and resuspended in 10Al of RNA loading buffer (90% deionized formamide, 50 mM Tris-HC1, pH 8.0, 1mM EDTA, 0.025% xylene cyanol, 0.025% bromophenol blue). The transcripts wereanalyzed by electrophoresis through an 8% polyacrylamide, 8 M urea gel.402.11 MOLECULAR SEQUENCESThe molecular sequences (nucleotide sequences and/or amino acidsequences) for ribosomal proteins L11, L1, L10 and L12 were obtained from sequencedata banks (EMBL, GenBank and Swiss-Prot data banks) associated with theGeneWorks® package (IntelliGenetics, Inc., Mountain View, CA). Sequencesunavailable from the data banks were obtained from the literature. Theabbreviations used as organism identifiers in sequence alignments andphylogenetic trees and the reference for each sequence are listed in Table 5.2.12 SEQUENCE ALIGNMENTSThe amino acid sequences of ribosomal proteins L11, L1, L10 and L12 from theeubacteria, the archaebacteria and the eukaryotes were aligned using the alignmentalgorithm in the GeneWork® package. The resulting alignments were visuallyinspected to minimize the alignment gaps and to maximize amino acid identities.In the cases of ribosomal proteins L10 and L12, the previous evolutionary modelswere consulted in order to preserve predicted structural features (Shimmin et al.,1989). The L12 alignments center on the conserved arginine-tryptophan residue atposition 88. When required for analysis, nucleotide sequence alignments colinearto the depicted amino acid sequence alignments were used. Consensus of sequencealignments was determined visually by a flexible majority rule, where chemicallysimilar amino acid residues at each alignment position were taken intoconsideration. For example, at position 279, in the five archaebacterial L10 proteinsthere are two Ds, one E, one K, and one T. Because of the chemical similaritybetween D and E, D was chosen as the consensus residue, even though it does notrepresent the majority residue at this position.41Table 5 Organisms and their abbreviations from which the sequences of theribosomal proteins L11, L1, L10 and L12 are availableOrganism Abbreviation Proteins ReferenceEubacteriaBacillus^stearothermophilus B st Ll Kimura et al. (1985)L12 Garland et al. (1987)Bacillus subtilis Bsu L12 Itoh and Wittman (1979)Desulfovibrio vulgaris Dvu L12 Itoh and Otaka (1984)Escherichia^coli Eco L11, Ll, L10, L12 Post et al. (1979)Haloanerobium prevalens Hpr L12 Matheson et al. (1987)Halophilic eubacterium Heu L12 Falkenberg et al. (1985)NRCC 41227Micrococcus lysodeikticus Mly L12 Itoh (1981)Proteus vulgaris Pvu L11, Ll Sor and Nomura (1987)RhodopseudomonassphaeroidesRsp L12 Itoh and Higo (1983)Serratia marscescens Sma L11, Ll Sor and Nomura (1987)Salmonella^typhimurium Sty L10, L12 Paton et al. (1990)Spinacea^oleracea Sol(c) L12 Bartsch et al. (1982)(chloroplast) L11 Smooker et al. (1991)Streptomyces griseus Sgr L12 Itoh (1982)Streptomyces^virginiae Sy i L11 Okamoto et al. (1992)Synechocystis sp. PCC 6803 Sec L10,L12 Sibold and Subramanian(1990)Thermotoga^maritima Tina L11,L1, L10, L12 Liao and Dennis (1992)EukaryotesArtemia^salina Asa L121I (eL12') Amons et al. (1979; 1982)L12I (eL12)Dictyostelium discoideum Ddi L10 (P0) Prieto, et al. (1991)Drosophila^melanogaster Dme L10 (PO) Kelley et al. (1989)L1211 (rp21C),L12I (rpAl)Wigboldus, 1987; Qian et al.(1987)Gallus gallus Gga L1211 (P1) Ferro and Reinach (1988)42Table 5 (Continued)Homo sapiens Hsa L10 (P0), L1211 Rich and Steiz (1987)(P1), L12I (P2)Mus musculus Miry L10(P0) Krowczynska et al. (1989)Rattus norvegicus Rno L10(P0) Chan et al. (1989)Rattus rattus Rra L1211 (P1),L12I (P2)Wool et al. (1990)L11 (L12) Suzuki et al. (1990)Saccharomyces^cerevisiae Sce L10 (P0), L12IA,L12IB, L12IIA,L12IIBNewton et al. (1990)Mitsui and Tsurugi (1988);Remacha et al. (1988)L11 (L15) Pucciarelli et al. (1990)Schizosaccharomyces pombe Spo L121 (A4), L12IB Beltrame and Bianchi (1990)(A2), L121I (Al),L12IIB (A3)Trypanosoma cruzi Tcr L12I (P2) Schijman et al. (1991)Tetrahymena^thermophila Tth L1211 (L37) Hansen et al. (1991)ArchaebacteriaHalobacterium cutirubrum Hcu L11, Ll, L10, L12 Shimmin et al. (1989)Halobacterium^halobium H h a L11, Ll, L10, L12 Itoh (1988)Haloarcula marismortui Hma L11, Ll, L10, L12 Arndt and Weigel (1990)Haloferax^volcanii Hvo L11, Ll, L10, L12 Shimmin and Dennis(unpublished data)Methanococcus vannielli Mva Ll, L10, L12 Baier et al. (1990)Sulfolobus acidocaldarius Sac L12 Matheson et al. (1989)Sulfolobus solfataricus2 Sso L11, Li, L10, L12 Ramirez et al. (1989)1. The protein designations used in this thesis are based on the sequence similarity to the E. coli L11,Ll, L10 and L12 proteins. The original nomenclatures are given in parentheses.2. Recent data indicate that the organism used to clone these ribosomal protein genes was actually S.acidocaldarius and not S. solfataricus (Durovic, 1993b). Nonetheless, we have here retained thespecies designation of Ramirez, et al. (1989). 432.13 PHYLOGENETIC RECONSTRUCTIONParsimony analysis of the aligned amino acid sequences using the heuristicand/or branch and bound tree search options and bootstrap analysis were carriedout using PAUP (Swofford, 1993). When the heuristic tree search option was used,random addition of sequences with 10 replications was used to generate theparsimony tree. For bootstrap analysis of the L12 alignments, random addition ofsequences with one replication was used because of limitation in computingcapacity. The tree bisection-reconnection (TBR) algorithm was used in theheuristic tree searches (Swofford, 1993). The distance matrix methods were alsoemployed to construct distance matrix trees using DNADIST, FITCH, KITSCH, andNEIGHBOR programs in the PHYLIP Package (Felsenstein, 1991).44III. The organization and expression of essential transcription,translation component genes in the hyperthermophiliceubacterium Thermotoga maritima3.1 INTRODUCTIONLiving organisms derive from a common primordial ancestor and divideinto three easily recognizable kingdoms or lineages: the eubacteria, thearchaebacteria and the eukaryotes (Woese and Olsen, 1986; Woese et al., 1990). Inspite of this superficial understanding, our knowledge relating to the molecularfeatures of the common ancestor and the precise origins and relationshipsbetween the three surviving kingdoms or lineages remain obscure (Woese, 1987;Woese and Olsen, 1986). Using an elegant approach involving the use ofduplicated gene sequences, Iwabe et al. (1989) and Gogarten et al. (1989) suggestedthat the primordial ancestor, represented by the root of the universal phylogenetictree, falls closest to the eubacterial domain and that the archaebacteria andeukaryotes were derived from a later splitting of a second independent lineage.Thermotoga maritima is an anaerobic and extremely thermophiliceubacterium that has been isolated from geothermal ocean floor locales (Huber etal., 1986). Phylogenetic sequence analysis of 16S rRNA, and elongation factors Tuand G indicates that T. maritima is slowly evolving and is a representative of thedeepest branches within the eubacterial lineage (Achenbach-Richter et al., 1987;Bachleitner et al., 1989; Tiboni et al., 1991). These features—deep branching andslowly evolving—make it more likely that T. maritima has retained ancestralcharacteristics that might tend to be lost in later and more rapidly evolvingbranches. Characterization of the molecular features of T. maritima canpotentially reveal information about the common ancestor and its relationship to eubacteria and possibly also to archaebacteria and eukaryotes.45In this study, we have chosen to characterize a segment of the T. maritimagenome that encodes equivalents to the Escherichia coli L11, L1, L10 and L12 largesubunit ribosomal proteins. The analysis of these particular genes and theproteins they encode is judicious for a number of reasons. Their functionalactivities in protein synthesis are universally conserved, have been wellcharacterized, and for L10, L11 and L12, amino acid sequences are available fromseveral representative species within each of the three kingdoms (for a review seeShimmin et al., 1989). In E. coli and other organisms, a single copy of L10 and fourcopies of L12 assemble along with L11 to form a distinct stalk on the 50S subunit ofthe ribosome; this complex functions in factor binding and GTPase activitiesduring the protein synthesis cycle (Strycharz et al., 1978; Egebjerg, et al., 1990; Ryanet al., 1991). Protein L1 binds to 23 S rRNA to form a shoulder opposite the stalkon the 50S subunit and functions to stabilize peptidyl-tRNA binding to the P siteof the ribosome (Lake and Strycharz, 1981; Draper, 1990).The transcriptional and autogenous translational regulation of the L11, L1,L10 and L12 genes and proteins has been extensively studied in E. coli (Fiil et al.,1980; Baughman and Nomura, 1983; Christiansen et al., 1984; Lindahl and Zingal,1986; finks-Robertson and Nomura, 1987; Downing and Dennis, 1987, 1991). Thefour ribosomal protein genes along with the RNA polymerase 13 and 13' subunitgenes form a complex operon that is transcribed from two promoters PL11 andPuo• The L12-13 intergenic space contains a transcription attenuator that plays animportant role in regulating the expression of the 13 and (3' RNA polymerasesubunit genes. In addition, the proximal ribosomal protein transcripts containtwo well-characterized sites used for autogenous translational regulation. Thefirst is a mimic of the L1 binding site in 23S rRNA and is located immediately infront of the L11 translation initiation codon. A deficiency in 23S rRNAproduction allows L1 protein to bind to the mRNA and block translation of the L11 and L1 cistrons. The long L1-L10 intergenic space contains a second control46region which binds L10 (or L10-L12 complex); protein binding is believed to switchthe conformation of the mRNA to a structure which exhibits greatly reducedtranslational efficiency.The region upstream of the operon encoding ribosomal proteins and RNApolymerase in E. coli is occupied by four tRNA genes, tufB (one of two genesencoding the translation elongation factor Tu), and the short secE-nusG operonencoding two essential proteins involved respectively in protein export and intranscription termination-antitermination (An and Friesen, 1980; Schatz et al.,1990; Downing et al., 1990; Sullivan et al., 1992; Linn and Greenblatt, 1992). Ouranalysis indicates that T. maritima lacks a tufB gene in this region and that fivetRNA genes, secE and nusG are cotranscribed with four genes for the ribosomalproteins L11, L1, L10 and L12. The downstream RNA polymerase genes aretranscribed separately.3.2 RESULTS AND DISCUSSIONThe rif region of the E. coli chromosome contains a cluster of essentialgenes that encode components of the transcription-translation apparatus (Lindahlet al., 1975). Included are genes for the large ribosomal subunit proteins L11, L1,L10 and L12 and the 13 and 13' subunits of RNA polymerase. To identify the regionin the T. maritima genome that encodes the equivalent large subunit ribosomalproteins, genomic DNA was digested and probed by Southern hybridization withthe 2.2 kb EcoRI fragment from E. coli (see Figure 3). At medium stringency, theprobe hybridized to a single 5.8 kb EcoRI fragment.The T. maritima 5.8 kb fragment was shown to contain a single Xbal sitelocated 2.2 kb from one end. The same 2.2 kb Xbal-EcoRI fragment was identifiedas the terminal part of a larger 5.0 kb Xbal-Hin dill genomic fragment present inthe recombinant plasmid pUC-TB4. This 5.0 kb insert fragment was known toNucLeotide Scale (kilobases)0^ 1I i^I  8.5A Escherichia coliT. Y. GT Ti^tufB^secE nusG^L11^Ll^L10^L12^/3B BB Thermotoga maritima^ PROBEtRNAsM1 M2TY WsecE^nusG^Lll^LlL33L10^L12B C C D^X S B^HpUC -TB4 ^X.Tma5.8pPD990pPD934Figure 3 Structure and organization of the L11, L1, L10 and L12 encoding regions from the E. coil and T. maritima genomesA. The structure of a 6-kb portion of the rif region at 89 min on the E. coli chromosome is depicted. Genes are shown as solidboxes and intergenic spaces are blank. The tRNA genes are identified as follows: Tu and TT are non-identical tRNAtYr genes; Y is atRNAtYr gene and G is a tRNAg 1Y gene. The 2.2-kb EcoRI fragment overlapping the L11, Ll, L10 and L12 ribosomal protein geneswas used to probe genomic T. maritima DNA.B. The structure of the corresponding 5.8-kb portion of the T. maritima genome is depicted. The tRNA gene designations are: MIand M2 are non-identical tRNAmet genes; T is a tRNAtiir gene; Y is a tRNAtYr gene and W is a tRNAtrP gene. The secE, nusG genes,and the L33, L11, L1, L10 and L12 ribosomal protein genes are the equivalents or homologues to the corresponding E. coli genes.Some restriction enzyme sites used for generating probes and their positions within the nucleotide sequence are: E, EcoRI (position1, 5783'; C, ail (1356, 2007); D, Dral (3070); X, Xbal (3567); S, Sad (3858) and H, Hin dill. Restriction fragments that have been clonedinto A. car plasmid vectors are indicated.48encode the amino terminal portion of the RNA polymerase 0 subunit protein (W.Zillig, personal communication). These Southern hybridization experimentstherefore suggested that some or all the L11, L1, L10 and L12 equivalent ribosomalprotein genes are, as in E. coli, located proximal to the subunit gene of the RNApolymerase 13 in the T. maritima genome.Using the 2.2 kb XbaI-EcoRI fragment from pUC-TB4 as a probe, we clonedthe genomic 5.8 kb EcoRI fragment in Xgt10, but we were unable to subclone thisEcoRI fragment into a number of different plasmid vectors. However, from therecombinant Xgt10, the 2.2 kb XbaI-EcoRI and the overlapping 4.0 kb EcoRI-SacIfragments were isolated and subcloned to give plasmids pPD934 and pPD990,respectively. The complete nucleotide sequences of the overlapping 4.0 kb EcoRI-SacI and 2.2 kb XbaI-EcoRI fragments yielded the sequence of the entire 5788nucleotide long genomic EcoRI fragment (Figure 4). The sequence contains fivetRNA genes, two short open reading frames encoding the equivalent ribosomalprotein L33 and the secE genes, respectively, a long open reading frame designatednusG, four genes encoding the equivalents of the E. coli L11, L1, L10 and L12 largeribosomal subunit proteins and as expected, the 5' portion of the open readingframe encoding the equivalent of the 0 subunit protein of the E. coli RNApolymerase.Comparison of the content and location of genes between E. coli and T.maritima reveals both similarities and differences (Post et al., 1979; Downing et al.,1990; An and Friesen, 1980). First, the tRNAthr and tRNAtYr genes of T. maritimahave the same anticodon as the thrT and tyrU genes of E. coli; the other tRNAgenes show no correspondence (Figure 3). Second, the ribosomal protein L33 geneis located between genes for tRNAtYr and tRNAtrP, whereas in E. coli, theequivalent gene (rpmG) is clustered with rpmB encoding ribosomal protein L28;this gene cluster is located near 80 minute in the E. coli chromosome, about 45 kb 49Figure 4^Nucleotide sequence of the T. maritima 5.8 kb EcoRI genomicfragmentThe sequence of the 5788-nucleotide-long EcoRI fragment from the genomeof T. maritima is illustrated. The five putative promoters P1, P2, PL10, PL12, and Ppare indicated above the sequence; the major start sites are denoted (•). Restrictionsites used for transcript mapping studies are indicated above the sequence. Theposition of the five tRNAs (o, anticodon• • •) and the predicted amino acidsequences of the proteins encoded by genes on the fragment are depicted below thenucleotide sequence. Translation initiation sequences complementary to the 3'end of 16S rRNA are underlined.50EcoRI^20^ 40^ 60^ 80^ P1 •^ 120GAATTCTCGGATATTTTACGAGCATTTCCTTGATGGGATCTTTCTTCATGCTGATCACACTCCTTGACAACGGGGTTTTGTTAGAATATAATCTGATAGCGGTGTGGGCTCGTAGCTCAG^tRNAmet1^00000000000000000MspI 140 Mspi 160 Aval 180 200 220 240TTGGCAGAGCGCCCGGCTCATAACCGGGTGGTCGGGGGTTCGAATCCTCCCGAGCCCACCAGTTCCTGAAGGAGAGCACGGCTCTCCTTATTATTTTAACACATCGTTCAAAGATGTTCG000000000000000000....0000000000000000000000000000000000000000260^ P2 •^ 300^ 320^ 340^ 360AGGTTGACAAAGAAAAGCTCTGATAGTAAAATTAATGAACGGTCTTGGGCGGCGTAGCTCAGCGGCGAGAGCGGGTGATTCATAATCACCGTGTCGTGGGTTCGAGTCCCACCGCCGCCAtRNAmet2^000000000000000000000000000000000...0000000000000000000000000000000000000380^ MspI 420^ 440^ 460^ AvalTAGGTCATCGGAAAGGAAATAGGGCCAGCGTAGCTCAACCGGTAGAGCGACTGATTTGTAATCAGTAGGTTGTGGGTTCGAGTCCCACCGCTGGCTCCAAAAGTATGTGGTGGGGTGccCtRNAthr 00000000000000000000000000000000000•••0000000000000000000000000000000000000000^00000000000000tRNAtyr500^ 520^ 540^ 560^ 580 600GAGTGGCCAAAGGGGGCGGACTGTAAATCCGCTGGCAGAATCTTCGGAGGTTCAAATCCTCCCCCCACCACCAGATTTTTTGAGAAAGGGTGGAAGATATGCGAGTGAAAGTGGCTCTGA0000000000000000000000...000000000000000000000000000000000000000000000000 L33:50aa;MW=5744; pI=9.9MRVKVAL620^ 640^ 660^ 680^ 700^ 720AATGTTCTCAGTGCGGTAACAAGAACTACTACACCACAAGGAACAAGGACAAAAGAGCAAAGCTCGAACTGAGAAAGTACTGCCCAAAGTGCAACGCCCACACGATTCATACCGAAACGAK CSQCGNKNYYTTRNKDKRAKLELRKYCPKCNAHTIHTET740^ MspI^MspI 780^ MspI^ 820^ 840AAGCGTAATCGCAGGGCCGTAGCTCAACTGGTAGAGCGCCGGTCTCCAAAACCGGTGGTTGCGGGTTCGAGTCCTGCCGGCCCTGCCATTTTTTGATCTGAGGGGGCATCGAGAATGGAGK A^00000000000000000000000000000000000-0000000000000000000000000000000000000000 SecE:65aa;MW=7314; pI=9.9 M EtRNAtrp860^ 880^ 900^ 920^ 940^ 960AAACTCCGAAAGTTCTTCAGGGAAGTCATCGCCGAAGCAAAGAAAATTTCCTGGCCCTCCCGAAAGGAGTTGCTCACTTCTTTTGGTGTTGTTCTCGTGATACTCGCTGTTACAAGTGTTKLRKFFREVIAEAKKISWPSRKELLTSFGVVLVILAVTsV980 Aval 1000 1020 1040 1060 MspITATTTTTTTGTGCTTGATTTCATCTTCTCGGGAGTTGTGAGTGCGATTTTCAAAGCGCTGGGAATAGGATAAGGTGATAGGTGATGAAGAAAAAGTGGTACATAGTCCTTACTATGTCCGY F F V L D F I F S G V V S A I F K A L G I G NusC: 353aa; M K K K W Y I V L T M SMW=40329; pI=9.01100^ 1120^ 1140^ 1160^ 1180^ 1200GT- TACGAGGAAAAGGTTAAAGAAAATATCGAAAAGAAAGTCGAAGCCACCGGGATAAAAAATCTGGTGGGCCGTATTGTTATTCCTGAAGAGGTAGTTTTGGACGCCACCAGCCCTTCCGG YEEKVKENIEKKVEATGIKNLVGRIVIPEEVVLDATSPS1220^ 1240^ 1260^ 1280^ 1300^ 1320AGAGGCTCATACTTTCTCCGAAGGCCAAATTACACGTGAACAATGGAAAAGATGTTAACAAAGGGGATTTGATAGCTGAAGAACCTCCTATTTATGCTCGAAGAAGCGGTGTGATCGTTGE RLILSPKAKLHVNNGKDVNKGDLIAEEPPIYARRSGVIV1340^ ClaI^ 1380^ 1400^ 1420^ 1440ACGTGAAGAACGTCAGAAAGATTGTTGTGGAAACCATCGATAGGAAGTATACGAAGACGTATTACATTCCCGAGTCTGCGGGAATCGAGCCGGGTTTGAGGGTTGGAACGAAAGTGAAGCD VKNVRKIVVETIDRKYTKTYYIPESAGIEPGLRVGTKVK1460^ 1480^ 1500^ 1520^ 1540^ 1560AGGGACTGCCGCTTTCGAAAAACGAAGAGTACATCTGTGAACTGGATGGAAAGATCGTTGAGATAGAACGAATGAAAAAAGTGGTCGTTCAGACACCCGATGGTGAGCAGGACGTTTATTQ GLPLSKNEEYICELDGKIVEIERMKKVVVQTPDGEQDVY1580 1600 1620 1640 1660 1680ACATTCCTTTGGATGTTTTCGACAGGGATAGGATAAAAAAAGGAAAAGAAGTGAAACAGGGGGAAATGCTTGCGGAAGCCAGGAAGTTCTTCGCCAAGGTTTCGGGAAGAGTCGAAGTGGYIPLDVFDRDRIKKGKEVKQGEMLAEARKFFAKVSGRVEV1700^ 1720^ SmaI^ 1760^ 1780^ 1800TGGATTATTCAACAAGAAAAGAGATCAGAATCTACAAGACGAAAAGAAGAAAACTCTTCCCGGGTTATGTGTTCGTGGAAATGATCATGAACGATGAGGCCTACAATTTCGTTCGTTCCG^ DYSTRKEIRIYKTKRRKLFPGYVFVEMIMNDEAYNEVRS1820^ 1840^ 1860^ 1880^ 1900^ 1920TGCCATACGTTATGGGGTTTGTCAGTTCGGGAGGACAACCCGTTCCCGTAAAAGACAGAGAAATGAGACCTATTTTGAGACTCGCGGGCCTCGAAGAGTACGAAGAGAAGAAGAAACCTG^ PYVMGFVSSGGQPVPVKDREMRPILRLAGLEEYEEKKKP1940^ 1960^ 1980^ 2000^ClaI^2020^ 2040TGAAGGTCGAACTCGGTTTCAAGGTTGGAGACATGGTGAAGATAATAAGCGGTCCCTTCGAAGATTTTGCGGGTGTTATAAAGGAAATCGATCCAGAGAGACAGGAATTGAAAGTAAACG^ KVELGFKVGDMVKIISGPFEDFAGVIKEIDPERQELKVN2060^ 2080^ 2100^ 2120^ 2140^ 2160TAACTATATTCGGACGTGAAACTCCTGTTGTTCTTCATGTTTCTGAAGTGGAGAAAATCGAGTGAGAAAACGTGGGAGGAGGAATCCGCACCACGCATAGGGACGTTCGAACATGGCGAA^ T I F G R E T P V V L H V S E V E K I E L11: 141aa; W=15089; p1=9.6^M A KFigure 4512180^ MspI^MspI^2220^ 2240^ 2260^ 2280GAAAGTAGCGGCTCAGATTAAATTACAACTGCCTGCCGGAAAAGCCACGcCGGCTCCACCCGTTGGACCCGCCTTGGGTCAGCACGGTGTTAACATCATGGAGTTTTGTAAAAGGTTCAAKVAAQIKLQLPAGKATPAPPVGPALGQHGVNIMEFCKRFN2300^ 2320^ 2340^ 2360^ 2380^ 2400TGCCGAAACAGCGGATAAAGGAGGCATGATACTTCCTGTTGTTATCACAGTGTACGAAGACAAGTCGTTCACTTTCATCATCAAAACACCACCTGCTTCCTTCCTTCTCAAGAAAGGAGCAETADKAGMILPVVITVYEDKSFTFIIKTPPASFLLKKAA2420^ 2440^ 2460^ 2480^ 2500^ 2520GGGTATAGAGAAGGGTTCTTCCGAGCCAAAAAGAAAGATAGTTGGAAAAGTTACCAGAAAACAGATTGAAGAAATAGCGAAXACAAAGATGCCAGATTTGAACCCAAACAGCTTGGAAGGG IEKGSSEPKRKIVGKVTRKQIEEIAKTKMPDLNANSLEA2540^ 2560^ 2580^ 2600^ 2620^ 2640AGCCATGAAGATCATTGAAGGAACCGCTAAGAGTATGGGAATAGAAGTAGTGGACTGATGTAACGGAAAGGAGGAGGCGCAATGCCGAAGCACTCCAAGAGGTATCTTGAAGCAAGGAAA^A M K I I E G T A K S M G I E V V D^LI: 233aa;^M P K H S K R Y L E A R KMW=25934; pI=9.52660^ 2680^ 2700^ 2720^ 2740^ 2760CTGGTGGACAGAACAAAGTACTACGATCTTGACGAAGCCATAGAACTCGTTAAAAAAACTGCCACGGCGAAATTCGATGAAACGATAGAACTCCACATTCAAACTGGAATAGACTACAGGL VDRTKYYDLDEAIELVKKTATAKFDETIELHIQTGIDYR2780^ 2800^ 2820^ 2840^ 2860^ 2880AAACCTGAACAGCACATCAGAGGAACGATCGTGCTTCCACACGGGACAGGTAAGGAAGTCAAGGTTCTGGTGTTTGCCAAAGGTGAAAAGGCAAAAGAGGCTTTGGAAGCGGGCGCGGATK PEQHIRGTIVLPHGTGKEVKVLVFAKGEKAKEALEAGAD2900^ 2920^ 2940^ 2960^ 2980^ 3000TACGTAGGAGCTGAGGATCTTGTAGAAAAAATAGAAAAAGAAGGTTTTCTCGATTTCGATGTGGCAATAGCCACACCTGATATGATGAGAATAATCGGAAGGCTCGGAAAGATTCTGGGAYVGAEDLVEKIEKEGFLDFDVAIATPDMMRIIGRLGKILG3020^ 3040^ 3060^Dral^3080^ 3100^ 3120CCAAGAGGTTTGATGCCATCGCCCAAATCTGGAACGGTGACTCAGGAAGTAGCAGAAGCGGTTAAAGAGTTTAAAAAAGGAAGAATCGAGGTCAGAACGGACAAAACTGGGAACATCCACP RGLMPSPKSGTVTQEVAEAVKEFKKGRIEVRTDKTGNIH3140^ 3160^ Fnu4HI^ 3200^ 3220^ 3240ATACCCGTTGGTAAGAGGAGCTTCGATAACGAGAAACTGAAGGAAAACATAATCGCGGCAATAAAACAGATTATGCAGATGAAACCCGCAGGTGTGAAAGGACAGTTCATAAAAAAAGTGIPVGKRSEDNEKLKENIIAAIKQIMQMKPAGVKGQFIKKV3260 MspI^3280^ 3300^PL10•^3320^ 3340 Apal^3360GTTTTGGCTTCTACAATGGGACCCGGTATAAAATTGAATCTTCAGAGTCTGTTGAAAGAGTAAAGCAATCGAAAACTCAATAAGACGCCGTAGATGGCAGGGCCCGTGGGGTTAAAGATC^ LASTMGPGIKLNLQSLLKE3380^ 3400^ 3420^ 3440^ 3460^ 3480CTGCCGGAGGCGTCCCAGAAAGGTTCTATGACCTTTTTGTGGACCCTCCGGGGTCCACAAAATTTTTTTGGGAGGTGAATCCTTTGCTGACCAGGCAACAGAAAGAACTCATAGTTAAAGL10: 179aa;MW=20231; pI=9.1^M L T R Q Q K E L I V K3500^ 3520^ 3540^ 3560^XbaI^3580^ 3600AAATGAGTGAAATATTCAAAAAGACATCGCTGATACTCTTTGCCGATTTCCTGGGTTTCACGGTAGCTGATCTCACCGAGCTTCGTTCTAGATTGAGAGAAAAGTACGGAGATGGAGCAAE MSEIFKKTSLILFADFLGFTVADLTELRSRLREKYGDGA3620^ 3640^ 3660^ 3680^PvuII^3700^ 3720GGTTCAGGGTTGTGAAGAACACTCTCTTGAATCTCGCTCTCAAGAACGCTGAGTACGAAGGTTACGAAGAATTTCTCAAGGGACCCACAGCTGTACTCTACGTCACTGAAGGAGACCCTGRFRVVKNTLLNLALKNAEYEGYEEFLKGPTAVLYVTEGDP3740^ 3760^ 3780^ 3800^ 3820^ 3840TAGAAGCTGTCAAGATAATTTACAACTTTTACAAGGATAAGAAAGCGGATCTTTCGAGGCTCAAGGGTGGTTTCCTCGAAGGAAAGAAATTCACGGCAGAAGAAGTGGAAAACATTGCGA^ EAVKIIYNEYKDKKADLSRLKGGFLEGKKFTAREVENIASad^ 3880^ 3900MspI^3920^ 3940^ 3960AACTCCCATCCAAAGAAGAGCTCTACGCTATGCTCGTTGGTCGTGTGAAAGCTCCGATTACCGGTCTTGTGTTTGCATTGAGTGGTATTTTGAGGAATCTCGTGTATGTGCTCAATGCTAK LPSKEELYAMLVGRVKAPITGLVFALSGILRNLVYVLNAPL12 •^3980^ 4000^ 4020^ 4040^ 4060^ 4080TTAAAGAGAAAAAATCTGAATGATGGAGGTGTTTGAAGATGACGATTGATGAAATCATCGAAGCGATTGAGAAACTCACAGTTTCAGAGCTTGCAGAACTCGTGAAGAAGCTCGAAGACAI K E K K S E^L12: 128aa;^M T I D E I I E A I E K L T V S E L A E L V K K L E DMW=13457; pI=4.7Figure 4 (continued)524100^ 4120^ 4140^MspI^9160^ 9180^ 4200AATTTGGAGTGACTGCTGCTGCACCTGTGGCTGTCGCTGCTGCCCCAGTTGCTGGAGCAGCTGCCGGTGCCGCTCAGGAAGAAAAGACAGAGTTTGACGTCGTTTTGAAGAGCTTCGGCCK F G V T A A A P V A VA A A PV AGA A AGA A QE E K T E FDV V L K S F G9220^ 4290 Hinfl^4260^ 9280^ 9300^ 4320AGAACAAGATTCAGGTCATCAAAGTTGTCAGGGAAATCACCGGACTCGGTCTCAAGGAAGCCAAAGACCTCGTCGAAAAAGCCGGTTCACCCGATGCAGTCATTAAGAGCGGTGTTTCCAQ N K I QVI K V V R E I T GL GL K EA K DL V EK A G S FDA VI KS G V S4390^ 9360^ 4380^ 4400^ 9920^ 9490AAGAAGAGGCAGAAGAGATCAAGAAGAAACTCGAAGAAGCTGGTGCTGAAGTGGAACTGAAGTAAATTTTCGTTTGTGAAAAAAGAATGTTGCAACCCTGTACCGCCTCTTGCCGGTACAK E EA E EIK K K L E E A G A EVEL K4460 PO •^•^9480^ 9500^ 9520^SmaI^9590^ 9560GGGTTTTTGTGTTTTAATAAATAGAGTGTGGTACAAACGTTCTTCCTCACCATGTTGTTTCCTTTCCGTTCGATCCAAGCAAAACCCGGGAGAGAAATCCTGGGAGTTTCTTATTCCACA9580^ 4600^ 9620^ 9640^ 9660^ 9680TTGAGAGGTGAGAAAATGAAAGAGATCTCTTGCGGTAGGAGGACGAGGGTTTCTTTCGGCAAGAGCCGAGAGCCCCTGCCAATTCCAGACCTCGTGGAGATCCAGAAGAGTTCCTACCGA3^MK EIS CGR R T R VS FGK SR E P L PIPDL V E I Q K S S YRHinfl^9700^ 4720^ 9740^ 9760^ 9780^ 4800ACATTCCTCGAAGAAGGTTTGCTTGAAGTCCTCAAGAAATTTTCTCCCATTTATTCGCAGGCGACCCGCTCAGATTTGAGAAAATCAGACAGAGGATTTGCTCTCGAGTTTGTTTCAACCN FLEEGL L E V L K K F S PI YSQA T RS DL R K^ RD GE A L E F VS T4820^ 9840^ 4860^ 9880 9900^ 4920AGAACTGGAGAACCTGCCATCCATCCCCTTGAATGTAAAGCGAAGGGTCTAACCTACAGTOTTCCGATATATGCGACGGCTCGCCTTACCGACATGAAAAGCGGTGAGATGAAGGAAGAAR T G E P A I D PL E C K A K GL T YS V PI Y A T AR L T D M K SGEMK EE4940^ 9960^ 4980^ 5000^ 5020^ 5090GAAGTGTTCCTTGGCTACATTCCCTACATGACGGATCGTGGAACGTTCATAATAAACGGAGCAGAAAGGGTTGTAGTCAATCAGATAGTGGTTTCCCCAGGGCTTTACTTCTCGTCTGAGE VFL GYI PYMT DR G T F I^ ANI N G ^ER V V VNQIV VS PGL YES SE5060^ 5080^ 5100 5120^ 5190^ 5160TACATAGACAGAGAAGAATACGGCGGGTACTTTCTCCCTTCTCGAGGTGCATGGCTCGAAGTCATCCTCGATCCCTACGATGGAGTTCTTTACGCGGGCCTTGACGGAAAGAAGGTCAACY IDR E EYGGY F L PSR GA WLE VILD PYDGVL YAGL DGK K V N5180^ 5200^ 5220^ 5290^ 5260^ 5280CTTTTCCTCTTTCTGAAAACGATCGGTTACGAAAAAGATGAGGATATCCTCTCCCTTTATCCCACCTATCTGGATGCCGACGATGAAGACAGTCTCCTGCTCCACGTGGGCTCCATTCTGL FL F L K TIDY EKDEDIL^ PT y LDADDEDSLL L HVGS IL5300^ 5320^ 5390^ 5360^ 5380^ 5900CTCGAAGACATCTACGATGGTGGCAGGAAGATCGCTGAAAAATGGGATATCCTGACCAAAGATCTCGCGGAAAGGATTCTGATGATAGATGACATAAATCAGATAAAAATAGTTCATCCAL EDI Y DGGR K I A E K WDIL TKDL A SRIL MIDDINQI K I V HP5920^ 5940^ 5460^ 5480^ 5500^ 5520ATAGCTCAAAATACATTTGAAAAGATGCTGGAAGTGGTGTCTTCCTCGAGCGAAGAGGGAGAGGAAGAAGAGGAAAAGACAAAGATTTACGGTTTAAACGAAGTCACCGTTGTGGACGCAI A Q N T F E K ML E VVS S SS EEGEEEE E K T K I Y GLNE V T V V DA5590^ 5560^ 5580^ 5600^ 5620^ 5640ATATCTGGAAATTTTCAGGAGATTGCGACCCGAAGAACTTCCAAGAATAAACGCGGCAAAAAGGTATCTGCACGACCTCTTCTTCAATCCGGAAAGGTACGATCTTTCCGAGGTGGGAAGAY L^ FR R L R PEEL PR INA AK R YLHDL F F NP ER YDL SEVGR5660^ 5680^ 5700^ 5720^ 5740^ 5760TACAAAGTCAACGAAAGACTCAGAAACGCTTACATCAGGTACCTCATAGAGGTTGAAGGGGAAGATCCCGAAGAGGCGAGGAAGAAGGTTTACAACGAAACTTCTCTCGTTCTGAAACCAYKVNERL RNA Y IR Y L IE V EGEDPEE AR K K V YNE T S L V L K P5780 EcoRICTTGATATAGTCCTCGCTTCCAGAATTCL DIV  L A S RFigure 4 (continued)53upstream of the L11, L1, L10, L12,13 and 13' gene cluster (course et al., 1986). Third,this region in T. maritima lacks genes or sequences related to tufB (EF-Tu) of E.co/i. The tufA gene of T. maritima was located and cloned from elsewhere in thegenome (Bachleitner et al., 1989). A second copy of this gene which would beequivalent to tufB of E. coli does not exist in the T. maritima genome. Fourth, theT. maritima secE gene encodes a polypeptide of 65 amino acid residues, whichshow significant sequence identity to the carboxyl-terminal region of the E. coliSecE protein. Fifth, the nusG equivalent gene is nearly twice as large as thecorresponding nusG of E. co/i; this is principally the result of an insertion of 513nucleotides after codon 45 of the T. maritima nusG gene (see below). Finally, thearrangement and approximate size of the L11, L1, L10 and L12 equivalentribosomal protein genes and the downstream RNA polymerase 13 subunit genesare well conserved between the two species.3.2.1 The tRNA gene clusterFour of the five tRNA genes, Metl,Thr,Tyr and Trp, encode full lengthmolecules that include the 3' terminal CCA acceptor sequence and all can befolded into the universal clover leaf structure (Figure 5). In contrast, thetRNAmet2 gene encodes a 73-nucleotide-long truncated tRNA. Although thesequence ends with a CCA 3' terminus, the two C's are buried as part of the sevenbase-pair acceptor stem in the clover leaf structure. It seems likely that activationof this tRNA requires the post transcriptional addition of the terminal CCAacceptor sequence by a nucleotidyl terminal transferase. Alternatively, thetRNAmet2 might exhibit an atypical folding pattern where the stem of the Tlif armis contracted from five to two base-pairs; this would result in the expansion of thevariable loop from five to eight nucleotides and extension of the 3' terminalGCCA sequence above the seven base-pair acceptor stem. In this alternativeconfiguration, the acceptor stem would contain a C•A mismatch at position 6.54AG II U..3'CMET 1^CA5' ..0 0G—CG—CG—CC—GU—AC—GG G AI^GGG A G C0^GLT^CUCG^1111U A^ACC—G U 0^111^GGU^CCIICC uA0—CC u^LT IIC—GC—GG—CC G—C AII AC A UCC^ A II A G..3'fi■ CMET 2 CII A G• •3 '^ G5 ' . .13 G G—C^ 5...13 G G—C0—C G—CC—G ?^C—GG—C^ 0—CG—C G—CC-0 C OAG—CGA^ 13 A^G—C U 0 AU^CACCC^ U^C CG^C  A^A A^1 1^GGCGC^CI iII CI GI^CI0 1 1 1 111000 LI uC GG U C11 1^GU UAG 0^G 0 0 CG—C 0 uG—C L .0 00—C oU—AG—CA— UII AAC A IIAAA A• .3'^ A G A U • • 3 '^ AU II U. • 3 I^C^ C^ CTHR^C C CTYR^ TRP^A A A5 ' . .G G G—C 5 '..G UG—C 5 ' . .0 CA—II—GG—C^ G—CC—G^ II — A G—CA— II G—C G—CG—C G—C^ C—GC—G^ G—C C-0G—CU AG^ G—C A^ 0—CU^CGUCCUGACU^CA CC U^CCIICCG Ac A ACUCGA^GII11^G111^G G U G AG C C C G U11111^A^C AA uca A^11•G 11GG GGAGG g^GCG^1111^U 0^U^CU^G^0 1 ^1 C u^u cII M^LT^ UU IICI GGAGC, A --A^0 LT^G A^0A 0 0 0 U C G LTU A^G^GcGA—U A 0^0—C A 0 A Au GAGC C-0 U 0C—G C-0^ C—G^Q—A^0 G—C0 C C G—CA—U A—II Q—ALT^A LT^A^ C^AU A^ U A U ALT 0 T3 ° U A C c AFigure 5 Structure and processing of tRNAsThe structures of the five tRNAs encoded on the 5.8 kb EcoRI fragment aredepicted. The shaded nucleotides are present in the primary transcript butremoved during tRNA processing and maturation. The tRNAmet2 is unusual; iteither requires CCA addition for activation or it has a very unusual structurewith a mismatch (C•A) at position six in the acceptor stem. Both possibilities areillustrated.55There is no indication as to which, if either, of the two methionine tRNAs mightserve as the initiator in the translation initiation process.The analysis of transcripts derived from the region containing the fivetRNA genes was both complex and difficult because of rapid endonucleasecleavage at a large number of processing sites. Nonetheless, by using both S1nuclease protection and primer extension analysis, it has been possible to identify(i) putative transcription initiation sites, (ii) regions of the primary transcript thatrepresent major processing intermediates, and (iii) processing sites that generatemature tRNA 5' or 3' ends. The oligonucleotide primers used for primerextension are complementary to either coding sequences or to intergenicnoncoding sequences. Generally, the coding region primers utilize the primarytranscript or processing intermediates as template much more efficiently thanmature tRNA molecules; presumably, this reflects the inability of the reversetranscriptase to denature secondary or tertiary structure or to read throughmodified tRNA bases. Nuclease S1 protection assays using 5' or 3' end labeledDNA fragments as probes detect both precursor and mature RNA transcripts.The conclusions from these experiments are summarized in Figure 6A, andsome of the experimental results are illustrated in Figures 6B and C. Tosummarize, primary transcripts appear to be initiated from two putativepromoters, P1 located in front of the tRNAmeti gene and P2 located in front of thetRNAmet2 gene. No transcripts could be detected from the region in front of theputative P1 promoter. Transcripts initiated from these promoter sites appear toextend through the distal tRNAtrP gene and on into the secE and nusG genes;rapid endonuclease processing results in the removal of the tRNA sequences fromthe extended leader region of these transcripts.563.2.2 tRNA processingThe results of S1 nuclease protection experiments using the 5' end labeled173-nucleotide-long EcoRI—AvaI fragment, the 308-nucleotide-long AvaIfragment, the 508-nucleotide-long AvaI fragment and the 280-nucleotide-longMspl fragment are illustrated in Figure 6B. The two protected products of the 173nucleotide long probe (Figure 6B i) are 72 and 66 nucleotides in length andcorrespond respectively to protection by (i) the primary transcript initiated at thePi promoter and (ii) the transcript that has been processed to generate the mature5' end of tRNAmetl. Clearly, the amount of mature tRNA detected is muchgreater than the amount of primary transcript. The two most visible protectionproducts of the 308-nucleotide-long probe are about 200 and 300 nucleotides inlength (Figure 6B ii). These correspond respectively to protection by (i) the trailersequence that is liberated following cleavage of the primary transcript at or nearthe 3' end of the tRNAmetl sequence, and (ii) by primary transcripts or processedintermediates with a 5' end at or near the beginning of the tRNAmet2 sequence.The three major protection products obtained with the 508-nucleotide-long probewere about 435, 260 and 180 nucleotides in length (Figure 6B iii). Thesecorrespond respectively to protection by trailer sequences liberated by processing ator near (i) the 3' end of the tRNAtYr gene, (ii) the 5' end of the tRNAtrP gene, and(iii) the 3' end of the tRNAtrP gene. Finally, using the 5' end-labeled 280-nucleotide-long Mspl fragment as probe, it was possible to demonstrate thattranscripts exiting the tRNAtrP gene are extended well into the secE and nusGgenes. The observed products, 280 and 270 nucleotides in length, resultedrespectively from (i) full length protection of the probe by the primary transcript,and (ii) partial protection by the trailer liberated following processing at the 3' endof the tRNAtrP sequence (Figure 6B iv). No other abundant transcripts with either3' or 5' ends within the tRNAtrP-secE, and cfrE-nusG intergenic spaces weredetected. Thus, the five tRNA genes are processed from the leader region of the57mRNA that extends into the nusG gene. These results were confirmed andextended using the corresponding 3' end-labeled AvaI fragments and other 5' and3' end-labeled Mspl fragments as probes in S1 nuclease protection assays (data notshown).Some of the 5' transcript ends detected by S1 nuclease protection wereconfirmed and precisely positioned using primer extension analysis (Figure 6C).The primer oD10 is complementary to a sequence within the tRNAmetl gene(Table 3). The major extension product, terminating at the G residue at position101, is five nucleotides in front of the tRNAmetl gene (Figure 6C i). This is theposition where transcripts are initiated from the putative Pi promoter. Lessabundant products with end sites at nucleotides 98, 99 and 106 were also apparent.The first two positions probably correspond to minor transcription initiation sitesand the third corresponds to the 5' end of the mature tRNAmetl. It is likely thatthis oligonucleotide primes more efficiently on precursor than mature tRNA.The second primer oD1 is complementary to a region within the primarytranscript between the mature tRNAmetl and tRNAmet2 sequences (Table 3). Fourextension products with end sites at nucleotide positions 182, 164, 134 and 101were detected (Figure 6C ii). The product with an end at position 182 mostprobably corresponds to priming on the trailer intermediate released followingendonuclease cleavage at or immediately adjacent to the 3' end of the tRNAmett.This result implies that endonuclease incision occurs precisely at the end of themature tRNA sequence and that there is no extensive exonuclease trimmingrequired to produce the mature 3' tRNA end. Alternatively, the product may bedue to termination of extension caused by secondary structure of the tRNA withinthe primary transcript. The next two products corresponding to reversetranscription stops at position 164 and 134 within the tRNAmetl structuralsequence are presumably caused by impediments to elon• a is i II -the Tyr loop and the second is near the base of the descending portion of the58Figure 6 Mapping of transcript end sites in the tRNA-nusG region(A) A detailed genetic map of the 1.2-kb region is illustrated with the five tRNAand the nusG genes indicated. The positions of the two putative promoters, P1and P2, are indicated (r>. ) along with restriction sites used for making S1 nucleaseprotection probes: E, EcoRI; M, MspI; A, AvaI. The primary transcript is depictedbelow the map: •, represents putative transcription start sites; 1, and T, representrespectively the positions of detectable 5' and 3' ends generated during theexcision and processing of the tRNA sequences.(B) The structures of four 5' end-labeled DNA fragments used as the probes of S1protection assays are illustrated as rectangles on the left. Below each are the majorprotection products illustrated as lines (i—iv). The position in the nucleotidesequence (from Figure 4) used for end-labeling at the ends of the minus strandDNA probes and the ends of the protected products are indicated in parentheses.The length of the protected products in nucleotides (n) corresponds to the visibleautoradiographic bands. The autoradiograms are illustrated at the right: S,molecular length standard; T, S1 protection using T. maritima RNA. For clarity,the controls using E. coli RNA and the DNA probe alone without RNA are notshown.(C) The autoradiograms of the primer extension experiments are illustrated. Theprimers used were (a) oD10, complementary to position 155-136 within thetRNAmeti gene, (b) oD1, complementary to position 248-231 in the Met1 —Met2intergenic space and (c) oD12, complementary to position 433-417 within thetRNAmet2 gene (Table 3). The major extension stops are indicated and theirpositions within the complementary DNA (+) strand nucleotide sequence areillustrated: •, strong stop; o, weak stop. The ladder (G, A, T, C) depicts the DNA (-)strand sequence; PE designates the lane containing the primer extension products.59Figure 660anticodon stem. The longest product has an end corresponding to thetranscription initiation site of the putative Pi promoter at position 101. Theabsence of detectable product with an end site corresponding to the 5' end of themature tRNA (position 106) may indicate that 3' end processing normallyprecedes 5' end processing.The third primer oD11 is complementary to a sequence within thetRNAmet2 (Figure 6C iii). Two extension products with ends at positions 279 and281 were evident; these correspond to the 5' end sites of transcripts initiated at theputative P2 promoter immediately in front of the tRNAmet2 gene. By using otherprimers, it has been possible to show that endonuclease processing at the 3' endsof the tRNAthr and tRNAthr appear to occur immediately adjacent to the CCAterminal sequence; extension products resulting from priming of the Thr and Tyrtrailer sequences exhibited stops at nucleotide positions 460 and 554, respectively(data not shown). In S1 nuclease protection experiments, 3' end sites weredetected in approximately the same positions.3.2.3 Characterization of transcripts derived from protein -encoding genesTranscripts entering the secE and nusG genes were efficiently extendedthrough the L11 and L1 ribosomal protein genes and into the L1-L10 intergenicspace (Figure 7 A). Both nuclease S1 and primer extension assays failed to revealsignificant levels of transcripts with either 5' or 3' ends in or between these genes(data not shown); this implies that the region between nucleotides 820 and 3300 isdevoid of internal promoters, terminators and major mRNA processing sites (thatupon cleavage produce transiently stable intermediates) and that the secE, nusG,L11 and L1 cistrons are sequestered on a large polycistronic mRNA.In contrast, both read-through transcripts and transcripts with 3' or 5' endswithin the L1-L10 intergenic space have been identified (Figures 7B i and ii and CTy. The 31- and 51-ends were not generated by an endonuclease cleavage event61Figure 7 Characterization of transcripts from the protein-encoding genes(A) The genetic map illustrates the positions of the protein encoding genes (solidboxes). Restriction sites used to generate S1 probes are : M, Mspl; F, Fnu4HI; X,XbaI; P, PvuII; and H, Hinfl. The vertical arrows indicate the positions of putativeregulatory signals on the DNA (or mRNA below): P, promoter; A, attenuator; T,terminator.(B) The structures of several 5' and 3' end-labeled DNA probe fragments used inS1 nuclease protection assays are illustrated as rectangles. Under each probe arethe protection products (lines). Nucleotide positions corresponding to 5' or 3' sitesof end-labeling on the minus DNA strand (in parentheses) and protected fragmentlengths in nucleotides (n) for each of the probes and the corresponding protectionproducts are indicated. The autoradiograms are illustrated below: S, molecularlength standard; T, S1 protection using T. maritima RNA. For clarity, the controlsusing E. coli RNA and the end-labeled DNA probe alone are not illustrated. Theprobes used are as follows: (i) 3' labeled Fnu4HI-PvuII; (ii) 5' labeled Xbal-Fnu4H1; (iii) 3' labeled MspI-MspI; (iv) 5' labeled MspI-MspI; (v) 3' labeled Hinfl-Hinfl; (vi) 5' labeled Hinfl-HinfI.(C) Primer extension was used to locate the transcription initiation sites for theputative L10 (i), L12 (ii) and 13 (iii) promoters. Positions of major (•) stops on the(+) DNA strand sequence are indicated. The primers used were (i) oD15complementary to position 3464-3447, (ii) oD16 complementary to position 4026-4008, and (iii) oD5, complementary to position 4609-4592 (Table 3). The ladder (G,A, T, C) depicts the DNA (-) strand sequence, and PE depicts the products of theprimer extension reaction.(D) Total RNA was separated by electrophoresis and probed with the Xbal-Smalfragment (nucleotide position 3567-4526) spanning the L10 and L12 genes. Thefragments hybridized to 0.4 and 1.0 kb RNA and to larger RNA molecules.62Figure 763because the 3' transcript end site at position 3426 is located 112 nucleotidesdownstream from the 5' transcript end site at position 3314. The 5' transcript endprobably results from transcription initiation at a putative internal promoter, 1110,used to augment the expression of the downstream L10 and L12 genes. The 3' endsite at position 3426 presumably results from transcript attenuation. The end siteis located within a poly T stretch and is preceded by overlapping sequences withinverted repeat symmetry. The results from S1 protection experiments indicatethat this structure mediates the termination of about 50% of the mRNAtranscripts during exponential phase growth. Together, the L1-L10 intergenicpromoter and attenuator elements probably play an important role in modulatingexpression of the downstream genes (see below).The L10-L12 intergenic space is only 19 nucleotides in length. In theribosomes of E. coli and other organisms, the L12 protein is present in four copiesper 50S subunit, whereas all other proteins including L11, L1 and L10 arestoichiometric and present in single copy (Dennis, 1974; Subramanian, 1975;Hardy, 1975). In E. coli, this four-fold excess of L12 is achieved through an ill-defined translational control mechanism (Downing and Dennis, 1987; Petersen,1990). In contrast to E. coli, a major 5' transcript end was mapped near the end ofthe L10 gene (position 3972-3974) and probably results from a transcriptioninitiation event from a promoter element buried within the T. maritima L10 gene(Figure 7B iv and 7C ii). Transcripts from this promoter represent between one-third to one-half of the total L12 mRNA.Analysis of the L12-13 intergenic space indicates that few if any transcriptsexiting the L12 gene are extended into the downstream RNA polymerase 0subunit gene (Figures 7B v and vi, and C iii). Rather, the transcripts are efficientlyterminated within a poly T stretch centered around position 4446 that is precededby a region of inverted repeat symmetry. Expression of the RNA polymerasp e• subunit gene requires transcript reinitiation at a downstream promoter. The 5'64ends of these reinitiated transcripts have been located by primer extension assaysat positions 4566 and 4571.The above nuclease protection and primer extension results suggest thepresence of two internal promoters within the ribosomal protein gene cluster;these are used to augment the expression of downstream genes. One of these,Pilo, is located in the L1-L10 intergenic space and the other PL12 is locatedimmediately in front of the L10-L12 intergenic space. The presence of an efficienttranscription terminator immediately after the L12 gene results in the productionof mono-, bi- and hexacistronic transcripts containing the L12 cistron with lengthsof about 400, 1000 and greater than 3500 nucleotides, respectively. By probingnorthern RNA blots with a probe containing L10 and L12 genes, the 400 and 1000nucleotide long transcripts along with a heterogeneous large transcript wereidentified (Figure 7D).3.2.4 mRNA secondary structure and functionFigure 8 summarizes and contrasts the transcription patterns and regulatoryfeatures of the secE, nusG, L11, L1, L10, L12 regions of the E. coli and T. maritimagenomes. In E. coli, there are three non-overlapping transcription units: thetRNA-tufB operon, the secE -nusG operon, and the L11, L1, L10 L12, 0, j3' operon(An and Friesen, 1980; Downing and Dennis, 1987; Downing et al., 1990). Becauseof internal promoters, terminators and attenuators, a number of different primarytranscripts are produced from the operon of the ribosomal proteins and RNApolymerase. In addition, the transcripts from all three operons contain potentialendonuclease cleavage sites which increase further the number of detectablemRNA species.The ribosomal protein mRNAs of E. coli possess well-characterizedtranslational control elements (reviewed by Lindahl and Zengel, 1986; Jinks-Robertson and Nomura, 1987). The site controlling L11 and Ll synthesis is a65Figure 8 Structure and features of E. coli and T. maritima RNA transcripts(A) The genomic maps and transcripts produced for E. coli (top) and T. maritima(bottom) are illustrated. The positions of promoters (P), terminators (T), andattenuators (A) are indicated. Where transcription termination is substantiallyless than 100% at terminators or attenuators, the (percent) read-through isindicated. The 5' transcript ends resulting from initiation events are indicated(9--) along with sites of mRNA processing by known endonucleases (R3, RNaseIII;RE, RNaseE) (0X0). Protein binding (PB) autogenous translational control sites areboxed on the mRNAs.(B) Regions of RNA secondary structure that presumably serve a regulatoryfunction in the T. maritima mRNA are illustrated. For comparison, the putativeL1 protein binding site in the T. maritima 23S rRNA is presented. Also illustratedfor comparison are a portion of the L10 autogenous translational control regionand the 0 attenuator structure from the E. coli RNA transcripts; mutationalsubstitutions resulting in L10, L12 translation defective (^) or translationconstitutive (>) phenotypes are indicated. The designation (T) is represented by(U) in RNA transcripts.110■■■■■•• V^ PBL1tRNA ENDOSsecE nusG L11^Ll-==IIPLiiTEGIIPEGTByX XtRN ENDOS :Q •—^REV PBL 1 0PBL 1•^R3 X 1REXL11 L1 L10^L12Es•herichia coliTu iu GTTT^ tufBTh rmotoga maritimatRNAs1 M2 T Y W secE^nusGL10^L12^13MM- Mk.I LIM.. IPL10TL11 (50%)^A13 (80%)VR3 xL33TI3 r3P2^ PL10 AL10 (50%) PL12Figure 8 AACLEOTIDE SCALE (KILOBASES)0^ 1^ 2^ 3^ 4^ 5^ 6^8.5I I I I I I I 68mimic of the 23S rRNA binding site for protein L1 and is located immediatelypreceding the L11 cistron on the mRNA (Baughman and Nomura, 1983). Becauseof translational coupling, once L1 binds to the mRNA, translation of thedownstream L1 cistron is also blocked (Thomas and Nomura, 1987).The E. coli L10 regulatory site located in the middle of the long L1—L10intercistronic space is more complex and less well understood (Fiil et al., 1980;Johnsen et al., 1982; Christiansen et al., 1984). The L10 protein (or an L10-(L12)4complex) has been shown to bind to a segment of the mRNA about 100nucleotides in length near the middle of the intercistronic space. A region ofinterrupted inverted-repeat symmetry immediately adjacent to the L10 proteinbinding site was shown to be extremely sensitive to nucleotide substitution (seeFigure 8), and is believed to be a crucial component in the on/off switching ofmRNA translation; both translation defective and translation constitutivemutants have been characterized (Fiil et al., 1980; Christiansen et al., 1984).In T. maritima, the secE and nusG genes are cotranscribed with theribosomal protein genes, and the tRNAs are processed from the leader region ofthis polycistronic mRNA transcript. The distal RNA polymerase subunit geneswould appear to form a separate operon, to be transcribed from a promoter, Pp,located in the L12-13 intergenic space.Potential sites related to those cleaved by RNaseIII and RNaseE in E. coli(Arraiano et al., 1988, King and Schlessinger, 1987) have not yet been identified inthe T. maritima transcripts containing secE, nusG and ribosomal protein genes. Itis possible, nonetheless, that such endonuclease sites do occur and are used totrigger rapid degradation of mRNA sequences. If the products formed by anendonuclease cleavage are rapidly degraded, they would escape detection in thenuclease protection and primer extension assays used here. In many of our S1protection experiments, autoradiographic bands of low intensity are apparent.These bands, representing minor protection products with 5' and 3' ends falling69within generally nondescript sequences, probably represent transiently-stabledegradation intermediates and have for simplicity not been emphasized in thisstudy.Examination of the T. maritima nucleotide sequence of the secE, nusG, L11,L1, L10, L12, and p genes and intergenic spaces has revealed a number ofpotentially important regions that could form regulatory structures within anmRNA transcript (Figure 8 B). The first is in the short nusG-L11 intergenic spaceand forms a bipartite helical structure immediately preceding the L11 translationinitiation site. The region exhibits primary sequence and secondary structuralsimilarity to the L1 binding site within the 23S rRNA (Achenbach-Richter,personal communication). By analogy with E. coli, this site is probably used tomediate translational regulation by protein L1.A direct comparison of the L1-L10 intergenic spaces of E. coli and T.maritima failed to reveal any nucleotide sequence similarity. The transcriptiontermination signal that has been identified is referred to as an attenuator because(i) it functions at about 50% efficiency during exponential phase growth, and (ii) itpossesses structural features which suggest that the termination frequency can bemodulated. The second inverted-repeat within this element is characteristic ofeubacterial Rho-independent terminators; when this structure is allowed to formin the nascent transcript, termination would occur. The first inverted-repeatwithin this element exhibits structural and possibly functional similarity to an E.coli repeat implicated in the on/off switching of mRNA translation (Fiil et al.,1980; Christiansen et al., 1984; see Figure 8B). If the on/off switch exists in T.maritima, it would appear to control transcript termination by either allowing orpreventing the formation of the second terminator hairpin in the mRNA.The efficient terminator in the L12-(3 intergenic space was identified anddefined by nuclease S1 protection assays. It is a typical eubacterial Rho-independent terminator, consisting of a single inverted-repeat followed by a70stretch of T residues. In E. coli, the analogous structure is more complex andfunctions as an attenuator, terminating approximately 80% of the transcripts thatexit from the L12 gene and extend into the intergenic space (Downing and Dennis,1991). Recent in vitro transcription experiments suggest that antitermination atthis attenuator may be stimulated by E. coli NusA and NusG proteins (Linn andGreenblatt, 1992). The termination frequency at this E. coli attenuator is adjustableand functions to control the level of production of the 0 and DI subunits of RNApolymerase. In T. maritima, it seems that transcription of the upstream ribosomalprotein genes and the downstream f3 and IT RNA polymerase subunit genes aredissociated; transcription of the RNA polymerase 13 gene requires initiation at thePp promoter which partially overlaps with the termination sequence of theribosomal protein operon.3.2.5 Transcription and translation initiation signalsA total of five putative transcription initiation sites were located by primerextension and nuclease protection assays within the tRNA, secE, nusG, ribosomalprotein and RNA polymerase gene cluster. The sequences preceding these fivesites were examined in order to identify conserved features which mightconstitute elements of a T. maritima promoter (Figure 9A). The followingfeatures emerge: (i) All promoters exhibit one or more start sites located within aregion up to eight nucleotides in length. (ii) Centered about ten nucleotidesupstream from the major start site is an AT-rich sequence that probablycorresponds to the E. coli Pribnow box sequence (TATAAT); the T. maritimaconsensus derived here is TAWAAT (W, nucleotide A or T, Hoopes and McClure,1987). (iii) A second conserved element possibly corresponding to the E. con -35sequence (TTGACA) was also identified; the T. maritima consensus wasTTGAC(A /G). In the E. coli promoter, the spacing between the -10 and -35 71elements is critical and usually limited to a range between 16 and 19 nucleotides.In T. maritima, the spacing appears to be more variable, ranging from 18-25nucleotides. The functional significance of either of these elements and theirspacing relative to the transcription start site in the T. maritima promoter will beestablished only by a more detailed in vitro or genetic analysis.The designation of open reading frames and the location of the translationinitiation codons were based upon two criteria. The first is alignment of predictedamino acid sequences with the sequences of known E. coli proteins. The second isthe proximity of a potential translation initiation codon to a sequence exhibitingcomplementarity to the 3' end of 16S rRNA (Gold and Stromo, 1987). All theprotein-encoding genes analyzed in this study contain a five to nine nucleotidelong region of 16S rRNA-mRNA complementarity (Figure 9B). For seven of thegenes, L33, secE, nusG, L11, L1, L12 and 0, ATG is used as the translation initiationcodon; the L10 gene apparently uses the unusual codon, TTG. It is uncertainwhether or not this unusual initiation codon plays a role in translationalregulation of the L10 cistron.3.2.6 Protein homologiesThe amino acid sequences of the L33, SecE, NusG, L11, L1, L10 and L12proteins predicted from the nucleotide sequence of the T. maritima 5.8 kb genomicfragment were aligned to the E. coli protein sequences, and the sequence identityof each protein pair was calculated (Table 6). This comparison indicates quiteclearly that the predicted T. maritima proteins are the homologues of thecorresponding E. coli proteins; amino acid identities range from 32 to 66 percentfor the seven protein pairs. The protein pairs exhibiting the largest proportion ofamino acid replacements are often affected more frequently by insertion deletionevents.72APROMOTER ALIGNMENTS-35 -10^START SITEP1 ..CC TTGAC AACGGGGTTTTGTTAGAA TATAAT CTGATAGC G GTGTG.. (99)P2 ..GG TTGAC AAAGAAAAGCTCTGATAG TAAAAT TAATGAAC G GTCTT.. (279)PL10 ..AA TTGAA TCTTCAGAGTCTGTTGAAAGAG TAAAGC AATCG A AAACT.. (3312)PL12 ..TT TTGAG GAATCTCGTGTGTGTGCTCAATGC TATTAA AGAGAAA A AATCT.. (3972)P/31 ..TC TTGCC GGTACAGGGTTTTTGTGT TTTAAT AAATAGA^G TGTGG.. (4466)13132 ..TC TTGCC GGTACAGGGTTTTTGTGTTTTA ATAAAT AGAGTGTG G TACAA.. (4471)CONSENSUS^.. TTGAC ^ 18-24 NUC^ TAWAAT 5-8 NUC. G ^BRIBOSOMAL BINDING SITESTRANSLATIONINITIATIONCODON^POSITIONL33 ..TG AGAAAGGG TGGAAGAT ATG.. (579)secE ..CT GAGGGGG^CATCGAGAA ATG.. (835)TGAT—7—nusG ..AT AAGGAGGTG ATG.. (1044)L11 ..AG GGAGGT TCGAAC ATG.. (2153)Li ..CG GAAAGGAGG AGGCGCA ATG.. (2602)L10 ..TG GGAGGTGA ATCCTT TTG.. (3444)L12 ..AT GGAGGTG TTTGAAG ATG.. (3999)Q ..GA GAGGTGA GAAA ATG.. (4577)HoTCTTTCCTCCACTAGG....16S rRNAFigure 9 Transcription initiation and translation initiation elements in T.maritimaA. The nucleotide sequences (DNA) overlapping the putative transcription initiationsites are listed. The promoter designations are presented on the left and the majorstart nucleotide is given on the right. The sequences are aligned at the start site, the—10 element and the —35 element. The two equally intense start sites of the Ppromoter appear to use a common —35 element and overlapping —10 elements; bothpossibilities are listed. W stands for either nucleotide A or T.B. The nucleotide sequences (DNA) overlapping the translation initiation sites of theL33, secE, nusG, L11, L1, L10, L12 and fi genes are aligned at the translation initiationcodon. The upstream sequences are spaced such that they align in a complementaryway with the sequence of the 3'-end of 16S rRNA presented at the bottom(Illustrated is the DNA sequence of the rRNA gene). The complementary nucleotidesequivalent to the E. coli Shine-Dalgarno ribosome binding sequences are overlined.73Table 6 T. maritima and E. coli protein homologiesProtein Commonpositions )InternalGaps2Identicalamino acids3L33 48 4 20 (42%)SecE 63 0 20 (32%)NusG 175 5 75 (43%)L11 141 1 87 (62%)L1 232 1 116 (50%)L10 147 8 65 (44%)L12 119 4 78 (66%)1^The common positions are the number of positions where the two proteinsequences each contain an amino acid in the alignment generated almostentirely by the alignment algorithm in the Gene Works® package using thedefault parameters. The alignments of T. maritima and E. coli SecE andNusG proteins were first generated by the computer program, then visuallyadjusted to maximize the amino acid identity. The NusG alignment resultsin an insertion of 171 amino acid residues after a position 45 in T. maritimaNusG sequence.2^Internal gaps are the number of gaps required in one or the other sequencesto maintain maximal alignment. Presumably, these are the result ofdeletion or insertion mutations during evolution.3^Identical amino acids are the number of common positions where bothsequences contain the same amino acid residue. The percent identicalain;no acIcL; i,; ificli,ated in parentheses.74In E. coli, secE gene is located 5' to the nusG gene (Downing et al., 1990).The E. coli secE encodes an essential integral membrane protein with threemembrane-spanning stretches, which plays an important role in protein export(Schatz et al., 1989). In T. maritima, a short open reading frame immediatelypreceding nusG encodes a polypeptide of 65 amino acid residues, of which 36 arehydrophobic. This polypeptide can be aligned to the carboxyl terminus of the E.coli SecE protein without alignment gap and with an amino acid identity of 32%(Figure 10). Thus, this polypeptide is likely the homologue of SecE in T. maritima.Presumably the N-terminal 28 amino acids are present on the cytoplasmic side ofthe plasma membrane; the central stretch of 19 amino acids spans the membranebilayer, and the C-terminal 18 amino acids are localized on the periplasmic side(Schatz et al., 1989; Figure 10).Interestingly, short open reading frames that encode polypeptides of 60 and84 amino acid residues were also found in the published sequences containingnusG gene from Thermus thermophilus (Heinrich et al., 1992) and Streptomycesvirginae (Okamoto et al., 1992), respectively. The two proteins can also be alignedwith the T. maritima and E. coli SecE proteins (Figure 10), with 22% (T. maritima-T. thermophilus), 23% (E. coli—T. thermophilus), 27% (S. virginae—T.thermophilus), 31% (T. maritima—S. virginae), and 25% (E. coli—S. virginae)amino acid identity, respectively. Taken together, these data indicate that T.maritima and T. thermophilus have very small SecE proteins, whereas the S.virginae SecE protein has a highly charged N-terminal extension which does notshow significant sequence identity to any part of the N-terminus of the elongatedE. coli SecE. All three SecE proteins probably have only one membrane-spanningsegment (Figure 10). The alignment shows a few highly conserved residuesincluding a tryptophan residue at alignment position 107 (Figure 10). If the T.maritima SecE is ancestral, the N-terminal extension in the E. coli SecE may be a recent innovation. Genetic and biochemical experiments showed that the75Tma-SecEEco-SecESvi-SecETth-SecETma-SecEEco-SecESvi-SecETth-SecETma-SecEEco-SecESvi-SecETth-SecE10^20^30^40^5050MSANTEAQGS GRGLEAMKWV VVVALLLVAI VGNYLYRDIM LPLRALAVVI60^70^80^90^100^MEK LRKFFREVIA 100LIAAAGGVAL LTT KGKA TVAFAREART 13MPDAEDE TREKKARKGG KRGKKGPLGR LALFYRQIVA 14MFAR LIRYFQEARA 37110^Transmembrane region^140KELLTSFGVV LVILAVTSVY FFVLDFIFSGQETLHTTLIV AAVTAVMSLI LWGLDGILVRNQLTTYTTVV IVFVVIMIGL VTVIEFGFEKEQVVEGTQAI LLFTLAFMVY LGLYITVFRF• •• •150VVSAIFKALG IG 127LVSFITGLRF  -- 65AIKFVFG---^60LIGLLR----^84• •1RKKIRKVKLARV••Figure 10 Alignment of the SecE protein sequencesThe Sequences of the SecE proteins from T. maritima (Tma-SecE), E. coli (Eco-SecE),S. virginiae (Svi-SecE) and T. thermophilus (Tth-SecE) are aligned. The residues that areinvariant in all four sequences are boxed; the regions with conservative substitutionsare indicated with "*." The transmembrane region is marked by a heavy overline.76carboxyl-terminal region containing the third membrane-spanning segment issufficient for E. coli SecE function (Schatz et al., 1991; Nishiyama et al., 1992).The NusG protein of E. coli is essential for cell viability, and was shown toassemble into an RNA polymerase elongation complex and to participate intranscription termination-antitermination related activities (Swindle et al., 1988;Downing et al., 1990; Linn and Greenblatt, 1992; Sullivan et al., 1992; Li et al., 1992and 1993). The E. coli protein is relatively small (M r 20,508), whereas the T .maritima protein is much larger (Mr 40,329). The increase in size results almostentirely from a single large insertion of 171 amino acid residues after position 45.An extensive search of the protein data base has failed to reveal any other proteinsequence highly related to the sequence of the inserted region. The NusGsequences from Synechocystis Sp. PCC 6803 (Schimidt and Subramanian, personalcommunication) and Thermus thermophilus (Heinrich et al., 1992) have beendetermined; like E. coli they lack this insertion. The NusG protein fromStreptomyces virginiae has 319 amino acid residues (Mr 34,676; Okamoto et al.,1992); surprisingly, this increase in size is due to an extension of about 110 aminoacid residues at its amino terminus, and this extension has a high proportion ofacidic residues (24 Glu residues and 15 Asp residues). The carboxyl terminus issimilar to other NusG proteins (Figure 11). The NusG protein from S. virginiaewas shown to function as a butyrolactone autoregulator receptor that may switchon expression of genes for antibiotic production and/or cytodifferentiation. Analignment of the five NusG protein sequences is illustrated in Figure 11.Antibodies prepared against the T. maritima NusG protein were shown to cross-react specifically with proteins of the expected sizes in extracts of E. coli, and T.thermophilus (T. Heinrich and R. Hartmann, personal communication). The T.maritima NusG has been expressed in E. coli; the purified recombinant NusG wasshown to have DNA-binding activity (see Part V). 77Figure 11 Alignment of the NusG protein sequencesThe sequences of the NusG proteins from T. maritima (Tma-NusG), E. coli(Eco-NusG), Synechocystis Sp. PCC 6803 (Sec-NusG), S. virginiae (Svi-NusG) andT. thermophilus (Tth-NusG) are aligned. The residues that are conserved ineither all five sequences or four of the five sequences and a conservativesubstitution in the fifth sequence are boxed. The N-terminal extension of about110 amino acid residues in the NusG protein from S. virginiae, which does notshow significant sequence identity to any part of the alignment, is omitted in thealignment. The NusG sequence of Synechocystis Sp. PCC 6803 is kindly providedby Dr. Subramanian before publication.4060^70^80^90^100NLVGRIVIDLFGEDRILQVEIEFIYQAEDKIFQVLILDATSPSERLILSPKAKLHVNNGKDVNKGDLIAEEP 290^300S-GGQTSDR^SEQKRHYGRGRGH-AYDPYPLTLDE-GMR 280390^400380360-GPFEDFAGVI ID-GPFADFNGVV D-GPFKDFEGDVI SDGPFATLQATI IN370QELKVNVTIFGRETPSRLKVSVSIFGROPPVESKLKALLSIFGRETPVESKKVKGLVEIFGRETPVETma-NusGEco-NusGSec-NuSGSvi-NusG78Tma-NusGEco-NusGSec-NusGSvi-NusGTth-NusGTma-NusGEco-NusGSec-NusGSvi-NusGTth-NusG10^20^MKKK^MS----EAP----KKR MSFTDDQSPVAEQNKKTPSEG-PVDPIQALREELRLL---PGMSI E 30IVLTMS• VQVPSQAFS126. 1IHTY•^1.11'OWVHTL to50KKVEATGIKEHIKLHNMEQRIHTLDVAQRAVSLNVEKRIKAFGLQ110^120^130^140^150Tma-NusG PIYARRSGVIVDVKNVRKIVVETIDRKYTKTYYIPESAGIEPGLRVGTKVEco-NusG ^Sec-NusG Svi-NuSG Tth-NusG ^160^170^180^190^200Tma-NusG KQGLPLSKNEEYICELDGKIVEIERMKKVVVQTPDGEQDVYYIPLDVFDGEco-NusG ^Sec-NusG Svi-NusG Tth-NusG ^210^220^230^240^250Tma-NusG RDRIKKGKEVKQGEMLAEARKFFAKVSGRVEVVDYST EIRI KRRKEco-NusG ^ EI GQR-- SERKSec-NusG KI^QGEESvi-NusG  QIl GER-- RQNTth-NusG ^  E sl GGK--ITs11 VVRK270^IMND^E•IDLGDEEEPNE4^DLTN^IMDD^D ^A ID • II 260Tma- usG -LFPGYVFVEco-NusG -F PGYVLVSec-NusG KI PGYVLISvi-NusG KL-PGYVLVTth-NusG KLFPGYLFI310Tma-NusG --PVPVKDEco-NusG --PAPISDSec-NuSG VLPMPLSHSvi-NusG IVKM-LAP-Tth-NusG --PVPLSP320^330^340^350RPILRLAGLEEYEEKKKPVKVELGFKVGD IIS----DAIMNRLQ--QVGDKPRP-K-TL-FEPGEuVND----ERIFRHVD--E-QEPVVKIDMEIGDHI-- T--LS----QEKAAKAAAEEAGLPAPAVKRTIEV-LDF T-GDSVTVTRHILEVSGL--LGKKEAP-KAQVAFREGD4VVS----r Figure 1179The ribosomal protein L33 gene was identified between tRNAtYr andtRNAtrP genes (positions 579 to 728, see Figure 4). The T. maritima L33 has 50amino acid residues, and is basic (pI=9.9). This protein shows sequence identitiesof 42% to the E. coli ribosomal protein L33 (Table 6) and 47% to the L33 of tobaccochloroplast.The utilization of codons at the 1150 positions in the L33, secE, nusG, L11,L1, L10 and L12 genes was analyzed. As previously observed, there are fewrestrictions on codon usage, although TTA and CTA (leucine), and CGC, CGA andCGG (arginine) codons are either not used or used infrequently (Tiboni et al.,1991). In general, this pattern is similar to that used by E. coli; the major differenceis that E. coli prefers CGT and CGC, whereas T. maritima prefers the unrelated butsynonymous AGA and AGG triplets to encode arginine.3.2.7 Evolutionary implicationsIn archaebacteria and eukaryotes, the L10 and L12 equivalent proteinsexhibit features which clearly distinguish them from the equivalent eubacterialproteins (Shimmin et al., 1989; Newton et al., 1990). In all these respects, the T.maritima L10 and L12 proteins are typically eubacterial. If the ancestor of T.maritima branched from other eubacteria very near the position of the universalcommon ancestor (i.e., the root of the universal tree), it would suggest that theprimordial state of the L10 and L12 proteins was eubacterial, and that the domainrearrangements and duplications now apparent in the archaebacterial-eukaryoticproteins occurred early in the lineage leading to these now well separateddomains. This proposal differs from a previous model which suggested that thearchaebacterial-eukaryotic L10 and L12 structures were primordial and that therearrangements occurred within the lineage leading to eubacteria (Shimmin et al.,1989).803.3 SUMMARYA 5788-nucleotide-long EcoRI fragment from the genome of T. maritima,identified by cross-hybridization to the L11, L1, L10 and L12 ribosomal protein genesequences from E. coli, was cloned and sequenced. This fragment encodes fivetRNAs (tRNA metl, anticodon complementary to AUG; tRNAmet2, AUG; tRNAthr,ACA; tRNAtYr, UAC; tRNAtrP, UGG), a membrane protein SecE, which isputatively involved in protein translocation process, the transcriptiontermination-antitermination factor NusG, the five 50S subunit ribosomal proteinsL33, L11, L1, L10 and L12, and the amino-terminal portion of the RNA polymerasef3 subunit. The five tRNA genes, the L33, secE, nusG genes, and the L11, L1, L10and L12 genes form a complex transcription unit. Transcripts appear to beinitiated from an upstream promoter, P1, located in front of the tRNAmetl geneand from three internal promoters: P2 is located immediately in front of thetRNAmet2 gene; Puo is near the beginning of the L1—L10 intergenic space, and Pmis at the end of the L10 gene sequence. The tRNA sequences are excised from theleader regions of the Pi and P2 initiated transcripts. Three putative but potentiallyimportant regulatory sequences were identified within this operon: an L1translational control site, a transcription attenuator, and a strong Rho-independent terminator. The strong terminator located distal to the L12 geneoverlaps a fifth promoter, Pp, which is used to initiate transcripts of thedownstream RNA polymerase f3 subunit gene. The T. maritima secE encodes aprotein with 65 amino acids, which can be aligned to the C-terminus of theelongated E. coli SecE protein (127 amino acids, and three membrane-spanningsegments). In the alignment, the amino acid identity is 32%. A sequence analysisindicates that the small SecE of T. maritima has one putative transmembranesegment in the central region of the protein. The T. maritima NusG proteinexhibits 43% amino acid sequence identity when aligned to the E. coli counterpart;81the alignment is interrupted by a 171 amino acid long insertion into the T.maritima protein after residue 45 that is absent in the E. coli and other eubacterialNusG proteins.82IV. The functions of the T. maritima NusG protein: DNA-bindingactivity and its role in transcription4.1 INTRODUCTIONA group of proteins called Nus factors endow the DNA-dependent RNApolymerase of Escherichia coli with the ability to read through both factor-dependent and factor-independent transcription termination signals on thetemplate DNA. These Nus factors function in a multimeric complex with theRNA polymerase that is assembled shortly after transcription initiation (for review,see Das, 1992; Roberts, 1993).Two of the factors, NusB and NusE (ribosomal protein S10), bind to RNApolymerase as well as a box A RNA sequence and tether the transcript at this site tothe elongating RNA polymerase (Nodwell and Greenblatt, 1993). The NusA bindsdirectly to the RNA polymerase and slows the transcription elongation, causing theRNA polymerase to pause at specific sites in many transcriptional units (Yager andvon Hippel, 1987). This factor is of fundamental importance in facilitating theinteraction with the X antitermination protein N, during bacteriophagepropagation (Whalen et al., 1988; Das, 1992). The N protein binds to a helical box Bsequence that is adjacent to box A on X transcripts. Together, these RNA motifsconstitute X nut sites (N utilization sites); the interactions between nut, N, NusA,NusB and NusE strengthen the tethering of the nascent transcript to the elongatingpolymerase and allow the complex to read through potential termination sites onthe DNA template (Rosenberg et al., 1978; Friedman and Olson, 1983; Mason andGreenblatt, 1991; Das, 1992; Nodwell and Greenblatt, 1993).The NusG protein also binds to the elongating RNA polymerase and isbelieved to be essential for Rho-dependent transcription termination events83(Sullivan and Gottesman, 1992; Li et al., 1992, 1993). The termination factor Rho isan RNA-dependent ATPase with RNA helicase activity (Brennan et al., 1987); ifunimpeded, it causes destabilization of the RNA-DNA duplex within thetranscription bubble at the site of elongation and triggers the release of the nascenttranscript when polymerase is paused at a Rho-dependent termination site(Morgan et al., 1983, 1984). The activity of Rho is mediated through an interactionwith NusG in the elongation complex (Li et al., 1993). During the transcription of XDNA, the interaction of Rho and NusG is effectively blocked by the NusA-Nprotein association (Li et al., 1993). The in vivo depletion of NusG has been shownto result in the suppression of termination at a number of different Rho-dependenttermination signals; other potentially deleterious effects of NusG depletion havenot been observed (Sullivan and Gottesman, 1992; Sullivan et al., 1992). Otherexperiments suggest that NusG may also play a role in the regulation oftermination at an attenuator site located upstream of genes encoding the 13 and 13'subunits of RNA polymerase in E. coli (Linn and Greenblatt, 1992).The E. coli gene encoding the 181 amino acid long NusG protein is essentialfor cell viability and is located in the secE-nusG bicistronic operon (Downing et al.,1990). The secE gene encodes a transmembrane protein that is an essentialcomponent of the protein translocation system (Schatz et al., 1989). This operon ispart of a larger cluster of essential genes that encode the translation factor EF-TuB,the ribosomal proteins L11, L1, L10 and L12, and the f3 and (3' subunits of RNApolymerase (Post et al., 1979; An and Friesen, 1980).We recently cloned and characterized a complex transcription unitcontaining the nusG gene from the hyperthermophilic eubacterium Thermotogamaritima (Part III; Liao and Dennis, 1992). The operon contains five tRNA genesand ribosomal protein L33 gene in the 5' mRNA leader, a diminutive version ofsecE and an enlarged version of nusG, and in the distal position, the rp1KAJL genesencoding ribosomal proteins L11, L1, L10 and L12. The NusG of T. maritima has84353 amino acid residues with molecular weight of 40 kilodaltons; within thissequence there is a large 171 amino acid long insertion after residue 45 that is absentin the corresponding NusG homologs from E. coli and other eubacteria. We haveobserved that the T. maritima NusG protein has a generalized DNA-bindingactivity. In this part, the characterization of this activity and the role of NusG intranscription are described.4.2 RESULTSThe nusG coding region was amplified with PCR from the T. maritimagenomic clone pPD990 using EcoRI and Hin dill containing oligonucleotideprimers. The amplified fragment was cloned, sequenced to verify amplificationfidelity, and finally recloned into the expression vectors pKK223-3 and pET-3a togive pPD1077 and pPD1078, respectively. In plasmid pPD1077, the T. maritimanusG gene is under the control of the tac promoter and in plasmid pPD1078, thegene is under the control of a T7 RNA polymerase promoter. In the E. coli strainsharboring these expression plasmids, synthesis of the T. maritima NusG proteinwas tightly regulated and high level expression was dependent on IPTG induction.The expression of the T. maritima NusG protein from plasmid pPD1077induced by the addition of IPTG to mid-log phase cells is illustrated in Figure 12A.To purify the protein, the cell lysate was heated to 75°C for 30 min and the heatlabile E. coli proteins were removed by centrifugation. The protein was furtherpurified by adsorption onto a CM Sepharose column and eluted with a lineargradient of NaCl. The eluted protein was virtually homogenous as judged by SDS-PAGE (Figure 12B). Using polyclonal antibodies produced against the purifiedrecombinant protein, we have shown by Western blotting that the single cross-reacting protein in T. maritima cell extracts has the same size as judged byelectrophoretic mobility as the recombinant protein produced in E. colt. ThisAL4 1 2kd^M 1 2 3 4 5 6 7 0 9 10 11 12 13 14 15 1697.4 —66.2^'OM42.7 -a.ow iliim■wialOSIDNO 00 -411-NueG"31.0 --44.21.5^fip14.4 -+ 6021.5 --a. •14.4 --a- NA. Figure 12 Overexpression and purification of the T. maritima NusG protein(A) Lysates of JM109 (lane 1) and JM109/pPD1077 grown in the presence ofIPTG (lane 2) were electrophoresed on a 12% polyacrylamide-SDS gel. (B) Aliquotsof material at various stages of NusG purification are visualized on a 16%polyacrylamide-SDS gel. The lanes are: (1) supernatant after heating cell extract to75°C for 30 min and centrifugation; (2) CM Sepharose CL-6B column flow through;(3,4) column washes with buffer B; (5-16) column fractions eluted with a lineargradient (50-300 mM) of NaCl; not all the column fractions are shown. For eachgel, a molecular weight standard (M) was included. The size of the proteinstandards is in kilodaltons (kd).8586indicates that the length of the designated T. maritima nusG open reading frame iscorrect and that the encoded protein is nearly twice the size of the homologousNusG protein of E. coli (20 kilodaltons). Antibodies raised against the E. coliprotein have been shown to cross-react with the authentic and the recombinant T.maritima NusG proteins (data not shown).4.2.1 Binding activity of NusG to duplex DNAThe highly purified recombinant NusG protein of T. maritima binds todouble-stranded DNA non-specifically. The salt and temperature dependencies andthe time-course of binding were examined using a 5.2 kb linear DNA fragment(Figure 13). These reactions contained about 0.03 pmoles of DNA fragment and 12pmoles of protein; this corresponds to about 80 molecules of NusG monomer perkb of duplex DNA. At salt concentrations between 5 and 300 mM NaC1, atincubation temperatures between 0° and 80°C, and at times greater than 30 sec,NusG forms a complex with DNA that is retained at the origin and unable topenetrate the agarose gel matrix.Two types of complexes were observed based on ethidium bromideaccessibility. At both low (5-20 mM) and high salt concentrations (160-300 mM) thecomplexes were stained with ethidium bromide, whereas at intermediate saltconcentrations (40-100 mM) the complexes were not; these complexes are referredto as "loose" and "tight," respectively. In 50 mM salt, tight complexes are formedonly at temperatures above 37°C. Even at 65°C, the initial protein-DNA complexfound is loose and the transition to tight occurs only after about 30 min. The"tight" complex can be visualized at the origin by autoradiography when labeledDNA fragments are used (unpublished data).We have investigated the stoichiometry of NusG binding to duplex DNAunder the following conditions. The buffer was 33 mM NaC1, 17 mM sodiumphosphate, pH 7.0, and the incubation conditions were for two hours at 65°C. For87A^B^CNaCI (mM)^Temp (°C)^Time (min)64Cl) 0) Z in 0 0 0 0 0 0 0 0g 8 ac i .4, co 0 ,. 0. . .. , 0 IA 0r•i ri 01 el^PI VI LA CO COU) E‘ IA r4 V 10 CICO8 •c:^.-. ,,,,, e..Figure 13 The binding properties of NusG protein to linear duplex DNAThe salt and temperature dependence and the kinetics of binding of NusG toa 5.2 kb linear fragment of duplex DNA was investigated. Typically, a 15 ill reactionmixture contained 0.03 pmoles of DNA fragment and 12 pmoles of NusGmonomer. For the salt-dependence assays (A), the buffer was 5 mM sodiumphosphate, pH 7.0, and the incubations were for two hours at 65°C. For thetemperature-dependence experiments (B), the reaction buffer contained 33 mMNaC1 and 17 mM sodium phosphate, pH 7.0, and incubation was for 2 hours at theindicated temperatures. For the time kinetics assays (C), the reaction buffer was 33mM NaCl and 17 mM sodium phosphate, pH 7.0, and the incubation temperaturewas 65°C for the indicated times. Incubations were terminated by freezing samplesin ethanol dry-ice bath. The samples were thawed, mixed with loading solution,and immediately electrophoresed on a 1.0% agarose gel and stained with ethidiumbromide. The molecular length standard (MLS, bacteriophage X, DNA digested byrestriction enzyme Psti {X PstI], and EcoRI and Hind111 {? E-H}, respectively) andcontrol DNA (CONT, the position is denoted with an arrow) lanes are indicated.88these studies we used either a long DNA fragment (5.2 kb), a short fragment (0.7 kb)or a mixture of a long and a short fragment (3.0 kb and 0.7 kb). The results obtainedwith the mixture of the two fragments are illustrated in Figure 14. In the firstexperiment, a constant amount of DNA (150 ng; 0.06 pmoles of each fragment) wastitrated with increasing amounts of NusG protein (1.25 to 25 pmoles). At a proteinconcentration above 5 pmoles per assay, both of the input DNA fragments are fullysequestered at the electrophoretic origin in the tight complexes which areinaccessible to ethidium bromide staining. At protein concentrations below 2.5pmoles of protein per assay, there is no evidence of any protein-DNA interaction.These results suggest that protein binding to DNA is cooperative, and that thecritical protein concentration necessary for complete complex formation in thisexperiment was about 10 pmoles per assay; this corresponds to 44 molecules ofNusG protein per kb of DNA. In separate experiments using the 5.2 kb or the 0.7 kbfragment, we estimated the critical amount of protein required for completecomplex formation was 33 and 41 molecules per kb of DNA (data not shown).These and numerous other experiments indicate that the protein has no sequencespecificity and exhibits no discernible preference for long versus short DNAfragments.In the second experiment (Figure 14B), a constant amount of NusG protein(500 ng; 12.5 pmoles) was titrated against an increasing amount of the equal molarmixture of the 3.0 and 0.7 kb DNA fragments (0.02 to 0.32 pmoles of each fragment).At DNA fragment concentrations below 0.08 pmoles per assay, virtually all theDNA is sequestered in a tight complex and retained at the electrophoretic origin.Again, this corresponds to about 42 molecules of NusG monomer per kb of targetDNA. At higher DNA concentrations, fragments not complexed with the proteinare visible. In experiments using a 5.2 or a 0.7 kb fragment, we estimated that thetarget DNA is totally sequestered when the amount of protein exceeds 21monomers per kb of DNA.A^NusG (pmol)^dsDNA (pmol)Cl o°) ca. tn. 0 LA 0 1/1X X^v-.1^Cq3 . 00 . 7Hca qt CO ei o a mcvZ 0 0 0 rq^Cq0^•^•^•^•^•^•^• •^•rj000000 00 089Figure 14 The stoichiometry of NusG:duplex DNA complexesAssays were carried out in 15 IA volumes containing 33 mM NaC1 and 17mM sodium phosphate, pH 7.0 and incubations were for two hours at 65°C. In (A),the amount of DNA was constant (0.06 pmoles each of a 0.7 and a 3.0 kb fragment)and the amount of NusG protein was increased from 1.25 to 25 pmoles. In (B), theamount of NusG protein was constant (12.5 pmoles) and the amount of DNA wasincreased from 0.02 to 0.32 pmoles of each fragment. The reactions wereelectrophoresed through a 1.0% agarose gel and stained with ethidium bromide.The molecular length standard (MLS, PstI and 7‘., E-H) and control DNA (CONT;0.4 pmoles of each fragment) are indicated. The arrows indicate the positions of 3.0kb and 0.7 kb DNA fragments.904.2.2 Accessibility of the NusG-DNA complex to restriction by TaqlThe state of the complex between NusG protein and duplex DNA wasfurther investigated using accessibility to restriction endonuclease digestion. Therestriction enzyme TaqI has a temperature optimum at 65°C and can efficientlycleave DNA over a wide range of salt concentrations. In this assay, we incubatedduplicate samples of linear DNA at 65°C in buffers containing between 10 and 150mM NaC1 in the absence or presence of NusG protein (80 monomers per kb ofDNA). After two hours, TaqI endonuclease was added and digestion was continuedfor 1.5 hours. One of the duplicates was electrophoresed directly on a standardagarose gel and the other was mixed with a small amount of SDS (0.2% finalconcentration) and run on a gel containing 0.2% SDS (Figures 15A and B). Thisconcentration of SDS is sufficient to disrupt the NusG-DNA complex so that theintact fragment or TaqI digestion products can be electrophoresed into the gel andvisualized.In the native gel, it is apparent that the DNA fragment in samples notcontaining NusG protein is efficiently restricted by TaqI at salt concentrations below100 mM NaCl. At 150 mM NaC1, TaqI cleaves less efficiently and restriction of theDNA is incomplete. In samples containing NusG protein and run on the nativegel, the DNA is retained as a complex in the well because the number of monomersof protein per kilobase of DNA exceeds the critical value of 20-50. When theNusG-DNA complexes that had been digested with TaqI were dissociated with 0.2%SDS and run in an SDS gel, it can be seen that the DNA is protected from TaqIrestriction by NusG binding at salt concentrations below 50 mM NaCl. At saltconcentrations above 50 mM NaC1, TaqI can digest the DNA within the NusG-DNA complex nearly to completion. At 150 mM NaC1, TaqI is able to digest DNAin the complex much more efficiently than it digests free DNA. This result suggeststhat at salt concentrations below 50 mM, the DNA-protein complex is either more compact or more static and therefore resistant or inaccessible to digestion by TaqI91endonuclease. We cannot explain the activation of Taql nuclease by NusG in highsalt buffer.4.2.3^Binding activity of NusG to single-stranded DNANext, we tested if NusG can bind to single-stranded as well as duplex DNA.We found that when we mixed NusG (2 pg; 50 pmoles) with single-stranded M13DNA (200 ng; 0.085 pmoles) to give a ratio of about 80 monomers of protein per kbof single-stranded DNA, a complex that was retained in the well of theelectrophoresis gel formed within 30 sec at 65°C. When incubated less than 30 min,the complexes were visible with ethidium bromide staining; with incubation timesgreater than 30 min, the complexes could no longer be stained. The complexeswere at least partially dissociated by incubation with SDS (>0.1%) (data not shown).The stoichiometry of the interaction between NusG and single-strandedDNA was examined by titrating a constant amount of single-stranded M13 DNA(0.84 pmoles) with various amounts of NusG (1.25 to 160 pmoles per assay). Atprotein concentration above 20 pmoles, all the DNA was sequestered in complexesand retained at the electrophoretic origin. The protein concentration required toachieve complete complex formation was 40 pmoles; this corresponds to about 66monomers of NusG per thousand bases of single-stranded DNA. In the reciprocalexperiment, a constant amount of NusG (25 pmoles) was titrated with an increasingamounts of single-stranded DNA. At DNA concentrations of 0.63 pmoles per assayand below, virtually all the DNA was sequestered; this corresponds to greater than54 NusG monomers per thousand bases of DNA (Figure 16).92A^Native gel^ B 0.2‘)/0 SDS gelNaC1 (mM) 10 25 50 75 100 150 E.^NaC1 (mM) 10 25 50 75 100 150 E.NuaG^- + - + - + - + - + - + 8^NuaG^ + + - + - 0Figure 15 Susceptibility of NusG:duplex DNA complexes to restriction by TaqlIn duplicate experiments, NusG protein (1 lig; 25 pmoles) was mixed with a3.5 kb linear DNA fragment (0.09 pmoles) in 5 mM sodium phosphate buffer (pH7.0) containing between 10 and 150 mM NaCl. The sample volume was 19.5 tl andthe first incubation was for two hours at 65°C. At the end of the first incubation, 0.5tl mixture containing 1 unit of TaqI endonuclease and sufficient Mg+2 to bring thefinal concentration to 3.0 mM was added to each sample. The second incubationwas at 65°C for 1.5 hours. One sample from each duplicate was mixed withstandard loading solution, run on a standard agarose gel (1.0%) (A), and the otherwas mixed with loading solution containing SDS (0.2% final concentration) andrun on an SDS-containing 1.0% agarose gel (0.2% final concentration) (B). Thearrow indicates the position of the control 3.5 kb DNA linear fragment. The 3.5 kbDNA fragment is cleaved by TaqI to produce a 2.0 kb (o) and a number of othersmaller subfragments.93ANusG (pmol)Cq 1.08 r-1 C`q U 0riBssDNA (pmol)tf1 1.01.0 OD 0 Cl0 0 r-1 1-4 r-1 r-1.^.^.0 0 0 0 0 0r-1 CqCI V0 0 0 0 0 0Cl .11 OD ‘.0r-1^0 0-111P- MFigure 16 The stoichiometry of NusG:single-stranded DNA complexesAssays were carried out in 15 1.11 volumes containing 33 mM NaC1 and 17mM sodium phosphate, pH 7.0, and incubations were for two hours at 65°C. In (A),the amount of DNA was constant (0.084 pmoles) and the amount of NusG wasincreased from 1.25 to 160 pmoles. In (B), the amount of NusG protein wasconstant (25 pmoles) and the amount of DNA was increased from 0.021 to 0.168pmoles. The position of free DNA is indicated by the arrow. The lane designatedMIS is a molecular length standard (X PstI).94^4.2.4^Competition between single-stranded and duplex DNA for NusGbindingThe results above suggest that somewhat more NusG protein is required tofully complex single-stranded DNA (>50 monomers per thousand bases) than tofully complex duplex DNA (<50 monomers per thousand base pairs). Furthermore,the transition between free DNA and complex occurred over a narrowconcentration range for duplex DNA and over a broader concentration range forsingle-stranded DNA (Figures 19 and 21). In fact, at intermediate NusGconcentrations, a thin trail of ethidium bromide stainable material is apparentbetween the band of free DNA and the electrophoretic origin (Figure 16). Thismaterial presumably represents intermediates or incomplete complexes.We then investigated the preference of NusG for single-stranded or duplexDNA in a competition experiment, where equal molar amounts of a 3.5 kb duplexDNA fragment and a 7.2 kb single-strand M13 DNA (0.085 pmoles of each) weremixed together and titrated against increasing amounts of NusG protein (1.25-50pmoles). At a protein concentration above 10 pmoles per assay, all of the 3.5 kbduplex DNA was sequestered in tight complexes and retained at the electrophoreticorigin (Figure 17). At the same time, at least some single-stranded DNA remainedvisible in the uncomplexed or partially complexed state at a protein concentrationof 40 pmoles per assay (Figure 17). These results indicate that NusG bindspreferentially and more efficiently to duplex DNA.4.2.5 Role of NusG in transcriptionSince NusG functions as a transcriptional factor in E. coli, and the T.maritima NusG binds to DNA, the next question we asked concerned the specificrole of T. maritima NusG in transcription. A number of DNA templates wereconstructed for in vitro transcription assays. A 170 by EcoRI—Aval  fragmentcontaining the promoter P1 (see Part III) was isolated and cloned into plasmid95Figure 17 Competition between single-stranded and duplex DNA for NusGbindingAssays were carried out in 15 ill volumes containing 33 mM NaC1 and 17mM sodium phosphate, pH 7.0, and incubations were for two hours at 65°C. Eachassay contained 0.085 pmoles of a 3.5 kb duplex DNA fragment and 0.085 pmoles ofsingle-stranded M13 DNA. The amount of NusG protein was increased from 1.25pmoles to 50 pmoles per assay. After incubation, the samples were electrophoresedthrough a 1.0% agarose and stained with ethidium bromide. Lane designations areMLS, molecular length standard (X, PstI); SS + DS, single-stranded plus duplex DNAwithout protein; DS, duplex DNA without protein; SS, single-stranded DNAwithout protein.96pGEM3-Zf(+) between EcoRI and Aval sites of the multiple cloning site. A 93 byfragment containing the terminator To was obtained by PCR amplification using the5' primer oD20 and the 3' primer oD21 (Table 3). The product was cloned into theXbal and HindIII sites downstream of the P1 promoter fragment. (Primers oD20 andoD21 have Xbal and HindIII recognition sequences respectively.) The resultingEcoRI-HindIII 271 fragment containing the promoter Pi and the terminator To wasused for the in vitro transcription assay with partially purified DNA-dependentRNA polymerase from T. maritima; the results of this experiment are shown inFigure 18. The major bands correspond to (i) a 135-nucleotide-long transcriptinitiated at or near the mapped starting site (nucleotide 101, Figure 4) andterminated at To (denoted as Term), and (ii) the 170-nucleotide-long transcriptsinitiated at the same starting site and extended to the end of the linearized template(denoted as RT). In the absence (lane 1) and presence of small amounts of NusG(lanes 2-4, Figure 18), other RNA species are easily recognizable, and probably resultfrom non-specific initiation and termination events. However, at a relativelyhigher NusG concentration (12.5 ilM, lane 5), only specific transcripts can be seen(Term and RT); the intensity of these bands appears to be reduced. These resultsalso indicate that about 30% of transcripts escape termination. In contrast, in vivotranscript mapping suggested that the terminator 113 was nearly 100% efficient ineliciting termination of transcription from upstream promoters (Part III; Liao andDennis, 1992). This discrepancy suggests that additional protein factor(s) may berequired for efficient transcription termination. Alternatively, sequence context ofthe terminator, the tertiary structure of the genome, or the intracellularenvironment may be important for efficient termination.In the second set of in vitro transcription experiments, a potentialtranscription attenuator (ALio) normally located between the L1 and L10 genes (theL10 attenuator, see Part III) was integrated into the DNA templates. On the first such template, the L10 attenuator (A) was arranged between the promoter P1 and97Figure 18^In vitro transcription of DNA template containing a promoter and aterminatorThe promoter P1 (P) and terminator To (T) that were identified in the clonedgene cluster from T. maritima genome (Part III) were fused together (details in thetext), and a 271 nucleotide long DNA fragment was used for an in vitrotranscription assay with purified RNA polymerase from T. maritima. Majortranscripts are indicated (RT and Term), which are also depicted below the templatewith lines, and the 5' end of the transcripts is marked with a solid circle (•). Lane Mshows DNA size markers (3' end-labeled restriction fragments); the sizes are (fromtop, bp): 190, 147 and 110. The concentrations of T. maritima NusG are (.1M): 0 (lane1), 0.0125 (lane 2), 0.125 (Lane 3), 1.25 (Lane 4) and 12.5 (Lane 5); which arecorresponding to approximately 0, 0.6, 6, 60, and 600 monomers of NusG pertemplate molecule.98the terminator Ti3 (Figure 19). Specific transcripts generated on this template werethose initiated at the promoter and terminated at AL10 (Att, about 160-nucleotide-long), terminated at Tp (Term, about 370-nucleotide-long), or extended to the end ofthe linearized template (RT, about 410-nucleotide-long). Non-specific transcriptswere also apparent (lanes 1-4, Figure 19). Addition of NusG has similar effect ontranscription of this template; at a relatively high concentration (12.5 11M, lane 5 ofFigure 19), non-specific transcripts were greatly suppressed, whereas the specifictranscripts (RT, Term and Att) were largely unaffected.To check if NusG binds to the DNA template under the in vitro transcriptionassay condition, DNA band-shift assays using this template were conducted. TheDNA-NusG complexes that were retained at the origin of electrophoresis wereobserved when the protein was added to amounts corresponding to those used forlanes 4 and 5 in the transcription assay (1.25 and 12.5 1.1M, respectively). Nocomplexes were detected at lower protein concentrations. Thus, the NusG proteinbinds to the template DNA cooperatively. However, more monomers of NusG arerequired to form the protein-DNA complexes under the in vitro transcription assaycondition than that shown above (about 200 monomers of NusG per thousandbase-pair of template DNA; data not shown).To verify if Att (about 160-nucleotide-long) corresponds to transcripts that areterminated at the AL10, two additional templates were used (Figure 19). Bothcontain the promoter Pi followed by the AL10, and the distance between the twosequence elements were kept constant (same as on template 1, Figure 19).Templates 2 and 3 are 410 and 760 by long, respectively. The major transcripts arethe read-through products (pointed by arrows), and the transcripts ended at theAL10 site (Att, Figure 19).Figure 19^In vitro transcription of DNA templates containing a promoter, anattenuator, and a terminatorTemplates that contain the promoter P1 (P), the attenuator ALio (A) and theterminator Tp (T) (Temp 1), and that carry P1 (P) and Allo (A) (Temp 2 and Temp 3)were used for in vitro transcription assays with purified RNA polymerase from T.maritima. (All three sequence elements were from the cloned gene cluster fromthe T. maritima genome [see Part III].) Major transcripts are illustrated (RT andTerm, and Att), which are also indicated below each template with lines, and the 5'end of the transcripts is marked with a solid circle (•). Lanes 1 to 5 show transcriptssynthesized on template 1, and the lanes marked with Temp 2 and Temp 3 showthe RNA products generated on templates 2 and 3, respectively. Arrow points tothe read-through products (RT) in lanes Temp 2 and Temp 3. Lane M shows DNAsize markers (3' end-labeled restriction fragments); the sizes are (from top, bp): 765,577 489, 457, 404, 360, 328, 281, 255, 240, 190, and 147. The concentrations of T .maritima NusG are (µM): 0 (lane 1), 0.0125 (lane 2), 0.125 (Lane 3), 1.25 (Lane 4) and12.5 (Lane 5); these are corresponding to approximately 0, 1, 10, 100, 1,000 monomersof NusG per template molecule (0.5 kb).1004.2.6 Association of NusG with ribosomes in T. maritimaTo determine the possible location of NusG within the T. maritima cell, thepolyclonal antibodies against purified NusG protein were used to detect whether itis associated with the ribosomes of T. maritima. The 70S ribosomes and the 50Sand 30S ribosomal subunits from the T. maritima cells were purified by sucrosegradient centrifugation. Western blotting analysis detected a protein band withmolecular weight of about 40 kilodaltons in both supernatant and pellet fractionsafter centrifugation at 248,000 xg. The pellet corresponds to the cellular particleslarger than 20S. Thus, the NusG protein seems to be associated with the ribosomesin T. maritima cell. About 25%-50% of NusG partitioned with the pellet. Thepellet was then dissolved and centrifuged through a sucrose gradient. The fractionscorresponding to the 70S particle, 50S and 30S ribosomal subunits were collectedand pooled. Western blotting analysis of these fractions indicated that NusGprotein was present in all fractions. However, the amounts of NusG diminished inthe 50S and 30S subunit fractions after another round of sucrose gradientcentrifugation, which seems to suggest that NusG is not an integral part of theribosome (data not shown).4.3 DISCUSSIONThere is strong biochemical and genetic evidence to suggest that in E. coli,Nus factors form a multimeric complex with core RNA polymerase duringtranscriptional elongation (for reviews, see Das, 1992; Roberts, 1993). The complexis endowed with the ability to read through both factor-dependent and factor-independent transcription-termination signals. NusG is an important componentof this complex; it binds to and interacts with both core RNA polymerase and Rho101factor. The interaction with Rho is necessary for Rho-mediated termination(Mason and Greenblatt, 1992; Sullivan and Gottesman, 1992; Li et al., 1992, 1993).In E. coli, the nusG gene is essential for cell viability; the in vivo depletion ofNusG appears to affect only Rho-mediated termination (Downing et al., 1990;Sullivan and Gottesman, 1992). The gene encoding NusG is located in the secE-nusG bicistronic operon and is adjacent to the rplKAJLrpoBC operon. Nuclease S1protection assays indicate that the secE-nusG transcripts are about five to ten-foldless abundant than the ribosomal protein encoding transcripts (Downing et al.,1990; Downing and Dennis, 1991). Using a quantitative Western blotting assay, Li etal. (1993) estimated that there are about ten thousand copies of NusG per cell. Thisvalue is about equal to the number of core RNA polymerase and about five-foldbelow the number of ribosomes per cell (Bremer and Dennis, 1987).The NusG homologous gene, designated vbrA, has recently been cloned andcharacterized from Streptomyces virginiae (Okamoto et al., 1992). Unlike otherNusG homologs, the S. virginiae protein contains extra 125 amino acids at itsamino terminus. The protein was first characterized because of its ability to bindbutyrolactone, an autoregulator that triggers secondary metabolism and theproduction of the antibiotic virginiamycin in this organism. The unique aminoterminal domain has a high proportion of acidic residues (35%) and contains fourcopies of the tetrapeptide Glu-Glu-Ala-Ala. It was suggested but not demonstratedthat this acidic domain, absent from other NusG proteins, is the butyrolactonereceptor domain. If this is correct, it suggests that the protein not only senses thepresence of the butyrolactone but also activates genes responsible for secondarymetabolism and antibiotic production. The gene activation may be mediatedthrough the termination-antitermination activity of NusG. If so, it substantiatesthe idea that the NusG protein is an important element in determining thetermination properties of the transcription complex. Alternatively,  this protein 102might have either a specific or general DNA-binding activity and thereby eitherdirectly or indirectly influence transcription initiation.The T. maritima protein contains a 171-amino-acid-long insertion afterresidue 45 that is not present in any of the other eubacterial NusG proteins thathave been examined to date. There is some indication that this insertion may bethe result of a partial gene duplication, but the evidence for this is not compellingand the situation is clearly more complicated (unpublished observations). The T.maritima NusG contains a relatively high content of charged amino acids (121 of353 residues), but is only moderately basic (57 acidic and 64 basic residues; pI = 9.0)(Liao and Dennis, 1992). The charged residues are evenly distributed along thelength of the molecule.The relationship between this 171 amino acid long insertion and the DNA-binding activity of the protein remains uncertain. Three approaches have beenused with limited success to address this issue. First, the insert sequence as well asthe entire sequence were compared to all entries in the protein sequence data base.The T. maritima NusG exhibited homology only to other eubacterial NusGproteins, whereas the insert failed to exhibit significant sequence identity to anyother sequence including proteins with known DNA-binding activity. Second, weconstructed a deletion version of the T. maritima NusG that is missing the entire171 amino acid insertion. When expressed, the deletion protein was detectable byWestern blotting but failed to accumulate in appreciable amounts and thereforecould not be purified (unpublished results). Finally, we examined the E. coli NusGprotein for its ability to bind DNA; no complexes were detected at 37°C using thesame conditions employed here for the T. maritima protein. These results indicatethat the T. maritima NusG has a DNA-binding activity not present in thehomologous E. coli protein; the activity may or may not be located within or relatedto the insertion unique to this protein.103Our results indicate that the protein binds cooperatively to DNA in vitro; thestoichiometry is about 20-40 monomers per thousand base pairs of DNA. In T.maritima, the secE and nusG genes are cotranscribed with the downstreamribosomal protein genes (Part III; Liao and Dennis, 1992). If the nusG cistron isefficiently translated, the stoichiometry of NusG and ribosomes should be about 1:1,and the protein is likely to be in substantial excess over the amount of core RNApolymerase. Thus, it is possible that the amount of the protein may be five to ten-fold higher than in E. coli. If the genome size and composition of a T. maritima cellis similar to that of E. coli, there may be as many as 50,000 copies of NusG per cell;this would correspond to about five monomers per thousand base pairs of DNA.Two different low molecular weight histone-like proteins, associated withthe nucleoid of E. coli have been purified and characterized (reviewed by Drlica andRouviêre-Yaniv, 1987). The first, protein HU, is highly basic, resembling eukaryotichistones, and plays an important role in maintaining chromosome structure andsuperhelicity (Rouviere-Yaniv et al., 1979; Broyles and Pettijohn, 1986). The HUprotein is very abundant (with 20,000 to 100,000 copies per cell; Broyles andPettijohn, 1986). It binds to DNA more tightly than to RNA, as well as to ssDNAmore strongly than to dsDNA; moreover, it has the propensity to associate with theribosome (Dijk et al., 1983), but the biological significance of these properties is notclear (Gualerzi et al., 1986). Interestingly, we found that the T. maritima NusGseems to be also in association with the ribosomes, and bind to RNA (data notshown). However, interactions between ribosomes and cellular proteins appear tobe very complicated and nature of these interactions is not well understood.The second, protein H1, is neutral rather than basic and binds as a dimeronce every 400 base pairs to duplex DNA. The abundance of H1 in the bacterialnucleiod is second only to that of HU (estimated about 20,000 copies per cell) (forreview, see Higgins et al., 1990). Mutations in the osmZ gene encoding the H1 protein are pleiotropic and influence the expression of many genes in a non-104uniform fashion, presumably by affecting superhelicity and overall topology of thenucleoid (Hulton et al., 1990). Both HU and H1 generally inhibit transcription by E.coli RNA polymerase; however, they can also enhance transcription of certaintemplates (Drlica and Rouviêre-Yaniv, 1987; Higgins et al., 1990). At the presenttime, it is still uncertain if the NusG protein of T. maritima, in addition to being atranscription factor, also plays a role similar to HU and H1 in maintaining thestructure of the nucleoid in this hyperthermophilic eubacterium.The T. maritima NusG seems to play an important role in transcription; at aprotein concentration of about 2,000 monomers per thousand base pairs of templateDNA, the generally high background of aberrant initiation and termination wasessentially eliminated, and at the same time, specific transcription initiation andtermination are maintained. It is possible that at high temperature, the RNApolymerase is very active; aberrant initiation and termination events are frequent,which would result in high level synthesis of useless transcripts. To prevent this,the NusG protein acquired new functions that make RNA polymerase morefaithful. The NusG may stay bound to the DNA template, the RNA transcripts, andeven the RNA polymerase throughout the transcription process. Theseinteractions would prevent premature release of the transcripts and dissociation ofthe RNA polymerase from the template. On the other hand, these interactionsmay inevitably cause slow-down in RNA synthesis.4.4 SUMMARYThe NusG protein of T. maritima has 353 amino acid residues with anapparent molecular weight of 40 kilodaltons, twice as large as the E. colicounterpart, due to a large insertion in the central part of the sequence. It is a basicprotein (pI=9.0). The T. maritima NusG was expressed in E. coli and therecombinant NusG purified. The purified NiisG hag non-specific DNA binding105activity; it binds DNA cooperatively. We estimated by DNA band-shift assays thatabout 40 NusG monomers per kilobase pairs of dsDNA are needed to form NusG-dsDNA complexes. The number of NusG monomers required per kilobase ssDNAis higher (about 60) for the formation of NusG-ssDNA complexes. Two types ofNusG-DNA complexes have been observed: the first type forms instantly, can bestained with ethidium bromide ("loose" complex); the second type forms moreslowly, probably is converted from the structure(s) of the first type complex. Thesecond type may be more compact, since it can not be stained with ethidiumbromide ("tight" complex). The protein binds to both ds- and ssDNA, butpreferentially to dsDNA in a mixture of both DNA molecules. It seems to playimportant roles in transcription in T. maritima. In vitro transcription assays usingpurified DNA-dependent RNA polymerase from T. maritima suggested that atrelatively high NusG concentration, NusG appears to suppress aberranttranscription; however, the production of specific transcripts is largely unaffected.The sequence elements identified in the cloned gene cluster are functional inthe in vitro transcription system. The promoter Pi was used as promoter element.The transcripts synthesized under the control of this promoter are consistent withthe mapped transcripts in vivo. Transcription attenuation was also observed at theattenuator ALE). Transcription termination at the terminator To is faithful, buttranscription did not stop fully at this site in vitro, whereas it functions at nearly100% efficiency in vivo.106V. Molecular phylogenies based on the sequences of ribosomalproteins L11, L1, L10 and L125.1 INTRODUCTIONRibosomes are subcellular particles that play a structural and functional rolein the template directed synthesis of protein. Ribosomes were already present inthe common primordial ancestor, and their basic structural and functional featureshave been preserved in all its diverse descendants. As a result, the macromolecularcomponents of the ribosome, especially the small subunit ribosomal RNA, havebeen useful chronometers to measure evolutionary relationships among extantorganisms (Pace, et al., 1986; Pace, 1991).In the E. coli ribosome, a pentameric complex, consisting of four copies ofprotein L12 and a single copy of protein L10, binds cooperatively along with anotherprotein, L11, to a region in 23S rRNA between nucleotides 1030 and 1120 (Dijk et al.,1979; Egebjerg et al., 1990; Ryan et al., 1991). This interaction produces a distinct andeasily recognizable stalk on the large ribosomal subunit. This structure is essentialfor the binding of the extrinsic factors EF-Tu and EF-G and participates inconformational rearrangements of the ribosome that are accompanied by thehydrolysis of GTP (reviewed by Liljas, 1982; Shimmin et al., 1989). Quaternarycomplexes similar to the E. coli (L12)4-L10-L11-rRNA complex are structurally andfunctionally conserved in the ribosomes of archaebacteria and eukaryotes(Beauclerk et al., 1985; El-Baradi et al., 1987; Uchiumi et al., 1987; Cassiano et al.,1990). A fourth protein, L1, binds to large subunit RNA between nucleotides 2100and 2200 (Branlant et al., 1981). It functions to stabilize peptidyl-tRNA binding tothe ribosome P site and participates indirectly in the factor dependent GTPhydrol sis Subramania .11 0. ea an• er, •83).•21107In E. co/i, the genes encoding L11, L1, L10 and L12 form a complextranscription unit that also contains the genes for the two large subunits of RNApolymerase. It was somewhat surprising to find that the clustering of the genesencoding these four ribosomal proteins was conserved not just in eubacteria butalso in a range of distantly related archaebacterial species including Halobacteriumcutirubrum (Shimmin and Dennis, 1989), Haloferax volcanii (Shimmin andDennis, unpublished results), Haloarcula marismortui (Arndt and Weigel, 1990),and Sulfolobus solfataricus (Ramirez et al., 1989). In eukaryotes, these genes are notlinked (Newton et al., 1990) and the L12 gene has undergone a very ancientduplication that possibly predates the earliest eukaryotic organism.In this part, available L11, L1, L10 and L12 gene and protein sequences fromeubacterial, archaebacterial and eukaryotic organisms are aligned and analyzed. Weobserved that for each of the gene-protein analyses, there is strong coherence forgrouping organisms into the three primary kingdoms (or domains): eubacteria,archaebacteria and eukaryotes. That is, the gene or protein sequences of organismsfrom within any one of the three domains are more closely related to each otherthan they are to sequences from the other two domains. The patterns of divergencefor the L11, L10 and L12 proteins between eubacteria, archaebacteria and eukaryotesare surprisingly dissimilar considering their intimate physiological interactions onthe ribosome.5.2 RESULTS AND DISCUSSION5.2.1 Alignment and phylogeny of L11 proteinsThere are five eubacterial sequences and one chloroplast sequence, which isencoded by the nuclear genome, available for ribosomal protein L11. They alignfrom end to end with only two gaps in the alignment at positions 2-5 and 53108(Figure 20). The high degree of amino acid sequence identity among these fivesequences clearly suggests that the chloroplast sequence is of eubacterial origin.The three available archaebacterial L11 protein sequences can be easilyaccommodated to this alignment. The archaebacterial proteins retain seven of theeight proline residues that are conserved in the eubacterial alignment at positions24, 26, 27, 30, 60, 79 and 98; an eighth proline at position 80 has been replaced only inthe S. solfataricus sequence. The archaebacterial Lii proteins are furthercharacterized by a shorter amino terminus and by a 25-32 amino acid longextension at the carboxyl terminus when compared to the eubacterial L11sequences.The proteins designated "L15" from S. cerevisiae (Pucciarelli et al., 1990) and"L12" from R. rattus (Suzuki et al., 1990) are homologs. They align end-to-endwithout gaps and are identical at 115 of the 165 positions. Based upon (i)immunological cross-reactivity (Juan-Vidales, et al., 1983), (ii) a limited degree ofamino acid sequence similarity, and (iii) a common binding site within mouse 28SrRNA (El-Baradi et al., 1987), these eukaryotic proteins have been implicated ashomologues of the Lii protein of E. coli. The eukaryotic L11 sequences can beaccommodated in the alignment by the inclusion of only two additional internalgaps (positions 66 and 77). Of the seven positions where proline is conserved in thearchaebacterial and eubacterial proteins, only two (positions 30 and 79) are retainedin the eukaryotic proteins (Figure 20A).The phylogenetic relationships between the eleven L11 protein sequenceswere analyzed using PAUP (Figure 20B). The eubacteria were contained within awell-defined domain. The location and branching order of three species within thisdomain, Streptomyces virginiae, spinach chloroplast and T. maritima, are notrigorously defined. The two eukaryotic L11 sequences form another well definedbranch that originates from the S. solfataricus linea:e within II • •group. If the ancestral root of the tree is located near or within the eubacterial109domain (below the position of the arrow in Figure 20B), then the archaebacteriawould appear to be monophyletic but not holophyletic. (A group of taxa are said tobe holophyletic only when they not only are descended from a single ancestralspecies, but represent all the descendants of that ancestor. In the tree shown inFigure 20B, archaebacteria share a common ancestor with eukaryotes; thus bystrictest definition, archaebacteria are only considered being a monophyletic group.)However, bootstrap analysis indicates that the positioning of S. solfataricus relativeto eukaryotes is tenuous. In the DNA parsimony tree (and in all other distancemethod trees), the archaebacteria are both monophyletic and holophyletic; thebootstrap confidence for this arrangement was 0.82.5.2.2 Alignment and phylogeny of Li protein sequencesThere are five eubacterial and six archaebacterial L1 equivalent proteinsequences available (Figure 21A). Although the proportion of conserved aminoacid residues within the L1 family is relatively high, the alignment is interrupted bygaps at approximately 15 different positions. Many of these gaps, particularly thefive gaps located beyond amino acid position 125, clearly differentiate thearchaebacterial proteins from the eubacterial proteins. Deletion-insertion eventsare generally rare and their co-occurrence in multiple sequence alignments is astrong indication of common ancestry.In E. coli, protein L1 binds to nucleotides 2100-2200 of the E. coli 23S rRNA(Branlant et al., 1981). The sequence and secondary structure of this bindingdomain within large subunit rRNA of archaebacteria and eukaryotes are highlyconserved and the E. coli protein can protect these sites in vitro from ribonucleasedigestion (Zimmerman et al., 1980; Gourse et al., 1981). In E. coli protein L1 is alsoan autogenous regulator of translation of the mRNA containing the L11, Li, L10and L12 cistrons. A region within the leader of the mRNA exhibits primary sequence and secondary structural similarity to the authentic L1 binding domain in110Figure 20^Alignment of the amino acid sequences of the ribosomal protein L11family, and phylogenetic tree based on this alignmentA. The L11 proteins are from five eubacteria, one chloroplast, three archaebacteria,and two eucaryotes. The leader peptide required for import of the chloroplast L11protein into the organelle is not included in the alignment. The numbers indicatethe common alignment positions. The species names and their abbreviations arelisted in Table 5 (pp. 41-42).B. A parsimony analysis of aligned sequences was carried out using PAUP withbranch and bound search option, and the parsimonious tree is illustrated. Thereare 130 informative sites; all of them are included in the parsimony analysis. Theconsistency index for the tree is 0.886. The numbers indicate percent confirmationof grouping of species to the right of the node by bootstrapping analysis with 2000replications. The arrow indicates that possible root of the universal tree is locatedbetween the eubacteria and archaebacteria-eukaryotes.B111A10^20^30^40^50^60^70^80^90^100SceL11^MPPKFDPNEVKYLYLRAVGGEVGASAALAPKIGPLGLSPKKVGEDIAKATKE-FKGIKVYVQLKI-QNRQ-AAASV-VPSASSLVITALKEPPRDRKKDKRraL11^MPPKFDPNEIKVVYLRCTGGEVGATSALAPKIGPLGLSPKKVGDDIAKATGD-WKGLRITVKLTI-QNRQ-AQIEV-VPSASALIIKALKEPPRDRKKQKSsoLll^MPTKT^ IKIMVEGGSAKPGPPLGPTLSQLGLNVQEVVKKINDVTAQ-FKGMSVPVTIEIDSSTKKYDIKVGVPTTTSLLLKAINAQEPSGDPAHHcuLll^MAE T IEVLVAGGQADPGPPLGPELGPTPVDVQAVVQEINDQTEA-FDGTEVPVTIEYEDDGS-FSIEVGVPPTAALVKDEAGFDTGSGEPQEHmaL11^MAG-T^ IEVLVPGGEANPGPPLGPELGPTPVDVQAVVQEINDQTAA-FDGTEVPVTVKYDDDGS-FEIEVGVPPTAELIKDEAGFETGSGEPQESol(c)L11 KA----KKVIGVIKLALEAGKATPAPPVGPALGSKGVNIMAFCKDYNARTAD-KPGEVIPVEITVEDDKS-FTFILKTPPASVLLLKASGAEKGSKDPQMEcoL11^MA----KKVQAYVKLQVAAGMANPSPPVGPALGQQGVNIMEECKAFNAKTDSIEKGLPIPVVITVYADRS-FTFVTKTPPAAVLLKKAAGIKSGSGKPNKSmaLll^MA----KKVQAYVKLQVAAGMANPSPPVGPALGQQGVNIMEECKAFNAKTDSIEKGLPIPVVITVYSDRS-FTFVTKTPPAAVLLKKAAGIKSGSGKPNKPvuLll^MA----KKVQAYIKLQVSAGMANPSPPVGPALGQQGVNIMEFCKAFNAKTESVEKGLPIPVVITVIADRS-FTFVTKTPPAAVLLKKAAGVKSGSGKPNKSviLll^MPPK-KKKVTGLIKLQIKAGAANPAPPVGPALGQHGVNIMEECKAYNAATES-QRGMVVPVEITVYDDRS-FTFITKTPPAARLILKNAGIEKGSGEPHKTmaL11^MA----KKVAAQIKLQLPAGKATPAPPVGPALGQHGVNIMEECKRFNAETAD-KAGMILPVVITVYEDKS-FTFIIKTPPASELLKKAAGIEKGSSEPKR110^120^130^140^150^160^170^180SceL11^NVKHSGNIQLDEIIEIARQMRDKSFGRTLASVTKEILGTAQSVGCRVDFKNPHDIIEGINAGEIEIPENPraL11^NIKHNGNITFDEIVNIARQMRHRSLARELSGTIKEILGTAQSVGCNVDGRHPHDIIDDINSGAVECPAS^SsoL11^---KIGNLDLEQIADIAIKKKPQLSAKTLTAAIKSLLGTARSIGITVEGKDPKDVIKEIDQGKYNDLLTNYEQKWNE-AEGHcuL11^---FVADLSIEQLKTIAEQKKPDLLAYDARNAAKEVAGTCASLGVTIEGEDARTFNERVDDGDYDDVLGD^ELAAAHmaL11^---FVADLSVDQVKQIAEQKHPDLLSYDLTNAAKEVVGTCTSLGVTIEGENPREFKERIDAGEYDDVFAA^E AQASol(c)L11 --EKVGKITIDQLRGIATEKLPDLNCTTIESAMRIIAGTAANMGIDID---PPILVKKKKEVIFEcoL11^--DKVGKISRAQLQEIAQTKAADMTGADIEAMTRSIEGTARSMGLVVED^SmaL11^--DKVGKVTRAQVREIAETKAADMTGSDVEAMTRSIEGTARSMGLVVEDPvuLll^--EKVGKITSAQVREIAETKAADLTGADVEAMMRSIAGTARSMGLVVEDSviLll^--TKVAKLTAAQVKEIAELKMPDLNANDIDAAVKIIAGTARSMGVTVEG^TmaLll^--KIVGKVTRKQIEEIAKTKMPDLNANSLEAAMKIIEGTAKSMGIEVVD112Figure 21^Alignment of the amino acid sequences of the ribosomal protein L1family, and the phylogenetic tree based on this alignmentA. The Ll proteins from five eubacteria, and six archaebacteria are aligned. Thenumbers indicate the common alignment positions. The species names and theirabbreviations are listed in Table 5 (pp. 41-42).B. A parsimony analysis of the L1 sequences was carried out using PAUP withheuristic and branch and bound tree search options, and one of the two shortesttrees found with 627 steps is depicted. The other tree differs only in the position ofS. solfataricus, which is indicated by a dash line; the branch length of thisalternative lineage is arbitrary. There are 176 informative sites; all of them areincluded in the parsimony analysis. The consistency index for the shortest tree is0.900. The numbers indicate percent confirmation of grouping of species to theright of the node by bootstrapping analysis with 2000 replications. The arrowindicates that possible root of the universal tree is located between the eubacteriaand archaebacteria.113A^10^20^30^40^50^60^70^80^90^100HcuLl MADNDIE-EAVAR-ALEDAPQR NFRETVDLAVNLRDLDLNDPSQRVDEGVVLPSGTGQETQIVVFADGETAV-RADDVADDVLDEHhaLl MADNDIE-EAVAR-ALEDAPQR NFRETVDLAVNLRDLDLNDPSQRVDEGVVLPSGTGQETQIVVFADGETAV-RADDVADDVLDEHmaLl MADQEIE-NAVSR-ALEDAPER^NFRETVDLAVNLRDLDLNDPSNRVDESVVLPAGTGQETTIVVFAEGETAL-RAEEVADDVLDEHvoLi MAD-TIV-DAVSR-ALDEAPGR NFRETVDLAVNLRDLDLNDPSKRVDESIVLPSGTGQDTQIVVFATGETP---AEDAADEVLGPMvaLl MDSAQIQ-KAVKE-ARTRKPR-NFTQSVDLIV^NFTQSVDLIVNLKELDLTRPENRLKEQIVLPSGKGKDTKIAVIAKGDLAA-QAAEMGLTVIRQSsoLl MKKVLAD-KESLIEALK----LALSTEYNV KR^NFTQSVEIILTFKGIDMKKGDLKLREIVPLPKQPSKAKRVLVVPSFEQLEYAKKASPNVVITRBstL1 MPKVDKKYLEALK-LVDRSKAYPIAQAIEIVKKTNVAKFDATVEVAFRL-GVDPKKACQQIRGAVVLPHGTGKVARVLVFAKGEKAK-EAEAAGADYVG-EcoLl MAKLTKRMRVI-REKVDATKQYDINEAIALLKELATAKFVESVDVAVNL-GIDARKSDQNVRGATVLPHGTGRSVRVAVETQGANAE-AAKAAGAELVG-SmaL1 MAKLTKRMRVI-RDKVDATKQYDITEAIALLKELATAKFVESVDVAVNL-GIDARKSDQNVRGATVLPHGTGRSVRVAVFTQGANAE-AAKAAGAELVG-PvuLl MAKLTKRMRNI-REKVEVTKQYEIAEAVALLKELATAKFVESVDVAVNL-GIDARKSDQNVRGATVLPHGTGRSVRVAVFAQGANAE-AAKEAGAELVG-TmaLl MPKHSKRYLEA-RKLVDRTKYYDLDEAIELVKKTATAKFDETIELHIQT-GIDYRKPEQHIRGTIVLPHGTGKEVKVLVFAKGEKAK-EALEAGADYVG-^110^120^130^140^150^160^170^180^190^200HcuLl DDLSDLADDTDAAKDLADETDFFVAE^APMMQDIVGALGQVLGPRGKMPTPLQPDD--DVVDTVNRMKNT-VQIRSRDRRTFHTRVGAEDMSAEDIHhaLl DDLSDLADDTDAAKDLADETDFFVAE^APMMQDIVGALGQVLGPRGKMPTPLQPDD--DVVDTVNRMKNT-VQIRSRDRRTFHTRVGAEDMSAEDIHmaLl DELEELGGDDDAAKDLADDTDFFIAE^KGLMQDIGRYLGTVLGPRGKMPEPLDPDD--DVVEVIERMKNT-VQLRSGERRTFHTRVGAEDMSAENIHvoLl DELEDFGDDTDAAKDLADETDFFVAE^AGLMQDIGRYLGTVLGPRGKMPTPLQPAD--DVVETVNRMYNT-VQLRTRDRRTFHTRVGEDDMTPDEIMvaLl EELEELGKNKKAAKRIANEHGFFIAQ^ADMMPLVGKSLGPVLGPRGKMPTPLPGNA--NLAPLVARFKKT-VAINTRDKSLFQVYIGTEAMSDEEISsoLl EELQKLQGQKRPVKKLAIQNEWFLIN^QESMALAGRILGPALGPRGKFPTPLPNTA--DISEYINRFKRS-VIVKTKDQPQVQVFIGTEDMKPEDLBstL1 D TEY^ INK --IQQGWFDFDVVVATPDMMGEVGK-LGRIIGPKGLMPNPKTGTVTFDVAKAVQEIKAGKVEYRVDKAGNIHVPIGKVSFDMEKLEcoLl --MEDL^ADQ - IKKGEMNFDVVIASPDAMRVVGQ-LGQVLGPRGLMPNPKVGTVTPNVAEAVKNAKAGQVRYRNDKNGIIHTTIGKVDFDADKLSmaLl --MEDL^AEQ --IKKGEMNFDVVIASPDAMRVVGQ-LGQISGPRGLMPNPKVGTVTPNVAEAVKNAKAGQVRYRNDKNGIIHTTIGKVDFDADKLPvuLl --MDDL^AAK- VKAGEMDFDVVIASPDAMRVVGQ-LGQILGPRGLMPNPKVGTVTPNVAEAVKNAKAGQVRYRNDKNGIIHTTIGKVVSTKHKLTmaL1 --AEDL^VEK-I-EKEGFLDFDVAIATPDMMRIIGR-LGKILGPRGLMPSPKSGTVTQEVAEAVKEFKKGRIEVRTDKTGNIHIPVGKRSFDNEKL^210^220^230^240HcuLl ASNIDVIMRRLHANLEKGP--LNVDSVYVKTTMGPAVEVA ^HhaLl ASNIDVIMRRLHANLEKGP--LNVDSVYVKTTMGPAVEVA ^HmaLl ADNIDVILRRLHADLEKGP--LNIDTVYVKTTMGPAMEVA ^HvoL1 ARTSNVIVRRLEATLEKGP--LNIDSVYVKTTMGPSVEVPA ^MvaLl AANAEAILNVVAKKYEKGL--YHVKSAFTKLTMGAAAPISK ^SsoLl AENAIAVLNAIENKA-KVE--TNLRNIYVKTTMGKAVKVKRABstL1 KENFAAVYEAIIKAKPAAAKGTYVKNVTITSTMGPGIKVDPTTV-AVAQEcoLl KENLEALLVALKKAKPTQAKGVYIKKVSISTTMGAGVAVDQAGLSASVNSmaLl KENLEALLVALKKAKPSQAKGMYIKKVSLSTTMGAGVAIDQSGLSAAANPvuLl KENLEALLVALKKAKPSAAKGVYIKKVSLSTTMGAGVAIDQASLSATV-TmaLl KENIIAAIKQIMQMKPAGVKGQFIKKVVLASTMGPGIKLNLQSL-LK-EB SsoL153100— — — — SsoL1MvaL165^HhaL11001 HcuL1^ HmaL1^HvoL1EcoL1SmaL1100— PvuLl^ BstL169^TuaLl 100Figure 2111423S rRNA. Any deficiency in the production of rRNA results in L1 proteinaccumulation; the excess protein binds to the structural mimic on the mRNA andprevents translation of the L11 and L1 cistrons (Dean and Nomura, 1980; Yates andNomura, 1981; Baughman and Nomura, 1981; Thomas and Nomura, 1987; Kearneyand Nomura, 1987). Similar mimics of the Li rRNA binding site have beenidentified in the mRNAs of other eubacterial, as well as halophilic andmethanogenic archaebacterial species (Sor and Nomura, 1987; Shimmin andDennis, 1989; Baier et al., 1990; Liao and Dennis, 1992). Thus, both structural andregulatory features of the Ll family of proteins are conserved within eubacteria andat least some groups of archaebacteria. The eukaryotic homologue to protein Li hasnot been identified.The PAUP analysis of the Li protein sequences produced two equallyparsimonious trees that group eubacteria and archaebacteria in separate and wellresolved domains. The two trees differ only in their placement of S. solfataricus; inthe first case it branches separately and somewhat closer to eubacteria (solid branchposition in Figure 21B; 53% bootstrap confirmation), and in the second case itbranches with M. vannielli (stippled branch) and separately from the halophilic L1sequences. Distance and DNA parsimony methods position S. solfataricus and M.vannielli together although the grouping is tenuous.5.2.3 The sequence alignments and phylogeny of L10 proteinsBetween eubacteria and archaebacteria, the L10 proteins are in general lessconserved than are the L11 and L1 proteins. However, because of domainconservation within L10 proteins, a reasonable alignment can be achieved withlittle difficulty. By using L10 sequences from the archaebacterial species H.cutirubrum and S. solfataricus as "bridges," Shimmin et al. (1989) demonstratedthat the eukaryotic "PO" proteins are actually homolo ues of the bacproteins.115The sequence alignment of the L10 protein family from four eubacteria, fivearchaebacteria, and six eukaryotes is illustrated in Figure 22A. Amino acid identityamong all the L10 proteins is highest within the amino terminal 121 residues. Themost conspicuous feature is the presence of several highly conserved basic residuesat alignment positions 17 (lys), 51 (arg), 68 (lys or arg), 74 (lys or arg), and 121 (lys).There are also many positions in this region which have high incidence ofhydrophobic residues. These features suggest that secondary structures in thisdomain may be highly similar if not identical and that this domain may beinvolved in rRNA binding (Gudkov et al., 1980; Pettersson, 1979; Mitsui et al.,1989). It is difficult to align with certainty the carboxyl domain of the eubacterialL10 sequences beyond position 121 with the eukaryotic and archaebacterialsequences. Nonetheless, the sequence RNLVYVLNAI of T. maritima L10 near thecarboxyl end is highly similar to the archaebacterial sequences around position 240(e.g. RNL-SV-NAA in H. cutirubrum; Figure 22C). This sequence was used as astarting point to achieve the depicted alignment between positions 173 and 248.The archaebacterial and eukaryotic proteins exhibit a carboxyl terminalextension of approximately 80-100 residues that is clearly not present in theeubacterial protein. This extension is characterized in part by a cluster of chargedamino acids (approximately position 320-359). In the eukaryotic proteins, thischarged region is preceded by an alanine-proline rich region that is either shortenedin, or absent from, the archaebacterial proteins. It has been suggested that thesefeatures are a result of a partial duplication of the L12 gene that has been fused tothe end of the L10 gene (Shimmin et al., 1989). Within any species ofarchaebacteria or eukaryote, substantial sequence identity is always apparentbetween the carboxyl termini of the respective L10 and L12 proteins. For example,the identical sequences at the carboxyl terminus of the L10 and L12 proteins from S.solfataricus are ••QAAEKKEEKKEEEKK ,LSSLFG and from human are••KEESEESD(D/E)DMGFGLFD.116The carboxyl terminal four to six amino acid residues for the four eubacterialL10 proteins contains a high proportion of charged acidic or basic residues. Thisregion is possibly the functional analog to the region of high charge density withinarchaebacterial and eukaryotic L10 proteins. In the depicted alignment theseresidues are somewhat arbitrarily placed at positions 343-348.The analysis of the L10 protein sequences by PAUP yields six equallyparsimonious tree configurations. These six trees divide into the two typesdesignated tree 1 and tree 2 in Figure 22D. The L10 proteins from human, rat andmouse are identical except for a few conservative amino acid replacements and asingle deletion in the rat protein at position 324. The three subtypes within the type1 and type 2 trees result from the rearrangement of these closely relatedmammalian L10 sequences.The type 1 and type 2 trees differ from each other in two respects: the first isthe branching order within the eubacterial domain and the second is thepositioning of S. solfataricus. In the type 1 tree, Synechocystis is the deepest branchwithin the eubacteria, and the eukaryotes branch from the S. solfataricus lineagewithin the archaebacterial group. In the type 2 tree, Synechocystis and T. maritimagroup together within the eubacteria and the eukaryotes branch from themethanogen halophile lineage within the archaebacterial group. Neither of thesetwo positions for the origin of the eukaryotic domain is supported by bootstrapping.And again, if the root of the tree is within the eubacterial domain (below theposition of the arrow in Figure 22D) the archaebacteria appear monophyletic butnot holophyletic.Some regions of the L10 protein alignment are less certain than others.When positions 249-369, representing the region of uncertain alignment, wereexcluded from parsimony analysis, the shortest trees found exhibited a topologyidentical to the two types of tree illustrated in Figure 23D. When onl alipose ions to 121 were used for parsimony analysis, the branch pattern within the117Figure 22^Alignment of amino acid sequences of the ribosomal protein L10family, and the phylogenetic tree based on the L10 alignmentA. The L10 ribosomal proteins from five eubacteria, five archaebacteria and sixeucaryotes are aligned. In the eukaryotes, these proteins were previouslydesignated "P0." The species names and their abbreviations are as in Table 5 (pp.41-42).B. The consensus of the alignment was generated manually by majority rule.When majority is not evident at an alignment position, chemically similar aminoacid residues were considered to determine the consensus. Question mark (?)indicates that there is no simple consensus at such positions.C. Alignment of the L10 sequences of T. maritima and H. cutirubrum at positions239-248.D. A parsimony analysis of the aligned sequences was carried out using PAUP withthe branch and bound tree search option, and two of the six equally shortest treesare illustrated (Tree 1 and Tree 2). Four other alternative trees arise because ofregrouping among the three mammalian species (Hsa, Rno and Mmu). There are289 informative sites; all were included in the parsimony analysis. The consistencyindices for the two trees are 0.849. The arrow indicates that possible root of theuniversal tree is located between the lineages of eubacteria and archaebacteria-eukaryotes. The numbers indicate percent confirmation of grouping of species tothe right of the node by bootstrapping analysis with 1000 replications.118A 10^20^30^40^50^60^70^80^90^100DdiL10 MSGAGS-K RKKLFIEKATKLETTYDKMIVAEADFVGSSQLQKIRKSIR-- --GIGAVLMGKKTMIRKVIRDLAD--SKPELDALNTYLKQNTCDmeL10 M--VRENKAA^WKAQYFIKVVELFDEFPKCFIVGADNVGSKQMQNIRTSLR-- --GLAVVLMGKNTMMRKAIRGHLE--NNPQLEKLLPHIKGNVGHsaL10 M--PREDRAT^WKSNYFLKIIQLLDDYPKGFIVGADNVGSKQMQQIRMSLR-- --GKAVVLMGKNTMMRKAIRGHLE--NNPALEKLLPHIRGNVGMmuL10 M--PREDRAT^WKSNYFLKIIQLLDDYPKCFIVGADNVGSKQMQQIRMSLR-- --GKAVVLMGKNTMMRKAIRGHLE--NNPALEKLLPHIRGNVGRnoL10 M--PREDRAT^WKSNYFLKIIQLLDDYPKGFIVGADNVGSKQMQQIRMSLR-- --GKAVVLMGKNTMNRKAIRGHLE--NNPALEKLLPHIRGNVGSceL10 MGGIRE-K KAEYFAKLREYLEEYKSLFVVGVDNVSSQQMHEVRKELR-- --GRAVVLMGKNTMVRRAIRGELS--DLPDFEKLLPFVKGNVGHcuL10 M-SAEEQRTTEEVPEWKRQEVAELVDLLETYDSVGVVNVTGIPSKQLQDMRRGLH----GQAAVRMSRNTLLVRALEEAGD^GLDTLTEYVEGEVGHhaL10 M-SAEEQRTTEEVPEWKRQEVAELVDLLETYDSVGVVNVTUIPSKQLQDMRRGLH----GQAALRMSRNTLLVRALEEAGD^GLDTLTEYVEGEVGHmaL10 M-SAESERKTETIPEWKQEEVDAIVEMIESYESVGVVNIAGIPSRQLQDMRRDLH----GTAELRVSANTLLERALDDVDD^GLEDLNGYITGQVGMvaL10 MIDAKSEHK---IAPWKIEEVNALKELLKSANVIALIDMMEVPAVQLQEIRDKIR----DQMTLKMSRNTLIKRAVEEVAEETGNPEFAKLVDYLDKGAASsoL10 M-IGLAVTTTKKIAKWKVDEVAELTEKLKTHKTIIIANIEGFPADKLHEIRKKLR----GKADIKVTKNNLENIALKNAGY ^DTKLFESYLTGPNAEcoL10 MALNLQDSecL10 MGRTRENstyLio MALNLQDTmaL10 M-LTRQQKQAIVAEVSEVAKGALSAVVADSRGVTVDKMTELRKAGRE -AGVYMRVVRNTLLRRAVEGT^PFECLKDAFVGPTLKATVISDVQELFQDAQMTVIIDYQGLWAEITDLRNRLRP- LGGTCKIAKNTLVRRALAGQ-E^AWSPMEEFLTGTTAKQAIVAEVSEVAKGALSAVVADSRGVTVDKMTELRKAGRE- AGVYMRVVRNTLLRRVVEGT QFECLKDTFVGPTLKELIVKEMSEIFKKTSLILFADFLGETVADLTELRSRLREKYGDGARFRVVRNTLLRRAVENA^EYEGYEEFLKGPTA110^120^130^140^150^160^170^180^190^200DdiL10 IIECKDNIAEVXRVINTQ--RVGAPAKAGVFAPNDVIIPAGPTGMEPTQ-TSFLQDLKIATKINRGQIDIVNEVEIIKTGQKVGASEATLLQKLNIKPFTDmeL10 FVFTKGDLAEVRDKLLES--KVRAPARPGAIAPLHVIIPAQNTGLGPEK-TSFFQALSIPTKISKGTIEIINDVPILKPGDKVGASEATILNMLNISPFSHsaL10 FVFTKEDLTEIRDMLLAN--KVPAAARAGAIAPCEVTVPAQNTGLGPEK-TSFFQALGITTKISRGTIEILSDVQLIKTODKVGASEATLINMLNISPFSMmuL10 FVFTKEDLTEIRDMLLAN--KVPAAARAGAIAPCEVTVPAQNTGLGPEK-TSFFQALGITTKISRGTIEILSDVQLIKTGDKVGASEATLLNMLNISPFSRnoL10 FVFTKEDLTEIRDMLLAN--KVPAAARAGAIAPCEVTVPAQNTGLGPEK-TSFFQALGITTKISRGTIEILSDVQLIKTGDKVGASEATLLNMLNISPFSSceL10 FVFTNEPLTEIKNVIVSN--RVAAPARAGAVAPEDIWVRAVNTGMEPGK-TSFFQALGVPTKIARGTIEIVSDVKVVDAGNKVGQSEASLLNLLNISPFTHcuL10 LVATNDNPFGLYQQLENS--KTPAPINAGEVAPNDIVVPEGDTGIDPGPFVGELQTIGANARIQEGSIQVLDDSVVTEEGETVSDDVSNVLSELGIEPKEHhaL10 LVATNDNPFGLYQQLENS--KTPAPINAGEVAPNDIVVPEGDTGIDPGPFVGELQTIGANARIQEGSIQVLDDSVVTEEGETVSDDVSNVLSELGIEPKEHmaL10 LIGTDDNPFSLFQELEAS--KTPAPIGAGEVAPNDIVIPEGDTGVDPGPFVGELQSVGADARIQEGSIQVLSDSTVLDTGEEVSQELSNVLNELGIEPKEMvaL10 IVVTEMNPFKLEKTLEES--KSPAPIKGGAIAPCDIEVKSGSTGMPPGPFLSELKAVGIPAAIDKGKIGIKEDKVVAKEGDVISPKLAVVLSALGIKPVTSsoL10 FIFTDTNPFELQLFLSKF--KLKRYALPGDKADEEVVVPAGDTGIAAGPMLSVEGKLKIKTKVQDGKIHILQDTTVAKPGDEIPADIVPILQKLGIMPVY^EcoLlO IAYSMEHP-GAAARLFKEFAK^ ANAKFEVKAAAFEGELIPASQIDRL-^SecL10 ILVLKEDL-GGAIKAYKKFQK DTK- KTELRGGVLEGKSLTQADVEAI-^StyL10 IAYSMEHP-GAAARLFKEFAK ANAKFEVKAAAFEGELIPASQIDRL-^TmaL10 VLYVTEGDPVEAVKIIYNFYK^ DKKADLSRLKGGFLEGKKFTAEEVENI-DdiL10DmeLiOHsaL10MmuL10RnoL10SceL10210^220^230^240^250^260^270^280YGLEPKIIYDAGACYSPS---ISEEDLINKFKQGIFNIAAI-SL-EIGYPTVASIPHSVMNAFKNLLAISFETSYTFD ^YCLIVNQVYDSGSIFSPEILDIKPEDLRAKFQWVANLAAV-CL-SVGYPTIASAPHSIANGEKNLLAIAATTEVEFK ^FGLVIQQVFDNGSIYNPEVLDITEETLHSRFLEGVRNVASV-CL-QIGYPTVASVPHSIINGYKRVLALSVETDYTFP ^FGLIIQQVFDNGSIYNPEVLDITEQALHSRFLEGVRNVASV-CL-QIGYPTVASVPHSIINGYKRVLALSVETEYTFP ^FGLIIQQVFDNGSIYSPEVLDITEQALHTRFLEGVANVASV-CL-QIGYPTVASVPHSIINGYKRVLALSVETDYTFP ^FGLTVVQVYDNGQVFPSSILDITDEELVSHFVSAVSTIASI-SL-AIGYPTLPSVGHTLINNYKDLLAVAIAASYHYP ^ 290^300AAEKFKSAAA-AAEATTIKEY---IKLAEKVKAF---LALTEKVKAF---LALAEKVKAF---LAEIEDLVDR---IEHcuL10 VGLDLRGVESEGVLFTPEELEIDVDEYRADIQSAAASARNL-SV-NAAYPTERTAPDLIAKGRGEAKSLGLQASVESPDLADDLVSKADAQVRALAAQIDHhaL10 VGLDLRGVESEGVLFTPEELEIDVDEYRADIQSAAASARNL-SV-NAAYPTERTAPDLIAKGRGEAKSLGLQASVESPDLADDLVSKADAQVRALAAQIDHmaL10 VGLDLRAVFADGVLFEPEELELDIDEYRSDIQAAAGRAFNL-SV-NADYPTATTAPTMLQSDRGNAKSLALQAAIEDPEVVPDLVSKADAQVRALASQIDMvaL10 VGLNVLGVYEEGVIYTSDVLRIDEEEFLGKLQKAYTNAFNL-SV-NAVIPTSATIETIVQKAFNDAKAVSVESAFITEKTADAILGKAHAQMIAVA-KLASsoL10 VKLNIKIAYDNGVIIPGDKLSINLDDYTNEIRKAHINAFAV-AT-EIAYPEPKVLE--FTATKAMRNALALASEIGYIWETAQAVETKAVMKAYAAVASEcoL10 ATL---PTYEE-AIAR-LMATMKEASAGKLVRTLA-AVRD A^SecL10 GDL---PSKEQ-LMGQ-IAGGIN-ALATKIALGIKEVPASVARGLQHVStyL10 ATL---PTYEE-AIAR-LMATMKEASAGKLVRTLA-AVRD A TmaL10 AKL---PSKEE-LYAM-LVGRVK-APITGLVFALSGILRNLVYVLNAI310^320^330^340^350^360DdiL10 PVR AAP^SAAAPRAAA -KKVVVE- EKK^ EESD DDMGMG-LFD-DmeL10 DPSKFAAAA----SASAAPAAGGATEKKEEA^KKPESESEEED^ DDMGFG-LFD-HsaL10 DPSAFVAAAPVAAATTAAPAAAAAPA-KVEA^K- -EESEESD EDMGFG-LFD-MmuL10 DPSAFAAAAPAAAATTAAPAAAAAPA-KAEA^ K -EESEESD^EDMGFG-LFD-RnoL10 DPSAFAAAAPLAAATTLAPAAAA PA KVEA^K -EESEESD EDMGFG-LFD-SceL10 NPEKYAAA^APAATSAASGDAAPA- -EEAAAEEEEESD^ DDMGFG-LFD-HcuL10 DEDALPEELQDVDAPAAPAGGEADTTADEQS-DETQASE-ADDADDSADDDDDDDGNAGAEGLGEMFG-HhaL10 DEDALPEELQDVDAPAAPAGGEADTTADEQS-DETQASE-ADDADDSDDDDDDDDGNAGAEGLGEMFG-HmaL10 DEEALPEELQGVEADVATEEPTDDQDDDTASEDDADADDAAEEADDDDDDDED AGDALGAMF--MvaL10 GDEALDDDLKEQISSSAVVATEEAP-KAETKKE- -EKK^EEAA^ PAAGLGLLF--SsoL10 ISGKV--DLGVQIQAQPQVSEQAA-EKKEEKKEE--EKKGP--SEEEI GGGLSSLFGGEcoL10SecL10StyLlOaL10KEAADDKE^ KEAAKEKKSEFigure 2275^ DmeL10HsaL10100 MmuLlORnoL10^ SceL10DdiL10rcuL10100 HhaL10^ HmaL10MvaL109010059^ SsoL10lEcoL10L StyL1099TmaL1010067SecL10DdiL10DmeLlO75[HsaL10100 MmuL10RnoL10^ SceL10^ SsoL10rcuL10100 HhaL10^ HmaL10995910067MvaL10tEcoL10100 StyL10^ TmaL10SecL10r9957119B CONSENSUS^10^20^30^40^50^60^70^80^90^100Eukary M?GPREDRAT WKSNYFLKIIQLLDDYPKCFIVGADNVGSKQMQQIRMSLR----GKAVVLMGKNTMMRKAIRGHLE--NNPALEKLLPHIRGNVGArchae MISAE?ERTTEEIPEWK?EEVAELVELLETYDSVGVVNI?GIPSKQLQDMRR?LH----GQA?LRMSRNTLL?RALEEAGDETGNPGLD?L?EY1EGEVGEubact MALTRQD KQAIvAEvsEvfKGALsAvvADsRGvTvAKmTELRI(RLREKYGAGVYmRvvRNTLLRRAvEGT -E ?FECLEEFLVGPTA^110^120^130^140^150^160^170^180^190^200Eukary FVFTKEDLTEIRDMLLAN --KVPAAARAGAIAPCEVTVPAQNTGLGPEK-TSFFQALGITTKISRGTIEILSDVQLIKTGDKVGASEATLLNMLNISPFSArchae LV?TDDNPF?LFQQLENSKLKTPA?INAGEVAPNDIVVPEGDTGIDPGPFVGELQTVGANARIQEGSIQVLDDSVV?EEGE?VSDDLSNVLSELGIEFTEEubact ILYSMEHPPGAAAKLFKEFAK DKANAKFELKGGALEGKLITASQVERI-^210^220^230^240^250^260^270^280^290^300Eukary FGLIIQQVFDNGSIYSPEVLDITEEDLHSRFLEGVANVASV-CL-QIGYPTVASVPHSIINGYKRVLALSVETEYTFP LAEKVKAFAA-LAArchae VGLDLRGVF?EGVLFTPEELEIDVDEYR?DIQ?AA??AFNL-SV-NAAYPT?RTAPTLI?K7RGEAKSL?LQA7IESPDLADDLVSKADAQVRALAAQIDEubact ATLPS---YEE-LIAR-LMGTMKEASATKLVRTLA?AVRDLAYVLNAI 310^320^330^340^350^360Eukary DPSAFAAAAPLAAATTAAPAAAAAPAKKVEA? EKK??EESEESD EDMGFG-LFD-Archae DEEALPEELQDVDA??A????EAD???DEQSKDETQA?E?ADDADD?DDDDDDDDGNAGAEGLG?MFGGEubact  KEKEAE C240HcuL10 RNL-SV-NAATmaL10 RNLVYVLNAIDTree 1^ Tree 2Figure 22 (continued)120eukaryotic lineage was not well defined and branching within the archaebacterialgroup was reorganized: halophiles were closer to eukaryotes, M. vannielli wascloser to eubacteria, and S. solfataricus was between the two (data not shown).5.2.4 The sequence alignments and phylogeny of L12 proteinsIn spite of the major structural discontinuity that occurs between eubacterialL12 sequences and archaebacterial-eukaryotic L12 sequences, biochemical andgenetic evidence strongly suggests that all L12 proteins are homologous. First, theorganization of the genes encoding ribosomal proteins L11, L1, L10 and L12 ismaintained in organisms as divergent as eubacteria and archaebacteria; the L12gene is always located at the end of the L11, L1, L10, L12 tetragenic cluster. Second,ribosomes from all organisms contain multiple copies of the L12 protein. As agroup, these L12 proteins are very acidic, alanine and proline rich, and similar insize, ranging between about 110-120 amino acids in length. Four copies of the L12protein along with a single copy of L10 form a distinct stalk on the large ribosomesubunit that functions in factor-dependent GTP hydrolysis and mediates structuralrearrangements of the ribosome during the protein synthesis cycle. Third,pentameric complex L10-(L12)4 from E. coli recognizes a similar site on thearchaebacterial ribosome (StOffler-Meilicke and Staffler, 1991). Furthermore, E. coliL12 can form an active hybrid with yeast core ribosomes from which the acidicproteins have been removed (Sanchez-Madrid et al., 1981).In eukaryotic organisms, there are two distinct L12 proteins that have beendescribed. These have been designated as type I and type II (or "P2" and "P1,"respectively; Amons et al., 1979 and 1982; Rich and Steitz, 1987; Shimmin et al.,1989; Newton et al., 1990). In the yeast lineage that includes S. cerevisiae and S.pombe, each of the two genes has been reduplicated to give types IA, IB, IIA and IIB(Newton et al., 1990; Beltrame and Biarchi, 1990).121The alignment of twelve eubacteria and one chloroplast, sevenarchaebacterial and nine type I and ten type II eukaryotic proteins of the L12 familyis illustrated in Figure 23A. All but one of the eukaryotic type II proteins contain aconserved tryptophan at position 88; this aligns to a conserved arginine in the typeI, the archaebacterial and the eubacterial L12 proteins. It is interesting and perhapssignificant that the extension at the amino terminus of type II proteins shows somesequence similarity to the amino terminus of the eubacterial L12 proteins(alignment position 1-18). Another salient feature of all L12 proteins, especially thearchaebacterial and eukaryotic proteins is the highly charged carboxyl terminus.The alignment reflects this feature. The two large alignment gaps near the C-terminus within the eubacterial L12 sequences are located within the loopsconnecting 13 sheet [B] and a helix [C], and a helix [C] and R sheet [C] respectively(according to the crystal structure of the C terminal domain of E. coli L12 protein;Leijonmarck et al., 1980). Consequently deletions (or insertions) in these regionscould be accommodated without dramatically altering the overall protein structure.In eukaryotic and archaebacterial species, the L12 carboxyl terminal sequences arepreceded by an alanine-proline rich region and exhibit substantial similarity to thecarboxyl terminus of protein L10 (see above). Eubacterial L12 proteins have asimilar alanine-proline rich region, but it is located more proximally to the amino-terminus in the protein at position 39-60. In all the proteins, these alanine-prolineregions are believed to be highly flexible and to serve as "hinges" between twodistinct domains (Leijonmarck et al., 1980 and 1987; Shimmin et al., 1989). Therelocation of this hinge to a more amino terminal position in eubacterial L12proteins cannot be easily explained. Recent biochemical studies on the S.solfataricus L12 protein have concluded that the amino and carboxyl terminaldomains of the protein are functionally equivalent to the corresponding amino and• • -^IIII"^•supports a colinear alignment. To simplify visualization and comparison, a122Figure 23^Alignment of the amino acid sequences of the ribosomal protein L12familyA. The L12 equivalent protein sequences from thirteen eubacteria, sevenarchaebacteria, and nineteen eukaryotes were aligned. The eukaryotic proteinsdivide into two types designated as type I and type II. The species names and theirabbreviations are as in Table 5 (pp. 41-42).B. The consensus of the alignment was generated manually by majority rule.When majority was not evident at an alignment position, chemically similaramino acid residues were considered to determine the consensus. Question mark(?) indicates that there was no simple consensus at such positions.123A10^20^30^40^50^60^70^80AsaLl2II MA S-K DELAC^VYAALILL-DDDVDITTEKVN TILRAAGVSVEDmeLl2II M^STK-AELAS VYASLILV DDDVAVTGEKIN TILKAANVEVEGgaLl2II MA---SVS E LAC^ IYSALILH DDEVTVTEDKIN^ ALIKAAGVNVEHsaLl2II MA---SVS-E LAC IYSALILH DDEVTVTEDKIN ALIKAAGVNVERraLl2II MA---SVS-E LAC^ IYSALILH DDEVTVTEDKIN ALIKAAGVNVESceLl2IIA M -S T-E SAL SYAALILA DSEIEISSEKLL^ TLTNAANVPDESceL12IIB M -SDS- II^ SFAAFILA DAGLEITSDNLL TITKAAGANVDSpoLl2IIA M- -SAS E-LAT SYSALILA-DEGIEITSDKLL SLTKAANVDVESpoLl2IIB M- SAS E LAT^ SYSALILA DEGIEITSDKLL^ SLTKAANVDVETthL12II M^STT-E IEKVVKGA SYSALLLN DCGLPITAANIA ALFKTAKLNGHAsaL12I MRYVAAYLLAALSGNADPSTADIE^ KILSSVGIECNDmeLl2I ^ MRYVAAYLLAVLGGKDSPANSDLE KILSSVGVEVDHsaLl2I MRYVASYLLAALGGNSSPSAKDIK KILDSVGIEADRraL12I MRYVASYLLAALGGNSNPSAKDIK^ KILDSVGIEADSceLl2IA ^ MKYLAAYLLLVQGGNAAPSAADIK AVVESVGAEVDSceL12IB MKYLAAYLLLN AAGNTPDATKIK AILESVGIEIESpoL12IA MKYLAAYLLLTVGGKQSPSASDIE^ SVLSTVGIEAESpoL12IB ^ MKYLAAYLLLTVGGKDSPSASDIE SVLSTVGIEAETcrL12I MKYLAAYALVGLSGG-TPSKSAVE AVLKAAGVPVDHcuL12 ^ MEYVYAALILN-EADEELTEDNIT^ GVLEAAGVDVEHhaL12 MEYVYAALILN EADEELTEDNIT GVLEAAGVDVEHvoL12 MEYVYAALILN ESDEEVNEENIT AVLEAAGVDVEHmaL12 ^ MEYVYAALILN EADEEINEDNLT^ DVLDAAGVDVEMvaL12 MEYIYAALLLN SANKEVTEEAVK AVLVAGGIEANSacL12 MEYIYASLLLH-AAKKEISEENIK NVLSAAGITVDSsoL12 ^ MEYIYASLLLH-AAKKEISEENIK^ NVLSAAGITVDBstL12^M^TKEQIIEAVKNMTVLELNELVK-AIEEEFGVTAAAPVVVAGGAAAGA^EAAAEKTEFDVILADA-GAQKIKVIKBsuL12^MA^LNIEEIIASVKEATVLELNDLVK-AIEEEFGVTAAAPVAVAGGAAAGGAA^EE EFDLILAGA-GSQKIKVIKDvuL12^M---SSITKEQVVEFIANMTVLELSEFIK-ELEEKEGVSAAAPAMMAVAAGPA EAAPAEEEKTEFDVILKAA-GANKIGVIKEcoL12^M----SITKDQIIEAVAAMSVMDVVELIS-AMEEKFGVSAAAPVAVAAGPV^EAAEEKTEFDVILKAA-GANKVAVIKHeuL12^MA^LTQEDIINAVAEMSVMEVAELVS AMEEKFGVSAAAAVVAGPGGG^EA-EEAEEQTEFDLVLTSA-GEKKVNVIKHprL12^M^NKEEIMSAIEEMSVLELSELVE-DLEEKFGVSAAAPVAVAGGAA-GAGAAA----EEKSEEDVFLADI-GGKKIKVIKM1yL12^M^NKEQILEAIKAMTVLELNDLVK AIEEEFGVTAAAPVVA GGAAAAA^EEKTEFDVVLASA-GAEKIKVIKRspLl2^MA^DLNKLAEDIVGLTLLEAQELKT-ILKDKYGIEPAAGGAVMMAGPAAGAAA--PAEEEKTEFDVGLTDAAGANKINVIKSecL12^M---SAAT-DQILEQLKSLSLLEASELVK-QIEEAFGVSAAAPVGGMVMAAAAAAPA--EAAEEKTEFDVILEEVPADKKIAELKSgrL12^MA---KLSQDDLLAQFEEMTLIELSEFVK-AFEEKFDVTAAAAVAVAGPAAGGAPA EEAEQD-EFDVILTGA-GEKKIQVIKSol(c)L12 MAVEAPEKIEQLGTQLSGLTLEEARVLVD-WLQDKLGVSAASFAPAAAVAAPGAPADAAPAVEEKTEFDVSIDEVPSNARISVIKStyL12^M^SITKDQIIEAVSAMSVMDVVELIS AMEEKFGVSAAAAVAVAAGPA^EAAEEKTEFDVILKAA-GANKVAVIKTmaL12^M^TIDEIIEAIEKLTVSELAELVK-KLEDKFGVTAAAPVAVAAAPVAGAAAGAA--QEEKTEFDVVLKSF-GQNKIQVIKFigure 23A (continued)124AsaL12IIDmeLl2IIGgaLl2IIHsaLl2IIRraLl2IISceLl2IIASceLl2IIBSpoLl2IIASpoLl2IIBTthL12IIAsaLl2IDmeLl2IHsaLl2IRraLl2ISceLl2IASceLl2IBSpoL12IASpoLl2IBTcrL12IHcuL12HhaL12HvoL12HmaL12MvaL12SacL12SsoL12BstL12BsuL12DvuL12EcoL12HeuL12HprL12MlyL12RspL12SecL12SgrL12Sol(c)L12StyL12TmaL1290^100^110^120^130^140PYWPGLFTKALEGL-DL-KSMITN----VGSGVGAAPAAGGAAAA^TEA PAAPYWPGLFAKALEAI-NV-KDLITN----IGSGVGAAPAGGAAPAAAAAAPAA^PFWPGLFAKALANI-DI-GSLICN^VGAGGGAPAAAAPAGGAAPAGGGAAPA ^PFWPGLFAKALANV-NI-GSLICN^VGAGGPAPAAGAAPAGGPAPATAAAPA ^PFWPGLFAKALANV-NI-GSLICN^VGAGGPAPAAGAAPAGGPAPSAAAAPA ^NIWADIFAKALDGQ-NL-KDLLVN----F-SAGAAAPAGVAGGVAGG-EAGEAEA ^NVWADVYAKALEGK-DL-KEILSG----FHNAGPVAGAGAASGAAAAGGDAAA ^PIWATIFAKALEGK-DL-KELLLN^IGSGAGAAPVAGGAAAPAAA DGEAPA ^PIWATIFAKALEGK-DL-KELLLN----IGSAAAAPAAGGAGAPAAAAGGEAAA ^ETTFKTFEDFLKTN-PI-TNYIGA----IGGSAPAAASSAPA^PSQLQKVMNELKGK-DL-EALIAEGQTKLASMPTGGAPAAAAGGAATA-PAA^EAKEAKKEEKKEESEE--EDEDMGFGLFDAERLTKVIKELAGK-SI-DDLIKEGREKLSSMPVGGGGAVAAADAAPAAAAGG^DKKEAKKEEKKEESES--EDDDMGFALFEDDRLNKVISELNGK-NI-EDVIAQGIGKLASVPAGGAVAVSAAPGSAAPAAGSAPAAA----EEKKDEKKEESEE--SDDDMGEGLEDDERLNKVISELNGK-NI-EDVIAQGVGKLASVPAGGAVAVSAAPGSAAPAAGSAPAAA^EEKKDEKKEESEE--SDDDMGFGLFDEARINELLSSLEGK-GS-LEEIIAEGQKKFATVPTGGA--SSAAAGAAGAAAGGDAAA^EEEKEEEKEE^ SDDDMGFGLFDDEKVSSVLSALEGK-SV-DELITEGNEKLAAVPAAGPASA^GG AAAAGGDAAA^EEEKEEEAAEE----SDDDMGFGLFDAERVESLISELNGK-NI-EELIAAGNEKLSTVPSAGAVATPAAGGAAGAEATS AA^EEAKEEEAAEE----SDEDMGFGLFDSERIETLINELNGK-DI-DELIAAGNEKLATVPTGGAASAAPAAAAGGAAPAA^EEAAKEEAKEEEE--SDEDMGFGLFDPSRVDALFAEFAGK-DF-DTVCTEGKSKLVGGVTRPNAATASAPTAAAAASSGAAAPAAAA EEE^EDDDMGFGLFDESRAKALVAALEDV-DI-EEAVEE-- --AAAAPAAAPAASGSDDEAAADDGDDDEEA-DADEAAEAEDAGDDDDEEPSGEGLG-DLFGESRAKALVAALEDV-DI-EEAVEE-- --AAAAPAAAPAASGSDDEAAADDGDDDEEA-DADEAAEAEDAGDDDDEEPSGEGLG-DLFGESRVKALVAALEDV-DI-EEAIET -AAAAPAPAAGGSAGGEVEAADDDDEED-A-EEEAADEGGDDDGDDDEEADGEGLG-ALFGESRVKALVAALEDV-DI-EEAVDQ-- --AAAAPVPASGGAAAPAEGDADEADEADEEAEEEAADDGGDDDDDEDDEASGEGLG-ELFGDARVKALVAALEGV-DI-AEAIAK-- --AAIAPVAAAAPVAAAAAPA^EVKKEE-KKEDTT-AAAAAGLG-ALFMEVRLKAVAAALEEV-NI-DEILKT-- --ATAMPVAAVAAPAGQQTQQAA EKKEEKKEEEKKGPSE-EEIGGGLS-SLFGEVRLKAVAAALEEV-NI-DEILKT^ATAMPVAAVAAPAGQQTQQAA^EKKEEKKEEEKKGPSE-EEIGGGLS-SLFGVVR-EITGLGLKEAKDLVDNTPKP----IKEGIAVVR-EITGLGLKEAKELVDNTPKP KEGIAVVR-ALTGLGLKEAKDKVDGAPST----LKEAVSAVR-GATGLGLKEAKDLVESAPAA^LKEGVSVVR-EITGLGLKEAKAAVDGAPAT^LKEGMSAVR-ELTGLGLKEAKGVVDDAPGN----VKEGLSVVR-EITGLGLKEAKEVVDNAPKA----LKEGVSEVR-AITGLGLKEAKDLVE-AGGK----VKEAVAVVR-TITGLGLKEAKELVESTPKA----IKEATGVVR-ELTSLGLKEAKDLVDGTPKP----VLEKVAAVR-ALTSLGLKEAKELIEGLPKK LKEGVSAVR-GATGLGLKEAKDLVESAPAA----LKEGVSVVR-EITGLGLKEAKDLVEKAGSPDAV-IKSGVS150^160^170KEEKKEEKKEESEE--EDEDMGFGLFDES KKEEKKKEEESDQSDDDMGFGLFDEEKKEEEKKEESEE--SDDDMGFGLFDEEKKVEAKKEESEE--SDDDMGFGLFDEEKKVEAKKEESEE--SEDDMGFGLFDEKEEEEAKEE^ SDDDMGFGLFDEEEKEEEAAEE^ SDDDMGFGLFDEEKEEAKEEEE^SDEDMGFGLFDEEQKEEAKEEEE^SDEDMGFGLFDKKEEPKKEEPKKEEPKEEETDMDMG-DLFGKEEAEEIKAALEE^AGAKVEIK-KEEAEELKAKLEE^VGASVEVK-KEEAEEAKKQLVE^AGAEVEVK-KDDAEALKKALEE^AGAEVEVK-KEDGDEAKTKLEE^AGASVELK-KEDAEEMKEKLEE^AGATVELK-KDEAEEIKAKLEE^VGASVEVK-KADAEAMKKK LEE^AGAKVELK-KDDAEAIKKQIEE^AGGKAAVK-KEAAEKAAESLKA^AGASVEVK-KDDAEDAKKQLED^AGAKVSIV-KDDAEALKKSLEE^AGAEVEVK-KEEAEEIKKKLEE^AGAEVELK-B consensusEukaryllEukarylArchaeEubactEukaryIIEukarylArchaeEubact10^20^30^40MA---SVS-EELACVVKGA--SYSALILA-DDEVEITSDKIN^ MKYLAAYLLL?LGGN?SPSASDIEMEYVYAALILN-EADEEITEENIT50^60^70^80TLTKAAGVNVEKILSSVGIEAD^ ?VLEAAGVDVEMAVESSLTKEQIIEAIKEMTVLELNELVK-ALEEKEGVSAAAPVAVAGGAAAGAAAAAAEAAEEKTEEDVILA?APGANKIKVIK90^100^110^120^130^140^150^160^170PFWPGLFAKALEG?-DL-KELI?N----IGGGGGAAAAGAAAAAAAAAG?AAAPAA--KKEEEKKEEAKKEESEEEESDDDMGFGLFDDERL?KVISELNGK-DI-EELIAEGNEKLASVPTGGAAAVAAAPGAAAAAAGGA?AAAAEAKEEKKEEKKEESEE--SDDDMGEGLEDESRVKALVAALEDV-DI-EEAVET AAAAPVAA?AASAGDDEQAADDGDEDEEAAEEDEAKEEEDKKDDDDEEASGEGLG-DLFGVVR-EITGLGLKEAKDLVDGAPKADAV-LKEGVS^ KEDAEEIKKKLEE^AGASVEVK-Figure 23 (continupri)125consensus of the eukaryotic type I and II, the archaebacterial and the eubacterial L12proteins are aligned in Figure 23B.It should be stressed here that in any alignment (and in particular this L12alignment) the assumption of common ancestry of each amino acid at a givenalignment position is less than certain. That is, alignments simply reflect a guess,hopefully a best guess, of common ancestry at every position.The phylogenetic relationships among the L12 family of protein sequenceswere determined using parsimony (Figure 24) and distance matrix methods (notshown). Because of the uncertainty in generating a reliable alignment betweeneubacterial and archaebacterial-eukaryotic L12 sequences, we first determined thephylogenies of eubacteria, archaebacteria and eukaryotes separately, and then forcomparison we determined the "universal" phylogeny. In general, the branchpatterns within the eukaryotic, archaebacterial and eubacterial groups wereessentially identical in the "universal" tree and the three individual trees. Theuniversal parsimony tree (shown in Figure 24) and a Fitch-Margoliash distance tree(not shown) both indicated that the eubacterial sequences form a single coherentgroup that is confirmed by bootstrap analysis. However, the branching orderwithin this group is not substantiated by bootstrap analysis.The archaebacterial L12 sequences also appear to form a coherent group thatis both mono- and holophyletic. By bootstrap resampling, the confirmation of thisgrouping was 57% for the protein alignment and 58% for the corresponding nucleicacid alignment analyzed by PAUP (data not shown). In contrast, the eukaryotic L12sequences clearly resolve into two groups corresponding to the type I and type IIproteins. This distinct division implies that the duplication of the L12 geneoccurred very early in the eukaryotic lineage.126Figure 24^Phylogenetic tree inferred from the aligned L12 amino acid sequencesThe phylogenetic tree was constructed by using PAUP with heuristic treesearch option. Illustrated is the majority rule consensus of the 14 equally shortesttrees. There are 147 informative sites in the alignment; all of them were used forparsimony analysis. The consistency index is 0.598. When the first 18 alignmentpositions, and the flexible hinge regions (position 43 to 74 for eubacteria and 119-146 for archaebacteria and eucaryotes) were excluded from analysis, 20 shortest treeswere found; the majority rule consensus of these trees has essentially the sametopology as the tree shown here. The numbers refer to the percent confirmation ofgrouping of the species to the right of the node by bootstrap analysis with 100replications. Only values greater than 50% are labeled. The abbreviations of thespecies names are as in Table 5 (pp. 41-42). The numbers indicate percentconfirmation of grouping of species to the right of the node by bootstrappinganalysis with 1000 replications. The arrow indicates that possible root of theuniversal tree is located between the lineages of eubacteria and archaebacteria-eukaryotes.70568752^79 ^ AsaLl2II^ DmeLl2II100 00ELGgaLl2IIHsaLl2II1 ^RraLl2IISceL12IIASpoLl2II70 ^Sp0L12IIB^ SceL12IIB^ TthL12II^ AsaLl2I^87   DmeLl2I127505764100100^ MvaL12SacL12SsoL12100 r HsaLl2IL RraLl2I87 r-- SpoL12IASpoLl2IB^ TcrL12I SceL12IA^ SceL12IB100  HcuLl2I HhaL1285 1--• HvoL12^ HmaL12100BstL12L.E.- BsuL12MlyL12^ HprL12 HeuL12^ SgrL12^ TmaL12^ DvuL12100 r EcoL12L.- StyL12^ SecL12^ RspL12Sol(c)L12Figure 241285.2.5 Phylogenetic considerationsThe alignment and phylogenetic analysis presented above using L11, L1, L10and L12 protein sequences generally support the concept that organisms divide intothree distinct and well defined groups: eubacteria, archaebacteria and eukaryotes.The ribosomal protein sequences from member species within a group are in mostcases more similar to each other based on amino acid identity than to the sequencesfrom species outside the group. Furthermore, numerous deletions, insertions orstructural rearrangements in these ribosomal protein sequences confirm this threepart delineation and demarcation.If the root in these ribosomal protein based trees is near or within theeubacterial domain, then it is clear that the archaebacteria appear monophyletic,originating from a common ancestor that is distinct from eubacteria. The origin ofthe eukaryotes is more problematic. They appear to originate as a distinct brancheither outside of the archaebacterial group as suggested by the L12 proteinphylogeny or alternatively from within the archaebacterial group as suggested bythe L11 and L10 protein phylogenies.Although ribosomal proteins at first glance might be considered goodcandidates for phylogenetic analysis, in reality there are some inherent problems.First, they are relatively small proteins and second, divergence and structuralrearrangements often make alignments difficult and ambiguous. Because of theselimitations, the origin of the eukaryotic lineage either from within or outside of thearchaebacterial group cannot be statistically substantiated.Phylogenetic analyses of rRNA sequences and translational elongationfactors Tu and G sequences suggest that the hyperthermophilic eubacterium T.maritima is a representative of deep branching lineages within the eubacterialgroup (Achenbach-Richter et al., 1987; Bachleitner et al., 1989; Tiboni et al., 1991).Representatives of deep branching lineages within the archayperthermophilic. This has led to the suggestion that the ancestor of eubacteria129and archaebacteria (i.e., the common ancestor represented as the root of theuniversal tree) was hyperthermophilic (Achenbach-Richter et al., 1987; Pace, 1991;Burggraf et al., 1992; Stetter, 1993). This would place the position of the root eitherdeep within the eubacterial or archaebacterial groups or somewhere between thetwo groups. This situation may be clarified by previous analyses of translationalelongation factors and subunits of ATPase which have placed the root somewherebetween eubacteria and archaebacteria (Iwabe et al., 1989; Gogarten et al., 1989).In contrast to the rRNA and the elongation factors Tu and G basedphylogenetic analysis (Achenbach-Richter et al., 1987; Bachleitner et al., 1989; Tiboniet al., 1991), our analysis using L11, L1, L10 and L12 ribosomal protein sequences areless definitive in the placement of T. maritima within the eubacteria. Theresolution of our trees is limited by the relatively small size of these proteins and insome cases by the limited number of sequences available for analysis. The tree forthe L12 protein, containing thirteen eubacterial sequences, is virtually devoid ofresolution that is confirmable by bootstrap analysis. In the L11 tree, the mesophileS. virginiae appears to branch more deeply than T. maritima. These observationsseem to suggest that different molecules, although they are all components of theprotein synthesis apparatus, can diverge to some extent independently and give riseto incongruent phylogenies. The "true" organismal phylogeny will hopefullybecome apparent from a consensus of molecular phylogenies.Lake et al. have suggested that the eukaryotic lineage arose as a branch fromthe sulfur-metabolizing thermophilic lineage (i.e., the "eocytes" or Crenarchaeota)within the archaebacterial group (Lake, 1988, Rivera and Lake, 1992). Other analysesindicate that eukaryotic lineage originated outside of the archaebacterial kingdoms(Pace et al., 1986; Woese et al., 1990). Our data neither confirm nor refute either ofthese two positionings. However, our analysis clearly highlights the majordiscontinuity that separates archaebacterial and eukarsequences. The sequence (amino acid identity) and structure (deletion, insertion130and rearrangements) of ribosomal proteins from organisms within a group (i.e.,eubacteria, archaebacteria or eukaryotes) are clearly more similar to each other thanto the sequence and structure of the proteins from organisms outside the group.5.3 SUMMARYAvailable sequences that correspond to the E. coli ribosomal proteins L11, L1,L10 and L12 from eubacteria, archaebacteria and eukaryotes have been aligned. Thealignments were analyzed qualitatively for shared structural features and forconservation of deletions or insertions. The alignments were further subjected toquantitative phylogenetic analysis. Eubacteria and eukaryotes each form well-defined, coherent and non-overlapping groups, and the holophyly of these twogroups is supported by bootstrap resampling. Archaebacteria also form a coherentphylogenetic group by themselves, but the relationships between the major groupsof archaebacteria (extreme halophiles, methanogens, and sulfur-metabolizingthermophiles) and outgroups (eubacteria and eukaryotes) can not be established inthis study. In particular, the positioning of S. solfataricus (a sulfur-metabolizinghyperthermophile) is conflicting in various trees, and remains unresolved in anytree based on these ribosomal protein sequences. T. maritima does not appear asthe deepest branch in any of the presented trees, which may indicate that theevolutionary rate of the ribosomal protein genes is different from that of the othergenes, such as rRNA genes. However, the phylogenetic placement of T. maritimain these trees is less definitive, probably due to relatively short sequences of theseproteins.The degree of diversity of the four proteins between the three groups is notuniform. L11 is the most conserved protein of these four ribosomal proteins; thus,the alignment of the L11 family is less ambigiFou---=s7 For the L12 proteins and the L10131proteins, the archaebacterial and eukaryotic proteins are more similar to each other,whereas the eubacterial proteins are very different. In eukaryotes there areparalogous genes that encode type I and type II L12 proteins; for some features, thetype I protein is more similar to the archaebacterial L12 than is the type II protein.The eukaryotic L1 equivalent protein has yet to be identified. These data indicatethat the evolutionary divergence of even closely associated components of thetranslation apparatus can be remarkably dissimilar, especially when comparedbetween the eubacteria, archaebacteria and eukaryotes.132VI. Conclusion and prospectsEubacteria represent a large collection of diverse microbial species that utilizea wide variety of metabolic strategies to exploit different ecological habitats. Most ofour understanding of the molecular biology and biochemistry of this group oforganisms comes from the study of E. coli, a mesophilic and facultatively anaerobicheterotroph. Our knowledge of eubacteria will be greatly enhanced bycharacterization of other species that are more representative of the range ofeubacterial biochemistry.Hyperthermophilic eubacteria are examples of the diversity of this microbialworld. These organisms can grow at unusually high temperatures and theimportance of their biochemical characterization has been widely appreciated by thescientific community and the biotechnology industry (Pace, 1991; Kristjansson andStetter, 1992). Because the deep branches in a universal phylogenetic tree areexclusively occupied by the hyperthermophiles, it is plausible that the last commoncellular ancestor of all life was a hyperthermophile (Pace, 1991; Stetter, 1993). Thus,detailed study of these organisms could potentially reveal ancestral characteristicsthat may have already been lost in other species, as well as the biochemicalmechanisms for high temperature growth. Furthermore, the thermostableenzymes serve as better biocatalysts in industry, and have opened new possibilitiesin biotechnology, such as the well-known PCR technique. Proteins fromthermophilic sources will, therefore, continuously be exploited in the future fortechnological purpose.The results presented in this thesis provide a perspective on thebiochemistry and molecular biology of the hyperthermophilic eubacterium T.maritima. The genomic organization of the secE, nusG, ribosomal protein genesL11, L1, L10 and L12, and the 13 subunit gene of the RNA polymerase in T. maritima133is the same as that in E. coli and other eubacteria. Thus, it is reasonable to assumethat this arrangement already existed in the ancestor of eubacteria. However, theexpression patterns of these genes in T. maritima are very different from that in E.coli. The E. coli rif region is divided into three transcriptional units, whereas mostof the genes in the T. maritima fragment are cotranscribed. The controlmechanisms for gene expression are also different. The regulation of L10 and L12expression in E. coli is through a complicated mechanism at the translational level,while the expression of these two genes in T. maritima appears to be modulatedthrough a mechanism of transcriptional attenuation. On the other hand. somecommon strategies for control of gene expression are probably used. For example,both the E. coli rif operon and the cloned T. maritima fragment contain the L1autogenous regulatory sites. While the primary structures of the ribosomalproteins L11, L1, L10, and L12 are largely preserved, the sequences of the T.maritima SecE and NusG are very dissimilar to their respective counterparts in E.coli. Furthermore, the transcription factor NusG in T. maritima functions as aDNA-binding protein, which may be very important for regulation of geneexpression in a hyperthermophilic environment, whereas the E. coli proteinapparently lacks DNA-binding activity. These features exemplify the common anddifferent biochemical characteristics within the domain of eubacteria. Commonproperties may be inherited from their ancestor, whereas different characteristicsmay be necessary for adaptation to diverse ecological niches.In summary, our understanding of the microbial world and of thefundamentals of life and its evolution can be enriched by further biochemicalcharacterization of hyperthermophiles; future research in this field will be bothrewarding and merited.134VII. ReferencesAchenbach-Richter, L., Gupta, R., Stetter, K. 0., and Woese, C. R. (1987) Were theoriginal eubacteria thermophiles? System. Appl. Microbiol., 9: 34-39.Amons, R., Pluijms, W., Kriek, J., and M011er, W. (1982) The primary structure ofproteins eL12'/eL2'-P from the large subunit of Artemia salina ribosomes. FEBSLett.,146: 143-147.Amons, R., Pluijms, W., and Willer, W. (1979) The primary structure of ribosomaleL12/eL12-P from Artemia salina 80S ribosomes. FEBS Lett.,104: 85-89.An, G., and Friesen, J. (1980) The nucleotide sequence of tufB and four nearby tRNAstructural genes of Escherichia coli. Gene, 12: 33-39.Arndlt, E., and Weigel, C. (1990) Nucleotide sequence of the genes encoding the L11,L1, L10 and L12 equivalent ribosomal proteins from the archaebacteriumHalobacterium marismortui. Nucleic Acids Res. 18: 1285.Arraiano, C., Yancey, S., and Kushner, S. (1988) Stabilization of discrete mRNAbreakdown products in ams rnb multiple mutants of Escherichia coli K-12. J.Bacteriol., 170: 4625-4633.Bacheitner, M., Ludwig, W., Stetter, K. 0., and Schleifer, K. H. (1989) Nucleotidesequence of the gene coding for the elongation factor Tu from extremelythermophilic eubacterium Thermotoga str operon. FEBS Lett., 57: 115-120.Baier, G., Piendl, W., Redl, B., and Stoeffler, G. (1990) Structure, organization andevolution of the L1 equivalent ribosomal protein gene of the archaebacteriumMethanococcus vanielli. Nucleic Acids Res., 18: 719-724.Balch, W. E., Fox, G. E., Magrum, L. J., Woese, C. R., and Wolfe, R. S. (1979)Methanogens: reevaluation of a unique biological group. Microbiol. Rev. 43: 260-296.Bartsch, M., Kimura, M., and Subramanian, A. R. (1982) Purification, primarystructure, and homology relationships of a chloroplast ribosomal protein. Proc.Natl. Acad. Sci. LISA, 79: 6871-6875.135Baughman, G., and Nomura, M. (1983) Localization of the target site fortranslational regulation of the L11 operon and direct evidence for translationalcoupling in Escherichia coli. Cell, 34: 979-988Beauclerk, A. A. D., Hummel, H., Holmes, D. J., Bock, A., and Cundliffe, E. (1985)Studies of the GTPase domain of archaebacterial ribosomes. Eur. J. Biochem.,151: 242-255.Beltrame, M., and Bianchi, M. E. (1990) A gene family for acidic ribosomal proteinsin Schizosaccharomyces pombe: two essential and two non-essential genes. Mol.Cell. Biol., 10: 2341-2348.Bonch-Osmolovskaya, E. A., Miroshnichenko, M. L., Kostrikina, N. A., Chernych,N. A., and Zavarzin, G. A. (1990) Thermoproteus uzoniensis sp. nov., a newextremely thermophilic archaebacterium from Kamchatka continental hotsprings. Arch. Microbiol. 154: 556-559.Bonch-Osmolovskaya, E. A., Slesarev, A. I., Miroshnichenko, M. L., Svetlichnaya, T.P., and Alexeyev, V. A. (1988) Characteristics of Desulfurosoccus amylolyticus n.sp. - a new extremely thermophilic archaebacterium isolated from thermalsprings of Kamchatka and Kunashir island. Mikrobiologia, 57: 78-85.Bremer, B., and Dennis, P. (1987) Modulation of chemical composition and otherparameters of cell by growth rate. In Escherichia coli and Salmonellatyphimurium: cellular and molecular biology (Neidhardt, F. C., Ingraham, J. L.,Low, K. B., Schaecher, M., and Umbarger, E., eds.), pp. 1527-11542. AmericanSociety for Microbiology, Washington, DC.Brennan, C. A., Dombroski, A. J., and Platt, T. (1987) Transcription terminationfactor p is an RNA-DNA helicase. Cell, 48: 945-952.Brierley, C. L., and Brierley, J. A. (1973) A chemoautotrophic and thermophilicmicroorganism isolated from an acid hot spring. Can. J. Microbiol., 19: 183-188.Brock, T. D. (1986) Introduction: an overview of the thermophiles. InThermophiles: General Molecular and Applied Microbiology (Brock, T. D. ed.),pp. 1. John Wiley & Sons, New York.136Brock, T. D., Brock, K. M., Belly, R. T., and Weiss, R. L. (1972) Sulfolobus, a newgenus of sulfur oxidizing bacteria living at low pH and high temperature. Arch.Microbiol. 84: 54-68.Brown, J. W., Haas, E. S., and Pace, N. R. (1993) Characterization of ribonuclease PRNAs from thermophilic bacteria. Nucleic Acids Res., 21: 671-679.Broyles, S., and Pettijohn, D. E. (1986) Interaction of the Escherichia coli HU proteinwith DNA. Evidence for formation of nucleosome-like structures with alteredDNA helical pitch. J. Mol. Biol.,187: 47-61.Burggraf, S., Olsen, G. J., Stetter, K. 0., and Woese, C. R. (1992) A phylogeneticanalysis of Aquifex pyrophilus. System. Appl. Microbiol., 15: 352-356.Burggraf, S., Jannasch, H. W., Nicolaus, B., and Stetter, K. 0. (1990a) Archaeoglobusprofundus sp. nov., represents a new species within the sulfur-reducingarchaebacteria. System. Appl. Microbiol.,13: 24-28.Burggraf, S., Fricke, H., Neuner, A., Kristjansson, J., Rouvier, P., Mandelco, L.,Woese, C. R., and Stetter, K. 0. (1990b) Methanococcus igneus sp. nov., a novelhyperthermophilic methanogen from shallow submarine hydrothermal system.System. Appl. Microbiol., 13: 263-269.Casiano, C., Matheson, A. T., and Traut, R. R. (1990) Occurrence in thearchaebacterium Sulfolobus solfataricus of a ribosomal protein complexcorresponding to Escherichia coli (L7/L12)4•L10 and eukaryotic (P1)2/(P2)2•PO. J.Biol. Chem., 265: 18757-18761.Cavalier-Smith, T. (1991) The evolution of cells. In Evolution of Life: Fossils,Molecules and Culture (Osawa, S. and Honjo, T. eds.), pp. 271-304, Springer-Verlag, Tokyo.Chan, Y.-L., Paz, V., and Wool, I. G. (1989) The primary sequence of rat acidicribosomal phosphoprotein P0. EMBL accession number X15096.Christiansen, T., Johnsen, M., Fiil, N., and Friesen, J. (1984) RNA secondarystructure and translation inhibition: analysis of mutants in the rplJ leader.EMBO J., 3: 1609-1612137Chyba, C. F., Thomas, P. J., Brookshaw, L., and Sagan, C. (1990) Cometary delivery oforganic molecules to the early Earth. Science, 249: 366-373.Clarke, A. R., Wigley, D. A., Chia, W. N., Barstow, D., Atkinson, T., and Holbrook, J.(1986) Site-directed mutagenesis reveals role of mobile arginine residue in lactatedehydrogenase catalysis, Nature , 324: 699-702.Conaway, J. W., and Conaway, R. C. (1991) Initiation of eukaryotic messenger RNAsynthesis. J. Biol. Chem., 266: 17721-17724.Cousineau, B., Cerpa, C., Lefebvre, J., and Cedergren, R. (1992) The sequence of thegene encoding elongation factor Tu from chlamydia trachomatis compared withthose of other organisms. Gene, 120: 33-41.Crick, F. H. C. (1968) The origin of the genetic code. J. Mol. Biol., 38: 367-379.Das, A. (1992) How the phage lamda N gene product suppresses transcriptiontermination: communication of RNA polymerase with regulatory proteinsmediated by signals in the nascent RNA. J. Bacteriol., 174: 6711-6716.Davies, G. J., Gamblin, S. J., Littlechild, J. A., and Watson, H. C. (1993) The structureof a thermally stable 3-phosphoglycerate kinase and a comparison with itsmesophilic equivalent. Proteins,15: 383-389.Davies, G. J., Littlechild, J. A., Watson, H. C., and Hall, L. (1991) Sequence andexpression of the gene encoding 3-phosphoglycerate kinase from Bacillusstearothermophilus. Gene, 109: 39-45.Dean, D., and Nomura, M. (1980) Feedback regulation of ribosomal protein geneexpression in Escherichia coli. Proc. Natl. Acad. Sci. USA, 77: 3590-3594.Dennis, P. (1974) In vitro stability, maturation and relative differential synthesisrates of individual ribosomal proteins in Escherichia coli B/r. J. Mol. Biol., 88:25-41.Dennis, P. (1985) Multiple promoters for transcription of ribosomal RNA genecluster in Halobacterium cutirubrum. J. Mol. Biol., 186: 457-461138De Rosa M., Gambacorta A., Huber R., Lanzotti V., Nicolaus B., Stetter K. 0., andTrincone A. (1988) Lipid structure in Thermotoga maritima, J. Chem. Soc.Chem. Commun., 300.Dijk, J., White, S. W., Wilson, K. S., and Appelt, K. (1983) On the DNA bindingprotein II from Bacillus stearothermophilus. J. Biol. Chem., 258: 4003-4006.Dijk, J., Garret, R. A., and Muller, R. (1979) Studies on the binding of the ribosomalprotein complex L7/L12-L10 and protein L11 to the end one third of the 23SRNA: a functional center of the 50S subunit. Nucleic Acids Res., 6: 2717-2729.Downing, W., and Dennis, P. (1987) Transcription products from the rp1KAJL-rpoBCgene cluster. J. Mol. Biol.,194: 609-620.Downing, W., Sullivan, S., Gottesman, M., and Dennis, P. (1990) Sequence andtranscription pattern of the essential Escherichia coli secE-nusG operon. J.Bacteriol.,172: 1621-1627.Downing, W., and Dennis, P. (1991) RNA polymerase activity may regulatetranscription initiation and attenuation in the rp/KAJLrpoBC operon inEscherichia coli. J. Biol. Chem., 266: 1304-1311.Draper, D. (1990) Structure and function of ribosomal protein-RNA complexes:thermodynamic studies. In The Ribosome: structure, function and evolution.(Hill, W. E., Dahlberg, A., Garrett, R., Moore, P., Schlessinger, D. and Warner, J.eds.), pp. 160-167, American Society for Microbiology, Washington, DC.Drlica, K. and Rouviere-Yaniv, J. (1987) Histone-like proteins of bacteria. Microbiol.Rev., 51: 301-319.Durovic, P., Liao, D., Mylvaganam, S., and Dennis, P. P. (1993a) The Evolution ofRibosomal Protein and Ribosomal RNA Operons: Coding Sequences, RegulatoryMechanisms and Processing Pathways. In The Translational Apparatus(Nierhaus, N. R., Subramanian, A. R., Erdmann, V. A., Franceschi, F., andWittmann-Liebold, B., eds.), (in press), Plenum, New York and London.Durovic, P. (1993b) Characterization of a novel pathway for ribosomal RNAmaturation in Sulfolobus acidocauldarius. Ph.D. thesis, University of BritishColumbia, Vancouver, B.C., Canada. 139Egebjerg, J., Doutchwarie, S., Liljas, A., and Garrett, R. (1990) Characterization of thebinding sites of protein L11 and the L10*(112)4 pentameric complex in theGTPase domain of 23 S ribosomal RNA from Escherichia coli. J. Mol. Biol., 213:275-288.El-Baradi, T. A. L., de Regt, C. H. F., Einerhand, S. W. C., Teixido, J., Planta, R. J.,Ballesta, J. P. G., and Raue, H. A. (1987) Ribosomal proteins EL11 fromEscherichia coli and L15 from Saccharomyces cerevisiae bind to the same site inboth yeast 26S and mouse 28S rRNA. J. Mol. Biol.,195: 909-917.Ernst, W. G. (1983) The early Earth and archean rock record. In Earth's EarliestBiosphere: its origin and evolution. (Schopf, J. W. ed.), pp. 41-52. PrincetonUniversity Press, Princeton, NJ.Falkenberg, P., Yaguchi, M., Roy, C. Zurker M., and Matheson, A. T. (1985) Theprimary structure of the ribosomal A-protein (L12) from the moderate halophileNRCC 41227. Biochem. Cell Biol., 64: 675-680.Favaloro, J. R., Treisman, R., and Kamen, R. (1980) Transcription maps of polyomavirus-specific RNA: analysis by two-dimensional nuclease S1 mapping. MethodsEnzymol., 65: 718-749.Felsenstein, J. (1991) PHYLIP (Phylogeny Inference Package, version 3.4) (Universityof Washington).Feinberg, A. P., and Vogelstein, B. (1983) A technique for radiolabeling DNArestriction endonuclease fragments to high specific activity. Anal. Biochem., 132:6-13.Ferro, J. A. and Reinach, F. C. (1988) The complete sequence of chicken-musclecDNA encoding the acidic ribosomal protein P1. Eur. J. Biochem.,177: 513-516.Fiala, G., Stetter, K. 0. (1986) Pyrococcus furiosus sp. nov. representing a novel genusof marine heterotrophic archaebacteria growing optimally at 100°C. Arch.Microbiol., 145: 56-61.Fiala, G., Stetter, K. 0., Jannasch, H. W., Langworthy, T. A. and Madon, J. ( 1986)Staphylothermus marinus sp. nov. represents a novel genus of extremelythermophilicSystem. Appl. Microbiol. 8: 106-113. e •e!• •^• e I' • • •140Fiil, N., Friesen, J., Downing, W., and Dennis, P. (1980) Post-transcriptionalregulatory mutants in a ribosomal protein-RNA polymerase operon of E. coli.Cell, 19: 837-844.Friedman, D. I., and E. R. Olson, E. R. (1983) Evidence that a nucleotide sequence,"box A", is involved in the action of the NusA protein. Cell, 34: 143-149.Gabowski, D. T., Pieper, R. 0., Futscher, B. W., Deutsch, W. A., Erickson, L. C., andKelley, M. R. (1992) Expression of ribosomal protein PO is induced by antitumoragents and increased in Mer - human tumor cell lines. Carcinogenesis 13: 259-263.Garland, W. G., Louie, K. A., Matheson, A. T., and Liljas, A. (1987) The completeamino acid sequence of the ribosomal 'A' protein (L12) from Bacillusstearothermophilus. FEBS Lett., 220: 43-46.Gilbert, W. (1986) The RNA world. Nature, 319: 618.Gogarten, J. P., Kibak, H., Dittrich, P., Taiz, L., Bowman, E. J., Bowman, B. J.,Manolson, M. F., Poole, R. J., Date, T., Oshima, T., Konisch, J., Denda, K., andYoshida, M. (1989) Evolution of the vacuolar H+-ATPase: Implications for theorigin of eukaryotes. Proc. Natl. Acad. Sci. USA, 86: 6661-6665.Gold, L., and Stromo, G. (1987) Translational Initiation. In Escherichia coli andSalmonella typhimurium (Neidhardt, F., Ingraham, J., Low, K., Magasanik, B.,Schaechter, M. and Umbarger, H., eds.), pp. 1302-1307, American Society forMicrobiology, Washington, DC.Gourse, R. L., Sharrock, R. A., and Nomura, M. (1986) Control of ribosomesynthesis in Escherichia coli. In Structure, Function, and Genetics of Ribosomes(Hardestry, B., and Kramer, G. eds.), pp.766-788, Springer-Verlag, New York, NY.Gourse, R. L., Thurlow, D. L., Gerbi, S. A., and Zimmerman, R. A. (1981) Specificbinding of a prokaryotic ribosomal protein to a eukaryotic ribosomal RNA:implications for evolution and autoregulation. Proc. Natl. Acad. Sci. USA, 78:2722-2726. Grant, W. D., and Larsen, H. (1989) Ex r SOS^ e C'Halobacteriales, ord. nov. In Bergey's Manual of Systematic Bacteriology (Staley,141J. T., Bryant, M. P., Pfennig, N., and Holt, J. G., eds.), 3: 2216, Williams andWilkins, Baltimore.Green, R., and Szostak, J. W. (1992) Selection of a ribozyme that functions as asuperior template in self-copying reaction. Science, 258: 1910-1915.Greenblatt, J. (1991) RNA polymerase-associated transcription factors. TrendsBiochem. Sci.,16: 408-411.Grogan, D., Palm, P., and Zillig, W. (1990) Isolate B12, which harbours a virus-likeelement, represents a new species of archaebacterial genus Sulfolobus, Sulfolobusshibate, sp. nov. Arch. Microbiol. 154: 594-599.Gualerzi, C. 0., Losso, M. A., Lammi, M., Friedrich, K., Pawlik, R. T., Canonaco, M.A., Gianfranceschi, G., Pingoud, A., and Pon, C. L. (1986) Proteins from theprokaryotic nucleoid. Structural and functional characterization of theEscherichia coli DNA-binding proteins NS (HU) and H-NS. In BacterialChromatin (Guallerzi, C. 0., and Pon, C. L. eds.) pp. 101-134. Springer-Verlag,Berlin.Gudkov, A. T., Tumanova, L. G., Gongadze, G. H., and Bushow, V. N. (1980) Role ofdifferent regions of ribosomal proteins L7 and L10 in their complex formationand in the interaction with the ribosomal 50S subunit. FEBS Lett.,109: 34-38.Hansen, T. S., Andreasen, P. H., Dreisig, H., Hojrup, P., Nielsen, H., Engberg, J., andKristiansen, K. (1991) Tetrahymena thermophila acidic ribosomal protein L37contains an archaebacterial type C-terminus. Gene,105: 143-150.Hasegawa, M., Hashimoto, T., and Adachi, J. (1993) Origin and evolution ofeukaryotes as inferred from protein sequence data. In The Origin and Evolutionof Prokaryotic and Eukaryotic Cells (Hartman, H., and Matsuno, K., eds.), WorldSci. Publ. (in press).Hardy, S. (1975) The stoichiometry of the ribosomal proteins in Escherichia coli.Mol. Gen. Genet.,140: 253-274.Heinrich, T., SchrOder, W., Erdmann, V. A., and Hartmann, R. K. (1992)Identification of the gene encoding transcription factor NusG of Thermusthermophiliis J Barteriol , 174. 7859-5863. 142Henikoff, S. (1984) Unidirectional digestion with exonuclease III creates targetedbreakpoints for DNA sequencing. Gene, 28: 351-359.Hensel, R., Jakob, I., Scheer, H., and Lottspeich, F. (1992) Proteins fromhyperthermophilic archaea: stability towards covalent modification of thepeptide chain. In The archaebacteria: biochemistry and biotechnology, Biochem.Soc. Symp. (Danson, M. J., Hough, D. W., and Lunt, G. G. eds.), 58: 127-133,Portland Press, London.Higgins, C., Hinton, J., Hutton, J., Owen-Hughes, T., Pavitt, G., and Seirafi, A. (1990)Protein Hi: a role for chromatin structure in the regulation of bacterial geneexpression and virulence? Mol. Microbiol., 4: 2007-2012.Hoopes, B., and McClure, W. (1987) Strategies in regulation of transcriptioninitiation. In Escherichia coli and Salmonella typhimurium (Neidhardt, F.,Ingraham, J., Low, K., Magasanik, B., Schaechter, M. and Umbarger, H. eds.), pp.1231-1240 American Society for Microbiology, Washington, DC.Huber, G., and Stetter, K. 0. (1991) Sulfolobus metallicus, sp. nov., a novel strictlychemolithoautotrophic thermophilic archaeal species of metal-mobilizers.System. Appl. Microbiol. 14: 372-378.Huber, G., Spinnler, C., Gambacorta, A., and Stetter, K. 0. (1989a) Metallosphaerasedula gen. and sp. nov. represents a new genus of aerobic, metal-mobilizing,thermoacidophilic archaebacteria. System. Appl. Microbiol.,12: 38-47.Huber, H., Thomm, M., KOnig, H., Thies, G., and Stetter, K. 0. (1982) Methanococcusthermolithotrophicus, a novel thermophilic lithotrophic methanogen. Arch.Microbiol. 132: 47-50.Huber, R., and Stetter K. 0. (1992a) The Thermophiles: hyperthermophilic andextremely thermophilic bacteria. In Thermophilic Bacteria (Kristjansson J. K.ed.), pp. 185-194, CRC Press Inc., Boca Raton, FL.Huber, R., Wilharm, T., Huber, D., Trincone, A., Burggraf, S., KOnig, H., Rachel, R.,Rockinger, I., Fricke, H., and Stetter, K. 0. (1992b) Aquifex pyrophilus gen. sp.nov., represents a novel group of marine hyperthermophilic hydrogen-oxidizingbacteria. System. Appl. Microbiol.,15: 340-351.143Huber, R., Woese, C. R., Langworthy, T. A., Kristjansson, J. K., and Stetter, K. 0.(1990) Fervidobacterium islandicum sp. nov., a new extremely thermophiliceubacterium belonging to the "Thermotogales". Arch. Microbiol., 154: 105-111.Huber, R., Woese, C. R., Langworthy, T. A., Fricke, H., and Stetter, K. 0. (1989b)Thermosipho africanus gen. nov., represents a new genus of thermophiliceubacteria within the "Thermotogales". System. Appl. Microbiol. 12: 32-37.Huber, R., Kristjansson, J. K., and Stetter, K. 0. (1987) Pyrobaculum gen. nov. a newgenus of neutrophilic , rod-shaped archaebacteria from continental solfatarasgrowing optimally at 100°C. Arch. Microbiol., 149: 95-101.Huber, R., Langworthy, T. A., KOnig, H., Thomm, M. M., Woese, C. R. Sleytr, U. B.and Stetter, K. 0. (1986) Thermotoga maritima sp. nov. represents a new genusof unique extremely thermophilic eubacteria growing up to 90°C. Arch.Microbiol.,144: 324-333.Hulton, C., Seirafi, A., Hinton, J., Sidebotham, J., Waddell, L., Pavitt, G., Owen-Hughes, T., Spassky, A., Buc, H., and Higgins, C. (1990) Histone-like protein H1 (H-NS), DNA supercoiling, and gene expression in bacteria. Cell, 63: 631-642.Itoh, T. (1988) Complete Nucleotide Sequence of the ribosomal "A" protein operonfrom the archaebacterium Halobacterium halobium. Eur. J. Biochem., 176: 297-303.Itoh, T., and Otaka, E. (1984) Complete amino-acid sequence of an L7/L12-typeribosomal protein from Desulfovibrio vulgaris Miyazaki. Biochim. Biophys.Acta, 789: 229-233.Itoh, T. and Higo, K. I. (1983) Complete amino acid sequence of an L7/L12-typeribosomal protein from Rhodopseudomonas spheroides. Biochim. Biophys.Acta, 744: 105-109.Itoh, T., Sugiyama, M., and Higo, K. I. (1982) The primary structure of an acidicribosomal protein from Streptomyces griseus. Biochim. Biophys. Acta, 701: 164-172.Itoh, T. (1981) Primary structure of an acidic ribosomal protein from Micrococcuslysodeikticus_ FERS Lett , 127 . 67,70__ ^144Itoh, T., and Wittmann-Liebold, B. (1978) The primary structure of Bacillus subtilisacidic ribosomal protein B-L9 and its comparison with Escherichia coli proteinsL7/L12. FEBS Lett., 96: 392-394.Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S., and Miyata, T. (1989) Evolutionaryrelationship of archaebacteria, eubacteria and eukaryotes inferred fromphylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA, 86: 9355-9359.Jinks-Robertson, S., and Nomura, M. (1987) Ribosomes and tRNA. In Escherichiacoli and Salmonella typhimurium (Neidhardt, F., Ingraham, J., Low, K.,Magasanik, B., Schaechter, M. and Umbarger, H. eds.), pp. 1358-1385, AmericanSociety for Microbiology, Washington DC.Johnsen, M., Christiansen, T., Dennis, P., and Fiils, N. (1982) Autogenous control:Ribosomal protein L10-L12 complex binds to the leader sequence of its mRNA.EMBO J., 1: 999-1004.Jones, W. J., Leigh, J. A., Mayer, F., Woese, C. R., and Wolfe, R. S. (1983)Methanococcus jannaschii sp. nov., an extremely thermophilic methanogenfrom submarine hydrothermal vent. Arch. Microbiol., 136: 254-261.Joyce, G. F. (1989) RNA evolution and the origin of life. Nature, 338: 217-224.Juan-Vidales, F., Sanchez-Madrid, F., Saenz-Robles, M. T., and Ballesta, J. P. G. (1983)Purification and characterization of two ribosomal proteins of Saccharomycescerevisiae, homologies with proteins from eukaryotic and with bacterial proteinECL11. Euro. J. Biochem.,136: 275-281.Juszczak, A., Aono S., and Adams M. W. (1991) The extremely thermophiliceubacterium, Thermotoga maritima, contains a novel iron-dehydrogenasewhose cellular activity is dependent upon tungsten. J. Biol. Chem., 266: 13834-13841.Kasting, J. F. (1993) Earth's early atmosphere. Science, 259: 920-926.Kelly, M. R., Venugopal, S., Harless, J., and Deutsch, W. A. (1989) Antibody to ahuman DNA repair protein allows for cloning of a Drosophila cDNA thatencodes an apurinic endonuclease. Mol. Cell. Biol., 9: 965-973.145Kearney, K. R., and Nomura, M. (1987) Secondary structure of the autoregulatorymRNA binding site of ribosomal protein L1. Mol. Gen. Genet., 210: 609-618.Kimura, M., Kimura, J., and Ashman, K. (1985) The complete primary structure ofribosomal proteins L1, L14, L15, L23, L24 and L29 from Bacillusstearothermophilus. Eur. J. Biochem., 150: 491-497.King, T., and Schlessinger, D. (1987) Processing of RNA transcripts. In Escherichiacoli and Salmonella typhimurium (Neidhardt, F., Ingraham, J., Low, K.,Magasanik, B., Schaechter, M. and Umbarger, H. eds.), pp. 703-718. AmericanSociety for Microbiology, Washington, DC.Knoll, A. H., and Barghoorn, E. S. (1985) Archean Microfossils showing cell divisionfrom the Swaziland system of south Africa. Science,198: 396-398.Kiipke, A. K. E., Leggatt, P. A., and Matheson, A. T. (1992) Structure functionrelationships in the ribosomal stalk proteins of archaebacteria. J. Biol. Chem.,167: 1382-1390.Kristjansson J. K., and Stetter K. 0. (1992) Thermophilic bacteria. In ThermophilicBacteria (Kristjansson J. K. ed.), pp. 1-18, CRC Press Inc., Boca Raton, FL.Krowczynska, A.M., Coutts, M., Makrides, S., and Brawerman, G. (1989) The mousehomologue of the human acidic ribosomal protein P0: a highly conservedpolypeptide that is under translational control. Nucleic Acids Res., 17: 6408-6408.Kurr, M., Huber, R., Kiinig, H., Jannasch, H. W., Fricke, H., Trincone, A.,Kristjansson, J. K., and Stetter, K. 0. (1991) Methanopyrus kandleri, gen. and sp.nov represents a novel group of hyperthermophilic methanogens, growing at110°C. Arch. Microbiol.,156: 239-247.Laemmli, U. K. (1970) Cleavage of structural proteins during the assembly of thehead of the bacteriophage T4. Nature, 227: 680-685.Lake, J. A. (1990) Origin of the eucaryotic nucleus: rRNA sequences genotypicalrelate eocytes and eucaryotes. In The Ribosome: Structure, Function andEvolution (Hill, W. E., Dahlberg, A. Garrett, R. A., Moore, P. B., Schlessinger, D.and Warner, J. R. eds.), pp. 579 588, American Society for- Microbiology,Washington DC.146Lake, J. A. (1988) Origin of the nucleus determined by rate-invariant analysis ofrRNA sequences. Nature, 331: 184-186.Lake, J. M., and Strycharz, W. A. (1981) Ribosomal proteins L1, L17, L27 localized atsingle sites on the large subunit by immune electron microscopy. J. Mol. Biol.,153: 979-992.Lauerer, G., Kristjansson, J. K., Langworthy, T. A., KOnig, H., and Stetter, K. 0. (1986)Methanothermus sociabilis sp. nov., a second species within theMethanothermaceae growing at 97°C. System. Appl. Microbiol., 8: 100-105.Leijonmarck, M., Liljas, A., and Subramanian, A. R. (1984) Computed spatialhomology between the L12 protein of chloroplast ribosome and 1.7 A structure ofEscherichia coli L12 domain. Biochem. Mt., 8: 69-76.Leijonmarck, M., Eriksson, S., and Liljas, A. (1980) Crystal structure of a ribosomalcomponent at 2.6 A resolution. Nature, 286: 824-826.Leijonmarck, M., and Liljas, A. (1987) Structure of the C-terminal domain of theribosomal protein L7/L12 from Escherichia coli at 1.7 A. J. Mol. Biol. 195: 555-580.Li, J., Mason, S. W., and Greenblatt, J. (1993) Elongation factor NusG interacts withtermination factor p to regulate termination and antitermination of transcription.Genes Dev., 7: 161-172.Li, J., Horwitz, R., McCrachen, S., and Greenblatt, J. (1992) NusG, a new Escherichiacoli elongation factor involved in transcriptional antitermination by N protein ofphage X. J. Biol. Chem., 267: 6012-6019.Liao, D., and Dennis, P. P. (1992) The organization and expression of essentialtranslation, transcription component genes in the extremely thermophiliceubacterium Thermotoga maritima. J. Biol. Chem., 267: 22787-227976.Liljas, A. (1982) Structural studies of ribosomes. Prog. Biophys. Mol. Biol., 40: 161-228.147Lindahl, L., Jaskunas, S., Dennis, P., and Nomura, M. (1975) Cluster of genes inEscherichia coli for ribosomal proteins, ribosomal RNA and RNA polymerase.Proc. Natl. Acad. Sci. USA, 72: 2743-2747.Lindahl, L., and Zengel, J. (1986) Ribosomal genes in Escherichia coli. Ann. Rev.Genet., 20: 297-326.Linn, T., and Greenblatt, J. (1992) The NusA and NusG proteins of Escherichia coliincrease the in vitro read through frequency of a transcriptional attenuatorpreceding the gene for b subunit of RNA polymerase. J. Biol. Chem., 267: 1449-1454.Loomis, W. F., and Smith, D. W. (1990) Molecular phylogeny of Dictyosteliumdiscoideum by protein sequence comparison. Proc. Natl. Acad. Sci. USA, 87:9093-9097.Londei P., Altamura, S., Huber, R., Stetter K. 0., and Cammarano, P. (1988)Ribosomes of the extremely thermophilic eubacterium Thermotoga maritimaare uniquely insensitive to the miscoding-inducing action of aminoglycosideantibiotics J. Bacteriol., 170: 4353-4360.Magsanaga, A. and Nosoh, Y. (1974) Conformational change with temperature andthermostability of glutamine synthetase from Bacillus stearothermophilus.Biochim. Biophys. Acta, 365: 208-211.Manca, M. C., Nicolaus, B., Lanzotti, V., Trincone, A., Gambacorta, A., Peter-Katalinic, J., Egge, H., Huber, R., and Stetter, K. 0. (1992) Glycolipids fromThermotoga maritima, a hyperthermophilic microorganism belonging tobacterial domain. Biochim. Biophys. Acta,1124: 249-252.Marquis, D. M., Fahnestock, S. R., Henderson, E., Woo, D., Schwinge, S., Clark, M.W., and Lake, J. A. (1981) The L7/L12 stalk, a conserved feature of the prokaryoticribosome, is attached to the large subunit through its N-terminus. J. Mol. Biol.,150: 121-132.Mason, S. W., Li, J., and Greenblatt, J. (1992) Direct interaction between twoEscherichia coli transcription antitermination factors, NusG and ribosomalprotein S10. J. Mol. Biol., 223: 55-66.148Mason, S. W., and Greenblatt, J. (1991) Assembly of transcription elongationcomplexes containing the N-protein of phage X and Escherichia coli elongationfactors NusA, NusB, NusG, and S10. Genes Dev., 5: 1504-1512.Matheson, A. T., Auer, J., Ramirez, C., and Bock, A. (1990) Structure and Evolutionof archaebacterial ribosomal proteins. In The Ribosome: Structure, Function andEvolution (Hill, W. E., Dahlberg, A. Garrett, R. A., Moore, P. B., Schlessinger, D.and Warner, J. R. eds.), pp. 617-635, American Society for Microbiology,Washington DC.Matheson, A. T., Louie, A. K. and Bock, A. (1988) The complete amino acidsequence of the ribosomal A protein (L12) from the archaebacterium Sulfolobusacidocaldarius. FEBS Lett., 231: 331-335.Matheson, A. T., Louie, K. A., Tak, B. D., and Zuker, M. (1987) The primarystructure of ribosomal A-protein (L12) from the halophilic eubacteriumHaloanaerobium praevalens. Biochimie, 69: 1013-1020.McCarroll, R., Olsen, G. J., Stahl, Y. D., Woese, C. R., and Sogin, M. L. (1983)Nucleotide sequence of the Dictyostelium discoideum small-subunit ribosomalribonucleic acid inferred from the gene sequence: evolutionary implications.Biochemistry, 22: 5858-5868.McClure, W. R. (1985) Mechanism and control of transcription initiation inprokaryotes. Annu. Rev. Biochem., 54: 171-204.Merkler, D. J., Srikumar, K., Marches-Ragona, S. P., and Wedler, F. C. (1988)Aggregation and thermoinactivation of glutamine synthetase from an extremethermophilic B. caldolyticus. Biochim. Biophys. Acta, 952: 101-114.Miroshnichenko, M. L., Bonch-Osmolovskaya, E. A., Neuner, A., Kostrikina, N. A.,Chernych, N. A., and Alekseev, V. A. (1989) Thermococcus stetteri sp. nov., anew extremely thermophilic marine sulfur-metabolizing archaebacterium.System. Appl. Microbiol.,12: 257-262.Mitsui, K., Nakogawa, T., and Tsurugi, K. (1989) The gene and the primaryStructure of acidic ribosomal protein AO from yeast Saccharomyces cerevisiaewhich show partial homology to bacterial ribosomal protein L10. J. Biochem.160: 223-227.149Mitsui, K., and Tsurugi, K. (1988) cDNA and deduced amino acid sequence of acidicribosomal protein AO from Saccharomyces cerevisiae. Nucleic Acids Res., 16:3573-3573.Morgan, W. D., Bear, D. G., and Von Hippel, P. H. (1984) Specificity of release byEscherichia coli transcription termination factor Rho of nascent mRNAtranscripts initiated at the X pR promoter. J. Biol. Chem., 259: 8664-8671.Morgan, W. D., Bear, D. G., and Von Hippel, P. H. (1983) p-dependent termination oftranscription. 1. Identification and characterization of termination sites fortranscription from the bacteriophage pR promoter. J. Biol. Chem., 258: 9553-9564.Neuner, A., Jannasch, H. W., Belkin, S., and Stetter, K. 0. (1990) Thermococcuslitoralis sp. nov.: a new species of extremely thermophilic marine archaebacteria.Arch. Microbiol.,153: 205-207.Newton, C. H., Shimmin, L. C., Yee, J., and Dennis, P. P. (1990) A family of genesencode the multiple forms of the Saccharomyces cerevisiae ribosomal proteinsequivalent to the Escherichia coli L12 protein and a single form to L10-equivalentribosomal protein. J. Bacteriol.,172: 579-588.Nishiyama, K., Mizushima, S., and Tokuda, H. (1992) The carboxyl-terminal regionof SecE interacts with SecY and is functional in the reconstitution of proteintranslocation activity in Escherichia coli. J. Biol. Chem., 267: 7170-7176.Nodwell, J. R., and Greenblatt, J. (1993) Recognition of box A antiterminator RNA bythe E. coli antitermination factor NusB and ribosomal protein S10. Cell, 72: 261-268.Noll, K. M. (1989) Chromosome map of the thermophilic archaebacteriumThermococcus celer. J. Bacteriol.,171: 6720-6726.Noller, H. F., Hoffarth, V., and Zimniak, L. (1992) Unusual resistance of peptidyltransferase to protein extraction procedures. Science, 256: 1416-1419.Okamoto, S., Nihira, T., Kataoka, H., Suziki, A., and Yamada, Y. (1992) Purificationand Molecular Cloning of a butyrolactone autoregulator receptor fromStyep, tomyces virginiae. J. Biol. Chem., 267:  1093-1098.150Pace, N. R. (1991) Origin of life-facing up to the new physical setting. Cell, 65: 531-533.Pace, N. R., Olsen, G. J., and Woese, C. R. (1986) Ribosomal RNA phylogeny and theprimary lines of evolutionary descent. Cell, 45: 325-326.Patel, B. K. C., Morgan, H. W., and Daniel, R. M. (1985) Fervidobacterium nodosumgen. nov. and spec. nov., a new chemoorganotrophic caldo-active, anaerobicbacterium. Arch. Microbiol.,141: 63-69.Paton, E. B., Woodmaska, M. I., Kroupskaya, I. V., Zhyvoloup, A. N., and Matsuka,G. K.. (1990) Evidence for the ability of L10 ribosomal proteins of Salmonellatyphimurium and Klebsiella pneumoniae to regulate rplJL gene expression inEscherichia coli. FEBS Lett., 265: 129-132.Paton, E. B., Woodmaska, M. I., Kroupskaya, I. V., Zhyvoloup, A. N., and Matsuka,G. K. (1990) Evidence for the ability of L10 ribosomal proteins of Salmonellatyphimurium and Klebsiella pneumoniae to regulate rplJL gene expression inEscherichia coli. FEBS Lett., 2675: 129-132.Paton, E. B., Zolotukhiu, S. B., Woodmaska, M. I., Kroupskaya, I. V., and Zhyvoloup,A. N. (1990) The nucleotide sequence of gene rplJ encoding ribosomal proteinL10 of Salmonella typhimurium. Nucleic Acids Res.,18: 2824-2824.Pettersson, I. (1979) Studies on the RNA and protein binding sites of the E. coliribosome protein L10. Nucleic Acids Res., 6: 2637-2646.Petersen, C. (1990) Escherichia coli ribosomal protein L10 is rapidly degraded whensynthesized in excess of ribosomal protein L7/L12. J. Bacteriol., 172: 431-436.Perutz, M. F., and Raidt, H. (1975) Stereochemical basis of heat stability in bacterialferredoxins and in haemoglobins A2. Nature, 255: 256-259.Piccirilli, J. A., McConnell, T. S., Zaug, A. J., Noller, H. F., and Cech, T. R. (1992)Aminoacyl esterase activity of Tetrahymena ribozyme. Science, 256: 1420-1424.Pinto, J. P., Gladstone, G. R., and Yung, Y. L. (1980) Photochemical production offormaldehyde in Earth's primitive atmosphere. Science, 210: 183-185.151Planta, R., Mager, W., Leer, R., Wondt, L., Raue, H., and El-Baradi, T. (1986)Structure and expression of ribosomal proteins in yeast. In Structure, Functionand Genetics of Ribosomes (Hardesty, B., and Kramer, G., eds.), pp. 699-718.Springer-Verlag, New York, NY.Platt, T. (1986) Transcription termination and the regulation of gene expression.Annu. Rev. Biochem., 55: 339-372.Pley, U., Schipka, J., Gambacorta, A., Jannasch, H. W., Fricke, H., Rachel, R., andStetter, K. 0. (1991) Pyrodictium abyssi sp. nov. represents a novel heterotrophicmarine archaeal hyperthermophile growing at 110°C. System. Appl. Microbiol.,14: 245-2533.Portier, C., Dondin, L. Grunberg Manago, M., and Rignier, P. (1987) The first step inthe functional inactivation of Escherichia coli polynucleotide phosphorylasemessenger is a ribonuclease III processing at the 5' end. EMBO J., 6: 2165-2170.Post, L. E., Strycharz, G. D., Nomura, M., Lewis, H., and Dennis, P. P. (1979)Nucleotide sequence of the ribosomal protein gene cluster adjacent to the genefor RNA polymerase subunit in Escherichia coli. Proc. Natl. Acad. Sci. USA, 76:1697-1701.Prieto, J., Candel, E., and Coloma, A. (1991) Nucleotide sequence of a cDNA encodingribosomal protein PO in Dictyostelium discoideum. Nucleic Acids Res., 19: 1342.Pucciarelli, M. G., Remacha, M., Vilella, M. D., and Ballesta, J. P. G. (1990) The 26SrRNA binding ribosomal protein equivalent to bacterial protein L11 is encodedby unspliced duplicated genes in Saccharomyces cerevisiae. Nucleic Acids Res.,18: 4409-4416.Qian, S., Zhang, J.-Y., Kay, M. A., and Jacobs-Lorena, M. (1987) Structural analysis ofthe Drosophila rpAl gene, a member of the eukaryotic "A" type ribosomalprotein family. Nucleic Acids Res.,15: 987-1003.Ramirez, C., Shimmin, L. C., Newton, C. H., Matheson, A. T., and Dennis, P. P.(1989) Structure and evolution of the L11, Ll, L10 and L12 equivalent ribosomalproteins in eubacteria, archaebacteria and eukaryotes. Can. J. Microbiol., 35: 234-244.152Rehaber, V., and Jaenicke, R. (1992) Stability and reconstitution of D-glyceraldehyde-3-phosphate dehydrogenase from the hyperthermophilic eubacteriumThermotoga maritima. J. Biol. Chem., 267: 10999-11006.Remacha, M., Saenz-Robler, M. T., Vilella, M. D., and Ballesta, J. P. G. (1988)Independent genes coding for three acidic proteins of the large ribosomal subunitfrom Saccharomyces cerevisiae. J. Biol. Chem., 263: 9094-9101.Rich, B. E., and Steitz, J. A. (1987) Human acidic ribosomal phosphoproteins P0, P1,and P2: analysis of cDNA clones, in vitro synthesis, and assembly. Mol. Cell.Biol., 7: 4065-4074.Rivera, M. C., and Lake, J. A. (1992). Evidence that eukaryotes and eocyteprokaryotes are immediate relatives. Science, 257: 74-76.Robson, E., and Pain, R. H. (1971) Analysis of code relating sequence to conformationin proteins: possible implications for the mechanism of formation of helicalregions. J. Mol. Biol., 58: 237-259.Roberts, J. W. (1993) RNA and protein elements of E. coli and X transcriptionantitermination complexes. Cell, 72: 653-655.Rosenberg, M., Court, D., Shimatake, H., Brady, C., and Wulff, D. L. (1978) Therelationship between function and DNA sequence in an incistronic regulatoryregion of phage X. Nature, 272: 414-422.Rouviêre-Yaniv, J., Yaniv, M., and Germond, J.-E. (1979) E. coli DNA bindingprotein HU forms nucleosome-like structure with circular double-strandedDNA. Cell,17: 265-274.Ryan, P. C., Lu, M., and Draper, D. E. (1991) Recognition of highly conserved GTPasecenter of 23 S ribosomal RNA by ribosomal protein L11 and the antibioticthiostrepton. J. Mol. Biol., 221: 1257-1268.Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular cloning: a laboratorymanual, Cold Spring Harbor Laboratory, NY.153Sanger, F., Coulson, A. R., Barrell, B. G., Smith, A. J. H., and Roe, B. A. (1980)Cloning in single stranded bacteriophage as an aid to rapid DNA sequencing. J.Mol. Biol.,143: 61-178.Sanchez-Madrid, F., Vidales, F. J., and Ballesta, J. P. G. (1981) Functional role ofacidic ribosomal proteins. Interchangeability of proteins from bacterial andeucaryotic cells. Biochemistry, 20: 3263-3266.Sander, G. (1983) Ribosomal protein L1 from Escherichia coli. Its role in the bindingof tRNA to the ribosome and in elongation factor G - dependent hydrolysis. J.Biol. Chem., 258: 10098-10102.Sawadogo, M., and Sentenac, A. (1990) RNA polymerase B (II) and generaltranscription factors. Annu. Rev. Biochem., 59: 711-754.Schatz, P., Bieker, K., Ottemann, K., Silhavy, T., and Beckwith, J. (1991) One of thethree transmembrane stretches is sufficient for functioning of the SecE protein acomponent of the E. coli secretion machinery. EMBO J.,10: 1749-1757.Schatz, P., Riggs, P., Jacq, A., Fath, M., and Beckwith, J. (1989) The secE gene encodesan integral membrane protein required for protein export in E. coli. Genes Dev.,3: 1035-1044.Schijman, A. G., Dusetti, N. J., Vazquez, M. P., Lafton, S., Levy-Yeyati, P., and Levin,M. J. (1990) Nucleotide cDNA and complete deduced amino acid sequence of aTrypanosoma cruzi ribosomal P protein (P-JL5). Nucleic Acids Res., 18: 3399.Schopf, J. W. (1993) Microfossils of the early archean apex chert: new evidence of theantiquity of life. Science, 260: 640-646.Schopf, J. W., and Walter, M. R. (1983) Archean microfossile: new evidence ofancient microbes. In Earth's Earliest Biosphere: its origin and evolution (Schopf,J. W. ed.), pp. 214-239, Princeton University Press, Princeton, NJ.Segerer, A. H., Trincone, A., Gahrtz, M., and Stetter, K. 0. (1991) Stygiolobusazoricus gen. nov., sp. nov., represents a novel genus of anaerobic, extremelythermophilic archaebacteria of the order Sulfolobales. Mt. J. Syst. Bacteriol., 41:495-501.154Segerer, A., Neuner, A., Kristjansson, J. K., and Stetter, K. 0. (1986) Acidianusbrierleyi comb. nov.: facultatively aerobic, extremely acidophilic thermophilicsulfur-metabolizing archaebacteria. Int. J. Syst. Bacteriol., 36: 559-564.Shapiro, R. (1988) Prebiotic ribose synthesis: a critical analysis. Origin of life &Evolution of Biosphere, 18: 71-85.Sharp, R. J., Riley, P. W., and White, D. (1992) Heterotrophic thermophilic Bacili. InThermophilic Bacteria (Kristjansson, J. K., ed.), pp. 19-50. CRC Press Inc., BocaRaton, FL.Shimmin, L. C. (1990) An archaebacterial ribosomal protein gene cluster. Ph.D.thesis, University of British Columbia, Vancouver, B.C., Canada.Shimmin, L. C., and Dennis, P. P. (1989) Characterization of the L11, L1, L10 and L12equivalent ribosomal protein gene cluster of the halophilic archaebacteriumHalobacterium cutirubrum. EMBO J., 8: 1225-1235.Shimmin, L. C., Ramirez, C., Matheson, A. T., and Dennis, P. P. (1989) Sequencealignment and evolutionary comparison of the L10 equivalent and L12equivalent ribosomal protein from archaebacteria, eubacteria and eukaryotes. J.Mol. Evol., 29: 448-462.Sibold, C., and Subramanian, A. R. (1990) Cloning and characterization of the genesfor ribosomal proteins L10 and L12 from Synechocystis Sp. PCC 6803:Comparison of gene clustering pattern and protein sequence homology betweencyanobacteria and chloroplasts. Biochim. Biophys. Acta, 1050: 61-68.Simpson, H. D., Coolbear, T. Vermute, M., and Daniel, R. M. (1990) Purification andsome properties of a thermostable DNA polymerase from a Thermotoga speciesBiochem. Cell Biol., 68: 1292-1296.Smooker, P. M., Schmidt, J., and Subramanian, A. R. (1991) The nuclear:organelledistribution of chloroplast ribosomal protein genes. Features of a cDNA cloneencoding precursor of L11. Biochimie, 73: 845-851.Sor, F., and Nomura, M (1987) Cloning and DNA sequence determination of theL11 ribosomal protein of Serratia marcenscens and proteus vulgaris.Translational feedback regulation of Escherichia coli L11 operon by heterologous11proteins.___Atlok -Gen.155Stetter, K. 0. (1993) Life at the upper temperature border. In Colloqueinterdisciplinaire du comite national de la recherche scientifique, Frontiers ofLife, Le Bloris Proceedings (Than Thanh Van, J. T., Mounolou, J. C., Schneider, J.,and McKay, C. eds.), pp. 195-219, C55, Editions Frontiers, Gif-sur-Yvette.Stetter, K. 0. (1988) Archaeoglobus fulgidus gen. nov., sp. nov.: a new taxon ofextremely thermophilic archaebacteria. System. Appl. Microbiol.,10: 172-173.Stetter, K. 0., Lauerer, G., Thomm, M., and Neuner, A. (1987) Isolation of extremelysulfate reducers: evidence for a novel branch of archaebacteria. Science, 236: 822-824.Stetter, K. 0. (1986) Diversity of extremely thermophilic archaebacteria. InThermophiles: General, molecular and applied microbiology (Brock, T. D. ed.)pp. 40-74, John Wiley & Sons, New York, NY.Stetter, K. 0., Kiinig, H., and Stackebrandt, E. (1983) Pyrodictium gen. nov., a newgenus of submarine disc-shaped sulphur reducing archaebacteria growingoptimally at 105°C. System. Appl. Microbiol., 4: 535-551.Stetter, K. 0., Thomm, M., Winter, J., Wildgruber, G., Huber, M., Zillig, W.,Janecovic D., Kiinig H., Palm P., and Wunderl S. (1981a) Methanothermusfervidus, sp. nov., a novel extremely thermophilic methanogen isolated fromIcelandic hot spring. Zentralbl, Bakteriol. Hyg., Abstr. 1, orig c2, 166.Stetter, K. 0., Thomm, M., Winter, J., Wildgruber, G., Huber, H., Zillig, W.,Janecovic, D., KOnig, H., Palm, P., and Wunderl, S. (1981b) Met hano t hermu sfervidus, sp. nov., a novel extremely thermophilic methanogen isolated fromIcelandic hot spring. Zentralbl. Bakteriol. Hyg., Abstr. 1 Orig. C2, 166-178.Stiiffler-Meilicke, M., and Stiiffler, G. (1991) The binding site of ribosomal proteinL10 in eubacteria and archaebacteria is conserved: recognition of chimeric 50Ssubunit. Biochemie, 73: 797-804.Strycharz, W. A., Nomura, M., and Lake, J. A. (1978) Ribosomal proteins L7/L12localized at a single region of the large subunit by immune electron microscopy.J. Mol. Bio1.,126: 123-140.156Studier, F. W., Rosenberg, A. H., Dunn, J. J., and Dubendorff, J. W. (1990) Use of T7RNA polymerase to direct expression of cloned genes. Methods Enzymol., 105:60-89.Subramanian, A. R. (1975) Copies of protein L7 and L12 and heterogeneity of thelarge subunit of Escherichia coli ribosome. J. Mol. Biol., 95: 1-8.Subramanian, A. R., and Dabbs, E. R. (1980) Functional studies on ribosomes lackingprotein L1 from mutant E. co/i. Eur. J. Biochem., 112: 425-430.Sullivan, S., and Gottesman M. E. (1992) Requirement for E. coli NusG protein infactor-dependent transcription termination. Cell, 68: 989-994.Sullivan, S., Ward, D., and Gottesman, M. (1992) Effect of Escherichia coli nusGfunction on N-mediated transcription antitermination. J. Bacteriol.,174: 1339-1224.Suzuki, K., Olvera, J., and Wool, I. G. (1990) The primary structure of rat ribosomalprotein L12. Biochem. Biophys. Res. Commun.,172: 35-41.Swindle, J., Zylicz, M., Georgopoulos, C., Li, J., and Greenblatt, J., (1988) Purificationand properties of the NusG protein of Escherichia coli. J. Biol. Chem., 263: 10229-10325.Swofford, D. L. (1993) PAUP Version 3.1 (Laboratory of Molecular Systematics,Smithsonian Institution, Washington, DC).Thomas, M., and Nomura, M. (1987) Translational regulation of the L11 ribosomalprotein operon of Escherichia coli: mutations that define the target site forrepression by L1. Nucleic Acids Res., 15: 3085-3096.Tiboni, 0., Cantoni, R., Creti, R., Cammarano, P., and Sanangelantoni, A. M. (1991)Phylogenetic depth of Thermotoga maritima inferred from analysis of the fusgene: amino acid sequence of elongation factor G and organization of theThermotoga str operon. J. Mol. Evol., 33: 142-151.Titus, D. E. (1991) Promega Protocols and Application Guide, Second Edition,Promega Corporation, Madison, WI. 157Uchiumi, T., Wahba, A. J., and Traut, R. R. (1987) Topography and stoichiometry ofacidic proteins in large ribosomal subunits from Artemia salina as determined bycrosslinking. Proc. Natl. Acad. Sci. USA, 84: 5580-5584.Wachtershauser, G. (1992) Groundworks for an evolutionary biochemistry: theiron-sulphur world. Prog. Biophys. Mol. Biol., 58: 85-201.Walter, M. R., Buick, R., and Dunlop, J. S. R. ( 1980) Stromatolites 3,400-3,500 myrold from North Pole area, Western Australia. Nature, 284: 443-445.Walter, M. R. (1983) Archean stromatolites: evidence of the Earth's earliest benthos.In Earth's Earliest Biosphere: its origin and evolution (Schopf, J. W. ed.), pp. 187-213, Princeton University Press, Princeton, NJ.Warner, J. R. (1989) Synthesis of ribosomes in Saccharomyces cerevisiae. Microbiol.Rev., 53: 256-271.Watanabe, K., Chishiro, K., Kitamura, K., and Suzuki, Y. (1991) Proline residuesresponsible for thermostability occur with high frequency in the loop regions ofan extremely thermostable oligo-1,6-glucosidase from Basillusthermoglucosidasius KP1006. J. Biol. Chem., 266: 24878-24294.Wedler, F. C., and Hoffman, F. M. (1974) Glutamine synthetase of Bacillusstearothermophilus. II. Regulation and thermostability. Biochemistry, 13: 3215-3221.Wedler, F. C., and Merkler, D. J. (1985) Thermostabilization of B. caldolyticusglutamine synthetase by intrinsic and extrinsic factors. In Curr. Top. Cell Regul.,26: 263-280.Weiner, A. M., and Maizels, N. (1991) The genomic tag model for the origin ofprotein synthesis: Further evidence from the molecular fossil record. InEvolution of Life: Fossils, Molecules and Culture (Osawa, S. and Honjo, T. eds.),pp. 51-66. Springer-Verlag, Tokyo.Weiner, A. M., and Maizels, N. (1987) 3' terminal tRNA-like structures tag genomicRNA molecules for replication: Implications for the origin of protein synthesis.Proc. Natl. Acad. Sci. USA, 84: 7383-7387.158Weiner, A. M. (1987) The origin of life. In Molecular Biology of Gene (Watson, J. D.,Roberts, J. W., Steitz, J. A., and Weiner, A. M. eds.), pp. 1098-1160. BenjaminCummings, Menlo Park, CA.Whalen, W., Wolska, K., Devito, J., and Das, A. (1992) NusG is a multifunctionaltranscription factor that positively regulates both termination andantitermination. Abstract of the Cold Spring Harbor Conference on Bacteria andPhages, August, 1992, pp. 17.Whalen, W., and Das, A. (1988) NusA protein is necessary and sufficient in vitrofor phage X N gene product to suppress a Rho-independent terminator placeddownstream of nutL. Proc. Natl. Acad. Sci. USA, 85: 2494-2498.Wheelis M. L., Kandler 0., and Woese C. R. (1992) On the nature of globalclassification. Proc. Natl. Acad. Sci. USA, 89: 2930-2934.Wiegel, J. (1990). Temperature spans for growth: hypothesis and discussion. FEMSMicrobiol. Rev., 75: 155-170.Wigboldus, J. 0. (1987) cDNA and deduced amino acid sequence of Drosophilavp21c, another "A" type ribosomal protein. Nucleic Acids Res., 15: 10064.Windberger, E., Huber, R., Trincone, A., Fricke, H., and Stetter, K. 0. (1989)Thermotoga thermarum sp. nov. and Thermotoga neapolitana occurring inAfrican continental solfataric spring. Arch. Microbiol., 151: 506-512.Wittmann-Liebold, B., KOpke, A. K. E., Arndt, E., Kromer, W., Hatakeyama, T., andWittmann, H.-G. (1990) Sequence comparison of ribosomal proteins and theirgenes. In The Ribosome: Structure, Function and Evolution (Hill, W. E.,Dahlberg, A. Garrett, R. A., Moore, P. B., Schlessinger, D. and Warner, J. R. eds.),pp. 598-616. American Society for Microbiology, Washington DC.Woese, C., Kandler, 0., and Wheelis, M. (1990) Towards a natural system oforganisms: Proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Nat.Acad. Sci USA, 87: 4576-4579.Woese, C. R. (1987) Bacterial Evolution. Microbiol. Rev., 51: 221-270.Woese, C. R., and G. Olsen, (1986) Archaebacterial phylogeny: perspectives on theurkingdoms. System. Appl. Microbiol., 7: 161-177.159Woese, C. R. (1980) Just so stories and Rube Goldberg machines: speculations on theorigin of the protein synthesis machinery. In Ribosomes: Structure, Function,and Genetics (Chamblis, G., Craven, G. R., Davies, J., Davis, K., Kahan, L. andNomura, M. eds.), pp. 357-373. University Park Press, Baltimore.Wool, I. G., Endo, Y., Chan, Y.-L., and Gliick, A. (1990) Structure, function andevolution of mammalian ribosomes. In The Ribosome: Structure, Function andEvolution (Hill, W. E., Dahlberg, A., Garrett, R. A., Moore, P. B., Schlessinger, D.and Warner, J. R. eds.), pp. 203-214, American Society for Microbiology,Washington, DC.Wrba, A., Jaenicke, R., Huber, R., and Stetter, K. 0. (1990) Lactate dehydrogenasefrom the extremely thermophilic Thermotoga maritima. Eur. J. Biochem., 188:195-201.Wrba, A., Schweiger, A., Schultes, V., Jaenicke, R., and Zavodszy, P. (1990) Extremelythermostable D-glyceraldehyde-3-phosphate dehydrogenase from EubacteriaThermotoga maritima . Biochemistry, 29: 7584-7592.Yager, T. D., and von Hippel, P. H. (1987) Transcript elongation and termination inEscherichia coli. In Escherichia coli and Salmonella typhimurium: cellular andmolecular biology (Neidhardt, F. C., Ingraham, J. L., Low, K. B., Schaecher, M.,and Umbarger, E., eds.), pp. 1241-1275. American Society for Microbiology,Washington, DC.Yamagichi, A., and Oshima, T. (1990) Circular chromosomal DNA in the sulfur-dependent archaebacterium Sulfolobus acidocaldarius. Nucleic Acids Res., 18:1133-1136.Yates, J. L., and Nomura, M. (1981) Feedback regulation of ribosomal proteinsynthesis in E. coli: Localization of the mRNA target sites for repressor action ofribosomal protein L11. Cell, 24: 243-249.Yang, X-Y. H., Schulz, H., Elzinda, M., and Yang, S.-Y. (1991) Nucleotide sequence ofthe promoter and fadB gene of the fadBA operon and primary structure ofmultifunctional fatty acid oxidation protein from Escherichia coli. Biochemistry,30: 6788-6795. Zhang, H., Scholl, • • •.• IP^P. 41 • • '^•^•sequencing as a choice for DNA sequencing. Nucleic Acids Res.,16: 1220.160Zillig, W., Holz, I., Janekovic, D., Klenk, H. P., Imsel, E., Trent, J., Wunderl, S.,Forjaz, V. H., Coutinho, R., and Ferreira, T. (1990) Hyperthermus butylicus, ahyperthermophilic sulfur-reducing archaebacterium that ferments peptides. j.Bacteriol., 172: 3959-3965.Zillig, W., Holz, I., Klenk, H. P., Trent, J., Wunderl, S., Janecovic, D., Erwin, J., andHaas, B. (1987) Pyrpcoccus woeseii, sp. nov., an ultra-thermophilic marinearchaebacterium representing a novel order, Thermococcals. System. Appl.Microbiol., 9: 62-70.Zillig, W., Yeates, S., Holz, I., Bock, A., Gropp, F., and Simon, G. (1986)Desulfurolobus ambivalens gen. nov., sp. nov., an autotrophic archaebacteriumfacultatively oxidizing or reducing sulfur. System. Appl. Microbiol., 8: 197-203.Zillig, W., Gierl, A., Schreiber, G., Wunderl, S., Janecovic, D., Stetter, K. 0., andKlenk, H. P (1983) The archaebacterium Thermophilum pendens represents anovel genus of thermophilic, anaerobic sulfur spring Thermoproteales. System.Appl. Microbiol., 4: 79-87.Zillig, W., Stetter, K. 0., Prangishvilli, D., Schafer, W., Wunderl, S., Janecovic, D.,Holz, I., and Palm, P. (1982) Desulfurococcaceae, the second family of theextremely thermophilic, anaerobic, sulfur-respiring Thermoproteales. Zentralbl.Bakteriol. Hyg., Abstr. 1 Orig. C3, 304-317.Zillig, W., Stetter, K. 0., Schafer, W., Janecovic, D., Wunderl, S., Holz, I., and Palm,P. (1981) Thermoproteals: a novel type of extremely thermophilic anaerobicarchaebacteria isolated from Icelandic solfataras. Zentralbl. Bakteriol. Hyg.,Abstr.. 1 Orig. C2, 205-227.Zillig, W., Stetter, K. 0., Wunderl, S., Schulz, W., Priess, H., and Scholz, I. (1980) TheSulfolobus- "caldariella" group: taxonomy on the basis of structure of DNA-dependent RNA polymerases. Arch. Microbiol., 125: 259-269.Zimmerman, R. A., Thurlow, D. L., Finn, R. S., March, T. L., and Ferrett, L. K. (1980)Conservation of specific protein RNA interactions in ribosomal evolution. InGenetics and Evolution of RNA polymerase, tRNA and ribosomes (Osasa, S.,Ozeki, H., Uchida, H. and Yura, T. eds.), pp. 569-584, University of Tokyo Press,Tokyo.

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0098821/manifest

Comment

Related Items