Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Analysis of the s-layer transporter mechanism and smooth lipopolysaccharide synthesis in caulobacter… Awram, Peter Alan 2000

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2000-485994.pdf [ 23.05MB ]
Metadata
JSON: 831-1.0089847.json
JSON-LD: 831-1.0089847-ld.json
RDF/XML (Pretty): 831-1.0089847-rdf.xml
RDF/JSON: 831-1.0089847-rdf.json
Turtle: 831-1.0089847-turtle.txt
N-Triples: 831-1.0089847-rdf-ntriples.txt
Original Record: 831-1.0089847-source.json
Full Text
831-1.0089847-fulltext.txt
Citation
831-1.0089847.ris

Full Text

Analysis of the S-layer Transporter Mechanism and Smooth Lipopolysaccharide in Caulobacter  Synthesis  crescentus  by Peter Alan Awram B . S c , The University of British Columbia, 1992  A THESIS SUBMITTED IN PARTIAL F U L F I L M E N T O F THE REQUIREMENTS FOR THE D E G R E E OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Department of Microbiology and Immunology)  We accept this thesis as conforming to the required standard  T H E UNIVERSITY O F BRITISH C O L U M B I A November 1999 © Peter Alan Awram, 1999  In  presenting  degree freely  at  this  the  available  copying  of  department publication  thesis  in  partial  fulfilment  of  the  University  of  British  Columbia,  I  agree  for  this or of  reference  thesis by  this  for  his  and  scholarly  or  thesis  for  her  DE-6  of  / <VcfiOT?/o/-Q6/  T h e U n i v e r s i t y o f British Vancouver, Canada  Columbia  Date  f  (2/88)  /  / f  I  further  purposes  gain  that  agree  may  be  It  is  representatives.  financial  permission.  Department  study.  requirements  shall  not  that  the  Library  permission  granted  by  understood be  for  allowed  an  advanced  shall for  the that  without  head  make  it  extensive of  my  copying  or  my  written  Abstract  C. crescentus is a Gram-negative bacterium that possesses an hexagonal array called the S-layer that covers the entire outer surface of the bacterium. This array is composed of an estimated 60 000 copies of the 98 kDa protein RsaA. RsaA secretion is directed by a C-terminal secretion signal located in the last 82 amino acids of the protein.  Once R s a A is secreted from the cell, it  assembles into the S-layer and attaches to the outer membrane via a specific species of smooth lipopolysaccharide (S-LPS). The mechanisms required for the secretion of RsaA and the synthesis of the S - L P S were examined in this thesis. Tn5 mutagenesis of wildtype C. crescentus demonstrated the presence of two genes, rsaD and rsaE, 3' of the rsaA gene that were required for transport of RsaA. These genes were isolated and are capable of complementing the Tn5 mutations 3' of RsaA in trans. The resulting proteins of rsaD and rsaE belong to the type I secretion family that uses three components: an A T P Binding Cassette-transporter (RsaD), a Membrane Fusion Protein (RsaE) and an outer membrane protein (OMP), to secrete proteins through both membranes of Gramnegative bacteria. The O M P , RsaF, of the R s a system was found by screening the partial Caulobacter  genome sequence for sequence identity to other type I  O M P s . The gene for R s a F is found 5 kb 3' of rsaE.  Deletion of the N-terminus  or C-terminus of RsaF prevents the R s a secretion mechanism from functioning. The secretion of the S-layer subunits in a number of other  Caulobacter  species was also examined. A partial O R F from F W C 2 7 with 44.6% identity to RsaA was isolated. In addition, the ABC-transporter components from F W C 6 , F W C 8 and F W C 3 9 were isolated. These components were >95% identical to  ii  RsaD.  These results were used to explore the evolutionary relationships  between the different Caulobacter species. Eighteen Tn5 mutations resulting in the inability of the S-layer to attach to the surface of the bacterium were also isolated.  Southern blot analysis  demonstrated that twelve of these insertions were linked to the R s a transporters. The Tn5 insertion points were isolated and sequenced allowing identification of several putative genes involved in S - L P S synthesis from the  Caulobacter  genome sequence. A total of twelve open reading frames (ORFs) were found by Tn5 mapping and two more were found 3' of rsaE.  Six of these putative genes  may code for proteins involved in the synthesis of sugar residues including five that make perosamine. Five of the genes appear to be glycosyltransferases involved in forming the linkages between sugar residues in the O-antigen. One of the genes appears to be a repressor, while the remaining genes are unidentified. These data suggest that the major component of the O-antigen is perosamine and that a number of different linkages are made between the perosamine residues.  iii  TABLE OF CONTENTS  Abstract  ii  Table of Contents  iv  List of Tables  v  List of Figures  vi  List of Abbreviations  vii  List of Species abbreviations  viii  Acknowledgements  ix  CHAPTER 1  Introduction  1  CHAPTER 2  Materials and Methods  16  CHAPTER 3  Secretion of RsaA  24  CHAPTER 4  Identification of the Outer Membrane Protein Component of the ' RsaA Transport Complex  CHAPTER 5  Identification of the S-layer Subunit and Transporter genes in Freshwater Caulobacter species  CHAPTER 6  38  53  Identification of Genes involved in the Synthesis of the O-antigen of C. crescentus  68  CHAPTER 7  Conclusion and Future Considerations  92  Bibliography  .  97  Appendix 1  R A T fragment - rsaADEF and IpsABCDEF  113  Appendix 2  ATC15252 S-layer subunit and transporter genes  122  Appendix 2  Sequences of IpsGHIJK, orfl and orf2  129  iv  List of Tables  Table Table Table Table Table Table Table Table Table Table Table  2-1 2-2 5-1 5-2 5-3 5-4 6-1 6-2 6-3 6-4 6-5  Strains and Plasmids used in this study Primers used for P C R for this report Differences between the R s a genes found in lab strains F W C species secreting alkaline protease Southern Blot Banding patterns of different F W C species B L A S T alignment of RsaA with itself Southern blot of Shedder mutants using EcoRI Southern blot of Shedder mutants using Sst\ List of shedder mutants Deduced proteins involved in O-antigen synthesis Characteristics of the S - L P S synthesis genes  17 19 55 56 57 62 71 72 73 77 79  v  List of Figures Figure 1-1 Figure 1-2 Figure 1-3 Figure 1-4 Figure 2-1 Figure 3-1 Figure 3-2 Figure 3-3 Figure 3-4 Figure 3-5 Figure 3-6 Figure 3-7 Figure 3-8 Figure 4-1 Figure 4-2 Figure 4-3 Figure 4-4 Figure 4-5 Figure 4-6 Figure 5-1 Figure 5-2 Figure 5-3 Figure 6-1 Figure 6-2 Figure 6-3 Figure 6-4 Figure 6-5 Figure 6-6 Figure 6-7 Figure 6-8 Figure 6-9 Figure 6-10 Figure 6-11  Shed S-layer from C. crescentus Developmental cycle of C. crescentus 3-Dimensional reconstruction of the S-layer Type I secretion system Plasmids containing NA1000 chromosomal DNA Colony Immunoblot S-layer negative Tn5 insertions Complementation of Tn5 mutants with rsaA Genes 3' of rsaA ClustalW alignment of ABC-transporters ClustalW alignment of M F P s Complementation of transport deficient mutants Expression of prtB in C. crescentus Alignment of O M P components The two possible O M P s Comparison of possible R s a O M P components DNA surrounding rsaA O M P s similar to Rsa(973) AprA secretion from C. crescentus Alignment of partial F W C RsaD genes ClustalW alignment of F W C 27 Dendrogram of F W C species Shed S-layer from C. crescentus Colony Immunoblot S - L P S of shedding Tn5 mutants S - L P S synthesis genes linked to rsaA Genes interrupted by Tn5 insertions in shedder mutants ClustalW alignment LpsB ClustalW Alignment of LpsD and LpsE ClustalW alignment of LpsF ClustalW alignment of LpsJ ClustalW alignment of LpsK Perosamine Synthesis Pathway  2 3 5 9 18 25 26 27 28 29 30 32 34 41 43 45 48 49 50 58 60 66 69 69 70 75 76 81 82 83 85 87 90  vi  List of Abbreviations  ABC ATP BLAST C-terminus DNA EDTA EGTA G+C FWC HCI KDO kDa Km LPS min MFP mg ml (LLI [ig NaCI NeuNAc NAD NMR N-terminus NTG O-antigen ORF OMP PAGE PCR PYE RNA S-layer S-LPS SDS Sm Tc T TIGR Tris UV m  ATP-Binding Cassette adenosine triphosphate Basic Local Alignment Search Tool carboxy terminus deoxyribonucleic acid ethylene diaminetetra-acetic acid ethylene glycol-bis((3-aminoethyl Ether) N N N ' N ' tetraacetic acid guanosine and cytosine content of DNA freshwater Caulobacter hydrochloric acid ketodeoxy octulosonic acid kilodalton kanamycin lipopolysaccharide minute membrane fusion protein milligram millilitre microlitre microgram sodium chloride N-acetyl neuraminic acid (sialic acid) nicotinamide adenine dinucleotide nuclear magnetic resonance amino terminus 1 -methyl-3-nitro-1 -nitrosoguanidine antigenic determinant found on the outside of cell consisting of repeating units of oligosaccharides open reading frame outer membrane protein polyacrylamide gel electrophoresis polymerase chain reaction peptone yeast extract ribonucleic acid surface layer smooth lipopolysaccharide of C. crescentus sodium dodecyl sulphate streptomycin tetracycline Melting temperature of two strands of DNA The Institute for Genome Research Tris (hydroxymethyl) methylamine ultra violet light  List of Species Abbreviations  B. pertussis B. melitensis C. crescentus C. fetus C. jejuni E. chrysanthemi E. coli P. aeruginosa S. enterica S. marcescens V. cholerae R. meliloti R. leguminosarum  Bordetella pertussis Brucella melitensis Caulobacter crescentus Campylobacter fetus Campylobacter jejuni Erwinia chrysanthemi Escherichia coli Pseudomonas aeruginosa Salmonella enterica Serratia marcescens Vibrio cholerae Rhizobium meliloti Rhizobium leguminosarum  Acknowledgements I would like to acknowledge myself for persevering throughout this process. I would further like to thank my wonderful girlfriend Bianca Kuipers for being supportive during this time. I would like to thank John Nomellini and Stephen Walker for helpful suggestions and thoughtful insights and the occasional gel along the way. I would also like to thank all my 'partners in pain' that started with me.  ix  Chapter 1 Introduction This thesis focuses on the secretion and attachment of the S-layer of Caulobacter  crescentus.  S-layers are not well understood and have not been  studied extensively even though they are found on a wide range of prokaryotes (Messner and Sleytr, 1992; Sleytr et al.,  1993; Sleytr and S a r a ,  1997).  Consequently, there is a need for basic research to describe these structures. Despite this lack of study, some research has been done on the commercial aspects of S-layers (Sleytr et al., 1997a). The research presented here is applicable to both of these areas.  It is of general interest to know the methods of secretion and  attachment of the S-layer and this information can also be applied to the commercial aspects of S-layers. Evidence is presented that the S-layer subunit of C. crescentus is secreted by a type I secretion mechanism and that the S-layer subunits of a number of other Caulobacter species are probably secreted by an almost identical type I mechanism. Also presented are several putative proteins involved in the synthesis of the Oantigen that support the predicted composition of the O-antigen as being a polymer of a 4,6-dideoxy-4-amino-hexose with complex linkages (Walker et al, 1994; Smit unpublished). Furthermore these data suggest that the 4,6-dideoxy-4-amino-hexose is perosamine and that a number of glycosyltransferases provide complex linkages between the perosamine residues.  The S-layer of C. crescentus can be used as a biotechnology vehicle. The S layer is a 2-dimensional array made from approximately 60 000 copies of the protein, RsaA (Smit et al., 1981). This layer covers the entire outer surface of the bacterium and makes up about 10% of the cell's protein. Therefore, RsaA must be secreted, passing through both membranes, from the Gram-negative cell.  An  uncleaved C-terminal secretion signal directs this secretion of R s a A (Bingle et al., 1999; Bingle et al., 1996; Bingle et al., 1997b; Bingle and Smit, 1994). secreted, the S-layer is attached to the outer membrane via the  Once smooth  1  lipopolysaccharide (S-LPS) (Walker et al., 1994). If the S - L P S is disrupted or absent the S-layer detaches from the membrane and aggregates into particles that are up to 9 0 % pure R s a A making it easy to collect large amounts of relatively pure protein (Fig. 1-1).  It has  been found that the N-terminus of RsaA contains the attachment domain and a C-terminus C a  2 +  binding  domain is responsible for aggregation of the protein (Bingle etal., 1997b). To produce recombinant proteins it is desirable to produce large quantities that are easily isolated from the rest of the cellular protein. The properties of the C. crescentus S-layer and secretion apparatus allow this. The C-terminal secretion signal and C a  z +  binding  domain can be fused to a desired protein  and  Figure  1-1.  shed s-  layer from C. crescentus. E M p h o t o g r a p h of S layer s h e d from a strain with defective S-LPS. (Photo c o u r t e s y J o h n Smit)  recombinant proteins can then be secreted from C. crescentus by the R s a A secretion signal. The proteins aggregate together in the medium where they can be filtered away from the cells.  This process has been  shown to be viable and recombinant proteins have been expressed and purified from C. crescentus (Bingle et al., 1997a). S-layers also have other uses such as the expression of epitopes in S-layers to be used for recombinant vaccines. Another aspect that is being examined is to use the regular arrays formed by the S-layer as templates for the deposition of metal or silicon atoms to allow creation of circuitry finer than is allowed by current integrated circuit etching technology. It would also be possible to use the arrays as surface supports to which biologically active molecules could be attached (Sara and Sleytr, 1996a; Sara and Sleytr, 1996b; Sleytr et al., 1997a; Sleytr et al., 1997b; Sleytr and Sara, 1997). Obviously, all these uses could be applied to the S-layer of C. crescentus. To increase the utility of C. crescentus S-layers for such applications it is vital to understand how the RsaA protein is secreted and attached to the surface. For example, it is necessary to understand the conformation of the protein when it is  2  passing through the secretion apparatus. This will determine what kind of foreign proteins or epitopes can be secreted and are capable of forming aggregates using the R s a A secretion pathway.  To answer some of these questions this thesis  examines the RsaA secretion and S - L P S synthesis pathways.  C. crescentus  is a Gram-negative, motile eubacterium found in soil and  aquatic environments including drinking water. The non-pathogenic bacterium derives its name from the crescent shape of the cells.  C.  crescentus  undergoes a dimorphic developmental life cycle (for reviews see Brun et a/.,  1994;  Gober  and  Marques,  1 995;  Poindexter,  1981;  Shapiro, 1976; Shapiro and Losick, 1997) during Figure 1-2.  which Developmental  cycle  of  C.  crescentus  Sessile cells attached to the surface via the holdfast bud off swarmer cells which move to a new location where they lose their flagellum and grow a stalk to attach to the surface again. (Figure courtesy Ian B o s d e t . )  it  between  switches a  motile  (swarmer) phase and a sessile  stalked  phase  (Fig 1-2). In both phases the bacterium is completely covered by the S-layer (Smit et a/., 1981).  In the  swarmer phase the cell expresses a single flagellum, pili and a holdfast (an adhesin) at one pole. When the cell differentiates into the stalked form, it loses the flagellum and a stalk (containing no cytoplasm) grows out from the cell envelope keeping the holdfast on its tip. Stalked cells divide and produce a swarmer cell with the flagellum being created at the pole furthest from the stalked cell. Most of the current research on C. crescentus  focuses on the developmental process resulting in these two  different forms and the development of the flagellum (Brun et a/., 1994; Roberts et a/., 1996; Shapiro and Losick, 1997).  3  S-layers are two-dimensional arrays that cover the outside surface of many prokaryotes. C. crescentus  is one of many species of bacteria covered with a  crystalline protein surface layer (S-layer) (Boot and Pouwels, 1996; Sleytr and Messner, 1983; Sleytr and Sara, 1997; Smit et a/., 1981). Thousands of copies of nearly always a single protein or glycoprotein self-assemble into a crystalline-like lattice (Sleytr and Messner, 1983). The S-layers described so far have subunits ranging in size from 30 to 220 kDa (Messner and Sleytr, 1992). Although a large number of bacteria have been found to have S-layers, enteric bacteria, the most studied, lack them and consequently have not been studied much (Hovmoller et a/., 1988; Sleytr and Messner, 1988). For reviews on S-layers see Beveridge et a/., 1997; Sleytr, 1992; Sleytr and Messner, 1983. S-layers typically make up 10% of the protein in a cell and thus represent a large energy expenditure by the cell (Sleytr and Messner, 1983).  Many bacteria  have been found to lose their S-layers when there is no environmental pressure for maintenance, such as during sub-culturing in the laboratory, showing that S-layers are not essential for growth (Blaser et a/., 1985; Borinski and Holt, 1990; Luckevich and Beveridge, 1989; Stewart and Beveridge, 1980).  Considering the energy  expenditure, the function of the S-layer must be required for survival in the normal environment of the bacterium. It is presumed that most S-layers have a protective barrier role because the pore-like structures formed by the layer likely act as molecular sieves and prevent the entry of molecules, such as proteases and lytic enzymes, larger than the pore (Sleytr and Messner, 1983) as shown by several cases (Koval and Hynes, 1991; Sleytr, 1976). In addition, some infectious bacteria use their S-layers to adhere to and invade the cells of other organisms (Blaser et a/., 1988; Messner and Sleytr, 1992; Munn et a/., 1982). It has been demonstrated that the S-layer of C. cresentus protects it from a Bdellovibrio-Wke organism (Koval and Hynes, 1991), but the S-layer also acts as a receptor for the bacteriophage 0CR3O (Edwards and Smit, 1991) showing that the S-layer also allows C. crescentus to be infected by a parasite. S-layers have common features, such as an acidic pi, an absence of cysteine residues and a high number of hydroxylated amino acids.  Subunits are held  4 .  together and to the surface by noncovalent (hydrophobic, ionic, hydrogen or polar) bonds (Koval and Murray, 1984; Messner and Sleytr, 1992; Sleytr and Messner, 1983). Despite these similarities, there is very little sequence similarity among S layer proteins (Gilchrist et al., 1992; Messner and Sleytr, 1992), suggesting that Slayers may have arisen by convergent evolution.  The S-layer of C. crescentus is composed of the protein RsaA.  Six copies of  R s a A form a ring-like subunit (Fig. 1-3) that interconnects with other subunits to form a two-dimensional hexagonal array (Smit et al., 1992).  The gene for R s a A  has been cloned (Smit and Agabian, 1984) and s e q u e n c e d (Gilchrist et al., 1992). N-terminal protein sequencing of the mature R s a A polypeptide has shown that only the initial N-formyl methionine is cleaved, leaving a mature polypeptide of 1025 residues with a molecular weight of 98 kDa (Fisher et al., 1988; Gilchrist et Figure struction  1-3. 3-Dimensional reconof the S-layer. T h e arrow  indicates a single C - s h a p e d R s a A  monomer.  (Figure from Smit et al, 1 9 9 2 ) .  al., 1992). The S-layer is anchored to the cell surface via a noncovalent interaction between the N-terminus of the protein and a specific smooth L P S in the outer  membrane (Walker et al., 1994). C a  2 +  is required for the proper crystallization of  R s a A into the S-layer and its removal using E G T A disrupts S-layer structure (Nomellini et al., 1997; Walker et al., 1994).  RsaA is a true secreted protein. RsaA must pass through both the inner and outer membranes to form the S-layer on the outer surface of the bacterium. A s there is a large amount of R s a A (10 to 12% of the cellular protein), an efficient secretion system or a large number of transport complexes are required to secrete the protein during the 105 min generation time. Linker mutagenesis of R s a A has shown that the  5  extreme N-terminus is required for surface attachment while the C-terminus is required for secretion. Further, deletion and hybrid protein analyses have indicated that secretion of R s a A relies on an uncleaved C-terminal secretion signal located within the last 82 amino acids of the RsaA protein (Bingle et al., 1999; Bingle et al., 1996; Bingle et al., 1997a; Bingle e r a / . , 1997b; Bingle and Smit, 1994).  The  presence of an uncleaved C-terminal secretion signal usually indicates secretion by a type I system (Binet et al., 1997; Salmond and Reeves, 1993) rather than a type II, III or IV system. Most Gram-positive bacterial S-layers have been shown to use the General Secretion Pathway ( G S P ) or Sec-dependent pathway (Pugsley, 1993) for export (Messner and Sleytr, 1992; Sleytr and Messner, 1988; Sleytr et al., 1993; Sleytr and Sara, 1997), whereas S-layer proteins in Gram-negative bacteria are secreted using a type II system (Boot and Pouwels, 1996) which also employs the G S P to transport the S-layer subunit across the inner membrane. Recently, it has been shown that the S-layer of Campylobacter  fetus is secreted by a type I  mechanism (Thompson et al., 1998) and an S-layer-like protein in marcescens  Serratia  with significant similarity to R s a A has been shown to use a type I  secretion mechanism (Kawai etai, 1998). In addition to the secretion signal, the C-terminal portion of R s a A also contains repeats of a glycine and aspartate acid-rich region which are thought to bind calcium ions (Gilchrist et al., 1992) and result in the aggregation of free RsaA in the medium. Such Ca -binding motifs are found in most proteins secreted by type I 2+  systems (Binet et al., 1997) and consist of a glycine/aspartate rich G G X G X D motif that repeats 4-36 times (Welch, 1991). C. crescentus  has two groups of three  repeats separated by 12-16 residues containing this motif. Interestingly, there are no obvious repeat regions in the S-layer of C. fetus (Thompson et al., 1998). It has been suggested that these motifs are important for the proper presentation of secretion signal to the A B C transporter  (Duong et al.,  1996; Letoffe  and  Wandersman, 1992; Sutton et al., 1996). Thus, in the case of RsaA, the glycine and aspartate rich repeats may function (along with C a ) both in maintaining the 2 +  crystalline structure of the S-layer and in the secretion of the S-layer protein itself.  6  There are four described Gram-negative bacterial transport systems. systems have been named type I through type IV.  These  The type I system requires 3  proteins that are thought to form a pore through the inner and outer membranes allowing the protein to be secreted. This is the method by which R s a A is secreted and it is discussed in depth below. Type II systems use the G S P for export across the inner membrane and then use a complex of 12-14 proteins for secretion to the outside of the bacterium. The secretion substrates contain classical Sec-dependent N-terminal signal sequences that direct transport across the inner membrane by the S e c pathway (Pugsley, 1993). Proteins are transported across the cytoplasmic membrane in an unfolded state and then fold in the periplasm. This folding is necessary as the components for secretion seem to recognize the secondary or tertiary structure of the substrate as no sequence similarity has been found (Lu and Lory, 1996). Both A T P hydrolysis as well as proton motive forces appear to be required for secretion of the substrate (Feng et al., 1997; Letellier et a/., 1997). For a review of type II secretion systems see Russel, 1998. The auto-secreting proteins, such as the IgA proteases, like the type II secreted proteins, use the G S P to cross the inner membrane. These proteins have an N-terminal signal sequence and a C-terminal pro-sequence. They are exported across the cytoplasmic membrane by the S e c dependent pathway in the usual manner with cleavage of the N-terminus signal sequence. The pro-sequence then forms a pore in the outer membrane through which the rest of the protein passes. Once the protein is outside, autocatalytic cleavage of the pro-sequence occurs, releasing the protease from the cell (Pohlner et al., 1987). Type III secretion has only been found in pathogens and is used to deliver bacterial proteins into the host cytoplasm to alter the host's metabolism to the advantage of the bacterium. Type III systems are the most complex of the secretion systems, involving more than 20 proteins. The proteins form a needle-like structure that spans the inner and outer membrane (Kubori et al., 1998). Before secretion can occur, the bacterium must make contact with the host cell. Secretion seems to be directed by the m R N A .  It is thought that the m R N A forms a hairpin loop that  obscures the translation start signal until the 5' region of the m R N A interacts with the  7  secretion apparatus (Anderson and Schneewind, 1997). A signal recognition protein may mediate this process. Therefore, secretion is coupled with translation. A T P hydrolysis appears to be required for secretion, as components of type III systems are capable of hydrolyzing A T P in vitro (Eichelberg et al., 1994). The substrate may j  then pass through the needle structure to the outside of the cell, though this has not been proven.  For reviews of type III secretion see Anderson and Schneewind,  1999; Galan and Collmer, 1999 Type IV secretion systems have only recently been discovered and are not well understood. This transport pathway, like the type III, has so far been found exclusively in pathogens. The type IV system seems to have been designed to transport DNA, though the Bordetella pertussis Ptl system only transports proteins (Weiss et al., 1993). There are at least 9 proteins involved in the transport process and their functions are not well understood.  There are usually two proteins  containing nucleotide binding motifs that appear to be the transporting units that hydrolyze A T P to effect transport. It is not known if the substrate is transported in a one step process where the substrate bypasses the periplasm or a two step process where the substrate is first transported to the periplasm and then a second transport process secretes the protein. For a review of type IV secretion see Burns, 1999 RsaA is secreted by a type I mechanism. The goal of this thesis was to elucidate the secretion mechanism of RsaA. Initial indications suggested that it was a type I secretion mechanism (i.e., a C-terminal secretion signal and the presence of glycine/aspartate rich repeats) and data are presented here directly demonstrating that R s a A is secreted by a type I mechanism. structure of the C. crescentus  Figure 1-4 shows the predicted  membrane and also serves as a general model of a  type I mechanism. The best described type I secretion systems are those required for the secretion of Escherichia  coli a - h e m o l y s i n  metalloproteases (PrtB) and Pseudomonas  (HlyA),  aeruginosa  (Binet et al., 1997; Salmond and Reeves, 1993).  Erwinia  chrysanthemi  alkaline protease (AprA)  A type I secretion apparatus  requires three components (Delepelaire and Wandersman, 1991). One component, the A B C transporter, is embedded in the inner membrane and contains an A T P -  8  binding cassette (ABC).  It has  been shown that this component recognizes the C-terminal  RsaA  signal  S-layer  sequence of the substrate protein and hydrolyzes A T P during transport  process  (Binet  the Outer Membrane  and  Wandersman, 1995; Koronakis et al., 1993). Another component, the  Inner Membrane  membrane fusion protein (MFP), is anchored in the inner membrane and appears to span the periplasm  A B C - transporter  (Dinh et al., 1994). The remaining component is an outer membrane protein  (OMP)  that  has  been  shown to interact with the M F P . It is  thought  that  these  three  components form a channel that extends  from  the  cytoplasm  Figure  1-4.  Type  I  secretion  Diagram o f t h e h y p o t h e t i c a l m e m b r a n e of  C.  crescentus  showing  the  system. architecture  predicted  type  I  s e c r e t i o n m e c h a n i s m o f RsaA  through the two membranes to the outside of the cell (Akatsuka et al., 1997; Hwang et al., 1997). The substrate may pass through this channel (probably in an unfolded state) to the outside of the cell. In many cases, the genes for all three transport components are found immediately adjacent to the substrate gene(s) (Duong et al., 1992; Letoffe et al., 1990). In other type I systems, only the ABC-transporter and M F P genes are next to the substrate gene (Letoffe et al., 1994b; Mackman et al., 1985). The R s a genes are organized like the latter and the O M P gene is not adjacent to the ABC-transporter and M F P . Recently, it was determined that the O M P gene is only separated from the M F P gene by five O R F s and a distance of 5 kb in the R s a system.  There are also  instances where the substrate gene is separate from the secretion genes (Finnie et al., 1998; Scheu et al., 1992). A s shown in Figure 1-4, from analysis of the A B C transporters it is thought that the protein components work in multimers of at least 2. Some members of the ABC-transporter family, such as P-glycoprotein, contain two  9  almost identical domains in tandem, each with its own membrane spanning and A B C region (Sheps et al., 1996). Association of two A B C transporters has been shown for monomeric ABC-transporters (Davidson and Nikaido, 1991). The proteins may work in pairs so that one A T P is hydrolyzed for transport and a second A T P is hydrolyzed to return the complex to the resting conformation.  It is also possible that  the proteins work in tandem and small sequential conformational changes in each separate protein push the proteins along (Welsh, 1998). Recent work indicated that while the ABC-transporters may work as a dimer, the M F P may work as a hexamer and the O M P as a trimer (Holland, 1999; Koronakis et al., 1997). The ABC-transporter family is very large and the type I secretion systems make up only a small portion. They are found in all forms of life and are sufficient to transport a substrate across a single membrane.  There is significant sequence  similarity among the ABC-transporters, even between eukaryotic and prokaryotic genes. The eukaryotic P-glycoprotein shares close to 50% conserved amino acids with many of the bacterial ABC-transporters such as HlyB and PrtD over the entire length of the protein (Croop, 1998; Sheps era/., 1996). Mammalian P-glycoproteins actually have more sequence identity to these prokaryotic transporters than to proteins considered to belong to the P-glycoprotein family.  ABC-transporters are  also involved in the import of substrates such as the Mai transporter where maltose is transported across the inner membrane (for reviews see Boos and Shuman, 1998; Ehrmann etai, 1998; Nikaido, 1994). The basic monomeric ABC-transporter consists of 2 domains. One domain, usually N-terminal and consisting of six to eight membrane spanning segments, recognizes the substrate and forms the pore through the membrane.  The other  domain contains the A B C region, which provides the energy for transport from the hydrolysis of A T P . The A B C domain is highly conserved and consists of about 215 amino acids and within this region there are four distinct motifs. Like all A T P a s e s , ABC-transporters contain Walker A or P-loop (consensus G X X G X G K [ S T ] ) and 1  Walker B (hhhhD) motifs which interact directly with A T P binding and hydrolysis 1  X-denotes any amino acid; h-denotes hydrophobic amino acid; brackets indicate alternative amino acids at a single position 1  10  (Walker et al., 1984), but they are immediately followed by a specific A B C transporter motif (LSGGQ[QRK]QR) (Bairoch, 1992; Gorbalenya and Koonin, 1990) 1  which is thought to be involved in energy transduction (Hyde et al., 1990). A fourth motif has recently been identified in a majority of E. coli and  Saccharomyces  cerevisiae ABC-transporters (Decottignies and Goffeau, 1997; Linton and Higgins, 1998).  This fourth motif is hhhhH followed by a charged residue and is found 1  approximately 30 amino acids C-terminal of the aspartic acid in the Walker B motif. No one has so far been able to make a 3-dimensional crystal of the complete A B C transporter from which the structure could be determined.  However, the A B C  domain has been crystallized from two proteins (Armstrong et al., 1999; Hung et al., 1998) showing that the A B C forms an L with 2 arms; arm 1 binds with the A T P and arm 2 interacts with the membrane-spanning domain. It is thought that hydrolysis of A T P causes a conformational change in arm 2 which transfers the energy to the membrane spanning domain, possibly through the ABC-transporter motif found at the end of arm 2, and the conformational change in the membrane spanning domain results in transport of the substrate (Welsh, 1998). The M F P is characterised by a single hydrophobic transmembrane domain in the N-terminus that sits in the inner membrane. A hydrophilic domain spans the periplasm and the C-terminus consists of beta sheet that may interact with the outer membrane component (Dinh et al., 1994). The M F P family contains the conserved motif [ L I V M ] X X G [ L M ] X X X [ S T G A V ] X [ L I V M T ] X [ L I V M T ] [ G E ] X [ K R ] X [ L I V M F Y W ] [LIVMFYW]X[LIVMFYW][LIVMFYW][LIVMFYW] (PROSITE:PDOC00469) 1  The O M P sits in the outer membrane and interacts with the M F P . Of the known O M P s only TolC, from the a-hemolysin transporter, has been studied extensively. It has been found that three smooth L P S synthesis genes are required for secretion of a-hemolysin. It is likely that the smooth L P S is required for proper insertion of TolC in the membrane (Stanley et al., 1993; Wandersman and Letoffe, 1993). Two-dimensional crystals of TolC have been examined using electron microscopy and show that TolC forms a trimer. It also appears that a portion of the C-terminus is located in the periplasm (Koronakis et al., 1997).  TolC contains a  centrally located sequence of 44 amino acids in the middle of the protein that is highly similar to a sequence in HlyD (the M F P ) ; these sequences are required for  11  transport and can be interchanged and still allow transport (Schulein era/., 1994). Thus, TolC is thought to provide the essential function of linking the transporter complex to the external environment. While members of the ABC-transporter family secrete a huge range of substrates ranging from C a  2 +  ions to cancer drugs to proteins, the type I secretion  subfamily has been found to only secrete proteins.  The specific features for  secretion of a protein by a type I system are not known except that the secretion signal is located in approximately the last 60 amino acids of the C-terminus of the protein (Mackman et al., 1985).  A s little as 15 amino acids of the C-terminus of the  protease, PrtG, from E. chrysanthemi as efficient.  still allows secretion, although this is only 1%  It was found that substrates can be secreted by closely related type I  systems (Binet and Wandersman, 1996; Letoffe et al., 1994a; Letoffe et al., 1994b), but only if there is more than 25% amino acid identity between ABC-transporters of the systems (Delepelaire and Wandersman, 1990; Fath et al., 1991). No sequence similarity is found among the secretion signals of the different substrate proteins; however, in the proteases, lipases and NodO a conserved motif of a negatively charged amino acid followed by several hydrophobic amino acids has been found at the end of the C-terminus (Binet et al., 1997). The C-terminal signal sequence of a hemolysin was extensively mutagenized, but few individual amino acids were found to affect secretion (Kenny et al., 1992). Because of this lack of sequence similarity and identification of important residues it is thought that the secretion signal relies on secondary structure to initiate transport. N M R and circular dichroism studies of the C-terminus of PrtG, HasA (the heme acquisition protein from Serratia HlyA (the hemolysin from E. coli) and LktA (the leukotoxin from  marcescens), Pasteurella  haemolytica) have shown that there are two a helices in the C-terminus (Wolff et al., 1997; Wolff et al., 1994; Yin et al., 1995).  Mutation of these a helical regions in  HlyA and LktA showed that the secretion signal appears to bind to a pocket in the ABC-transporter and induce a conformational change that causes transport to occur (Zhang et al., 1998). Presented in this thesis is evidence that all three components of a type I secretion system have been found in C. crescentus  and these components are  required for the secretion of RsaA. They have greatest similarity to the protease  12  type I secretion systems from P. aeruginosa and E. chrysanthemi  and the proteases  from these systems can be secreted by the R s a system.  The S-layers subunits from other Caulobacter species appear to be secreted by type I systems. Several F W C species with S-layers have been isolated from a wide number of aquatic sources (MacRae and Smit, 1991; Walker et al., 1992). The subunits of these S-layers react with anti-RsaA antibody and their smooth-LPS reacts with antibody raised against the smooth-LPS of NA1000.  The S-layer  subunits from these F W C species range in size from 100 to 193 kDa and can be removed from the bacterium's surface using low pH or E G T A (Walker et al., 1992). Portions of the genome of the F W C species with S-layers hybridize to the rsaA gene while the genomes of F W C species without S-layers do not (MacRae and Smit, 1991).  It is shown in C h . 5 that the protease, AprA from P. aeruginosa,  was  expressed and secreted in some of these F W C species. These facts suggest that type I secretion mechanisms secrete the S-layer subunits in the F W C species. Since the F W C species secrete S-layer subunits varying widely in size, it is desirable to examine the S-layer subunits and their corresponding secretion systems and examine the differences and similarities to allow one to determine how the mechanisms work, what parts of the protein are essential for secretion and what parts provide specificity. With these goals in mind, procedures are reported here for the characterisation of the S-layer subunit, ABC-transporter and M F P genes from various F W C species.  The S-layer is attached to the surface of C. crescentus using a species of smooth LPS.  The outer membrane of Gram-negative bacteria  phospholipids, proteins and L P S (Nikaido and Vaara, 1985). including C. crescentus,  contains  In many cases,  there is also an extracellular polysaccharide ( E P S )  (Ravenscroft et al., 1991); the S-layer is external to all of these molecules (although the E P S may pass through the S-layer). Smooth L P S is a major component of the outer membrane of Gram-negative bacteria and consists of three regions. The lipid A moiety is the endotoxic part of L P S and is anchored in the outer leaflet of the outer membrane.  The core, a branched chain oligosaccharide linked to ketodeoxy  13  octulosonic acid (KDO), is attached to the lipid A molecule. Extending from the core is the O-antigen which contains a repeating linkage of oligosaccharides (Schnaitman and Klena, 1993). It has been found in C. crescentus that the S - L P S anchors the Slayer to the cell surface via a noncovalent interaction with the N-terminus of RsaA. Immunolabelling showed that the S - L P S is completely occluded by the S-layer (Walker et al., 1994). Isolation and characterization of the S - L P S showed that the core sugars and fatty acids are identical to those of the rough L P S and that the Oantigen is of a homogeneous length, unlike the variable length S - L P S found in many enteric bacteria. Previous reports (Walker et al., 1994) indicated that the O-antigen was composed of a 4,6-dideoxy-4-amino-hexose, a 3,6-dideoxy-3-amino-hexose and glycerol, but recent results (Smit, unpublished) indicate that glycerol is a contaminant of the S - L P S isolation procedure, and that the 3,6-dideoxy-3-aminohexose assignment is likely due to a co-purifying polymer.  Therefore, it seems  possible that the O-antigen is composed solely of a 4,6-dideoxy-4-amino-hexose. Anomeric traces found by analysis of proton N M R spectra indicate that the linkages between the 4,6-dideoxy-4-amino-hexose are not identical, implying the involvement of a larger number of glycosyltransferases than needed for a simple polymer with only one kind of linkage. These data correlate with the information presented in this thesis.  I have  found a number of S - L P S synthesis genes, indicating that C. crescentus may make perosamine, a 4,6-dideoxy-4-amino-hexose, and that perosamine is likely a component of the S - L P S . A number of glycosyltransferases were also found as would be expected considering that several transferases would be required to produce the different linkages that result in the different anomeric proton traces found by proton NMR.  Evidence is presented in this thesis demonstrating how RsaA is secreted and how the S-LPS, involved in attachment of the S-layer, is synthesized. Three genes composing the ABC-transporter, M F P and O M P of a type I secretion system required for secretion of RsaA in C. crescentus  are described. A type I secretion  system is also required for secretion of the S-layer subunits of other F W C species. The genes required for the secretion of RsaA and the synthesis of S - L P S are linked  14  leading to the discovery of a number of putative genes involved in the synthesis of the S - L P S required for S-layer attachment. Additional genes involved in synthesis of the S - L P S were discovered by Tn5 mutagenesis.  15  Chapter 2 Materials and Methods Strains, plasmids and growth conditions. All strains, libraries and plasmids used in this study are listed in Table 2-1. Plasmids with NA1000 D N A inserts are listed in Figure 2-1.The £ . coli strains DH5oc J M 1 0 9 or RB404 were used for all E. coli cloning manipulations. E. coli was grown at 37°C in Luria broth (1% tryptone, 0.5% NaCI, 0.5% yeast extract), with 1.2% agar for plates. C. crescentus  strains were  grown at 30°C in P Y E medium (0.2% peptone, 0.1% yeast extract, 0.1% C a C I , 2  0.2%  M g S C v , with 1.2% agar for plates).  Ampicillin was used at 100 ng/ml,  streptomycin at 50 ng/ml, kanamycin at 50 u.g/ml in both C. crescentus and E. coli, and tetracycline was used at 0.5 |ig/ml and 10 \xg/m\ and chloramphenicol was used at 2 |ig/ml and 20 ng/ml in C. crescentus and E. coli, respectively, when appropriate.  Recombinant DNA manipulations.  Standard methods of D N A manipulation and  isolation were used (Sambrook et al., 1989). Electroporation of C. crescentus was performed as previously described (Gilchrist and Smit, 1991).  Southern blot  hybridizations were done according to the membrane manufacturer's instructions (Amersham Hybond-N).  Southern blot analysis allowing up to 3 0 % mismatch  between the probe and chromosomal DNA was performed in an identical manner except the hybridization step was performed at 50°C instead of 65°C. Blots were washed: twice for 15 min at room temperature with 2X S S P E (0.18M NaCI, 0.01 M N a P 0 , 0.001 E D T A pH 8.0), 0.1% S D S ; once for 15 min at 50°C with 1X S S P E , , 4  0.1%  SDS.  R a d i o l a b e l e d probes were made by nick translation using the  DNase/DNA Pol manufacturer's instructions (GIBCO/BRL). Chromosomal DNA was isolated as previously described (Yun et al, 1994). P C R products were generated using the primers listed in Table 2-2. P C R was performed using Taq polymerase (BRL), following the manufacturer's suggested protocols. Annealing temperatures (Ta) 2°C below the melting temperature T of the m  16  Table 2 - 1 . Strains and Plasmids used in this study Relevant characteristics Bacterial strains E. coli JM109 RB404 DH5a  C. crescentus NA1000  recAl, endAl,gyrA96, thi,hsdR17,supE44,relAl, A(lac-proAB),XV, [traD36, proAB lacP, lacZAM15] ¥-dam-3,dam-6,metBl, galK2, galT22 lacYl, thi-1, tonA31, tsx-78, mtl-1, supE44 recAl, endAl,gyrA96, thi,hsdR17,supE44,relAl, A(lacZYA-arfF)U196 X (<kZ0lacZAM15)  JS10O1  Ap', syn-1000. Variant of wild-type strain CB15, ATCC 19089, that synchronizes well S-LPS mutant of NA1000, sheds S-layer into medium  JS1003  NA1000 with rsaA interrupted with KSAC Km'cassette  JS3001 JS4000 Plasmids pBBRlMCS pBBRlAprF pBBRlPrtF pBBR3 pBBR3AprA pBBR3PrtB pBBR3AprA:pRAT5 pBBR3PrtB:pRAT5 pBBR3AprA: pCR2.1FUSall pBBR5 pBSKS+ pBSKS-gccl984 pCR2.1 pCR2.1FHSall pCR2.1FHXmal pCR2.1rsaF(1984) pJUEK72 pRATl pRAT4AH pRAT4AH: pBBR5 pRAT5 pRAT5 : pRK415 pRAT HI (B/E) pRK415 pRK415 rsaA APK pRUWSOO pRUW500: pRK415 pSUP2021 pTZ18UB:«aAAP pTZ18R and pTZ18U pTZ19U pTZ18U(CHE) pTZ19UASSm pTZ18RapM  pTZ19UASSmANACRsaF(973) pTZ19UASSm973circ pTZ18U(CHE)ANACRsaF(1984) pUC8 pUC9 ^oAANAC pUC8 neoR pTZ18R aprA : pRK415 Libraries NA1000 cosmid  S-LPS mutant of ATCC 15252, sheds Slayer into medium S-layer negative, derivative of ATCC 15252  Reference or Source  (Yanisch-Perron et al, 1985) (Brent and Ptashne, 1980)~ Life Technologies  (Edwards and Smit, 1991) (Edwards and Smit, 1991)  Cm', broad host range vector (Kovach etal, 1994) EcoRl-BamHl fragment containing aprF from pJUEK72 in pBBRlMCS this study HindW-Pstl fragment containing prtF from pRUWinh4 in pBBRlMCS this study Sm', broad host range vector this study aprA*, aprA cloned into pBBR3 using EcoRl and Pst 1 this study prtB*, prtB cloned into pBBR3 using this study aprA*, rsaD*, rsaE*, Ap , Sm , pBBR3AprA fused with pRAT5 at the Xbal this study site prtB*, rsaD*, rsaE*, Ap , Sm , pBBR3PrtB fused with pRAT5 at the Xbal site this study aprA*, rsaF*, pBBR3AprA fused with pCR2.1Fl ISall at the Xbal site this study r  r  r  r  Tc , broad host range, broad host range ColEl cloning vector, lacZ, Ap 736 bp PCR product containing valyl tRNA synthetase made using the primers gccl984-1407 and gccl984-I2143 and T-tailed into pBSKS Km', Ap , commercial T-tail cloning vector PCR product generated using Tn5 and Tn5Sall primers from ligation of FllTn5 chromosomal DNA cut with Sail in pCR2.1 PCR product generated using Tn5 and Tn5Xma primers from ligation of F l lTn5 chromosomal DNA cut with Xmal in pCR2.1 2.1kb PCR product generated using primers gccl984-28 and gccl984-I2310 aprD*, aprE*, aprF*, aprA*, aprl* rsaA*, rsaD*, rsaE*, Ap rsaA*, rsaD*, rsaE*, Ap , rsaA is under control of a lacZ promoter rsaA*, rsaD*, rsaE*, Ap , Tc , pBBR5 was fused with pRAT4AH at the Sstl site rsaD*, rsaE*, Ap' rsaD*, rsaE*, Ap , Tc', pRK415 was fused with pRAT5 at the Sstl site BamHl/ EcoRl fragment from pRATl cloned into pTZ18U lacZ*,Tc', broad host range rsaA under control of lacZ promoter in pRK415 prtB*, Ap' r  r  r  r  r  r  r  r  prtB*, Tc pRK415 was fused with pRUW500 at the Pstl site carries Tn5, unable to replicate in C. crescentus The wildtypepromoter ofrsaA has been replaced with a lacZ promoter Ap',ColEl cloning vector A phagemid version of pUC18 or pUC19 Cm', Ap' gene of pTZ18U replaced with Cm' gene Sm', Sm' gene inserted into Seal site in Ap' gene of pTZ19U aprA*, Ap' The EcoRl-Bglll fragment from pJUEK72 containing aprA was inserted into the EcoRl-BamUl sites of pTZ18R internal Kpnl-Pstl fragment of rsaF(913) in pTZ19UASSm r  this study Stratagene this study Invitrogen this study this study this study (Guzzo et al., 1990) this study this study this study this study this study (Keen et al, 1988) mis study (Delepelaire and Wandersman, 1990) this study (Simon et al, 1983) (Bingle et al, 1997) (Mead et al, 1986) this study this study this study this study  recircularized plasmid isolated from BamHl digestion of NA1000::pTZ19UASSmANAC-RsaF(973) internal Pvull-Stul fragment of r.s«F(1984) in Smal site of pTZ18U(CHE)  this study  ColEl cloning vector, lacZ, Ap'  (Vieira and Messing, 1982) (Bingle etal, 1996; Bingle and Smit, 1994) this study this study  rsaA missing the extreme N-terminus and C-terminus Hindlll-BamHl from Tn5 containing neomycin resistance gene in pUC8 aprA*, Tc' pRK415 was fused with pTZ18R at the BamHl site 1000 cosmids containing 20 - 25 Kb of NA1000 DNA  this study  (Alley etal, 1991)  17  CO  CD  LU  CVJ  to CO  Q  I  00  1  CD CD  CO  I  CD 00  CO  < <0 o  I  I  u 1  CD  00  CD  <  z Q To E CM  o  E 2 UJ  I  LU  LU  o o  en  Ol  c 'E  Q.  < < X  <  (0 W in T -  Q.  Q.  Q.  t * oc  in oc cc ED 5m  h  < oc CL a .  CO CO  E  c/>  X  r £  r t . (M oc o a  . CM OC o a  o CO 1^  o> E (0  I •a  I a  <  l  o>  (_  3  N a 18  primers were used.  Extension times (te) were based on 60 sec/1000 bp of DNA.  General P C R parameters were 95°C - 30 sec, T - 30 sec, 72°C - t . A  E  The vector  pF3SKS+ was cut at the E c o R V site and T-tailed (Holton and Graham, 1991) and the P C R product was ligated into this vector. Cloning of chromosomal DNA adjacent to Tn5 insertions: Chromosomal DNA of the Tn5 mutant was cut with BamHI, Sa/1 o r X m a l . S a m H I fragments were cloned directly into the BamHI site of pTZ18 vectors. A second method that was used for isolating the chromosomal DNA adjacent to the Tn5 insertions involved an inverse P C R method developed by V. Martin (Martin and Mohn, 1999). PCR product  forward primer name- sequence (5'-3')  reverse primer name- sequence (5'-3')  RAT5  RsaD-A-CGGAATCGCGCTACGCGCTGG  RsaE1-GGGAGCTCGAAGGGTCCTGA  Degenerate primers for RsaF search  F60(GC)CG(GC)(AGT)(GC)(GTC)(GC)(GC)(GC) (CT)T(CG)CT(CG)CC(CG)CAGCT(CG)G FB110CT(GC)(CA)(GC)CAG(AC)C(GC)AC)T(GC)T TCGAC  IF340GCCGCC(CG)(CGT)(TAG)(GA)(TA)A(GC)A (GT)(GC)GG(GC)AG(GC)(TCG)(TA)(CG)T IFB415CTG(TC)TC(GC)GC(GC)(AT)(CT)(GC)AG(G C)ACGTC  Inverse PCR to obtain chromosomal DNA next to Tn5 insertion  Tn5 universal GGTTCCGTTCAGGACGGCTAC  TnSXma 1 -AGGCAGCAGCTGAACCAA Tn5Sal1-ATGCCTGCAAGCAATTCG  Degenerate primers for amplification of internal portion of RsaD homologues in FWC species  RD43BTA(TC)ATGCT(GC)CAGGT(GC)TAT(GC)AC CGIG  IRD477BC(GC)A(GT)(GC)CGCTG(GC)CGCTGGCC GC  Unsuccessful PCR  RsaF140-GCGGTCGAGCAGGGGGTGCT  RsaFIEND-ACGAATCCTTGCGCGCCTTGG  Amplification of pUC  TZ1920-  TZI1060-  type vectors  GAGGCCTAGTACTCTGTCAGACCAAGTTT ACTCATA  GAGGCCTACTCTTCCI I I I ICAATATTATT GAA  Amplification of  Gcc1984-28-  G c d 984-11200-  gcc1984 (numbers  CGCTCTACACCGGCGGTCGCGCCAGCGC  correspond to bp in  G c d 984-1407-  GGAGCTCTGGCGCCCCACCAGGGACGC GTAGAACG  contig)  GCCGGAACCCGAACCTGAACCGGTGTCG  G c d 984-12143GTGGTCGGTGCCCGGCAGCCACAGGG  Amplification of gcc973 (numbers correspond to bp in contig)  Gcc973-1600-  Gcc973-I2310GCTGGCGCCCCACCAGGGACGCGTAGA ACG  of reaF(1984)  GGAATCCATGTCACATGGGAAGAGACGG TCCGCCGT  Table 2-2. Primers used for PCR for this report.  19  Construction of plasmid vectors that replicate in C. crescentus.:  The plasmid pBBR5  was constructed from the plasmids p B B R l M C S (Kovach et al., 1994) and pHP45£2Tc (Fellay et al., 1987). The Q-Tc fragment from pHP45Q-Tc was removed using HindlU and the ends were blunted using T4 polymerase. A 0.3 kbp portion of the C m gene was removed from p B B R l M C S by cutting with Oral and replaced with the r  blunted Q-Tc fragment producing a T c broad host range vector that replicates in C. r  crescentus.  The plasmid pBBR3 was constructed in an identical manner except the  plasmid p H P 4 5 Q - S m (Fellay et al., 1987) was used to provide a Srh marker. Both r  these plasmids were constructed by John Nomellini. Construction of vectors that replicate only in E. coli: The vector pTZ18U(CHE) was constructed by amplification of all of pTZ18U except the a p gene using the primers r  TZ1920 and TZI1060 that were designed with Sfi/1 sites. The P C R product was cut with Stul and a C m gene (Morales et al., 1991) with blunt ends was inserted into r  the site.  Tn5 mutagenesis. Tn5 mutagenesis was accomplished using the narrow host range ( C o l E l replicon) plasmid pSUP2021 (Simon et al., 1983) which is not maintained in C. crescentus.  The plasmid was introduced by electroporation and  20,000 colonies that were streptomycin and kanamycin resistant were pooled, frozen at -70°C and aliquots were used for subsequent screening. Southern blot analysis of chromosomal DNA isolated from the Tn5 library was used to assess the randomness of insertions.  Hybridization with a Tn5 probe,  pUC8neoR, indicated that while there were some hot spots of Tn5 integration, the Tn5 insertions were randomly distributed throughout the chromosome (data not shown).  SDS-PAGE and Western blot analysis. Proteins and S - L P S were isolated from C. crescentus as previously described (Walker et al., 1992; Walker et al., 1994). S D S polyacrylamide gel electrophoresis (PAGE) and Western immunoblot analysis was performed as previously described (Walker et al., 1992). After transfer of proteins to nitrocellulose, the blots were probed with polyclonal antibody and antibody binding  20  was visualized using goat anti-rabbit serum coupled to horseradish peroxidase and colour-forming reagents (Smit and Agabian, 1984). To detect C. crescentus  whole cells synthesizing an S-layer, a colony blot  assay was used (Bingle. ef al., 1997a).  Briefly, cell material was transferred to  nitrocellulose by pressing the membrane onto the surface of an agar plate. The membrane was air dried for 10 to 15 min, washed in a blocking solution (3% skim milk powder, 20 mM Tris (pH 8.0), 0.9% NaCI) with vigorous agitation on a rotary shaker and then processed in the standard fashion (Bingle er al., 1997a). Surface protein from C. crescentus cells was extracted using pH 2.0 H E P E S buffer as shown by Walker (Walker et al., 1992).  To compare the amounts of  surface protein extracted from different mutants equal amounts of cells growing at log phase were harvested and equal amounts of the protein extract were loaded on the protein gel.  S D S - P A G E and Western blotting were performed according to  standard procedures (Sambrook et al., 1989).  Isolation of cosmids containing rsaA, rsaD and rsaE. The NA1000 and JS4000 cosmid libraries were probed with radiolabeled rsaA,  using the plasmid p U C 9  rsa>AANAC. 5 cosmids from the NA1000 library were isolated and 4 cosmids from the JS4000 library. Southern blot analysis of the cosmids hybridizing to the probe was used to determine which cosmids contained DNA 3' of rsaA. An 11.7 kb Sstl-EcoR\ fragment containing rsaA plus 7.3 kb of 3' DNA was isolated from one of the NA1000 cosmids and cloned into the Sstl-EcoR\ site of pBSKS+; the resulting plasmid was named p R A T 1 . The 3' end of the cloned fragment consisted of 15 bp of p L A F R 5 DNA containing Sau3A, Sma\ and EcoRI sites. SamHI fragments from the NA1000 cosmid were subcloned into the SamHI site of vector pTZ18R for sequencing. The 3' end fragment was subcloned using SamHI and E c o R I into pTZ18R. The 5' end fragment was subcloned using Sstl-Hind\\\  into pTZ18R. A cosmid containing the  rsaA, rsaD and rsaE genes was isolated from the JS4000 cosmid library and pieces were subcloned as BamHI fragments in pTZ18U for sequencing. fragments containing the rsaA  /-//>?aflil/BamHI  gene were cloned directly from the genome of  JS4000 and JS3001 by isolating bands of the correct size from an agarose gel and ligating to p U C 8 .  Colonies were probed with rsaA from NA1000 for plasmids  21  containing the correct insert. Hind\\\/Cla\,  ClallEcoRV  These clones were subcloned in three pieces as  and EcoRV/BamHI fragments into p U C type vectors. C/al  sites for cloning were generated in the vector by cutting with BamHl and filling in the 5' overhangs with Klenow fragment. Ligation of the blunt ends then produces a C/al site.  Isolation of FWC S-layer subunit genes.  F W C 2 7 chromosomal D N A was  digested with BamHl and Pst\. The digested DNA was ligated to a pTZ19U vector also digested with BamHl and Pst\.  A portion of the ligation mixture was  electroporated in to E. coli JM109 and allowed to incubate at 37°C for 1 hour in 1 ml of Luria broth. The mixture was divided evenly and spread on 10 agarose plates and incubated overnight.  The colonies were adsorbed to sterile filter paper  (Whatman 541). The colonies were then lysed by soaking the filter paper in 0.5M NaOH for 5 min. The filter paper was neutralized by soaking the filter paper in 1M Tris-HCI (pH 7.0) for 5 min twice. A filter was then soaked in 0.5M Tris-HCI (pH 7.0), 1.5M NaCI. Then, the filter was washed with 70% EtOH and baked at 80°C for 2 hours.  The filters were then probed with p U C 8 n e o R using the Southern blot  hybridization procedure allowing 30% mismatch (see above).  Nucleotide sequencing and sequence analysis. Sequencing was performed on a DNA sequencer (Applied Biosystems model 373). After use of universal primers, additional sequence was obtained by "walking along" the D N A using 15-20 bp primers based on the acquired sequence. DNA was sequenced in both directions for all original sequence, thereafter D N A was only sequenced in both directions when ambiguities were found.  Nucleotide and amino acid sequence data were  analyzed using Geneworks and MacVector software (Oxford Molecular Group) and the NCBI B L A S T e-mail server using the B L A S T algorithm (Altschul ef al., 1990). Primers were designed with the help MacVector and Amplify 1.2 (Engles, 1993) Protein alignments were generated using the ClustalW algorithm as implemented by the MacVector software using the default settings. rsaADEF and IpsABCDEF  The sequences for NA1000  were submitted to Genbank and can be accessed as  AF06235. The sequences for JS3001 rsaA and JS4000 rsa AD E can be accessed  22  using the accession numbers AF193063 and AF193064. Preliminary sequence data of the C. crescentus genome was obtained from The Institute for Genomic Research through the website at http://www.tigr.org. Signal peptides predictions were made using the SignalP web server (http://www.cbs.dtu.dk/services/SignalP/) (Nielsen et al., 1997).  23  Chapter 3 Secretion of RsaA Introduction The major purpose of my thesis was to elucidate the transport pathway of RsaA. The strain NA1000 was chosen for these studies because rsaA had originally been isolated from NA1000 and it is this gene that has been sequenced and used for all recombinant manipulations in the Smit Lab. In addition a number of useful mutants, with and without S-layers have been derived from NA1000. The lack of a cleaved secretion signal, the presence of calcium repeats, no periplasmic intermediate and a C-terminal secretion signal, indicated that R s a A was probably transported using a type I secretion system (Bingle et al., 1999; Bingle et al., 1996; Bingle et al., 1997a; Bingle era/., 1997b; Bingle and Smit, 1994) in which case other proteins would be required for secretion.  Results and Discussion C. crescentus was screened for genes involved in the secretion of the S-layer subunit, RsaA. Since a type I secretion system uses 3 main proteins to form the transport mechanism, it was necessary to devise a method for finding the genes coding for the components by screening for the loss of R s a A secretion. Unfortunately, there is no easy method to detect the presence of R s a A on the exterior of a colony, as found for a-hemolysin or the metalloproteases which can be detected using blood or skim milk plates (Mackman et al, 1985; Wandersman et al, 1987). Previous research had shown that the lytic phage (|)CR30 could only infect C. crescentus when an S-layer was present (Edwards and Smit, 1991). This phage was isolated using the strain C B 1 5 B E , a derivative of A T C C 19089, as is NA1000. When the phage was used to lyse NA1000 cells with an S-layer using an moi of 10 , 4  it was found that spontaneous mutants approximately 10" . 5  occurred at a high frequency  of  When these mutants were examined, it was found that  approximately 15% had lost their S-layer while the remaining 85% still retained their S-layer and were susceptible to re-infection. Obviously, the phage was not lysing all  24  the bacteria with an S-layer, since these bacteria still behaved like the wildtype strain. Of the bacteria that no longer had an S-layer, RsaA secretion was restored if a plasmid carrying the rsaA gene was expressed inside the bacterium (data not shown).  It seems that the rsaA gene is a more likely target for mutation when  selection pressure against the S-layer is applied. This is in agreement with the observation that many bacteria lose their S-layers during sub-culturing in the laboratory environment.  This method was discarded in favour of a colony  immunoblot assay which was much more labour intensive, but did not have a high background. For the colony immunoblot assay, two polyclonal primary antibodies were used: oc-RsaA (Walker er a/., 1992) and a - S - L P S (Walker et al., 1994).  a-RsaA  reacts to RsaA and a - S - L P S reacts to the smooth L P S required for the anchoring of the S-layer to the surface of the bacterium (Walker er al., 1994). When a - R s a A was used, colonies with an S-layer reacted with the antibody and appeared as a spot on the blot (Fig. 3-1). It was also found that a 'halo' could be detected around colonies when the S-layer could not anchor to the cells (e.g., cells with a defective S - L P S ) . The halo occurs when shed S-layer diffused away from the colony and was detected by a - R s a A as a ring around the colony (Fig. 3-1). When a - S - L P S was used, the antibody reacted to exposed S - L P S only when the cells of a colony lacked an S layer; S-layer blocks the binding of a - S - L P S . 0  NA1000 (wildtype)  RsaA appears to be completely degraded when it is not secreted (Bingle et al., 1996;  JS1003 (S-layer negative) JS 1001 (S-LPS negative)  Bingle and Smit, 1994), therefore cell lysis 9  d u r i n  procedure  t h i s  and  release  of  unsecreted R s a A was not a concern. Using Figure 3 - 1 . colony immunoblot. E x a m p l e of an immunoblot against  colonies  this method, it was possible to differentiate  using a-RsaA  demonstrating  different phenotypes exhibited.  the  between cells secreting RsaA, cells secreting a  n  d  s  h  e  d  d  i  n  g  S  . |  a  y  e  r  a  n  d  c  e  |,  s  w  i  t  h  o  u  t  a  n  S  .  layer.  Identification of Tn5 mutants lacking an S-layer. A pooled NA1000 Tn5 library was screened for S-layer negative mutants using the Western colony immunoblot  25  assay. In total, 9,000 colonies from the pooled Tn5 mutant library were screened using a - S - L P S antibody and 22,000 colonies were screened using a - R s a A . Eighteen Tn5 S-layer negative mutants were found. S D S - P A G E and Western blot analysis of whole cell lysates and culture supernatants confirmed that no S-layer was found in or on the cells or in the culture supernatant of these mutants (data not shown). One mutant, B12, on further examination was found to have an S-layer and was kept for use as a random Tn5 mutation control. Twenty-six Tn5 mutants with a shedding phenotype were also isolated during the screening and are described in Ch. 6.  Identification of Tn5 mutants defective in RsaA secretion.  Several possible  Tn5 insertion events, in addition to those in secretion genes could result in an Slayer negative phenotype. To eliminate Tn5 insertions in the rsaA gene, Southern blot analysis was performed on the S-layer negative mutants. Eleven of the mutants contained Tn5 insertions in rsaA and were not further characterised. Five mutants, B5, B9, B13, B15 and B17, contained insertions in the D N A immediately 3' of rsaA and one mutant, B2, had a Tn5 integration elsewhere on the chromosome (Fig. 3-2). These six mutants represented possible RsaA translocator mutants.  SH  B  B  H  1 kb Figure 3-2.  S-layer  negative  Tn5  insertions.  Graphical representation  positions of T n 5 insertions from mutants that no longer s e c r e t e d RsaA.  of  the  B = S a m H I , H = HindW,  S = S s f l . Triangles indicate T n 5 insertion points.  26  To determine whether the loss of S-layer was caused by a mutation affecting regulation of the gene, rsaA was expressed in the mutants under the control of a lacZ promoter, using the plasmid pRK415rsa>AAPK. This construct restored RsaA production in J S 1 0 0 3 and B 1 , mutants with an interrupted rsaA gene, although wildtype RsaA expression levels were not reached. No S-layer was found on any of the five mutants with a Tn5 insertion in the D N A immediately 3' of rsaA secreted RsaA when rsaA was expressed in trans in this manner (Fig. 3-3). In addition, the one mutant (B2) where the Tn5 insertion was not adjacent to the rsaA gene also produced an S-layer when complemented with the plasmid pRK415rsa>AAPK. This indicates that the B2 insertion was not in a gene involved in RsaA secretion. B2 may have an interruption in a gene responsible for regulation of  1  2  3  4  5  6 7  8  9  10  1 1 1 2  Figure 3-3. C o m p l e m e n t a t i o n of Tn5 m u t a n t s w i t h rsaA. Protein w a s e x t r a c t e d from the surface of the T n 5 mutants and J S 1 0 0 3 carrying the plasmid p R K 4 1 5 rsaAtsPK which e x p r e s s e s R s a A under control of the lac promoter and w i l d t y p e a n d rsaA knockout m u t a n t s that did not contain any plasmid to demonstrate differences in e x p r e s s i o n . E q u a l amounts of surface extracts w e r e loaded on the gel and a W e s t e r n performed using polyclonal a n t i b o d y against R s a A . T h e lanes are as follows: L a n e s 2 through 10 are surface extractions from cells c o n t a i n i n g t h e p l a s m i d p R K 4 1 5 rsaAAPK i n d i c a t e d by ( A P K ) . 1, p u r i f i e d R s a A ; 2 , J S 1 0 0 3 ( A P K ) ; 3, B 9 ( A P K ) ; 4, B 1 3 ( A P K ) ; 5, B 1 ( A P K ) (a T n 5 insertion in rsaA); 6, B 5 ( A P K ) ; 7 , B 1 5 ( A P K ) ; 8, B 1 7 ( A P K ) ; 9, B 2 ( A P K ) ; 10, B 1 2 ( A P K ) (a random T n 5 insertion); 11, J S 1 0 0 3 (rsaA'); 12, N A 1 0 0 0 (wildtype). The arrow indicaties wildtype RsaA.  27  R s a A production or, possibly, the Tn5 insertion mutation does not eliminate secretion and a second mutation in rsaA was responsible for the loss of secretion.  Isolation and sequencing of DNA near rsaA. A previously constructed cosmid library was used to isolate an 11.8 kb DNA fragment containing rsaA plus 7.3 kb of 3' DNA. This fragment was cloned into pBSKS+ forming the plasmid, pRAT1, and sequenced to search for translocator genes. An open-reading frame (ORF) was found 5' of rsaA, confirming earlier results (Fisher et al., 1988) and 5 O R F s were found 3' of rsaA (Fig. 3-4). A search of sequence databases showed that there were two  ORFs  immediately 3' of rsaA that encoded proteins with significant similarity to the A B C transporter and membrane fusion proteins (MFP) of two type I secretion systems: the alkaline protease transport system of P. aeruginosa metalloprotease transport system of E. chrysanthemi  (Guzzo er al., 1990) and (Letoffe and Wandersman,  1992) (Figs. 3-5,3-6). The first O R F was 1734 bp long and started 246 bp after the termination codon of rsaA. This O R F was predicted to code for a 578 amino acid protein with a predicted molecular weight of 62.0 kDa and pi of 9.02. Alignments of the predicted amino acid sequence show that the putative protein is 4 6 % identical and 69% similar  rsaA  rsaD  rsaE  L P S synthesis 1  B  B  1 kb Figure 3-4. pRAT1.  G e n e s 3'  of  rsaA.  Graphic showing the O R F s found after s e q u e n c i n g the plasmid  B = BamHI, E = EcoRI, H = Hindlll, S = S s t l .  28  to AprD from P. aeruginosa chrysanthemi..  and 3 3 % identical and 62% similar to PrtD from E.  The gene was designated rsaD because of this similarity (Fig. 3-5).  RsaD exhibits several N-terminal hydrophobic domains that may be transmembrane regions and a possible A T P binding site in the C-terminal half of the protein. T h e predicted protein contains Walker A , Walker B, and A B C signature motifs as well as the newly discovered E. coli motif (hhhhH). These motifs are highlighted in Fig. 3-5.  ClustalW Formatted Alignments  RsaD AprD PrtD HasD UpB  I v ii[R]p[A]v I m° m|V F 311 I II N I L|a,T'71s p | l V M L Q VY 0 R V L 00s R | g | A | r | s V l a MF slgjv I N L L I I L V P S L Y M L Q VY D R V L r q i n T s T f s v q\rr s T v I N V L M L falP S V Y M L Q V Y D R V L A 3 alAly R R[glF W[g] I a L F T A V I N L L M LUIP P A L Y M L O V Y D R V L r | T | r ^ l K | v | F w T v | | l F T A F l l It L t M L T > S I V M L Q V Y D R V Lip F SA V I N L L M L V P S L Y M L O V Y D R V L  mI m a r l g|8|s pi | I  a  M  L T V IIcIvTl FI»J! I L l l L I t Mia L S T1 U M ML LT IL L L L L Wajal T LH\U L T L M V T O T LTM L r L I U LB  6~Mld q|v D LT T L D L T[T]L D L TTTL  RsaD AprD PrtD HasD UpB  RsaD AprD PnD HasD UpB  E E Q 0 E E  L P S S T V A A M I LFAGTVAEN L F[k]e S L A E N L FTTG T L TRIN I F A O T I A EN L F A G T A E N  A H F MI A R F T l V vfalA A K L A R F G D A R F Q | 3 v " D A E KV V a A A[a]l ARFNDI DSEK V T A •  l|g s|rT|m|G V S T v T T L P (5G Y D T I L 0 L P NGY D T '  8C88* G[7]G a A L GTTG QtJ]i L G E G G S II GFIQ Q A I  L  I  I  L  T  L  ft E p M T 61 g l t l i f A " T l c ROF I T a o T l " A F T | R Q F I T O NA L F A F F ROFplT G N A L F A F F  L S G G 6 R O R L A L A R A V F S G G Q R Q R I AL A R A L S G G O R O R I (g]L A R A SGGQRQR VAL AR A S G G Q K Q R I [gll A R A  500 p A L L V L B E p N Als PTLVVLDEPNS P C L L I L D E P N A|T|L D PALVVLDEPNANLD P A L V V L D E P N 5 N L D P. L V V L D E P N NLD 5/0  RsaD AprD PrtD HasD UpB  A fi v H E it AQV H E L A G V H E L A G V H Q L  I  km  =E]  S  L P|G]I a p a a a vRTI [Lja|N] - - - a q k |AJ  Figure 3-5.  ClustalW  q r 9  ' P p vEAlg -|AJa I nv nfsle  P y n n p  alignment  ' P P P EL. glAEp q v vjA qlAlr m n l F T y|Ajn I a T J T d e ge a  of  ABC-transporters.  ( A c c e s s i o n number C A A 0 5 7 9 5 ) , PrtD ( A A B 0 3 6 7 1 ) , are the most closely related A B C transporters.  Alignment  of R s a D with A p r D  HasD ( C A A 5 7 0 6 9 ) a n d L i p B ( B A A 0 8 6 3 1 )  which  T h e green b o x surrounds t h e W a l k e r A motif, t h e  blue b o x surrounds the W a l k e r B motif, the red b o x surrounds the A B C motif a n d the yellow b o x surrounds the fourth A B C transporter  motif  recently discovered in most E. coli A B C t r a n s p o r t e r s .  29  ClustalW  RsaE AprE  -  EH*  -  -  - p 0 -  -  -  - k  i  q r p  m[T| r ~  PrtE HasE LipC  M H  S  pi  S  RsaE AprE PrtE HasE LipC  V Q H | I  RsaE AprE PrtE HasE UpC  LIT  T  3 m d  i  t h  9 e[pjq[Djs  t  t q[ple  f q  tp|n  k|fT  D"E3»  -  a m |R D [ T " a s  r  y[Tje  i p [Q p ''g | r  -  n q S v irplg 0 i d T l | s ] r q " T | D B | 9 i  C M L alK  e|Q  I  1 V R E 0  V V k|H ' I i IV d R I q V Q H P | S | G G V V s |o I q I I k n[_I a V Q|T|p|a^lQ  S  Q  e  -  Alignments  -  "  I|N| e A  V Q H*Fjl|o 0 I Q q|G 0  L  v  Formatted  E K  V R D G [qJH V K D Q D R V H E Q D R V R D G D K  R|  A[7~  fell 220  R$aE AprE PrtE HasE LipC  L R  LJy d k j o l l f v L A!e|E O YM  % L R  L  A  A  D M Y  A  D G Y  P R|pf P R K Q L L E P R N K M L E  V  S S Q G " R * | f[o S G O [eJV 0 \f2 D T u it w U G R I G G R . G  ;[E I|Q  T  M R M K  L P R N R  Q  R  d|T Q L K V a vjo g a s E A Q H R I i[R Q t|R R d | _ J q Q 1 q M L R I  T3  K 0 .  1|Q  K l q f n 1 E S Cj\t\B. Q . R  320  RsaE AprE PrtE HasE LipC  r  1 a|E V | t 1 n E  V  epH e v V [ A ] S  i s  Q| r  d  N  [AjslEjf  r n|n  L|q m A P  F  P vf7|G T V V  D  I . A P V  330  R  L  I K " F ifdJaVv I ' G | P  N V R A P V A G T V V H T _Q_V K A P V A G T V V ~ G L  e k  pTId E L ] d n R LJa k  I I  F T E P T E I F T T Q F T E  plv  £, n  . G T V V G  G G V i j a p|Q: Q G G V I G A G Q G G V V G A pfd G G  370  RsaE AprE PrtE HasE LipC  v" Him G M | v i H B G L P V m • G L P V G L P y_ y n G L P  ^3 -  G  tjT V  Q T F T A F H Q S  B  V  E  V  E  V  D  L P V E  L L FSAFNQS L M FTAFMQPi  R  v  G[e]v  T  P  R  V  O  T  P  R  V  T  P  K  I  T  P  R V  T P  T  I|q  s | L S [ q j D R I|_SJDIPJQ H T  M  V  T  L  L  v  T  L  v  E 3  V  S S  F 1  A  A  D  A  D  D  R  R R L  L L  E ]  L  D  E  Q  V  D  E  K  D  E  K  J G T V T L V S A D R L V D J  S A D R L  V  450  RsaE | Q " V ] AprE PrtE HasE LipC  i p ] p | T G B R T V L Q Y L F ) s [ P L |  Q V F V R T G E R S L  QT[ Tg I"F| p V v R T IG E R Ig |F  sQ]l  I R T G E R S M  Q V F V R  G E R S  Figure 3-6.  Pi l\L  R| C]T M | r r E " E l  D RUIH v A i  L N Y L F  K Y L F N Y 1 F  A  E|rr  D  R  M  H  L  A  L  T  E  E  D  R  L  H  L  A  L  T  E  E  |P R n H | 7 | s  h T E E  L N Y L F K P L  ClustalW  alignment  of  MFPs.  Alignment of RsaE with A p r E ( A c c e s s i o n  number C A A 4 5 8 5 6 ) , PrtE ( C A A 3 7 3 4 3 ) , HasE ( C A A 5 7 0 6 7 ) and LipC ( B A A 0 8 6 3 2 ) which are t h e most closely related A B C t r a n s p o r t e r s .  RsaD was predicted to have a insertion signal sequence consistent with insertion of the RsaD protein in the cytoplasmic membrane. The second O R F started 68 bp after rsaD, contained 1308 bp and encoded a protein of 436 residues with a predicted molecular weight of 48.4 kDa and pi of 6.59. Alignment of the predicted protein shows that the sequence is 2 8 % identical and 50% similar to AprE from P. aeruginosa and 29% identical and 5 2 % identical to PrtE from £. chrysanthemi.  The gene was designated rsaE because of this similarity  (Fig. 3-6). The deduced protein sequence of rsaE was predicted to have a typical Nterminal insertion signal sequence that would direct it to the inner membrane.  30  Possible ribosome binding sites were found 7 bp and 8 bp upstream of the A T G initiation codon for rsaD and rsaE, respectively. There was no indication of a promoter immediately 5' of either rsaD  or rsaE, but there was a putative rho-  independent terminator immediately after the stop codon of rsaE suggesting that they may be part of a polycistron which includes rsaA.  It has been found in the type  I secretion systems secreting E. coli a -hemolysin and E.  chrysanthemi  metalloprotease that the genes are part of an operon consisting of the substrate and the transport genes. It seems likely that transcription of the R s a genes is similar. Three more O R F s were found 3' of rsaE.  None of these O R F s encoded  proteins similar to the third component of type I secretion systems. Instead, these O R F s encoded proteins similar to those involved in synthesis of perosamine, a dideoxyaminohexose (see C h . 6). The chromosomal D N A near B 1 , B2, B5, B9, B13, B15 and B17 Tn5 insertions was isolated and sequenced to determine the Tn5 insertion point. It was found that the B1 Tn5 interrupts rsaA, as expected from the Sourthern blot analysis. B5 and B13 are identical insertions interrupting the N-terminus of RsaD while B17 is located 22 amino acids from the C-terminus. B9 and B13 are Tn5 insertions in rsaE. The sequence interrupted by the B2 Tn5 insertion has no sequence similarity to any known proteins.  Complementation of the secretion-defective Tn5 mutants. To demonstrate that the Tn5 insertions were responsible for the secretion defect the mutations were complemented in trans. First, the cosmid, 17A7, containing the entire R s a locus, was introduced into the mutants. All attempts at complementation using this cosmid were unsuccessful, including an attempt to restore R s a A production in JS1003 (which contains an inactivated rsaA gene). Since R s a A production in JS1003 can be restored with other plasmids containing rsaA, it is believed that expression of the genes was too low for complementation. A P C R product containing the genes rsaD and rsaE was generated and cloned into a suitable expression vector; the result was named pRAT5:PRK415 (see C h . 2). This plasmid was introduced into the Tn5 mutants B15 and B17. With this plasmid, mutant B17 secreted RsaA while the B15 mutant did not (Fig. 3-7A).  31  To  address  the  with  B15  problems  complementation,  a  new  tetracycline-resistant broad  host  range  (Tc ) r  vector,  p B B R 5 , was constructed.  It  was hoped that this vector would have a higher copy 1  6  7  number and expression of the R s a genes that would alleviate  the  encountered pRK415  1  5  6  when  or  cosmid  B  problems using  pLAFR5  vector).  (the  In  the  resulting constructs a  lac  promoter  for  is  used  transcription of the rsaD and rsaE  genes  in  pRAT5:  P B B R 5 and the rsaA, Figure 3-7. deficient  mutants  Complementation using  rsaD and  of  transport  rsaE.  W e s t e r n s of surface extracted protein using anti-S antibody. A) L a n e s are as follows: 1, B 1 7 (DE); 2, B 1 5 (DE);3, B 1 ( D E ) ; 4, B 1 7 ( 1 7 A 7 ) ; 5, B 1 5 ( 1 7 A 7 ) ; 6, J S 1 0 0 3 ; 7, N A 1 0 0 0 . (DE) indicates that the c e l l s c a r r i e d the plasmid p R A T 5 : p R K 4 1 5 containing the g e n e s rsaD and rsaE. ( 1 7 A 7 ) indicates that the c e l l s carry the c o s m i d 1 7 A 7 containing the entire R S A o p e r o n . E q u a l a m o u n t s of surface e x t r a c t were l o a d e d in all l a n e s . T h e arrow indicates full length R s a A . B) L a n e s are as follows: 1, B1 (DE); 2, B 5 (DE); 3, B 9 (DE); 4, B 1 5 (DE); 5, B 1 7 (DE); 6; N A 1 0 0 0 . D E indicates that the cells carry the plasmid p R A T 5 : p B B R 5 expressing t h e g e n e s rsaD and rsaE. Equal amounts of surface extract were l o a d e d in all l a n e s e x c e p t (6) w h e r e there w a s only one quarter of the amount loaded in the other lanes. T h e arrow indicates full length RsaA.  and rsaE  in  PBBR5.  rsaD  pRAT4AH:  When  pRAT5:  P B B R 5 was introduced into the mutants B 1 , B5, B9, B15 and  B17,  Western  analysis s h o w e d that  blot the  mutants with defective rsaD or  rsaE  genes expressed  R s a A on the surface while the rsaA mutant B1 did not (Fig.  3-7B).  When  pRAT4AH:pBBR5 was expressed in the same mutants, RsaA was only found on the surface of the B1 and B17 mutants (data not shown). The ability to complement the  32  Tn5 insertions in rsaD and rsaE using pRAT5:pBBR5 expressing rsaD and rsaE in trans indicates that these genes are responsible for the secretion of RsaA. The lack of complementation in some cases was probably the result of lower expression of the R s a genes. It was necessary to use Tc to maintain the vectors as Tn5 confers kanamycin and streptomycin resistance, but C. crescentus  does not  tolerate Tc well. When cells carry the Tc resistance marker are exposed to even low levels of Tc (0.5 ng/ml), they appear anomalous by microscopy. The cells are often severely elongated and there are few motile cells. It was difficult to grow cultures carrying T c plasmids with the R s a genes to densities high enough to extract r  sufficient protein to be seen on the Western blot. It seems probable that the Tc was causing membrane abnormalities and that these factors contributed to lower expression of the R s a genes with all the plasmids. The cosmid, 17A7, only has 1-2 copies per cell and similarly, pRAT5:pRK415 would be maintained at 2-3 plasmids per cell (Keen et al, 1988).  Preliminary  experiments with p B B R 5 suggest that it has a much higher copy number than either p L A F R 5 or pRK415 based vectors which would result in higher expression of any genes that p B B R 5 carries (data not shown). Expression levels would also be affected by the promoter transcribing the genes.  The lac promoter transcribes at higher levels than the wildtype rsaA  promoter (Yap et al., 1994). In addition, in the cosmid and p R A T 4 A H : p B B R 5 , rsaD and rsaE  are either transcribed by their wildtype promoter or as part of the rsaA  transcript as described above. In either case, a lesser amount of transcript would be produced than from the lacZ promoter of pRAT5:pBBR5. These data suggest why the complementation occurred only in some cases. The plasmid p R A T 5 : p B B R 5 (strong promoter and high copy number) produced the highest levels of RsaD and R s a E allowing full complementation of all the transport mutants while the cosmid, 17A7, (weaker promoter and low copy number) produced the lowest levels and could not complement any of the mutants.  The plasmids  pRAT5:pRK415 (strong promoter and low copy number) and p R A T 4 A H : p B B R 5 (weak promoter and high copy number) probably make an intermediate amount of protein that is only enough to complement the mutant B17. This mutant may differ from the others because the Tn5 insertion is only 22 amino acids from the C -  33  interrupted (Fig. 3-8). Smaller zones of clearing are seen around the wildtype strain, S-layef  rsaE  S-layer  +  rsaA  NA1000, and the S-layer producing B12  B9  B2  B1 2  B1  (representing  a  random  Tn5  insertion unrelated to secretion), as compared to JS1003 or B1, where the rsaA  gene  has  been  interrupted,  suggesting that there was competition between  RsaA  secretion JS1003  rsaA'  NA1000  wildtype  Figure 3-8. crescentus.  Bl 7  rsaD  Bl S  rsaE  Expression of prtB in C. PrtB was expressed in all the  and  PrtB  machinery,  for  the  further  supporting the supposition that RsaD and  RsaE  are  parts  of a type  I  secretion mechanism. Identical results w  e  r  e  f  o  u  n  d  w  h  e  n  *P  r A  w  a  s  expressed  plates containing 1% skim milk. Halos around colonies indicate that active PrtB is being secreted. Note that NA1000 and B12 cells are producing RsaA as well as PrtB and the halos surrounding these colonies are smaller. B12 represents a random Tn5 mutant control.  Summary Analysis of the region 3' of rsaA revealed the presence of two genes (rsaD and rsaE) encoding proteins with significant sequence similarity to components of the type I secretion systems used by P. aeruginosa and E. chrysanthemi  to secrete  two different extracellular proteases (Duong et al., 1992; Wandersman et al., 1990). Because interruption of rsaD and rsaE eliminated secretion of R s a A and the defects could be restored by complementation, it was apparent that their gene products make up part of the R s a A translocator machinery. When these results were reported (Awram and Smit, 1998), it was the first example of an S-layer that is secreted using a type I secretion system. Before then, S-layers had only been found to be secreted by a type II system (Messner and Sleytr, 1992; Sleytr et al., 1993).  It is now known that a protein with amino acid  34  unrelated to secretion), as compared to JS1003 or B 1 , where the rsaA gene has been interrupted, suggesting that there was competition between RsaA and PrtB for the secretion machinery, further supporting the supposition that R s a D and R s a E are parts of a type I secretion mechanism. Identical results were found when aprA was expressed in the Tn5 mutants (data not shown).  Summary Analysis of the region 3' of rsaA revealed the presence of two genes (rsaD and rsaE) encoding proteins with significant sequence similarity to components of the type I secretion systems used by P. aeruginosa and E. chrysanthemi  to secrete  two different extracellular proteases (Duong et al., 1992; Wandersman et al., 1990). Because interruption of rsaD and rsaE eliminated secretion of RsaA and the defects could be restored by complementation, it was apparent that their gene products make up part of the RsaA translocator machinery. When these results were reported (Awram and Smit, 1998), it was the first example of an S-layer that is secreted using a type I secretion system. Before then, S-layers had only been found to be secreted by a type II system (Messner and Sleytr, 1992; Sleytr et al., 1993).  It is now known that a protein with amino acid  sequence similarity to R s a A is secreted by the S. marcescens  type I secretion  system (Kawai er al., 1998). In addition, the C. fetus S-layer protein is secreted by a type I secretion mechanism.  The C. fetus S-layer shares several features in  common with that of C. crescentus.  It is produced by a free-living Gram-negative  bacterium, is hexagonally-packed, anchors to the cell surface via its N-terminus to a particular species of L P S (Bingle er al., 1997b; Dworkin et al., 1995; Walker er al., 1992) and so far has the greatest similarity of any S-layer protein to R s a A (Gilchrist era/., 1992).  35  The genes for the A B C transporter and the M F P components of type I secretion systems are generally found in an operon that includes the transported protein (Binet et al., 1997; Salmond and Reeves, 1993). In this respect then, the organization of the rsaA, rsaD and rsaE genes was not surprising. In contrast, the gene encoding the outer membrane protein component of type I secretion systems may or may not be closely linked to the other secretion genes. The third component of the R s a transporter has now been found 5 kb 3' of rsaE and is described in C h . 4. A potential Rho-independent terminator sequence is located after the rsaA coding region (Gilchrist et al., 1992). This predicted terminator results in a predicted transcript that matched closely the size of a transcript found using Northern blot analysis (Fisher et al., 1988). In this study, no obvious indications of a promoter were found immediately 5' of either the rsaD or rsaE  genes suggesting that  transcription of rsaD and rsaE is similar to transcription of the hlyA, hlyB and hlyD genes of E. coli, where a similar Rho-independent terminator is found after the hlyA gene and terminates most transcripts at this point.  A n anti-terminator, RfaH,  prevents termination and when it does, a larger transcript including the hlyB and hlyD genes is made (Leeds and Welch, 1996). This transcript is difficult to detect because it has a short half-life and an analogous transcript in C. crescentus may have been missed in the northern blot analysis. Transcription of the E. chrysanthemi protease secretion genes appears to be accomplished by a similar method (Letoffe et al., 1990) and it is postulated that the same is true for the R s a operon.  A  transcription pattern like this may account for the reduced expression found in the JS1003 and B1 mutants when they are complemented with rsaA.  The kanamycin  fragment interrupting rsaA in JS1003 does not have a transcription terminator and transcription may continue through to the end of rsaE, resulting in a transcript 1.5 kb longer than the wildtype, which would likely be more unstable and result in fewer transport complexes. In B 1 , it is likely that rsaD and rsaE are transcribed off one of the Tn5 promoters resulting in decreased amounts of transcript and, in turn, transport complexes. Type I secretion systems can be grouped into families. The R T X toxins, such as a-hemolysin (E. coli) and leukotoxin (P. hemolytica), comprise one family while extracellular proteases (e.g. AprA, PrtB) and lipase from S. marcescens  constitute  36  another (Binet et al., 1997). Within the families there is high sequence similarity and functional secretion mechanisms can be constructed from using components from the different members without a dramatic drop in protein transport. Because it has been demonstrated that AprA and PrtB proteins can be secreted from C. crescentus in active form and there is higher sequence similarity between these proteins than with R T X toxins, presumably, RsaA can be grouped with the protease family of type I secretion systems.  37  Chapter 4 Identification of the Outer Membrane Protein Component of the RsaA Transport Complex  Introduction The gene encoding the O M P component of the R s a A secretion machinery proved difficult to isolate since it was not found immediately 3' of the M F P , as in many other type I systems. This difficulty has also been found with most of the other type I secretion systems where the O M P is separated from the rest of the transporter complex. In fact, the O M P has only been found in 2 other cases of this type: TolC, required for transport of a-hemolysin in E. coli (Wandersman and Delepelaire, 1990) and HasF, part of the heme transporter in S. marcescens and Wandersman, 1996).  (Binet  In both of these cases the experimenters had simple,  efficient screens to look for mutants. Several different strategies were considered to find the O M P component. A s none of the original S-layer negative Tn5 mutants interrupted the O M P and considering the number of mutants screened, it was believed that the NA1000 Tn5 library did not contain the mutant. The Tn5 library may not have been complete or a Tn5 insertion in the O M P may have been lethal. If a Tn5 insertion was lethal there was no further point in screening another Tn5 library.  It seemed possible that a  point mutant with reduced secretion, but not having a lethal phenotype could be constructed. Since a U V / N T G point mutant library had been previously made by others, it was decided that this library could be screened for an O M P mutant. Alternatively, a functional type I system could be reconstructed as was done in E. coli using hasDE, the ABC-transporter and M F P genes, from S.  marcescens  and the O M P gene, tolC (Binet and Wandersman, 1996). This secretion apparatus was capable of secreting the S. marcescens heme-acquisition protein, HasA, as well as AprA and PrtB. The S. marcescens  O M P gene, hasF, was then isolated by  expressing a protease along with hasDE in an E. coli tolC mutant along with a plasmid library of S. marcescens chromosomal DNA, and screening for the presence  38  of protease secretion on skim milk plates. It was hoped*that a similar method would be capable of identifying the R s a O M P gene. A third option for finding the O M P was to screen by similarity to O M P components from other bacteria. There are two ways to approach this. One method is to search the genome of C. crescentus  for D N A fragments hybridizing to the  genes from O M P components. The other is to compare the sequences of different O M P components to find regions of similarity and design primers with degenerate sequences for P C R amplification of a portion of the O M P DNA sequence that can be used to isolate the complete gene by hybridization. All of these approaches were attempted and are summarized below, but none worked.  The O M P gene was eventually found using the partial C.  crescentus  genome sequence provided by The Institute for Genome Research (TIGR). Two partial O R F s with similarity to O M P components from other bacteria were found in this sequence data and this information was used to devise strategies to clone the complete sequence and to test which of the two O R F s was a legitimate O M P gene involved in the secretion of RsaA.  Results and Discussion  Screening libraries for OMP mutants defective in secretion.  Since the original  immunoblot assay was very labour intensive, attempts were made to develop a new screening method for finding secretion deficient mutants.  The proteases, AprA and  PrtB, are secreted by type I transporters and can be secreted by the R s a secretion machinery, allowing skim milk plates to be used for rapid screening. Therefore, vectors carrying these genes were designed for screening the libraries. The plasmid pBBR3AprA:pRAT5 was constructed and consists of the aprA gene and the rsaDE genes  under  the  control  of  separate  lacZ  promoters.  The  plasmid  pBBR3PrtB:pRAT5 is identical to p B B R 3 A p r A : p R A T 5 except the aprA gene is replaced with prtB. When these plasmids were introduced into the U V / N T G mutant library, no secretion of AprA or PrtB was observed. The rsaDE genes had originally been included in the plasmid to exclude rsaDE mutants from being found during the  39  screening process, but since the plasmid did not work the approach was dropped. When the plasmids p B B R 3 A p r A and p B B R 3 P r t B were used to express their respective proteases in the U V / N T G mutant library a large number of colonies failed to show secretion of the proteases. When some of these colonies were examined, it was found that they were still capable of secretion of R s a A .  This was an  unexpected result as expression of the proteases in NA1000 results in protease secretion from >99.9% of colonies. It was concluded that these proteases are not tolerated well by C. crescentus and could not be used as a screen. In agreement with this was the observation that C. crescentus colonies expressing the proteases could not be sub-cultured after growing for 5 days while normally C. crescentus can be sub-cultured even after several weeks.  It appeared that the proteases were  killing the bacteria, (see C h . 5 for further discussion about protease expression in Caulobacter species). Without a rapid screening method, it was decided to drop screening of mutant libraries in favour of the other approaches. Searching for the OMP using complementation systems.  If a complementation  system was going to succeed in finding the O M P component, it was necessary to determine if a functional system could be constructed using the C. transporter components.  crescentus  In many other type I systems the components can be  interchanged with components from other bacterial systems and allow heterologous secretion. To determine if the R s a system would work in a similar manner plasmids expressing R s a D and R s a E were expressed in bacterial hosts along with O M P components from several different bacterial systems. The plasmids pBBR3AprA:pRAT5, p B B R 3 P r t B : p R A T 5 and pRAT4AH were constructed and express either a protease or rsaA along with rsaD and rsaE. These plasmids were introduced into E. coli tolC  +  alone or with either of the plasmids  p B B R I A p r F and p B B R I P r t F which express O M P components from the Apr and Prt systems.  None of these strains secreted either the protease or R s a A (data not  shown). Since E. coli is an enteric microorganism and C. crescentus is a free-living groundwater bacterium, their outer membranes are quite different. It is possible that the R s a transport complex was unable to assemble in the membrane of E. coli.  40  Rhizobium meliloti and Rhizobium leguminosarum  are ground water bacteria living in  environments similar to C. crescentus and likely have a membrane resembling that if C. crescentus.  In addition, the type I secretion systems, Nod and Prs, with similarity  to the R s a secretion machinery have been found in R. leguminosarum 1998; Scheu et al., 1992). In R. leguminosarum,  (Finnie et al.,  as in the R s a system, the O M P  gene of the Prs secretion system has not been found close to the other transport genes and is expected to be elsewhere on the chromosome and could possibly complement the R s a machinery. p B B R 3 P r t B : p R A T 5 and p R A T 4 A H leguminosarum. Further  With this in mind, were  pBBR3AprA:pRAT5,  e x p r e s s e d in R.  meliloti  and  R.  Again, none of the constructs expressed the proteases or RsaA.  experiments  were  tried  by  introducing  pBBR3AprA:pRAT5,  p B B R 3 P r t B : p R A T 5 and p R A T 4 A H along with pBBR1 AprF and p B B R 1 PrtF, in various combinations in the Rhizobium  species. In no case was secretion of RsaA  or the protease found (data not shown).  Sequence similarity to other OMP genes was used to search for the Rsa OMP gene. Southern blots of C. crescentus  chromosomal D N A were probed with the  O M P genes, aprF and prtF under conditions allowing 3 0 % mismatch. hybridization of these probes to C. crescentus  No  D N A was found (data not shown)  demonstrating that this method could not be used. A sequence alignment of O M P components revealed areas of sequence identity among the different proteins. The protein sequences of the O M P s from a number of closely related type I transport systems (with O M P genes that are both linked and unlinked to the other transporter genes) were aligned (Fig 4-1).  The  O M P , HasF, was given the highest priority in the comparison because it is from the type I system with an unlinked O M P gene most closely related to the R s a system. Areas of significant homology were examined for the purpose of designing degenerate primers to amplify a portion of the O M P gene using P C R . Four areas, shown in Fig 4-1, were chosen for making primers. The primers were designed by taking the consensus amino acid sequence and using the codon preferences of C.  41  F60 E FB 110 B IF 340 • IFB415 Figure 4-1.  Alignment  of  OMP components.  Arrows are p l a c e d a b o v e  regions of similarity that were used to design degenerate primers.  T h e arrows are  colour c o d e d according the primer they were used to create (see legend)  crescentus to determine the DNA sequence. The design process was governed by the suggestions in Colnaghi et al., 1996; Maser and Kaminsky, 1998; and Tobin et al., 1997. A variety of conditions, as well as different combinations of the primers, were used to amplify fragments from NA1000 chromosomal D N A (see C h . 2). When the P C R conditions resulted in a product, multiple bands were always seen. Three DNA fragments of the expected size were gel purified and cloned. Sequencing of these products  revealed similarity  to 2 3 S R N A , poly  (3-hydroxybutyrate)  biosynthesis genes and NADH dehydrogenase genes. The primers appeared to be amplifying undesired D N A sequences and as a result these experiments were abandoned.  Two candidates for the Rsa OMP gene were identified in the preliminary Caulobacter genome data. A s all other attempts had failed to identify the O M P gene, contact was made with The Institute for Genome R e s e a r c h (TIGR) who provided preliminary sequence data from the Caulobacter  genome.  FASTA  searches (Pearson et al., 1997) of this database produced two contigs with similarity to known O M P components. Contig gcc_973 contains an O R F coding for the first 225 amino acids of a possible O M P component with a G+C content of 65.3%. Examination of the DNA 5' of this O R F revealed that this O R F is 5 kb 3' of the rsaE gene and there are 5 intervening O R F s that likely code for S - L P S synthesis proteins (Fig. 4-2). This O R F has been designated rsaF(973).  The deduced amino acid  B Substrate kb  0  2  Transporters  4  6  valyl tRNA synthetase  S-LPS synthesis 8  10  12  kb  14  0  rsaF?  S H  C  Figure 4-2.  B  The  two  B  possible  H  B BB  OMPs,  gcc973  rsaF(973)  s h o w s that r s a F ( 9 7 3 ) is located 5 kb downstream of rsaE. a g e n e coding for valyl t R N A synthetase. bars.  B - B a m H l , C-Clal,  and  2  )  gcc 1984 rsaF(1984).  B) rsaF(\984)  A) T h e figure  is located adjacent t o  T h e location of the g c c c o n t i g s is s h o w n with black  S-Sstl,  43  sequence of rsaF{973) had greatest similarity to TolC with 26.1% identity and 52.2% similarity over the 225 amino acids coded by gcc_973 (Fig. 4-3). Contig gcc_1984 has a G+C content of 67% and contains an O R F coding for the last 384 amino acids of a possible O M P . This O R F has been designated rsaF(1984). 3' of rsaF(1984) is an O R F coding for valyl tRNA synthetase (Fig 4-2).  The coding sequence of  rsaF(1984) had greatest similarity to the HasF O M P with 26.8% identity and 48.5% similarity (Fig. 4-3). crescentus's  The G+C content of these two O R F s is comparable to C.  67%, suggesting that neither is a recent genetic acquisition. These two  contigs overlap with 59.6% identity over a region of 344 bp indicating that they are not part of the same O R F , but suggest that one arose by gene duplication of the other (Fig. 4-3). Once sequence was available it was assumed that it would be relatively simple to obtain both complete genes. This did not prove to be the case.  Using  these sequences, primers were designed to amplify portions of rsaF(973) and rsaF(1984) that could then be used as probes to isolate the complete genes. These primers had melting temperatures (T ) between 58°C and 62°C and did not appear m  to have any hairpin loops or secondary priming sites when analyzed using primer analysis and design programs. Primers of this size and T for P C R amplification of C. crescentus  m  have been used routinely  DNA with excellent results. These primers  produced products of the expected size, but when cloned and sequenced the products were identical to the C. crescentus  DNA gyrase and glutamate permease  genes. Suspecting that there may be something peculiar about the structure of the DNA around the rsaF genes it was decided to attempt to isolate the D N A of the adjacent regions. Since the start of rsaF(973) is found in the genome 1.5 kb 3' of sequences cloned into pRAT1, a 2 kb SamHI-EcoRI fragment was sub-cloned from pRAT1 and designated pRAT HI (B/E). To amplify a fragment of D N A close to the rsaF(1984) gene, new primers were made to amplify a 736 bp region 3' of rsaF(1984). These primers were designed with Tm of 70°C and were 26-28 bp long.  44  A. BlastX comparison of gcc_973 Sequences p r o d u c i n g H i g h - s c o r i n g Segment P a i r s : 1. 2. 3. 4.  gi|72556 gi|3080540 gi|4826418 a i l 281563  High Score  o u t e r membrane p r o t e i n t o l C E . c o l i (D49826) LipD [ S e r r a t i a marcescens] (Y19002) P r t F p r o t e i n [ E r w i n i a amylovora] a a a l u t i n a t i o n o r o t e i n - Pseudomonas D u t i d a  92 115 115 61  Smallest Probability P(N) 4.0e-ll 7.4e-07 1.0e-06 3.4e-05  B. BlastX comparison of gcc 1984 Sequences p r o d u c i n g H i g h - s c o r i n g Segment P a i r s : 1. 2. 3. 4.  gi11405817 gi|135980 gi|3080540 ail4826418  High Score  (X98513) HasF ABC e x p o r t e r o u t e r membrane . OUTER MEMBRANE PROTEIN TOLC PRECURSOR E . c o l i (D49826) LipD [ S e r r a t i a marcescens] (Y19002) P r t F p r o t e i n [ E r w i n i a amvlovoral  154 159 126 111  Smallest Probability P(N) 1.0e-23 1.2e-23 8.3e-23 4.2e-21  C. Overlap of gcc_973 and gcc_1984. gcc_97 3  CAGACCTCGACCCTCTCTCTGAGCCAGAGCCTCTACACCAACGGTCGTTTCTCGGCCCGC  gcc_198 4  CGCTCTACACCGGCGGTCGCGCCAGCGCGGGC  gcc_97 3  CTGGCGGGTGTCGAGGCGCAGATCAAGGCCGCGCGCGAGAACCTGCGCCGCATCGAGATG  gcc_198 4  GTCAGCCCCGCTGAAGCCGACGTGCTGTCTGCGCGGGAAGGTCTTCGCGCGGTCGAGCAG  gcc_97 3  GACCTGCTGGTCCGCGTGACCAACGCCTATATCTCGGTGCGCCGCGACCGCGAGATCCTG  gcc_l9 84  GGGGTGCTGGTCAGCGTCGTCCAGGCCTATGTCGACGTGCGCCGAGACCAGGAACGCCTG  gcc_97 3  CGGATCAGCCAAGG-CGGTGAAGCCTGGCTGCAGAAGCAATTGAAGGACACCGAGGACAA  gcc_1984  CGCATC-GCCAAGGAAAACGTCGCGGTTCTGCAGCGCCAGCTCGAAGAATCGAACGCTCG  gcc_973  GTACAGCGTCCGTCAGGTGACCTTGACCGACGTGCAGCAGGCCAAGGCCCGCCTGGCGTC  gcc_1984  CTTCGACGTGGGTGAGATCACCCGGACGGACGTCGCCCAGTCTCAGGCGCGCTTGGCTTC  gcc_973  GGCCAGCACTCAGGTGGCGAACGCCCAGGCGCAGCTGAATGTCAGCGTAGCGTTCTACGC  gcc_1984  GGCCAAGGCCAGCCTGTCGGGCGCCCAGGCCCAGTTGGAAGTCAGCCGCGCCTCCTACGC  gcc_973  GTCCCTGGTGGGGCGCCAGCCGGAGAC  gcc_1984  TGCGGTGGTCGGTCAAACGCCCGGCGAACTGGCTCCCGAGCCGAGCTTGGCCGGACTGCT  Figure 4-3.  Comparison  of  possible  proteins to the O R F from g c c _ 9 7 3 .  Rsa OMP components.  A) C l o s e s t similar  B) Closest similar proteins to the O R F from g c c _ 1 9 8 4 .  comparison of g c c _ 9 7 3 to g c c _ 1 9 8 4 .  C)  Note that the P(N) numbers are higher for g c c _ 1 9 8 4  than g c c _ 9 7 3 b e c a u s e the g c c _ 1 9 8 4 contig has a larger portion of the O R F .  45  P C R using these primers produced a product of the expected size that was successfully cloned and the resulting plasmid was called pBSKS-gcc1984. When sequenced, the product proved to be the correct fragment. The NA1000 cosmid library was probed with p R A T HI (B/E) and p B S K S gcc1984.  A number of cosmids hybridized to p R A T HI (B/E), but all proved to  contain only D N A 5' of rsaF(973) and it was concluded that rsaF(973) was not located within the NA1000 cosmid library. The cosmid, 7A22, hybridized to p B S K S gcc1984. Southern blots of the cosmid showed that pBSKS-gcc1984 hybridized to a 5.5 kb SamHI band. Several attempts were made to subclone this fragment and while the surrounding fragments could be cloned, it was not possible to subclone the fragment containing rsaF(1984). Yet another approach was taken to isolate the rsaF genes. The plasmids pRAT HI (B/E) and p B S K S - g c d 984 will not replicate in C. crescentus and could be forced to integrate into the genome by homologous recombination. The plasmid pBSKS-gcc1984 was not successfully integrated into the chromosome, but pRAT HI (B/E) was, giving NA1000::pRAT HI (B/E). Chromosomal DNA from NA1000::pRAT HI (B/E) was partially digested with BamHI and ligated under conditions promoting the circularization of the DNA fragments. The ligation mix was electroporated into E. coli and plated on selective medium which allowed only the growth of cells carrying the plasmid pRAT HI (B/E) and chromosomal DNA adjacent to the integration points that had circularized during the ligation. The 14 kb plasmid, pTZ19UASSm973Bcirc, was isolated in this manner.  Restriction mapping and Southern blotting of this  plasmid showed that insert consisted of D N A from 2.5 kb of 5' to 5.5 kb 3' of rsaF(973). Fragments of this plasmid were sub-cloned and sequenced, including a fragment containing the N-terminal of RsaF(973), but it proved impossible to subclone and sequence the entire rsaF(973) from this plasmid. This is not the first example of DNA from C. crescentus that has proved impossible to subclone. A 6.6 kb fragment, containing the holdfast genes involved in C. crescentus  attachment,  has proven resistant to the subcloning efforts of several graduate students and postdoctoral fellows (Smit, unpublished). Fortuitously, one of the shedder Tn5 mutants, F11 (see C h . 6), contains a Tn5 insertion 400 bp 5' of the rsaF(973) O R F . Using primers that hybridize to the  46  Tn5 it was possible to use an inverse P C R method (Martin and Mohn, 1999) to isolate and clone two fragments of D N A containing r s a F ( 9 7 3 ) .  Plasmid  pCR2.1F11Sall contains the DNA from the F11 Tn5 insertion to the Sa/I site 1.1 kb 3' of rsaF(973). The other, pCR2.1F11Xmal, contains the D N A from the F11 Tn5 insertion to the X m a l site 2.0 kb 3' of rsaF(973). Again, both of these clones proved difficult to isolate.  Large amounts of P C R product were obtained from the P C R  reaction, but cloning of these fragments only produced one clone of pCR2.1F11Sall and two clones of pCR2.1F11Xmal. Usually when cloning products in this manner a minimum of 50 clones and as many as 300 clones can be expected. E. coli carrying these plasmids grow slowly and appear distended and malformed when observed by phase contrast light microscopy. It is possible that the inserts in these plasmids are not identical to wildtype NA1000 chromosomal D N A s e q u e n c e s , but contain mutations generated by inaccuracies in the Taq polymerase amplification. It may be that the majority of P C R product is lethal when introduced into E. coli, but some of the P C R product containing mutations in rsaF(973) making the product less toxic could be cloned in E. coli.  The sequence of the insert from pCR2.1 F11 Sail  assembled together with sequence from the plasmid pTZ19UASSm973Bcirc and the T I G R genome (Fig. 4-4, Appendix  I).  The R s a F  (973)  s e q u e n c e from  pCR2.1F11Sall, showed considerable similarity to other O M P s . The highest degree of sequence similarity was to E. coli TolC with 25.2% identity and 48.6% similar amino acids. The O M P s AprF and PrtF from P. aeruginosa  and E.  chrysanthemi  were not as similar (Fig. 4-5). Analysis of the sequence of RsaF(973) revealed the presence of a predicted signal sequence encompassing the first 32 amino acids and the presence of (3-strands capable of forming a (3-barrel structure typical of outer membrane proteins.  47  48  Comparison of RsaF(973) to the protein databases  1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17 . 18. 19. 20. 21. 22. 23. 24.  Document ID  Accession  g i |3860786 g i | 882565 gi|135980 gi|3080540 gi|2495191 gi|4826418 gi|281563 gi|72556 g i | 1405817 g i | 4838370 gi|4115627 g i | 117799 g i | 3493599 gi|4063019 g i I 2983554 g i | 416635 g i | 5759289 g i I 5759287 g i I 1653357 g i I 3646415 gi|3184190 gi|5091481 gi|3914250 gi|95600  (AJ235270) (028377) (X54049) (D49826) (U25178) (Y19002) (M64540) (X00016) (X98513) (AF121772) (AB015053) (X14199) (AF064762) (AF083061) (AE000721) (X64558) (AF175720) (AF175719) (D90913) (AJ007827) (AB011381) (AF031417) (L23839) (S12527)  Figure 4-5.  BLASTX search  Species  Protein TolC n/a TolC LipD TolC PrtF n/a TolC HasF NatC PrtF CyaE ZapD TliF n/a aprF n/a n/a n/a EprF OprM TtgC OprK PrtF showing  High Score  160 Rickettsia prowazekii 103 Escherichia c o l i Escherichia c o l i 103 S e r r a t i a marcescens 115 Salmonella e n t e r i t i d i s 90 E r w i n i a amylovora 115 Pseudomonas p u t i d a 99 E s c h e r i c h i a c o l i ( p a r t i a l ) 92 S e r r a t i a marcescens 90 Neisseria meningitidis 111 Pseudomonas f l u o r e s c e n s . 92 Bordetella pertussis 87 Proteus m i r a b i l i s 94 Pseudomonas f l u o r e s c e n s 85 Aquifex aeolicus 108 Pseudomonas a e r u g i n o s a 86 66 Porphyromonas g i n g i v a l i s Porphyromonas g i n g i v a l i s 83 S y n e c h o c y s t i s sp. 70 Pseudomonas t o l a a s i i . . 78 Pseudomonas a e r u g i n o s a 74 Pseudomonas p u t i d a 66 Pseudomonas a e r u g i n o s a 74 E r w i n i a chrysanthemi 80 OMPs  similar  to  RsaF(973).  predicted from ORF found in genome s e q u e n c e s . O M P from type I s y s t e m s with  smallest  Probability P(N) 5. 7e-23 5. le-17 6. 9e-17 1. 9e-16 1. 4e-14 3. 2e-13 1 3e-ll 1 4e-ll 3 4e-ll 3 2e-10 1 Oe-09 1 9e-09 5 9e-09 1 le-08 1 6e-08 5 3e-08 6 7e-06 0 00017 0 00018 0 00024 0 00035 0 .00043 0 .0011 0 .0015  Lines 1 and 2 are the g r e a t e s t  similarity to RsaD and RsaE are underlined. The P(N) value gives the probability of the m a t c h arising by c h a n c e .  Was either of RsaF(973) or RsaF(1984) the OMP component involved in secretion of RsaA? Sequence similarity was not enough to show that either or both of the genes coded for the O M P . One approach to determine this, was to construct knockout mutants of these O R F s and determine if this prevented secretion. The plasmids pTZ19UASSmANAC-RsaF(973) and pTZ18U(CHE)ANACRsaF(1984) were constructed to perform the required integration events.  Both  plasmids consisted of internal portions of the respective genes without the Nterminal and C-terminal. These constructs required only a single recombination event to accomplish the knockout. A single cross-over would produce two copies of the gene, one with an N-terminal deletion and one with a C-terminal deletion, neither of which would be expected to function.  To make the p T Z 1 8 U ( C H E ) A N A C -  49  RsaF(1984), it was still necessary to generate a P C R product containing the coding sequence of rsaF(1984).  New primers were created using the primer selection  methods provided by the MacVector software. The resulting primers were 26 and 28 bp long and had T  m  of 71-73°C.  Once again the P C R process proved difficult. A  P C R product could not be generated at any annealing temperature higher that 55°C, considerably lower than the predicted T .  When a product was generated,  m  contaminating bands were always present and could not be eliminated by changes in the P C R reaction conditions.  Instead, the band of the expected size was gel  purified and cloned, giving the plasmid pCR2.1rsaF(1984) which was then used for constructing the deletion clone pTZ18U(CHE)ANAC-RsaF(1984). The  plasmids  pTZ19UASSmANAC-RsaF(973) and pTZ18U(CHE)ANAC-  RsaF(1984) were electroporated into the strains NA1000, and JS4000. JS4000 is a strain of C. crescentus  that cannot make RsaA, but has functional rsaDE genes virtually identical to that of NA1000 (see Ch. 5).  NAT 000 (wildtype)  the  Knockouts were only obtained in  strain  JS4000  and  not  NA1000,  resulting in the mutants JS4000rsaF(973) and JS4000rsaF(1984). When AprA was JS4000 (S-layer neg.)  expressed in these mutants, AprA was not secreted by JS4000rsaF(973), but was by JS4000rsaF(1984) (Fig. 4-6).  JS4000 rsaF(973)  From these  data it was concluded that RsaF(973) is the O M P of the RsaA secretion system. To  J S 4 0 0 0 rsaF{ 1984)  required  confirm for  that  RsaF(973)  secretion,  the  pBBR3AprA:pCR2.1F11Sal1, Figure 4-6.  AprA secretion from C. crescentus. A p r A w a s e x p r e s s e d in all b a c t e r i a using p B B R 3 A p r A on skim milk plates. Z o n e s of clearing around the colonies indicate secretion of AprA. D e l e t i o n o f rsaF(973) interrupts s e c r e t i o n of A p r A while interruption of rsaF( 1 9 8 4 ) d o e s not interrupt s e c r e t i o n .  was clone  expressing  AprA and RsaF(973) was created.  This  construct could not be made in E. coli. This may be because both of the separate plasmids  were  toxic,  but  sublethal.  Together the toxic effects may be lethal. The plasmid was obtained by introducing  50  the ligation mix directly into the knockout strain of RsaF(973). No AprA is secreted from this construct as the plasmid pF3BR3AprA:pCR2.1F11 S a i l was unable to complement the knockout. Despite this, it is still believed that RsaF(973) is the O M P of the RsaA secretion system.  Summary This portion of the project was exceptionally arduous because the rsaF genes appeared to be toxic in E. coli.  This would explain much of the  difficulty  encountered, such as why the NA1000 cosmid library did not contain rsaF(973), why the TIGR genome sequence database does not contain a complete rsaF gene sequence, and why it proved difficult to isolate the genes. The lack of colonies resulting from the cloning of the rsaF(973) P C R products also suggests a toxic effect. All other attempts to isolate the rsaF genes on a fragment of D N A smaller than 7 kb failed, presumably because the smaller inserts were lethal. This suggests that the rsaF genes are lethal to E. coli and the clones obtained contain mutations that make the insert less toxic. A s mentioned above, this presumed toxicity may explain why the partial TIGR genome sequence contained only partial O R F s of the rsaF genes. Other analysis of the TIGR sequence suggests that greater than 80% of the C. crescentus genome is represented (see C h . 6). Given that, the sequence reported here for rsaF(973) may differ from the wildtype sequence. Such a mutant rsaF(973) gene in the plasmid PCR2.1 F11 Sail may not produce a protein that functions correctly.  This would  explain why this plasmid was tolerated in E. coli while other constructs appeared to be lethal and would explain why the plasmid pBBR3AprA:pCR2.1F11Sall failed to complement the RsaF(973) knockout.  It is unlikely that the phenotype of the  rsaF(973) knockout is caused by a polar mutation because the gene 3' of rsaF(973) is transcribed in the opposite orientation. Even given the failure to complement the knockout, the results presented here indicate that RsaF(973) is the O M P required for secretion of RsaA. The function rsaF(1984) is not known. The entire O R F was never cloned and sequenced so it was not possible to determine if an entire O R F coding for an O M P exists. The sequence identity between the two rsaF O R F s suggests that one may  51  be a gene duplication of the other and that rsaF(1984) is no longer functional. Another possibility is that there is a second type I secretion system in C. crescentus (though it is not known what it might transport) that uses RsaF(1984) as the O M P component. Determining the function of rsaF(1984) represents a future project.  52  Chapter 5 Identification of the S-layer subunit and transporter genes in Freshwater Caulobacter species Introduction The Smit laboratory strain culture collection contains numerous strains that have been isolated from locales around the world and are designated F W C (freshwater Caulobacter) species (MacRae and Smit, 1991). Analysis of these F W C species showed that not all have an S-layer (Walker et al., 1992). There seems to be a geographical as well as evolutionary distinction between these species (Abraham etal., 1999; MacRae and Smit, 1991). No F W C with an S-layer has been found in Europe, though admittedly, only a small fraction of the F W C species were isolated from European sources while F W C species with and without S-layers were found in North America. The evolutionary relationships between the different F W C species have recently been examined by 16S rDNA sequencing, profiling of restriction fragments of 16S-23S rDNA interspacer regions, lipid analysis, immunological profiling and salt tolerance characteristics to organize the taxonomy of 76 different strains (Abraham et al., 1999).  It was demonstrated that all of the F W C species with S-layers are  much more closely related to one another than to the species without S-layers, and the non-S-layer F W C species have been reclassified as the genus instead of Caulobacter.  Therefore S-layers are a characteristic of  Brevundimonas Caulobacter  species. The S-layers of the Caulobacter  species have been previously examined.  The S-layer subunits range in size from 100 kDa (comparable to NA1000) to 193 kDa and can be removed by a low pH or E G T A extraction method. All the putative S-layer proteins react with, antibody raised against R s a A (though most often to a lesser extent) and most also produce a polysaccharide that reacts to antibody against the S - L P S responsible for attachment of the S-layer in NA1000 (Walker et al., 1992). It was also shown that these F W C species will hybridize with an rsaA probe under conditions that would allow up to 30% mismatch (MacRae and Smit,  53  1991). This suggests that the S-layer subunits on these other F W C species are similar to RsaA and may also be secreted by a type I secretion mechanism. Two strains have been used predominantly for the examination of the S-layer in C. crescentus.  NA1000 is a variant of the A T C C 19089 strain, whose genome is  being sequenced by TIGR. It is from NA1000 that the rsaA gene and rsaD and rsaE, genes responsible for secretion of RsaA, were isolated (see C h . 3).  The  second strain used in the Smit lab is JS4000, a lab variant of the A T C C 15252 strain that spontaneously lost its S-layer during culturing, and is being used for expression of recombinant proteins secreted using the NA1000 rsaA gene. The S-layer gene from JS4000 has been cloned and expressed in E. coli where it produces a 40,000 molecular weight protein in inclusion bodies (Bingle era/., 1999). ATC15252 has an S-layer gene that appears to be identical to R s a A as determined by size and antibody reactivity, yet other characteristics of the bacterium (i.e. cell appearance, growth rates), 16S rRNA sequencing (Stahl er al., 1992) and R F L P mapping of the genome (B. Ely, pers. comm.) showed that it is different from NA1000. Preliminary investigations of these S-layers that were begun in order to determine the differences between the S-layer subunits and their associated transport systems are presented here and have now been taken over by Mihai luga. It is hoped that analysis of these other S-layer systems will provide insight into the transport mechanisms by showing what changes in the transporters are required to transport the different sized subunits.  Results and Discussion The S-layer subunit, ABC-transporter  and Membrane Forming Unit  proteins of JS4000 and NA1000 Caulobacter species are virtually identical. The S-layer genes from both JS4000 and JS3001, a shedding derivative of A T C C 15252, were cloned and sequenced (see C h . 2) and have few differences when compared to the sequence of the NA1000 r s a A In a few places the guanosine (G) and cytosine (C) residues are reversed (i.e., G C instead of C G ) , but these are in regions of high G+C content and appear to be errors in the original sequencing of rsaA (Gilchrist et al., 1992) as the partial Caulobacter genome sequence from TIGR  54  supports my sequencing results.  The sequence for NA1000 was amended  accordingly. The error in the JS4000 sequence that truncates the S-layer protein consists of a guanosine base that has been deleted from codon 357 which causes a termination codon to be read at codon 359. These differences are listed in Table 51. The rsaD and rsaE genes from JS4000 have been isolated from a cosmid library (see Ch.2) and were sequenced. These genes are almost identical to the NA1000 genes. The differences between the strains are summarized in Table 5-1.  ATCC 19089  ATCC 15252  NA1000 RsaA  JS4000  JS3001  aa 358-359-360  Gln-Asn-Leu  Gln-Thr-None  Gln-Asn-Leu  aa475  Val  He  Val  aa860  Thr  Ser  Thr  RsaD  aa298  Asn  Thr  ND  RsaE  aa 131-132  Ser-Gln  Arg-Leu  ND  Table 5-1.  Differences  between  the  Rsa genes found  in lab  strains.  amino acid s e q u e n c e differences between the R s a A , RsaD and RsaE proteins of three lab strains of C. crescentus.  N D - not  Deduced common  determined  The S-layers of FWC species are probably transported by a type I secretion system.  The alkaline protease gene, AprA, from P. aeruginosa  is secreted by the  RsaA secretion machinery (see C h . 3). AprA was successfully secreted in selected strains covering the range of S-layer subunit sizes, demonstrating that these strains also had type I secretion mechanisms (Table 5-2). AprA secretion was varied in the differing F W C species. While in NA1000 all the colonies containing the aprA gene secreted AprA, not all F W C colonies did.  While some species (i.e., F W C 19)  55  Species  AprA  Penetrance*  secretion  Subunit size  showed full penetrance (all colonies expressed  NA1000  ++  (%) >99.9  JS4000  ++  >99.9  98kDa  FWC 8  ++  80  122 kDa  FWC 9  +  >99.9  133 kDa  FWC 17  +  78  106 kDa  (i.e., F W C 32). It is not  FWC 19  +  >99.9  108 kDa  known why only some  FWC 28  +  45  106 kDa  colonies secreted AprA.  FWC 32  +  10  133 kDa  FWC 39  +  80  193 kDa  FWC 42  +  10  181 kDa  Table 5-2.  FWC species  secreting  alkaline  98 kDa  protease.  ++ represents 7 0 - 1 0 0 % of the N A 1 0 0 0 secretion level, + represents 2 0 - 6 9 % of the N A 1 0 0 0 secretion level * penetrance w a s the number of colonies expressing A p r A  AprA), in other F W C species as few as 10% of the colonies secreted A p r A when  the aprA  gene w a s e x p r e s s e d  P.  aeruginosa  also  expresses an inhibitor that binds to the AprA and prevents proteolytic activity inside the cell. A s the inhibitor is not expressed with aprA in the F W C species, AprA  may have a toxic effect on Caulobacter cells and there may be selective pressure to eliminate it from the cells. Cells not secreting AprA, may have found a way to prevent expression of the gene. NA1000 and some of the F W C species may be better able to tolerate the toxicity than other species.  FWC species with similar subunit sizes have similar Southern blot banding patterns. To further characterise the F W C species, Southern blot analysis was performed using probes to rsaA and rsaDE.  These blots were performed under  conditions that would allow up 30% mismatch. The results are summarized in Table 5-3.  56  Caulobacter  Subunit  Fragment size when probed  Fragment size when  species  size (kDa)  with rsaD and rsaE (enzyme )  probed with rsaA  1  (enzyme ) 1  98  >20kb(£coRI),7. lkb(#mtiIII)  1. Ikb (Hindill)  FWC 17  106  3.5 kb (£coRI), 5kb (Hindill)  4.3 kb (EcoRl)  FWC 18  131  ND  7.0 kb (BamHl)  FWC 19  108  3.5 kb (£coRI)  4.4 kb (EcoRl)  FWC 28  106  3.5 kb (£coRI)  4.3 kb (EcoRl)  FWC 31  106  3.5 kb (EcoRl)  4.3 kb (£coRI)  FWC 42 Table 5-3.  181  10 kb (EcoRl)  8.0 kb (EcoRT)  NA1000 JS3000  species.  Comparison  of Southern  Blot  2  banding  patterns  of different  FWC  C h r o m o s o m a l digests with the e n z y m e specified were probed with either rsaA or rsaDE.  1  E n z y m e that c h r o m o s o m a l D N A w a s cut with for Southern blot analysis  2  Not Determined  Analysis of the Southern blot data suggests that the S-layer subunits and transporters can be grouped according to size. All of the F W C species with subunits ranging from 106-108kDa have identical Southern banding patterns, while all the other F W C species with different subunit sizes have different banding patterns. The ability of the rsaDE genes to hybridize to the chromosome of the differing F W C species suggests that the S-layer subunit is being secreted by a type I transporter. With this in mind, methods were devised for isolating the genes involved.  The ABC-transporter  subunits were isolated from several different FWC  species. The sequence identity between A B C transporter among different type I systems is the most significant of the 3 transporter components. Using the sequence identity between the ABC-transporters aprD (P. aeruginosa), prtD (£.  chrysanthemi)  and rsaD (NA1000), degenerate primers were designed to amplify a central portion of the A B C transporter using P C R . Using these primers it was possible to amplify, clone and sequence fragments of the A B C transporter from F W C 6 , F W C 8 and FWC39. P C R products were not successfully generated from F W C 1 7 , FWC26,  57  30  20  NA1000 JS4000 FWC8 FWC6 FWC39  ,L y M L Q V Y D R V I. T s R N V S T L I V L T V k Y M L Q V Y D R V L T s R N V s T L I V L T V M L s s R N V A T L V V L T L Q V Y D R V L l Y  L Y M L Q V Y D R V L s  s R N V A  L Y M  s  L Q  V  Y  D R V  L  R  N  V  T  L  T  L  40  I C V F L F L V Y G L L E A L R T Q V L V R I C V F L F L V Y G L L E A L R T Q V L V R I C V F L F I V Y G L L E A L R T Q V L V R  NA1000 JS4000 FWCS FWC6 FWC39  R D R D R D  P  70 80 V L D s T L s R K G i G G Q A F R D M I F K s V L D s T L s R K G i G G Q A F R D M I F R s V L D s T L n K R G a G PTIQ A F R D  R  D  P  D  P  F  D  A  R  T T T  P  I  P  V F V V F V V F i  T  P  V  F  I  T  P V  F  P  i  I I  P  F K s V I F K s V  V  Q  N  Q N g  Q  N  D  L  D  s T s T L  "  s W M I V s W M V  L  H  I,  H  M S G M s G  •  I I  K K K K K  V V V V V  V V V V  G Q W  I  V  S s V  I  L A  1  I  c  L  N  A  T  L R N  A  E  R  V I V N I V Q V V 0 I V  Q T L Q T L  F  R  N  0  T  L  Y  I G A li G A  W  D R L  C Y  I G A R G A  w 1 L G A R G s w 1 L G A R G S w F I G A R G A w I G A R G A w  D R L  I V  Q  T  V  C Q w  G Q w K G Q w K  G  G Q w K G I G Q W K G K  G  L  Q T L Q T L  I L G I L G I L G I L G I L G  G G G G G  G G G G G  L  R E E K  s A  L  R E Q K  n  D R L  Q T M Q T M  L  R E  D R L  Q  D R L  L Q  T M  R  I E  A  G T  s v A  R  I D A  G  A  A V A  R  I D A  G A  G Y  D  D  D G  Y  t L A  D D  8 R  D  M M M M M  I  E Q  V  Q  V L V I V I V I V  P  E  P  R G V L  T D D H M  P  L  P  D  P  R G V L  K n T D D H M P L  P  D  P R G V L  L L  P  E  P  R G V L  P  E  P  R  L  L  R G  I V G V w  P  L  L  R G  I V G V w  P  L  L  R G  I V G V w  P  L  L  R G  I V G V w  P  L  L  R G  P  L  L  R  G  I V G V w I V G V w  . DD H M  F p  A G K  G P  S  A  A G K s  G P  S  A  A G K s  G P  S  A  A G K s  G P  S  A  A G K s  . V G P  S  A  A G K  S  s  s s s s s s  I  A  R  D  P E  K  L  G R H V  G Y L P  Q D  I  E L  F  S  G T V A  Q  N  I  A  D  P E  K  L  G R H  I G Y L P  Q D  I  E L  I  A  D  P E  K  L  G R H  I G Y L P  Q D  I  E L  F[7  Q N N  I  R Q w D P E K I K Q w D P E K  L  G R H  I  D  I  E  Q N Q N  I  G Y D T A  P m  Q s Q A  L L  Q A  L T]a[cr  E M  I  V  k M  A  G V H E  k  A  G V  H  M  E M  G V H E M  F  S  G TE A G T V A  L  F  S  G T V A  I E L  F  S G T V  420 L  G V H  P Q D  G G A  S  L  S  P m G Y D T A  I G E  G G A s  L  S  P  I G E  G G A s  L  S  I  G G A s  L  S  q  G Y D T A  L  A  Q  R R R R R  L A P I L A P I L A P I L A P I L A P I R A L A P I  A A A A A  E G E G E G E G E G  E G A  I L  P  P  A  I L  P  P  A A  A  T  I L  P  P  A  T  E  A  A  A  E  A  A E A E A  A  A  s  i L P P i I P P  E A  A  s  ]  C c c c c c  A  A  G V  A  A  G V  I R L  A  A  G V  I R L  A  A  G V  I R L  A  A  G V  I R L  A  A  G V  I R L  E  L  P  P  I R L  400 F  R  F  T  E  F  R  F  T E  F  A  R  F  t l s  F  A  R R  F  F  T E F  s Q E V I E A E s Q E V i E A E A Q E E i D A E A N D V i E A E A N D V i E A  A  E  A  E  Q  440  E V  i  A A A A  E A  450  G G Q R Q R L G G Q R Q R L G G 0 R Q R I  IQ I  I Q  ClustalW  I.  P  Y  D  T  A  G Y D T A  alignment  G  I G  E  E G G A  of partial  G G Q R Q R Q R Q R  s L S G G  RsaD genes  from  Identical residues have dark shading. Similar residues are s h a d e d lightly, alignment is the c o n s e n s u s s e q u e n c e .  FWC The  P  350  E  F  A A A A A  s s  A  s A  T  I A  V  300 s s s s s  F  430  I G E  G A  390  Q N  I Q S  G V  380  G T V A  C Y L  200 G A V G A V G A V G A V G A V  340  A  S  H  I  I A  250  L  F  R  240 L V G I L V G I L V G I L V G I L V G I L V G  A A A A A  P  S  Q  A s D A G A s D A G T s D S G A s D S G A s DAG A A A s D A G A A A A A  Q W Q w Q w Q w Q A w Q W  D D H M  E L  L G  N  290  I  L P  I  R G V L  G P  L  V A  Y  I A  I  P  370  G  H P i | g I. A T I A s P 1 Q M A T M A s  P  150  E  rTfr D H M  P  I A I A I A  I  P  G G G G G  w w  Q M A T M A s Q M A T M A s Q M A T M A s  D  A  W  P  w P w  L  d  I  P  A  P  A A A A A  Q D  G V H E M  Figure 5-1.  D  E Q V A E Q V A E Q V E E Q V A E H V A  G Y L P  G V H E M  L A  D  nA JLT Kn  G R H V  t L A  M  R R R R  r  L  V  L A  D  K  410  NA1000 JS4000 FWCS FWC6 FWC39  D  N  P E  D L  D G Y D  N  330  S V A  I K Q w L R Q w  F M  M  A I S A G A I E G K[T|S A G A I E G K I S A G S i f k l G R l I S [PIG S I . G K I S A GA I S A G  L R E E K  A V A  D  E  280  G T  A  R  I  A F C D A A F C D A A F ITID A  t G L I A F C G L I A E C U  s I s I s I s I s I A G s  I D G K I D G K  ~a\it L R | a [ E j a  [) R  V  0  I  190  D D H M  G A  D  T  A  s A  A  D G Y  R  R E E K  I E  D G Y  W  L  I D A  D  R  Q T M Q T M  R  I K Q W I K Q W  A  A  R  G Y D  Q  L  » T M R Q A s F a T P T M R Q A s F G g Q T P T M R Q A s F G A K A P T M R Q A s F G A Q F T M R Q A s F  D  L  Y  360  NA1000 JS4000 FWC8 FWC6 FWC39  G  320  a  D  -  1 L G G G A  A V A  Q  -  270  S  T  230  G A  G A  G G  A A A A A  I D A  R Q A  D Q V R E F L  ISO  L L L L L  R  T M  <»  .  Y Y Y Y Y  F  G A  Q  A A A A A  310 NA1000 JS4000 FWC8 FWC6 FWC39  Q  l | 11F G L A I M I I F G L A V M  220  I  D  MD  R G L Q A R W R A R V - G L Q A R W R p " R G - G L Q A R W R | V R J q a a|R A | 1 a|H S 1 V M K A M G M w G G L Q A R W R . R  N N  roo  V R E F M T a G L V R E F M T g G L I R E F M T t G L  130 140 C I I I F G L A V M N D n A T K N P A C I I I F G L A V M N D n A T K N p S cITll I F G L A V M N D r A T K N p  A  R R R R R  K N  I  Q  A  L  260  NA1000 JS4000 FWC8 FWC6 FWC39  I  D  • G G Q A F R D M G G G Q A F R D M  I  F F F F F  o I M s G I M s G I M s G I K V H  I  T  A G s T L R N A A G s T L R N A D A G s T L R N A D  A N A Q 0 A Q N D A G s  R  170 E V M K A M G M w E V M K A M G M w E V M K A M G M w M w E V M K A M G M w  D  D  F F G p F F G  p  210 NA1000 JS4000 FWCB FWC6 FWC39  K  110 120 V s W M L H p F F G I L A I V s W M L H p F F G I L A I V s W I L H p Y F G V L A I  160  NA1000 JS4000 FWC8 FWCS FWC39  L  G  90  I F K s  P  G G  V V L T L l| v | I F L F L V Y G a L E A L R T Q V L V R G G L K V L T . I C V F L F L V Y G L I, E A L R T Q V L V R G G L K  60  NA1OO0 F D g V A F D g A JS4000 F D d a FWC8 I* FWCS FWC39 E H • i- T  L K L K G L K  G G  species. line underneath t h e  F W C 2 8 , F W C 2 9 and F W C 4 1 .  Multiple bands were generated from F W C 2 7 and  FWC42, but I was unable to clone any of the fragments. Obviously, the P C R strategy selects for ABC-transporters most closely related to the NA1000 gene. This suggests that even though the subunit of F W C 6 is 181 kDa and that of F W C 3 9 is 193 kDa, the transporters are still closely related to F W C 8 with a subunit of 122 and NA1000 with a subunit of 98 kDa and this was confirmed by sequencing (Fig 5-1). Curiously, F W C species with small subunit sizes close to that of NA1000 failed to generate P C R products suggesting that the sequences of their ABC-transporters have diverged more from the NA1000 sequence.  Analysis of the sequence showed little division between the F W C  species according to size. In some places along the deduced protein sequence, the transporters of smaller subunits are more similar to one another than to the transporters of larger subunits while in others, the sequences of transporters of differing sizes are more similar to one another (Fig 5-1). A method for screening the chromosomes of F W C species for the S-layer subunit and S-layer transport genes was devised (see C h . 2). Using this method, part of the S-layer subunit gene for F W C 27 was isolated. F W C 2 7 has an S-layer subunit size of 145 kDa. Comparison of the sequence to NA1000 reveals that there is a considerable difference in the sequence of these proteins (Fig. 5-2). A B L A S T alignment of the RsaA and FWC27 sequences (Altschul et al., 1990) shows that the proteins are 44.6% identical and 61.5% similar over 130 amino acids.  59  NA1000 FWC 27  NMOOO FWC 27  10 m a y T t a q 1 V k k t g x S r r s p r  50  T T  A Y t » A F 9 « A T A A  ol  f  57  L 1  kpr £ n n i  1 r e  70 1 Is d A a A I T N I L K 1 V n s T T A V A i q T Y Q q|s|l|A|qUlt|A Ql V K a a s A T T S V A t 1 A Y E Y T T V A K S A A  L  N  A  Y  Y  L  N  A  Y  Y  S K A Q  F F  a q e n t V  N  E  N  R  N R  F Y  I N  F  I N  F  s I A V  N  L  N  L  N L  I N F  p  *  F  T  G  F  F  T  G  90 V a k i  P P  s s  a 1  P s  F F T G  a t g a g k - n  t q v g  A  so F  A  G  L D  F  L  A  G  I D  F  L  D F L  A G  V d s TI I s plTjg  N T N N S T N . .  ISO  140  130 a g a t k d  120 NMOOO FWC 27  t[r_  L G  t t 1 t 1 d a y v h 1 d 1 r s q a a  F  F  G  a q T f d A  t g  T  * A  L g aL Y  a k  t  m  S|A|  A  s  200  NMOOO FWC 27  I I I L Y  Figure 5-2. RsaA.  G G  n a v a t A A g v D v a a a v a f a k v h t 1 i A T p D  ClustalW  alignment  of  FWC 27  v r a n t p f t a a a d i d - g d g a t g  I D Y L  V  D  Y  with the  L  first  200  amino  acids  of  Identical residues have dark shading. Similar residues have light s h a d i n g . Identical and  similar residues are b o x e d . T h e line underneath the s e q u e n c e s is the c o n s e n s u s s e q u e n c e .  The sequence of RsaA contains repeating amino acid sequence elements. Sequence analysis of R s a A has revealed that portions of the sequence exhibit considerable sequence similarity to other portions of the molecule. Table 5-4 shows the similarity of the C a terminal.  2 +  binding domain of R s a A to sequences closer to the N-  These similar units do not appear to be uniform in size and appear to  consist of 60 to 90 amino acid segments, but the exact sizes have not been determined. These segments may represent a complete structural domain (i.e., cchelix or p-strand) that is replicated along the length of the protein, but further analysis is required to confirm this. As Table 5-4A shows, the alignments of RsaA along different portions of itself can result in as much as 2 8 % identical amino acids.  Furthermore, the Expect  numbers, representing the possibility of the match occurring by chance in a random sequence database of the current size, are very small. Table 5-4B shows the other hits in the database to the same portion of RsaA. The S a p proteins from C. fetus are S-layer proteins with the greatest identity to RsaA. HlyA from Aquifex  aeolicus  and the hypothetical protein from Rhodobacter capsulatus both contain the calcium binding motifs found in proteins secreted by type I systems, leading to higher identity. A s the Expect numbers show, the identity to R s a A along itself is greater than what would be found by chance in the sequence database. This repetitive  60  nature is also seen at the DNA level (data not shown). It must be taken into account that the nature of the R s a A composition (26% threonine and serine) leads to a higher number of repetitive sequences occurring than would be expected by chance. This explains why a low Expect number occurs with alignments to a membrane glycoprotein from Equine herpesvirus which also contains a high number of threonine and serine residues. It is only at Expect numbers of 1.8e-08, much higher than the best expect number of 6e-14 of RsaA to itself, that random proteins begin to show identity. Overall, the repetitive nature found here is higher than could be expected by chance and suggests that R s a A evolved by duplicating structural portions of the molecule to form a larger protein.  61  Table 5-4. BLAST alignment of RsaA with itself.  A pir| |A48995 paracrystalline surface layer protein RsaA - Caulobacter crescentus Length = 1026 Score = 573 bits (1461), Expect = e-163 Identities = 300/300 (100%), Positives = 300/300 (100%) Query: 1  QLGATAGAI'l'tlNVAVIWGLTVLAAP^ QLGATAGATiVlNVA\ftWGLTvLAAK  60  Sbjct: 721  QLGATAGATITTLXIVAVIWG^  780  Query: 61  LAGVETVNIAATKIOTTO  120  IAGTVEIVISIXAATDIOT  Sbjct: 781 LAGVETVISIIAATDTWITA  840  Query: 121  CTGSAVIWSANITVGE*^ GTGSAVIWSAlSriTvGEvVriRG^ Sbjct: 841 GTGSAVTFvSANITVGEVOTI^  180  Query: 181 GTGADIFDIOSMGTSTAF^ GTGADIFDIISLAIGTSTAFVTITES^ Sbjct: 901 GTGADIFT)INAIGTSTAFW  240  Query: 241 YLDAAAAGDGSGTSVTAKWFQFGGI^^ YLDAAAAGIXJSCTSVAKWF^ Sbjct: 961 YLDAAAAGTCSGTSVAKWF^^  300  900  960  1020  Score = 78.4 bits (190), Expect = 6e-14 Identities = 85/318 (26%), Positives = 133/318 (41%), Gaps = 37/318 (11%) Query: 2  D3ATAGATIKTNVAVNVGL L AT A NVAV+ G V A T G T T T + Sbjct: 360 LTAT^AAQAA^M/AVIX)GAWI  Query: 60 AIAGVEIVKlIAATIlTIwra^ A+ G V +A T N V+T QA Sbjct: 420 AVTGGTAVTVAQTAGNA VMITLTQA Query: 120 TGTGSAVTF AVT  59 +S  ++++++S+  419 119  + VTGN+ TA + A+ DVTVT^SSTTAVIVIQ^  VSAtvlTIVGEWTIR-GGAGAI^^ ++ I T G++  G+  T + G G A + + S A  472  GAGADTL 169 +G  GG  L  Sbjct: 473 GRVISkGAVTITDSAAASATTA^  532  Query: 170 VYTGGTnTCTGGTGADiro^ T +T T ++N + T+T +T ++AA D Sbjct: 533 TATPTANTLT IJLWNGL-TTTGMTDSEAAAD^  KIJ^LVGIST^OUADGAF 226 +++ G + + IA 584  Query: 227 GAAVTLG AAATl^YIIWWM3IX3SGTSvA^^ A TL AT+ + AAG SV Sbjct: 585 AL^TTTJsnSGDARVTITSHTA  T+V  Query: 280 GADAVIKLTGLVTLTTSA 297 A++ G T+T S + Sbjct: 645 TTKAIVI^EAGDinVTVSS 662  AGA  +  S 279 + 644  Table 5-4 continued Score = 66.3 bits (159), Expect = 3e-10 Identities = 94/361 (26%), Positives = 143/361 (39%), Gaps = 80/361 (22%) Query: 4  ATi^TITOMWNVGLTvLAAPTG A + TT +AV G V A T Sbjct: 409 ANSSTTTTGAIAVTGCTAVrvAQ^  TTIVTLANAT—GTSDVHNILTIJSSSAALAAG  TT+T A+ T G S  468  Query: 58  -TVA—IAGVETVNIAATiniSnTA-HVEn^ TVA + G T+ +A + TTA + T+TL + A +1 Sbjct: 469 ATVAGRvKGAVTITDSAAASATTAGKIAT^  Query: 114 FDASAVIGTGSAVTF-VSANrlVGhVA T T + + T V+ T T G + Sbjct: 529 R G A L T A T P T A O T L T I J N V ^ Query: 160  57  +T++ +AA. AG 113  +  +NL+ TG + 528  VTIRQGAGADSLTGSATANOT 159 +IG +++ A+T 588  IIGGAGIADIIJVYTGGTD  TFTGGTGADIFDINA  + +G + T T Sbjct: 589 TUXttSGDARVTITSHTAAAL^  FTGG GAD  191  +A 648  Query: 192 --IGTSTAFVTITDAAV GDKIXlLVGISTNGA— IAD3APGAAVTLGAAATLA +G VT++ A + GD D++ + N3+ AD AFG TL Sbjct: 649 I V M S A G D E T V L V S S A T I J G A G G S V T ^ Query: 240 QYH3AAAAGD3SGTSVAKWFtQPGG^ AAAGS+ G T + + ++AGAT + Sbjct: 704 AGAAAQGSHNA NGFTALQDSATAGAT^^  239 703 299  + LT L  T + 752  Score = 66.0 bits (158), Expect = 3e-10 Identities = 85/301 (28%), Positives = 121/301 (39%), Gaps = 46/301 (15%) Query: 2  LGATAGATIFIl^VNvGLTvIAA LA AT A++LVAAGT Sbjct: 172 LTAFVRAlNlTPFrAAADIDIAvKAALICT  Query: 62 AGVETVNIAATDTISTITAHvTOT + ++ A T+ A V+ T Sbjct: 220 — INDLSDGALSTDNAAGV^^  +S  61 T+S A T A+ TVSGIGGYATATAAM 219  +NA. ILNAA  121 G+T + +TG GSTLSLTIGTDTLTG 263  S  Query: 122 TGSAVTFVSANITvGEVvTIRGGAGAD T + TFV+ GEV AGA +LT Sbjct: 264 TANNDTFVA GEV AGAATLT  181 DT+  GGAG  VGEOIJSGGAGTDVLISIWVG^W^VTAI^  Query: 182 TGADIFDimiG-TSTAFVTITL^ TG I I + TS A +T+ ++ L + +T+GA Sbjct: 309 TGVTISGIEIiyiWrSGAAITUOTSSGV^ Query: 241 Yn^AAAAGDGSCTSVAKSAfli^p^ + A G+ +VA G TV Sbjct: 368 AANNVAVDGGATSIVTVASTGVTSGrTT^  DL + 308 240 GA  L A T AQ 367 299  +S+A G  VS A++  TG + +T 427  Table 5-4 continued S c o r e  =  62.8  I d e n t i t i e s  Query:  12  b i t s  =  (150),  77/293  E x p e c t  (26%),  =  3e-09  P o s i t i v e s  =  125/293  Gaps  =  38/293  A  V  L  +G  +  230  TDNAAGVMirrAYPSSGVS^  Query:  72  TDINITAHVnnLTIjQA^ +  +  T  +  +  A  +  71  TGT  TL+  S b j c t :  V T +  +  SGGAGTrmJSWQAAAVTALPIGW  Query:  132  NTINA3EvVriRGGAGAI>--SLTCSATA^  S b j c t :  344  Query:  190  G  T+  AG VA  A  GAG +  +  T  W  A+T  +S  V T G  +T  +  T  V++  G  TG  T+  T  +G  SGTTTVGA  +T  NSAASGTVSVSVANSST  Query:  250  GSGTSV--AKWFQFGGDIWVVre T++  A  G  GAIA  TTTGAIA  +  V  V T  G A  T+AQ  AG+  VIGGTAVTVAQ  TAGN  A  +GA  ++  G VT+T  SA  A+ 488  B  Smallest High Probability Score P(N)  Sequences producing High-scoring Segment Pairs:  Table 5-4.  435  300 +  AVNITLTQ^JOTVIXM  477427 2120535 2120536 2114323 94640 2983562 2114321 3128319 2606019 4063042 790694 3128317 790692  396  249  +++  397  343  189 +  N A I C T S T A F V T I T D A A V G D ^ ^  S b j c t :  gi gi gi gi gi gi gi gi gi gi gi gi gi  4+  LTALMT  AN+  +  T  NTS-GAAQTVTAGAGQNLTATT^  IM+  436  +++  131 A  288  NT+  +  287  S b j c t :  1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.  (12%)  T N V A V N v t ^ T v I A A P I G T I T v ^ ^ T+  S b j c t :  (42%),  RsaA - Caulobacter crescentus 1461 SapB - Campylobacter fetus 154 SapA - Campylobacter' fetus 108 membrane glycoprotein Equine herpesvirus 1 153 SapA - Campylobacter fetus 100 HlyA - Aquifex aeolicus-hemolysin protein 132 membrane glycoprotein Equine herpesvirus 1 130 hypothetical protein-Rhodobacter capsulatus 98 envelope glycoprotein - Equine herpesvirus 4 127 glycoprotein - Cryptosporidium parvum 125 111 epimerase -Azotobacter vinelandii hypothetical protein-Rhodobacter capsulatus 102 epimerase -Azotobacter vinelandii 109  B L A S T alignment  of  RsaA with  itself. A) Portions  721-1020.  le--187 5e--17 le--11 5e--11 4e--10 9e--09 8e--08 3e--08 7e--08 7e--08 le--07. 4e--06 4e-06  of the s e q u e n c e of RsaA  exhibit considerable s e q u e n c e similarity to other portions of the molecule. amino acid s e g m e n t of R s a A from  1 9 1 1 1 9 1 4 4 8 4 1 1  Q u e r y represents the 3 0 0  Sbjct represents the entire s e q u e n c e of RsaA.  Numbers alongside the s e q u e n c e indicate amino acid positions.  T h e line between the Query and S b j c t  lines indicates identical amino acids with the appropriate letter c o d e and similar amino acids with a '+'. Identities refers to the number of identical amino acids shared between the s e q u e n c e s . Positives refers to the c o m b i n e d number of identical and similar amino acids s h a r e d between the s e q u e n c e s . Expect gives the possibility of the s e q u e n c e alignment occurring by c h a n c e considering the current s i z e of the s e q u e n c e d a t a b a s e s . B) Result of B L A S T s e a r c h showing the c l o s e s t m a t c h e s to the amino acids 7 2 1 - 1 0 2 0 . numbers<0.001  P(N) numbers are almost identical to Expect numbers for E x p e c t  (Altschul etal.,  1990).  64  Phylogenetic analysis of the FWC species has shown that the FWC species can be divided into five branches. Analysis of the phylogenetic study Abraham et al., 1999 shows that there are two branches, B and D, of the  Caulobacter  phylogenetic tree that contain species with only small, 100-108, kDa S-layers (Fig 53).  F W C 1 9 , F W C 2 8 and FWC31 belong to one of these branches and F W C 17  belongs to the other. These are the four strains with identical Southern blot banding patterns (Table 5-3) suggesting that the S-layers and associated transporters of these two branches are more closely related to each other than to the other three branches. The three other branches show no correlation between subunit size and evolutionary distance as they have S-layer subunit sizes ranging from small (102 kDa) to large (193 kDa). In addition to this, the species F W C 6 , F W C 8 and FWC39, that proved easiest to amplify the ABC-transporter by degenerate P C R , all belong to different branches.  This may simply reflect the conserved nature of the A B C -  transporter. It may be that the larger S-layers evolved separately from one another and the similarities between ABC-transporters transporting large subunits (but not found in ABC-transporters transporting small subunits) may represent convergent evolution required to accommodate secretion of a larger subunit.  65  F W C without S-layers  F W C 3 4 ( 110) F W C 4 1 ( 137) F W C 1 1 ( 108)  A  F W C 4 5 ( 140) * F W C 4 2 ( 181) F W C 2 6 ( 140) F W C 3 2 ( 133) F W C 16 ( 151) F W C 7 ( 177) F W C 2 9 ( 124) * F W C 6 ( 181)  p  I F  * F W C 2 8 ( 106) F W C 2 2 ( 107)  B  * F W C 3 1 ( 106) F W C 2 0 ( 108)  3 l  F W C 2 5 ( 105) * F W C 1 9 ( 108) C  D  i*FWC8  ( 122)  7/  11 F W C 3 3 ( 110) F W C 3 5 ( 102)  J  F W C 4 4 ( 106)  1l * F W C 1 7 ( 106) * F W C 3 9 ( 193) FWC9  E  ( 133)  In  F W C 1 2 ( 133) * J S 3 0 0 0 (98) * N A 1 0 0 0 (98)  0.00  0.05  0.10  0.15  Linkage Distance  Figure 5-3. Dendrogram derived from Caulobacter glycolipid content ( A d a p t e d from A b r a h a m ef al, 1999). The F W C s p e c i e s have been organized into 5 groups with a linkage difference of more than 0.05. * s p e c i e s examined in this study. Numbers in brackets refer t o the size of the S-layer subunit in k D a .  66  Summary  The evolutionary relationships of the S-layer subunits and associated transporters of the different F W C species have been examined here. These results are still preliminary and more work needs to be done to substantiate these conclusions.  While keeping this in mind, I will hypothesize on the evolutionary  relationships that the data presented here suggest. The repetitive nature of RsaA suggests how the different sizes of S-layers could have arisen among the different F W C species. The larger S-layer subunits from such strains as F W C 3 9 and FWC41 may consist of an even more repetitive nature to account for the greater bulk. Larger S-layer subunits might arise from a duplication of DNA within the gene for the subunit. The phylogenetic analysis of the F W C species by Abraham and collegues shows little evolutionary relatedness with regard to S-layer subunit size (Fig. 5-3). While groups B and D contain only smaller S-layer subunits other groups contain a range of sizes. The most pronounced difference in subunit size is found in group E between the species with the largest (FWC 39) and the smallest (NA1000/JS3000) subunits, yet the bacteria are very closely related according to glycolipid content. Thus, it seems that the large S-layer subunits arose independently. The identical amino acid changes seen in the ABC-transporters with large S-layer subunits suggest that these amino acids may be required changes for transporting a subunit of a large size.  Further work on analyzing these differences is required before  anything conclusive can be determined, and is of great interest since this information would help determine the factors that must be considered when designing recombinant proteins for secretion. In reviewing all current data, I hypothesize that the progenitor of the six branches of F W C species had a small (106-108 kDa) S-layer subunit and the two branches consisting solely of small S-layer subunits represent F W C that are most closely related to the progenitor. The S-layer subunits of the F W C species in the other four branches may have altered their sizes more recently.  The repetitive  nature of the S-layer sequence may have assisted in the duplication of sequence segments by allowing slippage during gene replication to create larger S-layer  67  subunits. Smaller subunits such as the 98 kDa NMOOO subunit may have resulted from deletion of repeated units. It may be that to accommodate the different sized subunits, the ABC-transporter components must be changed at specific residues to allow secretion of larger subunits. If convergent evolution resulted in the similarities found between the large subunit transporters here, then these similarities will indicate what portions of the protein are involved in transport of the larger subunit. I believe that the analysis of the S-layer subunits and transporters in this manner will allow a much greater understanding of the type I secretion systems.  68  Chapter 6 Identification of genes involved in the synthesis of the O-Antigen of C. crescentus Introduction  The S - L P S of C. crescentus attachment  is responsible for  of the S-layer to the surface of  the  bacterium. Disruption of proper O-antigen formation in the S - L P S causes the RsaA molecules to slough off or 'shed' from the surface and assemble into sheets (Fig. 6-1). The S - L P S has been isolated and analyzed from S-layer negative NA1000 mutants (Walker et al., 1994) and has the same core and lipid composition as the rough L P S (Ravenscroft et al., 1992). Further analysis of the O-antigen (Smit, unpublished) has revealed that the O-antigen of the S - L P S appears to be composed of a homopolymer of a 4,6-dideoxy-4-amino-hexose. Mass spectrometry indicates that the O-antigen has a mass consistant with of forty of these hexose units. This homopolymer is unusual in that a number of  Figure  6-1.  shed s-  layer from C. crescentus. E M photo-graph of S layer s h e d from a strain with defective S-LPS. (Photo courtesy John Smit)  different anomeric proton signals can be found when it is analyzed by proton  N M R suggesting that the  individual sugar units may not all be linked in the same , S-layer negative c  manner. Presented in this report is evidence that this r  4,6-dideoxy-4-amino-hexose is, most likely, the sugar S-LPS negative Figure  6-2.  perosamine. Perosamine is not commonly found in the O-antigen and only a few species, including  colony  Immunoblot. E x a m p l e of an immunoblot demonstrating the different phenotypes exhibited by m u t a n t s .  Vibrio  cholerae, Brucella melitensis and E. coli 0 1 5 7 , contain perosamine residues (Stroeher et al., 1995; Wang and Reeves,  1998).  In  addition,  a  number  of  glycosyltransferases have been found which may be  69  the basis for the different linkages making up the homopolymeric O-antigen. Results and Discussion  Several Tn5 mutants producing altered S-LPS were found. The screen used to detect transport deficient mutants also detected S - L P S mutants in the NA1000 Tn5 library. On plates, these mutants exhibit a 'halo' of R s a A protein diffusing out from the colonies that can be easily distinguished with an immunoblot from bacterial colonies not shedding the S-layer (Fig. 6-2). This method was used to isolate a total of 26 'shedders' from the NA1000 Tn5 library with altered S - L P S (Fig 6-3).  Figure 6-3.  S - L P S of  shedding  L P S extracts from representative  Tn5  mutants.  form of S - L P S . J S 1 0 0 is a spontaneous shedder mutant band at the bottom  Silver s t a i n e d polyacrylamide gel of S-  N M O O O shedder T n 5 mutants.  N M O O O s h o w s the  wildtype  with a defective S - L P S . T h e large dark  is the rough L P S .  Southern blot analysis of these mutants has shown that mutants F1-F22 consisted of 16 different Tn5 insertions (data not shown).  Further Southern blot  characterisation of the mutants showed that F8 was not a proper Tn5 insertion since the banding pattern was incorrect when probed with Tn5. Southern blots probed with the coding sequence of rsaA showed that the rsaA band in the mutant F21 was not the same as wildtype. This suggested that the Tn5 mutation did not result in the shedding phenotype, but instead a second mutation resulting in a deletion of the  70  rsaA gene was responsible (data not shown). To further characterise these mutants, Southern blot analysis using E c o R l and Sstl was performed on the chromosomal DNA of these mutants. Both of these enzymes do not cut Tn5 and as a result can be used to determine if the Tn5 insertions are linked. The Southern blots were probed with a portion of the Tn5 and the banding patterns have been summarized in Tables 6-1 and 6-2. The results showed that the majority of these mutants have identical banding patterns (groups C and I) and are linked.  Of the remaining  mutants: F10 and F22 appear to be linked, while F3 and F9 are not linked to any of the others (Tables 6-1 and 6-2). Four of these mutants were isolated at a later date and were not characterised by Southern (F23-F26).  Southern blot analysis of chromosomal DNA digested using EcoRl Mutant  Group A  Group B  Group C  Group D  Group E  8.1 kb  15 kb  23 kb  30 kb  35 kb  Fl  X  F2  X  F3  X  F4  X  F6  X  F9  X  F10  x  Fll  X  F12  x  F14  x  F15  x  F19  x  F20  X  F22  Table 6-1.  X  Compilation of Southern blot data from E c o R l digestion of s h e d d e r m u t a n t  chromosomal DNA. fragment  of T n 5 .  E c o R l d o e s not cut T n 5 .  T h e S o u t h e r n blots w e r e p r o b e d with a  Mutants are grouped according the band s i z e s e e n on the  Southern  blots.  71  Southern blot analysis of chromosomal DNA digested using Sstl Mutant  Group F  Group G  Group H  Group I  Group J  Group K  9.3 kb  14 kb  18 kb  20 kb  21 kb  23 kb  Fl  X  F2  X  F3  X  F4  X  F6  X  F9  X  F10  X  Fll  X  F12  X  F14  X  F15  X  F19  X  F20  X  F22  Table 6-2.  X  Compilation of S o u t h e r n blot  c h r o m o s o m a l D N A . Sst\ d o e s not cut T n 5 . Tn5.  d a t a from  Ssfl d i g e s t i o n of s h e d d e r  mutant  T h e Southern blots were probed with a fragment of  Mutants are grouped according the band size s e e n on the Southern blot.  Half of the Tn5 and associated chromosomal D N A from a representative of each of these 16 groups and F23-F26 was cloned by one of two methods. The majority of Tn5 insertions were cloned by cutting the chromosomal DNA with BamHl. This cuts the Tn5 in half, but leaves the kanamycin resistance gene intact.  This  DNA was ligated into a pUC-based vector and selected on kanamycin. This gives an insert with Tn5 sequences on one side and chromosomal DNA on the other. A few mutants proved resistant to this technique and were cloned using an inverse P C R method, developed by V. Martin (Martin and Mohn, 1999). Sequencing off the end of the Tn5 revealed the insertion site of the Tn5 and this sequence was used to search the partial TIGR C. crescentus genome library for the D N A surrounding the Tn5 insertion site. All of the Tn5 insertion sites were found in the partial genome  72  sequence. Open reading frames (ORFs) were determined using the sequence from the partial genome and analyzed for C. crescentus codon preference. These O R F s were used to search the known protein databases for similar proteins using the B L A S T algorithm (Altschul et al., 1990).  The g e n e s interrupted by the Tn5 2  insertions were characterised using this data (Table 6-3). Tn5 mutant  Similarity to known proteins  Location*  group  ORF designation  F1,F7  regulator and transcription repressor LacI  gcc 433  IpsI  F2  perosamine synthetase, RfbE - V. cholerae  RATI  IpsC  F3  nucleotide sugar epimerase/dehydratase  gcc 1444  IpsK  similarity to mannosyl transferase WbaZ - E. coli  RATI  IpsD  methyl-accepting chemotaxis receptor  gcc 648  orfl  Phosphomannomutase, RfbB - V. cholerae  gcc 227  IpsG  F10  none-downstream of kpsT-like ORF (O-antigen transporter)  gcc 279  orf2  Fll  similarity to mannosyl transferase (rfb region)  gcc 973  IpsE  F12  similarity to mannosyl transferase WbaZ from E. coli  RATI  IpsD  F14.F16  mannose-6-phosphate isomerase  gcc 506  IpsH  F15.F18  similarity to mannosyl transferase WbaZ from E. coli  RATI  IpsD  F19  similarity to mannosyl transferase WbaZ from E. coli  RATI  IpsD  F20  similarity to mannosyl transferases  gcc 395  IpsF  F22  none-downstream of kpsT-like ORF (O-antigen transporter)  gcc 1290  orf2  F23  Phosphomannomutase  gcc 227  IpsG  F24  galactosyl-1-phosphate transferase, WlaH C. jejuni  gcc 2537  IpsJ  F25  mannose-6-phosphate isomerase  gcc 506  IpsH  F26  Rhamnosyl transferase  gcc 2218  IpsL  F4, F5 F6 F9,F13,F17  Table 6-3.  List of shedder  mutants.  have been given a n Ips designation. Caulobacter  O R F s with similarity to sugar modification e n z y m e s  * Location gives either the contig (gcc) found in the partial  g e n o m e or s h o w s that the gene w a s found in the R A T I fragment 3' of rsaE and h a d  been s e q u e n c e d while looking for the third translocator protein, RsaF. 2  For clarity the ORFs will be referred to as genes and the corresponding deduced protein sequences as proteins  even though it is acknowledged that neither assumption has been proven.  73  The S-LPS synthesis genes are genetically linked to the RsaA transport genes. Analysis of the DNA sequence around the rsaA transporter complex (see C h . 3 and C h . 4) revealed 5 O R F s with coding sequences having significant similarity to S - L P S synthesis enzymes between rsaE and rsaF(973) and one O R F 3' of rsaF(973) was found. The first O R F encoded a protein with similarity to G D P - D - m a n n o s e dehydratase (Currie et al., 1995; Stroeher et al., 1995), the second O R F encoded a protein with similarity to UDP-N-acetylglucosamine acyltransferases (Canter Cremers et al., 1989; Vuorio et al., 1994) and the third protein had similarity to perosamine synthetase (Bik et al., 1996; Stroeher et al., 1995). The fourth and fifth proteins have similarities to mannosyltransferases (Drummelsmith and Whitfield, 1999; Rocchetta etal., 1998). These five O R F s have been designated IpsA, IpsB, IpsC, IpsD and IpsE (Fig. 6-4). Another O R F , IpsF, was found 3' of rsaF(973), and also had similarity to glycosyl transferases (Kido et al., 1998). Since the S - L P S is required for attachment of the S-layer, it is not that surprising that some of the genes involved in S - L P S synthesis are physically near rsaA and the transport genes. Smooth L P S genes have also been implicated in the proper formation of the transport complex in some type I secretion signals (Wandersman and Letoffe, 1993).  It is thought that smooth L P S is required for  proper insertion of the O M P into the outer membrane.  Sequencing of the Tn5  insertions in the shedders has shown that F2 is located within IpsC and the four different insertions F4, F12, F15, and F19 are located within IpsD. The presence of four different Tn5 mutations in IpsD suggests that the Tn5 mutations are the cause of the shedding phenotype and this gene plays a role in S - L P S synthesis.  In  addition, F11 is found in IpsE and F20 is found in IpsF. Most of the remaining Tn5 insertions are also in genes that have similarity to smooth L P S synthesis genes (Fig. 6-5, Table 6-4). Two of these insertions interrupt genes with similarity to glycosyltransferases. Four Tn5 insertions are found in genes that have been implicated in pathways for the production of GDP-4-keto-6-Ddeoxymannose, a precursor of GDP-L-fucose and GDP-perosamine. One insertion appears in a gene with similarity to transcription regulators. Two other insertions are in unknown genes.  74  (/> CO  cc <S  b  £  "D c <U o o) e  * .£  I o c E o .2  I -SI c  53CO 7 3  *J  2 -=  ex  ~  . CO  o ? O W CO  CO c .c  4-J  cu *->  | §1  H CO  E 3  s t S </>  * .£ ~ C  CO  =E f <u 4-i E °> S 05 W ~ C  IC ?^ *"g 0 0) c  ° - 1  o ^  CO >, TO CO  QJ  01  (0  o >  3 D * ; 3 CO  JT" a) gj 75  CO  II '55 t  c .E CO CO to  I£ O  4->  CO I— CO c - ° .2 co *->  c T3  .2> -a  CO  c ^ « s c u CO  ^  to  .2  •=  CO  c 3  *"  CO >  c o  »•§ g  CO  CL  CO  i i 2 |  CO  "S  CO  JC £ (0 CL C 05 to c c o CD  10  .E  c  ^ u CO  J > 'w Q. C 3  O  E P I «  £ I 8^  c c 0) 1-  O  co  4-*  • CO I D y  • S CM C CM => LL  .1  vO (D to 13  76  Organism  Caulobacter  Similar  protein  Proteins  LpsA  GCA  Pseudomonas aeruginosa  GDP-mannose dehydratase  65.2/88.6  Q51366  RfbB  Synechocytis species  GDP-mannose dehydratase  55.2/83.4  P72586  GMD  Escherichia coli  GDP-mannose dehydratase  55.7/85.0  P32054  GMD  Escherichia coli 0157  GDP-mannose dehydratase  55.7/84.9  085339  YvfD  Bacillus subtilis  Serine O-acetyltransferase  47.2/83.1  P71063  Wlal  Campylobacter jejuni  Serine O-acetyltransferase  37.9/83.4  086157  NeuD  Escherichia coli  acetyltransferase  32.4/77.2  Q46674  WbdR  Escherichia coli 0157  N-acetyltransferase  30.3/72.2  085344  SpsC  Synechocytis species  Spore coat polysaccharide synthesis  50.0/86.1  P73981  Mth334  Methanobactium thermoautotropicumPerosamine synthetase  46.4/82.4  026434  RfbE  Escherichia coli 0157  Perosamine synthetase  45.4/82.4  007894  RfbE  Vibrio cholerae  Perosamine synthetase  42.3/80.1  Q06953  WbaZ-1  Archaeoblubus fulgidus  Mannosyl transferase  24.3/69.8  030192  Mth332  Methanobactium thermoautotropicumLPS biosynthesis  24.5/68.6  026432  ORF18.9  Salmonella enterica  19.6/62.0  Q00483  ExpE4  Sinorhizobium meliloti  25.0/40.7  P96434  LpsB  LpsC  LpsD  LpsE  LpsF  LpsG  LpsH  Function  Identity/%  Accession  Similarity  Mannosyl transferase  ORF18.9  Salmonella enterica  Mannosyl transferase  26.5/89.7  Q00483  WbaZ-2  Archaeoblubus fulgidus  Mannosyl transferase  24.5/64.6  029649  WbaZ-1  Methanobactium thermoautotropicumMannosyl transferase  24.2/66.5  030192  WbdA  Escherichia coli  Mannosyl transferase  19.4/66.2  066234  AF0617  Archaeoblubus fulgidus  LPS biosynthesis protein  24.8/69.9  029638  Mth370  Methanobactium thermoautotropicumLPS biosynthesis protein, RfbU -like  29.0/65.7  026470  AlgC  Pseudomonas aeruginosa  phosphomannomutase  36.0/57.4  P26276  PGM  Neisseria gonorrhoeae  phosphomannomutase  32.9/50.6  P40390  PmmA  Mycobacterium  phosphomannomutase  38.0/54.2  086374  PGM  Neisseria meningitidis  phosphomannomutase  35.0/53.5  P40391  XanB  Xanthomonas campestris  Phosphomannose isomerase  38.3/71.7  P29956  ManC  Yersinia enterocolitica  Mannose-1 -phosphate  33.2/64.0  Q56874  32.6/65.9  Q59427  guanyltransferase RfbM  Mannose-1 -phosphate  Escherichia coli  guanyltransferase LpsI  LpsJ  LpsK'  LpsL  CcpA  Bacillus megaterium  Catabolite control protein  34.9/74.2  P46828  CcpA  Bacillus subtilis  Catabolite control protein  33.1/74.5  P25144  DegA  Bacillus subtilis  Degradation activator  33.1/74.9  P37947  LacI  Bacillus subtilis  LacI repressor like protein  30.0/72.9  034396  LpsBl  Rhizobium etli  galactosyltransferase  59.7/71.0  034301  CapM  Staphylococcus aureus  unknown  45.7/79.6  P95706  RfbW  Vibrio cholerae  galactosyltransferase  47.2/79.8  Q56624  PssA  Rhizobium leguminosarum  galactosyltransferase  34.6/69.2  Q52856  WlaL  Campylobacter jejuni  amino sugar epimerase  43.8/79.6  086159  BplL  Bordetella pertussis  LPS biosynthesis  31.0/64.4  Q45387  LpsB2  Rhizobium etli  dTDP-glucose 4,6, dehydratase  25.9/39.4  034302  CAPD  Bacillus subtilis  unknown  26.5/69.9  P72370  CPS23FV  Streptococcus pneumoniae  Rhamnosyltransferase  29.8/51.7  086159  CPS23FI  Streptococcus pneumoniae  LPS biosynthesis  29.8/51.7  AAC69532  ORF51x5  Vibrio anguillarum  unknown  26.7/45.0  031012  Tdbl6 6-4.  Deduced  proteins  involved  in O-antigen  synthesis  and their  B L A S T and F A S T A alignments were used to determine identity and similarity.  homologues.  P e r c e n t a g e similarity  represents identical amino acids and co n s e r v e d s u b s t i t u t i o n s . * incomplete O F F  77  As shown by Southern blotting, the Tn5 insertions, F 1 , F2, F4, F6, F11, F12, F14, F15, F19 and F20 are linked. Figure 6-4 shows that the Tn5 insertions F2, F4, F11, F12, F15 and F20 are linked to the RsaA transporter genes. F 1 , F6, and F14 must be linked as well, but it was not possible to construct the DNA sequence of this linkage. In addition, of the four mutants not characterised by Southern analysis, F23 is in the same O R F as F9, and F25 is in the same O R F as F14. The other two mutants, F24 and F26, were not obviously linked to any of the other insertions.  Analysis and proposed function of individual proteins involved in S-LPS synthesis. A total of 14 O R F s associated with the formation of the S - L P S were found (Table 6-4). Four of these O R F s are incomplete. A summary of the characteristics of these O R F s is listed in Table 6-5. All of the O R F s start with an A T G codon except IpsH which starts with a T T G . Sequence similarity and codon preference indicate that the T T G is the most probable start codon for IpsH. Using the C. crescentus promoter consensus for biosynthetic genes (Malakooti et al., promoters were found 31 bp and 99 bp 5' olIpsG,  1995), possible  52 bp 5' of IpsH, 204 bp 5' of Ipsl,  154 bp 5' of IpsJ and 63 bp 5' of IpsK. In some clusters of smooth L P S genes the G+C content of the individual clusters varies with respect to the G+C content of the bacterium suggesting recent acquisition of the genes (Fallarino et al., 1997; Fry et al., 1998; Stroeher etal., 1995). The G+C content of these O R F s is consistent with the average C. crescentus content of 67%.  78  QRF  Translation start  Size (aa) 325 215 346 346 345 430 >469 434 356 187 >459 >336 >352 316  IpsA IpsB IpsC  GAACI3TCACTATCir^n!0G^TGCi^  IpsD IpsE IpsF  GCGTCTCXXTOGOCTGC^^  IpsG IpsH lpsl  OZTAAGACIGIGrOGGGACAAGAOTTO CGQGCIOXCATSaCA^^  IpsJ IpsK IpsL"  orfl* orf2  ND ND GGO^ACCmSAAA^^  Table 6-5.  Characteristics  of the putative  S-LPS  Predicted mass(kDa) 36.3 21.4 37.8 39.1 38.2 47.0 ND 45.5 38.7 20.5 ND ND ND 34.1 synthesis  pE 6.2 8.5 5.9 5.7 5.8 7.5 5.0 4.6 6.3 10.5 10.4 5.5 6.4 10.2  genes.  GtC % 65.1 69.3 63.1 65.2 65.8 69.1 65.8 67.4 65.4 66.0 69.8 68.7 73.8 72.5  Start c o d o n s  are in bold. Putative Shine-Dalgarno s e q u e n c e s are underlined. * - incomplete O R F . ND - n o t determined b e c a u s e O R F is incomplete.  LpsA resembles GDP-mannose 4,6-dehydratases. The start codon for IpsA is 143 bp 3' of rsaE.  No promoter matching the consensus sequence was found  upstream of IpsA, as would be expected if there is a terminator after rsaE (see Ch.3). The LpsA sequence has up to 65.2% identity and 88.6% similarity over its entire length to GDP-mannose 4,6-dehydratases from P. aeruginosa (Table 6-4).  T h e s e e n z y m e s convert  GDP-mannose  and E. coli.  to G D P - 4 - k e t o - 6 -  deoxymannose (Stevenson et al., 1996) as part of biosynthetic pathways polysaccharides. One example of this is the synthesis of perosamine in V. cholerae and E. coli 0 1 5 7 .  The significant similarity to G D P - m a n n o s e 4,6-dehydratases  suggests that this is also the function of LpsA, although no Tn5 insertion was found in the gene.  LpsB is similar to N-acetyltransferases. The gene IpsB follows IpsA by 2 bp suggesting that these genes are transcriptionally coupled. The protein encoded by the gene shows significant similarity to Wlal from C. jejuni and NeuD from E. coli (Table 6-4). Wlal is involved in the synthesis of the O-antigen (Fry et al., 1998) while  79  the function of NeuD is not clear, but is thought to be involved in NeuNAc transfer (Annunziato et al., 1995). These proteins also show some similarity to the LpxA genes  from  E.  coli and S. enterica.  The  L p x A proteins  are  UDP-N-  acetylglucosamine O-acetyltransferases that are involved in the first step of Lipid A biosynthesis and have 24 to 26 unique hexapeptide motifs starting with an isoleucine, leucine or valine residue often followed by a glycine (Vaara, 1992; Vuorio et al., 1994). LpsB, Wlal and NeuD contain several of these hexapeptide repeats (Fig. 6-6). The protein WbdR from E. coli 0 1 5 7 also contains these hexapeptide repeats and has 72.2% sequence similarity to LpsB. WbdR is thought to encode an N-acetyltransferase which converts GDP-perosamine to GDP-N-acetyl perosamine (Wang and Reeves, 1998). Since the data in this chapter suggest that the genes involved in perosamine synthesis in E. coli 0 1 5 7 are also present in C.  crescentus  LpsB may acetylate GDP-perosamine like WbdR.  80  SO  LpsB Wlal |M L I q|o ~ NeuD ~ M SIK K  I  g  g  L  m  a  rcT|g|G  ED  G  A |  G A G G  A  G  H A  K V V  I E S  L  V_ c  e|D | v  a K N  s |G H  T  G|T G  G  I I D  S  L  R  7T|H  H  F  P V V G P| k  I d |D L A | 1 p  T i g  y  n[p"  l|K  D N | •  . K  .  90  70 V  . I)  11F  1  V A  I G I G  -  d IF  F  I A  y  y |Y  F  ifgl I G  d[N]r|L  R  Q K  n |NJ e | l  R  K  k  p  s  t|R K  K  L jg  TaJR D HH G r \^\ a JR D  L  V N  100 [a  l|_s l | s jEE NN G F K I V N L | NJTIR L I N l |  l|y[Q K  H | y  F|_s  k |[H_ 1 [ N T TT]l_RI _ ^ T H  F I A I G  130 LpsB Wlal  I H|p[s A VV I H K S A L I  NeuD  ifdlK  T A  I  H  K  S A  L  G  A A  LpsB Wlal NeuD  I  I N  I  AH^A D T  k  R  I  I N  T  D  H  I L N  T  S V  I E  H  T  C s L  I E  »f  a] I [V  1  |V  V  I N  <1 A  V  V  C H L |g  "T|s  I G C C  I N A D S | w r T [ g r D l 1  H S N  Figure 6-6.  v s| I s|  ClustalW  alignment  LpsB.  Alignment of L p s B with Wlal from C. jejuni  ( A c c e s s i o n C A A 7 2 3 5 8 ) a n d NeuD from E. c o / / ( A C C 4 3 3 0 1 ) . A s t e r i s k s mark the h e x a p e p t i d e motifs found in glycosyl transferase.  Identical a n d similar residues are b o x e d .  LpsC appears to be a perosamine synthetase. The gene encoding LpsC starts 74 bp 3' of IpsB, but no promoter sequence was found between IpsB and IpsC. LpsC has considerable identity over its entire length to the rfbE and per gene products that are thought to synthesize perosamine (Table 6-4). These proteins likely catalyze the conversion of G D P - 4 - k e t o - 6 - D - d e o x y m a n n o s e to G D P perosamine (4-amino-4,6-dideoxymannose) in V. cholerae  and E. coli 0 1 5 7  (Stroeher et al., 1995; Wang and Reeves, 1998) and show similarity to two classes of pyridoxal-binding proteins involved in the synthesis of amino sugars similar to perosamine. The perosamine synthetic pathway has not been proven chemically, but the proteins suspected in the synthesis of perosamine are the only highly similar proteins involved in O-antigen synthesis found in common between Vibrio  cholerae,  and E. coli 0 1 5 7 supporting these predictions (Wang and Reeves, 1998). Based on the similarity to these genes, it is likely that LpsC is a perosamine synthetase.  81  LpsD LpsE WbaZEc WbaZSe WbaZ-1 At WbaZ-2Af  LpsD LpsE WbaZEc WbaZSe WbaZ-1 At WbaZ-2Af  LpsD LpsE WbaZEc WbaZSe WbaZ-tAt WbaZSAf  I v K V I  M  R  M m  e  _  _  r  h  i n  _ _ a m  -  n  I s L V tl  |M  K  I c V  £ H  M  K  I a V  f H  M  K  e a g t P g k  r  1 k  - - -  s h  a V  a  s V  a  -1  n w a  s g s_ 9 -  t  V t  £  i  £  <3  e w  v  1  d y f e s f  - 1 s - 9 - g  e r £ y L Q S e L V r V t t £ I q k L t t £ I q n L e V  £ r n  £ e d  k V i s i s q h  £  a e  P  f V  P  £ T w  P P  k A  k  k A  k k  E  e T 1 V  i  d  d P n e P a s n y £ y  k  a k  1 p i l  P £  a  h  h  a k  £ 1 g k y a - - a V P e e - - i n t 1 g  £ r S  e I K  p  a a p  1 s s i  s  LpsD LpsE WbaZEc WbaZSe WbaZ-tAt WbaZSAf  F  a  7  r  d n  R  a V  k  a  a  R m  L  1 h  k  K w  L  1 h  k  i  1 w  p  1 a  240  LpsD LpsE WbaZEc WbaZSe WbaZ-1 At WbaZ-2Af  e r  e 1 r  F  n  F  k  F  k Y  £  g D e i V A i s R I£ t g g g D Y V £ A g g R V e 1 n £ n k e D Y f £ T a s R L e V K n e k q D Y y £ T a s R H k £ K C y g D F w 1 S V n R I k c K n s e 1) F y 1 £ V 9 R L g y e h  -  •a [ap ] hg  K R  q  K R q V 7 y K R V P y K R y w  P  • K R  h  e K R  q y V k s g a l L P g s a e M P n £ s k M P o  a L M I E A [tl h L L I E A 1 L I V E A f L I V E A _L q L _E_ V e a I r 9  £ k  k  c  k  i  L  q d a k  -  -  - -  S  L V L V L V  I  s  I V V  L[7J  i  I V  V  L V  .  I G I G V G  p  e n  g £ s k  q a  g  d  I G  LpsD LpsE WbaZEc WbaZSe WbaZ-tAt WbaZ-2Af  LpsD LpsE WbaZEc WbaZSe WbaZ-tAt WbaZ-2 At  Figure 6-7. C l u s t a l W (Accession A A D 2 1 5 7 1 ) (AAB91187).  Alignment and S. enterica  of  L p s D a n d L p s E with W b a Z g e n e s from E. coli  (X61917) a n d W b a Z h o m o l o g u e s from A.  Identical and similar residues are boxed.  Identical residues  fulgidus  have dark shading  Similar residues have light shading. T h e c o n s e n s u s s e q u e n c e is located below the  alignment.  82  LpsD and LpsE resemble glycosyltransferases. The gene for LpsD follows IpsC by 6 bp and the gene for LpsE follows IpsD by 13 bp, suggesting that all three genes are part of a polycistron.  Both LpsD and LpsE have significant similarity to the  WbaZ proteins (Fig 6-7).  These proteins also have similarity to the RfbU related  proteins, but size and amino acid similarity indicates that the WbaZ-like protein are a separate family.  W b a Z is a known mannosyltransferase in S. enterica  (Liu et al.,  1993). It seems likely that LpsD and LpsE function to link perosamine monomers to the O-antigen with each providing a different form of linkage.  LpsF is similar to perosamine transferases. The gene for LpsF is separated from IpsABCDE  by rsaF and is transcribed in the opposite orientation.  LpsF, like LpsD  and LpsE, appears to be a mannosyltransferase, but has greater similarity to the RfbU family.  The similarity to mannosyltransferases is much less than that seen  with LpsD and LpsE, but it does have significant similarity to the C-terminal of E. coli mannosyltransferases, WbdB and WbdA (Kido et al., 1998; Sugiyama et al., 1998) and RfbU, from V. cholerae (Wang and Reeves, 1998). RfbU, from V. cholerae, is known to transfer a perosamine residue onto the growing O-antigen chain. These proteins contain a signature motif that is also found in LpsF (Fig 6-8).  This motif  consists of the sequence EX[XF]GXXXXE[AG] with a serine preceding the motif by 3 to 5 residues (Geremia et al., 1996; Rocchetta et al., 1998). Again, it seems likely that LpsF acts to add perosamine residues onto the O-antigen. 320 LpsF RfbU Vc RfbU E.c WbdA wbdB  _  T  A  S  S  D  I  V  L  L  H  L  L  S  K  A  F  V  F  P  Y  R  V  V  V  M  L  Y  N  L  c  K  L  F  V  L  Y  A  A  A  R  T  F  V  I A  s  -  34  330  -  L  F  350  L  H  R  E  G  Y  G  L  L  L  A  E  A  I  w  L  G  K  [¥" T  L  s  H  L  R  E  A  F  G  I  S  L  I  E  A  M  Y  C  K  A  I  I  S  S  P  s  -  -  E  E  A  F  G  M  V  L  A  E  A  V  s  G  V  P  V  I  A  F  P  s  -  -  L  E  G  F  G  L  P  P  L  E  A  M  R  c  G  A  A  T  l [ g ~  Y  P  s  -  -  F  E  G  F  G  L  P  I  L  E  A  M  S  c  G  V  P  V  V  S  A  c  * Figure 6-8. transferases. *. R f b U -Vibrio (D43637).  ClustalW  alignment  of  LpsF  with  T h e mannosyl transferase motif is b o x e d . cholerae  ( A c c e s s i o n Y 0 7 7 8 8 ) , RfbU - E. coli  Identical and similar residues are boxed.  a  number  of  known  mannosyl  T h e c o n s e r v e d serine is marked with ( B A A 3 1 8 3 8 ) , W b d A , W b d B - E. coli  Identical residues  have dark shading.  residues have light shading. T h e c o n s e n s u s sequence is located below the  Similar  alignment.  83  LpsG is similar to phosphomannomutases.  Two Tn5 insertions mutants had  interrupted LpsG genes. The LpsG gene does not appear to be linked to any of the other Ips genes (Table 6-1 and Table 6-2). This protein has very high identity along its entire length to a number of phosphomannomutase enzymes suggesting that this is the function of LpsG (Table 6-4).  Phosphomannomutase converts mannose-6-  phosphate to mannose-1 -phosphate and is one of the enzymes implicated in perosamine synthesis (Stroeher etal., 1995; Wang and Reeves, 1998).  LpsH may have a dual function as a phosphomannoisomerase  and mannose-  1-phosphate guanyltransferase. Two shedder mutants have Tn5 insertions within IpsH that result in loss of proper O-antigen production. It was not possible to link this gene with the RsaA transport genes using the TIGR Caulobacter genome sequence, but Southern analysis showed that IpsH is linked (Table 6-1 and Table 6-2).  LpsH  has significant identity over its entire length to a large family of enzymes that have dual functions  as a p h o s p h o m a n n o i s o m e r a s e and  guanyltransferase (Table 6-4)  mannose-1-phosphate  Both functions are required for the synthesis of  perosamine (Stroeher et al., 1995) and are probably also performed by LpsH in C. crescentus.  These functions are split up in E. coli 0 1 5 7 into the manA and manC  genes (Wang and Reeves, 1998).  Lpsl has similarity to the LacI repressor family. The Tn5 insertion in mutant F1 interrupts lpsl. Southern blot analysis indicated that this insertion is linked to the R s a locus. This insertion has a different phenotype than every other shedder Tn5 insertion. Analysis of the O-antigen by S D S - P A G E and silver staining reveals that a lower amount of O-antigen is produced by this mutant.  Analysis of Lpsl indicates  that the highest degree of identity is with C c p A , the catabolite control protein in Bacillus subtilis.  C c p A represses carbohydrate utilization enzymes such as a -  amylase and acetyl coenzyme A synthetase and has a positive regulatory affect on excess carbon excretion proteins such as acetate kinase (Henkin etal.,  1991).  Lower sequence identity is found to a number of LacI repressor-like proteins (Table 6-4). Analysis of the genes adjacent to lpsl revealed the presence of analogues of  84  glucokinase, 6-phosphogluconate  dehydratase and g l u c o s e - 6 - p h o s p h a t e - 1 -  dehydrogenase enzymes involved in basic metabolic pathways.  This positioning  suggests that Lpsl may regulate the transcription of these genes.  If Lpsl has a  repressor effect on these enzymes it could slow the production of O-antigen as glucose-6-phosphate would tend not be shunted into the perosamine synthetic pathway. Instead, it would be used for energy production in central metabolism.  LpsJ is similar to galactosyl transferases. The Tn5 insertion F24 interrupts a gene with sequence similarity to several galactosyl transferases (Fig. 6-9). These enzymes appear to transfer the first sugar residue (usually a galactose) to undecaprenol phosphate, the lipid precursor. RfbW is one of these enzymes and its 10  LpsJ RfbW LpsB1 WlaH WlbG  K  M F DIVITJ  |M V m  e e v m  1 g i e  ]L  K  L  R  y  PV \s[g]h A  LpsJ RfbW LpsB1 WlaH WlbG  V  G  R  a  s  V  G  K  n  q  I  £ n Q 7]G[T d e K V  G  G  V G  R  4 n  K  d g V  K  F  R  T  M  K  F  R  T  M  K  F  R  S  M  K  F  K  T M  K  .  K  S  F 1"  L  o|  I  It  I T A  L TT P T l L f a ] L  . G  L L| 1  I A | i  "  a|lK)L  L I  L  M  F R T M  130 LpsJ RfbW LpsB1 WlaH WlbG  LpsJ RfbW LpsBf WlaH WlbG  P Q L P Q L P Q L TlQ L V [ Q _ _L_ P Q L  R  D  "v"|p R  D  R  In  w s V L V i n V L k w c I L a f n V L k i  V  L  M  e  M  k  M  d  M  s  M  M  S  V s s V s f V s f V s ] V s V  G P R P A G P R P C aE • P A G p R > H G p R e G p R P  1  p|v  L  P  e y r y  M Q  s V  V  L  i  D I.  F  G V A G V G V d k  R  1 r h k  Fy  1 0 G V  i E  E |  Y L  E  L  a  V e  i  l  g  q  s  s  d  p  e  r  t  y  E  L | t a r|G V  V  V R  V  L K  k[G r  s  220  -  l  R  L  V  R  ^ V  R  S L Li  r  R  P  G  P G P G P G  V  L  T  T  | d  N  G|  a  A  Q  I  N  G]  A  Q  V  N  G  p | AJ  s  I  r  L  V  a  T  V  I  i  q  T  V  L  £  I  M  £  !  R  L K L K  i,  N  V  L M  L  I  Q  » D  S  Q  A Q  D  R  A  I T G W  1  T  W  G  T  £  R  W  T  S L g  ^1 £J g  W  G  I  n  [ijs  G  I  k  T  K N  R  y  D  f  D  V  t y R  i i i  I a h  1  £  i T  i  230  250  ||  k P I klH  -  V  G  t  « Q  Y V Q T  . Y  E  g k g a g d"TJv  K  g h v t t e k  £ n g k n  VJT[R V  . .  Figure 6-9.  ClustalW  R f b W - V . cholerae WbIG - Bordetella  alignment  (Accession Y07788), pertussis  have dark shading. the  v  I I I  I Q\£ Y  210 LpsJ RfbW LpsBf WlaH WlbG  L F N Q d D L L F N Q q D L L Y N Q y _p_ L  E |  n  D  vF  h  [m  pJ 0Q n R  d  G G G G G_ Q  (X90711).  of  LpsJ  with  putative  LpsB1-Ref// (U56723),  galactosyltransferases  W l a H - C . jejuni  Identical and similar residues a r e b o x e d .  Similar residues have light shading.  (CAA72357),  Identical residues  T h e c o n s e n s u s s e q u e n c e is located  below  alignment.  85  sequence is 47.2% identical and 79.8% similar to LpsJ over 144 amino acids. RfbW is involved in the synthesis of the perosamine homopolymer making up the Oantigen of V. cholerae 01 (Fallarino et al., 1997) suggesting that RfbW may transfer the first perosamine to the lipid precursor. In C. crescentus,  LpsJ may initiate the  formation of the O-antigen by attaching the first sugar residue (probably a perosamine) to the undecaprenol phosphate.  LpsK has sequence similarity to amino sugar synthesis enzymes. The mutant, F3, has an interruption in IpsK. It was only possible to determine the sequence for the 5' end of IpsK from the TIGR genome. The partial sequence of LpsK is similar to a number of large proteins, usually consisting of over 600 amino acids, suggesting that approximately 150 amino acids are missing from the C-terminal of the LpsK coding sequence (Fig 6-10). There is still considerable similarity, especially in the middle of the protein, to WlaL, RfbV and WlbL from C. jejuni, V. cholerae 01 and B. pertussis.  These proteins contain 5 hydrophobic, predicted transmembrane  domains in the N-terminus. The central portion contains an NAD-binding site and is homologous to UDP-glucose-4-epimerases. Two motifs have been implicated in binding of NAD in these proteins, G X G X X G and G A G G S I G (Fallarino et al., 1997). As seen in Fig 6-10, the second motif is found in all the proteins, but the first only occurs in RfbV and WlbL suggesting that not all members of this family contain this motif. The C-terminal 300 amino acids of these proteins have identity with dTDPglucose 4,6-hydratases (Bechthold et al., 1995; Linton et al., 1995). These proteins are usually associated with synthesizing amino 6-deoxy and dideoxy sugars involved in L P S synthesis or extracellular polysaccharides and probably perform multiple functions to account for the 3 domains. LpsK was not found linked to the other Oantigen synthesis genes. This may indicate that LpsK is involved in the synthesis of a core sugar, possibly the terminal core sugar. Interruption of this gene may prevent attachment of the O-antigen to the core, resulting in the observed shedding phenotype.  86  10  LpsK RfbV WlaL WlbL  M  M  T  L  P  F  Y  T  P  A  I Q  L  h\L\S  M  I  F  A  Y  K  Q  S  K  I R R _ L J F [ V ] D L P | R J P  F [ K ] Q M  LpsK RfbV WlaL WlbL  L A [  R  V E  R  A  F  V  R  L  G  L  F  V  F  K  L  A  A  F  G  w  R  F  RHs  A  T  D  H  L  Y  R  A LVJL  R  Y  M  M  L  P  I  Y  K  V  W  R  F  F  S  L  N  V  x  L  Y  Si.  R  Y  H  s  E  R  L  S  Y  A  P  I G |V  P  V  D  D  P  A  K  T G  Q  I  F  P  V  A  R  K  E  L  I G  T  Y  -[Y"|R  P  V  L  V  P M  A  R  110 LpsK RfbV WlaL WlbL  I  T  R  T H P G _ I D G G  T L A C  F  F  V  V  T  A  S  G  F  F  F  Q  I  F  Y  F  F  S  T  F  L  Q  G  LpsK RfbV WlaL WlbL  LpsK RfbV WlaL WlbL  E C  N  L  D  F  L  R  T  - A  F  F  N  -  P  V  | l| F  A P  P  S  E  A  E  A  L  R A P  A  G  L  G  E  R  G  A  T  G  R  D  A  Y |AJ L  I  Q  G  D  E  - [Y  K  A  L  H  A  G  S  Q [L]A  R  D  R  Q  K  P  E  E  T  G  A  P  R  T  G  A  G  G  S  v  L|A[RIL  F  |o  L |R F  H  D  K  V  L |R  |V"|N |G  F  180  170  [Fl l|  G  210 L  A  G  160 p R nrL  c  L  ii  L  V |c  S  T[T|S  A  R |G  A  L  - A  p  N [V  L  A  K  E G  S  L  M |A]L  R  T G  P  H  G  220  I A  D  F  D  a  - H  S  S  D  F  E  S• L  E  -  E  K  E  K  I  K  S[Y"  Y  -  P  P  E  Q  L  P  K  L  R  N  \71 E  L  190  G  L  G  L  S  P  Y  A  I  A  V K  [Q~|G N  -  - -  - -  L | FL T D  I R Q |  S A M  I N N  I A  i l l  T  L  D  R  E  D  H  R  E  240 I  P V K | M L  I|D | R | H  P  K  230 S  200  V | S  I A  L  R  I  A  M  P  I  L S  S G  E  K  - Q  A  P  T  N  T E  P  K  F  G  A  R  E  I K  E  L  Q  K  K  I R S  LpsK RfbV WlaL WlbL  440 i  s  T  D  K A V | A | P T | S | V M  L  I  S  T  D  K  M  I  L  460 G  G  470 D S D  Figure 6-10.  M T K  ClustalW  R  V  S T  I  T  I  F  F  E  V A [c]v  T  R  F  S  M  M  V V  V  R  A  V  R  P  T  N  S  T  D  K  A  V  R  P  T  N  I  S  T  D  K  A  V  R  P  T  I  M  G  A  A  K  R  V  A  E  G  A  S  K  R  M  A  E  A  E  I M G C N  V  M  T  G  A  K  S  R K  V  C  R  L  E  480  F  G  N  V  L  G  S  A  G  S  R  F  G  N  V  L  G  S  S  G  S  R  F  G  N  V  L  G  S  S  G  S  R  F  G  N  V  L  G  S  S  alignment  T h e s e c o n d N A D motif is b o x e d .  450  L  of  LpsK.  G  S  V V  V V  V V  [  _  P I P  V  P  T h e first  F  K  R  A  V  T  V  F  K  A  N  P  L  T  L  T  T  H  H P  P D  D  F  R  L  E  P  I  T  L  T  H  P  E  N A D binding motif is underlined.  Only R f b V a n d W l b L contain the first motif.  Only a partial  s e q u e n c e of L p s K h a s b e e n d e d u c e d a n d the alignment is truncated after the L p s K s e q u e n c e . RfbV -  V. cholerae  (Accession Y07788),  W l a L - C. jejuni  (CAA72360),  W l b L - B.  pertussis  (X90711)  87  LpsL may be a glycosyltransferase. The mutant F26 has an insertion in IpsL. This gene is 5' to an O R F with similarity to exsG which was implicated in extracellular polysaccharide synthesis (Becker et al., 1995). The LpsL amino acid sequence is 29.8% identical and 51.7% similar over a range of 87 amino acids to a putative rhamnosyl transferase in Streptococcus  pneumoniae  (Table 6-4).  Rhamnose is a 6-deoxy derivative of mannose, as is perosamine, suggesting that LpsL may be another perosamine transferase.  The functions of some of the Tn5-interrupted genes are still unidentified. The Tn5 insertions F22 and F10 interrupt an O R F with no identity to any known protein. But 5' of this O R F is an O R F corresponding to an A B C - 2 transporter.  These  transporters are known to transport extra-cellular polysaccharides and O-antigens through the cytoplasmic membranes (Whitfield, 1995). Unlike the A B C transporters of the type I secretion systems, the A B C and transmembrane domains consist of separate proteins. It is possible that the O R F interrupted by F10 and F22 represents the transmembrane protein part of the A B C - 2 transporter, but hydropathy analysis does not suggest that this protein contains transmembrane segments. The A B C - 2 transporters are often found adjacent to genes involved in polysaccharide synthesis, therefore it may be that the O R F interrupted by the F10 and F22 mutants is also involved in polysaccharide synthesis. The Tn5 insertion F6 interrupts orfl which has similarity to a chemotaxis receptor (Ward et al., 1995). CheY, a chemotaxis regulator, is found linked to a number of O-antigen synthesis genes with similarity to IpsJ, IpsB, IpsC and IpsK in C. jejuni. It may be that the genes involved in chemotaxis are found close to the 0 antigen synthesis genes in C. crescentus and that the F6 insertion has a polar effect on downstream S - L P S genes. It is also possible that this O R F has nothing to do with L P S synthesis and the Tn5 insertion may not cause the shedding phenotype. Instead, a second mutation may cause the altered phenotype.  88  Summary As stated at the beginning of the chapter, it seems likely that the S - L P S of C. crescentus  is a fixed length homopolymer of approximately forty 4,6-dideoxy-4-  amino-hexose residues. Proton N M R anomeric traces suggest that the linkages between the hexose residues may not all be identical.  Several of the genes  discussed in this chapter are similar to genes found in the synthesis of perosamine in V. cholerae and E. coli 0 1 5 7 (Stroeher et al., 1995; Wang and Reeves, 1998) and as perosamine is a 4,6-dideoxy-4-amino-hexose, it seems likely that the O-antigen of C. crescentus  consists of perosamine residues. All of the enzymes responsible  for perosamine synthesis can be found in the Ips genes listed above. Four enzymes are involved in converting fructose-6-phospate to perosamine (Fig. 6-11). The first enzyme  in  the  pathway  described  by  Stroeher  et  al  (1995)  is  a  phosphomannoisomerase, RfbA. Mutants F25 and F14 are located in LpsH which has significant similarity to RfbA. The second step in the pathway is performed by the enzyme RfbB, a phosphomannomutase. Two Tn5 mutants, F9 and F23, are in the gene for LpsG, an enzyme with considerable similarity to RfbB. The third step in the pathway is catalyzed by RfbA. catalyses the fourth reaction.  RfbD, a G D P - m a n n o s e 4,6-dehydratase,  No Tn5 insertion has been found in a gene with  similarity to RfbD, but the coding sequence of C. crescentus gene immediately 3' of rsaE, IpsA, shows considerable similarity to RfbD. The last step of the process requires RfbE, the perosamine synthetase. LpsC presumably fulfills this role in C. crescentus, and the shedding mutant F2 has a Tn5 insertion in the LpsC gene. Two more genes need to be considered as part of the perosamine pathway in C. crescentus  (Fig. 6-11). Bacteria using the Embden-Meyerhof-Parnas pathway  require phosphoglucoisomerase as part of the pathway leading into the bottom half of glycolysis, but C. crescentus  uses the Entner-Doudoroff glycolytic pathway  instead (Riley and Kolodziej, 1976) and as such would not be expected to normally have the enzyme phosphoglucoisomerase for converting glucose-6-phosphate to fructose-6-phosphate.  But C. crescentus  requires phosphoglucoisomerase if it  makes perosmaine by the pathway described here (Fig 6-11). None of the Tn5 hits were found in such a gene, so the TIGR Caulobacter genome was searched for a phosphoglucoisomerase analogue and one was found in contig gcc_2205. A  89  o co _(/> JO —  CO  ero  CO  Q. i  CC  I cc Q. O • •••  "<f  CM  X  c c  oxy ma  min  CD CO  ose)  5  -z. 1 1  Q_ Q_ CD Q Q T3 O o T3  co  CO [ n  <D  •  c  * 1-  °  d) f  E v>  CO (U ~ D) C  3-  o  CX CO  <o o ja *" 2 • p _  ro"3  COO Q. CO C 1_  co O CO  E  *  >%CJ3 co N r- (O C C CO CI) u CO f j « P  •— c > . CO c  o  0  " _ £  CO  o m  <D C  E CO CO  o  0 Q.  is • H CO > > _ E o r  I 8)-  CO -O CL CO M_  1= o £  Q. O £ B t "D O CO C  2 i 3 CO  co F o <x£j .<2 co -!2 if! -«= co CO 4J £ • £ O) co C C O) >*'</)</) m 3 .t; S2 <" "O . C CO c C 'o CO  co.si o .52 I  —  CO XI  i i  i  £  90  second enzyme, glucokinase, is required for converting glucose to glucose-6phophate. A glucokinase analogue was found next to the F1 Tn5 insertion in the potential repressor lpsl. From the position of lpsl may be deduced that Lpsl has a regulatory effect on the synthesis of glucokinase.  Interruption of Lpsl by the F1  insertion may alter the expression of glucokinase, which in turn would affect perosamine synthesis, resulting in the phenotype seen in the F1 mutant (less Oantigen). These data suggest that C. crescentus  contains all the genes necessary  for the synthesis of perosamine. Furthermore, 5 separate Tn5 insertions in 3 of the O R F s cause loss of O-antigen synthesis, strengthening  the argument  that  perosamine makes up the O-antigen of the S - L P S . Six of the Tn5 insertions appear to be in glycosyltransferases (IpsD, IpsE, IpsF, IpsJ, and IpsL) (Fig. 6-4).  This is would be expected since proton N M R  suggests there are a number of different linkages between the sugars in the Oantigen. The similarities of LpsJ to galactosyltransferases, which transfer the initial sugar to the lipid precursor, suggest that LpsJ may initiate the first addition of a sugar to the undecaprenol phosphate. The S - L P S chemical composition suggests that this first sugar is a perosamine, but it is possible that it is galactose. Galactose is found in the core and it is possible that traces found during analysis of the Oantigen would be attributed to contamination from the core. LpsK may be involved in the synthesis of a sugar residue. A s all the enzymes for the synthesis of perosamine are accounted for in the other Ips genes, LpsK may synthesize an unidentified sugar in the O-antigen (possibly an initial galactose linked by LpsJ) or a sugar in the L P S core. O-antigens are elongated at either the reducing terminus or the non-reducing terminus. If the O-antigen elongates at the reducing terminus, individual sugars are 'flipped' across the cytoplasmic membrane by a flippase enzyme and the O-antigen is assembled in the periplasm.  If synthesis of the O-antigen occurs at the non-  reducing terminus, the chain elongates in the cytoplasm and an A B C - 2 transporter is required to transport the O-antigen chain across the cytoplasmic  membrane  (Whitfield, 1995). If the A B C - 2 transporter upstream of the F10 and F22 insertions is involved in the transport of the O-antigen, it suggests that the O-antigen is elongated by polymerization at the non-reducing terminus.  The O-antigen would then be  91  transported through the cytoplasmic membrane by the A B C - 2 transporter where it would then be transferred to the lipid-A core. While it has not been proven that any of the O R F s listed here are required for O-antigen synthesis, the presence of multiple Tn5 insertions in some of the O R F s confirms that the Tn5 is responsible for causing the defective S - L P S phenotype and the interrupted O R F is very likely a gene involved in making the S - L P S .  92  Chapter 7 Conclusions and Future Considerations The attachment and secretion of the S-layer appear to be linked, although RsaA can be secreted even when the S - L P S is defective and the S-layer cannot attach to the surface.  While searching for the secretion components, genes  involved in the synthesis and assembly of the S - L P S were found linked to the transport complex. function.  In prokaryotes, genetic linkage often implies linkage of the  In this case, the most obvious link is that the S - L P S is required for  attachment of the S-layer. Since C. crescentus is a non-pathogenic bacterium, the only apparent function for the S - L P S is to allow attachment of the S-layer to the outer membrane. A s such, it seems likely that the bacterium coordinates production of the S-layer and S - L P S and that clustering of the genes allows better control. Similar linkages between the S - L P S and S-layer translocation have been found in Acinetobacter  sp. and Aeromonas salmonicida (Belland and Trust, 1985; Thorne et  al., 1976). A linkage between type I secretion systems and S - L P S has also been found. Three genes involved in the synthesis of the smooth L P S have also been implicated in the secretion of a-hemolysin from E. coli (Stanley et al., 1993; Wandersman and Letoffe, 1993). It is suspected that these genes are required for the proper insertion of the O M P component in the outer membrane.  RsaA is secreted by a type I secretion mechanism. All three main components of this system have been found and all are linked to the rsaA gene although the O M P gene is separated from the others by 5 kb. These genes are similar to a number of other type I secretion mechanisms. The highest similarity was found to systems secreting proteases and lipases from P. aeruginosa, S. marcescens.  E. chrysanthemi  and  The identity between these systems is high enough that the  proteases, AprA and PrtB, were successfully secreted by the R s a A  transport  machinery. The genetic arrangement of the R s a A transporter genes is unusual. Typically, either all three genes are on either side of the substrate gene or the O M P gene is unlinked to the rest of the genes. In the RsaA transport system, 5 genes are found between the M F P and the O M P , an arrangement that has not been found  93  before. These 5 genes appear to be required for the synthesis of the O-antigen. Another unusual finding was the presence of a homologous O R F of the O M P component found elsewhere in the genome. This homologue has 60% identity to rsaF, but is not required for the secretion of RsaA. The function of this homologue remains to be discovered or even if the gene produces a functional protein. RsaA accounts for a large portion of the cellular protein (10 to 12%). A s far as can be determined, the RsaA secretion machinery secretes a larger fraction of total cell protein than any other known type I secretion mechanism. This high level of protein production is apparently necessary to keep the cell completely covered with S-layer at all times and is similar to the levels noted for other bacterial S-layer proteins (Messner and Sleytr, 1992). This means that the RsaA secretion machinery is either more efficient than that of other type I secretion systems or that a larger number of transport complexes exist in the membranes or a combination of both factors. This question is an important one to answer from a fundamental research perspective, to address such things as what makes a secretion apparatus more efficient.  It is also important because some current research is engaged in  evaluating the potential of the S-layer protein secretion system for the secretion of heterologous proteins and peptides in a biotechnological context (Bingle et al., 1997a; Bingle et al., 1997b), where increased levels of secretion has obvious utility. Now that the genes involved in the transport of RsaA have been discovered, it will be possible to address such issues.  For example, gene duplications of the  transporter genes can be made to see if more copies of the transporter components increase secretion. In addition, with the genes in hand it will be possible to produce and isolate the individual components and make antisera against them. Antibodies can then be used to assess the amount of protein present in the cell.  Most of the genes involved in O-antigen synthesis are linked to the transporter genes. In addition to the O-antigen synthesis genes mentioned above, a number of other genes involved in O-antigen synthesis have been found by Tn5 mutagenesis. While the linkage pattern of these genes was not as obvious, Southern blot analysis showed that the majority of the Tn5 insertions found were linked to the transporter genes as well. However, it was not demonstrated that all of the Tn5 insertions were  94  as closely linked to the transporters. A s the Southern analysis of the Tn5 insertions only used two restriction enzymes, further analysis may prove that these other genes are also linked.  Usually, all the genes involved in the synthesis of the O-  antigen are linked on a 20-30kb fragment of DNA. Sequencing further, past IpsF, should reveal other genes involved in O-antigen synthesis, possibly including genes not found here by Tn5 mutagenesis.  Perosamine appears to be the major component of the O-antigen. Analysis of the O-antigen showed that it is composed of a 4,6, dideoxy-4-amino-hexose, of which perosamine is an example.  It was shown in this report that all the genes  required for the synthesis of perosamine are found in the genome of C.  crescentus.  Furthermore, three of these genes were disrupted by transposon mutagenesis leading to an altered O-antigen. It is reasonable to conclude from these data that perosamine is the 4,6, dideoxy-4-amino-hexose seen in the chemical analysis of the O-antigen.  Several glycosyltransferases are involved in the synthesis of the O-antigen. N M R analysis of the O-antigen revealed a number of different anomeric proton signals, suggesting that there are several different linkages between the sugar residues.  This implies the presence of multiple glycosyltransferases to produce  these linkages. A number of Tn5 insertions altering the O-antigen were found in genes with similarity to mannosyltransferases. Since perosamine is a derivative of mannose the transferases are probably highly similar and this has been found with the perosamine transporter, RfbV from E. coli 0 1 5 7 (see C h . 6). One Tn5 insertion interrupts a gene with similarity to galactosyltransferases that transfer the first sugar residue to the lipid precursor of the O-antigen.  It may be that this enzyme, LpsJ,  transfers a galactose to the lipid precursor as a first step in the growing O-antigen. Alternatively, since perosamine is an isomer of galactose, a perosamine may be the first residue of the O-antigen chain.  Galactose may have been missed in the  analysis of the O-antigen since it is also found in the core and a slightly increased level, relative to other core sugars, would have gone unnoticed.  95  Several other genes involved in the proper formation of the smooth L P S have also been found. One, IpsK, may be involved in synthesis of a core or O-antigen sugar.  Another, lpsl, appears to code for a transciption repressor that affects  smooth L P S production. Tn5 insertions interrupting O-antigen synthesis were found in two O R F s with no similarity to any known proteins. Two of these insertions are 3' of an O R F coding for an A B C - 2 transporter. A B C - 2 transporters export O-antigens and extracellular polysaccharides. If this is the A B C - 2 transporter that exports the O-antigen, it suggests that the O-antigen is synthesized in the cytoplasm by addition of sugar residues to the non-reducing terminus. The information provided here should assist in determining the correct structure of the S - L P S and may also allow the attachment site(s) of the O-antigen to RsaA to be determined. A number of possibilities present themselves for future steps in analysis of the S - L P S .  The first obvious step is to isolate the D N A  containing the genes IpsGHIJKL  and determine how closely they are linked.  Sequencing of this DNA may reveal other genes involved in O-antigen synthesis and possibly synthesis of the core (for example LpsK may be involved in synthesis of a core sugar and the DNA surrounding it may contain the remaining synthesis genes). The other obvious experiment is to knock-out LpsA and LpsB and confirm that they are involved in the synthesis of the O-antigen. There may be more genes involved in the synthesis of the O-antigen that were not found when screening the Tn5 library.  For example, interruption of O-  antigen synthesis genes that did not result in complete detachment of the S-layer may have been missed by the screen. An example of this might be enzymes involved in the transfer of the sugar residues that are not involved in the attachment of process. The S-layer lies very close to the outer membrane of the bacterium as seen in electron micrographs (Smit et al, 1981, Smit et al, 1984). If the O-antigen consisted of a single chain, it would be 40 residues long; long enough to span the distance between the S-layer and outer membrane numerous times. This suggests that the S-layer either attaches to several points along the chain (Fig 1-4) or the O-antigen has multiple branches. Selective mutation of the various transferases, or by using  96  the Tn5 mutants, should allow one to determine which of these possibilities is correct by analyzing the different sized O-antigens that are produced.  Summary R s a A , the S-layer subunit of C. crescentus,  is transported by a type I  secretion system involving three proteins, an ABC-transporter, a periplasmic spanning Membrane Forming Protein and an outer membrane protein. It was shown that a number of other F W C species also contain type I secretion systems that probably secrete the S-layer subunit.  The evolutionary  relationships of these type I secretion systems and the S-layer subunit genes was examined. A number of genes involved in the synthesis of the smooth L P S were found. Some of these genes code for enzymes involved in the synthesis of perosamine, the likely  major  component  of the  O-antigen.  Other  genes  code for  the  glycosyltransferases that link the sugar residues of the O-antigen to each other.  97  Bibliography  Abraham, W., Strompl, C , Meyer, H., Lindholst, S., Moore, E . R . B . , Christ, R., Vancanneyt, M., Tindall, B . J . , Bennasar, A., Smit, J . and Tesar, M. (1999) Phylogeny and polyphasic taxonomy of Caulobacter Maricaulis  gen. nov. with Maricaulis  species.  Proposal of  maris (Poindexter) comb. nov. as the type  species and emended description of the genera Brevundimonas  and  Caulobacter.  IntlJ. of System. Bacteriol., 49, 1053-1073. Akatsuka, H., Binet, R., Kawai, E., Wandersman, C. and Omori, K. (1997) Lipase secretion by bacterial hybrid ATP-binding cassette exporters:  molecular  recognition of the LipBCD, PrtDEF, and H a s D E F exporters. J. Bacteriol.,  179,  4754-4760. Alley, M.R., Gomes, S.L., Alexander, W. and Shapiro, L. (1991) Genetic analysis of a temporally transcribed chemotaxis gene cluster in Caulobacter  crescentus.  Genetics, 129, 333-341. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403-410. Anderson, D.M. and Schneewind, O. (1997) A m R N A signal for the type III secretion of Yop proteins by Yersinia enterocolitica  [see comments]. Science,  278, 1140-  1143. Anderson, D.M. and Schneewind, O. (1999) Type III machines of Gram-negative pathogens: injecting virulence factors into host cells and more. Curr.  Opin.  Microbiol., 2, 18-24. Annunziato, P.W., Wright, L.F., Vann, W . F . and Silver, R . P . (1995) Nucleotide sequence and genetic analysis of the neuD and neuB genes in region 2 of the polysialic acid gene cluster of Escherichia coli K1. J. Bacteriol., 177, 312-319. Armstrong, S., Zhang, H., Tabernero, L., Hermodson, M. and Stauffacher, C. (1999) Powering the A B C transporter: The crystallographic structure of the ATP-binding cassette, RbsA. ATP-Binding  Cassette Transporters:  From Multidrug  Resistance  to Genetic Disease. F E B S Advanced Lecture Course, Gosau, Austria, p. 3.  98  Awram, P. and Smit, J . (1998) The Caulobacter  crescentus  paracrystalline S-layer  protein is secreted by an A B C transporter (type I) secretion apparatus. J. Bacteriol., 180, 3062-3069. Bairoch, A. (1992) P R O S I T E : a dictionary of sites and patterns in proteins. Nuc. Acids Res, 20 Suppl, 2013-2018. Bechthold, A., S o h n g , J . K . , Smith, T . M . , C h u , X . and F l o s s , H . G . (1995) Identification of Streptomyces  violaceoruber  Tu22 g e n e s involved  in  the  biosynthesis of granaticin. Mol. Gen. Genet, 248, 610-620. Becker, A., Kuster, H., Niehaus, K. and Puhler, A. (1995) Extension of the Rhizobium meliloti succinoglycan biosynthesis gene cluster: identification of the exsA gene encoding an A B C transporter protein, and the exsB gene which probably codes for a regulator of succinoglycan biosynthesis. Mol Gen Genet, 249,487-497. Belland, R . J . and Trust, T . J . (1985) Synthesis, export, and assembly of  Aeromonas  salmonicida A-layer analyzed bytransposon mutagenesis. J. Bacteriol., 163, 877881. Beveridge, T . J . , Pouwels, P.H., Sara, M., Kotiranta, A., Lounatmaa, K., Kari, K., Kerosuo, E., Haapasalo, M., Egelseer, E.M., Schocher, I., Sleytr, U.B., Morelli, L , Callegari, M.L., Nomellini, J.F., Bingle, W.H., Smit, J . , Leibovitz, E., Lemaire, M., Miras, I., Salamitou, S., Beguin, P., Ohayon, H., Gounon, P., Matuschek, M. and Koval, S.F. (1997) Functions of S-layers. FEMS Microbiol. Rev., 20, 99-149. Bik, E.M., Bunschoten, A . E . , Willems, R . J . , Chang, A . C . and Mooi, F.R. (1996) Genetic organization and functional analysis of the otn DNA essential for cell-wall polysaccharide synthesis in Vibrio cholerae 0139. Mol. Microbiol., 20, 799-811. Binet, R., Letoffe, S., Ghigo, J . M . , Delepelaire, P. and Wandersman, C. (1997) Protein secretion by Gram-negative bacterial A B C exporters - a review. Gene, 192, 7-11. Binet, R. and Wandersman, C. (1995) Protein secretion by hybrid bacterial A B C transporters: specific functions of the membrane A T P a s e and the membrane fusion protein. EMBO J . , 14, 2298-2306. Binet, R. and Wandersman, C. (1996) Cloning of the Serratia  marcescens  hasF  gene encoding the Has A B C exporter outer membrane component: a TolC analogue. Mol. Microbiol., 22, 265-273.  99  Bingle, W.H., Awram, P., Nomellini, J . F . and Smit, J . (1999) The Secretion Signal of C. crescentus S-layer Protein is Located Within the C-Terminal 82 Amino Acids of the Molecule, submitted J. Bacteriol. Bingle, W . H . , Le, K.D. and Smit, J . (1996) The extreme N-terminus of the Caulobacter crescentus surface-layer protein directs export of passenger proteins from the cytoplasm but is not required for secretion of the native protein. Can. J. Microbiol., 42, 672-684. Bingle, W . H . , Nomellini, J . F . and Smit, J . (1997a) Cell-surface display of a Pseudomonas  aeruginosa strain K pilin peptide within the paracrystalline S-layer  of Caulobacter crescentus. Mol. Microbiol., 26, 277-288. Bingle, W . H . , Nomellini, J . F . and Smit, J . (1997b) Linker mutagenesis of the Caulobacter  crescentus  S-layer protein: toward a definition of an N-terminal  anchoring region and a C-terminal secretion signal and the potential for heterologous protein secretion. J. Bacteriol., 179, 601-611. Bingle, W . H . and Smit, J . (1994) Alkaline phosphatase and a cellulase reporter protein are not exported from the cytoplasm when fused to large N-terminal portions of the Caulobacter crescentus surface (S)-layer protein. Can J Microbiol, 40, 777-782. Blaser, M.J., Smith, P.F. and Kohler, P.F. (1985) Susceptibility of Campylobacter isolates to the bactericidal activity of human serum. J. Infect. Dis., 151, 227-235. Blaser, M.J., Smith, P.F., Repine, J . E . and Joiner, K.A. (1988) Pathogenesis of Campylobacter  fetus infections. Failure of encapsulated Campylobacter  fetus to  bind C3b explains serum and phagocytosis resistance. J. Clin. Invest., 81, 14341444. Boos, W. and Shuman, H. (1998) Maltose/maltodextrin system of Escherichia  coli:  transport, metabolism, and regulation. Microbiol. Mol. Biol. Rev., 62, 204-229. Boot, H.J. and Pouwels, P.H. (1996) Expression, secretion and antigenic variation of bacterial S-layer proteins. Mol. Microbiol., 21, 1117-1123. Borinski, R. and Holt, S . C . (1990) Surface characteristics of Wolinella recta A T C C 33238 and human clinical isolates: correlation of structure with function. Infect. Immun., 58, 2770-2776. Brent, R. and Ptashne, M. (1980) The lexA gene product represses its own promoter. Proc. Natl. Acad. Sci. U. S. A., 77, 1932-1936.  100  Brun, Y . V . , Marczynski, G . and Shapiro, L. (1994) The expression of asymmetry during Caulobacter cell differentiation. Annu. Rev. Biochem., 63, 419-450. Burns, D.L. (1999) Biochemistry of type IV secretion. Curr. Opin. Microbiol., 2, 2529. Canter Cremers, H., Spaink, H P . , Wijfjes, A . H . , Pees, E., Wijffelman, C.A., Okker, R.J. and Lugtenberg, B.J. (1989) Additional nodulation genes on the Sym plasmid of Rhizobium leguminosarum  biovar viciae. Plant Mol. Biol., 13, 163-174.  Colnaghi, R., Pagani, S., Kennedy, C. and Drummond, M. (1996) Cloning, sequence analysis and overexpression of the rhodanese gene of Azotobacter  vinelandii.  Eur. J. Biochem., 236, 240-248. Croop, J . M . (1998) Evolutionary relationships among A B C transporters.  Methods  Enzymol., 292, 101-116. Currie, H.L., Lightfoot, J . and Lam, J . S . (1995) Prevalence of gca, a gene involved in synthesis of A - b a n d common  antigen  polysaccharide in  Pseudomonas  aeruginosa. Clinical and Diagnostic Lab. Immun, 2, 554-562. Davidson, A . L . and Nikaido, H. (1991) Purification and characterization of the membrane-associated components of the maltose transport system from Escherichia coli. J. Biol. Chem., 266, 8946-8951. Decottignies, A. and Goffeau, A. (1997) Complete inventory of the yeast A B C proteins. Nature Genet, 15, 137-145. Delepelaire, P. and Wandersman, C. (1990) Protein secretion in gram-negative bacteria. The extracellular metalloprotease B from Erwinia chrysanthemi a C-terminal secretion signal analogous to that of Escherichia  contains  coli alpha-  hemolysin. J Biol. Chem., 265, 17118-17125. Delepelaire, P. and Wandersman, C . (1991) Characterization, localization and transmembrane organization of the three proteins PrtD, PrtE and PrtF necessary for protease secretion by the gram-negative bacterium Erwinia chrysanthemi.  Mol.  Microbiol., 5, 2427-2434. Dinh, T., Paulsen, I.T. and Saier, M.H., Jr. (1994) A family of extracytoplasmic proteins that allow transport of large molecules across the outer membranes of gram-negative bacteria. J. Bacteriol., 176, 3825-3831.  101  Drummelsmith, J . and Whitfield, C. (1999) G e n e products required for surface expression of the capsular form of the group 1 K antigen in Escherichia  coli  (O9a:K30). Mol. Microbiol., 31, 1321-1332. Duong, F., Lazdunski, A., Cami, B. and Murgier, M. (1992) Sequence of a cluster of genes controlling synthesis and secretion of alkaline protease in  Pseudomonas  aeruginosa: relationships to other secretory pathways. Gene, 121, 47-54. Duong, F., Lazdunski, A. and Murgier, M. (1996) Protein secretion by heterologous bacterial ABC-transporters: the C-terminal secretion signal of the secreted protein confers high recognition specificity. Mol. Microbiol., 21, 459-470. Dworkin, J . , Tummuru, M.K.R. and Blaser, M.J. (1995) A lipopolysaccharide-binding domain of the Campylobacter  fetus S-layer protein resides within the conserved  N-terminus of a family of silent and divergent homologs. J. Bacteriol., 177, 17341741. Edwards, P. and Smit, J . (1991) A transducing bacteriophage for crescentus  Caulobacter  uses the paracrystalline surface layer protein as a receptor. J.  Bacteriol., 173, 5568-5572. Ehrmann, M., Ehrle, R., Hofmann, E., Boos, W. and Schlosser, A. (1998) The A B C maltose transporter. Mol. Microbiol., 29, 685-694. Eichelberg, K., Ginocchio, C . C . and Galan, J . E . (1994) Molecular and functional characterization of the Salmonella  typhimurium invasion genes invB and invC:  homology of InvC to the F0F1 A T P a s e family of proteins. J. Bacteriol., 176, 45014510. Fallarino, A . , Mavrangelos, C , Stroeher, U.H. and Manning, P.A. (1997) Identification of additional genes required for O-antigen biosynthesis in Vibrio cholerae 0 1 . J . Bacteriol., 179, 2147-2153. Fath, M.J., Skvirsky, R . C . and Kolter, R. (1991) Functional complementation between bacterial MDR-like export systems: colicin V, alpha-hemolysin, and Erwinia protease. J. Bacteriol., 173, 7549-7556. Fellay, R., Frey, J . and Krisch, H. (1987) Interposon mutagenesis of soil and water bacteria: a family of DNA fragments designed for in vitro insertional mutagenesis of gram-negative bacteria. Gene, 52, 147-154.  102  Feng, J . N . , Russel, M. and Model, P. (1997) A permeabilized cell system that assembles filamentous bacteriophage. Proc. Natl. Acad. Sci. U. S. A., 94, 40684073. Finnie, C , Zorreguieta, A., Hartley, N.M. and Downie, J.A. (1998) Characterization of Rhizobium leguminosarum  exopolysaccharide glycanases that are secreted via  a type I exporter and have a novel heptapeptide repeat motif. J. Bacteriol.,  180,  1691-1699. Fisher, J.A., Smit, J . and Agabian, N. (1988) Transcriptional analysis of the major surface array gene of Caulobacter crescentus. J. Bacteriol., 170, 4706-4713. Fry, B.N., Korolik, V., ten Brinke, J.A., Pennings, M.T., Zalm, R., Teunis, B.J., Coloe, P.J. and van derZeijst, B.A. (1998) The lipopolysaccharide biosynthesis locus of Campylobacter jejuni 81116. Microbiolology,  144, 2049-2061.  Galan, J . E . and Collmer, A. (1999) Type III secretion machines: bacterial devices for protein delivery into host cells. Science, 284, 1322-1328. Geremia, R.A., Petroni, E.A., lelpi, L. and Henrissat, B. (1996) Towards a classification of glycosyltransferases based on amino acid sequence similarities: prokaryotic alpha-mannosyltransferases. Biochem. J., 318, 133-138. Gilchrist, A., Fisher, J.A. and Smit, J . (1992) Nucleotide sequence analysis of the gene encoding the Caulobacter  crescentus  paracrystalline surface layer protein.  Can. J. Microbiol., 38, 193-202. Gilchrist, A. and Smit, J . (1991) Transformation  of freshwater  and  marine  caulobacters by electroporation. J. Bacteriol., 173, 921-925. Gober, J.W. and Marques, M.V. (1995) Regulation of cellular differentiation  in  Caulobacter crescentus. Microbiol. Rev., 59, 31-47. Gorbalenya, A . E . and Koonin, E.V. (1990) Superfamily of UvrA-related NTP-binding proteins. Implications for rational classification of recombination/repair systems. J Mol Biol, 213, 583-591. G u z z o , J . , Murgier, M., Filloux, A. and Lazdunski, A. (1990) Cloning of the Pseudomonas  aeruginosa  alkaline protease gene and secretion of the protease  into the medium by Escherichia coli. J. Bacteriol., 172, 942-948. Henkin, T.M., Grundy, F.J., Nicholson, W.L. and Chambliss, G . H . (1991) Catabolite repression of alpha-amylase gene expression in Bacillus subtilis involves a trans-  103  acting gene product homologous to the Escherichia coli lacl and gaIR repressors. Mol. Microbiol., 5, 575-584. Holland, I.B. (1999) personal communication. Holton, T.A. and Graham, M.W. (1991) A simple and efficient method for direct cloning of P C R products using ddT-tailed vectors. Nuc. Acids Res., 19, 1156. Hovmoller, S . , Sjogren, A. and Wang, D.N. (1988) The structure of crystalline bacterial surface layers. Prog. Biophys. Mol. Biol., 51, 131-163. Hung, L.W., Wang, I.X., Nikaido, K., Liu, P.Q., A m e s , G . F . and Kim, S . H . (1998) Crystal structure of the ATP-binding subunit of an A B C transporter  [see  comments]. Nature, 396, 703-707. Hwang, J . , Zhong, X . and Tai, P . C . (1997) Interactions of dedicated export membrane proteins of the colicin V secretion system: C v a A , a member of the membrane fusion protein family, interacts with C v a B and TolC. J. Bacteriol.,  179,  6264-6270. Hyde, S . C . , Emsley, P., Hartshorn, M.J., Mimmack, M.M., Gileadi, U., Pearce, S.R., Gallagher, M.P., Gill, D.R., Hubbard, R.E. and Higgins, C . F . (1990) Structural model of ATP-binding proteins associated with cystic fibrosis, multidrug resistance and bacterial transport [see comments]. Nature, 346, 362-365. Kawai, E., Akatsuka, H., Idei, A., Shibatani, T. and Omori, K. (1998) marcescens  Serratia  S-layer protein is secreted extracellularly via an ATP-binding  cassette exporter, the Lip system. Mol. Microbiol., 27, 941-952. Keen, N T . , Tamaki, S., Kobayashi, D. and Trollinger, D. (1988) Improved broadhost-range plasmids for DNA cloning in gram-negative bacteria. Gene, 70, 191197. Kenny, B., Taylor, S. and Holland, I.B. (1992) Identification of individual amino acids required for secretion within the haemolysin (HlyA) C-terminal targeting region. Mol Microbiol, 6, 1477-1489. Kido, N., Sugiyama, T., Yokochi, T., Kobayashi, H. and Okawa, Y . (1998) Synthesis of Escherichia  coli 0 9 a polysaccharide requires the participation of two domains  of WbdA, a mannosyltransferase encoded within the wb* gene cluster. Mol Microbiol, 27, 1213-1221.  104  Koronakis, V., Hughes, C. and Koronakis, E. (1993) A T P a s e activity and A T P / A D P induces conformational change in the soluble domain of the bacterial protein translocator HlyB. Mol. Microbiol., 8, 1163-1175. Koronakis, V., Li, J . , Koronakis, E. and Stauffer, K. (1997) Structure of TolC, the outer membrane component of the bacterial type I efflux system, derived from two-dimensional crystals. Mol. Microbiol., 23, 617-626. Kovach, M.E., Phillips, R.W., Elzer, P.H., Roop, R.M. and Peterson, K.M. (1994) p B B R l M C S : a broad-host-range cloning vector. Biotechniques,  16, 800-802.  Koval, S . F . and Hynes, S . H . (1991) Effect of paracrystalline protein surface layers on predation by Bdellovibrio bacteriovorus.  J. Bacteriol., 173, 2244-2249.  Koval, S . F . and Murray, R . G . (1984) The isolation of surface array proteins from bacteria. Can J Biochem Cell Biol, 62, 1181-1189. Kubori, T., Matsushima, Y., Nakamura, D., Uralil, J . , Lara-Tejero, M., Sukhan, A., Galan, J . E . and Aizawa, S.I. (1998) Supramolecular structure of the  Salmonella  typhimurium type III protein secretion system. Science, 280, 602-605. Leeds, J.A. and Welch, R.A. (1996) RfaH enhances elongation of Escherichia  coli  /7/yCABD mRNA. J. Bacteriol., 178, 1850-1857. Letellier, L., Howard, S . P . and Buckley, J.T. (1997) Studies on the energetics of proaerolysin secretion across the outer membrane of Aeromonas  species.  Evidence for a requirement for both the protonmotive force and A T P . J. Biol. Chem., 272, 11109-11113. Letoffe, S., Delepelaire, P. and Wandersman, C . (1990) Protease secretion by Erwinia chrysanthemi:  The specific secretion functions are analogous to those of  Escherichia coli alpha-haemolysin. EMBO J, 9, 1375-1382. Letoffe, S., Ghigo, J . M . and Wandersman, C. (1994a) Iron acquisition from heme and hemoglobin by a Serratia marcescens extracellular protein. Proc. Natl. Acad. Sci. U. S. A., 91, 9876-9880. Letoffe, S., Ghigo, J . M . and Wandersman, C. (1994b) Secretion of the Serratia marcescens HasA protein by an A B C transporter. J. Bacteriol., 176, 5372-5377. Letoffe, S. and Wandersman, C. (1992) Secretion of C y a A - P r t B and HlyA-PrtB fusion proteins in Escherichia  coli: Involvement of the glycine-rich repeat domain  of Erwinia chrysanthemi protease B. J. Bacteriol., 174, 4920-4927.  105  Linton, K . J . and Higgins, C . F . (1998) The Escherichia  coli ATP-binding cassette  (ABC) proteins. Mol. Microbiol., 28, 5-13. Linton, K . J . , Jarvis, B.W. and Hutchinson, C . R . (1995) Cloning of the genes encoding thymidine  diphosphoglucose 4,6- dehydratase  diphospho-4-keto-6-deoxyglucose 3,5-epimerase from producing Saccharopolyspora  and  the  thymidine  erythromycin-  erythraea. Gene, 153, 33-40.  Liu, D., Haase, A . M . , Lindqvist, L., Lindberg, A.A. and Reeves, P.R. (1993) Glycosyl transferases of O-antigen biosynthesis in Salmonella  enterica: identification and  characterization of transferase genes of groups B, C 2 , and E 1 . J. Bacteriol.,  175,  3408-3413. Lu, H.M. and Lory, S . (1996) A specific targeting domain in mature exotoxin A is required for its extracellular secretion from Pseudomonas  aeruginosa.  EMBO J.,  15, 429-436. Luckevich, M.D. and Beveridge, T . J . (1989) Characterization of a dynamic S layer on Bacillus thuringiensis. J. Bacteriol., 171, 6656-6667. Mackman, N., Nicaud, J . M . , Gray, L. and Holland, I.B. (1985) Identification of polypeptides required for the export of haemolysin 2001 from E. coli. Mol. Gen. Genet. 201, 529-536. M a c R a e , J . D . and Smit, J . (1991) Characterization of caulobacters  isolated from  wastewater treatment systems. Appl. Environ. Microbiol., 57, 751-758. Malakooti, J . , Wang, S . P . and Ely, B. (1995) A consensus promoter sequence for Caulobacter  crescentus  genes involved in biosynthetic and housekeeping  functions. J. Bacteriol., 177, 4372-4376. Martin, V . J . and Mohn, W.W. (1999) An alternative inverse P C R (IPCR) method to amplify DNA sequences flanking Tn5 transposon insertions [In Process Citation]. J. Microbiol. Methods, 35, 163-166. Maser, P. and Kaminsky, R. (1998) Identification of three A B C transporter genes in Trypanosoma brucei spp. Parasitol. Res., 84, 106-111. Mead, D.A., Szczesna-Skorupa, E. and Kemper, B. (1986) Single-stranded DNA 'blue' T7 promoter plasmids: a versatile tandem promoter system for cloning and protein engineering. Protein Eng., 1, 67-74.  106  Messner, P. and Sleytr, U.B. (1992) Crystalline bacterial cell-surface layers. In Rose, A . H . and Tempest, D.W. (eds.), Advances  in Microbial  Physiology.  Academic  Press, London, Vol. 33, pp. 213-275. Morales, V . M . , Backman, A. and Bagdasarian, M. (1991) A series of wide-hostrange low-copy-number vectors that allow direct screening for recombinants. Gene, 97, 39-47. Munn, C . B . , Ishiguro, E . E . , Kay, W . W . and Trust, T . J . (1982) Role of surface components in serum resistance of virulent Aeromonas  salmonicida.  Infect.  Immun., 36, 1069-1075. Nielsen, H., Engelbrecht, J . , Brunak, S . and von Heijne, G . (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng., 10, 1-6. Nikaido, H. (1994) Maltose transport system of Escherichia  coli: an A B C - t y p e  transporter. FEBS Lett, 346, 55-58. Nikaido, H. and Vaara, M. (1985) Molecular basis of bacterial outer membrane permeability. Microbiol. Rev., 49, 1-32. Nomellini, J.F., Kupcu, S., Sleytr, U.B. and Smit, J . (1997) Factors controlling in vitro recrystallization of the Caulobacter  crescentus  paracrystalline S-layer. J.  Bacteriol., 179, 6349-6354. Pearson, W.R., Wood, T., Zhang, Z. and Miller, W . (1997) Comparison of D N A sequences with protein sequences. Genomics, 46, 24-36. Pohlner, J . , Halter, R., Beyreuther, K. and Meyer, T.F. (1987) G e n e structure and extracellular secretion of Neisseria gonorrhoeae  IgA protease. Nature, 325, 458-  462. Poindexter, J . S . (1981) The caulobacters: ubiquitous unusual bacteria. Microbiol. Rev., 45, 123-179. Pugsley, A . P . (1993) The complete general secretory pathway in gram-negative bacteria. Microbiol. Rev., 57, 50-108. Ravenscroft, N., Walker, S . G . , Dutton, G . G . and Smit, J . (1991)  Identification,  isolation, and structural studies of extracellular polysaccharides produced by Caulobacter crescentus. J. Bacteriol., 173, 5677-5684.  107  Ravenscroft, N., Walker, S . G . , Dutton, G . S . and Smit, J.K. (1992)  Identification,  isolation, and structural studies of the outer membrane lipopolysaccharide of Caulobacter crescentus. J. Bacteriol., 174, 7595-7605. Riley, R.G. and Kolodziej, B.J. (1976) Pathway of glucose catabolism in  Caulobacter  crescentus. Microbios, 16, 219-226. Roberts, R . C . , Mohr, C D . and Shapiro, L. (1996) Developmental programs in bacteria. Curr. Top. Dev. Biol., 34, 207-257. Rocchetta, H.L., Burrows, L.L., P a e a n , J . C . and L a m , J . S . (1998)  Three  rhamnosyltransferases responsible for assembly of the A-band D- rhamnan polysaccharide in Pseudomonas  aeruginosa:  a fourth transferase, WbpL, is  required for the initiation of both A-band and B-band lipopolysaccharide synthesis [published erratum appears in Mol Microbiol 1998 Dec;30(5):1131].  Mol.  Microbiol., 28, 1103-1119. Russel, M. (1998) Macromolecular assembly and secretion across the bacterial cell envelope: type II protein secretion systems. J. Mol. Biol., 279, 485-499. Salmond, G . P . and Reeves, P . J . (1993) Membrane traffic wardens and protein secretion in gram-negative bacteria. Trends in Biochem. Sci., 18, 7-12. Sambrook, J . , Fritsch, E.F. and Maniatis, T. (1989) Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. Sara, M. and Sleytr, U.B. (1996a) Biotechnology and biomimetic with crystalline bacterial cell surface layers (S-layers). Micron, 27, 141-156. Sara, M. and Sleytr, U.B. (1996b) Crystalline bacterial cell surface layers (S-layers): from cell structure to biomimetics. Prog. Biophys. Mol. Biol., 65, 83-111. Scheu, A.K., Economou, A., Hong, G.F., Ghelani, S., Johnston, A . W . and Downie, J.A. (1992) Secretion of the Rhizobium leguminosarum  nodulation protein NodO  by haemolysin-type systems. Mol. Microbiol., 6, 231-238. Schnaitman, C . A . and Klena, J . D . (1993)  G e n e t i c s of  lipopolysaccharide  biosynthesis in enteric bacteria. Microbiol. Rev., 57, 655-682. Schulein, R., Gentschev, I.,  Schlor, S . , G r o s s , R. and G o e b e l , W.  (1994)  Identification and characterization of two functional domains of the hemolysin translocator protein HlyD. Mol. Gen. Genet, 245, 203-211. Shapiro, L. (1976) Differentiation  in the Caulobacter cell cycle. Annu.  Rev.  Microbiol., 30, 377-407.  108  Shapiro, L. and Losick, R. (1997) Protein localization and cell fate in bacteria. Science, 276, 712-718. Sheps, J.A., Zhang, F. and Ling, V. (1996) Phylogenetic Analysis of Members of the A B C transporter  superfamily.  In Rothman, S . R . (ed.) Membrane  Protein  Transport. JAI Press, Greenwich, Conneticut, Vol. 3, p. 81. Simon, R., Priefer, U. and Puhler, A. (1983) A broad host range mobilization system for in vivo genetic engineering: transposon mutagenesis in Gram negative bacteria. Bio/technology,  1, 784-790.  Sleytr, U.B. (1976) Self-assembly of the hexagonally and tetragonally arranged subunits of bacterial surface layers and their reattachment to cell walls. J . Ultrastruct. Res., 55, 360-377. Sleytr, U.B., Bayley, H., Sara, M., Breitwieser, A., Kupcu, S., Mader, C , Weigert, S., Unger, F.M., Messner, P., Jahn-Schmid, B., Schuster, B., Pum, D., Douglas, K., Clark, N.A., Moore, J.T., Winningham, T.A., Levy, S., Frithsen, I., Pankovc, J . , Beale, P., Gillis, H.P., Choutov, D.A. and Martin, K.P. (1997a) Applications of S layers. FEMS Microbiol. Rev., 20, 151-175. Sleytr, U.B. and Messner, P. (1983) Crystalline surface layers on bacteria. Annu. Rev. Microbiol., 37, 311-339. Sleytr, U.B. and Messner, P. (1988) Crystalline surface layers in procaryotes. J. Bacteriol., 170, 2891-2897. Sleytr, U.B., Messner, P., Pum, D. and Sara, M. (1993) Crystalline bacterial cell surface layers. Mol. Microbiol., 10, 911-916. Sleytr, U.B., Pum, D. and Sara, M. (1997b) Advances in S-layer nanotechnology and biomimetics. Adv. Biophys., 34, 71-79. Sleytr, U.B. and Sara, M. (1997) Bacterial and archaeal S-layer proteins: structurefunction relationships and their biotechnological applications. Trends  Biotechnol.,  15, 20-26. Smit, J . and Agabian, N. (1984) Cloning of the major protein of the crescentus  Caulobacter  periodic surface layer: detection and characterization of the cloned  peptide by protein expression assays. J. Bacteriol., 160, 1137-1145. Smit, J . , Engelhardt, H., Volker, S., Smith, S . H . and Baumeister, W. (1992) The S layer of Caulobacter  crescentus:  three-dimensional image reconstruction and  structure analysis by electron microscopy. J. Bacteriol., 174, 6527-6538.  109  Srnit, J . , Grano, D.A., Glaeser, R.M. and Agabian, N. (1981) Periodic surface array in Caulobacter crescentus: fine structure and chemical analysis. J. Bacteriol.,  146,  1135-1150. Stahl, D.A., Key, R., Flesher, B. and Smit, J . (1992) The phylogeny of marine and freshwater caulobacters reflects their habitat. J. Bacteriol., 174, 2193-2198. Stanley, P.L., Diaz, P., Bailey, M.J., Gygi, D., Juarez, A. and Hughes, C. (1993) Loss of activity in the secreted form of Escherichia  coli haemolysin caused by an rfaP  lesion in core lipopolysaccharide assembly. Mol. Microbiol., 10, 781-787. S t e v e n s o n , G . , Andrianopoulos, K., Hobbs, M. and R e e v e s , P . R . Organization of the Escherichia  (1996)  coli K-12 gene cluster responsible for production  of the extracellular polysaccharide colanic acid. J. Bacteriol., 178, 4885-4893. Stewart, M. and Beveridge, T . J . (1980) Structure of the regular surface layer of Sporosarcina  ureae. J. Bacteriol., 142, 302-309.  Stroeher, U.H., Karageorgos, L.E., Brown, M.H., Morona, R. and Manning, P.A. (1995) A putative pathway for perosamine biosynthesis is the first function encoded within the rfb region of Vibrio cholerae 0 1 . Gene, 166, 33-42. Sugiyama, T., Kido, N., Kato, Y., Koide, N., Yoshida, T. and Yokochi, T. (1998) Generation of Escherichia  coli 0 9 a serotype, a subtype of E. coli 0 9 , by transfer  of the wb* gene cluster of Klebsiella 0 3 into E. coli via recombination. J. Bacteriol., 180, 2775-2778. Sutton, J . M . , Peart, J . , Dean, G . and Downie, J.A. (1996) Analysis of the C-terminal secretion signal of the Rhizobium  leguminosarum  nodulation protein NodO; a  potential system for the secretion of heterologous proteins during nodule invasion. Mol. Plant Microbe Interact., 9, 671-680. Thompson, S.A., Shedd, O.L., Ray, K.C., Beins, M.H., Jorgensen, J . P . and Blaser, M.J. (1998) Campylobacter  fetus surface layer proteins are transported by a type I  secretion system. J. Bacteriol., 180, 6450-6458. Thome, K.J., Oliver, R.C. and Glauert, A . M . (1976) Synthesis and turnover of the regularly arranged surface protein of Acinetobacter  sp. relative to the other  components of the cell envelope. J. Bacteriol., 127, 440-450. Tobin, M.B., Peery, R.B. and Skatrud, P.L. (1997) G e n e s encoding multiple drug resistance-like proteins in Aspergillus  fumigatus and Aspergillus  flavus. Gene,  200,11-23.  110  Vaara, M. (1992) Eight bacterial proteins, including UDP-N-acetylglucosamine acyltransferase (LpxA) and three other transferases of Escherichia  coli, consist of  a six-residue periodicity theme. FEMS Microbiol. Lett., 76, 249-254. Vieira, J . and Messing, J . (1982) The pUC plasmids, an M13mp7-derived system for insertion mutagenesis and sequencing with synthetic universal primers. Gene, 19, 259-268. Vuorio, R., Harkonen, T., Tolvanen, M. and Vaara, M. (1994) The novel hexapeptide motif found in the acyltransferases LpxA and LpxD of lipid A biosynthesis is conserved in various bacteria. FEBS Letters, 337, 289-292. Walker, J . E . , Saraste, M. and Gay, N.J. (1984) The unc operon. Nucleotide sequence, regulation and structure of A T P - synthase. Biochim.  Biophys.  Acta.,  768, 164-200. Walker,  S . G . , Karunaratne,  D.N., Ravenscroft,  Characterization of mutants of Caulobacter  N. and  crescentus  Smit,  J.  (1994)  defective in surface  attachment of the paracrystalline surface layer. J. Bacteriol., 176, 6312-6323. Walker, S . G . , Smith, S . H . and Smit, J . (1992) Isolation and comparison of the paracrystalline surface layer proteins of freshwater caulobacters. J.  Bacteriol.,  174, 1783-1792. Wandersman, C , Delepelaire, P. and Letoffe, S. (1990) Secretion processing and activation of Erwinia chrysanthemi proteases. Biochimie, 72, 143-146. Wandersman, C . and Letoffe, S. (1993) Involvement of lipopolysaccharide in the secretion of Escherichia  coli alpha-haemolysin and Erwinia  chrysanthemi  proteases. Mol. Microbiol., 7, 141-150. Wang, L. and Reeves, P.R. (1998) Organization of Escherichia  coli 0 1 5 7 O antigen  gene cluster and identification of its specific genes. Infect. Immun., 66, 35453551. Ward, M.J., Bell, A.W., Hamblin, P.A., Packer, H.L. and Armitage, J . P . (1995) Identification of a chemotaxis operon with two cheY genes in  Rhodobacter  sphaeroides. Mol. Microbiol., 17, 357-366. Weiss, A.A., Johnson, F.D. and Burns, D.L. (1993) Molecular characterization of an operon required for pertussis toxin secretion. Proc. Natl. Acad. Sci. U. S. A., 90, 2970-2974.  Ill  Welch, R.A. (1991) Pore-forming cytolysins of gram-negative bacteria. Mol. Microbiol., 5, 521-528. Welsh, M.J. (1998) The A B C of a versatile engine. Nature, 396, 623-624. Whitfield, C . (1995) Biosynthesis of lipopolysaccharide O antigens.  Trends  Microbiol., 3, 178-185. Wolff, N., Delepelaire, P., Ghigo, J . M . and Delepierre, M. (1997) Spectroscopic studies of the C-terminal secretion signal of the Serratia marcescens  haem  acquisition protein (HasA) in various membrane-mimetic environments. Eur. J. of Biochem., 243, 400-407. Wolff, N., Ghigo, J . M . , Delepelaire, P., Wandersman, C. and Delepierre, M. (1994) C-terminal secretion signal of an Erwinia chrysanthemi  protease secreted by a  signal peptide-independent pathway: proton N M R and C D conformational studies in membrane-mimetic environments. Biochemistry, 33, 6792-6801. Yanisch-Perron, C , Vieira, J . and Messing, J . (1985) Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene, 33, 103-119. Yap, W.H., Thanabalu, T. and Porter, A . G . (1994) Influence of transcriptional and translational control sequences on the expression of foreign genes in Caulobacter crescentus. J Bacteriol, 176, 2603-2610. Yin, Y., Zhang, F., Ling, V. and Arrowsmith, C H . (1995) Structural analysis and comparison of the C-terminal transport signal domains of hemolysin A and leukotoxin A. FEBS Lett, 366, 1-5. Yun, C , Ely, B. and Smit, J . (1994) Identification of genes affecting production of the adhesive holdfast of a marine caulobacter. J Bacteriol., 176, 796-803 Zhang, F., Sheps, J.A. and Ling, V. (1998) Structure-function analysis of hemolysin B. Methods Enzymol., 292, 51-66.  112  Appendix 1 RAT fragment-rsaADE, IpsABCDE, rsaF, IpsF LOCUS DEFINITION  ACCESSION VERSION KEYWORDS SOURCE ORGANISM  NA1000RATX 16458 bp DNA BCT 07-OCT-1999 Caulobacter crescentus s s t l , S-layer subunit (rsaA), ABC-transporter ( r s a D ) , Membrane F o r m i n g U n i t ( r s a E ) , p u t a t i v e GDP-mannose-4,6-dehydratase (LpsA), p u t a t i v e a c e t y l t r a n s f e r a s e (LpsB), p u t a t i v e perosamine synthetase (LpsC), p u t a t i v e mannosyltransferase (LpsD), p u t a t i v e m a n n o s y l t r a n s f e r a s e (LpsE), O u t e r membrane p r o t e i n ( r s a F ) , a n d p u t a t i v e p e r o s a m i n e t r a n s f e r a s e (LpsE) g e n e s , c o m p l e t e c d s . NA1000RATX  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; Caulobacter group; Caulobacter. 1 ( b a s e s 1230 t o 2387) REFERENCE F i s h e r , J . A . , S m i t , J . and Agabian,N. AUTHORS T r a n s c r i p t i o n a l a n a l y s i s o f t h e m a j o r s u r f a c e a r r a y gene o f TITLE Caulobacter crescentus JOURNAL J . B a c t e r i o l . 170 ( 1 0 ) , 4706-4713 (1988) MEDLINE 89008089 REFERENCE 2 ( b a s e s 1336 t o 4645) AUTHORS G i l c h r i s t , A . , F i s h e r , J . A . and S m i t , J . TITLE N u c l e o t i d e s e q u e n c e a n a l y s i s o f t h e gene e n c o d i n g t h e C a u l o b a c t e r crescentus p a r a c r y s t a l l i n e surface layer p r o t e i n JOURNAL Can. J . M i c r o b i o l . 38 ( 3 ) , 193-202 (1992) MEDLINE 93007489 REFERENCE 3 ( b a s e s 1 t o 16458) AUTHORS Awram,P. a n d S m i t , J . TITLE The C a u l o b a c t e r c r e s c e n t u s p a r a c r y s t a l l i n e S - l a y e r p r o t e i n i s s e c r e t e d b y an ABC t r a n s p o r t e r ( t y p e I ) s e c r e t i o n a p p a r a t u s J . B a c t e r i o l . 180 ( 1 2 ) , 3062-3069 (1998) JOURNAL 98292737 MEDLINE REFERENCE 4 ( b a s e s 1 t o 16458) AUTHORS Awram,P.A. a n d S m i t , J . K . TITLE I d e n t i f i c a t i o n o f Genes i n v o l v e d i n t h e S y n t h e s i s o f t h e Smooth Lipopolysaccharide JOURNAL Unpublished REFERENCE 5 ( b a s e s 1 t o 16458) Awram,P.A. AUTHORS Direct Submission TITLE Submitted (07-OCT-1999) M i c r o b i o l o g y a n d Immunology, U n i v e r s i t y o f JOURNAL B r i t i s h C o l u m b i a , 300-6174 U n i v e r s i t y B l v d , V a n c o u v e r , BC V6T 1Z3, Canada Location/Qualifiers FEATURES 1. .16458 source /organism="Caulobacter crescentus" /strain="NA1000" c o m p l e m e n t ( 2 2 7 . .799) gene /gene="sstl"  113  CDS  gene CDS  gene CDS  /note="unknown" complement(227..799) /gene="sstl" /note="unknown" /codon_start=l /transl_table=ll /product="Sstl" /translation="MAAQVLSFFQRSPRYAPQPADWSQQELAEFYRVESALIRAGIRV GTDRGLSDENEPWFVFYRADDGEVVIHFARIDGEYLIAGPAYEEIARGFDFTSLVRNL VARHPLIRRSDSGSNLSVHPAALLVAVVGTAFFKTGEARAAETGQSNATSGHNRPVLL S S S SNAS LNDRCRAGRLPAARLCLGATAGQ" 1443..4523 /gene="rsaA" 1443..4523 /gene="rsaA" /citation=[1] /citation=[2] /codon_start=l /transl_table=ll /product="S-layer subunit" /translation="MAYTTAQLVTAYTNANLGKAPDAATTLTLDAYATQTQTGGLSDA AALTNTLKLVNSTTAVAIQTYQFFTGVAPSAAGLDFLVDSTTNTNDLNDAYYSKFAQE NRFIN FSINLATGAGAGATAFAAAYTGVS YAQTVATAY DK11GNAVATAAGVDVAAAV AFLSRQANIDYLTAFVRANTPFTAAADIDLAVKAALIGTILNAATVSGIGGYATATAA MINDLSDGALSTDNAAGVNLFTAYPSSGVSGSTLSLTTGTDTLTGTANNDTFVAGEVA GAATLTVGDTLSGGAGTDVLNWVQAAAVTALPTGVTISGIETMNVTSGAAITLNTSSG VTGLTALNTNTSGAAQTVTAGAGQNLTATTAAQAANNVAVDGGANVTVASTGVTSGTT TVGANSAASGTVSVSVANSSTTTTGAIAVTGGTAVTVAQTAGNAVNTTLTQADVTVTG NSSTTAVTVTQTAAATAGATVAGRVNGAVTITDSAAASATTAGKIATVTLGSFGAATI DSSALTTVNLSGTGTSLGIGRGALTATPTANTLTLNVNGLTTTGAITDSEAAADDGFT TINIAGSTASSTIASLVAADATTLNISGDARVTITSHTAAALTGITVTNSVGATLGAE LATGLVFTGGAGADSILLGATTKAIVMGAGDDTVTVSSATLGAGGSVNGGDGTDVLVA NVNGSSFSADPAFGGFETLRVAGAAAQGSHNANGFTALQLGATAGATTFTNVAVNVGL TVLAAPTGTTTVTLANATGTSDVFNLTLSSSAALAAGTVALAGVETVNIAATDTNTTA HVDTLTLQATSAKSIVVTGNAGLNLTNTGNTAVTSFDASAVTGTGSAVTFVSANTTVG EVVTIRGGAGADSLTGSATANDTIIGGAGADTLVYTGGTDTFTGGTGADIFDINAIGT STAFVTITDAAVGDKLDLVGISTNGAIADGAFGAAVTLGAAATLAQYLDAAAAGDGSG TSVAKWFQFGGDTYVVVDSSAGATFVSGADAVIKLTGLVTLTTSAFATEVLTLA" 4766. . 6502 /gene="rsaD" 4766..6502 /gene="rsaD" / n o t e = " A B C - t r a n s p o r t e r o f RsaA t y p e I s e c r e t i o n s y s t e m " /codon_start=l /transl_table=ll /product="ABC-transporter" /translation="MFKRSGAKPTILDQAVLVARPAVITAMVFSFFINILALVSPLYM LQVYDRVLTSRNVSTLIVLTVICVFLFLVYGLLEALRTQVLVRGGLKFDGVARDPIFK SVLDSTLSRKGIGGQAFRDMDQVREFMTGGLIAFCDAPWTPVFVIVSWMLHPFFGILA IIACIIIFGLAVMNDNATKNPIQMATMASIAAQNDAGSTLRNAEVMKAMGMWGGLQAR WRARRDEQVAWQAAASDAGGAVMSGIKVFRNIVQTLILGGGAYLAIDGKISAGAMIAG SILVGRALAPIEGAVGQWKNYIGARGAWDRLQTMLREEKSADDHMPLPEPRGVLSAEA ASILPPGAQQPTMRQASFRIDAGAAVALVGPSAAGKSSLLRGIVGVWPCAAGVIRLDG YDIKQWDPEKLGRHVGYLPQDIELFSGTVAQNIARFTEFESQEVIEAATLAGVHEMIQ SLPMGYDTAIGEGGASLSGGQRQRLALARAVFRMPALLVLDEPNASLDQVGEVALMEA MKRLKAAKRTVIFATHKVNLLAQADYIMVINQGVISDFGERDPMLAKLTGAAPPQTPP PTPPPAPLQRVQ"  114  gene CDS  gene CDS  gene CDS  gene CDS  gene CDS  •  6570..7880 /gene="rsaE" 6570..7880 /gene="rsaE" /note="MFP o f RsaA t y p e I s e c r e t i o n s y s t e m " /codon_start=l /transl_table=ll /product="Membrane F o r m i n g U n i t " /translation="MKPPKIQRPTDNFQAVARIGYGIIALTFVGLLGWAAFAPLDSAV IANGVVSAEGNRKTVQHLEGGMLAKILVREGEKVKAGQVLFELDPTQANAAAGITRNQ YVALKAMEARLLAERDQRPSISFPADLTSQRADPMVARAIADEQAQFTERRQTIQGQV DLMNAQRLQYQSEIEGIDRQTQGLKDQLGFIEDELIDLRKLYDKGLVPRPRLLALERE QASLSGSIGRLTADRSKAVQGASDTQLKVRQIKQEFFEQVSQSITETRVRLAEVTEKE VVASDAQKRIKIVSPVNGTAQNLRFFTEGAVVRAAEPLVDIAPEDEAFVIQAHFQPTD VDNVHMGMVTEVRLPAFHSREIPILNGTIQSLSQDRISDPQNKLDYFLGIVRVDVKQL PPHLRGRVTAGMPAQVIVPTGERTVLQYLFSPLRDTLRTTMREE" 8020..8997 /gene="LpsA" 8020..8997 /gene="LpsA" /codon_start=l /transl_table=ll / p r o d u c t = " p u t a t i v e GDP-mannose-4,6-dehydratase" /translation="MAKTALITGVTGQDGAYLAKLLLEKGYTVHGMLRRSASADVIGD RLRWIGVYDDIQFELGDLLDEGGLARLMRRLQPDEVYNLAAQSFVGASWDQPHLTGSV TGLGTTNMLEAVRLECPQARFYQASSSEMYGLVQHPIQSETTPFYPRSPYAVAKLYAH WMTVNYRESFGLHASAGILFNHESPLRGIEFVTRKVTDAVAAIKLGQQKTVDLGNLDA KRDWGHAKDYVEAMWLMLQQETPDDYVVATGKTWTVRQMCEVAFAHVGLNYQDHVTIN PKFLRPAEVDLLLGDPAKAKAKLGWEPKTTMQQMIAEMVDADIARRSRN" 8997..9644 /gene="LpsB" 8997..9644 /gene="LpsB" /codon_start=l /transl_table=ll /product="putative acetyltransferase" /translation="MSASLAIGGVVIIGGGGHAKVVIESLRACGETVAAIVDADPTRR AVLGVPVVGDDLALPMLREQGLSRLFVAIGDNRLRQKLGRKARDHGFSLVNAIHPSAV VSPSVRLGEGVAVMAGVAINADSWIGDLAIINTGAVVDHDCRLGAACHLGPASALAGG VSVGERAFLGVGARVIPGVTIGADTIVGAGGVVVRDLPDSVLAIGVPAKIKGDRS" 9716..10756 /gene="LpsC" 9716..10756 /gene="LpsC" /codon_start=l /transl_table=ll /product="putative perosamine synthetase" /translation="MDTTWISSVGRFIVEFEKAFADYCGVKHAIACNNGTTALHLALV AMGIGPGDEVIVPSLTYIASANSVTYCGATPVLVDNDPRTFNLDAAKLEALITPRTKA IMPVHLYGQICDMDPILEVARRHNLLVIEDAAEAVGATYRGKKSGSLGDCATFSFFGN KIITTGEGGMITTNDDDLAAKMRLLRGQGMDPNRRYWFPIVGFNYRMTNIQAAIGLAQ LERVDEHLAARERVVGWYEQKLARLGNRVTKPHVALTGRHVFWMYTVRLGEGLSTTRD QVIKDLDALGIESRPVFHPMHIMPPYAHLATDDLKIAEACGVDGLNLPTHAGLTEADI DRVIAALDQVLV" 10760..11797 /gene="LpsD" 10760..11797  115  gene CDS  gene CDS  gene CDS  /gene="LpsD" /codon_start=l /transl_table=ll /product="putative mannosyltransferase" /translation="MRIVLLSSIVPFINGGARFIVEWLEEKLIEAGHEVERFYLPFVD DPNEILHQIAAWRLMDLTQWCDRVICFRPPAYVVDHPNKVLWFIHHIRTFYDLWDTPY RGMPDDAQHRAIRDNLRALDTQAISEARAVFTNSQVVADRLKAFNGLDATPLYPPIYQ PERFSHTGYGDEIVAISRLEPHKRQALMIEAMQYVKSGVKLRLAGTASSAEYGRQLVK MTHDLGVADRVILEDRWISEDEKADMLKQALAVAYLPKDEDSYGYPSLEGAHARKPVI TTTDSGGVLELVEHGRNGLISAPDPRALAEQFDRLHADKAATAKMGTASLNRLAEMKI DWSTVVERLTS" 11808..12845 /gene="LpsE" 11808..12845 /gene="LpsE" /codon_start=l /transl_table=ll /product="putative mannosyltransferase" /translation="MKVLVVNNAAPFQRGGAEELADHLVRRLNATPGVQSELVRVPFT WEPAERLIEEMLISKGMRLYNVDRVIGLKFPAYLIPHHQKVLWLLHQFRQAYDLSEAG QSHLDFDDTGRAVKAAIRAADNACFAECRKIYCNSPVTQNRLMKFNGVASQVLYPPLN DGELFTGGEHGDYVFAGGRVAAGKRQHLLIEALALLPGSLRLVIAGPPENQAYADRLT KLVEDLDLKDRVELRFGFHPREDIARWANGALICAYLPFDEDSVGYVTMEAFAAGKAV LTVTDSGGLLEIVSADTGAVAEPTPQALAEALDRLTSDKARAISLGDAARRLWRDKNV TWEETVRRLLD" 12902..14485 /gene="rsaF" 12902..14485 /gene="rsaF" /note="OMP o f RsaA t y p e I s e c r e t i o n s y s t e m " /codon_start=l /transl_table=ll / p r o d u c t = " O u t e r membrane p r o t e i n " /translation="MRVLSKVLSVRTSLIALAMAMAVVGRADLAHAETLAEAITAAYQ SNPNIQAQRAAMRALDENYTQARSAYGLQASASVAEVYGWSKGVNAKNGVEAASQTST LSLSQSLYTNGRFSARLAGVEAQIKAARENLRRIEMDLLVRVTNAYISVRRDREILRI SQGGEAWLQKQLKDTEDKYSVRQVTLTDVQQAKARLASASTQVANAQAQLNVSVAFYA SLVGRQPETLKPEPDIDGLPTTLDEAFNQAEQANPVLLAAGYTEKASRAGVAEARAQR LFSVGARADYRNGSSTPYYARGGLREDTVNASITLTQPLFTSGQLNASVRQSIEENNR DKLLMEDARRSMVLSVSQYWDSLVAARKSLVSLEEEMKANTIAFYGVREEERFALRST IEVLNAQAELQNAQINFVRGRANEYVGRLHLLAQVGTLEVGNLAPGVQPYDPERNFRK VRYRGALPTELIIGTFDKIALPLEPKKPAPGDTSPIRPPSSELPARPVSADKVTPPAS MNDLPALTDDTPVQTAPRN" complement(14591..15880) /gene="LpsE" complement(14591..15880) /gene="LpsE" /codon_start=l /transl_table=ll /product="putative perosamine t r a n s f e r a s e " /translation="MTSRLLEIWRRLPTPIRRSAHVVAGAPRAALEALDKALAEHRHR SAERTALARARRRAGPRGLSPTLPVTVIGFHSAVHGLGEGARMLARGFGDMGLGVRAL DLSASVGFAAEIAPAYSSPDPDERGVTISHINPPELLRWARETEGRFLEGRRHIGYWA WELEEVPSDWLPAFDFVDEVWTPSAFAADAIRRVAPRGVKVTPVPYPLYLNPRPQADR QRFGLQDDRVVVLMAFDLRSTAQRKNPDAALRAFRDATVKATRPATLVCKVVGADLYP ETFQALAAEVADDPSIRLLTDNLSAQDMAALTASSDIVLSLHRSEGYGLLLAEAIWLG KPTLATGWSSNVEFMDPASSQFVDYRLVPVEGDGVIYRAGRWADADVGDAAEKLARMI  116  SDDAWRNTLAAATARNGHVSFNRDAWVAMTSARLPLT" BASE COUNT 2845 a 5542 c 5354 g 2717 t ORIGIN 1 gagctcaccg cgtgaaccgg c g t g t t g t c g acggtctgag tcggcggggg gaggcgcgct 61 g g c c c c g c c g c c a a a a c c a c t g a t g c c g g c c g g c a a g g c c g a g g c g t c g c t g a a g t c g a g 121 a g c c g c a g c g g c g g c g g t c a g g c c c g a g c g c a c g g c c g g g t c c a g g c t c g c g g c g t c c a c 181 a c g g a a g t c c g a g g c c a g c a g g g c g g c g g c c a g a a c c a g c a c c g c c t c a t t g g c c g g c a g 241 t t g c a c c g a g g c a t a g g c g g g c t g c t g g a a g c c g a c c g g c g c g a c a c c g g t c g t t c a g g c 301 t g g c g t t c g a g c t g c t g c t g a g c a g g a c g g g g c g a t t g t g g c c c g a g g t c g c a t t g c t c t 361 g g c c g g t c t c g g c g g c g c g c g c t t c g c c g g t c t t g a a g a a g g c g g t g c c g a c c a c g g c c a 421 c c a g c a g g g c g g c g g g g t g a a c c g a c a g g t t g c t g c c g c t g t c g g a g c g g c g g a t c a g c g 481 g a t g g c g g g c g a c g a g g t t g c g c a c c a g g c t g g t g a a g t c g a a g c c g c g c g c a a t c t c t t 541 c a t a g g c g g g g c c g g c g a t c a g a t a c t c g c c g t c g a t a c g c g c a a a g t g g a t c a c c a c c t 601 c g c c g t c g t c c g c t c g g t a g a a g a c g a a c c a g g g t t c g t t c t c g t c g c t c a g g c c g c g a t 661 c a g t g c c g a c g c g g a t g c c c g c g c g g a t c a g g g c g c t c t c g a c a c g g t a g a a c t c g g c c a 721 g c t c c t g c t g g c t c c a g t c c g c c g g t t g c g g c g c g t a g c g c g g c g a g c g c tgaaagaacg 781 a c a g g a c c t g t g c g g c c a t a c g g c g g a a g c t t c c c c a a g c c t a g g t g a a a a g c c g a c c c c 841 c c g t c g g c c c a a a c a c g c t a g c a g a c g a c a c g a t a a c c g a a c t a g t c t t c g c t g a a c a g g 901 a t c t c g t a g g t g a t c g g a t c a t a g a a g c g g a g c a g t t c g c g c a c g a a c g t c t t c t c c g a g 961 a c c t c c a g g g c c t t g g c c c a g t c g c g g t a a c g g t c c g g c g g a a t g c g g c c g c g a c c c g t c 1021 t c c a g c t g a g a g a t g a a g g t g t a a t a a t c a g c g c c g a c c t t a g c g g c c a g c t g g c g t t g c 1081 g a c a g g c c g g c g g c c t c g c g c a t c t c c t t c a g c c a g c g g c c a c c t t c g c g g c g g a g g t c t 1141 t g c a c c t c c g a g g c g c t g c g g c g t t g c g g g t t a c c a t a c a t t a t a a a g c c t c g c g c g t t g 1201 a c c g a g g g c a g g a g c g c g g g c g c g c t c a c t c a c c c g c c a g g t g a a c a g t c t t t a t a t a t a 12 61 g c g c t t t t c g g c g g g g g g t a c a a g g a a c g c t a t a t a g g a a t t t g c t g t a c c g g t t a g a a a 1321 a a t g c t g t a c c c c t g a a a t t c g g c t a t t g t c g a c g t a t g a c g t t t g c t c t a t a g c c a t c g 1381 c t g c t c c c a t g c g c g c c a c t c g g t c g c a g g g g g t g t g g g a t t t t t t t t g g g a g a c a a t c c 1441 t c a t g g c c t a t a c g a c g g c c c a g t t g g t g a c t g c g t a c a c c a a c g c c a a c c t c g g c a a g g 1501 c g c c t g a c g c c g c c a c c a c g c t g a c g c t c g a c g c g t a c g c g a c t c a a a c c c a g a c g g g c g 1561 g c c t c t c g g a c g c c g c t g c g c t g a c c a a c a c c c t g a a g c t g g t c a a c a g c a c g a c g g c t g 1621 t t g c c a t c c a g a c c t a c c a g t t c t t c a c c g g c g t t g c c c c g t c g g c c g c t g g t c t g g a c t 1681 t c c t g g t c g a c t c g a c c a c c a a c a c c a a c g a c c t g a a c g a c g c g t a c t a c t c g a a g t t c g 1741 c t c a g g a a a a c c g c t t c a t c a a c t t c t c g a t c a a c c t g g c c a c g g g c g c c g g c g c c g g c g 1801 c g a c g g c t t t c g c c g c c g c c t a c a c g g g c g t t t c g t a c g c c c a g a c g g t c g c c a c c g c c t 18 61 a t g a c a a g a t c a t c g g c a a c g c c g t c g c g a c c g c c g c t g g c g t c g a c g t c g c g g c c g c c g 1921 t g g c t t t c c t g a g c c g c c a g g c c a a c a t c g a c t a c c t g a c c g c c t t c g t g c g c g c c a a c a 1981 c g c c g t t c a c g g c c g c t g c c g a c a t c g a t c t g g c c g t c a a g g c c g c c c t g a t c g g c a c c a 2041 t c c t g a a c g c c g c c a c g g t g t c g g g c a t c g g t g g t t a c g c g a c c g c c a c g g c c g c g a t g a 2101 t c a a c g a c c t g t c g g a c g g c g c c c t g t c g a c c g a c a a c g c g g c t g g c g t g a a c c t g t t c a 2161 c c g c c t a t c c g t c g t c g g g c g t g t c g g g t t c g a c c c t c t c g c t g a c c a c c g g c a c c g a c a 2221 c c c t g a c g g g c a c c g c c a a c a a c g a c a c g t t c g t t g c g g g t g a a g t c g c c g g c g c t g c g a 2281 c c c t g a c c g t t g g c g a c a c c c t g a g c g g c g g t g c t g g c a c c g a c g t c c t g a a c t g g g t g c 2341 a a g c t g c t g c g g t t a c g g c t c t g c c g a c c g g c g t g a c g a t c t c g g g c a t c g a a a c g a t g a 2401 a c g t g a c g t c g g g c g c t g c g a t c a c c c t g a a c a c g t c t t c g g g c g t g a c g g g t c t g a c c g 2461 c c c t g a a c a c c a a c a c c a g c g g c g c g g c t c a a a c c g t c a c c g c c g g c g c t g g c c a g a a c c 2521 t g a c c g c c a c g a c c g c c g c t c a a g c c g c g a a c a a c g t c g c c g t c g a c g g c g g c g c c a a c g 2581 t c a c c g t c g c c t c g a c g g g c g t g a c c t c g g g c a c g a c c a c g g t c g g c g c c a a c t c g g c c g 2 641 c t t c g g g c a c c g t g t c g g t g a g c g t c g c g a a c t c g a g c a c g a c c a c c a c g g g c g c t a t c g 2701 c c g t g a c c g g t g g t a c g g c c g t g a c c g t g g c t c a a a c g g c c g g c a a c g c c g t g a a c a c c a 2761 c g t t g a c g c a a g c c g a c g t g a c c g t g a c c g g t a a c t c c a g c a c c a c g g c c g t g a c g g t c a 2821 c c c a a a c c g c c g c c g c c a c c g c c g g c g c t a c g g t c g c c g g t c g c g t c a a c g g c g c t g t g a 2881 c g a t c a c c g a c t c t g c c g c c g c c t c g g c c a c g a c c g c c g g c a a g a t c g c c a c g g t c a c c c 2941 t g g g c a g c t t c g g c g c c g c c a c g a t c g a c t c g a g c g c t c t g a c g a c c g t c a a c c t g t c g g 3001 g c a c g g g c a c c t c g c t c g g c a t c g g c c g c g g c g c t c t g a c c g c c a c g c c g a c c g c c a a c a 3061 c c c t g a c c c t g a a c g t c a a t g g t c t g a c g a c g a c c g g c g c g a t c a c g g a c t c g g a a g c g g 3121 c t g c t g a c g a t g g t t t c a c c a c c a t c a a c a t c g c t g g t t c g a c c g c c t c t t c g a c g a t c g 3181 c c a g c c t g g t g g c c g c c g a c g c g a c g a c c c t g a a c a t c t c g g g c g a c g c t c g c g t c a c g a  3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 54 61 5521 5581 5641 5701 5761 5821 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601  tcacctcgca ccctcggcgc tcctgctggg tcagctcggc tggtggccaa ccctccgcgt tgcaactggg tgaccgttct cctcggacgt cgctggctgg tcgacacgct gtctgaacct ccggcacggg cgatccgcgg tcatcggtgg gtggcacggg cgatcaccga ctatcgctga agtacctgga agttcggcgg gcgctgacgc ccgaagtcct actaagagac gccggccttg ccgccacaat tcgcgcggcg cgaccaggcc catcaacatt gaccagccgc ggtctacggc cgacggcgtg gggcatcggc cctgatcgcc gcacccgttc gatgaacgac ccagaacgac gggcggcctg cgccagcgac gaccctgatc gatgatcgcc tcagtggaag cgaggaaaag cgaagccgcc ccgtatcgac gctgctgcgc ctacgacatc ggacatcgag atcgcaggaa gccgatgggc ccagcgcctg gccgaacgcc taaggccgct cgactacatc gctggccaag gccgttgcag cagaacgcgc acaacttcca  caccgctgcc cgaactggcg cgccacgacc gaccctgggc cgtcaacggt cgctggcgcg cgcgacggcg ggcggctccg gttcaacctg cgtcgagacg gacgctgcaa gaccaacacc ctcggctgtg cggcgctggc cgctggcgct cgcggatatc cgccgctgtc cggcgccttc cgctgctgct cgacacctat ggtgatcaag gacgctcgcc cccgtcttcc cctagttccg ttcgtggtcg tcctgaaggc gtgctggtcg ctggccctgg aacgtttcga ctgctcgagg gcccgggatc ggccaggcgt ttctgcgatg ttcggcatcc aacgccacca gccggttcca caagcccgct gccggcggcg ctgggcggcg ggctcgatcc aattatatcg agcgccgacg tcgatcctgc gccggcgccg ggtatcgtcg aagcagtggg ctgttctcgg gtcatcgagg tatgacacgg gccctggcgc agcctcgacc aagcgcacgg atggtgatca ctgaccgggg cgcgtccagt ccatcaggct ggctgtggcc  gccctgacgg accggtctgg aaggcgatcg gctggtggtt tcgtcgttca gcggctcaag ggtgcgacga accggtacga accctgtcgt gtgaacatcg gccacctcgg ggcaacacgg accttcgtgt gccgactcgc gacaccctgg ttcgatatca ggcgacaagc ggcgctgcgg gccggcgacg gtcgtcgttg ctgaccggtc taagcgaacg gaaagggagg gtggctatga agacggcgcc tcacaatgtt cccgcccggc tcagcccgct ccctgatcgt cgctgcgcac cgatcttcaa tccgcgacat cgccctggac tggcgatcat agaacccgat ccctgcgcaa ggcgcgcgcg cggtgatgtc gcgcctatct tggtcggccg gcgcgcgcgg accacatgcc cgccgggcgc cggtggccct gcgtctggcc atcccgagaa gcaccgtcgc ccgcgaccct cgatcggcga gtgcggtgtt aggtgggcga tgatcttcgc accagggtgt ctgcgccgcc aagcgccttc tgaatctcaa cgtatcggct  gcatcacggt tcttcacggg tcatgggcgc cggtcaacgg gcgctgaccc gctcgcacaa ccttcaccaa cgaccgtgac cctcggccgc ccgccaccga ccaagtcgat ctgtcaccag cggccaacac tgaccggttc tctacaccgg acgctatcgg tcgacctcgt tcaccctggg gcagcggcac acagctcggc tggtcacgct tctgatcctc cggggtcttt tttagcggga ttagttgtta caagcgcagc ggtgatcacc gtacatgctg gttgacggtc ccaggtgctg gtcggtgctg ggaccaggtc gccggtgttc cgcctgtatc ccagatggcc cgccgaggtc ccgcgacgag gggcatcaag ggccatcgac cgccctggcg cgcctgggat gctgcccgag gcaacagccg tgtcggtccc ctgcgcggcg gctgggtcgc ccagaacatc ggcgggcgtg gggcggcgcc ccgcatgccg agtggcgctg cacccacaag gatcagcgac ccagacgccg gtcagtccgc tgaagccccc acggcatcat  gaccaacagc gttggtgcga cggcgctggc gctgactcga cggcgacgac accgtcaccg cggcgacggc a c c g a c g t t c ggccttcggc ggcttcgaaa cgccaacggc t t c a c g g c t c cgttgcggtg aatgtcggcc cctggccaac gccacgggca tctggccgct ggtacggttg caccaacacg accgctcacg cgtggtgacg ggcaacgccg cttcgacgcc agcgccgtca cacggtgggt gaagtcgtca ggccaccgcc aatgacacca cggtacggac a c c t t c a c g g cacctcgacc gctttcgtga c g g c a t c t c g acgaacggcg cgctgctgcg accctggctc ctcggttgcc aagtggttcc tggcgcgacc t t c g t c a g c g gaccacctcg gccttcgcca gcctaggcga ggatcgctag cttatgggcg ctacgcgctg ctggggggct t g c t c a c t t t ctgtacatgg ccgcgtcggt ggcgcgaagc c g a c g a t c c t gccatggtct tcagcttctt caggtctatg accgcgtgct atctgcgtct tcctgttcct gtgcgcggcg g t c t g a a g t t gactccacgc tcagccgcaa cgagagttca tgaccggcgg gtcatcgtct cgtggatgct atcatcttcg gcctggccgt accatggcct cgatcgccgc atgaaggcca tgggcatgtg caggtggcct ggcaggccgc gtgttccgca acatcgtcca ggcaagatct cggccggcgc cccatcgagg gcgccgtggg cgcctgcaga ccatgctgcg ccgcgcggcg t g c t g t c g g c accatgcgcc aggccagctt agcgcggcgg g c a a g t c c t c ggcgtcatcc gcctcgacgg cacgtcggct acctgccgca gcccgcttca ccgagttcga cacgagatga tccagagcct t c g c t g t c c g gcggccagcg gccctgctgg tgctggacga atggaagcga tgaagcggct gtgaacctgt tggcccaggc tttggcgaac gcgacccgat ccgccgacgc cgccgcccgc ctctcccttc ctggccgttt caagatccag cgtccgacgg cgccctgacc tttgtcggtc  6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 72 61 7321 7381 7441 7501 7561 7 621 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 8821 8881 8941 9001 9061 9121 9181 9241 9301 9361 9421 9481 9541 9601 9661 9721 9781 9841 9901 9961 10021  tgttgggctg ccgccgaggg tggtccgcga aggccaacgc cgcgcctgct gccagcgcgc agcgtcgcca agagcgagat tcgaggacga gtctgctggc cagaccgctc agcaggagtt aggtgaccga ccgtcaacgg ccgagccgct agccgaccga tccactcgcg tttccgatcc agctgccgcc cgaccggcga ccacgatgcg tgggcggcgc gtcaggaccc accggtgtga accgtccacg tggatcggcg ctggcgcgcc ttcgtcggcg accaacatgc tcgtccgaaa ccccgctcgc gagagctttg ggcatcgagt caaaagaccg tatgtcgagg accggcaaga aactatcagg ctgctgggcg caacagatga gcgcttccct tcatcgagag cgcggcgcgc gcgagcaggg tgggccgcaa tcgtttcgcc acgctgacag actgccgcct ccgtgggaga gcgccgacac ttgcgatcgg ttccgtcgcc cacgacctgg ctactgtggc cctggtggcg cgcctcggcc gcggaccttc gatcatgccc  ggccgcgttc taatcgcaag aggcgagaag cgccgccggc ggccgagcgc cgatccgatg gacgatccag cgagggcatc gctgatcgac cctggagcgc caaggccgtc cttcgagcag gaaggaggtc cacggcgcag ggtcgacatc tgtggacaat ggaaatcccg gcagaacaag gcatctgcgc gcgcaccgtg cgaggagtag gggcgagtta ttcgttgtta ccggtcagga gcatgctgcg tctatgacga tgatgcggcg cctcgtggga tcgaagccgt tgtacggtct cctatgcggt gcctgcacgc tcgtgacccg tcgatctggg ccatgtggct cctggaccgt accacgtgac atccggccaa tcgccgaaat cgccatcggg cctgcgggcc ggtgttgggc gctgtccaga ggcgcgcgac tagcgtacgt ctggatcggc gggcgcggcc gcgggctttt gatcgtcggc cgtgccggcc gcgccgcgcc atctcgtcgg gtcaagcacg atggggatcg aattcagtca aacctggacg gtgcacctct  gccccgctcg accgtgcagc gtgaaggccg atcacccgca gaccagcgtc gtcgcccgcg ggccaggtcg gaccgtcaga ctgcgtaagc gagcaggcct cagggcgcct gtcagccaga gtcgcctccg aacctgcgct gcgcccgagg gtccatatgg atcctgaacg ctcgactact ggcagggtca ctgcagtacc ggcaaaggtt aattcgccgg ctggagtcag cggggcgtac tcgctcggcc catccagttc cctgcagccg ccagccgcac gcgtctggaa ggtgcagcac ggccaagctc ctcggccggc caaggtcacc caatctcgac gatgctgcag gcgccagatg gatcaatccg ggccaaggcc ggtcgacgcc ggcgtcgtca tgcggtgaga gttccggtag ctgttcgtgg cacggctttt ctgggcgagg gacctggcga tgccacctgg ctcggtgtcg gccggcggtg aagatcaaag tcgacggcaa tcggacgctt cgatcgcctg gacccggcga cctattgcgg ccgcgaagtt acggtcagat  acagcgcggt acctcgaagg gccaggtgct accagtacgt cgtccatcag ccatcgccga acctgatgaa cccagggcct tctatgacaa cgctgtcggg ctgacaccca gcatcaccga acgcccagaa tcttcaccga acgaggcctt gcatggtcac gcacgatcca tcctcgggat ccgccggcat tgttctcgcc tcaagggcct cgctgctttc cggatacgca ctcgccaagc tcggccgatg gagctgggcg gacgaggtct ctgacgggct tgcccgcagg ccgatccagt tacgcccact atcctgttca gacgcggtgg gccaagcgcg caggagacgc tgcgaagtgg aagttcctgc aagctcggct gacatcgcgc tcatcggcgg cggtggcggc tgggcgatga cgatcggcga cgctggtcaa gggttgcggt tcatcaacac gacccgcctc gcgcccgggt tcgtcgtgcg gagaccgttc cgaacgtgac catcgttgag caacaacggt cgaggtgatc cgcgacgcct ggaggcgctg ttgcgacatg  gatcgccaac ggcgtcgtct cggcatgctg gccaagatcc g t t c g a g c t g gacccgaccc g g c t t t g a a g gccatggaag cttccccgcc gacctgacca cgaacaggcc c a g t t c a c t g cgcccagcgt ttgcagtatc gaaggaccaa c t c g g c t t c a gggcctggtg ccccggccgc ctcgatcggc cgtctgaccg gctcaaggtt cgccagatca gacccgggtt cgcctggccg gcggatcaag a t c g t g t c g c gggcgctgtc gttcgcgccg cgtgatccag gcgcatttcc cgaagttcgg ctgccggcct g t c g c t g t c g caggaccgca cgtgcgcgtg gacgtcaagc gccggcccag gtgatcgtgc gctgcgagac accctgcgca gatttccaaa gcttttcgga cattcgcggg caatagtgta tggcgaaaac g g c t t t g a t c tgctgctgga gaagggttac tgatcggcga ccgcctgcgc a c c t c t t g g a cgagggcggt acaacctggc ggcccagagc cggtgacggg cctgggcacg cgcggttcta tcaggcctcg cggagacgac g c c g t t c t a t ggatgacggt gaactatcgc accacgagag cccgctgcgc cggccatcaa gctgggtcag actggggtca cgccaaggac cggacgacta tgtggtcgcg ccttcgccca tgtcggcctg gtccggcgga agtggacctg gggaacccaa gacgaccatg ggcgctcgcg caactgatga cggcggccac gccaaggtgg c a t t g t c g a t gcggatccga cctggcgctg ccgatgcttc caaccggctg cgccagaagc cgccatccat ccctctgccg gatggccggc gtcgcgatca cggcgctgtt gtcgaccatg ggccctggcc ggcggcgtat catacctggc gtcacgatcg cgaccttccg gactcggtcc gtgagtgacc tgccgcgcat tatgtactcg aatgcatgga ttcgaaaagg ccttcgccga acgaccgcct tgcacctggc gttccgagcc tgacctatat gtgctggtcg acaacgatcc ataacgccgc gcacgaaggc gatccgatcc tcgaagttgc  10081 10141 10201 102 61 10321 10381 10441 10501 10561 10621 10681 10741 10801 10861 10921 10981 11041 11101 11161 11221 11281 11341 11401 11461 11521 11581 11641 11701 11761 11821 11881 11941 12001 12061 12121 12181 12241 12301 12361 12421 12481 12541 12 601 12661 12721 12781 12841 12901 12961 13021 13081 13141 13201 13261 13321 13381 13441  tcgcaggeat gggcaagaag catcaccacc gcgcttgctg cttcaattac cgacgaacac cctgggcaat gtacactgtg cgacgcgttg tgcgcatctg cctgccgacc tcaggtgttg cggcggcgcg ggtcgagcgg cgcctggcgg ggcctatgtg ctacgacctg ccgcgacaat caactcccag gtatccgccg ggccatct'cg cgtgaagagc acagctggtc ctggatcagc gcccaaggac ggtgatcacc cttgatcagc caaggctgcg cgactggagc tcgtcaacaa tccgccgcct gggagccggc atgtggaccg tgctgtggct atctggactt cctgcttcgc tgaagttcaa tcaccggcgg gccagcacct ccggaccgcc atctgaagga gggccaacgg tcacgatgga tgctggagat ccgaggcgct cgcgcaggct attaagccac aatgcgagtg catggcggtc cgcagcctat cgagaactac ggtctatggc ctcgaccctc gggtgtcgag gctggtccgc cagccaaggc cgtccgtcag  aacctgctcg tcaggctcgc ggcgagggcg cgaggccagg cggatgacca ctggccgcgc cgggtcacca cgcctgggcg ggcattgaga gccacggatg cacgcggggc gtctagccga cgcttcatcg ttctacctgc ctgatggacc gtggaccatc tgggacacgc ctccgcgcgc gtggtggccg atctatcagc cggctggagc ggcgtgaagc aagatgaccc gaggacgaga gaggacagct acgaccgact gccccggacc acagccaaga accgtcgtgg cgccgcgccg gaacgccacg cgagcgtctg ggtcattggc gctgcaccag cgacgacacg cgagtgccgc cggggtcgcc cgagcatggc gttgattgag ggagaaccag tcgcgtcgag ggccctgatc ggccttcgcc cgtcagcgcg tgatcgtttg atggcgcgac aaacattggg ctgtcgaaag gtcggtcgcg cagagcaatc acccaggccc tggtccaagg tctctgagcc gcgcagatca gtgaccaacg ggtgaagcct gtgaccttga  tgatcgagga tgggcgactg ggatgatcac gcatggatcc acatccaggc gcgaaagggt agccccatgt agggcctttc gccgtccggt atctgaagat tgactgaagc tgcgcatcgt tcgagtggct cgtttgtcga tgacccagtg cgaacaaggt cctatcgcgg tcgacaccca accgcttgaa ccgagcgctt cgcacaagcg tgcgcctggc acgacctggg aggccgatat atggctatcc ccggcggggt cgcgcgcgct tggggaccgc agcgcctgac ttccaacgcg cccggcgtcc atcgaggaga ctcaaatttc ttccgtcagg ggcagggcgg aagatctact agccaggtgc gactatgtct gccctagcct gcctatgccg ctgcggtttg tgcgcctatc gcaggcaagg gataccggtg acctcggaca aagaatgtca ttgaagacca ttctgtccgt ctgatctcgc cgaatattca gttcggccta gcgtcaacgc agagcctcta aggccgcgcg cctatatctc ggctgcagaa ccgacgtgca  tgcggccgag cgccaccttc caccaatgat caaccgccgc cgcgatcggt cgtgggctgg cgcgctgacc caccacgcgc gttccacccg cgccgaagcc cgatatcgac cctgctgtcc cgaggagaag cgatccgaac gtgcgaccgg cttgtggttc catgcctgac ggcgatttcg ggcgttcaac ttcccatacc tcaggccctg gggcacggcg cgtcgccgac gctgaaacag ttcgctggag gctggaactg ggccgagcag ctcgctgaac ctcatgagaa gcggcgccga agtccgagct tgctgatctc cggcctatct cctacgacct tgaaggcggc gcaactcgcc tctatccgcc tcgcgggcgg tgctgcccgg accgcctgac gcttccatcc tgccctttga ccgtgctgac cggtcgccga aggcgcgggc catgggaaga cgttaagacg gcgaacgtct ccacgccgag ggcccaacgc tgggctgcaa caagaacggc caccaacggt cgagaacctg ggtgcgccgc gcaattgaag gcaggccaag  gcggtgggcg agcttcttcg gatgacctgg tactggtttc ctggcgcagc tacgagcaga ggtcgccacg gatcaggtga atgcacatca tgcggggtcg cgtgtcatcg tcgatcgtgc ctgatcgagg gagatcctgc gtgatctgct atccaccaca gacgcgcagc gaagcccgcg ggcctggacg ggctatggcg atgatcgagg tccagcgccg cgggtcattc gctctggccg ggcgctcacg gtcgagcatg ttcgaccgcc cgtctggccg cgcccgcatg ggagctggcc ggtgcgcgtg caaggggatg gatcccgcat gtccgaagcg gatccgcgcg cgtcacccag gctgaacgac ccgggtcgcg cagtctgcgg caagctggtc gcgcgaggac cgaggatagt cgtgaccgac gcccacgccg gatatcgctg gacggtccgc gggtcggcta ctgatcgcct accttggccg gccgccatgc gccagcgcct gtcgaggccg cgtttctcgg cgccgcatcg gaccgcgaga gacaccgagg gcccgcctgg  cgacctaccg gcaacaagat cggccaagat cgatcgtcgg tggagcgggt agctggcgcg tgttctggat tcaaggatct tgccgcccta acggcttgaa cggcgctcga cgttcatcaa ccggccacga accagatcgc tccggccgcc tccgcacctt accgggccat cggtgttcac ccacgccgct acgagatcgt ccatgcagta agtatggtcg tcgaggatcg tggcctatct cccgcaagcc gccgcaacgg tgcacgctga agatgaagat aaggttctgg gaccatctgg cccttcacct cggctctaca caccaaaagg ggccagagcc gccgacaacg aaccgcctga ggtgagctgt gcgggcaagc ctggtgatcg gaggatctgg atcgcccgtt gtaggttacg tccggcggcc caagccctgg ggcgacgcgg cgtcttcttg cagtctagga tggccatggc aggcgatcac gcgcgctgga cggtcgctga ccagccagac cccgcctggc agatggacct tcctgcggat acaagtacag cgtcggccag  13501 13561 13621 13681 13741 13801 13861 13921 13981 14041 14101 14161 14221 14281 14341 14401 14461 14521 14581 14641 14701 14761 14821 14881 14941 15001 15061 15121 15181 15241 15301 15361 15421 15481 15541 15601 15661 15721 15781 15841 15901 15961 16021 16081 16141 16201 16261 16321 16381 16441  cactcaggtg ggtggggcgc cctcgacgag caccgagaag cggcgcgcgc gcgcgaggac gctgaacgcc agacgcacgt gcggaagtcg ggtgcgcgaa cgaattgcag gctccatctt gccctacgat gctgatcatc gggggacacc cgacaaggtg ggtccagacc atcaaagggg ggagaacgct tgaacgagac cggagatcat cggcgcgata tggacgccgg cgagccatat cgatgtcgct gccggatgga ggtccgcgcc cccgaaaggc aggccatcag ggcgcggatt cgcgacggat aggcggggag ggcggccctc tgatgtgcga tctcggcggc cgccaaaccc agccgatgac cgcgcgccaa gcgcctccag tgggcaggcg acgcatgggc cgcgtcaggc gaggggggaa gtttggctat ccgccacgct ctccatccac acaggcggcc cgaggccttc ccacggcatc tcgcaagctc  gcgaacgccc cagccggaga gcgttcaatc gcctctcgcg gcggactatc accgtcaacg tcggtgcggc cgcagcatgg ctggtcagcc gaagagcgtt aacgcccaga ctggcgcagg cctgagcgta ggaaccttcg tcgccgatcc acgccgccgg gcgccccgca ttctatttca ctaggtcagg gtgaccgttc ccgcgccagc gatgacgcca gtccatgaac cgcttcggcg gctggcggtg cggatcgtcg gacgacttta gcgcagcgcc cacgacgacc gagatagagc agcgtcagcg ccagtcggac caggaaacgg gatcgtgacc gaagccgacg gcgcgccagc cgtcaccggg ggcggttctt agcggcgcgc gcgccatatc gaccgtatag gatcttgtca tacgtgtcga tatggcgagt cccaagatcg gccgcagccc ttcgccgcag gccgagttct tggacctgga cgcccggg  aggcgcagct cgctgaagcc aggccgaaca ccggcgtcgc gcaatggctc cctcgatcac agtcgatcga tcctgagcgt tcgaagagga tcgcgcttcg tcaatttcgt tcggcacgct acttcaggaa acaagatcgc ggccgccgtc cgtcgatgaa actagagccc gaagtttaga ggcagccgcg cgcgccgtcg ttttcggcgg tcgccctcca tccacgttcg agcagcaggc agggcggcca gccacctcgg cagacaaggg gcgtcgggat cggtcgtcct gggtagggga gcgaaggcgg gggacctctt ccttcggtct ccgcgttcgt gacgccgaca atccgtgcgc agcgtgggtg tccgcactcc ggggcgccgg tcaagcaggc ccgttcaggc tgcgaaaggt ctgagtggag tgaacccgct agaacgcctg agccggggat agatggtccg gtaatcgcaa tctccgacca  gaatgtcagc tgaacccgat agccaatccg cgaggcgcgg cagcacgccg cctgacccag ggagaacaac ctcgcagtac aatgaaggcc cagcacgatc ccgcgggcgc tgaggtcggc ggtccggtac cttgccgctc gagcgaactg cgatctgccc tttccgatcg acgttcttcg cgctcgtcat cggcggccaa cgtcgcccac cgggaaccag atgaccagcc cgtaaccctc tgtcctgcgc cggcgagcgc ttgcgggacg tcttgcgctg gcaggccaaa ccggcgtaac agggggtcca ccagctccca cccgcgccca cgggatccgg ggtccaaggc cttcgcccag acaaacccct ggtgtcggtg cgacgacatg gtgaggtcat cgagccgaca ttgcatgcag cgccggctat ccgctgccgc cgagctcggg caactggtac tctgtcgggc ggacctgccg aaatcgccac  gtagcgttct attgacggcc gtcctgctgg gcccagcgcc tactacgcgc ccgctgttca cgcgacaagc tgggacagcc aacacgatcg gaagtgctga gccaacgagt aatctcgctc cgcggcgctt gagcccaaga ccggccaggc gccctgaccg gatcgcctca gtcaccgttt cgccacccag ggtgttgcgc atcagcgtcg ccgatagtcg tgtcgccagg ggaccggtgc cgacaggttg ttgaaaggtc tgtcgccttg ggcggtcgaa gcgctggcga ctttacgccc gacctcgtcg ggcccagtag tcgcagaagc tgaagaatag gcggacccca accgtgcacc gggaccggcg ctcggccaag cgccgagcgc gggcgatgca ccagcaaagc caatggcggc gtcacggacg cttccgctcc tttggccagg ggaaccgact gcggaggcca gacttcgact gtgctggtcg  acgcgtccct tgcctacaac cggcgggcta tgttctcggt gtggcggtct ccagcggtca tgctgatgga tggtggccgc cctt.ctatgg acgcccaagc atgtcggtcg ccggcgtcca tgccgacgga agccggcgcc ctgtttcggc acgacacgcc atccgatcgg cgcacggatc gcgtcgcggt caggcgtcgt gcccagcgcc acgaactggc gtgggcttgc agggacagca tccgtcagaa tcgggataga actgtggcgt cgcagatcga tcggcctgcg cgcggcgcca acgaaatcga ccgatgtgcc tctgggggat gccggggcga aggcccatgt gcgctatgga cgtctgcgcg gccttgtcca cggatgggcg ggcgggcgag tgcgcggggc ccactgcgcc tcaactatac tgacggtcgg gtctttccgt tcaacccctc agctctacga tcatcgggct acttcattcg  Appendix 2 ATC15252 S-layer subunit and transporter genes LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  JS3001A19 4255 bp mRNA Caulobacter crescentus S-layer subunit ( r s a D ( p a r t i a l ) ) mRNAs, c o m p l e t e c d s . JS3001A19  BCT 07-OCT-1999 (rsaA) and A B C - t r a n s p o r t e r  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 4255) AUTHORS B i n g l e , W . H . , Awram,P.A., N o m e l l i n i , J . F . a n d S m i t , J . K . TITLE The S e c r e t i o n S i g n a l o f C. c r e s c e n t u s S - l a y e r P r o t e i n i s L o c a t e d i n t h e C - t e r m i n a l 82 Amino A c i d s o f t h e M o l e c u l e JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 4255) AUTHORS B i n g l e , W . H . , Awram,P.A., N o m e l l i n i , J . F . a n d S m i t , J . K . TITLE D i r e c t Submission JOURNAL S u b m i t t e d (07-OCT-1999) M i c r o b i o l o g y a n d Immunology, U n i v e r s i t y o f B r i t i s h C o l u m b i a , 300-6174 U n i v e r s i t y B l v d , . V a n c o u v e r , BC V6T 1Z3, Canada FEATURES Location/Qualifiers source 1..4255 /organism="Caulobacter crescentus" /strain="JS3001" gene 637..3717 /gene="rsaA" CDS 637..3717 /gene="rsaA" /codon_start=l /transl_table=ll /product="S-layer subunit" /translation="MAYTTAQLVTAYTNANLGKAPDAATTLTLDAYATQTQTGGLSDA AALTNTLKLVNSTTAVAIQTYQFFTGVAPSAAGLDFLVDSTTNTNDLNDAYYSKFAQE NRFINFSINLATGAGAGATAFAAAYTGVSYAQTVATAYDKIIGNAVATAAGVDVAAAV AFLSRQANIDYLTAFVRANTPFTAAADIDLAVKAALIGTILNAATVSGIGGYATATAA MINDLSDGALSTDNAAGVNLFTAYPSSGVSGSTLSLTTGTDTLTGTANNDTFVAGEVA GAATLTVGDTLSGGAGTDVLNWVQAAAVTALPTGVTISGIETMNVTSGAAITLNTSSG VTGLTALNTNTSGAAQTVTAGAGQNLTATTAAQAANNVAVDGGANVTVASTGVTSGTT TVGANSAASGTVSVSVANSSTTTTGAIAVTGGTAVTVAQTAGNAVNTTLTQADVTVTG NSSTTAVTVTQTAAATAGATVAGRVNGAVTITDSAAASATTAGKIATVTLGSFGAATI DSSALTTVNLSGTGTSLGIGRGALTATPTANTLTLNVNGLTTTGAITDSEAAADDGFT TINIAGSTASSTIASLVAADATTLNISGDARVTITSHTAAALTGITVTNSVGATLGAE LATGLVFTGGAGADSILLGATTKAIVMGAGDDTVTVSSATLGAGGSVNGGDGTDVLVA NVNGSSFSADPAFGGFETLRVAGAAAQGSHNANGFTALQLGATAGATTFTNVAVNVGL TVLAAPTGTTTVTLANATGTSDVFNLTLSSSAALAAGTVALAGVETVNIAATDTNTTA HVDTLTLQATSAKSIVVTGNAGLNLTNTGNTAVTSFDASAVTGTGSAVTFVSANTTVG EVVTIRGGAGADSLTGSATANDTIIGGAGADTLVYTGGTDTFTGGTGADIFDINAIGT STAFVTITDAAVGDKLDLVGISTNGAIADGAFGAAVTLGAAATLAQYLDAAAAGDGSG TSVAKWFQFGGDTYVVVDSSAGATFVSGADAVIKLTGLVTLTTSAFATEVLTLA"  122  gene CDS  3960..4253 /gene="rsaD(partial) " 3960..4253 /gene="rsaD(partial) " /codon_start=l / transl_table=ll /product="ABC-transporter" /translation="MFKRSGAKPTILDQAVLVARPAVITAMVFSFFINILALVSPLYM LQVYDRVLTSRNVSTLIVLTVICVFLFLVYGLLEALRTQVLVRGGLKFDGVARD" 722 a 1512 c 1296 g 725 t  BASE COUNT ORIGIN 1 aagcttcccc 61 g a c a c g a t a a 121 g c g g a g c a g t 181 g t a a c g g t c c 241 a t c a g c g c c g 301 c t t c a g c c a g 361 c g g g t t a c c a 421 c a c t c a c c c g 481 a c g c t a t a t a 541 t t g t c g a c g t 601 c a g g g g g t g t 661 g t g a c t g c g t 721 c t c g a c g c g t 781 a a c a c c c t g a 841 a c c g g c g t t g 901 a a c g a c c t g a 961 t c g a t c a a c c 1021 g g c g t t t c g t 1081 g c g a c c g c c g 1141 a t c g a c t a c c 1201 g a t c t g g c c g 1261 a t c g g t g g t t 1321 t c g a c c g a c a 1381 g g t t c g a c c c 1441 a c g t t c g t t g 1501 g g c g g t g c t g 1561 a c c g g c g t g a 1621 c t g a a c a c g t 1681 g c t c a a a c c g 1741 g c g a a c a a c g 1801 t c g g g c a c g a 1861 g c g a a c t c g a 1921 g t g g c t c a a a 1981 a c c g g t a a c t 2041 g c t a c g g t c g 2101 g c c a c g a c c g 2161 g a c t c g a g c g 2221 c g c g g c g c t c 2281 a c g a c g a c c g 2341 a a c a t c g c t g 2401 a c c c t g a a c a 2461 a c g g g c a t c a 2521 c t g g t c t t c a 2581 a t c g t c a t g g 2641 g g t t c g g t c a 2701 t t c a g c g c t g  aagcctaggt ccgaactagt tcgcgcacga ggcggaatgc accttagcgg cggccacctt tacattataa ccaggtgaac ggaatttgct atgacgtttg gggatttttt acaccaacgc acgcgactca agctggtcaa ccccgtcggc acgacgcgta tggccacggg acgcccagac ctggcgtcga tgaccgcctt tcaaggccgc acgcgaccgc acgcggctgg tctcgctgac cgggtgaagt gcaccgacgt cgatctcggg cttcgggcgt tcaccgccgg tcgccgtcga ccacggtcgg gcacgaccac cggccggcaa ccagcaccac ccggtcgcgt ccggcaagat ctctgacgac tgaccgccac gcgcgatcac gttcgaccgc tctcgggcga cggtgaccaa cgggcggcgc gcgccggcga acggcggcga acccggcctt  gaaaagccga cttcgctgaa acgtcttctc ggccgcgacc ccagctggcg cgcggcggag agcctcgcgc agtctttata gtaccggtta ctctatagcc ttgggagaca caacctcggc aacccagacg cagcacgacg cgctggtctg ctactcgaag cgccggcgcc ggtcgccacc cgtcgcggcc cgtgcgcgcc cctgatcggc cacggccgcg cgtgaacctg caccggcacc cgccggcgct cctgaactgg catcgaaacg gacgggtctg cgctggccag cggcggcgcc cgccaactcg cacgggcgct cgccgtgaac ggccgtgacg caacggcgct cgccacggtc cgtcaacctg gccgaccgcc ggactcggaa ctcttcgacg cgctcgcgtc cagcgttggt tggcgctgac cgacaccgtc cggcaccgac cggcggcttc  ccccccgtcg caggatctcg cgagacctcc cgtctccagc ttgcgacagg gtcttgcacc gttgaccgag tatagcgctt gaaaaatgct atcgctgctc atcctcatgg aaggcgcctg ggcggcctct gctgttgcca gacttcctgg ttcgctcagg ggcgcgacgg gcctatgaca gccgtggctt aacacgccgt accatcctga atgatcaacg ttcaccgcct gacaccctga gcgaccctga gtgcaagctg atgaacgtga accgccctga aacctgaccg aacgtcaccg gccgcttcgg atcgccgtga accacgttga gtcacccaaa gtgacgatca accctgggca tcgggcacgg aacaccctga gcggctgctg atcgccagcc acgatcacct gcgaccctcg tcgatcctgc accgtcagct gttctggtgg gaaaccctcc  gcccaaacac taggtgatcg agggccttgg tgagagatga ccggcggcct tccgaggcgc ggcaggagcg ttcggcgggg gtacccctga ccatgcgcgc cctatacgac acgccgccac cggacgccgc tccagaccta tcgactcgac aaaaccgctt ctttcgccgc agatcatcgg tcctgagccg tcacggccgc acgccgccac acctgtcgga atccgtcgtc cgggcaccgc ccgttggcga ctgcggttac cgtcgggcgc acaccaacac ccacgaccgc tcgcctcgac gcaccgtgtc ccggtggtac cgcaagccga ccgccgccgc ccgactctgc gcttcggcgc gcacctcgct ccctgaacgt acgatggttt tggtggccgc cgcacaccgc gcgccgaact tgggcgccac cggcgaccct ccaacgtcaa gcgtcgctgg  gctagcagac gatcatagaa cccagtcgcg aggtgtaata cgcgcatctc tgcggcgttg cgggcgcgct ggtacaagga aattcggcta cactcggtcg ggcccagttg cacgctgacg tgcgctgacc ccagttcttc caccaacacc catcaacttc cgcctacacg caacgccgtc ccaggccaac tgccgacatc ggtgtcgggc cggcgccctg gggcgtgtcg caacaacgac caccctgagc ggctctgccg tgcgatcacc cagcggcgcg cgctcaagcc gggcgtgacc ggtgagcgtc ggccgtgacc cgtgaccgtg caccgccggc cgccgcctcg cgccacgatc cggcatcggc caatggtctg caccaccatc cgacgcgacg tgccgccctg ggcgaccggt gaccaaggcg gggcgctggt cggttcgtcg cgcggcggct  123  2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201  caaggctcgc acgaccttca acgacgaccg tcgtcctcgg atcgccgcca tcggccaagt acggctgtca gtgtcggcca tcgctgaccg ctggtctaca atcaacgcta aagctcgacc gcggtcaccc gacggcagcg gttgacagct ggtctggtca aacgtctgat gaggcggggt atgatttagc cgccttagtt tgttcaagcg cggcggtgat cgctgtacat tcgtgttgac gcacccaggt  acaacgccaa ccaacgttgc tgaccctggc ccgctctggc ccgacaccaa cgatcgtggt ccagcttcga acaccacggt gttcggccac ccggcggtac tcggcacctc tcgtcggcat tgggcgctgc gcacctcggt cggctggcgc cgctgaccac cctcgcctag ctttcttatg gggactgggg gttactgtac cagcggcgcg caccgccatg gctgcaggtc ggtcatctgc gctggtgcgc  cggcttcacg ggtgaatgtc caacgccacg cgctggtacg cacgaccgct gacgggcaac cgccagcgcc gggtgaagtc cgccaatgac ggacaccttc gaccgctttc ctcgacgaac tgcgaccctg tgccaagtgg gaccttcgtc ctcggccttc gcgaggatcg ggcgctacgc ggcttgctca atggccgcgt aagccgacga gtcttcagct tatgaccgcg gtcttcctgt ggcggtctga  gctctgcaac ggcctgaccg ggcacctcgg gttgcgctgg cacgtcgaca gccggtctga gtcaccggca gtcacgatcc accatcatcg acgggtggca gtgacgatca ggcgctatcg gctcagtacc ttccagttcg agcggcgctg gccaccgaag ctagactaag gctggccggc ctttccgcca cggttcgcgc tcctcgacca tcttcatcaa tgctgaccag tcctggtcta agttcgacgg  tgggcgcgac ttctggcggc acgtgttcaa ctggcgtcga cgctgacgct acctgaccaa cgggctcggc gcggcggcgc gtggcgctgg cgggcgcgga ccgacgccgc ctgacggcgc tggacgctgc gcggcgacac acgcggtgat tcctgacgct agaccccgtc cttgcctagt caatttcgtg ggcgtcctga ggccgtgctg cattctggcc ccgcaacgtt cggcctgctc cgtggcccgg  ggcgggtgcg tccgaccggt cctgaccctg gacggtgaac gcaagccacc caccggcaac tgtgaccttc tggcgccgac cgctgacacc tatcttcgat tgtcggcgac cttcggcgct tgctgccggc ctatgtcgtc caagctgacc cgcctaagcg ttccgaaagg tccggtggct gtcgagacgg aggctcacaa gtcgcccgcc ctggtcagcc tcgaccctga gaggcgctgc gatcc  //  LOCUS DEFINITION  ACCESSION VERSION KEYWORDS SOURCE ORGANISM  REFERENCE AUTHORS TITLE JOURNAL REFERENCE AUTHORS TITLE JOURNAL  FEATURES source  gene CDS  JS4000RAT1 7493 bp DNA BCT 07-OCT-1999 Caulobacter crescentus S-layer subunit (rsaA(truncated)), A B C - t r a n s p o r t e r ( r s a D ) , a n d Membrane F o r m i n g U n i t ( r s a E ) g e n e s , complete c d s . JS4000RAT1  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; Caulobacter group; Caulobacter. 1 ( b a s e s 1 t o 7493) B i n g l e , W . H . , Awram,P.A., N o m e l l i n i , J . F . a n d S m i t , J . K . The S e c r e t i o n S i g n a l o f t h e C. c r e s c e n t u s S - l a y e r p r o t e i n i s L o c a t e d w i t h i n t h e C - T e r m i n a l 82 Amino A c i d s o f t h e M o l e c u l e Unpublished 2 ( b a s e s 1 t o 7493) B i n g l e , W . H . , Awram,P.A., N o m e l l i n i , J . F . a n d S m i t , J . K . D i r e c t Submission S u b m i t t e d (07-OCT-1999) M i c r o b i o l o g y a n d Immunology, U n i v e r s i t y o f B r i t i s h C o l u m b i a , 300-6174 U n i v e r s i t y B l v d , V a n c o u v e r , BC V6T 1Z3, Canada Location/Qualifiers 1..7493 /organism="Caulobacter crescentus" /strain="JS4000" 637. .1716 /gene="rsaA(truncated) " 637 . .1716  124  gene CDS  gene CDS  /gene="rsaA(truncated)" / n o t e = " The RsaA p r o t e i n i s t r u n c a t e d b e c a u s e o f a d e l e t e d G b a s e p a i r . A s t o p c o d o n r e s u l t s a f t e r t r a n s l a t i o n o f 359 amino a c i d s . " /codon_start=l /transl_table=ll /product="S-layer subunit" /translation="MAYTTAQLVTAYTNANLGKAPDAATTLTLDAYATQTQTGGLSDA AALTNTLKLVNSTTAVAIQTYQFFTGVAPSAAGLDFLVDSTTNTNDLNDAYYSKFAQE NRFINFSINLAT GAGAGATAFAAAYT GV S YAQTVATAY DK11GNAVATAAGVDVAAAV AFLSRQANIDYLTAFVRANT.PFTAAADIDLAVKAALIGTILNAATVSGIGGYATATAA MINDLSDGALSTDNAAGVNLFTAYPSSGVSGSTLSLTTGTDTLTGTANNDTFVAGEVA GAATLTVGDTLSGGAGTDVLNWVQAAAVTALPTGVTISGIETMNVTSGAAITLNTSSG VTGLTALNTNTSGAAQTVTAGAGQT" 3959..5695 /gene="rsaD" 3959..5695 /gene="rsaD" /codon_start=l /transl_table=ll /product="ABC-transporter" /translation="MFKRSGAKPTILDQAVLVARPAVITAMVFSFFINILALVSPLYM LQVYDRVLTSRNVSTLIVLTVICVFLFLVYGLLEALRTQVLVRGGLKFDGVARDPIFK SVLDSTLSRKGIGGQAFRDMDQVREFMTGGLIAFCDAPWTPVFVIVSWMLHPFFGILA 11AC111FGLAVMN DNATKN PIQMATMASIAAQNDAGS TLRNAEVMKAMGMWGGLQAR WRARRDEQVAWQAAASDAGGAVMSGIKVFRNIVQTLILGGGAYLAIDGKISAGAMIAG SILVGRALAPIEGAVGQWKTYIGARGAWDRLQTMLREEKSADDHMPLPEPRGVLSAEA ASILPPGAQQPTMRQASFRIDAGAAVALVGPSAAGKSSLLRGIVGVWPCAAGVIRLDG YDIKQWDPEKLGRHVGYLPQDIELFSGTVAQNIARFTEFESQEVIEAATLAGVHEMIQ SLPMGYDTAIGEGGASLSGGQRQRLALARAVFRMPALLVLDEPNASLDQVGEVALMEA MKRLKAAKRTVIFATHKVNLLAQADYIMVINQGVISDFGERDPMLAKLTGAAPPQTPP PTPPPAPLQRVQ" 5763..7073 /gene="rsaE" 5763..7073 /gene="rsaE" /codon_start=l /transl_table=ll /product="Membrane F o r m i n g U n i t " /translation="MKPPKIQRPTDNFQAVARIGYGIIALTFVGLLGWAAFAPLDSAV IANGVVSAEGNRKTVQHLEGGMLAKILVREGEKVKAGQVLFELDPTQANAAAGITRNQ YVALKAMEARLLAERDQRPSISFPADLTRLRADPMVARAIADEQAQFTERRQTIQGQV DLMNAQRLQYQSEIEGIDRQTQGLKDQLGFIEDELIDLRKLYDKGLVPRPRLLALERE QASLSGSIGRLTADRSKAVQGASDTQLKVRQIKQEFFEQVSQSITETRVRLAEVTEKE VVASDAQKRIKIVSPVNGTAQNLRFFTEGAVVRAAEPLVDIAPEDEAFVIQAHFQPTD VDNVHMGMVTEVRLPAFHSREIPILNGTIQSLSQDRISDPQNKLDYFLGIVRVDVKQL PPHLRGRVTAGMPAQVIVPTGERTVLQYLFSPLRDTLRTTMREE" 1261 a 2627 c 2358 g 1247 t  BASE COUNT ORIGIN 1 aagcttcccc 61 g a c a c g a t a a 121 g c g g a g c a g t 181 g t a a c g g t c c 241 a t c a g c g c c g 301 c t t c a g c c a g 361 c g g g t t a c c a 421 c a c t c a c c c g  aagcctaggt ccgaactagt tcgcgcacga ggcggaatgc accttagcgg cggccacctt tacattataa ccaggtgaac  gaaaagccga cttcgctgaa acgtcttctc ggccgcgacc ccagctggcg cgcggcggag agcctcgcgc agtctttata  ccccccgtcg caggatctcg cgagacctcc cgtctccagc ttgcgacagg gtcttgcacc gttgaccgag tatagcgctt  gcccaaacac taggtgatcg agggccttgg tgagagatga ccggcggcct tccgaggcgc ggcaggagcg ttcggcgggg  gctagcagac gatcatagaa cccagtcgcg aggtgtaata cgcgcatctc tgcggcgttg cgggcgcgct ggtacaagga  125  481 541 601 661 721 781 841. 901 961 1021 1081 1141 1201 12 61 1321 1381 1441 1501 1561 1621 1681 1741 1801 1861 1921 1981 2041 2101 2161 2221 2281 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841  acgctatata ttgtcgacgt cagggggtgt gtgactgcgt ctcgacgcgt aacaccctga accggcgttg aacgacctga tcgatcaacc ggcgtttcgt gcgaccgccg atcgactacc gatctggccg atcggtggtt tcgaccgaca ggttcgaccc acgttcgttg ggcggtgctg accggcgtga ctgaacacgt gctcaaaccg cgaacaacgt cgggcacgac cgaactcgag tggctcaaac ccggtaactc ctacggtcgc ccacgaccgc actcgagcgc gcggcgctct cgacgaccgg acatcgctgg ccctgaacat cgggcatcac tggtcttcac tcgtcatggg gttcggtcaa tcagcgctga aaggctcgca cgaccttcac cgacgaccgt cgtcctcggc tcgccgccac cggccaagtc cggctgtcac tgtcggccaa cgctgaccgg tggtctacac tcaacgctat agctcgacct cggtcaccct acggcagcgg ttgacagctc gtctggtcac acgtctgatc aggcggggtc tgatttagcg  ggaatttgct atgacgtttg gggatttttt acaccaacgc acgcgactca agctggtcaa ccccgtcggc acgacgcgta tggccacggg acgcccagac ctggcgtcga tgaccgcctt tcaaggccgc acgcgaccgc acgcggctgg tctcgctgac cgggtgaagt gcaccgacgt cgatctcggg cttcgggcgt tcaccgccgg cgccgtcgac cacggtcggc cacgaccacc ggccggcaac cagcaccacg cggtcgcgtc cggcaagatc tctgacgacc gaccgccacg cgcgatcacg ttcgaccgcc ctcgggcgac ggtgaccaac gggcggcgct cgccggcgac cggcggcgac cccggccttc caacgccaac caacgttgcg gaccctggcc cgctctggcc cgacaccaac gatcgtggtg cagcttcgac caccacggtg ttcggccacc cggcggtacg cggcacctcg cgtcggcatc gggcgctgct cacctcggtt ggctggcgcg gctgaccacc ctcgcctagg tttcttatgg ggactggggg  gtaccggtta ctctatagcc ttgggagaca caacctcggc aacccagacg cagcacgacg cgctggtctg ctactcgaag cgccggcgcc ggtcgccacc cgtcgcggcc cgtgcgcgcc cctgatcggc cacggccgcg cgtgaacctg caccggcacc cgccggcgct cctgaactgg catcgaaacg gacgggtctg cgctggccaa ggcggcgcca gccaactcgg acgggcgcta gccgtgaaca gccgtgacgg aacggcgctg gccacggtca gtcaacctgt ccgaccgcca gactcggaag tcttcgacga gctcgcgtca agcgttggtg ggcgctgact gacaccgtca ggcaccgacg ggcggcttcg ggcttcacgg gtgaatgtcg aacgccacgg gctggtacgg acgaccgctc acgggcaacg gccagcgccg ggtgaagtcg gccaatgaca gacaccttca accgctttcg tcgacgaacg gcgaccctgg gccaagtggt accttcgtca tcggccttcg cgaggatcgc gcgctacgcg gcttgctcac  gaaaaatgct atcgctgctc atcctcatgg aaggcgcctg ggcggcctct gctgttgcca gacttcctgg ttcgctcagg ggcgcgacgg gcctatgaca gccgtggctt aacacgccgt accatcctga atgatcaacg ttcaccgcct gacaccctga gcgaccctga gtgcaagctg atgaacgtga accgccctga acctgaccgc acgtcaccgt ccgcttcggg tcgccgtgac ccacgttgac tcacccaaac tgacgatcac ccctgggcag cgggcacggg acaccctgac cggctgctga tcgccagcct cgatcacctc cgaccctcgg cgatcctgct ccgtcagctc ttctggtggc aaaccctccg ctctgcaact gcctgaccgt gcacctcgga ttgcgctggc acgtcgacac ccggtctgaa tcaccggcac tcacgatccg ccatcatcgg cgggtggcac tgacgatcac gcgctatcgc ctcagtacct tccagttcgg gcggcgctga ccaccgaagt tagactaaga ctggccggcc tttccgccac  gtacccctga aattcggcta ccatgcgcgc cactcggtcg cctatacgac ggcccagttg acgccgccac cacgctgacg cggacgccgc t g c g c t g a c c tccagaccta ccagttcttc tcgactcgac caccaacacc aaaaccgctt catcaacttc ctttcgccgc cgcctacacg agatcatcgg caacgccgtc t c c t g a g c c g ccaggccaac tcacggccgc tgccgacatc acgccgccac ggtgtcgggc acctgtcgga cggcgccctg atccgtcgtc gggcgtgtcg cgggcaccgc caacaacgac ccgttggcga caccctgagc ctgcggttac ggctctgccg cgtcgggcgc tgcgatcacc acaccaacac cagcggcgcg cacgaccgcc gctcaagccg cgcctcgacg ggcgtgacct caccgtgtcg gtgagcgtcg cggtggtacg gccgtgaccg gcaagccgac gtgaccgtga cgccgccgcc accgccggcg cgactctgcc gccgcctcgg cttcggcgcc gccacgatcg c a c c t c g c t c ggcatcggcc cctgaacgtc aatggtctga cgatggtttc accaccatca ggtggccgcc gacgcgacga gcacaccgct gccgccctga cgccgaactg gcgaccggtc gggcgccacg accaaggcga ggcgaccctg ggcgctggtg caacgtcaac ggttcgtcgt cgtcgctggc gcggcggctc gggcgcgacg gcgggtgcga tctggcggct ccgaccggta cgtgttcaac ctgaccctgt tggcgtcgag acggtgaaca gctgacgctg caagccacct cctgaccaac accggcaaca gggctcggct gtgaccttcg cggcggcgct ggcgccgact tggcgctggc gctgacaccc gggcgcggat a t c t t c g a t a cgacgccgct gtcggcgaca tgacggcgcc t t c g g c g c t g ggacgctgct gctgccggcg cggcgacacc t a t g t c g t c g cgcggtgatc aagctgaccg cctgacgctc gcctaagcga g a c c c c g t c t tccgaaaggg ttgcctagtt ccggtggcta a a t t t c g t g g tcgagacggc  3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 54 61 5521 5581 5641 5701 57 61 5821 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261  gccttagttg gttcaagcgc ggcggtgatc gctgtacatg cgtgttgacg cacccaggtg caagtcggtg catggaccag gacgccggtg catcgcctgc gatccagatg caacgccgag gcgccgcgac gtcgggcatc tctggccatc ccgcgccctg cggcgcctgg gccgctgccc cgcgcaacag ccttgtcggt gccctgcgcg gaagctgggt cgcccagaac cctggcgggc cgagggcggc gttccgcatg cgaagtggcg cgccacccac tgtgatcagc gccccagacg ttcgtctgtc caatgaagcc gctacggcat tcgacagcgc agcacctcga ccggccaggt gcaaccagta gtccgtccat gcgccatcgc tcgacctgat agacccaggg agctctatga cctcgctgtc cctctgacac agagcatcac ccgacgccca gcttcttcac aggacgaggc tgggcatggt acggcacgat acttcctcgg tcaccgccgg acctgttctc gtttcaaggg cggcgctgct cagcggatac tacctcgcca  ttactgtaca agcggcgcga accgccatgg ctgcaggtct gtcatctgcg ctggtgcgcg ctggactcca gtccgagagt ttcgtcatcg attatcatct gccaccatgg gtcatgaagg gagcaggtgg aaggtgttcc gacggcaaga gcgcccatcg gatcgtctgc gagccgcgcg ccgaccatgc cccagcgcgg gccggcgtca cgccacgtcg atcgcccgct gtgcacgaga gcctcgctgt ccggccctgc ctgatggaag aaggtgaacc gactttggcg ccgccgccga cgcctctccc ccccaagatc catcgccctg ggtgatcgcc aggcggcatg gctgttcgag tgtggcgttg cagcttcccc cgacgaacag gaacgcccag cctgaaggac caagggcctg gggctcgatc ccagctcaag cgagacccgg gaagcggatc cgagggcgct cttcgtgatc caccgaagtt ccagtctctg gatcgtgcgc catgccggcc gccgctgcga cctgatttcc ttccattcgc gcatggcgaa agctgctgct  tggccgcgtc agccgacgat tcttcagctt atgaccgcgt tcttcctgtt gcggtctgaa cgctcagccg tcatgaccgg tctcgtggat tcggcctggc cctcgatcgc ccatgggcat cctggcaggc gcaacatcgt tctcggccgg agggcgcggt agaccatgct gcgtgctgtc gccaggccag cgggcaagtc tccgcctcga gctacctgcc tcaccgagtt tgatccagag ccggcggcca tggtgctgga cgatgaagcg tgttggccca aacgcgaccc cgccgccgcc ttcctggccg cagcgtccga acctttgtcg aacggcgtcg ctggccaaga ctggacccga aaggccatgg gccgacctga gcccagttca cgtttgcagt caactcggct gtgccccggc ggccgtctga gttcgccaga gttcgcctgg aagatcgtgt gtcgttcgcg caggcgcact cggctgccgg tcgcaggacc gtggacgtca caggtgatcg gacaccctgc aaagcttttc gggcaatagt aacggctttg ggagaagggt  ggttcgcgcg cctcgaccag cttcatcaac gctgaccagc cctggtctac gttcgacggc caagggcatc cggcctgatc gctgcacccg cgtgatgaac cgcccagaac gtggggcggc cgccgccagc ccagaccctg cgcgatgatc gggccagtgg gcgcgaggaa ggccgaagcc cttccgcatc ctcgctgctg cggctacgac gcaggacatc cgagtcgcag cctgccgatg gcgccagcgc cgagccgaac gctcaaggcc ggccgactac gatgctggcc cgcgccgttg tttcagaacg cggacaactt gtctgttggg tctccgccga tcctggtccg cccaggccaa aagcgcgcct cccgcctgcg ctgagcgtcg atcagagcga tcatcgagga cgcgtctgct ccgcagaccg tcaagcagga ccgaggtgac cgccggtcaa ccgccgagcc tccagccgac ccttccactc gcatttccga agcagctgcc tgccgaccgg gcaccacgat ggatgggcgg gtagtcagga atcaccggtg tacaccgtcc  gcgtcctgaa gccgtgctgg attctggccc cgcaacgttt ggcctgctcg gtggcccggg ggcggccagg gccttctgcg ttcttcggca gacaacgcca gacgccggtt ctgcaagccc gacgccggcg atcctgggcg gccggctcga aagacctata aagagcgccg gcctcgatcc gacgccggcg cgcggcatcg atcaagcagt gagctgttct gaagtcatcg ggctatgata ctggccctgg gccagcctcg gccaagcgca atcatggtga aagctgaccg cagcgcgtcc cgcccatcag ccaggctgtg ctgggccgcg gggtaatcgc cgaaggcgag cgccgccgcc gctggccgag cgccgatccg ccagacgatc gatcgagggc cgagctgatc ggccctggag ctccaaggcc gttcttcgag cgagaaggag cgggacggcg gctggtcgac cgatgtggac gcgggaaatc tccgcagaac gccgcatctg cgagcgcacc gcgcgaggag cgcgggcgag cccttcgttg tgaccggtca acggcatgct  ggctcacaat tcgcccgccc tggtcagccc cgaccctgat aggcgctgcg atccgatctt cgttccgcga atgcgccctg tcctggcgat ccaagaaccc ccaccctgcg gctggcgcgc gcgcggtgat gcggcgccta tcctggtcgg tcggcgcgcg acgaccacat tgccgccggg ccgcggtggc tcggcgtctg gggatcccga cgggcaccgt aggccgcgac cggcgatcgg cccgcgcggt accaggtggg cggtgatctt tcaaccaggg gggctgcgcc agtaagcgcc gcttgaatct gcccgtatcg ttcgccccgc aagaccgtgc aaggtgaagg ggcatcaccc cgcgaccagc atggtcgccc cagggccagg atcgaccgtc gacctgcgta cgcgagcagg gtccagggcg caggtcagcc gtcgtcgcct cagaacctgc atcgcgcccg aatgtccata ccgatcctga aagctcgact cgcggcaggg gtgctgcagt taggtcaaag ctaaattcgc ttactggagt ggacggggcg gcgtcgctcg  7321 7381 7441  gcctcggccg atgtgatcgg cgaccgcctg cgctggatcg gcgtctatga t t c g a g c t g g g c g a c c t c t t ggacgagggc g g t c t g g c g c g c c t g a t g c g ccggatgagg t c t a c a a c c t ggcggcccag a g c t t c g t c g g c g c c t c g t g  cgacatccag gcgcctgcag gga  128  Appendix 3 Sequences of IpsGHIJK, orfl and orf2 LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gcc227 gcc227. gcc227  4883 bp  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus Bacteria;- P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 4883) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 4883) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1. .4883 /organism="Caulobacter crescentus" /strain="NA1000" gene complement(1..1242) /gene="orf3" CDS complement(1..1242) /gene="orf3" /codon_start=l /product="putative g l y c o l i p i d t r a n s p o r t e r " /translation="MSAAASTPQEYKRLTQYEVDVICAKHDRLWSARMGGARAVFAFC DLSGLSVPGRNLCDADFTGAILVGCDLRKAKLDNANFYGADLQGADLTDASLRRADLR GSSLRGANLTGADMFEADLREGTIAAADRKEGYRVIEPTQREAFAAGANLSGANLERS RLSGIVATKADFSDAILKDAKLVRANLKQANFNGANLAGADLSGANLAGADLRNAVLV GAKTLSWNVNDTNMDGALTDKPSGTSVSDLPYEQMIADHARWIETGGGEGKPSVFDKA DLRNLRSVRGFNLTALSAKGSVFYGLDMEGVQMQGAQLDGADLRACNLRRADLRGARL KGAKLTGADLRDAQLGPLLIAADRLLPVDLTGAILTNADLARADLRQARMAGADVSRA NFTGAQLRDLDLTGAIRLAARG" gene complement(1335..2048) /gene="orf4" CDS complement(1335..2048) /gene="orf4" /codon_start=l / p r o d u c t = " p u t a t i v e p h o s p h o g l y c e r a t e mutase" /translation="MPTLVLLRHGQSQWNLENRFTGWVDVDLTAEGEAQARKGGELIA AAGIEIDRLFTSVQTRAIRTGNLALDAAKQSFVPVTKDWRLNERHYGGLTGLNKAETA EKHGVEQVTIWRRSYDIPPPELAPGGEYDFSKDRRYKGASLPSTESLATTLVRVLPYW ESDIAPHLKAGETVLIAAHGNSLRAIVKHLFNVPDDQIVGVEIPTGNPLVIDLDAALK PTGARYLDDSRAEALPKVG" gene 2421..3377 /gene="orf5" CDS 2421..3377  129  gene CDS  /gene="orf5" /codon_start=l /product="putative sugar phosphate isomerase KpsF-like" /translation="MSAFNAVQVGRRVLAVEADALRVLADSLGEAFANAVETIFNAKG RVVCTGMGKSGHVARKIAATLASTGTQAMFVHPAEASHGDLGMIGPDDVVLALSKSGA GRELADTLAYAKRFSIPLIAMTAVADSPLGQAGDILLLLPDAPEGTAEVNAPTTSTTL QIALGDAIAVALLERRGFTASDFRVFHPGGKLGAMLRTVGDLMHGADELPLVAADAAM PDALLVMSEKRFGAVGVVDNAGHLAGLITXGDLRRHMDGLLTHTAGEVMTHAPLTIGP GALAAEALKVMNERRITVLFVVERERPVGILHVHDLLRAGVI" 3477..4883 /gene="lpsg" 3477.-4883 /gene="lpsg" /codon_start=l /product="phosphomannomutase" /translation="MFSSPRADLVPNTAAYENEALVKATGFREYDARWLFGPEINLLG VQALGLGLGTYIHELGQSKIVVGHDFRSYSTSIKNALILGLISAGCEVHDIGLALSPT AYFAQFDLDIPCVAMVTASHNENGWTGVKMGAQKPLTFGPDEMSRLKAIVLNAEFVER DGGKLIRVQGEAQRYIDDVAKRASVTRPLKVIAACGNGTAGAFVVEALQKMGVAEWP MDTDLDFTFPKYNPNPEDAEMLHAMADAVRETGADLAFGFDGDGDRCGVVDDEGEEIF ADKIGLMLARDLAPLHPGAXFVVXVKSTGLYATDPILAQHGCKVIYWKTGHSYIKRKS AELGALAGFEKSGHFFMNGELGYGYDCGLTAAAAILAMLDRNPGVKLSDMRKALPVAF TSLTMSPHCGDEVKYGVVADVVKEYEDLFAAGGSILGRKITEVITVNGVRVHLEDGSW VLVRAS SNKPEVVVVVE S S" 807 a 1647 c 1587 g 837 t 5 others  BASE COUNT ORIGIN 1 gccgcgcgcc 61 a a a g t t g g c g 121 g a g a t c g g c g 181 g a t g a g c a g c 241 c a a c c g c g c g 301 c a a c t g c g c g 361 c g c c g a c a g g 421 g t c g a a c a c c 481 g a t c a t c t g c 541 g c c g t c c a t a 601 g t t g c g c a g a 661 g c c g t t g a a a 721 g t c a c t g a a g 781 g c c c g a c a g g 841 a c c t t c c t t g 901 g g c g c c g g t c 961 g g c g t c g g t c 1021 g g c c t t t c g c 1081 g c g c c c c g g c 1141 c a t c c g c g c c 1201 c a g g c g c t t g 1261 a g t t a a c g g g 1321 a a g c t t t c g g 1381 c g c g c g c c g g 1441 a t c t c c a c a c 1501 a g c g a g t t g c 1561 t c g c t t t c c c 1621 a g g c t t g c g c 1681 t c c g g c g g c g 1741 t c g g c g g t c t 1801 c a g t c c t t g g  gccagcctga cgcgagacat ttggtaagaa ggccccagct cctcgcaggt ccctgcatct gccgtgagat gagggcttgc tcatacggca ttggtgtcgt tcagcgccgg ttggcctgtt tccgccttcg ttcgcgccgg cggtcggcgg aggttggcgc aagtcggcgc aggtcacagc acggagaggc gaccacagac tattcttgtg acaccttgcg cgttttaacc tcggcttcaa cgacgatctg cgtgggcggc agtagggcag ccttgtagcg ggatatcgta cagccttgtt tcacggggac  tggcgccggt ccgcgccggc tcgctcccgt gtgcgtcacg cggcgcggcg gcacgccttc tgaagcctcg cctcgccgcc ggtcagagac tgacgttcca cgaggttagc tgaggttggc tcgcgacgat ccgcgaaggc cggcgatcgt cgcgcagact cttgcagatc cgacgagaat cagacaggtc ggtcatgctt gcgtgctggc cagagcaaaa aactttggga cgcagcatcc gtcgtccggc gatcagcaca aacgcgaacc gcgatccttg cgaacggcgc caggccggtc gaagctctgc  cagatccagg cattcgggcc cagatcgacc caggtccgct caggttgcag catgtcgagg gacagatcgc gccggtctcg gcttgtgccc ggacagggtc gcccgacaga ccgaaccagc gcccgacagg ttcgcgttgg gccctcccgc ggatccgcgc cgcgccatag ggcgccggtg gcagaacgcg ggcgcagata ggcggcgctc ttaacgccgt agcgcctcag aggtcgatca acgttgaaca gtctcacccg agcgtcgtcg ctgaagtcgt cagatggtca agcccgccat ttggcggcgt  tcgcgtagct tgccgcaggt ggcagcaagc cccgtcagct gctcgcagat ccgtagaaca agattccgca atccatcgtg gacggcttgt ttggccccaa tcggcccccg ttggcgtcct cgcgaacgct gtcggttcga agatcggcct aggtcagccc aagttggcgt aaatcggcgt aaaaccgcgc acgtccactt atggctcgct gtttccaaca cccgactgtc ccagcggatt gatgcttgac ccttgagatg ccaggctttc actcgccgcc cctgctcgac agtggcgctc ccagcgccag  gcgcgccggt cggcgcgcgc ggtcggctgc tggcgccctt cggcgccatc ccgacccctt gatcggcctt cgtggtcggc cggtcaacgc ccagcacggc ccaggttggc tcaggatagc cgaggttggc tgacgcgata cgaacatgtc gccgcaacga tgtccagctt cgcagagatt gcgcgccccc cgtactgcgt gggtatcccg ccttccgctt gtccaggtag gccggtcgga gatggcgcgc cggagcgatg cgtgctgggc cggcgccagc gccgtgcttt gttcaggcgc attgcctgtg  130  1861 1921 1981 2041 2101 2161 2221 2281 2341 2401 2461 2521 2581 2 641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 42 61 4321 4381 4441 4501 4561 4621 4 681 4741 4801 4861  cggatggcgc atcagctcgc caaccggtga gtcggcatcg cttcagcgtc ccaggccgcc ggactcatcg cgctcctggg attcagcggg gccatggatc tcgccgtaga atgcggtcga cggggcatgt tcgtccaccc ttctggcgct agcgcttctc cgggcgacat cgaccacgtc agcggcgcgg ctatgctgcg ccgacgccgc gcgtcgttga acatggatgg tcggccccgg tgcttttcgt gcgcgggtgt gtcgtgccgc tctcctcgcc tcaaggcgac tcctgggcgt cgaagatcgt tcctggggct ccgcctattt acaacgaaaa ccgacgagat gcggcaagct gcgccagcgt ccttcgtggt acctcgactt cgatggctga gcgaccgctg tgatgctggc agtcgacggg actggaagac gcttcgaaaa gcctgaccgc cggacatgcg gtgacgaggt ccgccggcgg gggtgcacct tcgtcgtcgt  LOCUS DEFINITION  gcc506 gcc506.  gggtctgaac cgaggtgaac cgcccttccg ggcctgagct agcggttttc caggttccac g g c t t c c t t c ggagatcagg aagcgccgaa gcgaccaagg gcgcgtgcta taggcgagcc ccgccgcgct gatcgccctc gccccttcgg ccacacgccc aaaaggactc ggcgcgccgt agctttcgca atgagcgcct agccgatgcg ctgcgcgtcc g a c g a t c t t c aacgccaagg ggcgcggaag a t c g c c g c c a cgccgaagcc tcgcacggcg g t c c a a g t c g ggcgccggcc gatcccgctg atcgccatga cctgctgctg ctccccgacg gaccaccctg cagatcgcgc cttcaccgcc agcgacttcc cacggtcggc gacctgatgc catgcccgac gctttactgg taacgcgggt cacctggccg gctgctgacc cacaccgccg cgccctggcg gctgaagcgc cgtcgagcgc gaacgccccg gatctaggtc acatcgaaac gtccgcacgg ctaaagccat ccgcgccgat ctggttccga g g g c t t t c g c gagtacgacg gcaggccctg ggcctgggtc ggtcggccat gacttccgct gatcagcgcc ggctgcgagg cgcccagttc gacctcgaca cggctggacc ggcgtgaaga gagccgcctc aaggccatcg g a t c c g c g t g cagggcgagg cacccgtccc ctgaaggtga cgaggccctg cagaagatgg caccttcccc aagtacaatc c g c t g t c c g t gagacgggcg c g g t g t g g t c gatgacgagg gcgcgacctg gccccgctac cctmtacgcc accgatccga cggccacagc tacatcaagc gagcggccac t t c t t c a t g a cgccgcggcc a t c c t g g c c a caaggccctg ccggtggcct gaagtacggc g t c g t c g c c g t t c g a t c c t g ggccgcaaga ggaggacggc t c c t g g g t c c g g t c g a a a g c age  8012  bp  mRNA  aageggtega tcgccctcag tggctttggc gaaatgtcag acccgatcgg atgectgate tctctcgtct gttcagcaga egggacgagg teaaegctgt tcgcagactc gtcgcgtcgt ccctcgcttc acctgggcat gcgaactggc ccgccgtggc cgcccgaggg tgggegaege gcgtcttcca acggcgccga teatgagega gcttgatcac gcgaggtcat tgaaggttat teggcattet tttgeaaaac accatctcaa atacggccgc cgcgctggct tgggaaccta egtattcgae tgcacgacat tcccgtgcgt tgggcgccca tgetgaaege cccagcgcta tcgccgcctg gtgtcgctga ccaaccccga cggacctggc gcgaggagat atccgggcgc tcctggccca geaagagege aeggegaget tgctggaccg tcacctccct acgtggtgaa tcaccgaggt tggtccgcgc  t c t c g a t g c c ggccgcagcg cggtgaggtc cacatcaacc catggcgcag caggacgagc ggcgggctaa ggccagcccg cccggtgcgc gccccctccc gcatcttcat gectctgatg ggccccaggg ccagggcgac ccccggagat gaaggccaag eggegaaaaa g g c t g t c g a g tcaggtgggc cgccgcgtcc getgggegag g c c t t c g c c a ctgcacgggc atgggcaagt gaccggcacc caggegatgt gategggeca gacgacgtgg cgacaccctg gcctacgcta cgacagcccg cteggtcagg gaeggeggaa g t g a a c g c c c gatcgctgtg getctgetgg ccccggcggc aagctcggcg tgagcttccc ctggtcgccg a a a g c g t t t c ggegeggtcggkacggtgat ctgcgtcgac gacgcacgct cccctgacca gaacgagegg c g g a t c a c c g acatgtgcac gacctgcttc c t t g t c a t g c gaacgegcta ctgaagegag c c t t c a a t g t ctacgaaaac gaagccctgg gtttgggccg gagatcaatc tatccacgaa ctgggccaat ctcgatcaag aacgccctga tggcctggcc ctgtcgccca cjgccatggtc acggccagcc gaagccgctg accttcggcc c g a g t t c g t c gagegegatg tatcgacgac gtggccaagc cggcaacggc acggccggcg ggtcgtgccg atggacaccg agacgecgag a t g c t g c a c g g t t e g g c t t e gaeggegacg c t t c g c c g a c aagateggee greyttegtc gtgratgtga gcacggctgc aaggtgatct ggagctgggc gccctggccg gggctatggc tacgactgeg caatcccggc gtgaagctgt gaccatgagc ccgcactgcg ggaatacgag g a c c t g t t c g gatcaeggtc aacggcgtgc c t c g t c c a a c aageccgagg  BCT  15-OCT-1999  ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gcc506  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. 1 ( b a s e s 1 t o 8012) REFERENCE Awram,P.A. AUTHORS A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth TITLE Lipopolysaccharide Synthesis i n Caulobacter crescentus Unpublished JOURNAL 2 ( b a s e s 1 t o 8012) REFERENCE Awram,P.A. AUTHORS Direct Submission TITLE S u b m i t t e d (15-OCT-1999) UBC JOURNAL Location/Qualifiers FEATURES 1..8012 source /organism^"Caulobacter crescentus" /strain="NA1000" c omplement(1845..3860) gene /gene="orf6" complement(1845..3860) CDS /gene="orf6" /codon_start=l /product="putative transketolase" /translation="MRVRPSRSPAKHIKTEAPMPVSPIKMADAIRVLSMDAVHKAKSG HQGMPMGMADVATVLWGKFLKFDASKPDWADRDRFVLSAGHGSMLLYSLLHLTGFKAM TMKEIENFRQWGALTPGHPEVHHTPGVETTTGPLGQGLATAVGMAMAEAHLAARYGSD LVDHRTWVIAGDGCLMEGVSHEAISIAGRLKLSKLTVLFDDNNTTIDGVATIAETGDQ VARFKAAGWAVKVVDGHDHGKIAAALRWATKQDRPTMIACKTLISKGAGPKEGDPHSH GYTLFDNEIAASRVAMGWDAAPFTVPDDIAKAWKSVGRRGAKVRKAWEAKLAASPKGA DFTRAMKGELPANAFEALDAHIAKALETKPVNATRVHSGSALEHLIPAIPEMIGGSAD LTGSNNTLVKGMGAFDAPGYEGRYVHYGVREFGMAAAMNGMALHGGIIPYSGTFLAFA DYSRAAIRLGALMEARVVHVMTHDSIGLGEDGPTHQPVEHVASLRAIPNLLVFRPADA VEAAECWKAALQHQRTPSVMTLSRQKTPHVRTQGGDLSAKGAYELLAAEGGEAQVTIF ASGTEVGVAVAARDILQAKGKPTRVVSTPCWELFDQQPAAYQAAVIGKAPVRVAVEAG VKMGWERFIGENGKFIGMKGFGASAPFERLYKEFGITAEAVAEAALA" 4281..6041 gene /gene="orf7" 4281. .6041 CDS /gene="orf7" /codon_start=l / p r o d u c t = " p u t a t i v e N H ( 3 ) - d e p e n d e n t NAD(+) s y n t h e t a s e " /translation="MIVVGGPLRDAGRLYNTAIVIQGGKVLGVVPKSFLPNYREFYER RWFTPGAGLTGKTLTLAGQTVPFGTDILFRGEGVAPFTVGVEICEDVWTPTPPSTAQA LAGAEILLNLSASNITIGKSETRRLLCASQSSRMIAAYVYSAAGAGESSTDLAWDGHV DIHEMGALLAETPRFSTGPAWTFADVDVQRLRQERMRVGSFGDAMALSPASTPFRIVP FAFDAPEGDLALARPIERFPFTPSDPARLRENCYEAYNIQVQGLARRLEASGLKKLVI GISGGLDSTQALLVAAKAMDQLGLPRSNILAYTLPGFATSDRTKSNAWALMKAMAVTA AELDIRPAATQMLKDLDHPFGRGEAVYDVTFENVQAGLRTDYLFRLANHNAALVVGTG DLSELALGWCTYGVGDHMSHYNPNCGAPKTLIQHLIRFVAHSGDVGAETTALLDDILA TEISPELVPGEAVQATESFVGPYALQDFNLYYMTRYGMAPSKIAFLAWSAWHDADQGG WPVGLPDNARRAYDLPEIKRWLELFLKRFFANQFKRSAVPNGPKISSGGALSPRGDWR MPS DATADAWLAELRTNAPI" 6121..7446 gene /gene="lpsH"  132  CDS  6121..7446 /gene="IpsH" /codon_start=l / p r o d u c t = " p u t a t i v e mannose-6-phosphate i s o m e r a s e " /translation="VWGQDLAAIYPVILCGGSGTRLWPASRSDHPKQFLKLVSDRSSF QETVLRVKDIPGVAEVVVVTGEAMVGFVSEQTAEIGAWATILVEPEARDSAPAVAAAA AYVEAQDPAGVVLMLAADHHIAQPEIFQQAALTATKAAEQGYIVTFGVQPTVPATGFG YIRPGAPLLDGSVREVAAFVEKPDQATAERYLLEGYLWNSGNFAFQAATLLGEFETFE PSVAAAAKACVAGLQLEAGIGRLDREAFAQAKKISLDYAIMERTQKAAVAPAAFAWSD LGAWDAIWEASTRDGDGNAQTGDVDLHGSSNVLVRSTGPYVGVIGVNDIVVVAEPDAV LVCHRKDSQAVKTLVDGLKAKGRSIASRKSASPNGTETLVSTDGFDVELRRVPAGETL MLPVSTLQVLEGVIEMDGDVYAAGAIIALDDSVQARAIGAATLLVTKPR" 1292 a 2635 c 2741 g 1342 t 2 others  BASE COUNT ORIGIN 1 gcttgaagct 61 a g g g c g t t g g 121 a g a t a g c c g a 181 a t g a c g a c g t 241 g c g t c g g a c g 301 a g g g c g a t g a 361 t g a a g c g c c a 421 a c c c a g c g c a 481 c g c a g a t a g g 541 t c a g c g c t c a 601 t a a c c c t a g t 661 g t g t c c g g c g 721 a t a a a a t c c c 781 g g c t g g g t c c 841 t c g t a c g g c c 901 c g c c g c g c c c 961 g t c a a g c g c c 1021 c g t c a c c g c c 1081 g a t a g t t c t g 1141 c g a g c g a c g g 1201 a a t c c a g c a c 12 61 a g c t g a g g a t 1321 t g a a g a a a a c 1381 g c c g c g c c a c 1441 a t c t c c g c c g 1501 g g t c a g g g g t 1561 c g a t c g g a t c 1621 g c g a c g a c t c 1681 a g g g t c c c c a 1741 g g c g a c c a t c 1801 g c g g t c g g g g 1861 t c g g c c a c g g 1921 g c g c c g a a g c 1981 a t c t t g a c g c 2041 t a g g c g g c g g 2101 t t g c c c t t g g 2161 g c g a a g a t c g 2221 g c c g a c a a g t 2281 a c c g a c g g c g 2341 t c g g c c g g a c 2401 g g c t g g t g g g 2461 a c g c g g g c c t 2521 a g g a a g g t g c  tgcgcgaggc ccaggggtgc cgtacggcgt tcatcagcgc cgaacagggg cggagaagcg gccaggtctc agccaacgac tattgacgcg tgggcgctcc cacgaggggc cgcaccggaa aataatgccg acgccctcca gagcraggaa tcgacgggcg caaccctcgg acgccgtcag cacgcccgaa gatcaggcgt ctggtacaca cgccgagatg gaattgaatg gcgcgcgctg ctccttgggg tcgattcccc gcccatcgga gccacagcct tcgctgtggg aggtgtttct gcggagctcc cttcagcggt ccttcatgcc ccgcttcgac gctgctgatc cctgcaggat tcacctgcgc cgccgccctg tgcgttgatg ggaagaccag tgggaccgtc ccatcagggc ccgaataggg  gaccgccacg atcgggattg cgctgcggcg cccggccagc aatcgccagg gcgccagcga caggaactgg gaggtgcagg tcgcgaggcg gccccccccc taagaggaaa gatttttgcg acccgcgcta gttcgagacg tcttcagggt gactggcgac cggcgtctcg ccaagaagag atgatcagca tggaacgggc gcggtgttgc acgatggccc atggcggtca gcaagcaagg ccgtagctca tcggctccac ctccgtccgc gggttcgagg gcaggcatgg gtttcggacc ggactccacc gatgccgaac gatgaacttg ggcgacgcgc gaacagctcc gtcgcgcgcg ctcgccgccc ggtgcggacg ttgcagggcc caggttcgga ttcgccgaga gcccaggcgg gatgatcccg  gccgcgatga gcctgccggg atggtcgctg agcatgccgc aatccgaaga tcagggctgc atcgcggtgt gtcgcgcaga acagcgcgtt aaggagatcg tgaaagacca cgttactgcc gttcggcggc ctttagccgc ctttgccaga cctcaccttc cccgcccctc cgatcagcgt gaacaacgat gaagcaccgg ggcggttgat accaaaggag tgggtttccc agccgcttga gatggtagag caaggacctc attttcttgc aacagcggcg ggccagggtc gcgctatcga cccgcgcagc tctttataca ccgttctcgc accggggcct cagcagggcg gcgacggcga tcggccgcca tgcggggtct gccttccagc atggcgcgca ccgatggagt atggccgcgc ccgtgcaggg  gcaggagcgc gccaaaccac ccacgatggg gatgaccagg agcgaaggcc gccgtggacg ccaaggtcgg cgcgaaccgg cggcggaggt tagaaagggc tctgcggtcg cttgatgagc agagcgctcc ccagatccaa tcgcaaaggt cgagctgagg ccaggtcaat atccttcgtc cagagcaatc ccttacccaa gcgtttcgtc aacggcttgg attggggcag gttttggggc ccgaacgcgc gtgcgacatc g c g g t t t c g c cagccgccag aaggcgatga gggccgcatt agatagaggc ggccgtcggc ggggtcaggc gcaccacgag g c c t t g a a g c gcaggcagca cgggctgatg tcgacgccgc tccggtcacg cgatccagga cacgtcgaac gcgaccagcc gctgagcagg ccgccgagga gtccggactt tccgcttctt cgagcgcgct ttggggccgt c g c c t c g t t c gcaatgagga c c t t a g c c t c gatcacgacc ctaacgtcgc gtgcgcccct cctcgccggc aagatcggcc aggcgcgagc c t g c t c t g t a tccgccctgg cccttattct gatttcaggc cagagccgct ggcgctcgaa cggagccgag caatgaagcg ctcccagccc tgccgatgac ggcggcctgg tggagaccac gcgggtcggc cgccgacctc ggtgcccgag gcagctcgta ggcgcccttg tctggcgcga cagggtcatc actcggcggc ctcgacggca agctggcaac gtgctcgacc cgtgggtcat cacgtggacg g g c t g t a g t c ggcgaaggcc ccatgccgtt catggccgcg  133  2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 54 61 5521 5581 5641 5701 57 61 5821 5881 5941  gccatgccga aacgcgccca atcatctcgg gcgttgaccg ttcgccggca agcttggcct ttggcgatgt gcggcgatct cccgcgccct gcccagcgca cagccggcgg tcgatggtgg gcgatgctga caggtgcggt gccatgccga ggcgtgtgat atctccttca gagccgtgac tcgaatttca ccctggtggc gccatcttga cgcgaggggc cgtgcggaac cagacttatg tcttctcgcc tggcggatcc agggcgtggc ttctgcagca ccagcgcagg tctacaatac tcctgcccaa caggcaagac ggggcgaggg cgaccccgcc ccagcaacat cgcggatgat tggcctggga ggttttcgac agcggatgcg tccggatcgt tcgaacgctt cctacaacat tcgtcatcgg ccatggacca cgacgtccga ccgccgagct tcgggcgcgg ccgactatct tgtcggagct accccaactg cgggtgacgt cgccggagct ccctgcagga ccttcctggc ccgacaacgc tgaagcggtt cgtcgggcgg  actcacgcac tgcccttgac ggatcgccgg gcttggtctc gctcgccctt cccaggcctt cgtcgggcac cgttgtcgaa tcgagatcag gggccgcagc ccttgaagcg tgttgttgtc tcgcctcatg ggtcgacgag cggcggtggc gcacttccgg tcgtcatggc cggccgacag ggaacttgcc cggacttcgc tgggcgaaac gcactcgcat gcctgtcgtg ctcagtttca ctaccgtcac cgccgccaat tgtggtcgtg agaggcgttg cctggcgccg cgcgatcgtc ctatcgcgag cctgaccctg cgtcgccccg cagcaccgcc caccatcggc cgcggcctat cggccatgtc gggcccggcc cgtcggcagc tccgttcgcc tcccttcacg ccaggtccag tatttccggc gctgggcctg tcgcaccaag cgatatccgg cgaggcggtc gttccgtctg ggcgctgggc cggtgcgccc cggcgccgag ggtgcccggc cttcaatctc ctggagcgcc tcgccgcgcc cttcgccaac cgcgttgtcg  gccatagtgg cagggtgttg gatcaggtgc cagggccttg catggcgcgg gcggaccttg ggtgaagggc cagggtgtag cgtcttgcac gatcttgccg cgcgacctgg gtcgaagagg gctgacgccc gtcagagccg caggccctgg gtggcccggg cttgaagccg cacgaagcgg ccataggacc cttgtgcacg gggcatgggg gggaaccccc gacgctatgc gcataacccg ggtttcgtcc gctcagaacg ttcccggaac ctggacgcgg atgatcgtgg atccagggcg ttctacgagc gccggccaga ttcacggtgg caggccttgg aagtccgaaa gtctattcgg gatattcacg tggaccttcg ttcggcgacg tttgacgcgc ccgtccgacc ggcctggcgc gggctcgact ccgcgcagca tccaacgcct cccgcagcga tatgacgtca gccaaccaca tggtgcacct aagacgctga accacggctc gaggcggttc tactacatga tggcatgacg tacgacctgc cagttcaagc ccgcgggggg  acatagcggc ttcgagccgg tccagggccg gcgatgtggg gtgaagtcgg gcgccgcgac gcagcgtccc ccgtggctgt gcgatcatgg tggtcgtggc tcgccggtct accgtcagct tccatcaggc tagcgggcgg cccagcggac gtcagcgccc gtcagatgca tcgcggtcgg gtcgccacgt gcgtccatgg gcttccgtct gggggtcaac aacatcgtcc gaaaggccgc gggtcgcgac tcgtggctct tggggctgac ttgaggccgc tcggaggtcc gcaaggtgct gtcgctggtt ccgttccgtt gcgtcgagat cgggggccga cgcggcgtct cggccggcgc agatgggcgc ccgatgtgga ccatggcgtt ccgagggcga cagccaggct ggcgcctcga ccacccaggc acatcctggc gggcgctgat cccagatgct ccttcgagaa acgccgccct acggcgtcgg tccagcacct tgctggacga aggcgaccga cccgctacgg ccgaccaggg ctgagatcaa gctcggctgt actggcgcat  c t t c g t a g c c gggcgcgtcg tcaggtcggc cgagccgccg agccggagtg gacgcgggtg cgtccagcgc ctcgaaggcg c c c c c t t g g g cgaggcggcc ggccgacgct cttccaggcc agcccatggc cacgcgcgag gggggtcgcc t t c c t t g g g g tcgggcggtc ctgcttggtg cgtcgacgac cttgaccgcc cggcgatggt ggccaccccg t c g a g a g c t t caggcggccg atccgtcgcc ggcgatcacc ccaggtgcgc ctcggccatg cggtcgtggt ctcgacgccg cccactgacg gaagttctcg gcagggaata gagcagcatc cccagtcagg cttagacgcg cggccatgcc catcggcatg agaggacgcg g a t c g c g t c g t t a t a t g t t t tgcagggctg ccgcgcaggg cggctaaggc g t c g g c g t t a taggtggagg tcccttgggt agtccgtcgt c g c c g t t c c g aaggtcaagc ggcccgcgag gcccatgcgg gggctacacg atcgacgacc gatcgccacc ctgaccgagg gctgcgcgac gcaggccgcc gggcgtggtc ccgaaaagct cacgccgggc gccggcttga cgggaccgac a t t c t g t t c c ctgcgaggat gtctggaccc gatcctgctg aacctgtcgg gctctgcgcc agccagtcgt gggcgagagc t c g a c c g a c c g c t g c t c g c c gagaccccgc cgtccagcgc cttcggcagg atcgccggcc tcgaccccgt c c t g g c g c t g gcccggccga gcgcgagaac tgctacgagg ggcttcgggt ctcaagaagc t c t g c t g g t g gcggccaagg ctacactctg ccgggctttg gaaggcgatg gccgtcaccg caaggacctc gaccacccgt tgtgcaggcc ggcctgcgaa g g t c g t c g g c acgggggacc cgaccacatg agccactaca gatccgcttc gtggcccatt catcctcgcg accgagatct gagcttcgtc ggcccctacg catggcgccg tccaagatcg cggctggccc gtcggcctgc gcgctggctg gagctgttcc acccaacggg ccgaaaatct gccgtcggat gcgacagccg  6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261 7321 7381 7441 7501 7561 7621 7681 7741 7801 7861 7921 7981  atgcctggct ctctgtttgc gtgtggggac cgcctctggc cggtcctcct gtcgtcgtga gcctgggcca gcggcggcct caccacatcg gagcagggct tatatccgcc gagaagcccg ggcaatttcg gtcgccgccg ctggatcgcg cgcacccaga gacgcgatct gacttgcacg ggggtcaacg gacagccagg tcgcgcaaga gtggagttgc gtgctggaag ttggacgact cgttgatcac ggacgggctc tcgccgtgcg gcgcttggcc acagtcgggc ctcgcgatag tagcgtcggc gatgccccgg agaactcacg tctcgtcggn  ggcggaactg ctttaagcag aagacttggc ccgcatcgcg tccaggagac ccggcgaggc caatcctggt atgtcgaggc cccagcccga atatcgtcac ctggcgcgcc accaggcgac cgttccaggc ccgccaaggc aggccttcgc aggccgctgt gggaggcctc gctcgtccaa acatcgtcgt cggtgaagac gcgcctcgcc gtcgcgtacc gcgtgatcga cggttcaggc ccggtccatc ttcgataaag cagataggcc ggtgatctcc gccgtcgtcg gtcaacgccg gcggccggca ctcgcagatc gatctggccg cagggtgcgc  cgcacaaatg tcgacgcaat tgcgatctat gagcgaccat tgtcctgcgg gatggtcggg cgaacccgag ccaggatccg aatcttccag gttcggggtt gcttctggat cgccgagcgc ggcgaccttg gtgcgtggcc ccaggccaag cgcccctgcg cacccgcgac tgttctggtg cgtggccgag cctggtcgat gaacgggacc ggcgggagag gatggacggc tcgggcgatc tccaggatcg gtttcgtcgg atcaggccct gccgccagca tgggtcgaga tgggtctcgg agctggcccc cgacccgcga ctgggcgcgt cagatcgggt  cgccgatttg agcacccgat ccggtaatcc cccaaacagt gtgaaggata tttgtgtccg gctcgcgaca gccggcgtcg caggccgccc cagccgacgg ggttcggtgc tatcttctgg ctgggcgagt ggcctgcagc aagatctcgc gcgttcgcct ggcgacggta cgctcgacgg cccgacgcgg ggcctgaagg gagaccctgg accttgatgc gacgtctatg ggggcggcga ccaaggcgat gctggtactt cagcggcggc cggccgcctt aatcatcgag cgtcatcgat agcgcatcag cgcctggcgc gaatgaacct cc  aggaaaactc aaggggcgaa tgtgtggcgg tccttaaact ttccgggtgt agcagaccgc gcgcgccggc tgttgatgct tcaccgccac tcccggcgac gtgaggtcgc aaggctatct tcgagacctt tggaggccgg tcgactacgc ggtcggacct acgcccagac gtccctatgt tgctggtctg ccaagggccg tctcgaccga tgccggtatc ctgcgggcgc ccttgctggt atggtagaac gtcgcgccag ggcggccatg gatccgctcg cagggcgttg catgcgcaag cagccagccc ggggttccag ggagagcgcc  ttcgttacag gactaagact ctcgggcacc cgtgagcgat ggccgaggtg cgagatcggc cgtggcggcg ggccgccgac taaggcggcc cggctttggt cgccttcgtc ctggaacagc tgaaccgtcg catcggccgc catcatggag tggggcctgg gggcgacgtc cggcgtgatc ccatcgcaag ctccatcgcc cggcttcgat gacgcttcag gatcatcgcc cacgaagccg gagctggccg aggccgggga tcccagtagc gtttggggcc atcgccacgc gccgccgccg cattcgaact tcgaggtcga agttcggcga  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gcc433 gcc433. gcc433  9041 bp  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus Bacteria; Proteobacteria; alpha subdivision; Caulobacter Caulobacter. 1 ( b a s e s 1 t o 9041) REFERENCE Awram,P.A. AUTHORS A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth TITLE Lipopolysaccharide Synthesis i n Caulobacter crescentus JOURNAL Unpublished REFERENCE •2 ( b a s e s 1 t o 9041) Awram,P.A. AUTHORS D i r e c t Submission TITLE S u b m i t t e d (15-OCT-1999) UBC JOURNAL FEATURES Location/Qualifiers source 1..9041  group;  gene CDS  gene CDS  gene CDS  gene CDS  /organism="Caulobacter crescentus" /strain="NA1000" 913..2295 /gene="orf8" 913..2295 /gene="orf8" /codon_start=l / p r o d u c t = " p u t a t i v e Glucose-6-Phosphate 1-Dehydrogenase" /translation="MLLPSLYFLELDRLLPHDLRIIGVARADHDAASYKALVREQLGK RATVEEAVWNRLAARLDYVPANITSEEDTKKLAERIGAHGTLVIFFSLSPSLYGPACQ ALQAAGLTGPNTRLILEKPLGRDLESSKATNAAVAAVVDESQVFRIDHYLGKETVQNL TALRFANVLFEPLWDRSTIDHVQITIAETEKVGDRWPYYDEYGALRDMVQNHMLQLLC .LVAMEAPSGFDPDAVRDEKVKVLRSLRPFTKETVAHDTVRGQYVAGVVEGGARAGYVE EVGKPTKTETFVAMKVAIDNWRWDGVPFFLRTGKNLPDRRTQIVVQFKPLPHNIFGPA TDGELCANRLVIDLQPDEDISLTIMNKRPGLSDEGMRLQSLPLSLSFGQTGGRRRIAY EKLFVDAFRGDRTLFVRRDEVEQAWRFIDGVSAAWEEASIEPAHYAAGTWGPQSAQGL ISPGGRAWKA" 2298..2996 /gene="orf9" 2298.-2996 /gene="orf9" /codon_start=l /product="putative 6-phosphogluconolactonase" /translation="MPFTPIKLEAFGSREDLYDAAASVLVGALTTAVARHGRVGFAAT GGTTPAPVYDRMATMTAPWDKVTVTLTDERFVPATDASSNEGLVRRHLLVGEAAKASF APLFFDGVSHDESARKAEAGVNAATPFGVVLLGVGPDGHFASLFPGNPMLDQGLDLAT DRSVLAVPPSDPAPDLPRLSLTLAALTRTDLIVLLVTGAAKKALLDGDVDPALPVAAI LKQDRAKVRILWAE" 2997..4811 /gene="orflO" 2997..4811 /gene="orf10" /codon_start=l /product="putative phosphogluconate dehydratase" /translation^IAMSLNPVIADVTARIVARSKDSRAAYLANMDRAIENQPGRAKL SCANWAHAFAASPGVDKLRALDPNAPNIGIVSAYNDMLSAHQPLEAYPALIKDAARDV GATAQFAGGVPAMCDGVTQGRPGMELSLFSRDVIAMATAVALTHDAFDSALYLGVCDK IVPGLVIGALTFSHLPALFVPAGPMTSGLPNSEKARIRALYAEGKVGREELLAAESAS YHGPGTCTFYGTANTNQMLMELMGFHLPGSAFVHPNTPLREALVKESARRVAAVTNKG NEFIPVGRMIDEKSFVNGVVGLMATGGSTNLALHIIAMAAAAGVQLTLEDLDDISKAT PLLARVYPNGSADVNHFQAAGGMAFVIRELLKAGLVHEDVQTIAGAGLSLYAKEPVLE DGMLTWRDGAHESLDPAIVRPVSDPFSKEGGLRLMAGNLGRGVMKISAVKPEHHVIEA PCAVFQEQEDFIAAFKRGELDRDVVVVVRFQGPSANGMPELHNLSPSISVLLDRGHKV ALVTDGRMSGASGKTPAAIHVTPEAAKGGPLAYVQDGDVIRVNAETGELKIMVDEATL LARTPANVPASKPGFGRELFGWMRSGVGAADAGASVFA" 5856..6926 /gene="lpsl" 5856..6926 /gene="lpsl" /codon_start=l /product="putative repressor s i m i l a r to LacI" /translation="MAKYSPKRANRTGEGRKLSAKVTIHDVARESGVSIKTVSRVLNR EPNVKADTRDRVQAAVAALHYRPNISARSLAGAKAYLIGVFFDNPSPGYVTDVQLGAI ARCRQEGFHLIVEPIDSTADVEDQVAPMLTTLRMDGVILTPPLSDHPVVLAALEREGV AYVRIAPGDDFDRAPWVSMDDRLAAYEMTKHLVDLGHKDIAFIVGHPDHGASHRRHQG FLDAMRDSGLRVRDDRVAQGWFSFRSGFEAAEKLLGGADRPTAIFASNDDMALGVMAV ANRLRLDVPTQLSVAGFDDTPGAKITWPQLTTVRQPIHAMAGAAADMLMQGVEREEGA  136  gene CDS  PPPSRLLDFELVVRESTGPASH" 7224..9041 /gene="orf11" 7224..9041 /gene="orf11" /codon_start=l / p r o d u c t = " p u t a t i v e 1,4-B-D-glucan glucohydrolase" /translation="MLPRRFAFASALALTIACGSAGVVLAQTPPNATANPAVWPMSAS PAAITDAKTEAFIAQLMSRMTVEEKVAQTIQADGASITPEELKKYRLGSVLVGGNSAP ' DGNDRASPQRWIEWIRAFRAAALDKRGDRQEIP11FGVDAVHGHNNWGATIFPHNVG LGAAHEPDLIRRIGEVTAKEMAATGADWTFGPTVAVPRDSRWGRAYEGYGENPEIVKA YSGPMTLGLQGALEAGKPLAAGRVAGSAKHFLADGGTENGRDQGDAKISEADLVRLHN AGYPPAIEAGILSVMVSFSSWNGVKHTGNKSLLTDVLKERMGFEGFVVGDWNAHGQVE GCSNTSCAQAYNAGMDMMMAPDSWKGLYDNTLAQVKAGQIPMARIDDAVRRILRVKVK AGLFEDKRPLEGKLELLGAPEHRAVAREAVRKSLVLLKNEGVLPLKSSARVLVAGDGA DDIGKASGGWTLTWQGTGNKNSDFPHGQSIYAGVAEAVKAGGGSAELSVSGDFKQKPD VAIVVFGENPYAEFQGDITSIEYQAGDKRDLALLKKLKAAGIPVVSVFLSGRPLWTNP ELNASDAFVAAWLPGSEGGGVADVLVGDKAGKPRHDFQGK" 1492 a 3016 c 3052 g 1480 t 1 others  BASE COUNT ORIGIN 1 ggacgcgctg 61 c g a g c g t c g t 121 g g a t c t g g t c 181 g a a g c a a c g c 241 c a g g a c a g g c 301 c t a c a g t c t c 361 c a t c g a g a c a 421 c g c g g t g t c c 481 g a t c g g t c g c 541 g g g c g g c t a g 601 a a g g t t c g c c 661 c c c g g g a a t c 721 g t t t g g c c t a 781 g c c t t t c c c g 841 a a c a a c g a c g 901 t t g g c c c t c c 961 c a c g a t c t g c 1021 c t g g t c c g c g 1081 g c c g c g c g c c 1141 g c c g a a c g g a 1201 t a c g g c c c g g 1261 a t c c t c g a a a 1321 g c c g c t g t g g 1381 g t c c a g a a c c 1441 a g c a c g a t c g 1501 c c c t a c t a c g 1561 c t g t g t c t g g 1621 a a g g t c a a g g 1681 g t g c g t g g c c 1741 g a a g t g g g c a 1801 t g g c g t t g g g 1861 a c c c a g a t c g 1921 g g c g a g c t g t 1981 a c g a t c a t g a 2041 c t g t c g c t g t 2101 g t c g a c g c c t 2161 t g g c g c t t c a  aacaccatca ctggctaccg gaaaccgaca gcctggccgg ggggcgctca gccgatgacg gccagcaacc gcgtccgact ctcaaggaac cgctcgctgc tgaaagcgcg cctcaaacag gttagagccc cgcaggcatg tcggcgagaa ggatgctgct ggatcattgg agcaactggg tcgactacgt tcggtgcgca cttgccaggc agccgcttgg tcgacgagag tgacggccct accatgtgca acgaatacgg tcgccatgga tgctgcgctc agtacgtcgc agcccaccaa acggcgtgcc tcgtccagtt gcgccaaccg acaagcgtcc cgtttggcca tccgcggcga tcgacggcgt  acgcgacccg cgcgcgccct ccagccgtga ccgccggcgc gcgcggccga acgcggcgct ctgaaggtct ttggacgggt gtttccggac ggtgatcgca ctcaagaggc ttcctctgaa cgccgggcgc tcatgacaac cggccgcgaa gccttctctg cgtcgcccgg caagcgcgcg gcctgcgaac tggcacgctg tttgcaggcc ccgcgatctc ccaagtgttc gcgcttcgcc gatcaccatc cgcgctgcgg agcgccctca cctgcggccc cggtgtggtc gaccgagact gttcttcctg caagcctttg cctagtcatc gggtctctcg gaccggcggg ccgtacgctg ctcggcggcc  cacgacgatc gatgggtctg cgggcaggag gctctatgag ggaggcgagg cggtcgcctg tcgcgtagcg ttccgccgac aggacaaccg ggttcgcgcg tttatggaag accggattgg cgatccgagg gttgtcatag gtcttggtgc tatttcctgg gccgaccatg acagtggagg atcaccagtg gtcatcttct gccggcctga gaaagctcca cgcatcgacc aacgtgctgt gccgagaccg gacatggtgc ggcttcgatc ttcaccaagg gagggcggcg ttcgtggcca cgcaccggca ccgcacaaca gacctgcagc gacgagggca cgccgtcgca ttcgtgcgtc tgggaagagg  ctgccgcccg ggacgctatg atccgcggcg cggtccctgg ttgctgcgcg cgcgcccgct ctacagggca aacgaggcct gccgggtcac cgttagaagg gcagggcgtt tcttatggcg gcggcgaagc ggtggacgac tgctgggcgg agctcgaccg acgcggccag aggcggtttg aggaagacac tctcgctgtc cggggcccaa aggccaccaa actatctggg tcgagcccct aaaaggtcgg agaaccacat ccgatgcggt agaccgtggc cgcgcgctgg tgaaggtcgc agaacctgcc tcttcggtcc cggacgaaga tgcgactgca tcgcttacga gcgatgaggt ccagtatcga  ccctgaacgc acgcggcgct agatcgcttg gcgaccgctt cctcggtggc ggtcgggctt tgtcgatggg tcaatggctg ccgcccgcgc gcggcagggc tccgaatggc ggcgccgggc gcggtcgctt attggctaag agcgggcgat actgctgccg ctacaaggcg gaatcgcctc caagaagctg gcccagcctc cacgcgcttg cgccgccgtc caaggaaacc gtgggatcgc cgaccgctgg gctgcaactg gcgcgacgag ccacgacacc ctatgtcgag gatcgacaac ggaccgccgc ggcgaccgat catctcgctg gtcgctgccg aaagctgttc cgagcaggcc accggcgcac  137  2221 2281 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3061 3121 3181 3241 3301 3361 3421 3481 3541 3601 3661 3721 3781 3841 3901 3961 4021 4081 4141 4201 42 61 4321 4381 4441 4501 4561 4 621 4 681 4741 4801 4861 4921 4 981 5041 5101 5161 5221 5281 5341 5401 54 61 5521 5581  tatgcggcgg gcctggaagg aggacctcta gtcacggcag gcatggcgac ttgttcccgc gcgaggcggc gcgcgcgcaa gcgtggggcc gtctggacct acctcccacg tggtcaccgg tcgccgccat ccatgagcct acagccgcgc ccaagctgtc tccgtgctct tgtcagccca tgggcgcgac gccgtcccgg tggccctgac tgccgggcct ccggcccgat ccgagggcaa cgggcacctg gcttccattt tcaaggaatc tcggccggat gcggctcgac tgaccctcga cgaacggttc gtgagctgct tgtcgctgta ctcacgagag gcggcctgcg agcccgagca tcgccgcttt ggccgtccgc tggatcgcgg agacgcccgc tccaggacgg acgaggcgac gccgggaact tctttgcttg cggcgacatc gcgcctgatc cgaggagtat cgggccgatc cggcctgcgc ggcgctggcc gtcgggggag tgtccgtcgc gccggtcgat tcgggtgtcg ggcggccgct cgtagagggc aacggcgggc  gcacctgggg cctgagcatg tgacgcggcc ggtcggcttc catgaccgcc caccgacgcc caaggcctcg ggccgaggcg ggatgggcat cgccaccgac cctgagcctg cgcggccaag tctgaaacag gaatcccgtc ggcctatctc ctgcgccaac ggatccgaac ccagccgctg cgcccagttc catggagctg ccatgacgcc ggtgatcggc gacctcgggc ggtcggtcgt caccttctat gcctggctcg cgcccgccgc gatcgacgag caacctggcg agacctggac ggccgacgtg gaaggcgggt cgcgaaggaa tctggatccc cctgatggcg ccacgtgatc caagcgcggc caacggcatg tcacaaggtg cgccatccac cgatgtgatc cctgctcgcc gtttggatgg aggaagcgct ggcggtacga gagccgacgg ctccgcaagg gaccacggtc cgcgcaggcg gcgccgcgcg ggcgatctgg catggccagg gacgtcgaga gtcgagcgga gaagggcgcg tgcgccgaca gacatcgctc  accgcagtcc gcccagggcc cccttcacgc ccatcaagct gcctcggttc tggtcggcgc gccgccaccg gcggcacgac ccctgggaca aggtcacggt agcagcaatg agggtctggt ttcgcgccgc tgttcttcga ggcgtcaatg ccgccacccc ttcgcttcgc tgtttccggg cgttcggtgc tggccgtgcc accctggccg ccctgacccg a a a g c t t t g t tggacggcga gaccgcgcca aggtccgcat atcgccgacg tcaccgcccg gccaacatgg atcgggcgat tgggcccacg ccttcgccgc gcgccgaaca t c g g c a t c g t gaagcctatc ccgcgctgat gccggcgggg t g c c g g c c a t t c g c t g t t c t cgcgcgacgt ttcgactcgg cgctgtatct gcactgacct tcagccatct ctgcccaaca gcgagaaggc gaggaactgc tggcggccga ggcacggcca acaccaacca gccttcgtcc atcccaacac gtggctgcgg tgaccaacaa aagtcgttcg tcaacggcgt ctgcacatca tcgccatggc gatatctcca,aggccacgcc a a c c a c t t c c aggccgccgg ctagtgcacg aagacgtcca c c g g t g c t c g aggacggcat gccatcgtgc ggccggtctc ggcaatctgg gccgcggcgt gaggcgccgt g c g c c g t g t t gagctggatc gcgacgtggt cctgaactgc ataacctgtc g c c c t g g t c a ccgacggccg gtgacgccgg aagcggccaa cgcgtcaatg ccgagaccgg cggacccccg cgaacgtccc atgcggtcgg gggtcggcgc aggtcatgga cggcaatcac acgcccgctt cgccctggtc cctatagggg cgaggactac tcggtgtcaa gcatcctgac aggtccacat gaccaatctg g t t t t c g g a a cgccaagctg t t g g c c c t a a ggacctgcgc cgatcctggg tccaggcacc agatcccgct ggccaccgag tcgaggtgct ccgcgccctg tcctgtcggg tcccggcatg gtgtcgaggc gctgaccgcc gcctggcgac ggtgaaccgt t g a c c t t g g g cgcacgcggc  tgatctcgcc cgaagcattt tttgacgacg gccggcgccg cacgctcacc gcgtcgccac cggcgtgagc gttcggcgtc caatccgatg gcccagcgat caccgacctg cgttgatccg cctctgggcg gatcgtggcg cgagaaccag ctcgccgggc ctcggcctat caaggacgcc gtgcgacggt gatcgccatg gggcgtctgc gcccgccctg ccgcatccgc gagcgccagc gatgctgatg gccgctgcgt gggcaatgaa ggtcgggttg cgccgctgcg gctgctggcg cggcatggct gacgatcgcg gctgacctgg cgacccgttc gatgaagatc ccaggaacag cgtggtggtc gccgtcgatc catgtccggc gggcgggccg ggaactgaag ggcgtccaag ggccgacgcc agcggcgggc gagttcgacg ggcacggccg caggcggtgg gactggcgga atcaacgact cagatcggcg ggcttcggcg ggtggtcacg acccggcgcc gaggacctcc aagcagatca ttctgcgcca ggtgttttca  cggcggccga gggtcccgcg gcggtcgctc gtctatgacc gacgagcgct ctgctcgtgg cacgacgaga gttctcctgg ctggatcagg cccgcgccgg atcgtgctgc gccctgccgg gagtagatcg cgcagcaagg ccggggcgcg gtcgacaagc aatgacatgc gcccgggacg gtcacccagg gcgaccgccg gacaagatcg ttcgtgcccg gcgctctacg tatcatggcc gagctgatgg gaggccctgg ttcatcccgg atggcgaccg ggcgtgcaac cgcgtctatc ttcgtgatcc ggcgccggcc cgtgacggcg agcaaggaag tcggccgtga gaagacttca cgcttccagg tcggtgttgc gcctctggca ctggcctatg atcatggtgg ccgggctttg ggcgcctccg tcggcctcgt gtcaggaccc aggacgccat tcgctgtggc tctccgagga tcaccgccca aattgccgac tcgcaggcct tcgccttcgc tggacggcgg atgtggatct ccgagcgggc tcctgggctc tcgccggcgg  5641 5701 57 61 5821 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261 7321 7381 7441 7501 7561 7621 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 8821 8881 8941 9001  catcgcacca cgcatcatcg acattctgga caaggggcgt c t g t c c g g c t t c a c c c g t t c caccgccctg atcggtgcgg cggtggcgct g t c g t t t t c t tggggctcgc catgacagcc cgaatcggac cggcgaggga c g g a a a t t g a gcgagagcgg a g t c t c g a t c a a g a c c g t c t aggctgacac ccgtgatcgc gtgcaggccg t c t c g g c c c g t a g c c t g g c g ggggcgaagg ccagccccgg ctacgtcacc gatgtgcagc ggttccatct gatcgtcgag ccgatcgact cgatgctgac gacgttgcgc atggacggcg c g g t c g t t c t g g c g g c g c t t gagcgggaag acgatttcga ccgtgcgccg tgggtcagca c c a a g c a t c t g g t c g a t c t g ggccacaagg acggcgcttc gcaccggcgt catcaggggt g t g t t c g t g a t g a t c g t g t g gcgcagggct ccgagaagct gctgggcggc gcggatcgac tggcgctggg cgtcatggcg gtcgccaatc c a g t c g c c g g c t t c g a c g a c acgccgggag t t c g c c a a c c gatccacgcc atggccggag agcgggaaga gggcgcgccc c c g c c a t c g c agtccaccgg cccagcgtcg cactgacgag t a t t c c t c t t gacagcgctg tccgatcgac cgaaaaggcg gcgtcaggag gaaacgccat cgccctgtgc gcacgatcac tcccctccaa cgcggacacc ccgcaagcgc cgcgccgagc ggacgcccct tcaaggatcg cccatgctgc ccctgacgat cgcctgtgga tcggccggcg ctgcaaaccc cgccgtttgg ccgatgtcgg ccgaggcctt catcgcccag ctgatgagcc ccatccaggc cgatggcgcc tcgatcacgc c g g t g c t g g t cggcggcaac tcagcgccgg ggatcgaatg gatccgcgcc ttccgcgcgg aaatcccgat catcttcggc gtcgacgccg c g a t c t t c c c gcacaatgtc ggcctgggcg tcggcgaggt gaccgctaag gaaatggccg cggtcgccgt gcctcgcgat tcacgctggg cggagatcgt gaaggcctat tcgggcccga ccggcaagcc gctggcggcc ggccgcgtgg gtggcaccga gaatggccgc gaccagggcg g t c t g c a c a a cgccggctac ccgccggcga c g t t c t c c a g ctggaacggg gtcaagcaca tgaaggagcg c a t g g g c t t t gagggcttcg tcgagggctg cagcaacacc agctgcgccc t g g c t c c c g a cagctggaag ggcctytacg a g a t c c c c a t ggcgcggatc gacgatgccg c c g g c c t g t t cgaggacaag c g g c c t t t g g agcaccgggc cgtggcgcgc gaggcggtgc g c g t g c t g c c gctgaagagc t c g g c t c g t g ttggcaaggc ctcgggcggt tggaccctga acttcccgca cggccagtcg atctatgcag gcagcgcgga a c t g t c g g t t t c g g g c g a t t tgttcggcga gaacccctac gccgagttcc ctggcgacaa gcgtgacctg gcgctgctga t g t c g g t g t t cctgagcggc cggcccctgt c c t t t g t c g c ggcgtggctg cccggctcgg gcgacaaggc gggtaagccg cgccacgact  gaagagcccg gatcccgacg cacgccggag ttgtcatggc gcgccaaagt cgcgtgtcct cggtagctgc cctatctgat tcggcgccat cgaccgccga tgatcctgac gggtggccta tggatgatcg acatcgcctt tcctcgatgc ggttttcgtt cgacggcgat gcttgcggct cgaagataac cggccgccga ggctcctgga tgcgcgattg gtgagatcgc gacctcgcct gcggacctga gcctttcctt cgcgccgttt tggtcctggc ctagtccagc ggatgaccgt ccgaggaact acggcaatga ccgcgctgga tgcatggtca cagcgcacga ccaccggggc gccgcgccta tgaccctggg cgggctcggc acgcgaagat ttgaagccgg ccggcaacaa tcgtcggcga aggcttataa acaacacctt ttcgccgcat agggcaagct gcaaatcgct tgctggtcgc cctggcaggg gcgtcgcgga tcaagcagaa agggcgacat agaagctcaa ggaccaaccc agggcggcgg tccagggcaa  ttccgcgagc cacgtgatcc ggccgtgcgg taagtactcg cacgatccac gaatcgcgag gctgcactat cggcgttttc cgcccgttgc tgtcgaggat cccgccgctc tgtgcgcatc gctggccgcc tattgtaggg aatgcgcgac tcgctcgggc cttcgcctcg tgacgttcct ctggcctcag catgctgatg cttcgaactc gcaaggtggt atagatcaag ggccagtcca agccgcttat taaacaccgc cgccttcgct ccagacgccg cgccatcacc cgaggagaag gaagaagtac ccgcgccagc caagcgcggc caacaacgtc gcccgacctg ggactggacc tgagggctat gctgcagggg caagcacttc ctccgaggcc catcctgtcg aagcctgctg ctggaacgcc cgccggcatg ggcgcaggtg cctgcgagtc ggagctcctc ggtgctgctg cggagacggc caccggcaac ggccgtgaaa gcccgacgtg caccagcatc ggctgcgggc cgaactcaac cgtggccgac g  gcttcgacag tgcatccgca cggtgtcgta ccgaagcgag gacgtggccc cccaacgtca cgccccaata ttcgacaacc cggcaggaag caggtcgcgc agcgatcatc gccccaggcg tacgagatga caccccgacc agcggcctgc ttcgaggcgg aacgatgaca actcaactgt ctcaccacgg caaggcgtcg gtcgtgcggg accgcaatca acagtcgcca gcggctgtcg cttcgctcgg tccgcccgac tccgccctgg ccgaacgcca gacgccaaga gtcgcccaga cggttgggat ccgcagcgct gaccggcagg gtgggcgcca atccgtcgta tttggtccga ggcgagaatc gcgctggaag ctcgccgatg gatctggtgc gtgatggtct accgacgtgc cacggccagg gacatgatga aaggccgggc aaggtcaagg ggcgcgcctg aagaacgaag gccgacgaca aagaacagcg gccggcggcg gcgatcgttg gagtatcagg attccggtgg gcgtccgacg gttctggtcg  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gcc2537 gcc2537. gcc2537  1177 bp  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 1177) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 1177) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1..1177 /organism="Caulobacter crescentus" /strain="NA1000" gene complement(1..1177) /gene="lpsj" CDS complement(441..998) /gene="lpsj" /codon_start=l /product="putative galactosyl-l-phosphate t r a n s f e r a s e " /translation="TGDAPCGQHRDDGADQDPQIEQKRAPAQIFGVKGDLVGDRQLVS PIDLRPPRHAGTQGVNAGCTARGDQVILIEQSRPGSDQAHVTDEHAPELGQLIETELA HQAADRRQPLIRIVEKVGGHLGRIDAHGAKLRHRKQRRGAPHALRPVETRPRRSQPHK RPQHGQRQDQDRQPDRCSQHIKHTLH" BASE COUNT 169 a 387 c 416 g 205 t ORIGIN 1 gccgaagccg acggcaattg gagttggcgt g g t c t c c t g a cgggcatcga gagcagctcg 61 c c g g g c g a t g c g g g g c t g g g t g a g a a g c t c g c c g c a g a g a t c a c c g c c a t g g g c a t c g a c 121 c c c c a c g c t c t g t t g c c g c g c g g a c g g a t c g a c g a g a t c g c c g c c g c g c t c c a g a c g c g t 181 g a c g g a g a g g g c g c g c g a g a g g t g g t c a a g c g c c t g g c g c cggccgcgat c c g t c g c t t g 241 g t c c g t c g c c t g t t c t c c g a c g c g a c g c t g a a g g g c a a c a c c g a g c g c t t c c t c a a g c g c 301 t a t g c g g g c a t g a t c g a c g a g g c g g c c g g c c a g g a t c g c g a a g g c t t c c t g c t c g c c g c c 361 c t g c t g t c c t c c g a c g c t g g g c g g g c c t a t c t g c t g c t c g a c g c g g c g a g c g g c g a t c t g 421 g c c t a g g c c g a g c c g g c t g a a t g a a g c g t a t g t t t g a t g t g c t g g c t g c a g c g a t c g g g c 481 t g g c g g t c c t g g t c c t g c c g c t g g c c g t g c t g t g g g c g c t t g t g c g g c t g a c t t c g c c g g 541 g g c c t g g t c t c t a c t g g t c g c a g c g c g t g g g g c g c g c c t c g g c g c t g t t t c c g a t g c c g a 601 a g t t t c g c a c c a t g c g c a t c g a t a c g c c c g a g g t g g c c a c c c a c c t t c t c g a c a a t c c t g 661 a t c a a t g g c t g a c g c c g a t c g g c g g c c t g a t g c g c a a g c t c a g t c t c g a t g a g t t g c c c c 721 a g c t c t g g a g c g t g c t c g t c g g t c a c a t g a g c c t g g t c g g a c c c c g g c c g g c t c t g t t c a 781 a t c a g g a t g a c t t g a t c g c c g c g c g c c g t g c a g c c g g c g t t g a c g c c t t g c g t c c c g g c g 841 t g a c g g g g t g g g c g c a g a t c a a t g g g c g a g a c g a g t t g t c g a t c g c c g a c a a g g t c g c c c 901 t t g a c g c c g a a t a t c t g c g c c g g c g c t c g c t t c t g t t c g a t c t g c g g g t c c t g g t c a g c a 961 c c g t c a t c c c g g t g c t g a c c g c a c g g g g c g t c a c c c g t t a g c c g a g a t c g g c g t a g g c c t 1021 t t t c c a g c g c c g a a a c g a a g c g g g g a t c g c c g g c c a g c g a c g c c g g g a a c a c c g a g t c c a 1081 g g g c c a g g a a g g c g g c g a c g t c t g c g c t g g c c t c t c c c g t g g c c g c c a g g c c c a g a g c c a 1141 g g a g g c g a t c t t g c a g c g g g t c g a c g a t g g c t t c a c c  140  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gccl444 gccl444. gccl444  2 0 3 1 bp  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; . Caulobacter. REFERENCE 1 ( b a s e s 1 t o 2031) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 2031) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1..2031 /organism="Caulobacter crescentus" /strain="NA1000" gene 3..569 /gene="orf15" CDS 3..569 /gene="orf15" /codon_start=l / p r o d u c t = " p u t a t i v e molybdenum c o f a c t o r b i o s y n t h e s i s protein" /translation="GEAIRLSPQGDDAQAIASAVSPAPVDVIVTIGGASVGDHDLVKP ALRTLGLALSVETVAVRPGKPTWSGRLPDGRRWGLPGNPASALVCAELFLRPLLAAL TGAAPDIRLIPAGLAAPLPAGGPREHWMRAALSTDPDGRVVATPFPDQDSSLVSVFAR ADALLRRRPGAPPAATGEWDVLPLRRG" gene 658..2031 /gene="IpsK" CDS 658..2031 /gene="lpsK" /codon_start=l /product="putative n u c l e o t i d e sugar epimerase/dehydratase protein" /translation="MGHAGKIATHVLLAFVALLAGRYLVIDMPFTRDTLLQATLYGLA AFIVELAFRVERAPWRFVSATDHLRLLRSAVLTAAAFLVITRLTHPGIDGGLRTVAGA ALIQAALLSALRVIRRSLHERMLLDSVLRLGPASMHPALPRLLIIGSASEAEAFLRAP AGLGERYAPIGVVSPLDRETGDELRGVCVLGSIADFDSVLARLRDSGLSPAAILFLTD SAMSTFGAERLGRLKTEGVRLLRRHGVVEMGAAANTPQLREISIEELLSRPPVRLDPE PVRALVSGRRVLVTGAGGSIGSELCRQIAASGCAHLTMVDASEYNLFHIEREIAERHP LLSRREALCDVRDAARVQRVFTEMKPDIIFHAAALKHVTLVENHPCEGVRTNVLGTRN VAVAAKACGAAHLALISTDKAVAPTSVMGAAKRVAEAVARQYGGGGDMRVSIVRFGNV LGSAGSVV" BASE COUNT 255 a 722 c 703 g 350 t 1 others ORIGIN 1 gcggtgaggc g a t c c g a c t t tccccgcagg gcgacgacgc ccaggcgatc gccagcgccg 61 t t t c g c c c g c g c c c g t c g a c g t c a t c g t c a c g a t c g g c g g c g c c t c g g t c g g c g a c c a t g 121 a c c t g g t c a a a c c c g c a c t c c g a a c g c t g g g c c t t g c g c t t t c g g t c g a g a c g g t c g c c g  141  181 241 301 361 421 481 541 601 661 721 781 841 901 961 1021 1081 1141 1201 1261 1321 1381 1441 1501 1561 1621 1681 1741 1801 1861 1921 1981  tgcgccccgg tgccaggaaa cggctctcac ttccggcggg ggcgagtcgt gcgccgatgc tcgatgttct cccggatttg gggcatgcag cgctatctcg ggcctcgcag gtctcggcca ctggtcatta gcggccctga gagcgaatgc ccgcgcctgc gggcttggcg gatgaactgc cgtctgcgcg agcaccttcg cgccacggcg atcgaggaac gtgtccggtc cgtcagatcg ctgttccaca ctctgcgacg atcatcttcc ggcgtccgca gcggcgcatc gcggccaagc gtcagcatcg  caagccgacc cccggcctcg gggcgcggcg cggaccgcgg cgcgacaccc tctgctacgg gccgctccgg agttcgccgg gaaagatcgc tcatcgacat cattcatcgt ccgaccacct cccgcctgac tccaggcggc tgctcgattc tgatcatcgg aacgttacgc gcggcgtctg acagcggcct gcgccgagcg tggtcgagat tcttgagccg gacgggtgct ccgccagcgg tcgagcgcga tccgcgacgc acgctgcggc ccaatgtgct tcgccttgat gtgtcgccga tgcgctttgg  tggagcgggc gcgctggtgt ccggatatcc gagcattgga ttccccgatc cgacggcctg cgcggctgaa gcgtgacccg gacccacgtt gccgttcacg ggagttggct gcgacttctc ccatccaggc gctgctgtcg ggtgctgcgc ctcggcctcc cccgatcggc cgttctgggc gtcgccggcc tctgggccgc gggcgcggcg gccgcctgtc ggtgacaggc ctgcgcccat gatcgccgag cgcccgcgtc gctgaagcat gggcacccgc ctcgacggac ggccgtggcg caatgtgctg  ggttgccgga gcgcggaact gcctcattcc tgcgcgccgc aggattcctc gcgcgccccc accgcgacgg accttcaccg ctgctggcct cgggacacgc ttccgggtgg cgctcggccg atcgacggtg gcgctgcggg cttggscccg gaggccgaag gtggtctcgc tcgatcgccg gcgatcctgt ttgaagacgg gccaacaccc cgactggatc gcggggggca ctgaccatgg cggcacccgc cagcgtgtct gtcacgctgg aacgtggccg aaggccgtcg cgtcagtacg ggctcggccg  cggtcgccgc cttcctgcgg cgcgggcttg gctgtcgacg tctggtcagc tgcggcgacg catagaattg cttcagaggt tcgtggccct tgcttcaggc agcgggcccc tcctgacggc gcctgcgcac tgatccggcg cctcgatgca ccttcctgcg cgctcgaccg atttcgacag tcctcaccga aaggcgtgcg cccagctgcg cagagccggt gcatcggttc tcgacgcctc tcctctcgcg tcacggagat tggagaacca tcgccgccaa cgccgaccag gcggcggcgg gatcggtcgt  gtggtgggtc cctctgctgg gccgccccgc gatccggacg gtgttcgcgc ggcgaggttg acgtgctaag tcgtttcatg gctggccggt gaccctgtac gtggcgcttc ggcggcgttc cgtggccggc gagcctgcat tccggcgctg cgcgccggcc cgagaccggc cgtgctggcc cagcgcgatg cctgctgcgc cgagatcagc tcgcgcgctg cgagctctgc cgaatacaac tcgtgaggcg gaagccggac cccctgcgag ggcctgcggc cgtgatgggc cgacatgcgg a  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gcc2218 gcc2218. gcc2218  2142  bp  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 2142) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus Unpublished JOURNAL REFERENCE 2 ( b a s e s 1 t o 2142) Awram,P.A. AUTHORS D i r e c t Submission TITLE S u b m i t t e d (15-OCT-1999) UBC JOURNAL FEATURES Location/Qualifiers 1. .2142 source /organism="Caulobacter crescentus" /strain="NA1000" complement(3..719) gene  142  CDS  gene CDS  /gene="orf14" complement(3..719) /gene="orf14" /codon_start=l /product="unknown e x s G - l i k e " /trans1ation="SCGQAHAFGERRAQREDQARGREVEHRLAAEVPRQALMHHLGAE PVARGRPRQGGPALFAPDQGQKPRSGRLVDVPFDRDPALGGREGAMARGVGDQLVDGH VHRHRRLGAEGDGRALDLEPSRNVVGEGRQGALQNLLQLRTSPGAAGQQLVRLRERQD PALEDVCKGLGRRGRAQGLAGDRLHDRQGVLHAVIQLAQXEVTVLERGGEVVIETPAL QGRGRGARDHLQLAQHLGRRI" 1134..2138 /gene="IpsL" 1134..2138 /gene="IpsL" /codon_start=l /product="putative glycosyltransferase" /translation="EPEFRRIDGEGGVVPARHGLARLGPVHGDRQLLAELHGVDGERL VGLGFDRIEPGLAVGGQDVADVVAAEVADPRLHLHQPVQRHHLAALRREQGVVQARAE LAGHQTRPREETRLLVPAQGLGAHQGEAGLLGQALDLGLVFELAEGLGRAHAAPADEQ HVGVTALGEAQGLALKVPAQEVVEHIVAVQHVRQQGRADLDGFVARVVGDDLLVGQHH PHDARHIAGGGPGQQQRRGDPRRVQGADHEQLRVHLGQPVTPDEVEVGDHEGPVEPVR HRRVEIPGLARNVVLVPVLDVIGIGALGVDELVVQLQVRAGAFGHHAFGHQQVDIGRV E" 324 a 723 c 757 g 337 t 1 others  BASE COUNT ORIGIN 1 atgatccgcc 61 c c c t g g a g a g 121 t g c t g c g c g a 181 g c c a g a c c c t 241 t c c t g g c g c t 301 t g g a g g a g a t 361 a g g t c c a g g g 421 t c c a c g a g c t 481 t c g c g a t c g a 541 g c g a a c a g g g 601 t g g t g c a t c a 661 c g g g c c t g g t 721 g a c c g c c c g g 781 g c t c g g c g a c 841 c t g g c t g c g c 901 g a t g g t g t t t 961 c g g c t a t g g c 1021 c a a c g c c g g c 1081 c t g g g a t g g g 1141 a g t t c c g g c g 1201 t g g g c c c g g t 1261 g g c t g g t t g g 1321 t a g c c g a c g t 1381 a g c g a c a c c a 1441 t c g c c g g c c a 1501 t c g g g g c g c a 1561 t c g a g c t c g c 1621 g c g t c a c c g c 1681 t g g a a c a c a t 1741 t c g t c g c g c g 1801 g c c a c a t a g c 1861 a g g g c g c g g a  gcccaaggtg cgggcgtctc gctggatcac gcgcacgacc ctcgcaggcg tctgcaacgc cccggccgtc gatcgccaac atggaacgtc cgggcccgcc gggcctggcg cttcacgctg gtcatgatcg atggggtgtg gatcatccct ccggtcgccg gacctgccgc gcgctgcgcc atcatccaaa tatcgacggg ccacggcgat cctggggttc cgtggcggcg cctggcagcc ccagacgcgc ccagggtgag ggaaggactg ccttggtgaa agttgcggtt ggtcgtaggt cgggggcggt tcacgaacag  ctgcgcaagc aatgactacc cgcgtgaaga ccgtcgcccg cacgagttgc accctggcga gccttcagcg gccgcgcgcc gacaaggcga ctgtcggggc cgggacctcg cgcgcgccgc tcgaggatga aggtggccgg cgcccgacgg aacgcctgcg gcgcgggctt tggccgtcga gcgttcaaaa gaagggggcg cgccagcttc gatcgaatag gaagttgccg ctgcggcgag ccgcgagaag gccggcctcc ggccgtgcgc gctcaaggcc cagcatgtgc gacgatctcc ccggggcagc cttagggtac  tgcaggtggt tcgccgcctc acaccctggc aggcctttgc tgacccgccg ccttcgccga ccgagacggc atggcgccct ccgctccggg cgcccacgcg gcggccaggc tctccgaacg ggccctggtg ctcgttcggc cgcggtgctg cgagcagggc cgagtcggtg gcgcttccgg gggcgctgga tcgtcccagc tggcggaact agccgggcct atccgcgact agcagggcgt agacgcgcct tgggccaggc acgccgcccc tcgcgctcaa ggcagcaggg tggtcgggca agcagcggcg atctcggtca  cgcgcgcgcc gctcgagcac gatcgtgcag agacgtcttc cgcctgggga cgacgttgcg ggtgtcggtg ctcgaccccc gcttctgacc caacgggttc ggtgctcgac catgagcctg gccatgatgg gccgtcgacg gacgtcaata gtgccgttcg caggtgctgg atcggctaga tgtctgatct ccgccatggc ccacggggtc ggccgtcggc gcacctccac tgtgcaggcc gctggtccca cctcgatctc ggctgacgaa ggttccagcg ccgcgccgat gcatcatccg cggtgatccc gccagtgacc  acggcctcgg ggtcacttcm tcgatcaccc gagagccgga cgtccgcaac ggacggttcg cacatggcca caaggccggg ctgatctggc ggctccaaga ttcgcgccca gccgcatgaa tcgaggacat ccgccctggc tcggcggcga tcttcgccac ccaagccgat accccttggc ttagagcctg cttgcgcgcc gatggtgaac gggcaggatg cagccagttc cgagccgaac gcccagggtc ggcctcgttt cagcatgtcg caggaagtcg ctcgacggct cacgatgcgc cgccgcgttc ccagacgaag  143  1921 1981 2041 2101  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  ttgaagtggg caggcctggc tcggcgtaga cgtttggcca  gcc648 g e c 648. gcc648  tgatcacgaa gcgcaatgtc cgagctggtc ccagcaggtc  2699 bp  ggcccggtcg gtgctggtac gtccagcttc gacatcgggc  mRNA  agccggtgcg tcaccggcgc cagtgctgga cgtaataggg aggtgcgcgc c g g a g c c t t c gcgtggagat eg  BCT  gtcgaaatcc ataggcgccc ggccaccacg  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 2699) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 2699) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1. .2699 /organism="Caulobacter crescentus" /strain="NA1000" 1..1056 gene /gene="orf1" 1..1056 CDS /gene="orf1" /codon_start=l /product="putative chemotaxis r e c e p t o r p r o t e i n " /translation="RPVIAPGRTDDQDQVITVLSEQFKALAAGDLTARVDVVFSERYG HVRDEFNAAMTKLGQVMDEISMAAGGLGESSDEVARVSQHLSRGAGRQALDLHGARAA LQKVGAAAGRGVDGLRRVTEAAAGLRIDAASARRSVREAVGSIAEVEQSALRISQAAA LFDEVAQQANVLSLIADVEGARGGEGXGPFQAVAADKMRVLAERASGAAREIKGVTAA NSAQVSRCARLMDAASASFGGMASRITQIDGLVSGLAKSAQEQAHGLRAVDEAVDRAD DIAQTHADQVDEAAAVTGRLIEEAESLIQAASPFRAHVVSRPASRPEPARAGHHAPAG NAVARAHARIAAYARPR" 1060..2577 gene /gene="orf1" 1060..2577 CDS /gene="orf1" /codon_start=l /product="putative hippurate hydrolase p r o t e i n " /translation="MLCHPGKRVALVRDPGAASAALPQSLGPGSTPGFRRGSAGMTQD ISVRGGGGGEHVRRSCDSRNPRPSMKSLFAASALALLIATAAQAGPLNVPATQKVISA QLDRDYPALEALYKDIHAHPELGFQEVETAKKLAAQMRALGFTVTEGVGKTGVVAVLK NGEGPKVLIRTELDGLPMQEKSGLAWASQATATWNGEKVFVAHACGHDIHMAAWVGAA RQLVAMKAKWKGTLVFVAQPSEETVRGARAMLDDGLWDKIGGKPDYGFALHVGSGPXG EVYYKAGVLTSTSDGLDITFNGRGGHGSMPSATIDPVLMAARFTVDVQSVISREKDPS AFGVVTVGSIQAGSAGNIIPDKARVRGTIRTQDNAVREKILDGVRRTVKAVTDMAGAP PADLKLTPGGKMVVNDAALTDRTAVVFKAAFGARAVAQDKPGSASEDYSEFVLAGVPS VYFAIGGSDPAELAKAKAEGREPPVNHSPYFAPVAEPTIRTGVEAMTLAVLNVLK"  144  BASE COUNT 396 a 935 c ORIGIN 1 cgccccgtga tcgcgccggg 61 g a g c a g t t c a a g g c c c t g g c 121 g a g c g c t a t g g c c a c g t c c g •181 a t g g a c g a g a t c t c c a t g g c 241 g t c t c g c a g c a t c t g t c g c g 301 g c g g c g c t g c a g a a g g t g g g 361 a c c g a a g c c g c c g c c g g c c t 421 g c g g t g g g g t c g a t c g c g g a 481 c t g t t c g a c g a g g t g g c t c a 541 g c g c g g g g c g g r g a g g g c g m 601 c t g g c c g a g c g g g c c t c g g g 661 g c g c a g g t c t c g c g g t g c g c 721 g c g t c c a g g a t c a c c c a g a t 781 c a g g c c c a t g g c c t g c g c g c 841 a c c c a t g c c g a c c a g g t c g a 901 g a g a g c c t g a t c c a g g c c g c 961 c g g c c c g a a c c a g c c c g c g c 1021 c a c g c c c g c a t c g c c g c c t a 1081 c g t g t a g c g c t t g t c c g g g a 1141 c c c g g c t c t a c c c c c g g c t t 1201 a g g g g t g g c g g a g g t g g c g a 1261 t c c a t g a a g t c c c t g t t c g c 1321 g c c g g g c c g t t g a a c g t g c c 1381 t a t c c g g c g c t g g a g g c g c t 1441 g a g g t c g a g a c c g c c a a g a a 1501 g a g g g c g t c g g c a a g a c c g g 1561 c t g a t c c g c a c c g a g c t g g a 1621 a g t c a g g c g a c c g c c a c c t g 1681 g a c a t c c a c a t g g c c g c c t g 1741 t g g a a g g g c a c g c t g g t t t t 1801 g c c a t g c t g g a c g a c g g t c t 18 61 c t g c a c g t c g g t t c g g g t c c 1921 a c c t c g g a t g g c c t g g a c a t 1981 g c c a c c a t c g a c c c g g t g c t 2041 a g c c g c g a g a a g g a c c c g t c 2101 a g c g c c g g t a a c a t c a t c c c 2161 a a c g c c g t g c g c g a g a a g a t 2221 a t g g c c g g c g c c c c g c c c g c 2281 g a t g c g g c c c t g a c c g a t c g 2341 g t g g c g c a g g a c a a g c c g g g 2401 g t g c c g t c g g t c t a c t t c g c 2461 g c c g a a g g c c g t g a g c c g c c 2521 a c g a t c c g c a c g g g g g t g g a 2581 t t c t c c c c t t g c g g g a g a a g 2641 c c g c g c g a c c c c t c a a c c g a  984 g . ccgcaccgac ggccggcgac cgacgagttc ggctggcggg cggcgcgggg cgcggccgcc gcgcatcgac ggtcgagcag gcaggccaat ggggcccttc cgcggcgcga gcggctgatg cgacggtctg cgtcgacgag cgaggccgcg cagtcctttc cggccatcac tgcgcgaccc cccaggggcg tcgccggggt gcatgtgcgt cgcctcggct cgccacgcag gtacaaggac gctggccgcg cgtggtggcg cggcctgccg gaacggcgag ggtgggtgcg cgtggcccag gtgggacaag gkccggcgag caccttcaac gatggccgcc ggccttcggc cgacaaggcc cctcgacggc cgacctgaaa cacggcggtg ctcggcgtcc catcggtggc ggtcaaccac ggcgatgacc gtgtcgccgg cccgctacgc  381 t  3 others  gatcaggatc ctgaccgccc aacgcggcga ctgggcgagt cgtcaggcct gggcggggcg gccgccagcg agcgccttgc gtcctgtcct caggccgtcg gagatcaagg gacgccgcct gtgtcgggcc gcggtggacc gcggtcaccg cgcgcccatg gcgcccgccg cgctagggga gcaagcgcgg tcggccggga cgatcctgcg ctcgccctgc aaggtgatca atccacgccc cagatgcggg gtgctgaaga atgcaggaaa aaggtcttcg gcccgccagc ccctcggagg atcggcggca gtctattaca ggccggggcg cgcttcaccg gtggtgacgg cgggtgcgcg gtgcgccgca ctgaccccgg gtgttcaagg gaggactatt tcggaccccg tcgccgtact ctggcggtgc aggcgacgga gggccaccct  aggtgatcac gcgtcgatgt tgaccaagct cttcggacga tggatctgca tggacgggct cccgccgttc gcatcagcca tgatcgccga ccgctgacaa gcgtgacggc cggcctcgtt tggccaagtc gggccgacga gccgcttgat tggtttcgcg gcaacgccgt tgctgtgtca cgctcccgca tgacacagga actcacgaaa tgatcgccac gcgcccagct accccgagct cgctgggctt acggcgaggg agtcgggcct tcgcccatgc tggtggcgat agacggttcg agcccgacta aggccggcgt ggcacggctc tcgacgtgca tcggctcgat gcacgatccg cggtgaaggc gcggcaagat ccgccttcgg cggaattcgt ccgagctcgc tcgcgcccgt tgaatgtgtt tgaggggttt ctcccgcaaa  cgtgctgtcc ggtgttcagc gggccaggtc ggtggcgcgc cggtgcgcgg gcgccgcgtc ggtgcgtgag ggccgccgcc cgtcgagggc gatgcgcgtc cgccaattcg cggcggcatg cgcccaggag tatcgcccag cgaggaggcc cccggcgtcg ggcccgcgcc tcccggaaag gtccctgggt tatttctgtc cccgagaccc cgccgcccag cgaccgcgac cggcttccag caccgtcacc ccccaaggtg ggcctgggcc ctgcggccac gaaggccaaa cggggcccgc cggctttgcg cctgacctcg gatgccctcg gagcgtgatc ccaggcgggc cacccaggac ggtgaccgac ggtggtcaat ggcccgcgcc gctggccggc caaggccaag ggccgagccg gaagtgaccc ctcggcctcg gggagaagg  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  gccl290 gccl290. gccl290  2109 b p  Caulobacter crescentus. Caulobacter crescentus  mRNA  BCT  15-OCT-1999  B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 2109) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i n Caulobacter crescentus Unpublished JOURNAL REFERENCE 2 ( b a s e s 1 t o 2109) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1. .2109 /organism="Caulobacter crescentus" /strain="NA1000" c o m p l e m e n t ( 8 2 2 . . 1769) gene /gene="orfl2" CDS complement(822..1769) /gene="orf12" /codon_start=l /product="Unknown i n t e r r u p t s O - a n t i g e n s y n t h e s i s " /translation="MSRLPPGLKTGRDVSVTGVDAAGRTVLTARDGDPQMVWTPTREE RKALRAAKAVRIDVKLEAVEGKLVGPALYADWGDGFSEDSYARLKAGPDGWFASLPAR SFQLNGVRLDPSEGACAFTVEALTVTRIGDLGRDPRGLRGAAIQALKPMLGPLRGPAG AAWRRGRALLAKGRVARPAGRDEGAVGATYAHAIAVSRNLRSPHYAAPIAAPITLPAE APKVVAFYLPQFHPFPENDTWWGKGFTEWTNVSKAQPQFLGHYQPRLPADLGFYDLVS ARCWPSRWTWPRARASTPSASTTTGSPESAFWNGRWICS" c o m p l e m e n t ( 1 7 66..2107) gene /gene="orfl3" CDS c o m p l e m e n t ( 1 7 66. .2107) /gene="orfl3" /codon_start=l /product="polysaccharide transporter k p s T - l i k e " / t r a n s l a t i o n ^ ' PAELSGLGDFLHLPVRTYSAGMLARLMFTVATVFEADILVLDEW LSAGDAAFVQKAAQRMHRMVEDAKIVVMATHDHDLVQRVCNRVCELQGGKIXFLGSXE DWLAYRETQAA" BASE COUNT 310 a . 719 c 767 g 311 t 2 others ORIGIN 1 gtcggcgcgc ttggcgaact g g c t c t g t g c ctcggcgatg atcggatggg c g t t g g t c a g 61 g c g c g g c a g c c a g g c g c t g a g c g c g g t g c g g g t g g c g t g c a g a t a g c c g t ggccgaacca 121 c c g g t c g g g c t c g a g a t a g g c g c c t t c g c c c c a c t c g t t c c a g g c g t t g a cgaacaccag 181 c g c c t c g c c c t t g g g a t g c c g g g c c t c g g c g t g c t t c a g c g c c c c c g a c a gccagccgaa 241 a t a g c t t t c c g g a t c g g c g t t g t g g a a c g c c a c c c c g g c c c a g g g c t t g c g g g c c t g g t t 301 g t c c c a g c c g g g c a t g a c g c c g g g c a c g a a g g c g g c c g g a a c c t g c t c c a g c t c a t c c a g 361 c t t g t g g c g g g c c a c g g c g g g a t a g t c a t a g a c c t t g c c g g t g a a g c c g g cgtgcagcgg 421 c g t g a c c c g g t t g g t g a t c t c g c c c t c g a c a a t g g c g t g c g g c g g g a a g t c g a c g a t g c c 481 g t c g a a g c c g t g g c c g g c a t a g t c c t g g a a g c c g a a g g c g g t g g t g c a c a gcaggtgcag 541 c t c g c c c a g c c c c a t g g c c c g c g c c t g a t c g c g c c a g c g c t c a g t g g t g g c c t t g g c g t c 601 g g g c a g g a t c t c g g g g c g g t a c a g c a a t a g g a g c g g c t t g c c g c t c a c c c g c a g a t a g c g 661 c g g g t c g c g c a t g t a g c g c g c c a g g t c c t c g a a c a c c g c g c g g t c g t c c t g c g g c g a g t g 721 c t c c t g g c c c a t c a g g a t g t c g c t c t c g t c g c c g t c c c a g c g g c g g g t c c a g t t c t c a t t 781 g g c c c a g c a g a g g g c g a a g g g c a g g t c c a g g c t c g g a t c g t t c a g g a a c a g a t c c a g c g g 841 c c g t t c c a g a a g g c g c t t t c c g g c g a a c c a g t a g t a g t g g a a g c a g a a g g c g t g g a c g c c 901 c g c g c c c t t g g c c a g g t c c a c c t g c t g g g c c a g c a c c t c g c g c t g a c c a g g t c g t a g a a g 961 c c c a g a t c c g c c g g c a g g c g c g g c t g g t a g t g a c c c a g g a a c t g c g g c t g g g c c t t g g a g 1021 a c g t t g g t c c a c t c g g t g a a g c c c t t g c c c c a c c a g g t g t c a t t c t c c g g g a a c g g a t g a 1081 a a c t g c g g c a g g t a g a a g g c c a c c a c c t t g g g c g c t t c g g c c g g c a g g g t gatcggggcg  146  1141 1201 12 61 1321 1381 1441 1501 1561 1621 1681 1741 1801 1861 1921 1981 2041 2101  gcgatcgggg taggtcgcgc agcagcgccc ggcttcagcg atacgggtga cgcacgccgt gccttcaggc gggccgacca gcccgcagcg cgggcggtca gtcttcaggc cctcgktcga cccgctggac tccggtgcat cgtccagcac gcataccggc cggcggggg  cggcgtagtg cgaccgcgcc gcccgcgtcg cctggatcgc ccgtgagggc tcagttggaa gggcgtagga gcttgccctc ccttgcgctc ggaccgtgcg cgggcggcag gccgaggaag caggtcatgg ccgctgggcg caggatgtcg cgaataggtg  cgggctgcgc ctcgtcgcgt ccaggccgcg cgcgccgcgc ctcgaccgtg actgcgcgcc atcctcggaa gaccgcctca ttcgcgggtg cccggccgca gcggctcatg gsgatctttc tcgtgggtgg gccttctgca gcctcgaaca cgcaccggca  aggttgcggg cctgccggac ccggcggggc aggccgcgcg aaggcgcagg ggcagcgagg aagccgtcgc agcttgacgt ggcgtccaga tcgacgccgg cggcctgggt cgccctgcag ccatcaccac cgaaggcggc cggtggccac ggtgcagaaa  agaccgcgat gcgccacgcg cgcgcaaggg gatcgcggcc cgccctcgga cgaaccagcc cccagtcggc cgatccgcac ccatctgcgg tgaccgacac ttcgcggtag ttcgcagacg gatcttggcg gtcgccggcg cgtgaacatc gtcgcccagg  cgcgtgggcg tcccttggcc gcccagcatc gagatccccg cgggtccagc gtccggaccg gtagagcgcg cgccttggcc gtcgccgtcg gtcgcggccg gccagccagt cggttgcaga tcctcgacca ctgagccact aggcgcgcca cccgataact  //  LOCUS DEFINITION ACCESSION VERSION KEYWORDS SOURCE ORGANISM  g c c 2205 g c c 2205. g c c 2205  2365 b p  mRNA  BCT  15-OCT-1999  Caulobacter crescentus. Caulobacter crescentus B a c t e r i a ; P r o t e o b a c t e r i a ; alpha s u b d i v i s i o n ; C a u l o b a c t e r group; Caulobacter. REFERENCE 1 ( b a s e s 1 t o 2365) AUTHORS Awram,P.A. TITLE A n a l y s i s o f t h e S - l a y e r T r a n s p o r t e r M e c h a n i s m a n d Smooth Lipopolysaccharide Synthesis i nCaulobacter crescentus JOURNAL Unpublished REFERENCE 2 ( b a s e s 1 t o 2365) AUTHORS Awram,P.A. TITLE D i r e c t Submission JOURNAL S u b m i t t e d (15-OCT-1999) UBC FEATURES Location/Qualifiers source 1..2365 /organism="Caulobacter crescentus" /strain="NA1000" gene complement(2..550) /gene="orf16" CDS complement(2..550) /gene="orf16" /codon_start=l / p r o d u c t = " p u t a t i v e HOMODA h y d r o l a s e p r o t e i n " /translation="MRGLTISGVFAVLVLTASLAQAGEVTVDGRKVAYREWGGGERTL VMVSGLGDGAETFETVGPRLAQGWRVIAYDRAGYGGSADDPRVHDAERAEAELKGLLA ALKVRKPVLLGHSLGGVFAAHFAARNPGEVTGLVLEETRPTGFTAACKAKRMRGCAFP PLLKYAFPPGGRREVETLDRIER" gene complement(559..2178) /gene="pgi" CDS complement(559..2178) /gene="pgi" /codon_start=l  147  /product="putative phosphoglucoisomerase" /translation="MADLDAAWTRLEAAAKAAGDKRIVEFFDAEPGRLDALTLDVAGL HLDLSKQAWDEAGLEAALDLAHAADVEGARARMFDGEAINSSEGRAVLHTXLRAPAGA DVKALGQPVMAEVDAVRQRMKAFAQXVRSGAIKGATGKPFKAILHIGIGGSDLGPRLL WDALRPVKPSIDLRFVANVDGAEFALTTADMDPEETLVMVVSKTFTTQETMANAGAAR AWLVAALGEQGANQHLAAISTALDKTAAFGVPDDRVFGFWDWVGGRYSLWSSVSLSVA VAAGWDAFQGFLDGGAAMDEHFRTAPLEQNAPVLVALAQIFNRNGLDRRARSVVPYSH RLRRLAAFLQQLEMESNGKSVGPDGQPAKRGTATVVFGDEGTNVQHAYFQCMHQGTDI TPMELIGVAKSDEGPAGMHEKLLSNLLAQAEAFMVGRTTDDVVAELTAKGVSDAEIAT LAPQRTFAGNRPSTLVLLDRLTPQTFGALIALYEHKTFVEGVIWGINSFDQWGVELGK VMANRILPELESGASGQHDPSTAGLIQRLKR" 367 a 849 c 783 g 364 t 2 others  BASE COUNT ORIGIN 1 gccgctcaat acggtccagc 61 t g a g c a g c g g c g g g a a c g c g 121 c g g t c g g c c g g g t t t c c t c c 181 a g t g g g c g g c g a a c a c g c c g 241 a c g c c g c c a g c a g c c c c t t c 301 c a t c g g c g c t g c c g c c a t a g 361 g c c g g g g g c c g a c c g t c t c g 421 g g g t c c g c t c g c c a c c g c c c 481 c g c c g g c c t g c g c c a a c g a g 541 g c c c t c g c a t g g g a t c g c c t 601 a t g c t g g c c c g a a g c g c c g c 661 c a g c t c g a c g c c c c a c t g g t 721 c t t g t g c t c a t a g a g g g c g a 781 c a c c a g g g t c g a g g g c c g g t 841 g g c g t c a g a g a c g c c c t t g g 901 c a t g a a g g c c t c g g c c t g g g 961 g c c t t c g t c c g a c t t g g c g a 1021 c a t g c a c t g g a a a t a g g c g t 1081 c g t g c c g c g c t t g g c c g g c t 1141 c a g c t g c t g g a g g a a g g c g g 1201 g g c c c g g c g g t c c a g g c c g t 12 61 a t t c t g c t c c a g c g g g g c g g 1321 a c c c t g g a a c g c g t c c c a g c 1381 c g a a t a g c g g c c g c c g a c c c 1441 g g c g g c g g t c t t g t c c a g c g 1501 g c c t a g g g c c g c c a c c a g c c 1561 c g t g a a g g t c t t g g a g a c c a 1621 c a g g g c g a a c t c g g c g c c g t 1681 c g g t c g c a g g g c g t c c c a c a 1741 c a g g a t c g c c t t g a a c g g c t 1801 c g c g a a a g c c t t c a t c c g c t 1861 g g c c t t g a c g t c c g c t c c c g 1921 g g a c g a a t t g a t c g c c t c g c 1981 g t g g g c c a g a t c g a g c g c g g 2041 c a g g t g c a g g c c g g c g a c g t 2101 g a a c t c g a c g a t a c g c t t g t 2161 g g c g t c g a g a t c g g c c a t g t 2221 g c t t a t c a a a c a g g c g c c g c 2281 c g a t g t c c t g t g a c g c c g t c 2341 a g c a g a c g a c c g g c c c c g c c  gtttcgacct cacccgcgca agcacgaggc cccagcgaat agctcagcct cccgcccggt aacgtctcgg cattcgcgat gccgtcagca agcgcttcag tctccagctc cgaagctgtt tcagggcgcc tgccggcgaa ccgtgagctc ccaagaggtt cgccgatcag gctgaacatt gcccgtcggg ccaggcggcg tgcgattgaa tgcggaagtg ccgcggcgac agtcccagaa cggtcgagat aggcccgcgc ccatgaccag cgacattggc gcaggcgtgg tgccggtcgc ggcggacggc ccggagcgcg cgtcgaacat cctcgagacc ccagggtcag cgcccgcagc cgtcctcaca cgcgtcatgg gccttttccg ctcgc  cgcggcgccc tacgcttagc ccgtgacctc gccccagaag ccgcccgctc cataggcgat ccccgtcgcc aagcgacctt ccaggacggc gcgctggatc cggcaggatg gatcccccag gaaggtctgg agttcgctgc ggccacgaca cgagagcagc ctccatcggc ggtgccttcg gccgaccgac caggcggtgc gatctgggcc ctcatccatg ggccaccgaa cccgaacacg ggcggccaga cgcgccggcg ggtctcttcc gacgaagcgc gcccaggtcg gcccttgatc atcgacctcg caggnccgta gcgggcccgg cgcctcatcc agcgtcaagg cttggcggcg ggtttggtaa cagaccaatg ccatgctctg  gcccggcgga a a c g c g t a t t c t t a c a g g c c gcagtgaagc t c c c g g a t t g cgggccgcga cacgggcttt cgcaccttca ggcgtcgtgc acacgcggat gacgcgccag ccctgggcca aagaccgctg accatcacca gcgcccgtcg accgtcacct gaaaacgccg c t g a t c g t c a aaccctgcgg tcgaagggtc cggttcgcca tcaccttgcc a t c a c g c c c t cgacgaaggt ggcgtcaggc ggtcgaggag ggggccaggg t g g c g a t t t c tcgtccgtgg tccgcccgac t t c t c g t g c a tgccggccgg gtgatgtcgg tcccctggtg tcgccgaaca ccaccgtggc ttgccgttgc tctccatctc gagtacggca cgaccgagcg agggccacca gcaccggcgc gccgcgccgc cgtccaggaa aggctgaccg acgaccacag c g a t c g t c c g gcacgccgaa tgctgattgg ccccctgctc ttggccatgg tctcctgggt g g g t c c a t g t cggcggtggt. aggtcgatcg acggcttcac ctgccgccga tgccgatgtg gcccccgaac gcacggmctg gccatgaccg gctggcccag tgcagcacag c c c g g c c t t c gcgccctcga catcggccgc caggcctgct tggagagatc cgtcccggct cggcgtcgaa g c t t c c a g g c gggtccaggc a t c g c t g t t t acggacccgg a c g t t t t t t g cgggagcccc gatggcgtcg ttcaatccgg  148  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0089847/manifest

Comment

Related Items