Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Structural analysis of an enterohemorrhagic Escherichia coli metalloprotease effector Yu, Angel Chia-yu 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2012_fall_yu_angel.pdf [ 28.18MB ]
JSON: 24-1.0072927.json
JSON-LD: 24-1.0072927-ld.json
RDF/XML (Pretty): 24-1.0072927-rdf.xml
RDF/JSON: 24-1.0072927-rdf.json
Turtle: 24-1.0072927-turtle.txt
N-Triples: 24-1.0072927-rdf-ntriples.txt
Original Record: 24-1.0072927-source.json
Full Text

Full Text

Structural analysis of an enterohemorrhagic Escherichia coli metalloprotease effector  by Angel Yu  B.Sc., The University of British Columbia, 2006  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Biochemistry and Molecular Biology)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  July 2012  © Angel Yu, 2012  ii Abstract  Mucins are proteins that contain dense clusters of α-O-GalNAc-linked carbohydrate chains and are the major component of the  mucosal barrier that lines the mammalian gastrointestinal tract from mouth to gut. A critical biological function of mucins is to protect the underlying epithelial cells from infection. Enterohemorrhagic Escherichia. coli O157:H7 (EHEC), a bacterial pathogen that causes severe food and water borne disease, is capable of breaching this barrier and adhering to intestinal epithelial cells during infection.  StcE (secreted protease of C1-esterase inhibitor) is a ~100 kDa zinc metalloprotease virulence factor secreted by EHEC and plays a pivotal role in remodelling the mucosal lining during EHEC pathogenesis. StcE also dampens the host immune response by targeting the mucin- like region of C1-INH, a key complement regulator of innate immunity. To obtain further mechanistic insight into StcE function, I have determined the crystal structure of the full- length protease to 2.5Å resolution. This structure shows that StcE adopts a dynamic, multi- domain architecture featuring an unusually large substrate binding cleft. Electrostatic surface analysis reveals a prominent polarized charge distribution highly suggestive of an electrostatic role in substrate targeting. The observation of key conserved motifs in the active site allows us to propose the structural basis for the specific recognition of α-O-glycan containing substrates, which have been confirmed by glycan array screening to be O- glycosylation of the mucin-type. Complementary biochemical analysis employing domain variants of StcE further extends our understanding of the substrate binding stoichiometry and distinct substrate specificity of this important virulence-associated metalloprotease.   iii Preface Parts of this thesis have either been published or are soon due for publication in peer- reviewed journals.  A manuscript describing the characterization of enterohemorrhagic Escherichia coli StcE has been accepted for publication in Structure [Yu, A.C.Y., Worrall, L.J. and Strynadka, N.C.J. Structural insight into the bacterial mucinase StcE essential to adhesion and immune evasion during enterohemorrhagic E. coli infection. Structure.]. I was responsible for all aspects of this project, including design of the expression constructs, purification of StcE domain variants and the serpin-bound complex, conducting the proteolytic activity assay, performing the sucrose gradient experiment,  analyzing the static light scattering results, obtaining diffracting crystals of StcE, refining the structural model  and interpreting the structure. I conceived the ideas for the manuscript and I was responsible for the large majority of the writing. Dr. Liam Worrall, a current postdoctoral fellow in the Strynadka lab, provided assistance with the data collection and structure determination, and Dr. Thomas Spreter, a former postdoctoral fellow helped with the static light scattering experiments. Mammalian glycan array screening was performed at the Consortium for Functional Glycomics ( This work forms the majority of Chapter 3.  Work describing the characterization of the periplasmic domain of PrgH appeared in Nature Structural and Molecular Biology [Spreter, T., Yip, C.K., Sanowar, S., Andre, I., Kimbrough, T.G., Vukovic, M., Pfuetzner, R.A., Deng, W., Yu, A.C., Finlay, B.B., Baker,  iv Dr., Miller, S.I. and Strynadka, N.C. (2009) A conserved structural motif mediates formation of the periplasmic rings in the type III secretion system. Nature Structural and Molecular Biology 16, 468-476.]. I was involved in the purification and crystallization of various constructs of the PrgH periplasmic domain. I refined the crystal growth conditions and obtained diffracting crystals of PrgH(170-362). The structure was solved by Dr. Calvin Yip, a former PhD student in the lab. Our collaborators from Dr. Brett Finlay’s and Dr. Samuel Miller’s groups performed the biochemical characterization of the relevant type III secretion mutants. Ingemar Andre from Dr. David Baker’s lab and Dr. Thomas Spreter from the Strynadka group conducted the molecular modeling and analyzed the docking results. The PrgH structure was published along with that of the EscC secretin, another basal body component of the type III secretion system, in 2009. This work is located in Appendix A.  v Table of Contents  Abstract ................................................................................................................................... ii	
   Preface .................................................................................................................................... iii	
   Table of Contents .................................................................................................................... v	
   List of Tables ........................................................................................................................... x	
   List of Figures ........................................................................................................................ xi	
   List of Abbreviations ........................................................................................................... xiv	
   Acknowledgements .............................................................................................................. xvi	
   Dedication............................................................................................................................ xvii	
   Chapter  1: Introduction ........................................................................................................ 1	
   Mucosal immunology .................................................................................................. 1	
   Mucin-type glycoproteins ..................................................................................... 1	
   Role of mucin-type glycoproteins in host defense ............................................... 3	
   Pathogens and the host mucosal barrier................................................................ 4	
   Complement systems of innate immunity ................................................................... 6	
   Complement activation pathways......................................................................... 7	
   Complement evasion by pathogens ...................................................................... 9	
   Acquisition of complement regulators......................................................... 10	
   Complement inhibition by serpins............................................................... 11	
   Pathogenesis of enterohemorrhagic E. coli ............................................................... 13	
   The locus of enterocyte effacement (LEE) pathogenicity island........................ 14	
   Characteristics of zinc metalloprotease effectors ...................................................... 17	
    vi 1.4.1	
   Classification of zinc metalloproteases .............................................................. 18	
   Catalytic mechanism........................................................................................... 22	
   Enterohemorrhagic E. coli StcE ................................................................................ 26	
   Role in bacterial intimate adherence................................................................... 26	
   Evasion of host immune defence........................................................................ 27	
   Chapter  2: Materials and methods .................................................................................... 30	
   Molecular cloning...................................................................................................... 30	
   Protein expression and purification ........................................................................... 31	
   Multiangle light scattering......................................................................................... 32	
   Proteolytic assay ........................................................................................................ 32	
   Limited proteolysis .................................................................................................... 33	
   Inductively coupled plasma mass spectrometry analysis .......................................... 33	
   Crystallization and data collection............................................................................. 33	
   Structure determination and refinement. ................................................................... 35	
   Peptide docking ......................................................................................................... 36	
   Sucrose density gradient centrifugation................................................................... 36	
   Circular dichroism spectroscopy. ............................................................................ 37	
   Glycan array screening. ........................................................................................... 37	
   Microcalorimetry ..................................................................................................... 37	
   Enzymatic deglycosylation of StcEΔ35E447D/C1-INH......................................... 38	
   In vitro chemical cross-linking of StcEΔ35E447D/C1-INH ................................... 38	
   Chapter  3: Results ............................................................................................................... 40	
   Characterization of full-length StcE .......................................................................... 40	
    vii 3.1.1	
   Recombinant expression and purification of StcE.............................................. 40	
   Oligomerization state of proteolytic active StcE ................................................ 41	
   Characterization of StcE/C1-INH interactions ................................................... 43	
   Static light scattering analysis of the metalloprotease-serpin complex ....... 43	
   Isothermal titration calorimetry analysis of  C1-INH binding..................... 45	
   Proteolytic specificity of EHEC StcE................................................................. 47	
   Mammalian glycan array screening.................................................................... 49	
   Isothermal titration calorimetry of glycan and C1-INH peptide ligands ............ 51	
   Limited proteolysis of StcE ................................................................................ 52	
   Surface entropy reduction engineering of  StcE ................................................. 55	
   Structural analysis of StcE.................................................................................. 59	
   Overall architecture ..................................................................................... 59	
   Overall electrostatic surface features........................................................... 66	
   Metalloprotease domain............................................................................... 67	
   Accessible substrate-binding cleft ........................................................ 68	
   Conserved active site features .............................................................. 70	
   StcE proteolytic specificity................................................................... 73	
   Potential exosite function of the insertion domain ...................................... 74	
   Cell surface binding features ....................................................................... 75	
   Chemical cross-linking of StcE with C1-INH ............................................. 78	
   Characterization of the StcE insertion domain .......................................................... 81	
   Purification and structural determination of the StcE insertion domain............. 81	
   Implications for substrate binding ...................................................................... 84	
   Orientation of StcE(132-251) in full-length StcE........................................ 84	
   Electrostatic surface features of StcE(132-251) .......................................... 86	
   StcE insertion domain is dynamic ............................................................... 87	
   Chapter  4: Discussion.......................................................................................................... 88	
   Insights from the StcE structure ................................................................................ 88	
   Open conformation of the active site and overall electrostatics ......................... 88	
   Subsite specificity ............................................................................................... 90	
   Recognition of mucin-type O-glycan ligands..................................................... 91	
   Insight on catalytic mechanism .......................................................................... 93	
   Implications of the multi-domain architecture ................................................... 95	
   Future directions ........................................................................................................ 97	
   Characterization StcE-substrate interactions using recombinant C1-INH ......... 97	
   Kinetic analysis of StcE mutants ................................................................. 98	
   Baculovirus-insect cell expression of C1-INH ............................................ 98	
   Expression of C1-INH in mammalian hosts .............................................. 100	
   De-glycosylation strategies................................................................. 100	
   Electron microscopy of StcEΔ35E447D/C1-INH ............................................ 101	
   Cell-based assays and In vivo characterization of StcE mutants ...................... 103	
   Protein translocation assays ....................................................................... 104	
   In vivo competitive index assay................................................................. 106	
   Inhibitor screening for EHEC StcE .................................................................. 107	
   Bibliography........................................................................................................................ 109	
   Appendices........................................................................................................................... 126	
    ix Appendix A The virulence-associated type III secretion system ..................................... 126	
   The type III secretion apparatus.......................................................................... 126	
   Assembly ............................................................................................................ 128	
   Extracellular components: needle, needle extension and translocon.................. 130	
   Outer membrane structure .................................................................................. 131	
   Inner membrane structure ................................................................................... 133	
   Structural characterization of PrgH periplasmic domain.................................... 135	
   Implications of PrgH periplasmic domain structure ........................................... 136	
   Structure of MxiG cytoplasmic domain.............................................................. 138	
   Export apparatus ................................................................................................. 139	
   Appendix B References .................................................................................................... 141	
     x List of Tables  Table 3.1 Data collection and refinement statistics for SeMet-labeled crystals of StcEΔ35K318A/K320A/E321A/E447D................................................................................. 58	
   Table 3.2 Data collection and refinement statistics for StcE(132-251).................................. 83	
   Table A.1 Components of the bacterial virulence-associated type III secretion systems .... 127	
     xi List of Figures  Figure 1.1 Mucosal barrier of the human gastrointestinal tract................................................ 2	
   Figure 1.2 Chemical structure of mucin-type O-glycosylation. ............................................... 3	
   Figure 1.3 Activation and regulation of human complement pathways. .................................. 8	
   Figure 1.4 Sequence features of the C1-INH  mucin-like domain. ........................................ 10	
   Figure 1.5 The serpin mechanism of irreversible protease inhibition. ................................... 12	
   Figure 1.6 Intimate adherence of enterohemorrhagic E. coli to host cells. ............................ 16	
   Figure 1.7 Classification of zinc metalloprotease................................................................... 19	
   Figure 1.8 Architecture of gluzincin and metzincin active sites. ........................................... 21	
   Figure 1.9 Hydrolytic mechanism of zinc metalloproteases. ................................................. 23	
   Figure 1.10 Nomenclature of protease-substrate interactions. ............................................... 24	
   Figure 1.11 Structure of thermolysin in complex with the inhibitor N-phosphoryl-L- leucinamide............................................................................................................................. 25	
   Figure 3.1 Recombinant expression of a stable, mature form of StcE. .................................. 40	
   Figure 3.2 Recombinant StcE is proteolytically active as a monomer. .................................. 42	
   Figure 3.3 StcEΔ35E447D binds human C1-INH in a 1:1 ratio. ........................................... 44	
   Figure 3.4 Isothermal titration calorimetry of StcEΔ35E447D binding to human C1-INH... 46	
   Figure 3.5 Identification of the StcE cleavage site in C1-INH............................................... 48	
   Figure 3.6 Mammalian glycan array screening of StcEΔ35E447D. ...................................... 50	
   Figure 3.7 Isothermal titration calorimetry of StcEΔ35E447D binding to the O-linked glycan, and C1-INH peptide ligands. .................................................................................................. 52	
    xii Figure 3.8 Predicted domain boundaries of the conserved M66 metalloprotease domain in EHEC StcE. ............................................................................................................................ 55	
   Figure 3.9 Limited proteolysis of StcEΔ35E447D................................................................. 55	
   Figure 3.10 Overall architecture of StcE. ............................................................................... 60	
   Figure 3.11 Sequence alignment of select StcE homologues. ................................................ 65	
   Figure 3.12 Surface charge asymmetry in Macrobdella decora intramolecular trans-sialidase. ................................................................................................................................................ 67	
   Figure 3.13 StcE metalloprotease domain. ............................................................................. 69	
   Figure 3.14 Fluorescence scan of StcE quadruple mutant (K318A/K320A/E321A/E447D) crystals. ................................................................................................................................... 70	
   Figure 3.15 StcE active site. ................................................................................................... 72	
   Figure 3.16 Biochemical characterization of StcE domain variants....................................... 74	
   Figure 3.17 Isothermal titration calorimetry of StcEΔ35E447D to calcium. ......................... 77	
   Figure 3.18 Sucrose density gradient analysis of StcE domain variants. ............................... 78	
   Figure 3.19 Chemical cross-linking of the StcEΔ35E447D/C1-INH complex. ..................... 80	
   Figure 3.20 StcE insertion domain is an independently folded module. ................................ 82	
   Figure 3.21 Relative orientation of the proposed exosite domain in StcE. ............................ 85	
   Figure 4.1 Regulation of zinc metalloprotease activity by the cysteine switch mechanism. . 89	
   Figure 4.2  Structure of the CD43 mucin glycopeptide STTAV............................................ 93	
   Figure 4.3 Conservation of surface features in the StcE active site ....................................... 96	
   Figure A.1 Structural overview of the type III secretion apparatus. .................................... 128	
   Figure A.2 Model for the step-wise assembly of the T3SS. ................................................. 129	
   Figure A.3 Domain organization of T3SS outer and inner membrane ring components. .... 133	
    xiii Figure A.4 PrgH(170-362) shares a conserved ring-building motif in EscJ(21-190). ......... 135	
   Figure A.5 Basal body components of the T3SS.................................................................. 137	
   Figure A.6 MxiG(1-126) adopts a FHA fold........................................................................ 139	
     xiv List of Abbreviations  3D    three-dimensional C-ring     cytoplasmic ring DMSO    dimethylsulfoxide DSP    dithiobis[succinimidylpropionate DSS    disuccinimidyl suberate DTT    dithiothreitol EC    enzyme commission ECP    E. coli common pilus EDTA    ethylenediaminetetraacetic acid EGS    ethylene glycolbis[succinimidylsuccinate EHEC    enterohemorrhagic Escherichia coli EPEC    enteropathogenic Escherichia coli EM    electron microscopy FACS    fluorescence activated cell sorting FHA    forkhehad-associated GB3    globotriaosylceramide GFP    green fluorescent protein HEK    human embryonic kidney HEPES   hydroxyethyl piperazineethanesulfonic acid IPTG    isopropyl-β-D-thiogalactopyranoside ITC    isothermal titration calorimetry  xv JNK    c-Jun N-terminal kinase LB    Luria Bertani LEE    Locus of Enterocyte Effacement MG    mucin-type glycoproteins N-WASP   neural Wiskott-Aldrich syndrome protein NF-κB    nuclear factor- κB OMP    outer membrane protein PCP    procollagen C-proteinase PDB    Protein Data Bank PNGaseF   peptide N-glycosidase F Rmsd    root mean square deviation SAD     single anomalous diffraction SDS-PAGE   sodium dodecyl sulfate- polyacrylamide gel electrophoresis SeMet    selenomethionine Stx    shiga toxin T2SS    type II secretion system T3SS    type III secretion system Tccp    Tir cytoskeleton-coupling protein Tir    translocated intimin receptor TLP    thermolysin-like proteases WT    wild-type    xvi Acknowledgements  First and foremost, I would like to thank my family for their patience and unconditional support over the years. I would like to thank my supervisor Dr. Natalie Strynadka for giving me the chance to pursue the StcE project, which, although challenging, has motivated me to learn more about science. I am grateful for her valuable advice on research and writing, as well as the wealth of resources available in the lab. I would also like to thank my supervisory committee members Dr. Lawrence McIntosh and Dr. Ross MacGillivray for the helpful discussions and proposed research directions. I would like to thank past and present members of the Strynadka lab for sharing their expertise and support over the years, especially Liza de Castro and Marija Vuckovic for always being so helpful, Dr. Raz Zarivach, Dr. Igor D’angelo, Dr. Andrew Lovering and Dr. Liam Worrall for teaching me crystallography, Dr. Ho Jun Lee and Dr. Calvin Yip for their technical advice, Dr. Emilie Lameignere and Dr. Susan Safadi for troubleshooting the isothermal titration calorimetry studies. I am also very grateful for the scholarship support from the National Science and Engineering Research Council and the Michael Smith Foundation for Health Research, and all my committee members for the valuable reference letters that accompanied the award applications.   xvii Dedication  To my family for their continuous support and encouragement.   1 Chapter  1: Introduction 1.1 Mucosal immunology The mucosal lining of the gastrointestinal tract presents a large interface for potential encounters with potential pathogens in the lumen. The lining must be permeable to allow nutrient uptake, while balancing the need to provide a barrier function to fend off infection. Specialized intestinal goblet cells produce mucin glycoproteins that coat the epithelial surface, forming a thick, protective mucus layer together with other defense molecules, including antimicrobial peptides and immunoglobulins. The mucus consists of two layers: an inner glycocalyx of transmembrane mucins on the apical side of epithelial cells (100-150 µm) and a thick outer layer of secreted mucins measuring from 300 µm in the stomach to 700 µm in the large intestine (Linden et al., 2008). The inner layer is normally devoid of microorganisms, while anaerobic commensals and other bacteria reside in the outer layer, with the colon being the most densely populated organ (1010-1012 bacteria per gram of luminal content compared to 105-107 in the small intestine) (Figure 1.1).  1.1.1 Mucin-type glycoproteins Mucin-type glycoproteins (MG) are the major constituent of the mucosal barrier. These densely glycosylated proteins are characterized by Pro/Thr/Ser (PTS)-rich domains, in which the hydroxyls of Ser or Thr can be covalently modified by various O-linked glycans through the linking sugar α-N-acetylgalactosamine (GalNAc) in the Golgi (Figure 1.2) (Hang and Bertozzi, 2005). This post-translational modification, referred to as the mucin-type, is the most common form of O-glycosylation in higher eukaryotes.   2  Figure 1.1 Mucosal barrier of the human gastrointestinal tract. Colonic epithelial cells are covered by a thick mucus barrier consisting of a sterile inner layer and a thick outer layer that is populated with microorganisms (note that the density of bacteria is higher in the large intestine than in other digestive organs due to acidity in the stomach and high concentration of bile secretions in the small intestine). Mucin-type glycoproteins on the surface of intestinal microvilli are primarily synthesized by goblet cells and they adopt a “bottle brush”-like structure, with large numbers of O-glycans projecting from the peptide backbone. The figure is derived from (McGuckin et al., 2011). Reprinted with permission of Nature Reviews Microbiology.   3  Figure 1.2 Chemical structure of mucin-type O-glycosylation. Mucin-type O-glycosylation is represented by GalNAcα-threonine and the core I motif by Galβ1-3GalNAcα-threonine.  The close spatial arrangement of the sugars projecting from the mucin peptide dictates many biophysical properties of the MG, as well as modulates their functions. The O-linked glycans are clustered along the PTS tandem repeats and cause the glycoproteins to assume a “bottle- brush”-like, extended structure, with dimensions 2.5~3 times greater than unglycosylated, denatured globular proteins of similar residue number (Jentoft, 1990). Steric exclusion and electrostatic repulsion from the dense anionic O-glycans also limit access to the peptide backbone of mucin-like domains. This shielding confers protease resistance to the glycoproteins (Van den Steen et al., 1998) and allows them to function as a protective barrier. A subset of MG can oligomerize through intermolecular disulfide bonds and further assemble into higher-order structures. Upon hydration after being secreted into the lumen, the MG polymers expand 100-1,000-fold in volume to give rise to a viscous gel (Tam and Verdugo, 1981), which can agglutinate bacteria as shown by salivary MG in the oral cavity.   1.1.2 Role of mucin-type glycoproteins in host defense Saliva represents one of the host’s first defences against enteric pathogens such as Escherichia coli O15:H7. The particular serotype belongs to the pathovar of  4 enterohemorrhagic E. coli (EHEC), which is often associated with food-borne infections and severe gastrointestinal disease (Croxen and Finlay, 2010). Oligosaccharides attached to secreted salivary mucins can serve as receptors for bacterial adhesins to mediate cell-binding and form bacteria-protein aggregates that are subsequently engulfed by macrophages and neutrophils (Prakobphol et al., 1999) or removed from the respiratory tract by mucociliary action.  In addition to the secreted mucins, membrane-anchored surface MGs also play an important role in mitigating interactions with pathogens that reach the normally sterile cell surface that is devoid of microorganisms. The sheer size of the membrane spanning MG and the associated O-linked glycans can block access to the cell surface and inhibit bacterial adhesion through steric hindrance (Linden et al., 2009) (Figure 1.1). These sugar chains often mimic carbohydrate ligands found in glycolipids of the underlying mucosal membrane that are necessary for bacterial recognition and attachment. Once they are bound by the bacteria, the binding interaction as well as endogenous proteases can stimulate shedding of the extracellular O-glycosylated domain and this, in turn, facilitates the removal of the associated microorganism  from the cell surface in a mechanism described as a “releasable decoy” (McGuckin et al., 2011).  1.1.3 Pathogens and the host mucosal barrier Though the mucosal lining is effective in keeping microorganisms at bay, certain pathogens have evolved strategies to cross this barrier and reach the normally sterile apical cell surface to initiate infection. These include evasion of the mucus barrier, expression of degradative enzymes, and motility systems to facilitate entry and destruction of the epithelial cell integrity (McGuckin et al., 2011). For instance, some can bypass the thick mucus layer and  5 enter via M (microfold) cells. Unlike most cell types of the intestinal epithelium, M cells are not covered by a thick mucus (due to the lack of goblet cells in the vicinity) and are able to deliver microorganisms and antigens in the gut lumen to leukocytes across the epithelial barrier, thereby stimulating the immune response (Neutra et al., 1999). Many bacterial pathogens such as Salmonella typhimurium, Shigella flexneri and Yersinia enterocolitica, which all utilize a type III secretion system (T3SS) to inject virulence effector proteins into infected host cells, take advantage of this unprotected surface to invade the host cells (Vazquez-Torres and Fang, 2000).  Bacterial surface structures like the flagella, which are widespread among enteric pathogens, also play a key role in mediating access to the host epithelial cells (Yonekura et al., 2003). Flagella provide a propulsive force that is necessary to direct bacterial movement through the viscous glycocalyx and mucus layer to allow chemotaxis and confer motility during encounters at the host-pathogen interface. Besides using flagella-mediated motility to penetrate the mucus, other bacteria secrete additional enzymes that can break down the highly O-glycosylated mucins, and thus reduce their viscosity and ability to aggregate microorganisms. These enzymes include glycosidases that cleave surface carbohydrates potentially serving as receptor decoys for bacterial adhesins (and also make the underlying mucin peptide backbone more susceptible to protelysis), as well as mucinase enzymes (Silva et al., 2003). Mucinases are proteolytic enzymes that specifically degrade mucin-type glycoprotein targets. In particular, the ability of EHEC to overcome mucosal defenses and successfully colonize the host has been attributed to the host cell surface-clearing mucinase  6 activity of StcE (Grys et al., 2005), the focus of this thesis (its contribution to EHEC pathogenesis will be discussed in greater detail in later sections). Once microorganisms gain access to the host epithelium, they can further promote infection by releasing toxins that target intercellular tight junctions. These seals between adjoining cells maintain the barrier function of the gut epithelium, and the effector protein EspF (E. coli secreted protein F) has been shown to disrupt its integrity in vivo using Citrobacter rodentium, a related mouse pathogen that causes similar attaching-and-effacing lesions in infected host cells as the human-specific EHEC and enteropathogenic E. coli (Guttman et al., 2006). The resulting changes in the localization of the tight junction proteins claudin 1, 3 and 5 led to abnormal ion secretion and increased water influx to the intestinal lumen, which resembles the diarrheal phenotype observed in EHEC infections.  1.2  Complement systems of innate immunity Among the body’s first lines of defense against pathogens is the complement system. It can distinguish self from non-self and efficiently targets foreign antigens for removal either by phagocytosis or direct cell lysis. In order to survive within the host environment, successful pathogens have evolved various strategies to either delay or escape complement-mediated killing by targeting specific components of this defense system. Complement is a critical part of the innate immune system and consists of more than 30 proteins that exist both in cell surface-bound forms, as well as in the fluid phase in human plasma and tissue fluid (Walport, 2001). They sequentially regulate the activities of one another through controlled proteolytic cleavage to bring about attack on the pathogen surface.   7 1.2.1 Complement activation pathways The complement system can be activated through three distinct pathways: classical, alternate and lectin-binding pathways (Lambris et al., 2008) (Figure 1.3). Each utilizes a different set of recognition molecules to detect foreign structures. The classical pathway consists of proteins C1 to C9, and binding of antibodies to microbial surface structures can stimulate this branch of the complement through the recruitment of C1q of the C1 complex to the antibody- antigen complex. When the globular recognition domains of C1q bind to IgG or IgM, it triggers the onset of the proteolytic cascade as inactive zymogens of C1r and C1s serine proteases undergo auto-activation to form the functional enzymes (Figure 1.3). The mature C1r and C1s proteases then cleave C4 to generate C4a and C4b (Bally et al., 2009). Next, C2 associates with C4b bound to the cell surface, where it is also cleaved by the C1s protease to release C2b and generate C4b2a, the C3 convertase complex. The C3 convertase enzyme hydrolyzes C3 to C3a and C3b, which covalently attaches to and tags the target cell surface for destruction, and this represents the key point of convergence for all three pathways of the complement system.  The formation of C3 convertase through the lectin pathway is analogous to the classical pathway except that mannan-binding lectins (MBL) recognize specific sugars on microbial surfaces instead of antibody-antigen complexes. Moreover, the initial proteolytic activation of complement substrates is mediated by MBL-associated serine proteases 1 and 2 (MASP-1 and MASP-2) (Figure 1.3). Alternatively, C3 can spontaneously hydrolyze to form C3b and initiate the alternate pathway. Factor B of the alternate pathway binds directly to C3b to create the C3 convertase equivalent, C3bBb, and this effectively amplifies the downstream  8 effects of the classical and lectins pathways to generate more C3b opsonins available for surface deposition (Figure 1.3). The opsonins serve to tag foreign cell surfaces to stimulate the complement system and to facilitate their recognition by phagocytic cells to promote their removal.  Figure 1.3 Activation and regulation of human complement pathways. Proteolytic activation of the classical, lectin and alternate pathways results in the formation of C3 convertase, which cleaves C3 into C3b. Increased C3b deposition on the target cell surface leads to the assembly of C5 convertase and eventually, the lytic membrane attack complex (MAC). C1-INH (C1-esterase inhibitor) recruited to the surface can suppress complement activation via its serine protease inhibitor activity and by displacing factor B from C3 convertase (C3bBb) of the alternate pathway.  As C3b accumulates on foreign cell surfaces after initiation of the proteolytic cascades, this can result in direct cell lysis and phagocytosis of opsonized particles. Covalently bound C3b leads to the formation of C5 convertase that cleaves C5 to C5a and C5b, and C5b, in turn, provides the nucleation point for the assembly of a multi-component membrane-attack complex (C5b and C6 to C9) that punctures the microbial membrane to cause cell lysis (Figure 1.3). C3b also promotes the clearance of opsonized cells via direct interaction with complement receptors on phagocytes (Wiesmann et al., 2006). These specialized immune  9 cells process and present engulfed foreign antigens to effector T and B cells, thereby also stimulating the adaptive immune system. Furthermore, complement fragments C3a and C5a released during proteolytic activation act as anaphylatoxins and induce a range of pro- inflammatory and chemotactic responses, including the recruitment of leukocytes to the site on infection, which all contribute to the action of complement on invading pathogens. As such, the complex series of protein interactions triggered by the onset of complement cascades is highly efficient in protecting the host from pathogens. On the other hand, several modulators also exist to regulate the activities of complement proteins to prevent excessive immune activation and to inhibit components that have inadvertently deposited on normal host cells to minimize self-injury. Although these host defense molecules are discriminative as to their timing and site of activation, many are susceptible to manipulation by virulence proteins that thwart their functions to the advantage of the pathogen (Serruto et al., 2010).  1.2.2 Complement evasion by pathogens The ability to avoid the multiple attacks of the immune system is often key to the virulence mechanism of pathogenic microorganisms. They have developed various strategies during the course of evolution to suppress the activities of host complement proteins, and the mechanisms they use to achieve this include: proteolytic degradation of complement proteins, inhibition of their activities and recruitment of complement regulators or expression of their structural mimics (Lambris et al., 2008).     10 Acquisition of complement regulators One of the most adopted and efficient immune evasion strategies is to take advantage of available host resources and hijack endogenous regulators of complement activation. By recruiting complement modulators to the pathogen surface, these proteins trick the host to identify the foreign cell as “self” and thus spare the pathogen of complement-mediated killing through their protective functions. Specifically, the virulence factor StcE secreted by enterohemorrhagic E. coli binds the host protein C1-esterase inhibitor (C1-INH) and localizes the complement modulator to the cell surface (Figure 1.3) (Lathem et al., 2004; Lathem et al., 2002). C1-INH is a member of the serpin (serine protease inhibitor) family, which includes other plasma serpins antithrombin, α-1-antitrypsin and plasminogen activator inhibitor (Pike et al., 2002). C1-INH has a two-domain structure: a heavily glycosylated, mucin-type amino-terminus (Figure 1.4) and a globular C-terminal serpin domain that functions as an essential regulator of a number of serine proteases in the complement pathways (Beinrohr et al., 2007).  Figure 1.4 Sequence features of the C1-INH  mucin-like domain. Confirmed N- and O-glycosylation sites are denoted by filled triangles and circles, respectively. Potential O-glycosylated residues are marked by open circles. Dashed line underlines the signal sequence. Residue numbering is indicated above the protein sequence.  Although sequence identity is modest among serpins, their structures have a similar overall architecture and these proteins irreversibly inactivate their cognate serine proteases through a  11 common “conformational trap” mechanism (see the section below for mechanistic details) (Carrell and Travis, 1985). Complement inhibition by serpins The serpin fold consists of nine α-helices and three β-sheets (labeled A-C), in which a flexible reactive center loop (RCL) of ~20 residues is fully exposed to the solvent and potential protease targets (Figure 1.5). The conformation of native serpins exists in a metastable or high-energy state and it is the tendency toward a more thermodynamically stable state that provides the driving force for the formation of inhibitory serpin-protease complexes (Huntington, 2006). The flexible serpin RCL acts as bait and its sequence is complementary to the substrate binding pockets of the associated protease (nomenclature for numbering the RCL residues is consistent with that for the protease substrate whose specificity they mimic, where P1 denotes the residue N-terminal to the scissile bond and P1’ C-terminal to the cleavage site). Upon recognition by the protease, the serpin is cleaved between P1-P1’ and undergoes a dramatic conformational change in which the RCL inserts into β-sheet A to result in a hyperstable serpin conformation (Figure 1.5). As the protease inhibitor is covalently linked through residue P1 of the RCL to the catalytic serine of the protease in the acyl-enzyme intermediate, the serpin drags its target along with it while structural rearrangements in its core β-sheet take place. This distorts the catalytic triad in the serine protease active site, as well as induce overall disorder in the protease structure (Figure 1.5) (Huntington et al., 2000). Deformation of the catalytic site prevents deacylation, which normally occurs after the formation of the acyl-enzyme intermediate to release the protease  12 from the complex, leaving the protease trapped in an inactive, disordered state that is susceptible to degradation (Huntington, 2006) (Figure 1.5).  Figure 1.5 The serpin mechanism of irreversible protease inhibition. The native serpin (PDB 1QLP) is shown on the left (gray) with its serine protease target (PDB 2PTN, colored in cyan and red) in a docked orientation. The covalent serpin-protease complex is shown on the right. Cleavage of the reactive center loop (RCL, colored in yellow) by the serine protease results in its insertion into β-sheet A (in green) of the serpin. This is accompanied by a large conformational change that displaces the protease to the opposite pole of the serpin, resulting in disruption of the catalytic serine triad (PDB 1EZX) (Huntington et al., 2000).  The serine protease inhibitor C1- INH plays an important role in various physiological processes that involve tightly regulated proteolytic pathways, including blood coagulation, fibrinolysis, inflammation and the complement cascades (Davis et al., 2008). In particular, the serpin controls the very first steps leading to full complement activation. C1-INH is the only known inhibitor of C1r and C1s serine proteases in the classical pathway and of their counterparts in the lectin pathway, MASP-1 and MASP-2 (mannose-binding lectin-  13 associated serine protease) (Figure 1.3) (Rooijakkers and van Strijp, 2007). Inhibition of these serine proteases by the serpin prevents activation of the respective cascades and dampens the innate immune response. C1-INH also down-regulates the alternative pathway but in a manner that is independent of protease inhibition. It competes with factor B for binding to C3b (Jiang et al., 2001) and this interferes with the assembly of a functional alternative pathway convertase (C3bBb) required for activating downstream components of the complement cascade (Figure 1.3). Thus, interaction of C1-INH with its complement substrates exerts a profound effect on complement activation.  EHEC StcE traps C1-INH to the cell surface through joint interactions with the serpin and cell (Lathem et al., 2004) and its ability to exploit the complement regulator to escape the host immune response is key to the virulence of this bacterial pathogen.  1.3 Pathogenesis of enterohemorrhagic E. coli Escherichia coli is a group of highly diverse Gram-negative bacteria. Some exist as harmless commensals in a symbiotic relationship with their mammalian hosts while others have evolved to become pathogens through the horizontal transfer of virulence genes that improve their fitness in specific niches.  There are eight E. coli pathotypes characterized to date and they can be classified as either diarrheagenic E. coli, which causes enteric disease, or extrainterstinal E. coli, which  is associated with urinary tract infections, sepsis or meningitis (Croxen and Finlay, 2010). This thesis focuses on the enteric pathogen enterohemorrhagic E. coli (EHEC) and more specifically, the role of the virulence factor StcE in EHEC pathogenesis.   14 EHEC was first identified as a human pathogen in 1982 (Nataro and Kaper, 1998) and continues to be a major cause of outbreaks of gastrointestinal diseases worldwide. Cattle are asymptomatic carriers of EHEC and infections often occur as a result of consuming food or water that has been contaminated with the bacteria. EHEC O157:H7 is the most prevalent serotype associated with outbreaks in developed countries such as the United States, Canada, Japan and the United Kingdom, with recent statistics from the Food and Drug Administration reporting a fatality rate as high as 50% in the elderly (Kaper et al., 2004). EHEC is highly infectious and ingestion of as little as ~100 cells of the Gram-negative bacteria can result in infection (Kaper et al., 2004). They typically colonize epithelial cells in the mammalian colon and cause bloody diarrhea and haemolytic uremic syndrome (HUS), which leads to the destruction of red blood cells and in some cases, kidney failure and death. The hallmark of EHEC infection is the presence of the phage-encoded Shiga toxin. (Stx). There are two subgroups of Stx: Stx1 and Stx2 and they are members of the AB5 toxin family. Stx consists of a pentamer of the B subunit that binds to the glycolipid receptor Gb3 (globotriaosylceramide) present on the surface of epithelial cells in the human intestine and kidney, and an enzymatically active A subunit that cleaves ribosomal RNA to inhibit protein synthesis and cause cell death (Fraser et al., 2004; Nataro and Kaper, 1998). The enzymatic action of the toxin is believed to contribute to the pathologies associated with HUS, such as inflammation and damage in renal endothelial cells (Kaper et al., 2004).  1.3.1 The locus of enterocyte effacement (LEE) pathogenicity island Besides Stx, a 35-kb genetic segment also integrated into most EHEC genomes is the LEE (locus of enterocyte effacement) pathogenicity island that confers the characteristic  15 attaching-and-effacing (A/E) phenotype to the bacteria. EHEC belongs to a family of A/E bacterial pathogens that also includes enteropathogenic E. coli  (EPEC) and the mouse pathogen, Citrobacter rodentium. The A/E pathogens adhere closely to local epithelial cells, destroy (efface) microvilli of the host intestine, and subvert the host actin cytoskeleton from an extracellular position to form localized actin pedestals beneath the attached bacteria (Figure 1.6). Their ability to induce the A/E phenotype is primarily dependent on the LEE pathogenicity island that encodes a type III secretion system (T3SS) and its translocated effectors (McDaniel et al., 1995). T3SS is a conserved protein secretion complex found in a wide variety of Gram-negative pathogens, including Escherichia, Yersinia, Salmonella, Shigella, Pseudomonas and Xanthomonas (Cornelis, 2010). These pathogens use this needle- like machinery, which forms a continuous conduit across the bacterial envelope to the target cell membrane, to inject virulence-associated effector proteins directly into the host cytoplasm (Figure 1.6; structural organization of the T3SS is discussed in the Appendix) (Worrall et al., 2011). The translocated effectors often mimic the structure and function of eukaryotic host proteins to modulate  key signalling pathways and defence mechanisms to the benefit of the pathogen.  In order for EHEC to cause disease, successful colonization of host tissues must take place prior to infection. The E. coli common pilus (ECP) mediates initial attachment of EHEC to host epithelial cells (Rendon et al., 2007). Scanning electron microscopy analysis of ECP revealed surface ultrastructures that resemble flexible fibers and they stabilize bacterial interaction with the host cell membrane by forming physical bridges between neighbouring bacteria at the contact interface and by promoting direct binding to the host surface.  16  Figure 1.6 Intimate adherence of enterohemorrhagic E. coli to host cells. 1. Initial adhesion of EHEC to gut epithelial cells is mediated by surface pilli (shown as black sticks). 2. Effector proteins translocated through the type III secretion system induce attaching and effacing lesions that are characterized by actin pedestal formation. 3. Subversion of host cell cytoskeleton is mediated by Tir cytoskeleton-coupling protein (TccP), which binds to Tir through the host protein insulin receptor tyrosine kinase substrate (IRTKS). TccP interacts with N-WASP to stimulate actin recruitment via the ARP2/3 complex. Right: Electron micrograph of the attaching and effacing phenotype showing EHEC on top of a raised actin pedestal structure. Histopathology image reprinted with permission of Nature Reviews Microbiology (Croxen and Finlay, 2010).  More importantly, intimate attachment of EHEC to gut epithelial cells requires interaction between the LEE-encoded adhesin intimin in the bacterial outer membrane and Tir (translocated intimin receptor) (DeVinney et al., 2001) (Figure 1.6). EHEC senses environmental stimuli such as calcium levels and mammalian host cell surface and translocates Tir through the T3SS into the target cell, where Tir integrates into the plasma membrane and acts as a receptor for intimin (Deng et al., 2005). The cytoplasmic domain of Tir recruits TccP (Tir cytoskeleton-coupling protein), a bacterial effector that is also delivered into the host cytoplasm by the T3SS, to remodel the cytoskeletal network for enhanced bacterial attachment (Garmendia et al., 2004). TccP relieves auto-inhibition of the host protein N-WASP (neural Wiskott-Aldrich syndrome protein) by mimicking and thereby  17 disrupting binding interactions of the auto-inhibitory structural motif and this allows N- WASP to activate the Arp2/3 complex (actin-related protein) to initiate actin nucleation and polymerization directly beneath the adhering bacteria (Cheng et al., 2008).  1.4 Characteristics of zinc metalloprotease effectors The ability of EHEC to colonize and persist in the host cell environment is mediated by the concerted action of the effector proteins, which often target conserved aspects of eukaryotic cellular processes involved in cell signaling, cytoskeleton remodeling and stimulation of the host immune response (Spears et al., 2006). Bioinformatics and biochemical analyses of the medically relevant Sakai strain of EHEC O157:H7 identified a broad range of effectors, totaling more than 40 proteins that are encoded by or outside the LEE pathogenicity island (Tobe et al., 2006).  Notably, many are localized within lambdoid prophages, suggesting that they may have been acquired through separate horizontal gene transfer events during the evolution of the pathogen. Among those that are associated with mobile genetic elements are several zinc metalloprotease virulence factors. Recently, EPEC homologues of NleC and NleD (non-LEE encoded effectors C and D) have been shown to cleave the transcription factor NF-κB (nuclear factor- κB) and JNK (c-Jun N-terminal kinase), respectively, in a zinc- dependent manner, to suppress the cognate signaling pathways that trigger the host inflammatory response (Baruch et al., 2011). Similarly, StcE, a zinc metalloprotease effector encoded by the pO157 virulence plasmid, can block immune activation by cleaving the complement regulator C1-INH (Lathem et al., 2004). StcE-mediated proteolytic activity also interferes with the ability of neutrophils to respond to bacterial infections through chemotaxis by cleaving specific immuno-modulatory proteins (CD43 and CD45) on the surface of the immune cells (Szabady et al., 2009).  These proteases hydrolyze the target peptide bonds  18 within their substrates in a mechanism requiring a catalytic metal ion. As such, they belong to the EC (enzyme commission) subclass 3.4.24 and represent one of the major groups of enzymes in the MEROPS peptidase database, which distinguishes proteases by their catalytic types as metallo (M), aspartic (A), cysteine (C), serine (S) or threonine (T) (Rawlings et al., 2010).  1.4.1 Classification of zinc metalloproteases  In addition to the catalytic mechanism, MEROPS further classifies proteases according to a hierarchical system in which homologous members are grouped into families, which in turn are grouped into clans (Rawlings et al., 2010). Each family is denoted by a letter representing the catalytic type of the constituent enzymes followed by a unique number, and all members must share amino acid sequence similarity at least in the protease domain. A clan contains one or more families that appear to share the same evolutionary origin although they do not necessarily display significant homology at the protein sequence level. The evolutionary relationship is often drawn from similarities in their tertiary structures or in the absence of available structures, from the order of catalytic residues and neighboring sequence motifs that are unique to a given clan. Each clan is identified by two letters, with the first representing the catalytic mechanism of the families it contains (Figure 1.7). In particular, Bacillus thermoproteolyticus thermolysin, one of the best studied enzymes and the first zinc metalloprotease described at a high-resolution structural level, belongs to the M4 family of clan MA (Figure 1.7) (Matthews et al., 1972).  19  Figure 1.7 Classification of zinc metalloprotease. Classification of representative zinc metalloprotease families and their zinc-binding motifs. Residues involved in catalysis are in bold and zinc-coordinating ligands are shown in italic bold letters. A more comprehensive list of known zinc metalloprotease families (37 unique clans identified to date) is available in the MEROPS database.  Clan MA is by far the largest metalloprotease class in the MEROPS database. It encompasses an extensive list of protease families (37 to date), which may be further divided into subclans due to evolutionary divergence in their zinc-binding motifs (Figure 1.7). The active site of zinc metalloproteases from clan MA is characterized by a central substrate-binding cleft with specificity pockets that are suited for accommodating the amino acid side chains within the target recognition sequence. An active site α-helix containing the conserved metal-binding consensus sequence, HEXXH, contributes two His side chains for coordinating the catalytic zinc ion. Due to the involvement of this active site motif in zinc binding, members of the clan MA are also broadly referred to as “zincins” (Figure 1.7). A turn at the end of the active site helix leads to the third proteinaceous zinc ligand and this structural configuration is  20 necessary for bringing the residue into the appropriate conformation for metal binding. Depending on the identity and position of the third zinc ligand, the zincins may be further classified into three distinct subclans: gluzincins, aspzincins and metzincins (Figure 1.7). The catalytic environment of gluzincins is typified by thermolysin, in which the conserved NEXXSD motif embedded along an α-helix is observed to approach the active site below the HEXXH-containing helix to present a glutamate (E) as the third zinc ligand (Figure 1.8) (Holmes and Matthews, 1982). Therefore, the gluzincins are additionally categorized as the MA(E) subclan (Figure 1.7). A similar active site architecture is also described for the aspzincin structure of deuterolysin, except that the zinc ion is bound within hydrogen bonding distance to an aspartate residue that is found in the GTXDXXYG motif C-terminal to the two His zinc ligands (Figure 1.7) (McAuley et al., 2001). On the other hand, the metzincins feature a more extended metal-binding motif HEXXHXXGXX(H/D), in which a third histidine or aspartate provides the third zinc ligand (Figure 1.8). More specifically, a solvent molecule, along with the Nε atoms of the three conserved His side chains within the consensus motif (or an aspartate Oδ atom if the acidic residue replaces the third His ligand), coordinate the catalytic zinc ion in a tetrahedral geometry, with optimal bond angles of ~109.5° at the metal center and binding distances in a narrow range of ~2.0 to 2.1 Å (Harding, 2006). Located directly beneath the catalytic zinc ion and downstream from the metzincin metal-binding helix motif is a conserved 1,4-β-turn containing an invariant methionine residue, with its sulfur lone electron pairs oriented 5 to 6 Å away from the metal (Figure 1.8) (Gomis-Ruth, 2009). The conserved Met has been shown to be important for the structural integrity of the serralysin family of metzincins as mutations at this position led to a sub-optimal zinc-binding geometry, particularly in the conformations or side chain dihedral  21 angles of the metal-liganding His residues, and this correlated with the decreased activity of the Met-turn mutants (Oberholzer et al., 2009). This “Met-turn” is a distinguishing feature of the metzincins, which are grouped accordingly as the MA(M) subclan (Figure 1.7). A preliminary survey of sequences containing such conserved motifs has identified EHEC StcE and related proteins as a new family of the metzincin clan (Gomis-Ruth, 2003). They are designated as the M66 family of zinc metalloproteases (Rawlings et al., 2010), which also includes ToxR-activated gene A (TagA) from the Vibrio cholera pathogenicity island. No structural information for any member of this family was available prior to the work presented in this thesis.  Figure 1.8 Architecture of gluzincin and metzincin active sites. Shown are the catalytic sites of the gluzincin thermolysin (8TLN) (Holland et al., 1992) (A) and the metzincin StcEΔ35E447D (3UJZ) (B), overlaid with the sigma-A weighted 2Fo-Fc map contoured at 1σ. The active site residues are highlighted in stick form. Note the tetrahedral coordination geometry of the zinc cation (red sphere). The glutamic acid general base in thermolysin (Glu143) is in proximity to abstract a proton from the nucleophilic water (beige sphere). The general base catalyst is replaced with an aspartic acid residue in StcEΔ35E447D. Hydrogen bonds are shown as black dashed lines.   22 In addition to the zincins, which include StcE and its related homologues, there are additional zinc metalloprotease families that do not possess the common HEXXH motif.  Of note, the MC, MD and ME clans from the MEROPS database feature the unique metal-binding motifs, HXXE…H, HXH…H and the inverted consensus HXXEH…E, respectively (Figure 1.7) (Hooper, 1994).  1.4.2 Catalytic mechanism Zinc metalloproteases are characterized by the requirement for a zinc cation in the active site to hydrolyze target peptide bonds. Much insight has been gained through structural studies investigating the zincin thermolysin and its interactions with a series of transition state analogue inhibitors (Holmes and Matthews, 1981; Monzingo and Matthews, 1984; Tronrud et al., 1986). The metal ion assists in the catalytic process in a variety of ways. It polarizes the zinc-bound solvent to depress its pKa value, making the water more acidic to facilitate nucleophilic attack on the susceptible peptide (Figure 1.9). The nucleophilicity of the catalytic solvent can be further enhanced by the glutamic acid residue within the zincin motif HEXXH. Upon substrate binding, the zinc-bound water is displaced toward the glutamate, which acts as a base catalyst to abstract a proton from the water to generate a reactive hydroxide ion. In an addition step, the lone pair of electrons on the deprotonated solvent then mediate nucleophilic attack on the carbonyl carbon of the scissile peptide bond. During this process, not only does the zinc cation polarize the carbonyl of the substrate to render it more electrophilic, it also acts as a penta-coordinated Lewis acid in a subsequent step to stabilize the negative charge on the sp3-hybridized oxyanionic transition state (Figure 1.9). In an elimination step, the tetrahedral intermediate is converted to hydrolysis products as the  23 proton accepted by the Glu general base is transferred to the leaving amide nitrogen, completing the cleavage reaction (Figure 1.9).  Figure 1.9 Hydrolytic mechanism of zinc metalloproteases. The carbonyl carbon of the scissile peptide bond is polarized by the catalytic zinc cation. The general base abstracts a proton from the catalytic solvent to facilitate the nucleophilic attack onto the carbonyl carbon of the substrate (addition). The carboxyanionic tetrahedral transition state is stabilized by the positively charged zinc. The general base shuttles a proton to the leaving amide-nitrogen to generate the cleavage products.  The essential nature of the zinc coordination geometry in metalloprotease-mediated catalysis has been examined through experiments where the native zinc ion in thermolysin was systematically replaced with various transition metals (Holland et al., 1995; Holmquist and Vallee, 1976). It was shown that substitutions of zinc with Mg2+, Cr2+, Ni2+, Cu2+, Mo2+, Pb2+, Hg2+, Cd2+ and Pr2+ barely restored the proteolytic activity to the wild-type (WT) level. In particular, high-resolution structures of the Cd2+ and Fe2+-substituted enzymes indicate that the reduced catalytic efficiency is likely due to the their inability to adopt tetrahedral and penta-coordinate geometries in the Michaelis complex and transition state, respectively, as  24 the native zinc ion. Cd2+ and Fe2+ prefer a hexa-coordinate octahedral geometry, which distorts the thermolysin active site and causes conformational changes that deviate from the ideal positions for the glutamic acid general base and other neighboring residues (Holland et al., 1995). Thus, the zinc cation plays a key role in catalyzing substrate hydrolysis.  According to the Schetcher and Berger nomenclature, substrate residues on the amino side of the scissile bond are successively referred to as the non-prime residues P1, P2, etc, and those on the carboxyl side as prime residues P1’, P2’, etc (Figure 1.10) (Schechter and Berger, 1967). Along the substrate-binding groove, complementary protease subsites S1 and S1’ accommodate the corresponding P1 and P1’ residues of the bound peptide, and this set of enzyme-substrate interactions dictates the specificity of the protease.  Figure 1.10 Nomenclature of protease-substrate interactions. The Schechter and Berger convention designates residues N-terminal to the scissile peptide bond as the non-prime side (P) and C-terminal residues as prime side (P’). The complementary protease subsites are referred to as S and S’.  As observed in the crystal structure of B. thermoproteolyticus thermolysin, the active site features a prominent hydrophobic S1’ pocket (lined by Phe130, Leu133, Val139 and Leu202) which influences the substrate specificity of the metalloprotease (Figure 1.11) (de Kreij et al., 2001; Matthews, 1988). The size and non-polar nature of the thermolysin S1’ subsite is found to be optimal for accommodating Leu at the P1’ position. This knowledge about the protease subsite preferences, coupled with the established details on its zinc-mediated catalytic  25 mechanism, have been exploited to develop potent inhibitors of the metalloprotease. In the high resolution structure of thermolysin in complex with the phosphoramidate inhibitor N- phosphoryl-L-leucinamide (P-Leu-NH2), the conformation of the inhibitor is observed to mimic the tetrahedral transition state intermediate in which PO1 (phosphoryl oxygen atom 1) of the phosphoramidate group coordinates the zinc with a binding distance of 2.1Å, displacing the catalytic solvent to prevent substrate hydrolysis (Figure 1.11). The leucyl substituent of the inhibitor projects into the hydrophobic S1’ pocket as initially predicted from the structure of the apo-enzyme (Figure 1.11) (Tronrud et al., 1986). These structure- guided studies are invaluable in designing inhibitors against metalloprotease drug targets and may be conducted to obtain insights about the specificity and inhibitory profile of the virulence-associated StcE protease.  Figure 1.11 Structure of thermolysin in complex with the inhibitor N-phosphoryl-L- leucinamide. The phosphoramidate inhibitor (yellow with the phosphorus atom in orange) is bound in the thermolysin active site in a conformation that mimics the tetrahedral transition state intermediate. It displaces the catalytic solvent to coordinate the zinc cation and its leucyl substituent is nestled in the S1’ pocket formed by Phe130, Leu133, Val139 and Leu202. Hydrogen bonds are shown as black dashed lines (2TMN) (Tronrud et al., 1986).    26 1.5 Enterohemorrhagic E. coli StcE 1.5.1 Role in bacterial intimate adherence EHEC has the ability to penetrate the mucosal lining and adhere intimately to eukaryotic host cells but the molecular mechanisms of this process, which is required for both colonization and infection, are only partially understood. The zinc metalloprotease StcE, which EHEC secretes through the type II general secretory pathway during infection, has been implicated to assist the penetration step (Lathem et al., 2002; Paton and Paton, 2002). Its expression from the pO157 virulence plasmid is activated by Ler, a global regulator of various genes essential for EHEC pathogenesis, including the LEE- (locus of enterocyte effacement) encoded type III secretion system and its translocated effectors (Lathem et al., 2002). Indeed, Grys et al. demonstrated that StcE plays a critical role in colonization by unmasking the host cell surface through proteolytic cleavage to allow intimate adherence of EHEC to gut epithelial cells (Grys et al., 2005). The decreased virulence of the Aeromonas hydrophila tagA mutant (64% similarity with StcE) in a mouse model of infection also supports its hypothesized function as a host surface-clearing mucinase (Pillai et al., 2006). It was further concluded that colonization defects of an EHEC type II secretion knockout might be related to StcE deficiency (Ho et al., 2008).  The role of StcE in virulence is attributed to its zinc metalloprotease mediated mucinase activity toward specific mucin-type glycoproteins. It is capable of degrading and reducing the viscosity of the mucus layer via cleavage of mucin 7 and glycoprotein 340 present in saliva and other tissues (Grys et al., 2005). These proteins form part of the host innate immune system by functioning as receptor decoys for microbial adhesins to facilitate the clearance of  27 bacteria-protein aggregates from the host. Thus, destruction of these defence molecules by StcE would allow EHEC to overcome the mucosal barrier for necessary contact with the epithelium to establish infection.  1.5.2 Evasion of host immune defence In addition to its role in mediating host cell colonization, StcE is believed to contribute to immune evasion. Once attached to colon epithelial cells, EHEC secretes various effector proteins that lead to lesions in the endothelium and intestinal lining, and the infiltration of leukocytes and complement proteins to the gut lumen (Nataro and Kaper, 1998). StcE specifically targets CD 45 and CD43 membrane-bound glycoproteins that are found exclusively on the leukocyte cell surface (Szabady et al., 2009). The two host sialoglycoproteins also contain glycosylation of the mucin-type required for StcE recognition (Figure 1.2 ), and this provides  further competitive advantage to the pathogen and allows it to modulate the host immune system. CD43, also referred to leukosialin, contains a large extracellular domain that is heavily O-glycosylated. The high density of negatively charged O-linked glycans causes the protein to adopt an extended conformation and project 45nm from the cell surface to mediate cell signalling events and regulate cell-cell contacts (Cyster et al., 1991a). During cell migration, CD43 is clustered at the trailing end of polarized leukocytes. The so-called anti-adhesive force afforded by the sialoglycoprotein clusters is believed to propel cell movement (Seveau et al., 2000). StcE has been demonstrated to specifically target the mucin-like extracelluar domain of CD43, and the loss of these sialoglycoproteins due to StcE proteolytic activity prevents immune cells from moving to the sites of infection and mounting the appropriate inflammatory response. Indeed, StcE-treated  28 neutrophils exhibited poor chemotactic mobility (Szabady et al., 2009). On the other hand, the effect of the StcE-mediated cleavage of the protein tyrosine phosphatase CD45 is less clear. It has been proposed that the proteolytically modified phosphatase may dimerize and this enhances the neutrophil oxidative burst to interfere with the normal immune response, although the exact mechanism is not yet fully understood (Szabady et al., 2011).  Notably, StcE also targets the complement system. As briefly  mentioned above,  StcE can inhibit proteolytic activation of the complement cascades by potentiating the serine protease inhibitory activity of human C1-esterase inhibitor, the principal regulator of initiation proteases in the pathways (Figure 1.3) (Lathem et al., 2004; Lathem et al., 2002). As shown by FACS (fluorescence-activated cell sorting) experiments, this is mediated by the ability of StcE to recruit C1-INH to the cell surface through an as yet unknown surface-binding region, as the serpin has no inherent cell-binding properties (Lathem et al., 2004). To further probe the StcE/C1-INNH interaction, a recombinant C1-INH containing only the serpin domain without the N-terminal mucin-like region was expressed and treated with StcE. The metalloprotease left the C1-INH fragment intact, which is consistent with its requirement for specific O-glycosylated motifs for target recognition (Lathem et al., 2004). Taken together, StcE likely bridges the complement regulator to the cell membrane, where the C-terminal serpin domain is then in proximity to inactivate downstream components of the complement cascades. As a result of the inhibition, it was shown that serum-sensitive E. coli-K12 became more resistant to complement-mediated cell lysis in the presence of StcE-treated C1-INH (Lathem et al., 2004).   29 StcE is a member of the M66 zinc metalloprotease family for which none of the members have been characterized at the structural level (Rawlings et al., 2010),. Although it is inferred that StcE binds to specific O-glycosylated regions of its substrates (Lanthem et al., 2004; Szabady et al., 2009), the mode of interaction and details of its specificity requirements are unknown. To obtain mechanistic insight into StcE function and to understand more fully how it subverts multiple mucosal defenses and the complement system of innate immunity, we have carried out X-ray crystallographic analysis of StcE and characterized its interaction with its substrate C1-INH using complementary biochemical and biophysical approaches. This work represents the first crystal structure reported for a mucin-type O-glycoprotein-specific protease.                30 Chapter  2: Materials and methods 2.1 Molecular cloning The gene encoding Escherichia coli O157:H7 stcE residues 36 to 989 devoid of the signal sequence (accession code: O82882) was amplified from genomic DNA using the primers 5’-CCGCAGGCTAGCGCTGATAATAATTCAGCC-3’ and 5’-CCGCAGCTCGAGTTATTTATATACAACCCTCATTGACC -3’. The PCR fragment was ligated into the NheI and XhoI sites of pET28b expression vector (Novagen). The resulting construct pET28bStcEΔ35 served as a template for all subsequent DNA manipulations to generate additional StcE variants. Site-directed mutagenesis to introduce the catalytic inactive E447D mutation was performed with the primer pair 5’-GGGAATGAGTTCAGTCATGACGTTGGTCATAATTATGG-3’ and 5’-CCATAATTATGACCAACGTCATGACTGAACTCATTCCC-3’ by following the QuickChange protocol (Stratagene).  The surface entropy reduction mutant K318A/K320A/E321A was generated through the same procedure with the primers 5’-CGGGATCGCTTTGATTTTGCCGCAGACGCAGCAGCACATAGGG-3’ and 5’-CCCTATGTGCTGCTGCGTCTGCGGCAAAATCAAAGCGATCCCG-3’. In designing the truncation mutant devoid of the StcE insertion domain (Gly150-Phe243), the domain boundaries were chosen according to secondary structure predictions, as well as observed features in the crystal structure of the surface entropy reduction mutant, StcEΔ35K318A/K320A/E321A/E447D, to minimize perturbations to the overall structure and stability. DNA sequences corresponding to residues Gly150-Phe243 were removed by QuickChange using the primer pair 5’-GGATGGTGTTCCGGAAGGTCGCTCCGGTGAACTGGAG-3’ and  31 5’-CTCCAGTTCACCGGAGCGACCTTCCGGAACACCATCC-3’. For the expression of the insertion domain on its own, the region encoding residues His132-Asn251 of StcE was amplified with the primers 5’-CCGCAGGCTAGCCATCTGGATGGTGTTCCGGAAG-3’ and 5’- CCGCAG CTCGAGTTAATTATTCTCCAGTTCACC GGAG-3’, and the PCR product was ligated into the NheI and XhoI sites of pET28b expression vector (Novagen). Integrity and identities of all expression constructs were confirmed by DNA sequencing.  2.2 Protein expression and purification E. coli BL21 (λDE3) transformed with plasmid constructs of StcE variants was grown to mid-exponential phase at 37° C in Luria Bertani (LB) media containing 50 µg.mL-1 kanamycin. Protein expression was induced with 0.3mM isopropyl-β-D- thiogalactopyranoside (IPTG). After overnight induction at 20°C, cells were harvested, disrupted by a pressurized homogenizer (Avestin) in lysis buffer (20 mM HEPES, pH 7.5 and 500 mM NaCl) and centrifuged at 25,000 x g for 45 minutes. The clarified lysate containing the His-tagged target protein was loaded onto zinc-chelating Sepharose and eluted with imidazole (20 mM HEPES, pH 7.5 and 150 mM NaCl and 100 mM imidazole). The low amount of imidazole used for the elution was critical as StcE was sensitive to and prone to precipitation in the presence of high concentrations of the chemical. The protein was further purified sequentially by MonoQ anion-exchange and Superdex-200 HR 10/30 columns (GE Healthcare) to >95% purity as judged by SDS-PAGE (sodium dodecyl sulfate- polyacrylamide gel electrophoresis). The affinity tag was cleaved with thrombin prior to size exclusion chromatography and the protein kept in the storage buffer (20 mM HEPES, pH 7.5 and 150 mM NaCl).  32 For the isolation of the StcEΔ35E447D/ human C1-INH complex, the His-tagged catalytically inactive metalloprotease was incubated with the serpin (commercially available through Molecular Innovations) overnight to allow complex formation. The mixture was loaded onto zinc-chelating Sepharose to remove free C1-INH and subsequently eluted with buffer containing 100 mM imidazole. The affinity tag was removed with thrombin overnight and the cleaved sample was further purified on a Superdex-200 HR 10/30 column to allow separation of the metalloprotease-serpin complex from any unbound protease.  2.3 Multiangle light scattering Purified protein samples (0.5-1 mg.mL-1) were injected onto a Superdex-200 HR 10/30 column equilibrated with buffer (20 mM HEPES, pH 7.5 and 150 mM NaCl) at room temperature and connected in line to a miniDAWN multiangle light scattering machine with an interferometric refractometer (Wyatt Technologies). For data analysis, the associated ASTRA software package was used to estimate molecular mass using the Debye fit method.  2.4 Proteolytic assay StcE variants were incubated with human C1-INH in a 1:20 (wt/wt) ratio in 20 mM HEPES, pH 7.5 and 15 0mM NaCl at 37°C. Aliquots (10µL) were removed from the 100µL total reaction volume at 0, 15, 45,150, 240, 360 min, 23 hr and 48 hr and the reaction terminated with EDTA (final concentration 50 mM). Bands corresponding to C1-INH were visualized using Pro-Q Emerald 300 glycoprotein stain (Molecular Probes). For N-terminal sequencing of the C1-INH fragment, the protein bands were transferred to PVDF membrane and stained with Coomassie blue. The band of interest was excised from the membrane and analyzed by  33 N-terminal sequencing facilities in the Iowa State University and the Hospital for Sick Children in Toronto.  2.5 Limited proteolysis Purified StcE was mixed in a 1:1000 (wt/wt) ratio with proteases of different cleavage specificities, namely chymotrypsin (preference for aromatic residues at the P1 position), trypsin (basic residues at P1) and subtilisin (large, uncharged residues at P1). The reaction mixtures were incubated at room temperature for 15 min, 2.5 hr and 15 hr before they were terminated and analyzed by SDS-PAGE.  2.6 Inductively coupled plasma mass spectrometry analysis The metal content of StcEΔ35K318A/K320A/E321A/E447D was analyzed by inductively coupled plasma mass spectrometry with a Perkin-Elmer, NextION 300 instrument that was calibrated with the Instrument Calibration Standard 2 (SPEC Certiprep). The metals examined include zinc, magnesium, manganese, iron, cobalt, nickel and copper.  2.7 Crystallization and data collection Initial needle clusters were obtained from the catalytically inactive StcEΔ35E447D. We tried various methods to improve the diffraction quality of the crystals, such as seeding, iterative screening and chemical modification, but to no avail. Eventually, the surface entropy reduction approach yielded larger crystals of the triple (E447D/K736A/E737A) and quadruple (K318A/K320A/E321A/E447D) mutants. Crystals of the quadruple mutant exhibited superior diffraction properties (~3.5 Å at home source) compared to those of the  34 triple mutant (maximum resolution of ~7 Å) and were grown by the hanging-drop vapor diffusion method at 18 ºC. The drops consisted of a 1:1 ratio of StcEΔ35K318A/K320A/E321A/E447D (7 mg.mL-1) and the optimized precipitant mixture (0.2 M MgCl2, 8% PEG 8000 and 0.1 M MES, pH 6.5). Single crystals were cryoprotected by brief soaking in the mother liquor supplemented with 25% glycerol and flash-frozen in liquid nitrogen. Since the majority of the quadruple mutant crystals showed severe anisotropy in the diffraction, extensive manual screening was required before a reasonably well- diffracting selenomethionine (SeMet)-labeled crystal was identified. Diffraction data were collected at the Canadian Light Source 08ID-1 beamline, which is equipped with a double crystal monochromator, vertical focusing mirror and Rayonix MX300 CCD X-ray detector at the end station. A redundant SAD dataset collected at the selenium peak (0.97907Å) for phasing was processed using MOSFLM (Battye et al., 2011) and SCALA (1994).  Crystals of the StcE insertion domain (His132-Asn251) were grown at 18 ºC using the vapor diffusion setup by mixing equal volumes of the protein (20 mg.mL-1) and reservoir solution (26% PEG 2000MME and 0.1 M trisodium citrate, pH 5.5). The crystals were derivatized by brief soaking in 0.5 M NaI for halide phasing and flash-cooled in the same reservoir solution supplemented with 15% glycerol. Diffraction data containing anomalous signals from the iodides were collected using in-house copper radiation at the CuKα wavelength (1.54Å) and recorded on the mar345 X-ray image plate detector. Data were reduced and scaled following the same procedure as for the aforementioned quadruple mutant data set.    35 2.8 Structure determination and refinement Initial experimental phases were obtained by SHARP of the automatic structure solution program autoSHARP (Vonrhein et al., 2007), which located 10 out of a possible 17 selenium sites. Subsequent solvent flipping with SOLOMON from autoSHARP resulted in a reasonably interpretable map for the core of StcE. The model was built manually using Coot (Emsley et al., 2010). Iterative rounds of refinement were performed using REFMAC (Murshudov et al., 1997), with parameters describing TLS (translation/libration/screw) motion (Painter and Merritt, 2006), and BUSTER (Blanc et al., 2004) to improve modeling of the poorly resolved regions. The TLS parameters were generated by the TLS motion determination server, which modeled the flexible StcE structure as contiguous groups of atoms undergoing rigid-body motion. The optimal TLS groups, as judged by the diminishing net TLS residual, are composed of: residues 39-115, 116-139 and 248-310, 311-411, 412- 512, 513-519, 520-685, 686-800 and 801-898. The high Wilson B (61 Å2) and overall refined B (83 Å2) are perhaps not surprising, given the modular and dynamic nature of StcE. Initial iodide phases for domain INS were also obtained using autoSHARP. Following similar procedures for model building and refinement, its orientation in the overall StcE structure was determined by spherically averaged phased translation using SeMet phases from the quadruple mutant. We compared the relative orientation between M-SD1 and M-SD2 among StcE homologues by measuring the angle between structurally equivalent residues (Cα of 3UJZ, R372/H446/P535; 1IAG, S71/H142/D180; 2ERO, T267/H335/D373; 2DW2, A263/H333/N371). Structure figures were prepared with PyMol ( The ConSurf server was used to identify functionally and structurally important residues (Glaser et al., 2003; Landau et al., 2005).  36 2.9 Peptide docking Molecular docking was performed with AutoDock Vina, using default values for the docking parameter file (Trott and Olson, 2010).  The program merged non-polar hydrogen atoms by adding the charge of each to the carbon atom to which it is bonded and assigned charges to the atoms in the coordinate files. A grid box of 22 x 22 x 18 grid points of 1Å spacing covering the immediate active site region was defined as the search space. Docking results were manually inspected for structural and chemical complementarities. One of the evaluation criteria was that the distance between zinc and the P1 carbonyl oxygen be less than 3Å.  2.10 Sucrose density gradient centrifugation Cell pellets collected from 200 mL cultures of E. coli BL21 (λDE3) expressing StcE variants were lysed by a pressurized homogenizer in buffer A (20mM Tris, pH7.5). Cell debris was removed by a low-speed spin at 10,000 g (JA10 rotor, Beckman), followed by a high-speed spin at 140,000 g (Ti60 rotor, Beckman) to isolate the total membranes. The membrane pellet was washed in buffer A, centrifuged again and homogenized in 2 mL buffer A. Membrane fragments were then layered onto the sucrose gradient consisting of 0.8 mL 55%, 2 mL 50%, 2 mL 45%, 2 mL 35% and 0.8 mL 30% and centrifuged in a SW41 swinging-bucket rotor (Beckman) at 180,000 g for 16 hr.      37 2.11 Circular dichroism spectroscopy CD spectra of 0.25 mg.mL-1 WT StcE and mutant devoid of the insertion element were recorded in a Jasco J-810 spectropolarimeter at 25°C. All spectra were the average of 4 scans. The optical path length is 0.1cm.  2.12 Glycan array screening The glycan array binding analysis was performed at the Consortium for Functional Glycomics ( StcEΔ35E447D, fluorescently labeled with the Alexa Fluor® 488 dye (Molecular Probes), was screened with version 4.2 of the printed mammalian glycan microarray containing 511 ligands in replicates of 6. The highest and lowest fluorescence readings were excluded from statistical analysis of the data in an attempt to eliminate false hits.  2.13 Microcalorimetry Isothermal calorimetry experiments were conducted using a MicroCal ITC200 instrument (GE Healthcare). All titrations were performed in 20 mM HEPES, pH 7.5 and 150 mM NaCl at 25°C. For the binding assay involving StcEΔ35E447D and human C1-INH, 2 µL aliquots of 175 µM.mL-1 C1-INH were injected at 3 min intervals into the calorimetric cell containing 15 µM of the inactive metalloprotease. Control experiments that measured the changes in heat energy due to heats of dilution were carried out by injecting C1-INH into the reaction buffer and by injecting the same buffer into StcEΔ35E447D in separate titrations. Both control reactions yielded insignificant heats of dilution. For the metal and small molecule ligand binding studies, syringe solutions containing10 mM Ca2+ and 2 mM of the di-  38 sialylated core I tetrasaccharide, Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα-threonine (kindly provided by Dr. James Paulson in the Scripps Research Institute) were injected into StcEΔ35E447D at concentrations of 1 mM and 100 µM, respectively.  All titrations were performed in duplicates or more if the results of the first two trials were inconsistent. A representative titration curve was shown for each set of binding experiments described in the Results section.  2.14 Enzymatic deglycosylation of StcEΔ35E447D/C1-INH The purified complex containing glycosylated human C1-INH was enzymatically deglycosylated using peptide N-glycosidase F (PNGaseF; New England Biolabs) in 20 mM Tris, pH 7.5, 50 mM NaCl at a final concentration of 1000 units.mL-1 overnight at room temperature.  To improve the efficiency of carbohydrate cleavage, the enzymatically treated sample was further deglycosylated using 2500 units of PNGaseF in the presence of the chaotrope urea (final concentration of 1.5 M) with 5% glycerol added as a stabilizing agent. After overnight incubation at room temperature, the complex was buffer-exchanged on a Superdex-200 HR 10/30 column into 20 mM HEPES, pH7.5, 150 mM NaCl and 5% glycerol prior to crystallization experiments.  2.15 In vitro chemical cross-linking of StcEΔ35E447D/C1-INH The cross-linking agents disuccinimidyl suberate (DSS), dithiobis[succinimidylpropionate] (DSP) and ethylene glycolbis[succinimidylsuccinate] (EGS) (Pierce) were dissolved in DMSO (dimethylsulfoxide), and increasing volumes were added to the purified complex, resulting in a range of final cross-linker concentrations (0.3 mM to 2 mM). The reactions  39 were incubated in 20 mM HEPES, pH 7.5, 100mM NaCl at room temperature for 30 min before they were quenched with 1 M Tris, pH 7.5 (final concentration of 50 mM) and analyzed by SDS-PAGE.                       40 Chapter  3: Results 3.1 Characterization of full-length StcE 3.1.1 Recombinant expression and purification of StcE The Welch group at the University of Wisconsin has published many of the significant findings in the literature related to StcE.  In terms of biochemical characterization of the enzyme, they expressed StcE as a fusion to a chitin-binding domain and an intein protease (Grys et al., 2006). The chitin-binding domain allows purification using a chitin affinity column while the addition of free cysteine or the reducing agent DTT induces autoproteolytic activity of the intein protease to release StcE from the fusion domains.  Slight degradation was observed in the purified sample (Figure 3.1A) and the two breakdown products were identified by N-terminal sequence analysis as coming from StcE.  Figure 3.1 Recombinant expression of a stable, mature form of StcE. (A) Purification of the intein-StcEΔ35 fusion. Coomassie blue staining revealed the presence of StcE degradation products (~50 kDa and ~60 kDa) in addition to the expected ~100 kDa native StcE band after the induction of intein autoproteolytic cleavage. (B) Purification of His-tagged StcEΔ35. The purified protein remained stable after the removal of the affinity tag and this expression vector was used for the construction of all subsequent StcE domain variants.  Upon closer inspection, there are six cysteine residues in the StcE sequence and the disulfide bond prediction program diANNA suggests four of them could potentially form disulfide pairs (Ferre and Clote, 2006). Disulfides are sensitive to reducing conditions and therefore  41 StcE may become less stable and more susceptible to degradation when exposed to DTT. Although the degradation was minor, sample homogeneity is often critical to the formation of an ordered crystal lattice for structural studies. For this reason, an N-terminal His-tagged construct was designed to circumvent the use of reducing agents.  StcE contains an N-terminal signal sequence that is recognized by the Sec protein translocation machinery and is exported via the type II secretion system (T2SS) to the extracellular space during EHEC infection (Lathem et al., 2002; Paton and Paton, 2002). A His-tagged, mature form of StcE devoid of its secretion signal was expressed and purified sequentially by zinc-chelating affinity Sepharose, anion exchange and size exclusion chromatography to >95% homogeneity as judged by SDS-PAGE (Figure 3.1B) (the construct spans residues 36-989 and is referred to hereafter as full-length StcE or StcEΔ35). The concentrated protein appeared stable and remained intact over an extended period at 4°C (~ one month). The same purification scheme was employed for all StcE variants used in subsequent crystallization trials, biophysical analysis, array study and biochemical experiments.  3.1.2 Oligomerization state of proteolytic active StcE StcE contains the extended zinc-binding motif HExxHxxGxxH found in zinc metalloproteases and has been shown to cleave a number of heavily O-glycosylated substrates, namely C1-INH, mucin 7, glycoprotein 340, CD43 and CD45 (Grys et al., 2005; Lathem et al., 2002; Szabady et al., 2009). To determine whether the recombinant StcE is functionally active, a proteolytic assay using human C1-INH was performed. C1-INH was  42 chosen as the model substrate because it is commercially available and is more amenable to biochemical characterization compared to other substrates that contain membrane-associated regions, such as CD43 and CD45. Cleavage of C1-INH by the full-length protease resulted in a fragment which migrated as a ~65 kDa species on an SDS-PAGE gel (Figure 3.2A). The size of the proteolytic product is consistent with that reported previously in a study where C1-INH was first identified as a physiological substrate of StcE (Lathem et al., 2002).  Figure 3.2 Recombinant StcE is proteolytically active as a monomer. (A) Time-dependent cleavage of C1-INH by WT StcE and the catalytic mutant E447D. Purified recombinant StcEΔ35 degraded C1-INH into a ~65 kDa fragment while the E447D mutant was inactive. Protein bands corresponding to the mucin-like glycoprotein substrate C1-INH were specifically visualized by Pro-Q Emerald 300 glycoprotein stain. Time was measured in minutes unless otherwise indicated (hour, hr). The lane representing the cleavage of C1-INH by StcEΔ35E447D was taken from the image of a separate SDS-PAGE gel. (B) Multiangle light scattering analysis of StcEΔ35 is shown as the molar mass versus volume plot (represented by the thick horizontal red line) overlaid with the gel filtration elution profile (represented by the thin red curve). The intensity of laser light scattered by the particles in solution is proportional to the molar mass and registered as the light scattering signal. The red horizontal line indicates a uniform particle size distribution of the StcEΔ35 sample. The molecular mass calculated across the StcE elution peak from the gel filtration column is consistent with that of a monomeric species (~100 kDa).  To determine whether oligomerization is involved in its function, StcE was subjected to static light scattering analysis, which is a noninvasive technique to measure the molecular mass of macromolecules in solution based on their interaction with laser light. The result showed that  43 the full-length active protease exists as a monomer in solution. The particle distribution of the sample is monodisperse, with an estimated molecular mass of ~100 kDa that is close to the expected size (Figure 3.2B).  3.1.3 Characterization of StcE/C1-INH interactions Static light scattering analysis of the metalloprotease-serpin complex StcE is believed to attenuate complement-mediated cell lysis by localizing the host protein C1-INH to the cell surface, where the immunomodulating serpin is in proximity to inhibit nearby serine proteases responsible for complement activation (Figure 1.3). Flow cytometry experiments revealed that while C1-INH by itself has no affinity for cell surfaces, StcE binds directly to cells in a specific, saturable manner (Lathem et al., 2004). This suggests that the bacterial effector likely serves as a physical bridge between the host serpin and the cell periphery as it down-regulates the host immune response. At saturating concentrations of the metalloprotease, approximately 1.8 x 106 molecules of StcE are present on a given cell and an estimated 2.25 x 106 molecules of C1-INH are bound proportionally (Lathem et al., 2004). This translates into a binding ratio of greater than 1:1 based on the FACS (fluorescence activated cell sorting) results. To further define the binding stoichiometry and better understand the mode of interaction between StcE and its substrate C1-INH, light scattering experiments were performed on the StcE/C1-INH complex and its individual components to determine their oligomerzation states.  Residue E447 of the H446ExxHxxGxxH456 motif is predicted to be the general base that abstracts a proton from the catalytic water to assist nucleophilic attack on the peptide  44 substrate (Lathem et al., 2002). Therefore, E447 was mutated to the shorter D447 to prevent hydrolysis of C1-INH and to trap the substrate in a metalloprotease-serpin complex. In agreement with previous mutagenesis studies on StcE (Lathem et al., 2002), our E447D mutant is proteolytically inactive toward C1-INH (Figure 3.2A). The serpin was then incubated with the inactive, His-tagged protease and the mixture was purified on a zinc- chelating column to remove any unbound serpin. Size exclusion chromatography followed as an additional purification step to separate any unbound protease and a stable complex of StcEΔ35E447D/C1-INH was generated. Static light scattering analyses of the protease/serpin complex, C1-INH alone, and StcE on its own yielded molecular mass estimates of ~ 170, ~ 73 and ~ 100 kDa, respectively (Figure 3.3). This corresponds to a binding stoichiometry of 1:1 between the bacterial effector and its host target.  Figure 3.3 StcEΔ35E447D binds human C1-INH in a 1:1 ratio.  45 Multiangle light scattering analysis of (A) StcEΔ35E447D, (B) C1-INH and (C) StcEΔ35E447D/C1-INH complex is shown as an overlay with the respective gel filtration elution profile. The shift in molecular weight suggests a 1:1 binding stoichiometry. The blue trace corresponds to the UV absorption detector output and red to the light scattering signal. Peaks selected for the molecular weight estimate are delineated within dark lines. The extra peaks preceding the elution peaks of C1-INH in (B) and the protease-serpin complex in (C) correspond to large molecular weight species eluting in the exclusion volume of the Superose6 gel filtration column. The relative amount of these high MW species was negligable as indicated by the low UV absorbance at the corresponding elution positions. Isothermal titration calorimetry analysis of  C1-INH binding To better understand the energetics associated with the recognition of C1-INH by StcE, isothermal titration calorimetry (ITC) was used to study this protein-protein interaction. During biomolecular binding events, energy is released or absorbed. ITC is a thermodynamic technique that directly measures the heat evolved during these interactions to determine the enthalpy of binding (ΔH), binding constant (Ka), reaction stoichiometry (n), and entropy change (ΔS) that are unique to the systems being studied (Velazquez-Campoy et al., 2004). To quantify these thermodynamic parameters, equivalent amounts of a ligand are titrated against its binding partner at regular time intervals with the heat change coupled to each ligand injection proportional to the extent of binding (Figure 3.4A). As the system moves toward saturation with more and more binding sites occupied by the ligand, the heat signal decreases. The change in heat eventually reaches a plateau where only the heat of dilution due to excess ligand is detected (Figure 3.4A).  To characterize the StcE/C1-INH complex, the serpin substrate was successively titrated against the inactive metalloprotease. Interactions between the two proteins were detected by ITC (the instrument is sensitive to sub-millimolar to nanomolar binding affinity), and the reaction was largely exothermic  (Figure 3.4B). A binding curve was constructed by plotting  46 the heat released during each injection against the corresponding molar ratio of C1-INH to StcE. The inflection part of this curve indicates the point at which available StcE binding sites are bound by C1-INH. This suggests a binding ratio of 1:1, confirming the stoichiometry obtained from the static light scattering approach (Figures 3.3 and 3.4B).  Figure 3.4 Isothermal titration calorimetry of StcEΔ35E447D binding to human C1- INH. (A) Example of a typical ITC experiment. Upper panel: raw data of an ITC isotherm illustrating an exothermic binding reaction. Lower panel: integration of the data with the binding curve fitted to a single site model. The enthalpy of binding (ΔH) is determined as the change in integrated heat as the system reaches saturation. The reaction stoichiometry (N) is indicated by the molar ratio at the inflection point of the binding curve and the association constant (Ka) by the slope. (B) Binding curve for the StcE/C1-INH titrations reflects an exothermic reaction. The anomalous pattern observed during the early injections cannot be sufficiently fitted to calculate the binding constants.  The ITC curve was also fitted with the one-site binding model to derive the related changes in enthalpy and entropy, which reflect the strength of binding between StcE and C1-INH, as well as the thermodynamic forces driving formation of the complex. Unfortunately, the  47 complex pattern of the StcE/C1-INH isotherm is not well described by a simple one-site binding model or others currently available in the data analysis software. Therefore, accurate values for ΔH and ΔS were not determined. This likely reflects the complex binding mode exhibited by the interaction between StcE and its substrates, as inferred from the structure of the protease (details will be discussed in later sections describing the crystal structure).  3.1.4 Proteolytic specificity of EHEC StcE A previous truncation study showed that recombinant C1-INH containing the C-terminal serpin domain alone, without the N-terminal mucin-like region, has no binding affinity for StcE (Lathem et al., 2004). This is consistent with the finding that StcE does not proteolytically inactivate native C1-INH to disrupt its serine protease inhibitor function. Rather, it interacts with the mucin-type N-terminus of C1-INH (Figure 1.2). Similar binding patterns were also observed for the extracellular mucin-like domains of the StcE targets CD43 and CD45. Western blotting and FACS experiments both showed that StcE treatment of these transmembrane glycoproteins left their non-glycosylated domains intact while proteolytic cleavage occurred within the mucin-like regions (Szabady et al., 2009). Although StcE has been shown to specifically require interaction with O-glycosylated regions of its substrates, little is known about its substrate recognition sequence.  To obtain better insight into substrate specificity, C1-INH was incubated in the presence of WT StcE and the major cleavage product, which has an apparent molecular weight (60-65 kDa) comparable to that reported in the literature (Lathem et al., 2002), was subjected to N- terminal sequence analysis to identify the cleavage site. The sample was analyzed by two  48 independent protein sequencing facilities. Both reports identified SPLQP as the dominant prime-side residues (P1’-P5’).  The sequenced peptide matches closely to S113PTQP117 in the published sequence of human C1-INH except the T115L substitution (Figure 3.5). The mutation likely reflects a sequence variant of C1-INH. The residues identified by N-terminal sequencing correspond to those within the N-terminal O-glycosylated domain, in agreement with the finding that the StcE-bound serpin remains active, capable of inhibiting downstream serine proteases to block complement activation (Lathem et al., 2004). Now that the cleavage sequence in C1-INH has been mapped to the mucin-type N-terminus, we generated a sequence alignment comparing the glycosylated extracellular domains of C1-INH, CD43 and CD45 in an attempt to derive a potential recognition sequence for substrate cleavage. However, no clear consensus was readily identified between the P4-P4’ positions of the aligned substrates (L109PTD/SPLQ116 in C1-INH where / indicates the cleavage site). The difficulty may be partly due to the presence of multiple PTS-rich repeats found in mucin-type sequences (Szabady et al., 2011).  Figure 3.5 Identification of the StcE cleavage site in C1-INH. Confirmed N- and O-glycosylation sites are denoted by filled triangles and circles, respectively. Potential O-glycosylated residues are marked by open circles. Boxed residues indicate P1 and P1’ positions of the StcE cleavage site as determined by N-terminal sequencing analysis. Dashed line underlines the signal sequence. Residue numbering is indicated above the protein sequence.    49 3.1.5 Mammalian glycan array screening The host substrates targeted by StcE are mucin-type proteins rich in O-linked glycans. As such, the molecular determinants responsible for the protease/substrate interactions likely involve carbohydrates. In large, multi-domain carbohydrate-active enzymes, accessory carbohydrate-binding modules often play a critical role in catalysis by holding the catalytic domain in proximity to the substrate (Boraston et al., 2004), although position specific iterative BLAST searches identified no obvious sequence similarity to such modules in the ~100 kDa StcE (Altschul et al., 1997). To identify potential carbohydrate structural motifs that StcE recognizes, we subjected a catalytic inactive version of the full-length protease (StcE Δ35E447D) to mammalian glycan array analysis at the Consortium for Functional Glycomics. These microarrays contain diverse N-linked and O-linked glycans to enable efficient screening for binding and have been successful in comparing the relative affinities for different ligands and identifying the glycan specificity of various microbial virulence factors that recognize mammalian glycoconjugates, most notably the viral hemagglutinin receptors (Xu et al., 2012).  StcE was screened against 511 ligands contained in version 4.2 of the printed mammalian glycan microarray and the results revealed binding to core I O-linked glycans. Specifically, the top ligand identified in the screening was the di-sialylated core I tetrasaccharide, Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα-threonine (Figure 3.6A). Core I sugars are defined by the attachment of Galβ1-3GalNAc- to the hydroxyls of serine or threonine residues via an α-glycosidic linkage to the GalNAc moiety (Figure 3.6A) (Van den Steen et al., 1998). They represent the most common type of O-GalNAc glycans and may be  50 elaborated with additional sugars in highly regulated reactions catalyzed by glycosyltransferases in the Golgi (Varki et al., 2009). Post-translational modification of folded proteins with O-GalNAc, the basic constituent of core I, is also classified as glycosylation of the mucin-type (Figure 3.6A), and binding of the microarray ligand Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα-threonine containing this carbohydrate motif is thus consistent with the known substrate preference of StcE for mucin-type glycoproteins.  Figure 3.6 Mammalian glycan array screening of StcEΔ35E447D.  (A) Chemical structure of the di-sialylated core I tetrasaccharide, Neu5Acα2-3Galβ1- 3(Neu5Acα2-6)GalNAcα-threonine, which bound StcE with the highest relative affinity. The core I O-glycan motif, Galβ1-3GalNAcα-threonine, is outlined within the black dashed line, including the underlying mucin-type structure, GalNAcα-threonine. (B) The strength of binding for each glycan array target represented on the X-axis is proportional to the measured fluorescent intensity (shown in the Y-axis). Note the preferential binding of StcE to the O- glycan in (A) over a similar ligand substituted with the aliphatic linker -CH2CH2CH2CH2-.  51 The error bar associated with each fluorescence signal measurement refers to the standard error of the mean.  Earlier studies that probed the carbohydrate content of C1-INH also identified sialic acids (Neu5Ac) attached in similar glycosidic linkages and Galβ1-3GalNAc in the serpin using Maackia amurensis, Sambucus nigra and peanut agglutinins, respectively (Schoenberger, 1992).  This lends support to the biological relevance of the glycan microarray findings.  3.1.6 Isothermal titration calorimetry of glycan and C1-INH peptide ligands Having identified the di-sialylated core I tetrasaccharide Neu5Acα2-3Galβ1-3(Neu5Acα2- 6)GalNAcα-threonine in the fluorescent screening of StcE for mammalian glycan binding (Figure 3.6), we next tried to determine the strength of the interaction by isothermal titration calorimetry. Unlike the exothermic reaction observed upon binding of StcE to full-length human C1-INH (Figure 3.4B), no significant amount of thermal energy, other than the heats of dilution background, was evolved after the addition of the synthetic glycan (Figure 3.7A), likely indicating a weak interaction (or none) that is below the detection limit of the instrument (reliable determination of the dissociation constant in the range of 10-8 M < Kd < 10-4 M (Velazquez-Campoy et al., 2004)). Similar titration experiments with the synthetic non-glycosylated C1-INH octapeptide, L109PTDSPLQ116 (based on the sequence identified by N-terminal sequencing) also yielded negligible amount of heat (Figure 3.7B). The fact that we were unable to detect any peptide-bound species suggests that productive binding may require more extensive subsite interactions with the substrate, involving both the peptide and carbohydrate moieties (see Discussion).  52  Figure 3.7 Isothermal titration calorimetry of StcEΔ35E447D binding to the O-linked glycan,  and C1-INH peptide ligands. Insignificant amount of thermal energy was generated from the titrations of StcE with the di- sialylated core I tetrasaccharide, Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα-threonine (A), and with the non-glycosylated C1-INH octapeptide, L109PTDSPLQ116 (B), as shown by their respective raw ITC isotherms in the upper panels and integrated heats below. These results indicate negligible binding of StcE to the synthetic ligands.  3.1.7 Limited proteolysis of StcE StcE has a calculated molecular weight of ~100 kDa while the size of most metalloproteases usually ranges from 25 to 60kDa (Grys et al., 2006). It is likely that in addition to the protease domain, StcE contains non-catalytic modules that impart its unique function as a cell surface-binding, O-glycoprotein specific mucinase. The EHEC virulence effector has no significant sequence homology to entries in the PDB and the only putative conserved domain detected by sequence analysis is the peptidase module (annotated as residues 249-550) found in all members of the M66 metalloprotease family (Rawlings et al., 2010). To obtain more  53 detailed mechanistic insight on StcE function and understand its domain organization, one of our aims is to determine the structure of the full-length protease.  Designing expression constructs containing discrete domains of the full-length target is a common strategy used in structural biology research, as smaller functional domains may be more stable and amenable to structure determination (Derewenda, 2010). Both full-length StcE and the conserved metalloprotease domain (residues 249-550) were cloned (Figure 3.8); however, the catalytic domain could not be expressed as a stable, independently folded module. In an attempt to delineate domain boundaries of the large metalloprotease, StcE was subjected to limited proteolysis using three different proteases. StcE has been reported to be fairly robust and retained the ability to cleave C1-INH after incubation with trypsin, chymotrypsin, neutrophil elastase and Pseudomonas aeruginosa elastase (Grys et al., 2006). Thus, limited proteolytic treatment of the bacterial effector should not interfere with the characterization of its functionally relevant domains. The relatively stable nature of StcE may have physiologically important implications, as it likely remains active while it encounters various host and microbial proteases during passage through the human digestive tract and in the colon, where EHEC colonizes. Results of treating purified StcE with a minimal amount of subtilisin, trypsin and chymotrypin showed that none of the proteases generated stable fragments corresponding to distinct domains that could then be expressed individually. Instead, a laddering pattern of proteolytic fragments was observed (Figure 3.9). This indicates the presence of various highly flexible regions that are solvent-exposed and accessible to proteases, suggesting that the StcE structure may be quite dynamic.  54   55 Figure 3.8 Predicted domain boundaries of the conserved M66 metalloprotease domain in EHEC StcE. EHEC StcE contains a putative metalloprotease domain as identified by a preliminary sequence alignment of the M66 family of proteins. The predicted span of the catalytic domain, depicted by the pink bar below the sequence alignment, is annotated in the Conserved Domain Database as residues Glu249-Val550.  The conserved region contains the extended metzincin zinc-binding motif H446ExxHxxGxxH456 and the catalytic residues therein are denoted by red star symbols. Secondary structure representation of EHEC StcE (derived from our crystal structure, 3UJZ) is displayed above the alignment and the consensus sequence below. Sequences that share >65% similarity are in red and boxed in blue. Conserved residues are shaded in red. GenBank accession codes for the StcE-like proteins are as follows: ABD59014 refers to Aeromonas hydrophila; ABO91303, A. salmonicida; CAP16261, Escherichia coli O55:H7; ADT96714, Shewanella baltica OS678; AEB48888, A. veronii B565; EEZ40955, Photobacterium damselae CIP102761; EEZ85628, Vibrio harveyi 1DA3; EDL70231, V. harveyi HYϕ1; EDL69213, V. cholera HY ϕ1.   Figure 3.9 Limited proteolysis of StcEΔ35E447D. Purified StcE was proteolyzed with 1:1000 (wt/wt) dilutions of chymotrypsin (C), subtilisin (S) and trypsin (T) at room temperature for 15 min, 2.5 hr and 15 hr before the reactions were terminated and analyzed by SDS-PAGE. The negative control (U), in which StcE was incubated under the same conditions in the absence of additional proteases, showed that the low molecular weight bands were generated as a result of the proteolytic treatment rather than sample breakdown. The positions of the molecular weight markers are indicated on the left side of the SDS-PAGE gel.  3.1.8 Surface entropy reduction engineering of  StcE Initial high-throughput crystallization screening of full-length, wild-type StcE utilizing sparse matrix screens under sitting-drop vapor diffusion did not yield any promising lead conditions. Reductive methylation of surface-exposed lysine residues is a method routinely used to promote crystallization and improve the diffraction of protein crystals (Kim et al.,  56 2008). The technique is based on the rationale that flexible amino acid side chains, such as those of lysine residues on the protein surface, have high conformational entropy that is unfavorable for crystal formation. Methylation of their basic side chains, in turn, is believed to reduce charge repulsion and create thermodynamically favorable protein interfaces for crystal assembly. Unfortunately, the methylated wild-type StcE, in which available surface accessible Lys were chemically modified as deduced by mass spectrometry, remained recalcitrant to crystallization (its increased molecular mass in the MALDI-TOF (matrix- assisted laser desorption/ionization-time of flight) spectrum suggests that approximately 35 of the 45  lysine residues in StcEΔ35 were methylated).  Crystallization screening of the StcEΔ35E447D catalytic mutant was also conducted in parallel in the hope that it would be more stable and thus might crystallize more readily. Previous inductively coupled plasma mass spectrometry revealed a higher zinc content that was closer to the expected 1:1 molar ratio for the mutant than the wild-type (Grys et al., 2006), suggesting that StcEΔ35E447D indeed may be more stable. We tried to supplement the WT StcE preparation with additional zinc but the resulting sample was prone to precipitation. Although initial fine needle clusters of StcEΔ35E447D were obtained, their limited size and poor diffraction properties made them unsuitable for structure determination.  Surface entropy reduction (SER) mutagenesis is another rational protein engineering method aimed to improve crystallization (Derewenda, 2011). The SER concept is similar to the aforementioned reductive methylation approach but differs in that it does not indiscriminately modify all surface accessible lysine residues. Rather, the method involves  57 selective mutation of predicted surface Lys and Glu containing sequence clusters to more compact alanines to promote crystal contacts.  Applying this strategy to the catalytically inactive construct StcEΔ35E447D, four SER mutants were generated: K318A/K320A/E321A, K482A/K483A, K736A/E737A and K803A/E804A/E806A. The SER mutations do not overlap with known active site residues in the conserved zinc-binding sequence, H446ExxHxxGxxH456, and all four variants were purified to homogeneity using the same procedure as WT StcE. The quadruple mutants K318A/K320A/E321A/E447D and E447D/K803A/E804A/E806A yielded crystals of different space groups although only K318A/K320A/E321A/E447D diffracted to sufficient resolution (better than 3Å). Furthermore, obtaining quality diffraction data was hampered by the severe anisotropic diffraction displayed by the majority of the surface-engineered crystals. Diffraction anisotropy refers to inconsistencies in the diffraction quality (in terms of resolution limit, for instance) as a function of the direction of the crystal. It is often due to the fact that crystal packing between adjacent unit cells may be more ordered in one direction over another. After extensive crystal screening by optimizing the growth and cryoprotectant conditions, and manually testing and evaluating the diffraction quality of numerous crystals at the synchrotron, the StcE structure was eventually solved using single-wavelength anomalous dispersion (SAD) data collected from crystals of selenomethionine (SeMet)- incorporated protein of K318A/K320A/E321A/E447D, which were obtained under 0.2 M MgCl2, 8% PEG 8000 and 0.1 M MES, pH 6.5 (Table 3.1). Initial selenium SAD phases were obtained by SHARP of the automatic structure solution program autoSHARP (Vonrhein et al., 2007), which located 10 out of a possible 17 selenium sites. The overall  58 figure of merit for phasing was 0.39. Density modification with SOLOMON from autoSHARP resulted in a reasonably interpretable map for the core of StcE. The model was built manually using Coot (Emsley et al., 2010). Refinement with TLS (translation/libration/screw) in REFMAC (Murshudov et al., 1997) and BUSTER (Blanc et al., 2004) was necessary to describe the motion inherent in the multi-domain protease and this resulted in improvement in the crystallographic residuals R and Rfree.  Table 3.1 Data collection and refinement statistics for SeMet-labeled crystals of StcEΔ35K318A/K320A/E321A/E447D  Data collection Space group C2221 Mol/au 1 Cell dimensions      a, b, c (Å) 76.53, 186.00, 188.76      α, β, γ (°) 90.00, 90.00, 90.00 Resolution (Å) 2.5 Total reflections 275104 Unique reflections 47052 Rpima 5.5 (36.9)b I/σI 9.2 (1.8) Completeness (%) 100 (100) Redundancy 5.8 (5.9)     59 Table 3.1 Data collection and refinement statistics for SeMet-labeled crystals of StcEΔ35K318A/K320A/E321A/E447D  Refinement Rwork/ Rfree 0.21/0.25 Average B-factors (Å2)      Protein  83      Ion  53      Water   63 (175 waters) R.m.s deviations      Bond length (Å) 0.010      Bond angles (°) 1.200 Ramachandran plot      Favored (%) 95      Disallowed (%) 0.4  aMultiplicity-weighted Rmerge (Diederichs and Karplus, 1997). bValues in parentheses are for the highest-resolution shell.  3.1.9 Structural analysis of StcE Overall architecture The crystal structure of apo-StcE was refined to a resolution of 2.5Å. The overall structure of the ~100 kDa protease reveals a dynamic, modular architecture with three distinct globular domains, which we designate as IG, M (composed of M-SD1 and M-SD2) and C in sequential order (Figure 3.10A-C). These three domains are packed against each other and this results in an overall T-shape, with IG and M forming the arms and C the body. The N- terminal domain IG adopts an immunoglobulin-like fold that consists of complementary discontinuous segments spanning residues 57-122 and 257-287. Within domain IG, there is a  60 sequence-variable region with weak electron density, which we designate as INS (insertion), and this region (His132-Arg252) inserts between strands 5 and 6 of the immunoglobulin β- sandwich (Figures 3.9A-C and 3.10, a sequence alignment of M66 metalloproteases).  Figure 3.10 Overall architecture of StcE. (A) Domain boundaries observed in the StcE crystal structure. The two subdomains of the metalloprotease module (M-SD1 and M-SD2) are colored in green, immunoglobulin-like domain (IG) in orange, insertion region (INS) in yellow, C-terminal domain (C) in purple and remaining disordered regions (D) in blue. SS stands for signal sequence. (B) Secondary structure and surface representation of StcE. The color scheme matches that in (A). Catalytic residues are shown in stick form, zinc and nucleophilic water as red and beige spheres, respectively. (C) 90° rotation of the molecule in (B). (D) Electrostatic surface representation of StcE in the same orientation as in (C), colored from -4 kT/e (red, negative) to +4 kT/e (blue, positive) using APBS (Baker et al., 2001). (E) and (F) The molecule in (D) is rotated 90° and 180°, respectively.   61 Immediately following domain IG is the metalloprotease or catalytic module, which is referred to as domain M (Glu296-Val550). Domain M consists of two subdomains: subdomain 1 (M-SD1) and subdomain 2 (M-SD2). Overall, domain M adopts a characteristic mixed α+β fold and a conserved Met-turn (StcE Met518), features that place it within the metzincin superfamily of zinc metalloproteases (Rawlings et al., 2010). Metzincins encompass several related families, including the structurally characterized astacins, ADAMs (a disintegrin and metalloprotease), matrixins, serralysins, snapalysins, leishmanolysins and pappalysins (Gomis-Ruth, 2009), as well as additional new ones that have been assigned as metzincins based on primary sequence data, such as the StcE-like family of enzymes.  62   63   64   65  Figure 3.11 Sequence alignment of select StcE homologues. The alignment reveals strong sequence conservationn in the metalloprotease domain. Secondary structure representation of EHEC StcE is displayed above the alignment and the consensus sequence below. Domain boundaries observed in the StcE structure are shaded according to the color schemes in Figures 3.8 and 3.10A. Note that the predicted sequence coverage of the conserved metalloprotease module (pink bars) differs from the actual domain boundaries observed in the crystal structure (green bars). Sequences that share >65% similarity are in red and boxed in blue. Conserved residues are shaded in red. Star symbols denote the  catalytic residues. Sequences corresponding to putative chitin-binding modules are boxed in purple and conserved residues involved in carbohydrate interactions  are also highlighted in purple. GenBank accession codes for the StcE-like proteins are as follows: ABD59014 refers to Aeromonas hydrophila; ABO91303, A. salmonicida; CAP16261, Escherichia coli O55:H7; ADT96714, Shewanella baltica OS678; AEB48888, A. veronii B565; EEZ40955, Photobacterium damselae CIP102761; EEZ85628, Vibrio harveyi 1DA3; EDL70231, V. harveyi HYϕ1; EDL69213, V. cholera HY ϕ1.  In the StcE crystal structure, the metalloprotease domain appears to form close interactions with the neighboring non-catalytic modules. M-SD2, which comprises part of the catalytic core, is cradled by domain IG and visible portions of INS, resulting in a large buried surface area of ~3955 Å2 (Figure 3.10B-C). Similarly, the partially resolved region 679-798 (D2) docks against M-SD1 on the opposite face of the active site. Viewing toward the active site, the D2 region and domains M and IG are nearly coplanar, with a long axis measuring ~80Å in total (Figure 3.10B-C). Extending in opposite directions away from this axis are  66 disordered structural elements, INS and D1 (D1 refers to Arg578-Arg676 and its location is judged by residual density and weak anomalous signals likely reflecting SeMet594, 598 and 608 in this region). Taken together, this unique arrangement of distinct, non-catalytic domains observed around the metalloprotease domain likely allows for additional regulatory functions including substrate recognition. Although these accessory domains do not participate directly in substrate cleavage, they often modulate substrate specificity, as shown by those of the related metzincin procollagen C-proteinase (PCP) in restricting PCP’s substrate profile (Wermter et al., 2007). Finally, the chain terminates in domain C (Glu804- Lys898), which is tethered to D2 through a short linker and adopts a β/γ crystalline fold. This domain is located ~180° from the active site and makes minimal contact with the rest of the protein (Figures 3.9A and C). Overall electrostatic surface features The electrostatic surface potential of StcE calculated based on the crystal structure reveals a highly polarized surface charge distribution for the metalloprotease (Figures 3.9D-F). Specifically, a large electronegative surface exists on one face of the molecule, contiguous along the three domains IG, M and C. In contrast, the opposite plane is primarily electropositive and hydrophobic. The unusual surface property suggests that the asymmetric charge distribution likely plays a role in substrate targeting at the host-pathogen interface. This is reminiscent of the electrostatic features utilized by the carbohydrate-binding sialidase/trans-sialidase family. Structures of Micromonospora viridifaciens sialidase and Macrobdella decora intramolecular trans-sialidase display a prominent negatively charged surface that is located on the reverse side of the sugar-binding active site (Gaskell et al.,  67 1995; Luo et al., 1998). Charge repulsion from the electronegative face of the enzymes is believed to orient the more electropositive substrate-binding site toward anionic glycoconjugate targets (Figure 3.12).  At the same time, the highly acidic face also promotes reversible binding to the negatively charged sialoglycoconjugates and this serves to increase the overall efficiency of the enzymes (Luo et al., 1998). It is conceivable that StcE operates through a similar electrostatic mechanism to direct the protease active site toward mucin-type glycoprotein substrates (Figure 3.10D-F).  Figure 3.12 Surface charge asymmetry in Macrobdella decora intramolecular trans- sialidase.  (A) View of the active site face of the trans-sialidase (PDB 1SLL) with the electrostatic surface potential colored from -4 kT/e (red, negative) to +4 kT/e (blue, positive). The catalytic and lectin-like domains, together with the positively charged groove in domain II, are implicated in carbohydtate recognition. (B) 180° rotation of the molecule in (A) reveals that the opposite face is primarily acidic. Metalloprotease domain The StcE structure we have determined represents the first reported for the M66 metalloprotease family. StcE possesses the extended zinc-binding motif HExxHxxGxxH within its catalytic domain (Glu296-Val550), whose boundaries were ambiguous from sequence analysis and are now clearly defined with our crystal structure (Figures 3.10).  68 Confirming its tentative assignment as a new metzincin family (Gomis-Ruth, 2003), domain M features a variation of the metzincin fold. Notably, a central substrate-binding cleft containing the active site separates this domain into two subdomains M-SD1 and M-SD2 (Figures 3.9B-C). M-SD1 consists of a twisted β-sheet (order β6, β2, β1, β3, β5, β4) packed on both sides by 3 helices (H1-H3). A sharp turn introduced by the conserved Gly453 in the zinc-binding motif along the active site helix (H3) then leads to M-SD2, which, after a series of convoluted loops, terminates in H4. In addition to this topology typical of metzincins, StcE contains unique structural elements that might have evolved to allow this protease to cleave densely O-glycosylated substrates. Accessible substrate-binding cleft In related single-domain zinc metalloproteases, namely thermolysin and astacin, ligand binding triggers closure of the ligand-free, open conformation via subdomain motion and reorients active site residues for catalysis (Grams et al., 1996; Holland et al., 1992). By overlaying our structure with structurally homologous metzincins, we found that the StcE active site cleft appears to have a wider angular opening than others in their analogous “open” conformations (Figure 3.13A). The relative orientation of M-SD1 and M-SD2 is ~10° wider in StcE (see Section 2.8 of Materials and Methods for details on how the angle was determined), as is the spacing between corresponding elements known to mediate binding of primed-side residues of the substrate via β-sheet-like hydrogen bonds. These include structural elements connecting S3 and S4, which exhibit unusually high curvature in StcE, and a variable loop joining the conserved methionine (StcE Met518) and the C-terminal H4 in M-SD2 (Figure 3.13B) (Gomis-Ruth, 2009).  69  Figure 3.13 StcE metalloprotease domain. (A) Superposition of the StcE metalloprotease domain, in green, with related metzincins (yellow, PDB 1IAG with a Z-score of 6.4; wheat, 2ERO, 6.5; blue, 2DW2, 6.8) along the active site helix. The wider angular opening between M-SD1 and M-SD2 in StcE is suitable for accommodating large, densely O-glycosylated substrates. The catalytic zinc ion is shown as a red sphere (B)  Structural elements known to mediate substrate binding in metzincins are highlighted in green. The active site flap in red is unique to StcE. Note the high curvature in the strand preceeding S4. Catalytic residues are shown in stick form, zinc as a red sphere and nucleophlic water as a beige sphere.  Furthermore, the M-SD2 variable region (Leu454-Thr534) is considerably more elaborate (in organization and the number of residues) than those typically observed in metzincins, forming a large, concave surface toward the catalytic center (Figures 3.9C and 3.12A). Collectively, these features may have evolved to facilitate approach by the expanded dimensions and “bottle brush”-like structures of densely glycosylated mucin-like substrates, including C1-INH, CD43 and CD45 (Cyster et al., 1991b; Hang and Bertozzi, 2005; Holmes, 2006; Perkins et al., 1990; Van den Steen et al., 1998).    70 Conserved active site features The zinc cation is essential for polarizing the carbonyl of the target scissile bond to catalyze nucleophilic attack by the nearby water nucleophile, as well as for stabilizing the resulting oxyanionc transition state (Figure 1.9). To verify the presence of zinc in StcE, the metal content of the purified K318A/K320A/E321A/E447D variant was verified by inductively coupled plasma mass spectrometry. The metals analyzed included zinc, magnesium, manganese, iron, cobalt, nickel and copper. The molar ratio of zinc to protein was 0.9, close to the expected equimolar ratio. The amount of other species was negligible. Fluorescence scans of the StcE crystals conducted at the synchrotron revealed a characteristic absorption near the zinc edge (theoretical absorption edge energy of ~ 9659 eV) (Figure 3.14), further confirming the identity of the catalytic metal in the sample.  Figure 3.14 Fluorescence scan of StcE quadruple mutant (K318A/K320A/E321A/E447D) crystals. Anomalous scattering occurred near the theoretical K edge of zinc (actual value measured was ~ 9687 eV).  The dispersive  (f’) and absorption (f’’) terms of anomalous scattering are shown d by the black and red peaks, respectively.  Superposition of the StcE active site with structurally homologous metzincins reveals a number of conserved catalytic residues in highly similar positions, despite the limited  71 sequence identity among these protease enzymes. The essential zinc ion is coordinated in the characteristic tetrahedral geometry by His446 and His450 of the active site helix (H3), His456 and a water molecule (Figure 3.15A). Met518 in the conserved Met-turn, a signature of metzincins, forms the hydrophobic base of the active site. In our crystal structure, Asp447 has been substituted for the Glu general base in WT StcE to facilitate ordered crystal formation. Being one methylene unit shorter, Asp447 is ~5Å away from the observed nucleophilic water, thus consistent with the inability of this mutant to hydrolyze substrates (Lathem et al., 2002).  In all metzincin clan members structurally characterized to date, S4 in the twisted β-sheet forms the upper rim of the active site cleft and binds primarily to the non-primed end of the substrate in an antiparallel fashion (Gomis-Ruth, 2009). In StcE, a stretch of conserved glycines, from Gly426 to Gly432 with an internal Ser428, occurs along S4 (Figure 3.15A). The poly-Gly motif likely confers plasticity to the active site cleft, allowing StcE to cleave substrates with somewhat lax specificity preferences. This structural feature may explain why it has been difficult to identify a consensus sequence among the glycosylated PTS-rich regions of StcE substrates (Szabady et al., 2011). Due to intramolecular interactions between the α-O-linked GalNAc and the peptide core (Coltart et al., 2002), the O-glycan clusters constrain the mucin-like peptide chain into a specific, extended conformation (Jentoft, 1990). High conformational freedom of the poly-Gly backbone in the StcE active site likely facilitates recognition of the O-glycan induced conformation. Indeed, the phi and psi angles of the well-ordered Gly426 and Gly430 lie in less favorable regions of the Ramachandran plot.  72  Figure 3.15 StcE active site. (A) Conserved structural motifs potentially involved in substrate binding and transition state stabilization surround the catalytic residues. Tetrahedral coordination of the zinc ion (red sphere) by the catalytic solvent (beige sphere) and ligand side chains is shown by orange dashed lines. A dashed black line delineates the proposed S1 pocket. Dashed green lines represent hydrogen bonds. An identical front view of the active site is shown in secondary structure representation for comparison in Figure 3.13B. (B) Docking of the C1-INH tripeptide in the StcE active site. Conformations of the P2-P1’ residues (T111DS113) of C1- INH appear to be structurally compatible with neighboring residues in the corresponding StcE subsites. The carbonyl oxygen of the scissile peptide bond is within binding distance of the catalytic zinc ion (orange sphere) as shown by the yellow dashed line. The StcE surface is colored by residue type, with cyan being polar; blue, basic; red, acidic; pink, aromatic and grey, hydrophobic. Dashed green line outlines the proposed sugar-binding site. The dimension of the active site near the scissile bond is approximated by the black dashed line.  Immediately upstream of H2 is a flap containing the strictly conserved Trp366 and His367 (Figure 3.15A). Their side chains project into and constrict the active site, providing some level of subsite specificity. Importantly, the backbone orientation of this flap is held in position via hydrogen bonding with several invariant residues, namely Gly426 of the poly- Gly strand, Arg372 and His425 side chains, which are further stabilized by the phenyl ring of Tyr418 (Figure 3.15A). By virtue of its proximity to the active site, His 367 may be involved in stabilizing the carboxyanionic transition state in addition to substrate binding. Tyr457  73 located downstream of the zinc ligand His456 could potentially fulfill a similar role (Figure 3.15A). The residue is strongly though not strictly conserved among M66 members (Figure 3.11). It may undergo a “tyrosine switch” or shift in position associated with substrate and/or transition state stabilization analogous to other family-specific non-zinc binding tyrosines in this position (Gomis-Ruth, 2003). StcE proteolytic specificity To obtain further insight on residues that could potentially contribute to substrate recognition, we next sought to verify whether the cleavage specificity we determined previously is in agreement with the structural features observed in the active site. Briefly, the sequence of the StcE-treated C1-INH fragment matches closely to S113PTQP117 found in the N-terminal  O-glycosylated region of the serpin, except for the T115L substitution (Figure 3.5). On the N-terminal side of the identified cleavage site is Asp112 of C1-INH, which would be positioned as the P1 residue, and indeed, it appears that its side chain could conceivably fit in a shallow pocket created by the invariant His367, Arg372 and Lys377 in the StcE active site (Figure 3.15A). We attempted to fit the C1-INH tripeptide spanning P2- P1’ (T111DS113) into the StcE active site by docking (Trott and Olson, 2010), and the accepted, energy-minimized conformation of the docked ligand shows a reasonable binding mode with respect to the surrounding catalytic residues (Figure 3.15B) (see Discussion for specific details). Therefore, the structural data are consistent with the cleavage specificity identified by N-terminal sequence analysis of the proteolyzed substrate.    74 Potential exosite function of the insertion domain Curiously, the poorly resolved density corresponding to the INS region (His132-Arg252) appears to approach the active site (Figure 3.10B). We generated a mutant of StcE with this sequence-variable region truncated (ΔINS) (Figures 3.10) to test its involvement in substrate binding. StcEΔINS exhibited decreased proteolytic activity toward C1-INH compared to WT StcE (Figure 3.16A) (circular dichroism spectrum confirmed no large perturbation to the overall structure due to the mutation, as shown in Figure 3.16B-C). This suggests the insertion may contribute to substrate binding and specificity, although we cannot exclude the possibility of additional sequence determinants in the absence of a complete structure.  Figure 3.16 Biochemical characterization of StcE domain variants. (A) Time-dependent proteolytic cleavage of C1-INH by WT StcE, C-terminal truncation (ΔC) and insertion deletion (ΔINS) mutants. The ΔINS mutant degraded C1-INH into the expected ~65 kDa fragment less efficiently than WT StcE and the ΔC mutant, which exhibited comparable activity as the wild-type. Protein bands corresponding to the mucin- like glycoprotein substrate C1-INH were specifically visualized by Pro-Q Emerald 300  75 glycoprotein stain. Time was measured in minutes unless otherwise indicated (hour, hr). Relevant lanes were assembled from multiple SDS-PAGE gels. (B) Circular dichroism analysis of WT StcE and ΔINS. The deletion mutant retained secondary structure elements; its impaired proteolytic efficiency was not due to misfolding. (C) Gel filtration elution profiles of monomeric WT StcE (blue) and ΔINS (red). SDS-PAGE analysis of respective peak fractions is shown in inset.  Sequences corresponding to region INS have a propensity to form primarily β-strands according to secondary structure prediction (Cole et al., 2008). The fact that we do not observe well ordered density for INS in our model suggests this region may adopt multiple conformations that may contribute to the ability of StcE to capture a variety of complex substrates at the cell surface. Cell surface binding features StcE protects serum-sensitive E. coli from complement-mediated lysis by recruiting C1-INH to opsonized, or antibody-coated cells, effectively increasing C1-INH local concentrations (Lathem et al., 2004; Pillai et al., 2006). The metalloprotease promotes interaction of the serpin with its serine protease targets to potentiate its inhibitory activity on the complement. However, the enhanced association of the serpin/serine protease complex does not occur in solution and relies on the presence of cell surfaces as assembly platforms (Lathem et al., 2004). C1-INH has no inherent binding affinity for cell surfaces and flow cytometry analysis indicated that its recruitment to the cell periphery requires direct interaction between the cells and StcE (Lathem et al., 2004). StcE mediates this process in a cleavage-independent manner although the mechanism of cell surface localization is unclear.   76 A structural homology search by DALI (Holm and Rosenstrom, 2010), which compares the query protein with entries in the Protein Data bank on a scale where a Z-score of > 2.0 indicates possible homology,  suggests the StcE C-terminal domain may, in part, mediate such an interaction. The βγ-crystallin fold domain shows the highest similarity with protein S (PDB 1NPS, Z-score of 9.2 and root mean square deviation or rmsd of 2.4Å over 77 aligned Cα), a surface coat protein that assembles around myxospores of Myxococcus xanthus during periods of cellular stress (Inouye et al., 1979). Previous electron microscopy studies on the cell adhesion protein showed that it could spontaneously associate with protein S-deficient spores upon the removal of salts from the medium or with the addition of Ca2+ (Inouye et al., 1979). The requirement for low salt and calcium suggests a mainly charge-mediated interaction for protein association. However, StcE lacks the conserved Ca2+-binding motif (D/N-X-X-S) present in some members of the βγ-crystallin superfamily (Aravind et al., 2009), and there is also no structural evidence of Ca2+ in the electron density map.  We experimentally verified these observations with isothermal titration calorimetry, which detected no appreciable binding of the divalent cation to StcE (Figure 3.17); in addition, calcium was not present in significant proportional amounts in samples previously analyzed by inductively coupled plasma mass spectrometry (Grys et al., 2006).    77  Figure 3.17 Isothermal titration calorimetry of StcEΔ35E447D to calcium. Insignificant amount of thermal energy was generated from the titrations of StcE with divalent calcium ions, as shown by the peaks in the raw ITC isotherm in the upper panel and the integrated heats below. The magnitude of the peaks reflects the heat of dilution background, indicating negligible binding of StcE to calcium.  To evaluate the ability of the C-terminal domain to associate peripherally with the cell surface, we compared the cellular localization of WT StcE and its truncation mutant (ΔC) in an equilibrium sucrose density experiment. WT StcE co-fractionated with the outer membrane protein (OMP) control ZirT (Gal-Mor et al., 2008) while significantly less of ΔC remained in the corresponding fractions (Figure 3.18A-B). Indeed, the calculated electrostatic surface of StcE reveals electropositive clusters and hydrophobic ridges in domain C that may provide complimentary charges for surface association (Figure 3.10E). Interestingly, the solvent-exposed side chains of Tyr828, Trp854, Tyr859, Arg861, Lys889, and Arg891 in these regions form a complementary interface along an acidic crevice that is  78 found on a symmetry-related molecule in the crystal contacts (Figure 3.18C). We also examined the ability of StcEΔC to degrade C1-INH to verify its structural integrity. Proteolytic efficiency of the mutant is comparable to the WT and not compromised by truncation of domain C, indicating that the putative cell surface-binding domain is dispensable for substrate binding (Figure 3.16A).  Figure 3.18 Sucrose density gradient analysis of StcE domain variants. (A) Cellular localization analysis of StcE domain variants using increasing concentrations of sucrose (30% to 55%), as denoted by the gradient above the SDS-PAGE gel. Sucrose gradient fractionation of E. coli total membranes revealed co-localization of WT StcE (A) with the outer membrane protein control ZirT (B). Truncation of the C-terminal domain (ΔC ) resulted in a reduced amount of StcEΔC in the equivalent OMP fractions. The decrease was unlikely due to poor expression of StcEΔC as similar levels of WT StcE and the mutant were present in their respective total lysates in lanes 1 and 2. The positions of molecular weight markers are indicated on the left side of the SDS-PAGE gel. (C) Select hydrophobic and basic residues on the solvent accessible surface of the C-terminal domain contact a prominent acidic crevice on a symmetry-related StcE molecule. Chemical cross-linking of StcE with C1-INH The structure of apo-StcE provides valuable insight on its subsite specificity and how its multi-domain architecture and unusual active site may accommodate O-glycosylated substrates. Despite the amount of knowledge that was gleaned from the StcE structure, static  79 light scattering analysis and biochemical characterization of its truncation mutants, questions about its molecular interactions with the proteolytic substrates remain. As a complementary approach to study the detailed interactions between StcE and C1-INH while we pursued the high-resolution co-crystal structure of the complex, we tried to map the contact interface by chemical cross-linking in combination with mass spectrometry analysis. This approach has been used successfully to identify sites of direct interaction in various protein assemblies, including the structurally complex type III secretion needle apparatus (Sanowar et al., 2010). Specifically, subunit interactions deduced from the cross-linking data have led to new insights about the spatial organization of the transmembrane components of the needle complex.  To begin to probe the interaction between StcE and its substrate C1-INH, we isolated a complex of the catalytic inactive protease with the serpin. We chose the homobifunctional cross-linkers DSS, DSP and EGS (Figure 3.19A), which specifically react with primary amines, as a number of lysine side chains are accessible for conjugation near the active center.  Although DSS, DSP and EGS have the same chemical reactivity, their spacer chain lengths are different (11.4, 12 and 16.1 Å, respectively). This unique property may be exploited to probe different distance spheres and used to define distance restraints for the protease/substrate interactions in developing a model of the complex (Petrotchenko and Borchers, 2010).   SDS-PAGE analysis of the cross-linking reaction products revealed pronounced smearing of high molecular weight species (Figure 3.19B). The diffuse migration pattern was probably due to the heterogeneity of human C1-INH and the presence of variably cross-linked species. Notably, increasing concentrations of the cross-linkers did  80 not result in greater amounts of productively linked complexes (as evident in the faint high molecular weight bands obtained at all concentrations of DSS and DSP) (Figure 3.19B). The low efficiency of the cross-linking reaction was likely caused by steric hindrance from the oligosaccharide chains on the serpin, which limited access to the binding interface. Although a cross-linked species migrating slower than the ~180 kDa marker might reflect the 1:1 StcE/C1-INH complex, its low abundance and the inability to isolate a discrete band prevented further characterization by mass spectrometric mapping.  Figure 3.19 Chemical cross-linking of the StcEΔ35E447D/C1-INH complex. (A) Chemical structures of the cross-linkers DSS, DSP and EGS. The specific spacer arm lengths are indicated. (B)The purified complex was subjected to cross-linking using DSS, DSP and EGS at increasing concentrations (0.3 mM to 2 mM for lanes 1 to 3). A minor species with an apparent molecular mass slightly greater than 180 kDa potentially corresponds to the 1:1 metalloprotease/serpin complex. The control reaction (lane 1) in which StcEΔ35E447D alone was incubated with the cross-linker showed that there is no non- specific interaction between StcE monomers.      81 3.2 Characterization of the StcE insertion domain 3.2.1 Purification and structural determination of the StcE insertion domain The multi-domain arrangement of StcE, as revealed by the crystal structure, is highly dynamic. In addition to the ordered domains IG, M and C, it possesses flexible regions with poorly resolved electron density in the current model. These dynamic regions lack sequence identity with proteins of known function; thus sequence-based prediction offers little insight into their roles in facilitating StcE mucinase activity. Preliminary evidence from the substrate proteolytic assay, which demonstrated a reduction in the catalytic efficiency of the StcEΔINS mutant, indicates that the sequence variable insertion domain potentially acts as an exosite for substrate binding and specificity. To explore this hypothesis further, a construct defining region INS (residues H132-N251) was made based on visual inspection of the model and sequence alignment of StcE homologues (Figures 3.9B and 3.10).  A His-tagged construct expressing region INS alone (StcE-INS) was purified using a zinc- chelating column and size exclusion chromatography to >95% purity (Figure 3.20A). A sufficient amount of the protein was obtained to further characterize the function of the insertion region, starting with high-throughput crystallization screening of the purified sample.  StcE(132-251) crystallized readily and subsequent rounds of optimization resulted in crystals that diffracted to 1.6Å on a rotating anode Rigaku 007 X-ray source. Quick cryo- soaking of the crystals with a cryoprotectant containing NaI allowed incorporation of halide ions for phasing and the structure was solved by the SIRAS technique (single isomorphous replacement with anomalous signal).  Domain INS folds as a mixed, eight-stranded β- sandwich. The compact structure shares a similar topology with the T4 bacteriophage  82 baseplate structural protein gp9 (Z-score of 6.2) and contains an additional β-strand followed by a perpendicular arrangement of two α-helices in its N-terminal region (Figure 3.20B-C and Table 3.2).  Figure 3.20 StcE insertion domain is an independently folded module. (A) Purification of the insertion domain, StcE-INS. A construct encoding residues 132-251 of StcE was expressed separately and the purified protein appeared to be stable in the absence of other StcE domains. (B) Difference Patterson map illustrating the positions of strong iodide peaks. The map shown here represents the section where w = 0.167. The coordinates of each peak represent the vector difference between a given pair of iodides incorporated into the crystal. The calculated iodide positions allow the determination of the heavy atom substructure for phasing. (C) StcE-INS is compactly folded as a mixed, eight-stranded β- sandwich with a similar topological arrangement as the T4 bacteriophage baseplate structural protein gp9 (Z-score of 6.2). The anomalous difference map, contoured at 5σ level, shows the positions of multiple iodides (red spheres) bound to the protein.      83 Table 3.2 Data collection and refinement statistics for StcE(132-251). Data collection Space group H3 Mol/au 1 Cell dimensions      a, b, c (Å) 56.0, 56.0, 96.9      α, β, γ (°) 90.0, 90.0, 120.0 Resolution (Å) 1.6 Total reflections 80354 Unique reflections 14661 Rpima 2.1 (11.3)b I/σI 9.2 (1.8) Completeness (%) 98.9 (92.8) Redundancy 5.5 (5.1) Refinement Rwork/ Rfree 0.18/0.21 Average B-factors (Å2)      Protein  18.0      Ion  21.7 (7 iodides)      Water  (175) 26.0 (103 waters) R.m.s deviations      Bond length (Å) 0.020      Bond angles (°) 1.486 Ramachandran plot      Favored (%) 98      Disallowed (%) 0  84 aMultiplicity-weighted Rmerge (Diederichs and Karplus, 1997). bValues in parentheses are for the highest-resolution shell.  3.2.2 Implications for substrate binding Orientation of StcE(132-251) in full-length StcE To obtain a more complete description of the StcE structure and function, and to better understand the relative organization of the individual domains, region INS was used as a search model and positioned in the context of the available atomic coordinates for full-length StcE by molecular replacement using the spherically averaged phased translation function (correct placement was confirmed by the anomalous signal corresponding to SeMet205 in the region) (Figures 3.21A-B). A long flexible linker (His132-Lys150) N-terminal to domain INS and a shorter loop segment (Gly246-Arg252) continuous with its C-terminus join the insertion domain to complementary parts of the neighboring domain IG (Figures 3.9B-C and 3.21C).  85  Figure 3.21 Relative orientation of the proposed exosite domain in StcE. Sigma-A weighted 2Fo-Fc maps of the INS region in the context of the full-length StcE structure (A) and as an isolated domain (B), contoured at 1σ. The disordered electron density of the mobile INS domain in the former contrasts the well-ordered density observed in the latter. The map is centered at Met205 of the INS domain. (C) Position of domain INS in the overall StcE structure, shown as a front view of the active site. The conserved Trp189, Arg191 and Tyr214 potentially involved in carbohydrate binding are located on the protein surface and shown in yellow stick form. (D) Electrostatic surface potential of the molecule in (C), colored from -4 kT/e (red, negative) to +4 kT/e (blue, positive). The conserved cluster corresponding to solvent-accessible Trp189, Arg191 and Tyr214 is shaded in yellow. (E) 180° rotation of the molecule in (D) reveals oppositely charged surfaces on opposing sides of domain INS.  It pivots from the body of the metalloprotease through a hydrophobic core of conserved residues that are contributed by the aforementioned linkers and the inter-domain junction between regions IG and M (namely Phe44, Ile253, Tyr255, Trp476 and Phe489) (Figure 3.21C). Although the small, globular INS module forms minimal contact with the rest of the  86 ~100 kDa protease, its tilted orientation allows close approach to the substrate-binding cleft located between M-SD1 and M-SD2 of the metalloprotease domain (Figure 3.21C). This is in agreement with our initial speculation that the residual density for this region in the full- length StcE crystals seems to lie proximal to the active site.  Based on the specific conformation domain INS adopts in our crystal structure, its distance to the catalytic center, which is measured from a point parallel to the zinc cation, is approximately 35Å. Electrostatic surface features of StcE(132-251) Electrostatic surface analysis performed on the isolated insertion domain revealed an extensive distribution of positively charged patches, including a distinct basic cavity between domains M and INS, as well as a smaller acidic surface on the opposite side (Figure 3.21D- E). Closer inspection of the inter-domain cavity identified Arg223, Lys224, and Arg244, as responsible for the high density of positive charges.  Moreover, a conserved cluster of basic and aromatic residues (Trp189, Arg191 and Tyr214) is situated in a path that is continuous with the protease active site (Figures 3.21D). Aromatic side chains often engage in (CH)/π stacking interactions with planar sugar rings to mediate protein-carbohydrate recognition (Boraston et al., 2004). Surface exposure of these hydrophobic residues (Trp189 and Tyr214 in domain INS), which are typically buried in the interior of proteins for stability, may render their side chains accessible to specific carbohydrate determinants in glycoprotein targets of StcE. A high-resolution structure of StcE in complex with its substrate, for example, C1- INH, will shed light on how the unique electrostatic surface features observed in this particular module contribute to binding of mucin-type substrates in greater detail.   87 StcE insertion domain is dynamic Domain INS adopts a discrete fold as an independent module. The compact structure determined for the domain on its own contrasts with the poorly resolved density that was observed in the context of full-length StcE. This suggests that the region is highly mobile, rather than disordered. Indeed, domain INS is flexibly linked via a long linker (His132- Lys150) as an insertion to domain IG, which juxtaposes the mostly basic face of the former toward the StcE active site that processes negatively charged glycoprotein substrates. The lack of intramolecular contacts between the long linker and surrounding residues suggests flexibility and this likely allows the proposed exosite domain to exist in various conformations to present a potential protein-binding interface relative to its respective targets.              88 Chapter  4: Discussion 4.1 Insights from the StcE structure StcE-like enzymes comprise the M66 metalloprotease family, for which only EHEC StcE, its Shigella boydii 13 counterpart and TagA orthologues from A. hydrophila and V. cholera have been characterized biochemically (Pillai et al., 2006; Szabady et al., 2011; Walters et al., 2012), while no members have been characterized structurally. All the orthologues have been shown to remodel host cell surface proteins to mediate bacterial attachment during infection. As such, the StcE structure we solved represents a prototype from this family. It provides valuable insight into the molecular features that mediate the critical mucinase action that these proteins play in overcoming the host innate immune response to bacterial infection.  4.1.1 Open conformation of the active site and overall electrostatics Upon examining the StcE metalloprotease domain, we note that the angular opening of the active site groove is ~10° wider compared to other structurally homologous metzincins (Figure 3.13A). It is accessible to solvent and potential substrates, and is not occluded by a propeptide. As a means to regulate the hydrolytic activity, some proteases are synthesized as zymogens and subsequently converted to the mature, active forms through limited proteolysis of the prosegments. In particular, the metzincin family of matrix metalloproteases (MMP) feature a globular prodomain in which a “cysteine-switch” or “Velcro” sequence motif is observed to occupy the substrate binding cleft. In the crystal structure of proMMP-2, a cysteine residue within the conserved motif, P100RCGVPD106, replaces the catalytic water and coordinates the zinc cation via its Sγ atom to keep the enzyme in a latent state (Figure 4.1). Hydrogen bonding interactions between Arg101 and Asp106 therein hold the regulatory  89 “cysteine-switch” in the required conformation to orient Cys102 for a tetrahedral coordination sphere (Morgunova et al., 1999). A similar “Velcro” mechanism controlling enzyme activation has also been suggested for other metzincin families, including the adamalysins (Grams et al., 1993), leishmanolysins (Gomis-Ruth, 2003) and pappalysins (Tallant et al., 2006).  Figure 4.1 Regulation of zinc metalloprotease activity by the cysteine switch mechanism. Interactions between Arg101 and Asp106 maintain the specific conformation of the cysteine- switch sequence, P100RCGVPD108, which blocks access to the active site. Cys102 displaces the catalytic solvent from the proMMP-2 active site and Asn104 occupies the S1subsite to prevent substrate binding and catalysis. Hydrogen bonds are denoted by black dashed lines.  Besides the signal sequence at the amino-terminus of StcE that is cleaved by type I signal peptidase after its Sec-dependent transport across the bacterial cytoplasmic membrane, the metalloprotease does not appear to possess a prodomain and is secreted into the extracellular space as the mature active form (Lathem et al., 2002). We suggest the unusually open dimensions of the StcE substrate binding cleft may reflect its unique ability to efficiently process large O-glycosylated substrates. In addition, the observed surface charge asymmetry in StcE likely has an impact on how this protease approaches its substrates. α-O-glycan  90 clusters found along PTS-tandem repeats of mucin-type host glycoproteins impart an overall negative charge to the molecules (Van den Steen et al., 1998). Of note, the predominantly acidic face of StcE may provide a charge repulsion mechanism that suitably orients the more electropositive target-binding site toward surface localized substrates on the host cell (Figure 3.10D-F) , a strategy also adopted by members of the unrelated carbohydrate-binding sialidase/trans-sialidase family. We suggest these properties might have evolved to allow StcE to overcome the electrostatic shielding and general protease resistance afforded by O- glycosylated proteins (Garner et al., 2001). Measuring ~8Å in width at the narrowest constriction, the StcE active site cleft widens further away from the catalytic center on both the primed and non-primed ends (Figure 3.15B). This allows the protease to form optimal interactions at the scissile peptide bond while minimizing steric clashes with potential carbohydrate moieties linked to peptide residues nearby. These overall features, together with conserved structural motifs identified in the catalytic domain allow us to describe a probable mechanism for the proteolytic specificity of StcE toward mucin-type substrates.  4.1.2 Subsite specificity When StcE is in the open conformation, O-glycoprotein substrates such as C1 esterase inhibitor (C1-INH) can access the clamp-like opening between M-SD1 and M-SD2, and be captured by StcE. A basic S1 specificity pocket formed by the conserved His367, Arg372 and Lys377 indicates a preference for acidic residues in this subsite of the enzyme active site (Figure 3.15A). This is supported by the N-terminal sequence of the StcE-treated C1-INH fragment, which indicate that the cleavage occurred C-terminal to Asp112.  As shown by the interactions simulated by our characterized C1-INH peptide spanning P2-P1’ (T111DS113),  91 fitting of Asp112 at P1 is sterically and electrostatically compatible with the immediate active site environment (Figure 3.15B). The bulky side chains of Trp366 and His367 in the conserved G364GWHSG369 flap partially shields the active site from bulk solvent, increasing the local effective charge, particularly for electrostatic interactions in the S1 subsite, as well as for the nearby catalytic zinc ion (Figure 3.15A). The enhanced charge on the zinc allows the divalent cation to effectively polarize the nucleophilic water and carbonyl of the target peptide bond to catalyze substrate hydrolysis.  Sequence analysis of C1-INH identified 14 potential O-glycosylation sites (Bock et al., 1986), including Thr111, which would fit as P2 (Figure 3.5). Although identity of the sugar moiety attached to Thr111 has yet to be verified experimentally, aromatic side chains of the nearby Tyr404 and Phe544 can potentially mediate carbohydrate binding through stacking interactions, a common theme in sugar binding (Figure 3.15B).  4.1.3 Recognition of mucin-type O-glycan ligands Our mammalian glycan microarray analysis of StcE identified the di-sialylated core I O- glycan Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα-threonine as the highest affinity ligand (Figure 3.5). Previous lectin binding assays (employing agglutinins from Sambucus nigra and Maackia amurensis and peanut agglutinin) have also confirmed the presence of similarly linked carbohydrate structures on C1-INH (Schoenberger, 1992). Interestingly, substitution of the threonine in this glycoconjugate with the extended linker, -CH2CH2CH2CH2-, abrogated binding (Figure 3.6B). This suggests that ligand recognition by the StcE active site may involve both the peptide and sugar moieties of the glycoprotein substrate. Welch et al. conducted a similar glycan array study on StcE previously and obtained different results that  92 were reported to be inconclusive. We acknowledge that we used the full-length protease for our screen instead of a truncated version used in the other study and hypothesize that ligand binding requires cooperative interaction between the different StcE modules observed in our structure. Interestingly, Pic (protease involved in colonization), a virulence factor expressed by Shigella flexneri 2a, uropathogenic and enteroaggregative E. coli, was recently shown to also require the recognition of an O-glycosylated Ser or Thr in its substrates for cleavage (Ruiz-Perez et al., 2011). A number of parallels can be drawn between Pic and EHEC StcE, such as their preference for mucin-type glycoprotein substrates, as well as their ability to inhibit the mobilization of immune cells during infection through their cleavage of select leukocyte surface proteins. The fact that theses effectors from various pathogenic microorganisms all exploit the mucin-type glycan motif to exert their effect on host proteins suggests that this unique recognition property may be an important virulence determinant.  Moreover, conservation of the poly-Gly motif in StcE, as well as Gly520 and Gly521 on the rims of the active site cleft indicates that conformational flexibility is essential for target recognition and catalysis (Figure 3.15A). The glycine backbone can assume a wider range of torsional angles without undue strain, allowing the active site to adapt to and recognize specific α-O-glycan induced conformations in mucin-type glycoprotein substrates. Indeed, evidence from high-resolution NMR analysis of mucin glycopeptides based on the StcE substrate CD43 revealed a stable conformation arising from the interaction of α-O-GalNAc with adjoining peptide functional groups (Coltart et al., 2002). Specifically, the GalNAc amide proton hydrogen-bonds to the glycosidic oxygen and the backbone carbonyl of the attached amino acid, while the hydrophobic face of the N-acetyl methyl forms medium-range  93 interactions with methyl groups of the peptide side chain two positions downstream from the site of O-glycosylation. (Figure 4.2) (Hashimoto et al., 2011). These intramolecular interactions constrain the underlying mucin-type peptide backbone into a defined, extended conformation.  Figure 4.2  Structure of the CD43 mucin glycopeptide STTAV. Shown here is a representative structure from the ensemble of the 59 best models calculated for the O-glycosylated peptide, STTAV (PDB 1KYJ; (Coltart et al., 2002). The peptide is colored in green; α-O-linked GalNAc residues on Ser1, Thr2 and Thr3 are colored in cyan, yellow and magenta, respectively. Representative intramolecular interactions between GalNAc and the attached amino acid residue are shown for Thr3. Specifically, the GalNAc amide proton forms hydrogen bonds (in black dashed lines)  with the glycosidic oxygen and the backbone carbonyl of Thr3, while the hydrophobic face of the N-acetyl methyl group forms medium-range interactions with the non-polar side chain of Val5. The backbone of the mucin-type glycopeptides adopts a defined conformation as a result of these interactions with the core carbohydrates.  4.1.4 Insight on catalytic mechanism While the electrostatic profile and open conformation of the StcE active site facilitates substrate approach, we speculate that substrate binding could trigger domain closure through induced-fit, leading to favorable protease-substrate interactions. Mutagenesis studies on the  94 related thermolysin-like proteases (TLP) suggest that conserved glycines are involved in hinge-bending motions that are associated with closure and opening of the binding cleft during catalysis (Veltman et al., 1998). Interestingly, Gly441 of StcE lies in a structurally analogous position at the N-terminal end of the interdomain active site helix as the conserved Gly136 of TLP (Figure 3.11). Kinetic analysis of the G136A mutant  of TLP showed decreased activity toward peptide substrates, supporting the importance of the conserved glycine residue in catalysis. In the same vein, Gly441 in StcE may mediate a similar hinge- like movement that optimally orients the metalloprotease substrate binding cleft and the target peptide toward one another.  Orientation of the docked C1-esterase inhibitor peptide in the StcE active site, with the P1 carbonyl within binding distance of the metal, allows for nucleophilic attack by the zinc- bound water on the re-face of the scissile peptide bond (Figure 3.15B). This is consistent with the catalytic register and substrate-binding mode of similar metzincins (Gomis-Ruth, 2009). His367 and/or the non-ligandingTyr457 could potentially stabilize the transient negative charge on the tetrahedral intermediate, contributing to the charge neutralization effect of the catalytic zinc cation. This likely requires reorientation of the Tyr457-containing loop and rotation of the residue side chain to bring the phenolic hydroxyl into proximity, in a “tyrosine switch” motion observed in the related snapalysins (Kurisu et al., 1997). After hydrolysis, the cleaved target is liberated from the active site and the protease regenerated.     95 4.1.5 Implications of the multi-domain architecture Despite extensive conserved surface features in the active site (Figure 4.3) and their shared specificity toward mucin-type glycoproteins, the substrate profiles of StcE homologues characterized to date are not identical. In particular, the Vibrio pathogenicity island-encoded virulence factor TagA, which shares 37% sequence identity with StcE, cleaves CD43 less efficiently compared to StcE and is inactive toward C1-INH (Szabady et al., 2011). This suggests that sequence divergence between these proteins may be responsible for differences in substrate preference. We hypothesize that a dynamic, sequence-variable region (INS, His132-Asn251) within the N-terminal IG domain, which likely has arisen from a gene insertion event, contributes to substrate specificity. Proximity of the observed termini of INS (P137 and L248) and the interspersed residual (but poorly ordered) density adjacent to the active site would present this variable region as a possible exosite for specifying binding partners as in the metalloprotease RVV-X (Takeda et al., 2007). This is supported by our observed decrease in cleavage efficiency of ΔINS toward C1-INH. We speculate that StcE has evolved this modular architecture to fine-tune its substrate binding. Though present in most StcE-like proteins in the M66 family, select members lack this insertion element (for instance, EEZ85628 and EDL70231 from the marine bacteria Vibrio harveyi 1DA3 and HYϕ1, respectively) and instead, feature tandem chitin/cellulose-binding domains in the C- termini as annotated in the Entrez protein sequence database (Figure 3.11). The variation in domain composition and organization likely reflects adaptation to changing substrate motifs in different host environments (Pruzzo et al., 2008). Truncation mutants and chimeric proteins designed based on the StcE model will provide insight in that regard.  96  Figure 4.3 Conservation of surface features in the StcE active site Structure-based sequence alignment of StcE homologues is mapped onto the StcE structure, shown as a front view of the active site. The molecule is colored according to the degree of evolutionary conservation as indicated. Conserved residues along the unique active site flap (Tyr366 and His367), in the S1 subsite (Arg372 and Lys377) and those involved in zinc coordination (His446, His450 and His456) are labeled. The catalytic zinc ion is shown as a red sphere.  Nevertheless, the StcE structure reveals how the M66 peptidase family can acquire non- catalytic domains potentially conferring distinct substrate specificities while preserving essential features of its catalytic scaffold.  The overall conformation of StcE and local flexibility in the active site appear to have evolved to specifically recognize O-glycan induced conformations in mucin-type glycoproteins. The findings provide a structural basis for understanding how bacterial virulence factors can breach the mucosal barrier and disarm  97 key host immune proteins and specifically represent a highly attractive, accessible antimicrobial target to block enterohemorrhagic virulence and disease.  4.2 Future directions 4.2.1 Characterization of StcE-substrate interactions using recombinant C1-INH Human C1-INH is the principle regulator of complement proteases in the classical and mannose-binding lectin pathways, and is specifically targeted by the metalloprotease StcE during EHEC infection. The crystal structure of apo-StcE we determined has allowed us to propose how the bacterial virulence factor, a prototypical member of a unique O- glycoprotein-specific protease family, may irreversibly modify key mammalian mucin-type substrates involved in host defense. Although its active site features, which include an unusually wide substrate-binding cleft and a conserved flexible poly-Gly motif, explain how the protease may accommodate densely glycosylated proteins without introducing steric hindrance, a co-crystal structure of StcE bound to its substrate, in particular, C1-INH, will help elucidate the molecular determinants responsible for target recognition and specificity in greater details. Our attempts to obtain a high-resolution structure of the StcEΔ35E447D/C1- INH complex were unsuccessful. It is likely that chemical and structural heterogeneity of the glycans on native serpin impeded ordered crystal lattice formation. Although enzymatic deglycosylation of the complex was employed, the high cost of the purified C1-INH (obtained commercially) prohibited extensive screening with multiple glycosidases that have different substrate specificities. Recombinant expression of C1-INH will thus provide a consistent supply of the protein for future structural studies.   98 Kinetic analysis of StcE mutants Being able to obtain a reliable supply of the recombinant C1-INH will also be useful for the kinetic characterization of StcE mutants that are designed to probe the structural features necessary for proteolysis. The candidates that are of particular interest for kinetic studies are Trp366 and His367 in the active site flap, the Tyr457 switch, Gly441 in the proposed hinge region, as well as residues in the poly-Gly motif (Figure 3.15A). The substrate may be conjugated to a fluorescent reporting group to enable more accurate and sensitive kinetic measurements. The recombinant system will also permit the expression of C1-INH variants that have different glycosylation patterns through site-directed mutagenesis of the N- and/or O-glycosylation sequence motifs. This will allow us to investigate the effect of glycosylation on StcE target recognition. Baculovirus-insect cell expression of C1-INH Large-scale production of soluble mammalian glycoproteins is challenging, as the correct transfer and processing of glycans are often required for folding, stability and their biological activities. Most of these targets are not amenable to bacterial expression as the popular E. coli host lacks the necessary glycosylation machinery. Baculovirus-insect cells and mammalian cell lines have the proper eukaryotic protein processing capabilities, and an increasing number of structures deposited in the PDB report the use of these expression systems (Nettleship et al., 2010). Notably, successful application of these technologies has led to structures of the G-protein coupled receptors (Cherezov et al., 2007).   99 Human C1-INH is extensively glycosylated during post-translational modification. A total of six N-linked sites have been identified, with three mapping to each of the C-terminal serpin domain and the unique mucin-like region in the N-terminus (Figure 3.5). Fourteen potential O-glycosylation sites exist, seven of which have been confirmed experimentally to reside in the N-terminal domain (Bock et al., 1986). A previous study has explored the use of the baculovirus-insect cell system for expressing the mature C1-INH (Wolff et al., 2001).  The recombinant C1-INH was found to be less active than the serpin from human plasma and had an apparent molecular mass ~25 kDa smaller than that of the native protein. Capillary electrophoresis analysis of the N-glycans released from recombinant C1-INH revealed carbohydrate chains of the high-mannose type (Wolff et al., 2001), rather than the chemically more complex sialylated bi-, tri- and tetra-antennary species detected by NMR analysis of the endogenous glycans (Perkins et al., 1990; Strecker et al., 1985). The lack of terminal sialylation in insect cells and the conversion of N-glycan intermediates to insect-specific paucimannose products are known limitations of the insect protein processing pathway, which could affect protein folding, stability and function (Altmann et al., 1995; Marchal et al., 2001), although recent engineering efforts have tried to address this issue.  The reduced molecular mass of recombinant C1-INH suggests that insect cells produce under-glycosylated C1-INH although the specific O-glycan content was not analyzed (Wolff et al., 2001).  Structural integrity of the O-glycans required for StcE recognition need to be verified for the recombinant C1-INH, as the protease targets specific O-glycosylated motifs in its cognate substrates to exert an immunomodulatory effect on the host (Grys et al., 2005; Lathem et al., 2002; Szabady et al., 2009).  We can test the biological activity of the non-  100 native serpin in a proteolytic assay to confirm whether StcE can recognize and process the recombinant substrate. Expression of C1-INH in mammalian hosts Previous studies have also evaluated the potential of expressing recombinant C1-INH in the mammary gland of transgenic rabbits, although this mammalian system, whose glycoprotein processing pathway has yet to be fully characterized, also failed to generate the authentic glycosylation patterns (Koles et al., 2004).  Transient expression in HEK293 (human embryonic kidney) cells has proven to be an effective strategy for producing high yields of secreted, biologically active mammalian proteins. The technology is adopted by major structural genomics initiatives as a complementary approach to the insect cell system, and developments in the expression protocols have enabled high throughput screening of multiple constructs in parallel (Aricescu et al., 2006; Lee et al., 2009b). Established procedures for maintaining the tissue culture and transfecting the cells for optimal transient expression are available for small-scale test expression and can be readily adapted for large-scale production. Western blots can be performed to confirm the presence of the target protein secreted into the conditioned media before purification. De-glycosylation strategies Another challenge associated with the structural characterization of human glycoprotein targets is that high conformational freedom and heterogeneity of the carbohydrate moieties are unfavorable for crystallization.  To circumvent this problem, we can introduce systemic mutations to the N-glycosylation consensus sequence N-X-(T/S) and/or select O-  101 glycosylation sites known to be modified in C1-INH by replacing Asn with Asp, Thr with Val or Ser with Ala to obtain a homogeneous population of the glycoprotein.  Besides possible changes in protein folding and stability, the C1-INH point mutants will need to be assessed for their ability to interact with StcE in proteolytic assays, although N-glycans are not expected to have a bearing on binding. Alternatively, enzymatic removal of N-linked carbohydrates can be achieved by using PNGaseF, which has a broad substrate specificity and cleaves between the modified Asn side chain and the innermost GlcNAc, leaving no residual sugar attached to the peptide (Lee et al., 2009a). The EndoF series of enzymes can also complement the action of PNGaseF to trim diverse glycans between the two GlcNAc in the N-glycan core. In cases where local protein and/or glycan structures sterically hinder access to the N-glycosidic bond, sequential or simultaneous treatment of the glycoprotein with glycosidases of different specificities, such as neuraminidase (to remove terminal sialic acids) may be required. The addition of chaotropes, such as urea, at non-denaturing concentrations, may also help to alter local structures at the glycosylation sites to make them more accessible to the enzymes.  4.2.2 Electron microscopy of StcEΔ35E447D/C1-INH In parallel with the attempt to characterize the metalloprotease-serpin interactions by X-ray crystallography, we can also examine the spatial organization of the complex by electron microscopy (EM). As a complementary approach to crystallography, EM is a useful tool to study protein structures, especially those of large macromolecular complexes. For protein targets that are less amenable to ordered crystal lattice assembly, EM, which requires a substantially lower amount of the samples for imaging, is an excellent technique for  102 visualizing the overall, low-resolution features of these biological molecules.  In combination with the crystal structure of StcE that we determined and the available atomic coordinates for the C1-INH C-terminal serpin domain, an EM three-dimensional (3D) reconstruction of the complex will provide additional insight on how the protease engages the mucin-like regions of its host targets.  Previous EM and neutron scattering experiments on the isolated C1-INH revealed a highly elongated molecule measuring ~18 nm in length (Odermatt et al., 1981; Perkins et al., 1990). Electron micrographs of C1-INH clearly delineated a two-domain structure with a lollipop- like, head-and-tail arrangement (Odermatt et al., 1981). The carbohydrate-rich N-terminal mucin-like region extends in a rod-like conformation and is continuous with the long axis of the globular C-terminal serpin domain. The diameter of the rod-like domain was estimated to be ~2 nm and that of the catalytic domain ~4 nm (Odermatt et al., 1981). To understand more fully how StcE may remodel this two-domain structure, negative stain EM analysis of the serpin-bound complex can be performed. Preparation of the specimen by negative staining in a layer of heavy atom solution is rapid, and the method produces high image contrast that allows finer features of the particle to be discerned from the background (Ohi et al., 2004). As particles in their native states may have preferred orientations or adopt different conformations, the images will be sorted into distinct classes and averaged separately to enhance the signal-to-noise ratio.  Different views of the particles in the 2D class averages will then be integrated to calculate a 3D reconstruction that is representative of the StcE StcEΔ35E447D/C1-INH complex in its native state.   103 Since the EM map of human C1-INH clearly shows a head-and-tail architecture for the molecule, difference mapping or comparison of the class averages of the StcEΔ35E447D/C1- INH complex with those of the serpin will allow us to locate StcE and note any structural rearrangements that occur as a result of the interaction.  To begin to piece together a 3D model of the fully assembled complex, we could use the “head” region of C1-INH as a guide to fit the crystal structure of the serpin domain into its corresponding globular density (PDB 2OAY; (Beinrohr et al., 2007). Simultaneous molecular docking of the StcE crystal structure we have in hand into the EM density map will further refine the model to suggest the approximate orientations of its constituent domains relative to C1-INH and their involvement in binding the N-terminal mucin-like region, for which no high-resolution structural information is currently available.  To experimentally validate the 3D composite model, we can selectively label StcE and/or C1-INH with a terminal GFP (green fluorescent protein) tag to identify the location of the fusion protein.  Proximity of the GFP density to its predicted position within the EM density map will thus lend support to the structure suggested by the molecular docking. The fusion-tagging technique and the conceptually similar antibody labeling approach have been used successfully to verify the structural composition of large macromolecular complexes obtained through a combination of EM and molecular docking experiments (Lees et al., 2010).  4.2.3 Cell-based assays and In vivo characterization of StcE mutants In additional to the proposed structural studies to investigate the StcEΔ35E447D/C1-INH interaction, characterization of StcE mutants in an in vivo or cellular context will help to address fundamental questions about its function in EHEC virulence. Previous studies on  104 cultured epithelial cells suggested that StcE plays a critical role in intimate adherence of EHEC to host cells, a step that is prerequisite for successful colonization and infection. Indeed, the stcE deletion strain showed a positive phenotype for defective colonization, as the mutant was noticeably less effective in inducing host actin polymerization beneath the adhering bacteria (Grys et al., 2005). Tir (translocated intimin receptor) is directly implicated in this process (DeVinney et al., 2001). Specifically, T3SS-dependent translocation of Tir from the bacterial cytosol to the eukaryotic cell first needs to occur before the EHEC effector can remodel the actin network to establish close host-pathogen contact and disrupt important cellular processes in concert with a wide range of virulence factors that are also injected by the secretion apparatus (Dean and Kenny, 2009). The reduction in actin pedestal formation by the stcE knockout strain thus suggests that the mutant is less efficient in delivering Tir into the host cell and in general, impaired in T3SS-dependent protein translocation. This is consistent with the belief that StcE acts during the critical first steps of host-pathogen encounters as a mucinase that degrades mucin-type glycoproteins in the mucosal barrier to assist type III secretion of bacterial effectors (Grys et al., 2005). Protein translocation assays Secretion assays measuring the amount of bacterial virulence proteins released into the media under T3SS-inducing conditions are routinely used to judge the secretion competencies of genetic mutants of the pathogens that utilize this protein transport pathway (Li et al., 2000). A reduction in protein secretion as a result of the gene mutation is often taken as an indication that the encoded product is involved in mediating type III secretion. However, this assay is more appropriate for analyzing the effect of assembly defects in the secretion  105 apparatus itself.  EHEC carrying the stcE deletion can assemble a functional type III injectisome and is expected to be able to secrete effectors into the media. Rather, it is likely the case that this mutant cannot efficiently deliver virulence proteins into the host cell via T3SS, as it cannot sufficiently overcome the protective mucus layer to colonize and form a continuous protein translocation conduit between the host and bacterial cytosol. Therefore, to test the role of StcE in T3SS, translocation assays are more suited for monitoring the unidirectional transfer of type III-associated bacterial proteins into eukaryotic cells.  Established procedures for studying protein translocation have been applied successfully to analyze effector transport, discover novel virulence proteins, measure the translocation kinetics of effectors, as well as determine their hierarchy with respect to their roles in infection (Mills et al., 2008; Tobe et al., 2006). To examine the effect of stcE deletion on T3SS-dependent protein translocation, we can use two independent approaches to compare the translocation efficiencies of WT EHEC and the mutant. A reporter system based on translational fusions of TEM-1 β-lactamase to EHEC effectors such as Tir and Map (mitochondrial-associated protein) can be constructed to monitor protein translocation into eukaryotic cells (Charpentier and Oswald, 2004). Infected cells are first loaded with the fluorescent substrate CCF2, and we can quantify the extent of its cleavage by the β-lactamse, which reflects the translocation efficiency of the fusion proteins into the host cytoplasm. Another useful quantitative assay is the translocation of adenylate cyclase fusions of the effector proteins. An increase in host intracellular cAMP levels due to the enzymatic conversion of ATP is an indicator of unobstructed protein translocation across the eukaryotic plasma membrane, which, in turn, allows the accumulation of the chimeric proteins in the  106 host (Sory and Cornelis, 1994). To confirm that the observed translocation phenotypes are dependent on a functional T3SS, similar assays using a type III secretion-deficient mutant (ΔescN, which lacks the essential ATPase driving the energy-dependent translocation process) that carries the same set of reporter fusion plasmids will also need to be performed as negative controls (Zarivach et al., 2007). If the ability of EHEC to efficiently transfer Tir and other associated effectors requires the clearance of the host mucus barrier by StcE, then the stcE deletion mutant is expected to be less efficient in mediating T3SS-dependent protein translocation and in colonizing the mucin-rich epithelial layer. In vivo competitive index assay We could further verify the hypothesis that StcE plays a key role in mediating type III secretion and the related virulence by examining the in vivo phenotype of its deletion mutant. The mouse-adapted pathogen Citrobacter rodentium shares many virulence traits with EHEC, such as the ability to attach to host intestinal epithelial cells, efface their microvilli, induce actin pedestal and cause diarrheal symptoms (Vallance et al., 2003). It is commonly used as a model for studying EHEC pathogenesis as it also expresses a similar set of translocated effectors. However, the C. rodentium genome does not encode stcE. The infant rabbit model is another experimentally validated alternative as the system was successfully adopted for characterizing the colonization defect of a type II secretion knockout of EHEC that is deficient in StcE (Ho et al., 2008). In addition, infant rabbits infected with EHEC display intestinal and renal pathologies that mimic those described for humans (Garcia et al., 2006).   107 To investigate the contribution of StcE to virulence, a stcE-deleted EHEC strain can be used in an infant rabbit model of infection to analyze the in vivo outcomes. We can compare its colonization efficiency and histopathological effects on the host with those observed for the WT strain to determine whether there is any attenuation in virulence due to StcE deficiency. Complementation of the ΔstcE strain with a WT copy of the gene can be carried out to ensure that the phenotypic differences are not caused by a polar effect from the deletion. To further characterize the ΔstcE mutant using a more sensitive quantitative approach, a mixed infection assay combining equal amounts of the mutant and WT EHEC can be performed to determine the competitive index of ΔstcE during disease progression in infant rabbits. It provides a reliable measure of the specific contribution of a gene to virulence by comparing the colonization ratios of the mutant and WT bacteria within the same host, and has been routinely used to study other EHEC effectors (Sham et al., 2011). The competition index is calculated as the colonization ratio (mutant: WT) of the output population divided by that of the input.  If WT EHEC significantly outcompetes the ΔstcE strain, it suggests that the mutant is attenuated in virulence and that StcE plays an important role in bacterial survival within the host. As StcE is involved in both early and late stages of EHEC pathogenesis, identifying specific inhibitors against the metalloprotease may be an effective therapeutic strategy for EHEC-related disease.  4.2.4 Inhibitor screening for EHEC StcE StcE is secreted to the extracellular milieu upon upregulation of its expression by Ler, a global regulator of various genes essential for EHEC pathogenesis, including the type III secretion system and its translocated effectors (Lathem et al., 2002). Its highly accessible  108 localization, important role in disease, combined with the fact that there are no known mammalian homologues of the metalloprotease suggest that StcE is an attractive drug target for blocking EHEC T3SS-mediated virulence. Since StcE is not required for general bacterial growth, drug resistance against the protease will also be less likely to occur.  High- throughput screening of small molecule libraries has made significant progress in identifying broad-spectrum inhibitors that target T3SS activity. In particular, 2-imino-5-arylidene thiazolidinone of the thiazolidinone class was found to not only inhibit T3SS in Salmonella and Yersinia, but also T2SS from Pseudomonas aeruginosa, by specifically targeting the outer membrane secretin protein common to these Gram-negative pathogens (Felise et al., 2008) (see Appendix A for the secretin’s role in the structural assembly of the virulence- associated secretion systems). A similar high-throughput approach may be adopted for the screening of specific inhibitors for StcE, and their effect on T3SS-mediated virulence can be evaluated by performing effector protein translocation assays (as described in Section of EHEC cells grown in the presence of the different compounds. It is envisaged that inhibitors that specifically neutralize StcE activity will enhance the arsenal of antivirulence drugs available for fighting EHEC-related infections.         109 Bibliography Altmann, F., Schwihla, H., Staudacher, E., Glossl, J., and Marz, L. (1995). Insect cells contain an unusual, membrane-bound beta-N-acetylglucosaminidase probably involved in the processing of protein N-glycans. J Biol Chem 270, 17344-17349. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database s  earch programs. Nucleic Acids Res 25, 3389-3402. Aravind, P., Mishra, A., Suman, S.K., Jobby, M.K., Sankaranarayanan, R., and Sharma, Y. (2009). The betagamma-crystallin superfamily contains a universal motif for binding calcium. Biochemistry 48, 12180-12190. Aricescu, A.R., Assenberg, R., Bill, R.M., Busso, D., Chang, V.T., Davis, S.J., Dubrovsky, A., Gustafsson, L., Hedfalk, K., Heinemann, U., et al. (2006). Eukaryotic expression: developments for structural proteomics. Acta Crystallogr D Biol Crystallogr 62, 1114-1124. Baker, N.A., Sept, D., Joseph, S., Holst, M.J., and McCammon, J.A. (2001). Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci U S A 98, 10037-10041. Bally, I., Rossi, V., Lunardi, T., Thielens, N.M., Gaboriaud, C., and Arlaud, G.J. (2009). Identification of the C1q-binding Sites of Human C1r and C1s: a refined three- dimensional model of the C1 complex of complement. J Biol Chem 284, 19340- 19348. Baruch, K., Gur-Arie, L., Nadler, C., Koby, S., Yerushalmi, G., Ben-Neriah, Y., Yogev, O.,  110 Shaulian, E., Guttman, C., Zarivach, R., et al. (2011). Metalloprotease type III effectors that specifically cleave JNK and NF-kappaB. EMBO J 30, 221-231. Battye, T.G., Kontogiannis, L., Johnson, O., Powell, H.R., and Leslie, A.G. (2011). iMOSFLM: a new graphical interface for diffraction-image processing with MOSFLM. Acta Crystallogr D Biol Crystallogr 67, 271-281. Beinrohr, L., Harmat, V., Dobo, J., Lorincz, Z., Gal, P., and Zavodszky, P. (2007). C1 inhibitor serpin domain structure reveals the likely mechanism of heparin potentiation and conformational disease. J Biol Chem 282, 21100-21109. Blanc, E., Roversi, P., Vonrhein, C., Flensburg, C., Lea, S.M., and Bricogne, G. (2004). Refinement of severely incomplete structures with maximum likelihood in BUSTER- TNT. Acta Crystallogr D Biol Crystallogr 60, 2210-2221. Bock, S.C., Skriver, K., Nielsen, E., Thogersen, H.C., Wiman, B., Donaldson, V.H., Eddy, R.L., Marrinan, J., Radziejewska, E., Huber, R., et al. (1986). Human C1 inhibitor: primary structure, cDNA cloning, and chromosomal localization. Biochemistry 25, 4292-4301. Boraston, A.B., Bolam, D.N., Gilbert, H.J., and Davies, G.J. (2004). Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem J 382, 769-781. Charpentier, X., and Oswald, E. (2004). Identification of the secretion and translocation domain of the enteropathogenic and enterohemorrhagic Escherichia coli effector Cif, using TEM-1 beta-lactamase as a new fluorescence-based reporter. J Bacteriol 186, 5486-5495. Cheng, H.C., Skehan, B.M., Campellone, K.G., Leong, J.M., and Rosen, M.K. (2008).  111 Structural mechanism of WASP activation by the enterohaemorrhagic E. coli effector EspF(U). Nature 454, 1009-1013. Cherezov, V., Rosenbaum, D.M., Hanson, M.A., Rasmussen, S.G., Thian, F.S., Kobilka, T.S., Choi, H.J., Kuhn, P., Weis, W.I., Kobilka, B.K., et al. (2007). High-resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science 318, 1258-1265. Cole, C., Barber, J.D., and Barton, G.J. (2008). The Jpred 3 secondary structure prediction server. Nucleic Acids Res 36, W197-201. Collaborative Computational Project, Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr 50, 760-763. Coltart, D.M., Royyuru, A.K., Williams, L.J., Glunz, P.W., Sames, D., Kuduk, S.D., Schwarz, J.B., Chen, X.T., Danishefsky, S.J., and Live, D.H. (2002). Principles of mucin architecture: structural studies on synthetic glycopeptides bearing clustered mono-, di-, tri-, and hexasaccharide glycodomains. J Am Chem Soc 124, 9833-9844. Cornelis, G.R. (2010). The type III secretion injectisome, a complex nanomachine for intracellular 'toxin' delivery. Biol Chem 391, 745-751. Croxen, M.A., and Finlay, B.B. (2010). Molecular mechanisms of Escherichia coli pathogenicity. Nat Rev Microbiol 8, 26-38. Cyster, J.G., Shotton, D.M., and Williams, A.F. (1991a). The Dimensions of the Lymphocyte-T Glycoprotein Leukosialin and Identification of Linear Protein Epitopes That Can Be Modified by Glycosylation. Embo Journal 10, 893-902. Cyster, J.G., Shotton, D.M., and Williams, A.F. (1991b). The dimensions of the T  112 lymphocyte glycoprotein leukosialin and identification of linear protein epitopes that can be modified by glycosylation. EMBO J 10, 893-902. Davis, A.E., 3rd, Mejia, P., and Lu, F. (2008). Biological activities of C1 inhibitor. Mol Immunol 45, 4057-4063. de Kreij, A., van den Burg, B., Veltman, O.R., Vriend, G., Venema, G., and Eijsink, V.G. (2001). The effect of changing the hydrophobic S1' subsite of thermolysin-like proteases on substrate specificity. Eur J Biochem 268, 4985-4991. Dean, P., and Kenny, B. (2009). The effector repertoire of enteropathogenic E. coli: ganging up on the host cell. Curr Opin Microbiol 12, 101-109. Deng, W., Li, Y., Hardwidge, P.R., Frey, E.A., Pfuetzner, R.A., Lee, S., Gruenheid, S., Strynakda, N.C., Puente, J.L., and Finlay, B.B. (2005). Regulation of type III secretion hierarchy of translocators and effectors in attaching and effacing bacterial pathogens. Infect Immun 73, 2135-2146. Derewenda, Z.S. (2010). Application of protein engineering to enhance crystallizability and improve crystal properties. Acta Crystallogr D Biol Crystallogr 66, 604-615. Derewenda, Z.S. (2011). It's all in the crystals. Acta Crystallogr D Biol Crystallogr 67, 243- 248. DeVinney, R., Puente, J.L., Gauthier, A., Goosney, D., and Finlay, B.B. (2001). Enterohaemorrhagic and enteropathogenic Escherichia coli use a different Tir-based mechanism for pedestal formation. Mol Microbiol 41, 1445-1458. Diederichs, K. and Karplus, P.A. (1997). Improved R-factors for diffraction data analysis in macromolecular crystallography. Nat Struc Biol 4, 269-275. Emsley, P., Lohkamp, B., Scott, W.G., and Cowtan, K. (2010). Features and development of  113 Coot. Acta Crystallogr D Biol Crystallogr 66, 486-501. Felise, H.B., Nguyen, H.V., Pfuetzner, R.A., Barry, K.C., Jackson, S.R., Blanc, M.P., Bronstein, P.A., Kline, T., and Miller, S.I. (2008). An inhibitor of gram-negative bacterial virulence protein secretion. Cell Host Microbe 4, 325-336. Ferre, F., and Clote, P. (2006). DiANNA 1.1: an extension of the DiANNA web server for ternary cysteine classification. Nucleic Acids Res 34, W182-185. Fraser, M.E., Fujinaga, M., Cherney, M.M., Melton-Celsa, A.R., Twiddy, E.M., O'Brien, A.D., and James, M.N. (2004). Structure of shiga toxin type 2 (Stx2) from Escherichia coli O157:H7. J Biol Chem 279, 27511-27517. Gal-Mor, O., Gibson, D.L., Baluta, D., Vallance, B.A., and Finlay, B.B. (2008). A novel secretion pathway of Salmonella enterica acts as an antivirulence modulator during salmonellosis. PLoS Pathog 4, e1000036. Garcia, A., Bosques, C.J., Wishnok, J.S., Feng, Y., Karalius, B.J., Butterton, J.R., Schauer, D.B., Rogers, A.B., and Fox, J.G. (2006). Renal injury is a consistent finding in Dutch Belted rabbits experimentally infected with enterohemorrhagic Escherichia coli. J Infect Dis 193, 1125-1134. Garmendia, J., Phillips, A.D., Carlier, M.F., Chong, Y., Schuller, S., Marches, O., Dahan, S., Oswald, E., Shaw, R.K., Knutton, S., et al. (2004). TccP is an enterohaemorrhagic Escherichia coli O157:H7 type III effector protein that couples Tir to the actin- cytoskeleton. Cell Microbiol 6, 1167-1183. Garner, B., Merry, A.H., Royle, L., Harvey, D.J., Rudd, P.M., and Thillet, J. (2001). Structural elucidation of the N- and O-glycans of human apolipoprotein(a): role of o- glycans in conferring protease resistance. J Biol Chem 276, 22200-22208.  114 Gaskell, A., Crennell, S., and Taylor, G. (1995). The three domains of a bacterial sialidase: a beta-propeller, an immunoglobulin module and a galactose-binding jelly-roll. Structure 3, 1197-1205. Glaser, F., Pupko, T., Paz, I., Bell, R.E., Bechor-Shental, D., Martz, E., and Ben-Tal, N. (2003). ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19, 163-164. Gomis-Ruth, F.X. (2003). Structural aspects of the metzincin clan of metalloendopeptidases. Mol Biotechnol 24, 157-202. Gomis-Ruth, F.X. (2009). Catalytic domain architecture of metzincin metalloproteases. J Biol Chem 284, 15353-15357. Grams, F., Dive, V., Yiotakis, A., Yiallouros, I., Vassiliou, S., Zwilling, R., Bode, W., and Stocker, W. (1996). Structure of astacin with a transition-state analogue inhibitor. Nat Struct Biol 3, 671-675. Grams, F., Huber, R., Kress, L.F., Moroder, L., and Bode, W. (1993). Activation of snake venom metalloproteinases by a cysteine switch-like mechanism. FEBS Lett 335, 76- 80. Grys, T.E., Siegel, M.B., Lathem, W.W., and Welch, R.A. (2005). The StcE protease contributes to intimate adherence of enterohemorrhagic Escherichia coli O157:H7 to host cells. Infect Immun 73, 1295-1303. Grys, T.E., Walters, L.L., and Welch, R.A. (2006). Characterization of the StcE protease activity of Escherichia coli O157:H7. J Bacteriol 188, 4646-4653. Guttman, J.A., Li, Y., Wickham, M.E., Deng, W., Vogl, A.W., and Finlay, B.B. (2006).  115 Attaching and effacing pathogen-induced tight junction disruption in vivo. Cell Microbiol 8, 634-645. Hang, H.C., and Bertozzi, C.R. (2005). The chemistry and biology of mucin-type O-linked glycosylation. Bioorg Med Chem 13, 5021-5034. Harding, M.M. (2006). Small revisions to predicted distances around metal sites in proteins. Acta Crystallogr D Biol Crystallogr 62, 678-682. Hashimoto, R., Fujitani, N., Takegawa, Y., Kurogochi, M., Matsushita, T., Naruchi, K., Ohyabu, N., Hinou, H., Gao, X.D., Manri, N., et al. (2011). An efficient approach for the characterization of mucin-type glycopeptides: the effect of O-glycosylation on the conformation of synthetic mucin peptides. Chemistry 17, 2393-2404. Ho, T.D., Davis, B.M., Ritchie, J.M., and Waldor, M.K. (2008). Type 2 secretion promotes enterohemorrhagic Escherichia coli adherence and intestinal colonization. Infect Immun 76, 1858-1865. Holland, D.R., Hausrath, A.C., Juers, D., and Matthews, B.W. (1995). Structural analysis of zinc substitutions in the active site of thermolysin. Protein Sci 4, 1955-1965. Holland, D.R., Tronrud, D.E., Pley, H.W., Flaherty, K.M., Stark, W., Jansonius, J.N., McKay, D.B., and Matthews, B.W. (1992). Structural comparison suggests that thermolysin and related neutral proteases undergo hinge-bending motion during catalysis. Biochemistry 31, 11310-11316. Holm, L., and Rosenstrom, P. (2010). Dali server: conservation mapping in 3D. Nucleic Acids Res 38, W545-549. Holmes, M.A., and Matthews, B.W. (1981). Binding of hydroxamic acid inhibitors to  116 crystalline thermolysin suggests a pentacoordinate zinc intermediate in catalysis. Biochemistry 20, 6912-6920. Holmes, M.A., and Matthews, B.W. (1982). Structure of thermolysin refined at 1.6 A resolution. J Mol Biol 160, 623-639. Holmes, N. (2006). CD45: all is not yet crystal clear. Immunology 117, 145-155. Holmquist, B., and Vallee, B.L. (1976). Esterase activity of zinc neutral proteases. Biochemistry 15, 101-107. Hooper, N.M. (1994). Families of zinc metalloproteases. FEBS Lett 354, 1-6. Huntington, J.A. (2006). Shape-shifting serpins--advantages of a mobile mechanism. Trends Biochem Sci 31, 427-435. Huntington, J.A., Read, R.J., and Carrell, R.W. (2000). Structure of a serpin-protease complex shows inhibition by deformation. Nature 407, 923-926. Inouye, M., Inouye, S., and Zusman, D.R. (1979). Biosynthesis and self-assembly of protein S, a development-specific protein of Myxococcus xanthus. Proc Natl Acad Sci U S A 76, 209-213. Jentoft, N. (1990). Why are proteins O-glycosylated? Trends Biochem Sci 15, 291-294. Jiang, H., Wagner, E., Zhang, H., and Frank, M.M. (2001). Complement 1 inhibitor is a regulator of the alternative complement pathway. J Exp Med 194, 1609-1616. Kaper, J.B., Nataro, J.P., and Mobley, H.L. (2004). Pathogenic Escherichia coli. Nat Rev Microbiol 2, 123-140. Kim, Y., Quartey, P., Li, H., Volkart, L., Hatzos, C., Chang, C., Nocek, B., Cuff, M., Osipiuk, J., Tan, K., et al. (2008). Large-scale evaluation of protein reductive methylation for improving protein crystallization. Nat Methods 5, 853-854.  117 Koles, K., van Berkel, P.H., Pieper, F.R., Nuijens, J.H., Mannesse, M.L., Vliegenthart, J.F., and Kamerling, J.P. (2004). N- and O-glycans of recombinant human C1 inhibitor expressed in the milk of transgenic rabbits. Glycobiology 14, 51-64. Kurisu, G., Kinoshita, T., Sugimoto, A., Nagara, A., Kai, Y., Kasai, N., and Harada, S. (1997). Structure of the zinc endoprotease from Streptomyces caespitosus. J Biochem 121, 304-308. Lambris, J.D., Ricklin, D., and Geisbrecht, B.V. (2008). Complement evasion by human pathogens. Nat Rev Microbiol 6, 132-142. Landau, M., Mayrose, I., Rosenberg, Y., Glaser, F., Martz, E., Pupko, T., and Ben-Tal, N. (2005). ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33, W299-302. Lathem, W.W., Bergsbaken, T., and Welch, R.A. (2004). Potentiation of C1 esterase inhibitor by StcE, a metalloprotease secreted by Escherichia coli O157:H7. J Exp Med 199, 1077-1087. Lathem, W.W., Grys, T.E., Witowski, S.E., Torres, A.G., Kaper, J.B., Tarr, P.I., and Welch, R.A. (2002). StcE, a metalloprotease secreted by Escherichia coli O157:H7, specifically cleaves C1 esterase inhibitor. Mol Microbiol 45, 277-288. Lee, J.E., Fusco, M.L., Abelson, D.M., Hessell, A.J., Burton, D.R., and Saphire, E.O. (2009a). Techniques and tactics used in determining the structure of the trimeric ebolavirus glycoprotein. Acta Crystallogr D Biol Crystallogr 65, 1162-1180. Lee, J.E., Fusco, M.L., and Saphire, E.O. (2009b). An efficient platform for screening expression and crystallization of glycoproteins produced in human cells. Nat Protoc 4, 592-604.  118 Lees, J.A., Yip, C.K., Walz, T., and Hughson, F.M. (2010). Molecular organization of the COG vesicle tethering complex. Nat Struct Mol Biol 17, 1292-1297. Li, Y., Frey, E., Mackenzie, A.M., and Finlay, B.B. (2000). Human response to Escherichia coli O157:H7 infection: antibodies to secreted virulence factors. Infect Immun 68, 5090-5095. Linden, S.K., Sheng, Y.H., Every, A.L., Miles, K.M., Skoog, E.C., Florin, T.H., Sutton, P., and McGuckin, M.A. (2009). MUC1 limits Helicobacter pylori infection both by steric hindrance and by acting as a releasable decoy. PLoS Pathog 5, e1000617. Linden, S.K., Sutton, P., Karlsson, N.G., Korolik, V., and McGuckin, M.A. (2008). Mucins in the mucosal barrier to infection. Mucosal Immunol 1, 183-197. Luo, Y., Li, S.C., Chou, M.Y., Li, Y.T., and Luo, M. (1998). The crystal structure of an intramolecular trans-sialidase with a NeuAc alpha2-->3Gal specificity. Structure 6, 521-530. Marchal, I., Jarvis, D.L., Cacan, R., and Verbert, A. (2001). Glycoproteins from insect cells: sialylated or not? Biol Chem 382, 151-159. Matthews, B.W. (1988). Structural Basis of the Action of Thermolysin and Related Zinc Peptidases. Accounts Chem Res 21, 333-340. Matthews, B.W., Jansonius, J.N., Colman, P.M., Schoenborn, B.P., and Dupourque, D. (1972). Three-dimensional structure of thermolysin. Nat New Biol 238, 37-41. McAuley, K.E., Jia-Xing, Y., Dodson, E.J., Lehmbeck, J., Ostergaard, P.R., and Wilson, K.S. (2001). A quick solution: ab initio structure determination of a 19 kDa metalloproteinase using ACORN. Acta Crystallogr D Biol Crystallogr 57, 1571-1578. McDaniel, T.K., Jarvis, K.G., Donnenberg, M.S., and Kaper, J.B. (1995). A genetic locus of  119 enterocyte effacement conserved among diverse enterobacterial pathogens. Proc Natl Acad Sci U S A 92, 1664-1668. McGuckin, M.A., Linden, S.K., Sutton, P., and Florin, T.H. (2011). Mucin dynamics and enteric pathogens. Nat Rev Microbiol 9, 265-278. Mills, E., Baruch, K., Charpentier, X., Kobi, S., and Rosenshine, I. (2008). Real-time analysis of effector translocation by the type III secretion system of enteropathogenic Escherichia coli. Cell Host Microbe 3, 104-113. Monzingo, A.F., and Matthews, B.W. (1984). Binding of N-carboxymethyl dipeptide inhibitors to thermolysin determined by X-ray crystallography: a novel class of transition-state analogues for zinc peptidases. Biochemistry 23, 5724-5729. Morgunova, E., Tuuttila, A., Bergmann, U., Isupov, M., Lindqvist, Y., Schneider, G., and Tryggvason, K. (1999). Structure of human pro-matrix metalloproteinase-2: activation mechanism revealed. Science 284, 1667-1670. Murshudov, G.N., Vagin, A.A., and Dodson, E.J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr 53, 240-255. Nataro, J.P., and Kaper, J.B. (1998). Diarrheagenic Escherichia coli. Clin Microbiol Rev 11, 142-201. Nettleship, J.E., Assenberg, R., Diprose, J.M., Rahman-Huq, N., and Owens, R.J. (2010). Recent advances in the production of proteins in insect and mammalian cells for structural biology. J Struct Biol 172, 55-65. Neutra, M.R., Mantis, N.J., Frey, A., and Giannasca, P.J. (1999). The composition and  120 function of M cell apical membranes: implications for microbial pathogenesis. Semin Immunol 11, 171-181. Oberholzer, A.E., Bumann, M., Hege, T., Russo, S., and Baumann, U. (2009). Metzincin's canonical methionine is responsible for the structural integrity of the zinc-binding site. Biol Chem 390, 875-881. Odermatt, E., Berger, H., and Sano, Y. (1981). Size and shape of human C1-inhibitor. FEBS Lett 131, 283-285. Ohi, M., Li, Y., Cheng, Y., and Walz, T. (2004). Negative Staining and Image Classification - Powerful Tools in Modern Electron Microscopy. Biol Proced Online 6, 23-34. Painter, J., and Merritt, E.A. (2006). Optimal description of a protein structure in terms of multiple groups undergoing TLS motion. Acta Crystallogr D Biol Crystallogr 62, 439-450. Paton, A.W., and Paton, J.C. (2002). Reactivity of convalescent-phase hemolytic-uremic syndrome patient sera with the megaplasmid-encoded TagA protein of Shiga toxigenic Escherichia coli O157. J Clin Microbiol 40, 1395-1399. Perkins, S.J., Smith, K.F., Amatayakul, S., Ashford, D., Rademacher, T.W., Dwek, R.A., Lachmann, P.J., and Harrison, R.A. (1990). Two-domain structure of the native and reactive centre cleaved forms of C1 inhibitor of human complement by neutron scattering. J Mol Biol 214, 751-763. Petrotchenko, E.V., and Borchers, C.H. (2010). Crosslinking combined with mass spectrometry for structural proteomics. Mass Spectrom Rev 29, 862-876. Pike, R.N., Bottomley, S.P., Irving, J.A., Bird, P.I., and Whisstock, J.C. (2002). Serpins: finely balanced conformational traps. IUBMB Life 54, 1-7.  121 Pillai, L., Sha, J., Erova, T.E., Fadl, A.A., Khajanchi, B.K., and Chopra, A.K. (2006). Molecular and functional characterization of a ToxR-regulated lipoprotein from a clinical isolate of Aeromonas hydrophila. Infect Immun 74, 3742-3755. Prakobphol, A., Tangemann, K., Rosen, S.D., Hoover, C.I., Leffler, H., and Fisher, S.J. (1999). Separate oligosaccharide determinants mediate interactions of the low- molecular-weight salivary mucin with neutrophils and bacteria. Biochemistry 38, 6817-6825. Pruzzo, C., Vezzulli, L., and Colwell, R.R. (2008). Global impact of Vibrio cholerae interactions with chitin. Environ Microbiol 10, 1400-1410. Rawlings, N.D., Barrett, A.J., and Bateman, A. (2010). MEROPS: the peptidase database. Nucleic Acids Res 38, D227-233. Rendon, M.A., Saldana, Z., Erdem, A.L., Monteiro-Neto, V., Vazquez, A., Kaper, J.B., Puente, J.L., and Giron, J.A. (2007). Commensal and pathogenic Escherichia coli use a common pilus adherence factor for epithelial cell colonization. Proc Natl Acad Sci U S A 104, 10637-10642. Rooijakkers, S.H., and van Strijp, J.A. (2007). Bacterial complement evasion. Mol Immunol 44, 23-32. Ruiz-Perez, F., Wahid, R., Faherty, C.S., Kolappaswamy, K., Rodriguez, L., Santiago, A., Murphy, E., Cross, A., Sztein, M.B., and Nataro, J.P. (2011). Serine protease autotransporters from Shigella flexneri and pathogenic Escherichia coli target a broad range of leukocyte glycoproteins. Proc Natl Acad Sci U S A 108, 12881-12886. Sanowar, S., Singh, P., Pfuetzner, R.A., Andre, I., Zheng, H., Spreter, T., Strynadka, N.C.,  122 Gonen, T., Baker, D., Goodlett, D.R., et al. (2010). Interactions of the transmembrane polymeric rings of the Salmonella enterica serovar Typhimurium type III secretion system. MBio 1, e00158-10. Schechter, I., and Berger, A. (1967). On the size of the active site in proteases. I. Papain. Biochem Biophys Res Commun 27, 157-162. Schoenberger, O.L. (1992). Characterization of carbohydrate chains of C1-inhibitor and of desialylated C1-inhibitor. FEBS Lett 314, 430-434. Serruto, D., Rappuoli, R., Scarselli, M., Gros, P., and van Strijp, J.A. (2010). Molecular mechanisms of complement evasion: learning from staphylococci and meningococci. Nat Rev Microbiol 8, 393-399. Seveau, S., Keller, H., Maxfield, F.R., Piller, F., and Halbwachs-Mecarelli, L. (2000). Neutrophil polarity and locomotion are associated with surface redistribution of leukosialin (CD43), an antiadhesive membrane molecule. Blood 95, 2462-2470. Sham, H.P., Shames, S.R., Croxen, M.A., Ma, C., Chan, J.M., Khan, M.A., Wickham, M.E., Deng, W., Finlay, B.B., and Vallance, B.A. (2011). Attaching and effacing bacterial effector NleC suppresses epithelial inflammatory responses by inhibiting NF-kappaB and p38 mitogen-activated protein kinase activation. Infect Immun 79, 3552-3562. Silva, A.J., Pham, K., and Benitez, J.A. (2003). Haemagglutinin/protease expression and mucin gel penetration in El Tor biotype Vibrio cholerae. Microbiology 149, 1883- 1891. Sory, M.P., and Cornelis, G.R. (1994). Translocation of a hybrid YopE-adenylate cyclase from Yersinia enterocolitica into HeLa cells. Mol Microbiol 14, 583-594. Spears, K.J., Roe, A.J., and Gally, D.L. (2006). A comparison of enteropathogenic and  123 enterohaemorrhagic Escherichia coli pathogenesis. FEMS Microbiol Lett 255, 187- 202. Strecker, G., Ollier-Hartmann, M.P., van Halbeek, H., Vliegenthart, J.F., Montreuil, J., and Hartmann, L. (1985). [Primary structure of the glycan chains of normal C 1 esterase inhibitor (C 1-INH) after NMR analysis at 400 MHz]. C R Acad Sci III 301, 571-576. Szabady, R.L., Lokuta, M.A., Walters, K.B., Huttenlocher, A., and Welch, R.A. (2009). Modulation of neutrophil function by a secreted mucinase of Escherichia coli O157:H7. PLoS Pathog 5, e1000320. Szabady, R.L., Yanta, J.H., Halladin, D.K., Schofield, M.J., and Welch, R.A. (2011). TagA is a secreted protease of Vibrio cholerae that specifically cleaves mucin glycoproteins. Microbiology 157, 516-525. Takeda, S., Igarashi, T., and Mori, H. (2007). Crystal structure of RVV-X: an example of evolutionary gain of specificity by ADAM proteinases. FEBS Lett 581, 5859-5864. Tallant, C., Garcia-Castellanos, R., Seco, J., Baumann, U., and Gomis-Ruth, F.X. (2006). Molecular analysis of ulilysin, the structural prototype of a new family of metzincin metalloproteases. J Biol Chem 281, 17920-17928. Tam, P.Y., and Verdugo, P. (1981). Control of mucus hydration as a Donnan equilibrium process. Nature 292, 340-342. Tobe, T., Beatson, S.A., Taniguchi, H., Abe, H., Bailey, C.M., Fivian, A., Younis, R., Matthews, S., Marches, O., Frankel, G., et al. (2006). An extensive repertoire of type III secretion effectors in Escherichia coli O157 and the role of lambdoid phages in their dissemination. Proc Natl Acad Sci U S A 103, 14941-14946. Tronrud, D.E., Monzingo, A.F., and Matthews, B.W. (1986). Crystallographic structural  124 analysis of phosphoramidates as inhibitors and transition-state analogs of thermolysin. Eur J Biochem 157, 261-268. Trott, O., and Olson, A.J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31, 455-461. Vallance, B.A., Deng, W., Jacobson, K., and Finlay, B.B. (2003). Host susceptibility to the attaching and effacing bacterial pathogen Citrobacter rodentium. Infect Immun 71, 3443-3453. Van den Steen, P., Rudd, P.M., Dwek, R.A., and Opdenakker, G. (1998). Concepts and principles of O-linked glycosylation. Crit Rev Biochem Mol Biol 33, 151-208. Vazquez-Torres, A., and Fang, F.C. (2000). Cellular routes of invasion by enteropathogens. Curr Opin Microbiol 3, 54-59. Velazquez-Campoy, A., Ohtaka, H., Nezami, A., Muzammil, S., and Freire, E. (2004). Isothermal titration calorimetry. Curr Protoc Cell Biol Chapter 17, Unit 17 18. Veltman, O.R., Eijsink, V.G., Vriend, G., de Kreij, A., Venema, G., and Van den Burg, B. (1998). Probing catalytic hinge bending motions in thermolysin-like proteases by glycine --> alanine mutations. Biochemistry 37, 5305-5311. Vonrhein, C., Blanc, E., Roversi, P., and Bricogne, G. (2007). Automated structure solution with autoSHARP. Methods Mol Biol 364, 215-230. Walport, M.J. (2001). Complement. First of two parts. N Engl J Med 344, 1058-1066. Walters, L.L., Raterman, E.L., Grys, T.E., and Welch, R.A. (2012). Atypical Shigella boydii 13 encodes virulence factors seen in attaching and effacing Escherichia coli. Fems Microbiology Letters 328, 20-25.  125 Wermter, C., Howel, M., Hintze, V., Bombosch, B., Aufenvenne, K., Yiallouros, I., and Stocker, W. (2007). The protease domain of procollagen C-proteinase (BMP1) lacks substrate selectivity, which is conferred by non-proteolytic domains. Biol Chem 388, 513-521. Wiesmann, C., Katschke, K.J., Yin, J., Helmy, K.Y., Steffek, M., Fairbrother, W.J., McCallum, S.A., Embuscado, L., DeForge, L., Hass, P.E., et al. (2006). Structure of C3b in complex with CRIg gives insights into regulation of complement activation. Nature 444, 217-220. Wolff, M.W., Zhang, F., Roberg, J.J., Caldwell, E.E., Kaul, P.R., Serrahn, J.N., Murhammer, D.W., Linhardt, R.J., and Weiler, J.M. (2001). Expression of C1 esterase inhibitor by the baculovirus expression vector system: preparation, purification, and characterization. Protein Expr Purif 22, 414-421. Worrall, L.J., Lameignere, E., and Strynadka, N.C. (2011). Structural overview of the bacterial injectisome. Curr Opin Microbiol 14, 3-8. Xu, R., McBride, R., Nycholat, C.M., Paulson, J.C., and Wilson, I.A. (2012). Structural Characterization of the Hemagglutinin Receptor Specificity from the 2009 H1N1 Influenza Pandemic. J Virol 86, 982-990. Yonekura, K., Maki-Yonekura, S., and Namba, K. (2003). Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643-650. Zarivach, R., Vuckovic, M., Deng, W., Finlay, B.B., and Strynadka, N.C. (2007). Structural analysis of a prototypical ATPase from the type III secretion system. Nat Struct Mol Biol 14, 131-137.   126 Appendices Appendix A  The virulence-associated type III secretion system Evolutionarily related to the flagella, the type III secretion system (T3SS) is a surface organelle associated with the bacterial envelope that mediates the delivery of bacterial virulence proteins into host cells. The translocated effector proteins often disrupt key eukaryotic cellular processes to allow intimate bacterial attachment and subsequent infection (Galan and Wolf-Watz, 2006). As such, the T3SS is essential to the virulence of microorganisms that utilize this protein export pathway, and sequencing of microbial genomes revealed that it is widespread among Gram-negative pathogens, including Salmonella typhimurium, Shigella flexneri, Yersinia enterocolitica, enteropathogenic and enterohemorrhagic E. coli (EPEC and EHEC, respectively), Chlamydia and Pseudomonas aeruginosa (Pallen et al., 2005).  A.1 The type III secretion apparatus The type III secretion apparatus is structurally and functionally conserved across the species. It is composed of approximately twenty different proteins that assemble into a macromolecular complex anchored in the bacterial envelope (Table A1 and Figure A.1). It spans both the inner and outer membranes of the bacteria, the peptidylglycan layer and the eukaryotic plasma membrane at its distal end to provide a continuous passage for the transfer of bacterial effectors directly into the host cytoplasm (Moraes et al., 2008).  Indeed, electron microscopy imaging of purified secretion complexes from S. typhimurium, S. flexneri and EPEC revealed a syringe-like structure with a multi-ring base and an hollow needle-like extension that is poised to function as an “injectisome” (Sani et al., 2007; Schraidt and  127 Marlovits, 2011; Sekiya et al., 2001). The needle is further stabilized by an inner rod, which together form a 20-30 Å-wide channel that traverses the body of the base to serve as a conduit for protein translocation (Marlovits et al., 2004). The diameter of the needle dictates that protein substrates destined for type III secretion need to be partially unfolded as it cannot accommodate the dimensions of native globular proteins. It is believed that dedicated type III chaperone proteins interact with their cognate substrates through their N-terminal chaperone- binding domains and keep them in a non-globular, secretion-competent state to facilitate their translocation through the needle complex (Galan and Wolf-Watz, 2006). Table A.1 Components of the bacterial virulence-associated type III secretion systems   128  Figure A.1 Structural overview of the type III secretion apparatus. High-resolution structures of the membrane ring components (EscC, PrgH and EscJ) and the needle (MxiH) are modeled into the EM maps of the Salmonella needle complex and Shigella needle, respectively, based on the orientations determined in the modeling studies by (Spreter et al., 2009) and (Dean and Kenny, 2009). Positions of the remaining structures are for illustration only. The injectisome traverses the bacterial inner membrane (IM), peptidylglycan (PG), outer membrane (OM), extracellular space and the host membrane (HM). Nomenclatures for the proteins from different species are listed in Table A.1. The figure is adapted from (Worrall et al., 2011). Reprinted with permission of Current Opinion in Microbiology.  A.2 Assembly Previous genetic and biochemical studies have shown that assembly of the needle complex proceeds in a hierarchical manner, starting with the insertion of multimeric rings in the outer  129 and inner membranes (Figure A.2) (Diepold et al., 2010; Sukhan et al., 2001). More specifically, recent fluorescent labeling experiments seem to suggest that the assembly occurs through two independent pathways, starting with the construction of separate outer membrane and inner membrane platforms which merge via a periplasmic intermediate (Figure A.2) (Diepold et al., 2010; Diepold et al., 2011). The Sec protein transport pathway, which involves the recognition of N-terminal signal sequences, is required for this process, as well as for the integration of the export apparatus in the inner membrane.  Figure A.2 Model for the step-wise assembly of the T3SS. The OM branch starts with the assembly of the YscC secretin ring, followed by YscD and YscJ to form the structural foundation of the needle complex. The IM branch begins with the association of the export apparatus components YscR, S and T, which induce YscV oligomerization and the attachment of YscJ. YscJ is believed to be the intermediate that links the OM and IM platforms. Recruitment of the ATPase-C ring complex completes the export apparatus, rendering the nascent T3SS competent for exporting the needle subunits.  The subsequent addition of the ATPase-C ring complex to the cytoplasmic face of the secretion apparatus completes the export apparatus. At this stage, the T3SS becomes  130 secretion-competent to export the inner rod and needle subunits, which polymerize in the extracellular space (Figure A.2). Once the needle complex is fully assembled, the secretion apparatus undergoes substrate specificity switching, ready to secrete the pore-forming translocators and effector proteins upon contact with the host cell membrane (Hakansson et al., 1996).  A.3 Extracellular components: needle, needle extension and translocon The major extracellular component of the type III secretion apparatus is the needle, which is a helical polymer composed of ~100-150 copies of a small protein from the PrgI/YscF/MxiH family (from Salmonella typhimurium, Yersinia spp. and Shigella spp., respectively as shown in Table A.1) (Cordes et al., 2003). High-resolution structures of the needle subunit from several homologues reveal a conserved architecture: a central coiled-coil helical bundle and a flexible C-terminal helix that is involved in subunit polymerization, as suggested by molecular docking of MxiH into EM (electron microscopy) reconstruction of the Shigella needle complex (Deane et al., 2006). Mapping of the needle mutants onto the model of the intact filament channel indicates that changes in the subunit interface can lead to aberrant secretion phenotypes (Blocker et al., 2008). This is consistent with the idea that the signal of host cell contact is transmitted through subtle conformational changes along the needle to regulate the translocation of effectors. The length of the needle is well controlled within species. The current “molecular ruler/tape-measure” model favors that the InvJ/YscP/Spa32 family of proteins determines the appropriate needle length through full extension of their unstructured “ruler” domains (Journet et al., 2003). In support of this model, a more recent study identified an inverse relationship between the needle span and the amount of secondary  131 structure or more specifically, the α-helical content, in the needle-length control protein (Journet et al., 2003; Wagner et al., 2009).  At the distal end of the needle complex is the active tip complex. It is assembled through the sequential recruitment of a hydrophilic needle tip protein and two hydrophobic translocators, triggered by the sensing of small molecules present in the host environment, such as bile salts and membrane lipids, respectively (Epler et al., 2009; Stensrud et al., 2008). The tip protein acts as an extension of the needle to serve as a platform for translocon assembly. Hetero- oligomerization of the SipB/YopB/IpaB and SipC/YopD/IpaC families of translocator proteins generates a pore-forming complex, which inserts into the host plasma membrane to initiate the process of protein translocation. Recent crystal structures of the N-terminal translocator fragments from Salmonella SipB and Shigella IpaB revealed extended coiled- coil motifs which resemble those identified in the needle and tip proteins (Barta et al., 2012). This highlights the role of intramolecular coiled-coils as potential structural scaffolds that facilitate the oligomerization of the translocon above the needle-tip complex, although the exact stoichiometry of the two translocator proteins remains unclear. Translocon proteins isolated from the membranes of sheep red blood cells that had been infected with live Yersina enterocolitica were observed to form a multimeric 500- to 700-kDa complex but the composition of the oligomeric species could not be determined (Montagner et al., 2011).  A.4 Outer membrane structure The core of the T3SS that provides a structural foundation for anchoring the extracellular needle is the basal body. It consists of mainly three proteins that oligomerize into ring-like  132 structures in the bacterial envelope: the InvG/YscC/MxiD family of outer membrane secretins, and PrgK/YscJ/MxiJ and PrgH/YscD/MxiG in the inner membrane (Figure A.2 and Table A.1).  The sole outer membrane component of the T3SS is the homomultimeric secretin ring, which functions as a gateway for the passage of proteins across the bacterial outer membrane. Comparison of the EM reconstructions of the Salmonella base substructure and the fully assembled needle complex revealed that prior to the deployment of the needle subunits, the channel of the secretin ring is sealed by a “septum” (Marlovits et al., 2004). Conformational changes in the outer membrane protein, along with overall structural remodeling of the secretion apparatus, are required to open the central pore and transform it into a structural platform for the assembly of the inner rod and needle. Proteins of the secretin family share a common modular architecture with a C-terminal “secretin homology domain” and a less conserved N-terminal domain (Figure A.3). EM studies on PulD, the prototypical secretin of the type II general secretory pathway, defined the approximate localization of the two domains, suggesting that a protease-resistant C-terminal region is anchored in the outer membrane while the more variable N-terminal domain resides in the periplasm to confer specific interactions with other periplasmic or inner membrane protein components (Chami et al., 2005).  133  Figure A.3 Domain organization of T3SS outer and inner membrane ring components. Shown are the domain boundaries of representative members from the InvG/YscC/MxiD PrgK/YscJ/MxiJ and PrgH/YscD/MxiG families. Enteropathogenic E. coli (EPEC) EscC contains a N-terminal periplasmic domain and a highly conserved protease-resistant C- terminal domain that is embedded in the outer membrane. EPEC EscJ is attached to the inner membrane via N-terminal lipidation and a C-terminal transmembrane (TM) helix. Salmonella PrgH contains a predicted transmembrane region that anchors the N-terminal cytoplasmic domain and C-terminal periplasmic region to the inner membrane.  A.5 Inner membrane structure Genetic and biochemical analyses of the S. typhimurium T3SS revealed that PrgK and PrgH constitute the major inner membrane components of the needle complex (Figure A.2 and Table A.1) (Kimbrough and Miller, 2000; Sukhan et al., 2001). They can oligomerize into ring-shaped complexes independent of other T3SS components, suggesting that they serve as a structural platform to anchor the outer membrane rings and the export apparatus to enable secretion. Interestingly, recent fluorescent labeling studies and complementary pull-down assays of T3SS mutants have indicated that YscJ, the PrgK homologue from Yersinia, may act as an adaptor to link the outer and inner membrane substructures of the secretion apparatus (Diepold et al., 2011). YscJ belongs to the highly conserved PrgK/YscJ/MxiJ family of lipoproteins that attaches to the outer leaflet of the inner membrane through lipidation of a conserved N-terminal cysteine residue and a C-terminal transmembrane helix (Figure A.3). As revealed by the crystal structure of EscJ, the PrgK homologue from EPEC,  134 the lipoprotein has the propensity to form a 24-subunit symmetrical ring structure, which molecular modeling and docking show to be suitably localized to the periplasmic side of the inner membrane rings in the context of the EM density map for the homologous S. typhimurium T3SS (Yip et al., 2005).  The second inner membrane component of the needle complex is the PrgH/YscD/MxiG family of bitopic membrane proteins (Figure A.2 and Table A.1).  It consists of a N-terminal cytoplasmic domain and a larger C-terminal periplasmic domain that is tethered to the inner membrane through a transmembrane helix (Figure A.3). Bioinformatic analysis predicted that the N-terminal regions of members from the PrgH family fold as FHA (forkhehad- associated) domains, which are often found in regulatory proteins involved in signal transduction (Pallen et al., 2002), suggesting that PrgH may act to relay environmental signals across the bacterial membranes to control type III secretion. This may be mediated in part by the periplasmic domain of PrgH, which is proposed to form a ring structure that surrounds PrgK in the inner membrane (Yip et al., 2005), although its precise topological arrangement and localization within the intact needle complex were unclear at the time I undertook the PrgH project. To gain further insight into the structure and adaptor function of PrgH, I overexpressed and purified its major periplasmic domain, which encompassed residues 170-362.  I eventually managed to obtain diffracting crystals of PrgH(170-362), and the structure was solved by the SAD technique and refined to 2.3Å by Dr. Calvin Yip. The modular domain topology of the periplasmic domain of PrgH resembles that observed in EscJ, the other essential inner membrane ring component of the type III basal body. The striking structural similarities between the two ring-forming proteins led us to propose that  135 the oligomerization of PrgH in the inner membrane may be mediated by a conserved ring- building motif that is also present in EscJ (Spreter et al., 2009).  A.6 Structural characterization of PrgH periplasmic domain The crystal structure of PrgH (170-362) reveals a modular domain architecture (Spreter et al., 2009). It consists of three topologically similar α/β domains (domains I, II and III) that are spatially arranged to adopt an overall “boot-shaped” structure (Figure A.4A). Each domain contains a central three-stranded β-sheet, juxtaposed on one side by two α-helices with similar lengths as the β-strands.  Figure A.4 PrgH(170-362) shares a conserved ring-building motif in EscJ(21-190). (A) Overall architecture of PrgH(170-362). The monomer consists of three topologically similar α/β domains (I, II and III). The N- and C-termini are denoted by “N” and “C”, respectively. (B) Superposition domain II of PrgH(170-362) onto EscJ(21-190) (colored in green) reveals a conserved α/β module, which has been observed to mediate ring formation in EscJ.  A structural homology search using the Dali server, which compares the query with all entries in the Protein Data Bank (PDB) (Holm and Rosenstrom, 2010), revealed surprising  136 structural similarities between  EscJ and the PrgH periplasmic domain (Figure A.4B). Although the two inner membrane proteins lack sequence identity, discrete domains of PrgH(170-362) and the periplasmic region of EscJ, EscJ(21-190), can be superimposed onto one another, with a Z-score of 6.3 (similarities with a Z-score less than 2 are insignificant). Given that this set of unique topological features mediate inter-subunit interactions in EscJ, a conserved structural motif may be involved to promote the oligomerization of PrgH upon its integration into the inner membrane.  A.7 Implications of PrgH periplasmic domain structure In an attempt to better define the localization and topology of PrgH relative to other basal body components in the intact T3SS assembly, the crystal structure of PrgH(170-362) was docked into the EM density map of the S. typhimurium needle complex (Spreter et al., 2009). The resulting PrgH ring model suitably accounted for the density regions directly adjacent to EscJ(21-190), in agreement with its expected periplasmic location and the buried nature of EscJ in the fully assembled secretion apparatus (Figure A.5) (Yip et al., 2005). The docked orientation was later shown to be consistent with the experimentally validated position, where the nanogold-labeled PrgH C-terminus was observed to occupy the region on the periplasmic face of the inner membrane (Schraidt et al., 2010).  137  Figure A.5 Basal body components of the T3SS. Modeling of the crystal structures of EscC(21-174), PrgH(170-362) and EscJ(21-190) into the Salmonella typhimurium EM density map (PDB codes: 3GR5, 3GR0 and 1YJ7, respectively). The boxed area defines the C-terminal protease-resistant “secretin homology domain” as identified for the type II secretin PulD. The proposed position of PrgH(170-362) is suitable for forming a ring around the oligomeric EscJ(21-190). Note that EscC(21-174) docks into a region that reaches into the periplasmic space. The figure is adapted from Spreter et al., 2009.  Interestingly, the crystal structure of the N-terminal periplasmic domain of EscC, EscC(21- 174), also reveals the characteristic α/β  topology conserved in EscJ(21-190) and PrgH(170- 362), suggesting that a common ring-building motif which drives the assembly of the EscJ ring may also mediate EscC oligomerization (Figure A.4). Based on this assumption, a ring model for EscC(21-174) was generated and its positioning in the EM reconstruction of the S. typhimurium injectisome was guided by data from limited proteolysis and surface biotinylation experiments (Spreter et al., 2009). The predicted position reaches further into the periplasmic region of the basal body than previously thought, suggesting a close contact interface between the secretin and the inner membrane components (Figure A.5). This  138 provides the first insight on how the outer and inner membrane rings may connect with one another after their integration into the membranes. In support of this finding, recent EM and cross-linking experiments have identified specific interactions between the periplasmic domains of the outer and inner membrane components, namely the N-terminal domain of the S. typhimurium secretin InvG and the C-terminal domain of PrgH (Sanowar et al., 2010; Schraidt et al., 2010).  A.8 Structure of MxiG cytoplasmic domain While the C-terminal domain of PrgH is engaged in intimate contact with the periplasmic regions of the outer membrane secretin InvG and PrgK, which is anchored in the inner membrane, its N-terminal cytoplasmic domain is predicted to fold as a FHA domain, a phosphothreonine recognition module often found in regulatory proteins involved in cell signaling (Pallen et al., 2002). Indeed, recent high-resolution structures of this domain from the Shigella flexneri homologue reveal that MxiG(1-126) adopts a FHA fold (Figure A.6) (Barison et al., 2012; McDowell et al., 2011). The cytoplasmic domain of the inner membrane protein MxiG was shown to interact with Spa33, the Shigella T3SS C-ring (cytoplasmic ring) that has been proposed to act as a sorting platform to establish the type III secretion hierarchy through differential recognition of the various chaperone-effector complexes (Lara-Tejero et al., 2011). Mutations of the conserved residues involved in phosphothreonine binding (MxiG Ser63 and Arg39 ) reduced T3SS activity and impaired epithelial cell invasion by S. flexneri, suggesting that the PrgH/YscD/MxiG family of inner membrane proteins may function as a receptor for sensing environmental stimuli and  139 transmits the signal through its periplasmic, transmembrane and cytoplasmic regions to contribute to the regulation of T3SS (Barison et al., 2012).  Figure A.6 MxiG(1-126) adopts a FHA fold. Secondary structure representation of the forkhead-associated fold adopted by MxiG(1-126), with the conserved phosphothreonine binding residues (Arg39 and Ser63) highlighted in ball- and-stick form (PDB 4A4Y; Barison et al., 2012) The N- and C-termini are denoted as “N” and “C.”  A.9 Export apparatus At the base of the type III secretion apparatus is a group of highly conserved integral membrane proteins (YscR, YscS, YscT, YscU and YscV from Yersinia) that together form the core of the export apparatus, which is thought to act as a gated docking platform to control the passage of substrate proteins across the inner membrane to the needle-like channel (Figure A.2 and Table A.1). As visualized by EM analysis of the purified S. typhimurium needle complex, they form a defined “cup-like” substructure within the inner  140 membrane rings of the basal body (Wagner et al., 2010). Besides acting as a portal for protein translocation, co-immunoprecipitation and fluorescent labeling studies have shown that a subset of the export apparatus components likely serve as a nucleation point for the sequential assembly of the inner membrane components of the needle complex. Starting with the subunit association between proteins from the SpaP/YscR/Spa24, SpaQ/YscS/Spa9 and SpaR/YscT/Spa29 families, the complex induces the oligomerization of InvA/YscV/MxiA in the inner membrane. Specifically, the stable anchoring of the YscV oligomer in the Yersinia injectisome was found to require the attachment of the YscJ inner membrane ring, which is believed to fuse the outer and inner membrane substructures of the secretion apparatus via its periplasmic domain as both assembly intermediates can recruit YscJ (Figure A.2) (Diepold et al., 2011). Subsequent peripheral association of the ATPase-C ring complex on the basal face of the needle complex completes the export apparatus, which becomes poised to harness the energy from ATP hydrolysis and the proton motive force to dissociate chaperone-effector complexes to prime the substrates for translocation (Akeda and Galan, 2005; Paul et al., 2008).  Another integral component of the export apparatus that regulates type III secretion is the SpaS/YscU/Spa40 family of proteins (Table A.1). It has been proposed to act as a “molecular substrate specificity switch” that controls the hierarchical delivery of the needle and translocator proteins, as well as downstream effectors, through an intein-like auto-cleavage mechanism (Figure A.2). As revealed by crystal structures of several homologues from this family, the post-translational cleavage results in changes in the conformation of a highly conserved cytoplasmic loop and the surrounding electrostatics features, which potentially  141 provide a unique binding interface for the recognition of the translocon components to allow their timely secretion (Sorg et al., 2007; Zarivach et al., 2008).  Appendix B  References Akeda, Y., and Galan, J.E. (2005). Chaperone release and unfolding of substrates in type III secretion. Nature 437, 911-915. Barison, N., Lambers, J., Hurwitz, R., and Kolbe, M. (2012). Interaction of MxiG with the cytosolic complex of the type III secretion system controls Shigella virulence. FASEB J. Blocker, A.J., Deane, J.E., Veenendaal, A.K., Roversi, P., Hodgkinson, J.L., Johnson, S., and Lea, S.M. (2008). What's the point of the type III secretion system needle? Proc Natl Acad Sci U S A 105, 6507-6513. Chami, M., Guilvout, I., Gregorini, M., Remigy, H.W., Muller, S.A., Valerio, M., Engel, A., Pugsley, A.P., and Bayan, N. (2005). Structural insights into the secretin PulD and its trypsin-resistant core. J Biol Chem 280, 37732-37741. Cordes, F.S., Komoriya, K., Larquet, E., Yang, S., Egelman, E.H., Blocker, A., and Lea, S.M. (2003). Helical structure of the needle of the type III secretion system of Shigella flexneri. J Biol Chem 278, 17103-17107. Deane, J.E., Roversi, P., Cordes, F.S., Johnson, S., Kenjale, R., Daniell, S., Booy, F., Picking, W.D., Picking, W.L., Blocker, A.J., et al. (2006). Molecular model of a type III secretion system needle: Implications for host-cell sensing. Proc Natl Acad Sci U S A 103, 12529-12533. Diepold, A., Amstutz, M., Abel, S., Sorg, I., Jenal, U., and Cornelis, G.R. (2010).  142 Deciphering the assembly of the Yersinia type III secretion injectisome. EMBO J 29, 1928-1940. Diepold, A., Wiesand, U., and Cornelis, G.R. (2011). The assembly of the export apparatus (YscR,S,T,U,V) of the Yersinia type III secretion apparatus occurs independently of other structural components and involves the formation of an YscV oligomer. Mol Microbiol 82, 502-514. Epler, C.R., Dickenson, N.E., Olive, A.J., Picking, W.L., and Picking, W.D. (2009). Liposomes recruit IpaC to the Shigella flexneri type III secretion apparatus needle as a final step in secretion induction. Infect Immun 77, 2754-2761. Galan, J.E., and Wolf-Watz, H. (2006). Protein delivery into eukaryotic cells by type III secretion machines. Nature 444, 567-573. Hakansson, S., Schesser, K., Persson, C., Galyov, E.E., Rosqvist, R., Homble, F., and Wolf- Watz, H. (1996). The YopB protein of Yersinia pseudotuberculosis is essential for the translocation of Yop effector proteins across the target cell plasma membrane and displays a contact-dependent membrane disrupting activity. EMBO J 15, 5812-5823. Holm, L., and Rosenstrom, P. (2010). Dali server: conservation mapping in 3D. Nucleic Acids Res 38, W545-549. Journet, L., Agrain, C., Broz, P., and Cornelis, G.R. (2003). The needle length of bacterial injectisomes is determined by a molecular ruler. Science 302, 1757-1760. Kimbrough, T.G., and Miller, S.I. (2000). Contribution of Salmonella typhimurium type III secretion components to needle complex formation. Proc Natl Acad Sci U S A 97, 11008-11013. Lara-Tejero, M., Kato, J., Wagner, S., Liu, X., and Galan, J.E. (2011). A sorting platform  143 determines the order of protein secretion in bacterial type III systems. Science 331, 1188-1191. Marlovits, T.C., Kubori, T., Sukhan, A., Thomas, D.R., Galan, J.E., and Unger, V.M. (2004). Structural insights into the assembly of the type III secretion needle complex. Science 306, 1040-1042. McDowell, M.A., Johnson, S., Deane, J.E., Cheung, M., Roehrich, A.D., Blocker, A.J., McDonnell, J.M., and Lea, S.M. (2011). Structural and functional studies on the N- terminal domain of the Shigella type III secretion protein MxiG. J Biol Chem 286, 30606-30614. Montagner, C., Arquint, C., and Cornelis, G.R. (2011). Translocators YopB and YopD from Yersinia enterocolitica form a multimeric integral membrane complex in eukaryotic cell membranes. J Bacteriol 193, 6923-6928. Moraes, T.F., Spreter, T., and Strynadka, N.C. (2008). Piecing together the type III injectisome of bacterial pathogens. Curr Opin Struct Biol 18, 258-266. Pallen, M., Chaudhuri, R., and Khan, A. (2002). Bacterial FHA domains: neglected players in the phospho-threonine signalling game? Trends Microbiol 10, 556-563. Pallen, M.J., Beatson, S.A., and Bailey, C.M. (2005). Bioinformatics, genomics and evolution of non-flagellar type-III secretion systems: a Darwinian perspective. FEMS Microbiol Rev 29, 201-229. Paul, K., Erhardt, M., Hirano, T., Blair, D.F., and Hughes, K.T. (2008). Energy source of flagellar type III secretion. Nature 451, 489-492. Sani, M., Allaoui, A., Fusetti, F., Oostergetel, G.T., Keegstra, W., and Boekema, E.J. (2007).  144 Structural organization of the needle complex of the type III secretion apparatus of Shigella flexneri. Micron 38, 291-301. Sanowar, S., Singh, P., Pfuetzner, R.A., Andre, I., Zheng, H., Spreter, T., Strynadka, N.C., Gonen, T., Baker, D., Goodlett, D.R., et al. (2010). Interactions of the transmembrane polymeric rings of the Salmonella enterica serovar Typhimurium type III secretion system. MBio 1, e00158-10. Schraidt, O., Lefebre, M.D., Brunner, M.J., Schmied, W.H., Schmidt, A., Radics, J., Mechtler, K., Galan, J.E., and Marlovits, T.C. (2010). Topology and organization of the Salmonella typhimurium type III secretion needle complex components. PLoS Pathog 6, e1000824. Schraidt, O., and Marlovits, T.C. (2011). Three-dimensional model of Salmonella's needle complex at subnanometer resolution. Science 331, 1192-1195. Sekiya, K., Ohishi, M., Ogino, T., Tamano, K., Sasakawa, C., and Abe, A. (2001). Supermolecular structure of the enteropathogenic Escherichia coli type III secretion system and its direct interaction with the EspA-sheath-like structure. Proc Natl Acad Sci U S A 98, 11638-11643. Sorg, I., Wagner, S., Amstutz, M., Muller, S.A., Broz, P., Lussi, Y., Engel, A., and Cornelis, G.R. (2007). YscU recognizes translocators as export substrates of the Yersinia injectisome. EMBO J 26, 3015-3024. Spreter, T., Yip, C.K., Sanowar, S., Andre, I., Kimbrough, T.G., Vuckovic, M., Pfuetzner, R.A., Deng, W., Yu, A.C., Finlay, B.B., et al. (2009). A conserved structural motif mediates formation of the periplasmic rings in the type III secretion system. Nat Struct Mol Biol 16, 468-476.  145 Stensrud, K.F., Adam, P.R., La Mar, C.D., Olive, A.J., Lushington, G.H., Sudharsan, R., Shelton, N.L., Givens, R.S., Picking, W.L., and Picking, W.D. (2008). Deoxycholate interacts with IpaD of Shigella flexneri in inducing the recruitment of IpaB to the type III secretion apparatus needle tip. J Biol Chem 283, 18646-18654. Sukhan, A., Kubori, T., Wilson, J., and Galan, J.E. (2001). Genetic analysis of assembly of the Salmonella enterica serovar Typhimurium type III secretion-associated needle complex. J Bacteriol 183, 1159-1167. Wagner, S., Konigsmaier, L., Lara-Tejero, M., Lefebre, M., Marlovits, T.C., and Galan, J.E. (2010). Organization and coordinated assembly of the type III secretion export apparatus. Proc Natl Acad Sci U S A 107, 17745-17750. Wagner, S., Sorg, I., Degiacomi, M., Journet, L., Dal Peraro, M., and Cornelis, G.R. (2009). The helical content of the YscP molecular ruler determines the length of the Yersinia injectisome. Mol Microbiol 71, 692-701. Yip, C.K., Kimbrough, T.G., Felise, H.B., Vuckovic, M., Thomas, N.A., Pfuetzner, R.A., Frey, E.A., Finlay, B.B., Miller, S.I., and Strynadka, N.C. (2005). Structural characterization of the molecular platform for type III secretion system assembly. Nature 435, 702-707. Zarivach, R., Deng, W., Vuckovic, M., Felise, H.B., Nguyen, H.V., Miller, S.I., Finlay, B.B., and Strynadka, N.C. (2008). Structural analysis of the essential self-cleaving type III secretion proteins EscU and SpaS. Nature 453, 124-127.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items