UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Discovery and engineering of carbohydrate-modifying enzymes using targeted high-throughput approaches Mehr, Kevin 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2017_february_mehr_kevin.pdf [ 6.07MB ]
Metadata
JSON: 24-1.0340628.json
JSON-LD: 24-1.0340628-ld.json
RDF/XML (Pretty): 24-1.0340628-rdf.xml
RDF/JSON: 24-1.0340628-rdf.json
Turtle: 24-1.0340628-turtle.txt
N-Triples: 24-1.0340628-rdf-ntriples.txt
Original Record: 24-1.0340628-source.json
Full Text
24-1.0340628-fulltext.txt
Citation
24-1.0340628.ris

Full Text

DISCOVERY AND ENGINEERING OF CARBOHYDRATE-MODIFYING ENZYMES USING TARGETED HIGH-THROUGHPUT APPROACHES by  Kevin Mehr  B.Sc.(Honours), Thompson Rivers University, 2008  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Genome Science and Technology)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  December 2016  © Kevin Mehr, 2016 ii  Abstract Carbohydrate-modifying enzymes can be used for the enzymatic synthesis or cleavage of complex glycans. In this study, various high-throughput assays were assessed as methods for the discovery of glycoside hydrolases (GHs) and glycosyltransferases (GTs) within environmental samples. Fluorescence-activated cell sorting (FACS) was evaluated as an approach for the functional enrichment of sialyltransferase (ST) genes in environmental samples. A model ST from Campylobacter jejuni, CstI, was successfully enriched from a mixture of genomic DNA (proof-of-principle), but active STs could not be isolated from a set of metagenomic libraries. The same FACS screen was applied to the directed evolution of multifunctional sialyltransferase from the Pasteurella multocida, PmST1, in order to isolate mutants with reduced sialidase activity and improved synthetic efficiency over the wild-type enzyme. Sialidase activity (kcat/KM) of the best mutants was reduced approximately 2-fold, with an approximately 2.5-fold increase in the sialyltransferase activity (kcat/KM). Despite these improvements, the maximum product yield of the mutants did not increase appreciably. While engineering PmST1, a study of the sialidase and trans-sialidase mechanisms of PmST1 and other STs from the glycosyltransferase family 80 was also undertaken. The mechanisms of both these activities were found to follow a reversible sialylation path, varying from that previously proposed in literature. A high-throughput plate-based assay was also evaluated as a functional screen for the identification of blood antigen-cleaving enzymes within the human gut microbiome. One GH enzyme from Bacteroides vulgatus (BvGH109) was found to be capable of converting the blood type A antigen into blood type O, offering a new enzyme for the engineering of universal donor blood. Two enzymes with α-N-acetylgalactosaminidase activity were also isolated and determined to represent a new sub-family of the GH family 31.  iii  Preface  All of the research presented in this thesis was designed by the author in collaboration with Dr. Stephen Withers. All research experiments and analysis of the research data were performed by the author.  A version of chapter 5 has been published. Kevin Mehr and Stephen G. Withers (2015) Mechanisms of the sialidase and trans-sialidase activities of bacterial sialyltransferases from glycosyltransferase family 80. Glycobiology. 26: 353-359. I conducted all of the experiments and authored the manuscript in collaboration with Dr. Withers.  The work presented in Chapter 3 required approval from UBC’s Research Ethics Board for Human Ethics. The certificate number of the approved application is H15-02967.    iv  Table of Contents  Abstract .......................................................................................................................................... ii Preface ........................................................................................................................................... iii Table of Contents ......................................................................................................................... iv List of Tables ............................................................................................................................... xii List of Figures ............................................................................................................................. xiii List of Abbreviations ................................................................................................................ xvii Acknowledgements .................................................................................................................. xviii Dedication ................................................................................................................................... xix Chapter 1: Introduction ................................................................................................................1 1.1 Carbohydrates in Nature ................................................................................................. 1 1.2 Carbohydrate-Active Enzymes ....................................................................................... 2 1.2.1 Overview of types ....................................................................................................... 2 1.2.2 GHs ............................................................................................................................. 4 1.2.3 GTs .............................................................................................................................. 9 1.3 Metagenomics ............................................................................................................... 12 1.3.1 A Brief History of Metagenomics............................................................................. 12 1.3.2 Sequence-Based Metagenomics................................................................................ 13 1.3.3 Functional Metagenomics ......................................................................................... 14 1.4 High-Throughput Biology ............................................................................................ 16 1.4.1 General Overview ..................................................................................................... 16 1.4.2 High-Throughput Screening of CAZymes................................................................ 17 v  1.5 Aims of Thesis .............................................................................................................. 18 1.5.1 Research Questions ................................................................................................... 18 1.5.2 Hypotheses and Specific Aims ................................................................................. 19 1.5.2.1 Hypothesis (1) ................................................................................................... 19 1.5.2.2 Hypothesis (2) ................................................................................................... 20 1.5.2.3 Hypothesis (3) ................................................................................................... 20 Chapter 2: Functional Metagenomic Screening for Glycosyltransferases .............................22 2.1 Introduction ................................................................................................................... 22 2.1.1 Sialic Acid and Sialyltransferases ............................................................................. 22 2.1.2 Glycosyltransferase Screening .................................................................................. 24 2.1.3 Fluorescence Activated Cell Sorting (FACS) ........................................................... 26 2.1.4 Fluorescent Encapsulation of ST Activity ................................................................ 27 2.2 Rationale and Research Goals ...................................................................................... 28 2.3 Development of Screen (Proof-of-Principle) ................................................................ 29 2.3.1 Screening Strain ........................................................................................................ 29 2.3.2 Expression Vector ..................................................................................................... 29 2.3.3 Library Construction (Campylobacter jejuni Genomic Library) .............................. 33 2.3.4 FACS Enrichment ..................................................................................................... 33 2.3.5 Optimizing Enrichment Efficiency ........................................................................... 35 2.3.6 The Effect of Library Complexity on Enrichment Efficiency Determined by qPCR41 2.4 Functional Metagenomic Screening ............................................................................. 43 2.4.1 Sampling ................................................................................................................... 43 2.4.2 Initial Construction of Libraries................................................................................ 44 vi  2.4.3 FACS Enrichments ................................................................................................... 44 2.4.4 Secondary Screening ................................................................................................. 45 2.4.5 Additional Library Constructions and FACS Enrichments ...................................... 46 2.4.6 Additional Secondary Screening............................................................................... 47 2.5 Limit of FACS Enrichments ......................................................................................... 47 2.5.1 Construction of Dosed Libraries ............................................................................... 47 2.5.2 Determination of Limit of Enrichment ..................................................................... 48 2.6 Conclusions and Future Work ...................................................................................... 52 Chapter 3: Functional Metagenomic Screening of the Human Gut Microbiome .................53 3.1 Introduction ................................................................................................................... 53 3.1.1 Functional Metagenomic Screening of GHs ............................................................. 53 3.1.2 Blood Antigen-Cleaving Enzymes ........................................................................... 54 3.1.3 The Human Gut as a CAZyme Source ..................................................................... 55 3.2 Rationale and Research Goals ...................................................................................... 56 3.3 Development of Screen ................................................................................................. 57 3.3.1 Screening Approach .................................................................................................. 57 3.3.2 Screening Strain Background Activity ..................................................................... 58 3.3.3 384-Well Screen Validation ...................................................................................... 60 3.4 Library Construction ..................................................................................................... 62 3.5 Plate Screening of Gut Microbiome Libraries .............................................................. 64 3.6 Initial Hit Characterization ........................................................................................... 66 3.6.1 Hit Isolation and Storage .......................................................................................... 66 3.6.2 Determination of Hydrolytic Activity on Blood Type A Antigen ............................ 67 vii  3.6.2.1 N-Acetylgalactosaminidase Coupled Assay on Blood Type A Antigen .......... 67 3.6.2.2 Coupled Assay Results ..................................................................................... 68 3.6.3 Sequence Analysis .................................................................................................... 69 3.7 Sub-Cloning, Expression, and Activity Verification .................................................... 73 3.8 Characterization of BpGH31(E2) and BcGH31(P19) ................................................... 76 3.8.1 Substrate Specificity ................................................................................................. 76 3.8.2 Classification............................................................................................................. 79 3.8.3 Kinetics ..................................................................................................................... 80 3.8.4 pH Optima ................................................................................................................. 80 3.8.5 Stereochemical Outcome of Hydrolysis ................................................................... 81 3.8.6 Potential Function in Nature ..................................................................................... 83 3.9 Characterization of BvGH109 (P5) ............................................................................... 85 3.9.1 Mechanism ................................................................................................................ 85 3.9.2 Kinetics ..................................................................................................................... 86 3.10 Conclusion and Future Work ........................................................................................ 87 Chapter 4: Directed Evolution of the Sialyltransferase PmST1 ..............................................89 4.1 Introduction ................................................................................................................... 89 4.1.1 α-2,3/2,6 Sialyltransferase from Pasteurella multocida (PmST1) ........................... 89 4.2 Rationale and Research Goals ...................................................................................... 91 4.3 Mutagenesis and Library Construction ......................................................................... 92 4.4 FACS Sorting ................................................................................................................ 92 4.5 Enriched Mutant Analysis............................................................................................. 94 4.6 Mutagenesis (Round 2) ................................................................................................. 97 viii  4.7 FACS Sorting (Round 2) .............................................................................................. 98 4.8 Enriched Mutant Analysis (Round 2) ........................................................................... 99 4.9 Sub-Cloning, Expression, and Purification ................................................................. 102 4.10 Characterization .......................................................................................................... 103 4.10.1 Kinetics (Coupled Assay) ................................................................................... 103 4.10.2 Development of HPAE-PAD Method ................................................................ 103 4.10.3 Sialyltransferase Kinetics.................................................................................... 105 4.10.4 Sialidase Kinetics ................................................................................................ 106 4.10.5 Synthetic Competency ........................................................................................ 106 4.11 Conclusions ................................................................................................................. 109 Chapter 5: Mechanisms of the Sialidase and Trans-Sialidase Activities of Bacterial Glycosyltransferases from Family GT80 .................................................................................111 5.1 Introduction ................................................................................................................. 111 5.2 Rationale and Research Goals .................................................................................... 114 5.3 Effect of CMP on the Sialidase Rate of PmST1 ......................................................... 114 5.4 Determination of Kinetic Competence for Reverse Sialylation by PmST1 ............... 117 5.5 Detection of CMP-Neu5Ac Formation by Coupled Enzyme Assay .......................... 118 5.6 Demonstration of CMP Contamination in Trans-Sialidase Reactions ....................... 125 5.7 Discussion ................................................................................................................... 129 5.8 Conclusion .................................................................................................................. 130 Chapter 6: Conclusion ...............................................................................................................132 Chapter 7: Experimental ...........................................................................................................136 7.1 Common Methods ....................................................................................................... 136 ix  7.1.1 DNA Sequencing .................................................................................................... 136 7.1.2 Protein Structure Analysis ...................................................................................... 136 7.1.3 Primer List .............................................................................................................. 136 7.2 Functional Metagenomic Screening of Glycosyltransferases ..................................... 138 7.2.1 Materials ................................................................................................................. 138 7.2.2 Construction of pDUAL and pDUAL2................................................................... 138 7.2.3 Verification of pDUAL and pDUAL2 .................................................................... 138 7.2.4 Campylobacter jejuni Genomic Library Construction ........................................... 139 7.2.5 Screening of Campylobacter Genomic DNA Libraries by FACS .......................... 140 7.2.6 Quantification of CstI by Colony PCR ................................................................... 141 7.2.7 Verification of CstI by Activity Assay (in vitro) .................................................... 142 7.2.8 Quantification of CstI by Quantitative PCT (qPCR) .............................................. 143 7.2.9 Sampling ................................................................................................................. 143 7.2.10 Environmental Library Construction .................................................................. 144 7.2.11 FACS of Environmental Libraries ...................................................................... 145 7.2.12 Secondary Screening of Environmental Libraries .............................................. 145 7.2.13 Determination of FACS Limit of Enrichment .................................................... 146 7.3 Functional Metagenomic Screening of the Human Gut Microbiome ......................... 147 7.3.1 Background Activity of Screening Strain ............................................................... 147 7.3.2 384-Well Screen Validation .................................................................................... 148 7.3.3 Sampling and DNA extraction ................................................................................ 148 7.3.4 Small-Insert Library Construction .......................................................................... 150 7.3.4.1 Acoustic Shearing and Blunt-End Ligation .................................................... 150 x  7.3.4.2 Restriction Enzyme Digestion and Ligation ................................................... 150 7.3.4.3 Library Picking and Storage ........................................................................... 151 7.3.5 Screening................................................................................................................. 151 7.3.6 N-Acetylgalactosaminidase Coupled Assay on Blood Type A Antigen ................ 152 7.3.7 Hit Sequence Analysis ............................................................................................ 153 7.3.8 Expression and Purification of Screening Hits ....................................................... 153 7.3.9 BpGH31(E2) and BcGH31(P19) Substrate Specificity Assay................................ 155 7.3.10 BpGH31(E2) and BcGH31(P19) Competition Assay ......................................... 155 7.3.11 Estimation of BcGH31(P19) and BpGH31(E2) pH Optima ............................... 155 7.3.12 Determination of α-N-acetylgalactosaminidase Stereochemical Outcome of Hydrolysis by BpGH31(E2) and BcGH31(P19) ................................................................. 156 7.3.13 Kinetics ............................................................................................................... 156 7.3.13.1 BpGH31(E2) and BcGH31(P19)................................................................. 156 7.3.13.2 BvGH109 and EmGH109 ............................................................................ 157 7.4 Directed Evolution of the Sialyltransferase PmST1 ................................................... 158 7.4.1 Protein Sequences ................................................................................................... 158 7.4.2 Mutagenesis and Library Construction ................................................................... 158 7.4.3 FACS Sorting .......................................................................................................... 160 7.4.4 Activity Verification (in vitro) ................................................................................ 161 7.4.5 Mutant Analysis (in vivo)........................................................................................ 162 7.4.6 Cloning and Protein Purification ............................................................................ 163 7.4.7 HPAE-PAD Optimization ....................................................................................... 163 7.4.8 Kinetics ................................................................................................................... 164 xi  7.5 Mechanisms of the Sialidase and Trans-Sialidase Activities of Bacterial Glycosyltransferases from the Family GT80 .......................................................................... 165 7.5.1 Materials ................................................................................................................. 165 7.5.2 Protein Sequences ................................................................................................... 166 7.5.3 Cloning and Protein Purification ............................................................................ 167 7.5.4 Effect of CMP on Sialidase Rate of PmST1 ........................................................... 168 7.5.5 Donor Hydrolysis Kinetics ..................................................................................... 169 7.5.6 Coupled Enzyme Assays and CMP Removal ......................................................... 170 7.5.7 Synthesis of α-2,6-Sialyllactose .............................................................................. 171 Bibliography ...............................................................................................................................172 Appendices ..................................................................................................................................187 Appendix A - FACS instrument setup .................................................................................... 187 Appendix B - TLC Data .......................................................................................................... 188 Appendix C – NMR data ........................................................................................................ 189  xii  List of Tables Table 2.1 - Estimation of CstI gene abundance in Campylobacter sp. genomic library by colony PCR ....................................................................................................................................... 35 Table 3.1 - Background activity of screening strain. .................................................................... 59 Table 3.2 - Testing substrate specificity of BpGH31 and BcGH31. ............................................. 79 Table 3.3 - Kinetic data for BpGH31 and BcGH31 ...................................................................... 80 Table 3.4 - Kinetic comparison of BvGH109 and EmGH109 on the MU-Type2Atetra substrate . 87 Table 4.1 - Summary of PmST1 mutations following screening.................................................. 97 Table 4.2 - Summary of PmST1 mutations present in top fluorescing cells (Round 2) ............. 100 Table 4.3 - Summary of sialyltransferase kinetic data for PmST1 mutants. .............................. 105 Table 4.4 - Summary of sialidase kinetics for PmST1 mutants .................................................. 106 Table 4.5 - Synthetic competency of PmST1 mutants ................................................................ 109 Table 5.1 - The effect of CMP on the sialidase rates of PmST1 acting on α-2,3-Neu5Ac-Gal-pNP............................................................................................................................................. 115  xiii  List of Figures Figure 1.1 - Summary of retaining mechanism for a β-glycosidase. .............................................. 5 Figure 1.2 - Summary of inverting mechanism for β-glycosidase. ................................................ 6 Figure 1.3 - Summary of NAD-dependent hydrolysis mechanism................................................. 7 Figure 1.4 - Catalytic mechanisms of GH bond formation. ............................................................ 9 Figure 1.5 - Summary of glycosyltransferase mechanisms. ......................................................... 11 Figure 2.1 - Glycosyltransferase-mediated encapsulation of fluorescence in vivo. ...................... 28 Figure 2.2 - Verification of bi-directional expression systems.. ................................................... 32 Figure 2.3 - FACS histograms of Campylobacter jejuni genomic library. ................................... 34 Figure 2.4 - FACS histograms of Campylobacter jejuni genomic library using retransformation method................................................................................................................................... 38 Figure 2.5 - FACS histograms of Campylobacter jejuni genomic library using solid media method................................................................................................................................... 39 Figure 2.6 - Comparison of fluorescence uniformity between liquid and solid media growth/expression methods .................................................................................................. 40 Figure 2.7 – qPCR Measurement of CstI Gene Abundance though FACS Sorting. .................... 42 Figure 2.8 – Relative fluorescence of metagenomic libraries through three rounds of FACS enrichment............................................................................................................................. 45 Figure 2.9 – Analysis of seawater metagenomic library clones for sialyltransferase activity. ..... 46 Figure 2.10 - Relative fluorescence of dosed libraries through three rounds of FACS enrichment................................................................................................................................................ 49 Figure 2.11 - Summary of qPCR data for dosed libraries. ............................................................ 50 Figure 2.12 - Summary of relative CstI gene abundance in library through FACS enrichment. . 51 xiv  Figure 3.1 - Blood group antigens and screening substrates. ....................................................... 58 Figure 3.2 - Plate screen validation............................................................................................... 61 Figure 3.3 - Production of small-insert libraries. Purified metagenomic DNA was partially digested with Sau3AI and size-selected on agarose gel (Left), cloned into the small high copy expression vector pHSG396 (Middle), then transformed into ReplicatorFOS E. coli cells for plating and library storage (Right). ......................................................................... 64 Figure 3.4 - Type O library plate screen results. ........................................................................... 65 Figure 3.5 - Type A library plate screen ....................................................................................... 66 Figure 3.6 – Overview of N-acetylgalactosaminidase coupled assay on blood antigen A substrate................................................................................................................................................ 68 Figure 3.7 - Type A antigen coupled assay results. ...................................................................... 69 Figure 3.8 – Sequence identity of screening hits to annotated GH31 genes from GenPept. ........ 71 Figure 3.9 - Phylogenetic tree analysis of characterized GH31 enzymes..................................... 72 Figure 3.10 - SDS-PAGE of purified hits. .................................................................................... 74 Figure 3.11 - SDS-PAGE protein expression test for BvGH109 (P5) .......................................... 76 Figure 3.12 - Inhibitory effect of 2F-pNP-β-Gal on alpha-N-acetylgalactosaminidase activity. . 78 Figure 3.13 - pH activity profiles of BpGH31(E2) and BcGH31(P19) ........................................ 81 Figure 3.14 – Determination of hydrolysis anomeric stereochemical outcome 1H NMR.. .......... 82 Figure 3.15 - Depiction of an N- and O-glycosylated glycan ....................................................... 83 Figure 3.16 - Genes surrounding BcGH31(P19) in the Bacteroides caccae reference genome. .. 84 Figure 4.1 - Structure of PmST1 enzyme from Pasteurella multocida bound with CMP............ 89 Figure 4.2 - FACS sorting data summary ..................................................................................... 94 Figure 4.3 – Post-FACS results.. .................................................................................................. 96 xv  Figure 4.4 - FACS sorting data summary (Round 2). ................................................................... 98 Figure 4.5 - Fluorescence retention following FACS (Round 2).................................................. 99 Figure 4.6 - Structure of PmST1 with location of mutations ...................................................... 102 Figure 4.7 - High-performance anion-exchange chromatography with pulsed amperometric detection (HPAE-PAD) for sialyltransfer and sialidase rate determination. ...................... 104 Figure 4.8 - Comparison of sialyltransferase product yields. reaction, flash-frozen, and submitted for analysis to HPAE-PAD to determine the product concentration.. ................................ 108 Figure 5.1 - Mechanism of the trans-sialidase (TcTs) from Trypanozoma cruzi. ...................... 112 Figure 5.2 - General reaction schemes for sialidase/trans-sialidase activities of sialyltransferases............................................................................................................................................. 113 Figure 5.3 - Coupled assay for the detection of sialidase activity. ............................................. 115 Figure 5.4 - The activation of PmST1 sialidase activity is specific to CMP. ............................. 116 Figure 5.5 - Initial rates of Neu5Ac release from 2,3-Neu5Ac-Gal-pNP by PmST1. ................ 117 Figure 5.6 - TLC data for the coupled enzyme assay of PmST1 (coupled assay results). ......... 120 Figure 5.7 - Detection of CMP-Neu5Ac formation by bacterial sialyltransferases (reversible sialylation ............................................................................................................................ 121 Figure 5.8 - TLC data for the coupled enzyme assay of Pd2,6ST (all results).. ......................... 122 Figure 5.9 - TLC data for the coupled enzyme assay of Psp2,6ST (coupled assay results). ...... 123 Figure 5.10 - TLC data for the coupled enzyme assay of Psp2,6ST (negative controls). .......... 124 Figure 5.11 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (PmST1) .............................................................................................................................. 126 Figure 5.12 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (Pd2,6ST) ............................................................................................................................ 127 xvi  Figure 5.13 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (Psp2,6ST) .......................................................................................................................... 128 Figure 5.14 – Summary of the effect of CMP removal on trans-sialidase activity .................... 129  xvii  List of Abbreviations CMP: cytidine 5’-monophosphate CMP-Neu5Ac: cytidine 5’-monophosphate-N-acetylneuraminic acid CSTI: monofunctional α-2,3-sialyltransferase from Campylobacter jejuni CSTII: multifunctional α-2,3/2,8-sialyltransferase from Campylobacter jejuni GT: glycosyltransferase HPAE-PAD: high-performance anion-exchange chromatography with pulsed amperometric detection  HEPES: 4-(2-hydroxyethyl)-1-piperazine ethanesulfonic acid Lac*: β-lactose-C2-BODIPY LC: liquid chromatography MES: 2-(N-morpholino)ethanesulfonic acid Neu5Ac: N-acetylneuraminic acid Pd2,6ST: α-2,6-sialyltransferase from Photobacterium damselae PmST1: α-2,3-sialyltransferase from Pasteurella multocida pNP: para-nitrophenol Pd2,6ST: α-2,6-sialyltransferase from Photobacterium sp. JT-ISH-224 Sia: sialic acid ST: sialyltransferase TLC: thin-layer chromatography  xviii  Acknowledgements  Tremendous gratitude to my intrepid supervisor, Dr. Steve Withers, for his endless patience, wisdom, and support. Thanks to all of my wonderful lab mates, especially Zach Armstrong and Emily Kwan for being such great colleagues over these past few years. I thank my wonderful wife for all of her love and support throughout this challenging endeavor. Special thanks to my friends and family, whose love and support kept me smiling.  xix  Dedication  Dedicated to my wonderful family, and science (I suppose).1  Chapter 1: Introduction 1.1 Carbohydrates in Nature Sugars are everywhere in nature, forming the foundation of all life on Earth. Photosynthesis, the most important biochemical process on our planet, is performed by plants, algae and cyanobacteria which use energy from the sun to combine carbon dioxide and water into sugars.1 Much of the subsequent sugar in these organisms ends up as cellulose or starch, larger polymers of the individual sugar molecule (monosaccharide) glucose. These sugar polymers are called polysaccharides, or more generally referred to as glycans or carbohydrates. The popular term “sugar” is often used interchangeably with “carbohydrate,” although carbohydrates are better defined in chemical terms as any compound containing carbon, hydrogen and oxygen and having twice as many hydrogen atoms as oxygen (C[H2O]n).2 Large polymers such as starch, cellulose, and glycogen are used in metabolic pathways to provide animals with energy3, as well as provide plants and materials such as wood with structural support. Starch in particular is utilized by humans who can break it down into glucose, the principal fuel for our metabolic pathways. The metabolism of almost all living things generates energy through the breakdown of glucose into carbon dioxide and water which can then be reformed into sugars (fixation) through photosynthesis. The four fundamental classes of macromolecules that make up living systems are proteins, lipids, nucleic acids, and polysaccharides. Therefore; understanding the structures and functions of carbohydrates is crucial to understanding biology. Carbohydrates, or glycans, are ubiquitous in nature and function in many more roles than energy production. The cell membranes of all living cells are coated with glycans or have glycans acting as essential constituents of cell walls. They play critical roles in molecular recognition, cell signalling, immunity, and inflammation. One striking example of this is how glycosylated cell surface 2  molecules define the ABO blood groups of humans. This determines the blood type of individuals, as well as how their immune systems will react to blood of alternate types.4 The attachment of carbohydrates to proteins also affects the amount of time they circulate in blood, affecting the efficiency of therapeutic compounds. The manipulation and study of carbohydrates and their roles in biology requires understanding of the enzymes that construct the various carbohydrate-containing molecules in nature.  1.2 Carbohydrate-Active Enzymes 1.2.1 Overview of types Carbohydrate-active enzymes (CAZymes) include any enzymes which degrade or create glycosidic bonds. It has been estimated that CAZymes make up 1-3% of all proteins encoded by the genomes of most organisms.5 CAZymes include glycoside hydrolases (GHs), glycosyltransferases (GTs), polysaccharide lyases (PLs) and carbohydrate esterases (CEs). Sequence and structural information for these enzymes is held in the Carbohydrate-Active Enzyme database (CAZy), built and maintained by the Universite de Provence, Marseille, France (www.cazy.org). All of the enzymes within CAZy are organized into distinct families and classes based on sequence and structural similarity. Currently, the database includes information on over 270 families of GHs, GTs, PLs, and CEs. Enzymatic formation and cleavage of the glycosidic linkages between sugars and other groups can occur by hydrolysis to release a free sugar (GHs), by transfer of an activated donor sugar to an acceptor to form a new glycoside (GTs), or by elimination to produce unsaturated sugar products (PLs). Additionally, enzymatic cleavage can occur by phosphorolysis to give a sugar-1-phosphate (phosphorylases classified with GHs or 3  GTs depending on structure and mechanism). The de-O or de-N-acylation of substituted sugars is performed by CEs.  For well over a century there has been a great effort to synthesize carbohydrates in the laboratory. While chemical approaches to carbohydrate synthesis have been developed extensively, the construction of complex carbohydrates and glycoconjugates in the laboratory remains a challenging endeavor. Chemical synthesis of carbohydrates requires a special regioselective reaction at a particular position of the sugar unit, in which the hydroxyl group that is available in such position must be distinguished from all the other hydroxyl groups in the structure that have similar properties. Additionally, the linkage formation between sugars must proceed in a stereoselective manner, since the linkage can produce two stereoisomers of which one will be preferred. Many carbohydrates are found linked to protein and lipids in nature, further complicating chemical synthesis. In light of these difficulties for chemical synthesis of carbohydrates, researchers have turned to biological synthesis in which naturally-occurring enzymes are used either in vivo or in vitro to synthesize complex carbohydrates. Reactions such at this performed in vitro are also referred to as chemoenzymatic syntheses, and offer a straightforward approach to obtaining desired carbohydrates. Currently, GHs and GTs are the major classes of biocatalysts available for the biological synthesis of polysaccharides and oligosaccharides.6 The widespread availability of these enzymes, as well as their efficient synthetic capabilities, has led to their extensive use both academically and commercially. As such, these enzymes are the focus of the work presented in this thesis.  4  1.2.2 GHs Glycoside hydrolases (GHs), or glycosidases, are present within almost all living organisms with the minor exceptions of some Archaea and a few unicellular parasitic eukaryotes.5,7 These enzymes catalyze a diverse set of reactions, with most being divided into two groups based on the resulting anomeric stereochemistry of their products. GHs are classified mechanistically as either inverting or retaining enzymes. Retaining GHs catalyze hydrolysis with a net retention of configuration at the anomeric centre. The catalytic mechanism of retaining GHs was first proposed by Koshland over 60 years ago.8 The mechanism occurs as a double displacement with two steps: glycosylation and deglycosylation (Figure 1.1). In the first step, the enzyme is glycosylated by the concerted action of the carboxylates of two residues, Asp or Glu, that are found on opposite sides of the enzyme active site. One of these residues functions as a nucleophile, attacking the anomeric carbon in order to displace the aglycon. In concert with the nucleophilic attack, the other residue acts as an acid catalyst, donating a proton to the glycosidic oxygen as the bond is cleaved. This step is referred to as the glycosylation step and leads to the formation of a covalently linked glycosyl-enzyme intermediate that has an anomeric configuration opposite to that of the starting material. The second step of this reaction, the deglycosylation step, involves the hydrolytic breakdown of the glycosyl-enzyme intermediate.9 The carboxylate that first functioned as an acid catalyst now acts as a base by abstracting a proton from the incoming nucleophile, usually a water molecule. Simultaneously, the water molecule attacks the carbohydrate-enzyme linkage in the reverse of the first step. Both steps pass through an oxocarbenium ion-like transition state. At the end of the reaction, a hemi-acetal is formed with the same anomeric configuration as the starting material (retaining). 5   Figure 1.1 - Summary of retaining mechanism for a β-glycosidase.  In contrast to the retaining mechanism, the inverting mechanism (Figure 1.2) is completed in a single step and does not involve the formation of any covalent enzyme intermediates during catalysis. The proposed mechanism of action for inverting GHs, which occurs via a single-displacement type of mechanism, is that one of the two carboxylate residues present in the active site protonates the glycosidic oxygen atom while the other coordinates the nucleophile (water molecule) to assist its deprotonation and in this way complete the hydrolysis reaction.10 This single-step reaction also passes through an oxocarbenium ion-like transition state as shown in Figure 1.2.  6   Figure 1.2 - Summary of inverting mechanism for β-glycosidase.  While these are the principal mechanisms employed by GHs, some other mechanisms are also found including the interesting NAD-dependent hydrolysis mechanism employed by enzymes in GH families 4 and 109. In this mechanism, a cofactor (NAD+) is perfectly positioned to remove the hydride from carbon C3 of the substrate and thus oxidise C3 to a ketone (Figure 1.3).11,12 The mechanism then proceeds through a number of anionic transition states with elimination and redox steps. Following oxidation of the hydroxyl at C3, the acidity of the C2 proton is increased such that an E1 elimination can occur aided by an enzymatic base. Concurrently, an enzymatic acid donates a proton to the glycosidic oxygen as the bond is cleaved. The α,β-unsaturated intermediate that forms then undergoes a 1,4-Michael-like addition of water at the anomeric carbon with C2 re-protonated with the help of an enzymatic acid. Finally, the ketone at C3 is reduced by NADH to generate the free sugar product. A hemi-acetal is formed with the same anomeric configuration as the starting material (retaining). Despite glycosidic bond cleavage occurring through an elimination reaction, the overall reaction is that of hydrolysis.  7   Figure 1.3 - Summary of NAD-dependent hydrolysis mechanism.  GHs can also be classified based on their ability to cleave a substrate at the end of a polysaccharide chain (usually the non-reducing end) or in the middle of the chain. The terms exo- and endo- refer to these abilities, respectively. GHs, and all CAZymes, are also classified based on the specific chemical reaction they perform using an Enzyme Commission (EC) number. These numbers can only be applied to enzymes for which the function has been characterized biochemically. Also, while enzymes might share a common mechanism, they may not share the same EC number if they are using different substrates. Ultimately, the classification used to describe a GH depends on the context in which it is being used or studied.  8  In addition to their ability to hydrolyze glycosidic linkages, GHs can be used under the right conditions for the reverse reaction and formation of bonds. This is called transglycosylation13 and generally require large concentrations of substrate. The catalytic mechanism of transglycosylation is depicted in Figure 1.4A. Like in the previously described GH mechanisms, the first step leads to the departure of the aglycon group and the formation of a covalent glycosyl-enzyme intermediate. The second step of the reaction involves the attack of the carbohydrate-enzyme linkage by another sugar molecule, and proton transfer from the sugar to the active site acid/base carboxylate. This transglycosylation reaction employs inexpensive simple sugars as glycoside donor molecules, leading to industrial interest in employing these enzymes for biotechnological synthesis.14 Unfortunately the yields for these transglycosylation reaction are typically low because the product formed can be readily hydrolyzed. Attempts can be made to displace the equilibrium of the glycosidic bond formation using high substrate concentrations or by using activated glycosyl donors such as aryl glycosides, but the transglycosylation reaction by GHs is still limited in efficiency.15 To overcome limitations of transglycosylation by native GHs, mutations can be made within the active site to enhance the rate of transglycosylation. The most classic example of this are the glycosynthases. The competing hydrolytic activity has been removed in these enzymes through the mutation of the catalytic nucleophile into small non-nucleophilic residues such as alanine, glycine, or serine. These enzymes are still able to catalyze transglycosylation to acceptor molecules by using glycosyl fluorides of opposite anomeric configuration to the natural substrate as donors.16,17 This mechanism is depicted in Figure 1.4B. The first glycosynthase enzyme was reported in 1998 by Withers and colleagues.18 Since then, many other glycosynthases with specific substrate specificities, varying reaction mechanisms, and high yields have been developed.16,19,20 As useful 9  as GHs can be for either the targeted hydrolysis or synthesis of glycosidic bonds, enzymes with the desired stereo- and regio-specificities must first be isolated and characterized. Hence, the discovery and characterization of new GHs or GHs with desired specificities is an important endeavor for developing efficient chemoenzymatic processes.    Figure 1.4 - Catalytic mechanisms of GH bond formation.  1.2.3 GTs Glycosyltransferases (GTs) are responsible for building complex glycoconjugates through the regio- and stereo-specific transfer of sugars to a wide range of acceptor molecules including glycans, peptides, and lipids.21 GTs can utilize a range of donor substrates, including nucleotide mono- or di-phosphate sugars and lipid phosphosugars.22 GT enzymes using nucleotide-phosophosugars are referred to as Leloir GTs, and enzymes using non-nucleotide donors such as 10  lipid phosphosugars or sugar-1-phosphates are referred to as non-Leloir GTs. All GTs catalyze the transfer of the glycosyl donor to a nucleophilic group such as an alcohol. Generally, the mechanism of GTs involves a concerted loss of the nucleotide phosphate or lipid phosphosugar and coupling to an appropriate acceptor molecule. This reaction can result in a product with either an inversion or retention of anomeric stereochemistry. Most GTs require a divalent cation (Mg2+ or Mn2+) to stabilize the leaving nucleotide phosphate, but some enzymes are metal-independent. The mechanisms of inverting and retaining GTs are summarized in Figure 1.5. The mechanism of inverting GTs proceeds through a single nucleophilic step enabled by an enzymatic base.  For retaining GTs, a SNi process as depicted in Figure 1.5B is supported by most evidence.23 A mechanism similar to the double-displacement Koshland mechanism of GHs has also been proposed (Figure 1.5C), but attempts to verify the characteristic glysosyl-enzyme intermediate for wild-type GTs have consistently met with failure. Along with mechanistic classification, GTs have been classified into families and classes according to amino acid sequence similarities as part of the CAZy database. Currently, there are over 260,000 GT sequences divided into 100 families in the database. Like GHs and Glycosynthases, GTs can be utilized for the chemoenzymatic synthesis of a wide range of oligosaccharides and glycoconjugates.20 Similarly though, many GT enzymes with desired specificities are not readily available to researchers, indicating the need for discovery and characterization of new GTs from natural sources.   11   Figure 1.5 - Summary of glycosyltransferase mechanisms.  12  1.3 Metagenomics 1.3.1 A Brief History of Metagenomics Microorganisms comprise the most abundant and diverse group of living things found on Earth. Capable of thriving in the most challenging environments, microbes have developed an enormous variety of enzymatic activities and metabolic pathways in order to survive. These organisms also produce a vast array of complex primary and secondary metabolites, many of which have been harnessed for industrial, pharmaceutical, and bioenergy processes.24,25 Historically, the analysis of a microbe would require obtaining it in pure culture, but these standard culturing techniques allow for less than 1% of the diverse bacteria from environmental samples to be grown, isolated, and studied.26 In lieu of these cultivation challenges, a complementary technique was developed in which the bacteria in environmental samples could be investigated in a culture-independent fashion. The prospect of analyzing natural microbial populations independently from culture by studying ribosomal RNA (rRNA) was first suggested by Pace et al. in 1985.27 This was further developed by Carl Woese and George E. Fox in 199028, who famously led the way in using 16S rRNA to determine bacterial phylogenies. Found within all bacteria and archaea is a section of DNA coding for the 16S rRNA, a component of the 30S small subunit of prokaryotic ribosomes. This gene evolved slowly, making it ideal for phylogenetic and evolutionary analysis of bacteria. This method of bacterial classification was first implemented for metagenomics in 1991 by Schmidt et al.29, who obtained DNA from a marine picoplankton community and analyzed the 16S rRNA present in the sample. This launched the field of metagenomics, leading to numerous studies of bacterial diversity in environmental samples30-39. Advances in sequencing technology have moved metagenomics beyond simple 16S rRNA analysis to complete sequencing of all DNA within an environmental 13  sample. One famous example of this was the sequencing of the Sargasso Sea metagenome in 2004 by Venter et al.40 Along with sequence-based metagenomic studies, activity-based or “functional” analysis of metagenomic libraries has grown in frequency, allowing for the expression and characterization of genes within environmental DNA libraries. Most metagenomic studies can therefore be categorized as either sequence-based or activity-based (functional) metagenomics.  1.3.2 Sequence-Based Metagenomics Sequence-based metagenomics relies on next-generation sequencing (NGS) technologies to analyze the genetic makeup of microbial communities. NGS is a broad term encompassing a number of modern sequencing technologies that allow cheap and efficient DNA sequencing compared to the traditional Sanger sequencing approach. The most common forms of NGS sequencing are Illumina sequencing, Roche 454 sequencing, Ion torrent or Ion proton sequencing, and SOLiD (Sequencing by Oligonucleotide Ligation and Detection). As previously mentioned, early studies with these technologies involved initial amplification of genes of interest (i.e. 16S rRNA gene) by PCR. This reduced the sequencing burden while still allowing for complex analyses of environmental microbial communities. The compiled 16S rRNA sequence information from a sample allows for a taxonomic profile to be developed. This provides information on the biodiversity of that environment, answering the question of which species are present. As the capacity of NGS technologies increased and the costs decreased, most DNA present within an environmental sample could be rapidly sequenced. This has been referred to as “full shotgun metagenomics.”41 Once those genomes in the sample present in sufficient quantity to allow for sequencing coverage are sequenced, the compiled sequence information 14  provides not just which species are present, but also functional information based on the composition of genes within each organism. This can subsequently be associated with the environmental data from the sample site under investigation. Many different environments have been analyzed by sequence-based metagenomics, including the human gut42-50, seawater40, soil32,34,39,51, and extreme environments such as high temperature52, high salt53, acidic54-56, low oxygen57,58, high sulfur59,60, and heavy metal.61,62 While much about a given community can be elucidated from sequence information, genes with novel activities or activities of industrial interest are not apparent from sequence analysis and require an activity-based approach for discovery.   1.3.3 Functional Metagenomics Most activity-based approaches to enzyme discovery from environmental DNA samples are referred to as functional metagenomics. This approach puts metagenomic gene sequences from uncultured microbes into expression vectors, which upon subsequent expression produce novel proteins inside the host cells.63 The presence of the novel proteins can be verified by screening the metagenomic clones for those displaying a desired biological activity (activity-based screening). Over the past two decades many novel small molecules and enzymes have been recovered from metagenomic libraries, including lipases64-66, esterases67-69, proteases70,71, laccases72,73, agarases74, amidases75, alcohol oxireductases76, antibiotics77-79, DNA polymerases80, Na+/H+ antiporters81, cellulases and xylanases82-86, and phytases.87 Early activity-based screens of metagenomic libraries suffered from low sensitivity and low throughput88,89, emphasizing the need to develop high throughput functional screening methods. Methods such as SIGEX (substrate-induced gene expression)90 have helped accelerate the isolation of some novel 15  biocatalysts over the past 8 years. Despite recent improvements, the major limitation on functional screening for desired enzymes is the availability of high-throughput enzyme activity assays and bacterial hosts for heterologous gene expression (traditionally just E. coli). Alternate host systems such as Streptomyces spp., Thermus thermophilis, and Sulfolobus solfataricus79,91,92 have been developed, expanding the choice of host and enzyme assay system. Expressing cloned genes of metagenomic origin in heterologous hosts enables researchers to access the tremendous genetic potential in a microbial community without knowing anything about the original gene sequence, the structure and composition of the desired protein, or the origin of the microbe. Additionally, activity-based assays for enzyme discovery do not rely on correct annotation of environmental DNA sequences to assign function to an isolated gene, and they allow the discovery of new classes of enzymes with given functions which may have little sequence similarity to previously-characterized genes. If the enzyme to be isolated is intended for commercial or industrial application, the activity assay can be designed to identify functional enzymes working under desired conditions. For the development of commercial biocatalysts, known enzymes are sometimes further subjected to techniques like rational protein design or directed evolution.16,17,93-96 Functional metagenomics allows researchers to rapidly access the natural diversity of unknown enzymes that have adapted to a wide range of different environmental conditions. The same assay can then also be used with mutant libraries of discovered enzymes to select improved variants (directed evolution) that are ideal for commercial/industrial application. Metagenomic data sets are increasingly becoming more complex and comprehensive, with information about novel genes/ORFs/operons from diverse environments having accumulated. There is a strong need to focus more on validating these novel genes/ORFs of metagenomic origin by activity-based characterization. Developing more 16  activity-based screening methods, as well as scaling up the throughput of the available approaches, is key to identifying novel enzymes in metagenomic samples and validating predicted genes already available from sequencing studies.  1.4 High-Throughput Biology 1.4.1 General Overview High-throughput biology combines the use of automated equipment with cell and molecular biology techniques to answer biological questions not readily obtainable through small-scale approaches. The essential idea is to take methods normally performed on their own and perform a very large number of them rapidly without negatively impacting the method quality. This type of approach has most widely been applied in drug discovery which uses robotics, data processing, liquid handling instruments, and sensitive detectors to quickly conduct millions of chemical or pharmacological tests.97 High-throughput assays are typically performed on multi-well microplates. The size and format of these plates vary, with the most common sizes being 24-well, 96-well, 384-well, and 1536-well. Some plates have even been manufactured with 3456 and 9600 wells. The proven utility of high-throughput approaches using microplates, especially in drug screening, provides a strong incentive for the development of more complex high-throughput assays involving enzymes. The complexity of the assay is a major determinant of how scalable the assay is to high-throughput formats, particularly if specialized or expensive reaction materials are needed for particular enzymatic reactions. One approach to high-throughput assays involving enzymatic activity, such as for enzyme directed evolution, uses droplet-based microfluidics in which drops of aqueous fluid separated by oil replace the wells of microplates.98 Analysis and sorting of droplets containing ‘hits’ is performed while the droplets 17  are flowing quickly through tiny channels, with up to 1 x 107 droplets being analyzed each hour.99 Using this droplet-based system, the volumes of reactions are substantially lower, decreasing the required amount of research materials as well as the cost, labour, and time required for analysis. Development of these high-throughput assays for complex enzymatic reactions represents an opportunity for classical biochemistry and molecular biology labs to greatly expand their ability to discover new enzymes of interest and engineer improved enzymes for sought-after applications. Applying these approaches to CAZymes in particular offers both a challenge and opportunity to researchers in the field of carbohydrates.   1.4.2 High-Throughput Screening of CAZymes With the advent of high-throughput screening techniques and instruments, researchers have been able to develop approaches to screen for GH or GT activities in high-throughput. Screening for GH activity on microtiter plates is possible through the development of chromogenic or fluorogenic substrates, which release a detectable reporter upon cleavage by the enzyme. This approach is readily adaptable to screening of metagenomic libraries for GH activities, such as with functional metagenomics. Additionally this approach can be applied to selection of an improved enzyme variant, such as with directed evolution studies. The best example of high-throughput screening for the discovery of GH genes has been the search for biomass-degrading enzymes from environmental DNA libraries for use in biofuel production.100-103 Other GH functionalities are also targets for discovery, such as the screening and isolation of blood antigen-cleaving enzymes for the production of universal donor blood.104 A researcher need only develop suitable substrates to search for the desired GH functionality. The production of sensitive fluorogenic GH substrates has allowed for high-throughput screens of environmental 18  libraries for numerous GH activities not previously screened for en masse.105 The use of highly sensitive fluorogenic substrates has also improved the throughput of plate-based environmental screens for biomass-degrading enzymes such as cellulases, with rapid screens of large 384-well plate libraries >105 in size.85 High-throughput screening approaches have also been developed for GT activities. While many high-throughput assays for general GT activities exist22,106-109, they have yet to be applied to environmental libraries for functional metagenomics. Most recently, high-throughput screens for GTs have been developed for directed evolution. Using flow cytometry and sorting of cells within micro- droplets by fluorescence, improved mutants of two GTs could be rapidly selected from large DNA libraries (106-107 clones).95,110 This approach applied a method of ultra-high-throughput screening (>106 clones per day) which was superior to high-throughput plate screening, while still sensitive enough to select for a specified GT activity. These approaches, both for GH and GT activity screening, could be applied to environmental DNA libraries for gene discovery. The further development of high-throughput assays for GH or GT activities of interest within metagenomic libraries would provide a means to isolate and characterize novel GH and GT enzymes with highly specialized activities of interest.   1.5 Aims of Thesis 1.5.1 Research Questions The primary aim of this thesis work is to develop high-throughput strategies for the discovery of GH and GT enzymes in environmental DNA libraries. The secondary aim is to apply those same strategies to the directed evolution of a GT enzyme in order to enhance a desired activity. The work presented in this thesis will attempt to answer the following questions: (1) What high-throughput screening approaches can be applied to functional metagenomic screens for 19  glycosyltransferases? (2) Are blood antigen-cleaving enzymes present within human gut microbes and can they be isolated using a high-throughput functional screening approach? (3) Can the synthetic capabilities of a multifunctional glycosyltransferase (sialyltransferase) be improved by directed evolution using a high-throughput screening approach?   1.5.2 Hypotheses and Specific Aims 1.5.2.1 Hypothesis (1) Glycosyltransferase (GT) enzymes can be isolated from diverse metagenomes using a functional enrichment and screening strategy. Encapsulation of fluorescent reaction products can allow cells expressing a specific enzyme activity to be sorted from a diverse population by fluorescence-activated cell sorting (FACS).   Specific Aims: GT genes will be enriched from a metagenomic library in a high-throughput manner using multiple rounds of FACS to select only cells harboring GT activity. Individual clones containing the gene of interest will be validated through a secondary assay to confirm the formation of glycosylated products. Genes will be identified and classified by sequencing, isolated, and functionally characterized. Specific aims of testing this hypothesis include the following: 1. Develop a product encapsulation technique, whereby a modified fluorescent acceptor molecule (Lactose-Bodipy) is retained within a cell if GT activity exists within the cell to modify it. 2. Optimize a fluorescent-assisted cell sorting (FACS) strategy to permit efficient enrichment of cells harboring genes for GT activity. 20  3. Develop and optimize a secondary assay to validate individual colonies for activity, identifying the formation of desired product. 4. Construct metagenomic libraries and perform screening. 5. Isolate gene sequences of interest for classification (phylogenetic and CAZy database) and functional characterization.  1.5.2.2 Hypothesis (2) High-throughput plate screening can be used to discover novel blood antigen-cleaving enzymes within the human gut microbiome. Specific Aims: The specific aims of testing this hypothesis include the following: 1. Build metagenomic libraries from gut bacteria of individuals with varying blood types. 2. Screen for active enzymes using highly sensitive fluorescent reporters (methylumbelliferone, MU) in a high-throughput format (384-well plate)  3. Discover bacterial α-N-acetylgalactosaminidases and α-galactosidases which cleave the terminal residues of the blood group antigens A and B, respectively.  4. Characterize enzymes from this metagenomic library in relation to the only other currently available enzymes with this blood antigen-cleaving utility.   1.5.2.3 Hypothesis (3) A multifunctional glycosyltransferase (GT) enzyme exhibiting strong product degradation can be engineered for decreased product degradation using an ultra-high-throughput fluorescence-activated cell sorting (FACS) strategy. Specific Aims: The specific aims of testing this hypothesis include the following: 21  1. Develop mutant libraries of a GT (sialyltransferase) and select for an improved mutant using FACS.  2. Develop an analogous method for measuring the sialyltransfer and sialidase reaction rates of a sialyltransferase. 3. Demonstrate the improved catalytic efficiency of the mutant over the wild-type enzyme on natural substrates.  22  Chapter 2: Functional Metagenomic Screening for Glycosyltransferases 2.1 Introduction 2.1.1 Sialic Acid and Sialyltransferases Sialic acid is a term given to a family of naturally occurring nine-carbon keto sugars derived from 2-keto-3-deoxy-5-acetamido-D-glycero-D-galacto-nonulosonic acid (N-acetylneuraminic acid [Neu5Ac]).111 Consisting of over 50 known members, sialic acids are typically the terminating sugar on branches of N-glycans, O-glycans, and gangliosides. The carboxylic acid (pKa 2.6) of sialic acids is deprotonated at physiological pH, conferring a net negative charge on these molecules. At the C5 position, the amino group may bear either an acetyl group, as in the most common form (Neu5Ac), or a glycolyl group as the next most likely form (Neu5Gc). Additional modification of the hydroxyl groups with methyl, sulfate, phosphate, acetyl, and lactyl groups have been found and lend these compounds their diversity in nature. Sialic acids are bound to the non-reducing ends of glycan chains, usually a galactose residue, in α(2,3) or α(2,6) linkages. They can also be bound through α(2,8) linkages to other sialic acids to form long chains known as polysialic acid.112   Originally found in human brain glycolipids and salivary mucins, sialic acids have now been found to be widely distributed in animal tissues and to a lesser degree in plants, fungi, yeasts, and bacteria, potentially owing to a late evolutionary appearance.113 In eukaryotes sialic acids are involved in mediating cell-cell recognition, stabilizing cell membranes and glycoconjugates with charge-charge repulsion, regulating transmembrane receptor function, controlling the half-lives of circulating glycoproteins and cells, and acting as chemical messengers.114-116 Sialic acid plays an important role in the brains of humans, where it involved in synaptogenesis and neuronal development.112 Varied sialylation of proteins and cell surfaces has been observed in disease 23  states such as leukemia (granulocytes)117, nephropathy (IgaI)118, Salla disease (neural cells).119 Additionally, the attachment of all influenza A virus strains to cells requires the presence of sialic acids. These influenza viruses vary in their affinity for the different forms of sialic acid, which determines the animal species that can be infected.120 Along with viruses, bacteria utilize sialic acid for pathogenesis. Sialic acids are the only nine-carbon sugars found in prokaryotes, and generally found only in those closely associating with higher organisms.111 Many pathogenic bacteria use sialic acids derived from hosts as a source of carbon, nitrogen, energy, and amino sugars for cell wall synthesis while others have developed their own de novo pathway for biosynthesis which differs from the eukaryotic method.121 Bacteria can modify their coat surfaces with sialic acids as a means to mask their presence in the host and avert innate host immunity.122 Microbial sialic acid metabolism is well established as a determinant of virulence in a number of diseases. Overall, the importance of sialic acids in human health has driven interest in understanding enzymes that transfer sialic acids to other molecules. The attachment of these terminal sugars is facilitated by sialyltransferases (STs), which transfer Neu5Ac from CMP-Neu5Ac to a range of acceptor molecules. Of the 94 GT families mentioned previously, STs are separated into families 29, 38, 42, 52, and 80 in the CAZy database. Family 29 contains all of the viral and eukaryotic STs, which share some amino acid sequence identity. Bacterial STs, which share little similarity to mammalian enzymes, display more sequence diversity and have been grouped into families 38, 42, 52, and 80.  Bacterial STs were initially identified as genes that contributed to the pathogenicity of infective organisms such as Neisseria meningitides, Campylobacter jejuni, and Haemophilus influenzae.123-125 STs can be used to help synthesize complex glycoconjugates such as sialylated oligosaccharides126, ganglioside mimics127, Sialyl-Lewis antigens127, which are neccessary for studying the roles of glycosylation 24  in diseases such as those just mentioned. Bacterial STs are more readily applied than eukaryotic STs for these syntheses in vitro, owing to their heterologous expression in E. coli and lack of post-translational modifications. Much effort has been put into harnessing the few well-characterized bacterial STs for synthesis of natural and unnatural Sia analogues that can be used to study and perturb natural systems.128 Sialylated glycoconjugates have become increasingly valued as therapeutics, owing to the increased circulatory lifespan of heavily sialylated glycoproteins.116,129 Some examples include the therapeutics erythropoietin (EPO) and Immunoglobulin G (IgG) which treat anemia and inflammatory disease, respectively.130,131 Bacterial STs can be isolated and improved through engineering to enhance catalytic activities132, and as such are valuable targets for discovery.    2.1.2 Glycosyltransferase Screening While glycosyltransferases (GTs) are valuable enzymes for the synthesis of complex biomolecules, there are limited methods to screen for them in any high-throughput fashion. Unlike glycoside hydrolases which can be screened rapidly through the use of fluorescent reporters released upon cleavage of fluorogenic target substrates, GTs require assays in which the addition of monosaccharide units to an existing acceptor can be monitored. This presents a very challenging case for the discovery of GTs from metagenomic libraries, where large library size (>106) and diversity require higher-throughput approaches. Additionally, rapidly screening this type of library requires a robust assay capable of analyzing individual cells or cell lysates. There have been some successful functional assays (low-throughput) of smaller genomic libraries to isolate GT enzymes of interest, but these do not have the capacity for application to metagenomic libraries.123 One traditional high-throughput plate assay couples the release of 25  nucleoside phosphates to the oxidation of NADH (λ = 340 nm, ϵ = 6.22 mm−1 cm−1) through the action of nucleoside monophosphate kinase (in the case of CMP), pyruvate kinase (PK), and lactate dehydrogenase.133 This method has challenges in its use with cell lysate, as free phosphate is present and may interfere with the reaction. Some recent approaches are more robust in that they have been successfully used with cell lysates. One recent plate-based assay for GTs involves using phosphatases to monitor inorganic phosphate release from UDP or CMP donor sugars following transfer. Briefly, the release of UDP or CMP from the donor sugar is coupled to a phosphatase, ammonium molybdate, and malachite-based reagents which react with free inorganic phosphate to form a green product detected colorometrically.22 In another more recent and robust method, a xanthene-based Zn(II) complex acts as a chemosensor by exhibiting a large fluorescent enhancement upon binding with nucleoside diphosphate (NDP), and is used to measure the release of nucleotide diphosphosugar accompanying glycosyltransferase reactions.108 While these newer approaches allow for plate-based set-ups and thus higher throughput, they are cumbersome in that they would require the lysis of cells and addition of the lysate to a separate assay plate. While possible, these approaches would be less than ideal for a large metagenomic library requiring 384-well plate usage for which these methods have not been validated (96-well only). A better method would not require the separate lysis of cells and then addition of lysate to assay plates. As a result of this lack of high-throughput functional assay, most GTs derived from metagenomic studies have been identified by sequence information, then directly cloned, expressed, and tested for activity134-136. While no researcher would debate the efficacy of using sequence similarity and conserved domain information to identify genes of interest for study, alternative functional approaches are needed to rapidly isolate glycosyltransferases of interest capable of soluble expression in E.coli. Additionally, sequence-26  based enzyme discovery methods may only identify well-studied and characterized enzyme classes, which may not present any novel function or express well as a recombinant enzyme. Any functional approach allowing for either ultra-high-throughput screening of large metagenomic libraries or enrichment of GTs within that library prior to low-throughput validation would be highly valuable in GT discovery and utilization for glycan synthesis.  2.1.3 Fluorescence Activated Cell Sorting (FACS) Flow Cytometry is a technology used to analyse individual cells in a fluid as they pass through one or more lasers. Attributes such as cell size, cell granularity, or fluorescence can be rapidly measured. The fluorescent staining of cellular components or the addition of fluorescent compounds allow for the differentiation of cells containing an attribute or enzymatic activity of interest. Using fluorescence to separate a subpopulation of desired cells is referred to as Fluorescence Activated Cell Sorting (FACS). An overview of the FACS instrument setup is presented in Appendix A. FACS can be used to rapidly assess very large libraries (>106) and sort cells from these libraries at rates of up to 108/hour. If a specific enzymatic activity could be coupled to the development of a fluorescent signal within individual bacterial cells, this technology could feasibly be used to detect and isolate that activity within metagenomic libraries. Multiple rounds of FACS selection and regrowth of sorted cells would facilitate the enrichment of desired genes from the library. Individual clones from the resulting library could then be easily tested for the desired activity using well-established lower throughput assays. In this way FACS offers a valuable opportunity to isolate enzymes of interest from environmental libraries, such as GTs for use in glycan synthesis.  27  2.1.4 Fluorescent Encapsulation of ST Activity FACS has previously been applied within the Withers group to the directed evolution of glycosyltransferases.95 In this approach, FACS was used to select improved variants of glycosyltransferase enzymes based on increases in fluorescence of the screening cells. A fluorescent encapsulation method was employed in which a cell-permeable acceptor sugar conjugated to a fluorophore (BODIPY) would be trapped within the cell if an active glycosyltransferase transferred another sugar to it (Figure 2.1A). The addition of another monosaccharide to the fluorescent acceptor sugar, β-lactose-C2-BODIPY (Figure 2.1B), rendered the sugar too large to escape the cell. After washing of the cells, FACS could then be used to select for enzyme variants which yielded more product within the cell and hence increased fluorescence over the wild-type enzyme. This methodology was applied successfully to the directed evolution of both a sialyltransferase (CstII) and galactosyltransferase (CgtB).110 This same technology could be applied to the detection of any active glycosyltransferases within a metagenomic library. As discussed previously, bacterial sialyltransferases (STs) are valuable targets for isolation based on their use in the synthesis of sialylated glycans. Using this same methodology, STs within metagenomic libraries could be rapidly enriched by FACS from within metagenomic libraries prior to isolation and activity verification.    28        2.2 Rationale and Research Goals Both chemical and enzymatic processes have been developed for glycan synthesis, but enzymatic methods are often advantageous owing to both their efficiency and their stringent regio- and stereochemical control.137 Enzymes are particularly preferred over chemical approaches for large-scale syntheses in which vast amounts of expensive chemical reagents may be lost while obtaining marginal product yields. However, the limited availability of suitable GTs for specific glycoside bond formations restricts the favoured application of enzymes. Thus, technologies to enable the discovery of GTs are anticipated to greatly augment the utility of GTs in this regard. Metagenomic libraries present an opportunity for enzyme discovery from bacteria not yet studied using conventional isolation methods. GTs, and in particular STs, are very valuable targets for enzyme discovery as a class of enzymes which can be utilized for synthesizing biologically relevant compounds. The goals of this research are to (1) demonstrate that FACS technology can Figure 2.1 - Glycosyltransferase-mediated encapsulation of fluorescence in vivo. (A) Upon modification by the addition of another sugar, a fluorescent sugar becomes trapped in the cell allowing for FACS. (B) The fluorescent acceptor sugar used in this study.  29  enrich sialyltransferases from genomic libraries and (2) isolate active sialyltransferase genes from within metagenomic libraries using FACS.  2.3 Development of Screen (Proof-of-Principle) 2.3.1 Screening Strain The strain used in this study was the E. coli strain JM107(∆nanA). This strain had previously been optimized for assaying ST activity through deletion of the sialic acid aldolase gene (nanA).125 Deletion of this gene halts sialic acid breakdown by the organism. This results in an increased sialyltransferase product yield. Additionally this strain is lacZ- so as not to degrade the BODIPY-C2-lactose used in the assay. It also contains a constitutively-expressed, endogenous sialic acid transporter (NanT) which ensures sialic acid uptake during the assay. This strain was used in all sialyltransferase screening work, including that discussed later in Chapter 4.  2.3.2 Expression Vector When conducting a functional metagenomic screen, the choice of expression system is very important. Metagenomic DNA can be expressed either as large inserts (>25 kb) within a fosmid expression vector, or as small inserts (<10 kb) within a plasmid vector. The major advantages of screening with a large-insert library are that you obtain a much more diverse library with more DNA from the metagenome present in each clone screened. If you are limited in the number of clones you can screen (e.g. in a plate-based screen with limits on physical screening capacity), then you get more metagenomic DNA per clone with large-insert fosmid libraries. Additionally, if you are screening for complete sets of genes working together to form complex natural products, you are unlikely to fully express all required genes from metagenomic segments of 30  DNA less than 10 kb in size. The major disadvantage with large-insert libraries though, is that you are relying on heterologous expression of genes which may have promoter systems not compatible with or optimal for E. coli. Creation of large-insert (>25 kb) libraries requires the use of a special fosmid system which can be packaged into phage particles and transfected into the expression strain. The expression strain must allow for this phage transfection as well as regulate the fosmid copy number. If heterologous expression of the metagenomic DNA does occur, it may be too low for detection of activity by the screen. Alternately, if the activity you seek is likely from a single enzyme than a small-insert library offers some advantages. Library production is much simpler, requiring only ligation of the metagenomic DNA into the expression vector and direct transformation into screening cells. The small vector size allows for efficient transformation and large library sizes. The screening strain does not require any special machinery to allow for transformation or replication of the vector once present. Additionally, transcription of the metagenomic DNA is not completely reliant on internal transcription machinery but instead also uses the E. coli-compatible lac promoter present on the expression vector. This will generally produce higher levels of expression as a result of both higher plasmid copy number and more efficient transcription machinery.63 As the intention of this study was to identify activity from individual genes (STs), and everything required was readily available within our lab, a small-insert expression system was utilized in this study. The small-insert expression vector chosen for this study was a vector based on pUC18. pUC vectors are very small expression vectors with high copy number, allowing for simple recombination and expression of foreign DNA. Expression of heterologous genes within a small-insert library can be further improved through the use of a bi-directional promoter, as the 31  orientation of metagenomic DNA relative to lac promoter on the vector cannot be controlled upon library creation. As shown previously138, use of a bi-directional promoter system enhanced the hit rate when screening a small-insert metagenomic library. Previously constructed bi-directional expression vectors were not available commercially or from the labs in which they were developed. Additionally, this screen required that the expression vector provide ampicillin resistance, which differed from those vector systems previously developed. A new bi-directional expression system was constructed specifically for this screen. A second lac promoter was sub-cloned into pUC18 in opposite orientation to the original promoter to allow for bi-directional transcription of inserted DNA. This new vector was named pDUAL (Figure 2.2A). The bi-directional expression was verified by cloning the bacterial xylanase gene Bcx into the vector in both orientations and testing for activity on the colorimetric xylanase substrate pNP-β-xylobioside. Presumably, if active Bcx protein were produced the substrate would be cleaved to release free pNP which could be viewed colorimetrically as the reaction turning yellow. Activity was seen for the Bcx gene cloned in both orientations, validating the bi-directional expression of the vector (Figure 2.2C). This vector was used for all initial proof-of-principle screens. Following the initial proof-of-principle screens, a slightly improved version of pDUAL, dubbed pDUAL2, was created. In this version, both the lac promoter and multiple cloning site (MCS) were sub-cloned into pUC19 in opposite orientation to the pre-existing promoter and MCS (Figure 2.2B). This vector was validated by sub-cloning the green fluorescent protein (GFP) from jellyfish into the vector in both orientations and visually assessing fluorescence (Figure 2.2D). Expression of the GFP gene occurred in both orientations, validating this updated version 32  of pDUAL2. This version was used for all subsequent metagenomic library screens and the determinations of the limits of enrichment for FACS (Chapter sections 2.4 and 2.5).  Figure 2.2 - Verification of bi-directional expression systems. (A) Vector map of bi-directional expression system pDUAL. (B) Vector map of improved expression system pDUAL2. (C) Assay of pDUAL with Bcx gene cloned into vector in both forward and reverse directions. Following growth and induction with IPTG, a small amount of lysate was added to reaction mixtures containing the colorimetric substrate pNP-β-xylobioside. The image is following reaction for 2 hours at 37°C  (D) Assay of pDUAL2 with green fluorescence protein (GFP) cloned into vector in both forward and reverse directions. Following growth and induction with IPTG, cell cultures were analyzed for green fluorescence under UV light (365 nm).   33  2.3.3 Library Construction (Campylobacter jejuni Genomic Library) To validate the utility of this screening system for enriching active ST genes from diverse background DNA, a genomic library of Campylobacter jejuni sp. 81176 was constructed in the pDUAL vector and transformed into the JM107∆nanA host strain. Briefly, Campylobacter jejuni sp. 81176 DNA was isolated, broken into small fragments by sonication (acoustic shearing), separated on gel and the appropriate sized DNA selected (2-8 kb), the DNA end-repaired and ligated into pDUAL prior to transformation into JM107∆nanA cells. This strain of C. jejuni contains a well-characterized sialyltransferase, CstI, from CAZy family GT42. STs from this family utilize β-lactosides as acceptors, ensuring that the β-lactose-C2-BODIPY used in the screen would be utilized if CstI were present. Cells expressing CstI fluoresce approximately two orders of magnitude above cells harboring only an empty plasmid when assayed with the screening substrates and analyzed by FACS. The ST activity of CstI protein on the acceptor β-lactose-C2-BODIPY had also been verified in vitro prior to beginning this project. This gene should therefore be enriched through multiple rounds of FACS using the fluorescence encapsulation strategy discussed previously. The constructed genomic library of C. jejuni DNA included 5.0 x 106 clones with average insert sizes of 4 kb, providing an over 99.9% probability of full genome coverage according to the Carbon and Clarke formula139 for genome coverage by a recombinant library.  2.3.4 FACS Enrichment  The C. jejuni genomic library (5.0 x 106 clones) was submitted to 3 rounds of FACS sorting. This entailed growing the library in minimal media (M9) containing glucose (0.4%), inducing gene expression overnight with IPTG, incubating the library with reagents for sialyltransferase 34  activity, then washing the cells and submitting them for sorting by FACS. Briefly, the cells were treated with sialic acid (Neu5Ac) along with the fluorescent acceptor sugar (β-lactose-C2-BODIPY), then washed extensively to remove any unreacted fluorescent sugars and submitted to FACS. Cells were first sorted according to forward and side scatter (FSC and SSC) to select for single bacterial cells. This is important as many cells clumped together may produce higher fluorescence, resulting in the selection of false positives. Cells were then sorted for green fluorescence (488 nmex/530 nmem and 488 nmex/610 nmem) using the FITC channel to capture cells in which an active sialyltransferase had transferred sialic acid to the β-lactose-C2-BODIPY acceptor, effectively trapping it in the cell. Sorted cells were transferred to liquid growth media (LB) overnight (a step often termed ‘amplification’) prior to the next round of FACS selection. Three rounds of growth, assay, and FACS selection were completed prior to determining the level of CstI gene enrichment. The FACS histogram data are summarized for all three rounds in Figure 2.3.   Figure 2.3 - FACS histograms of Campylobacter jejuni genomic library. Following treatment with Neu5Ac and β-Lactose-C2-BODIPY, the genomic library was submitted to FACS for sorting of cells based on fluorescence (488 nm/530 nm and 488 nm/610 nm). A gate was set (shown in red) above which cells were sorted for growth and further screening by FACS. The proportion of cells above the gate during each of the three rounds of FACS are indicated above.  35  An increase in the proportion of cells above the fluorescence cut-off ‘gate’ (seen as a shift to the right on the FACS histograms) indicated that an increased proportion of cells contained an active sialyltransferase. PCR was used to verify the retained presence of the CstI gene in the initial library, as well as libraries following FACS sorting (data not shown). This was further validated by investigating the presence of CstI in individual clones before and after the various rounds of FACS selection.   2.3.5 Optimizing Enrichment Efficiency In order to better quantify the level of enrichment of the CstI gene by FACS, colony-PCR was first employed in which a set of randomly-picked colonies from the genomic library was tested for the presence of the CstI gene (Table 2.1). Colonies were picked from the initial library prior to any enrichment and following each of the three rounds of FACS selection. The colony-PCR results suggested a very poor level of CstI gene enrichment by FACS with only approximately 4% of cells containing the CstI gene in the final population compared to what one might expect (>90%) following multiple rounds of FACS and what would be required for a feasible gene enrichment from a metagenomic library.  Table 2.1 - Estimation of CstI gene abundance in Campylobacter sp. genomic library by colony PCR Screening Phase of Campylobacter sp. Library # of Colonies Containing CstI Gene CstI Gene Abundance in Population (%) Pre-FACS 0/24 ≥ 0% Post-FACS Round 1 1/28 ~ 3.6% Post-FACS Round 2 1/28 ~ 3.6% Post-FACS Round 3 2/50 ~ 4.0%  Additionally, the lysates of the two positive clones (containing the CstI gene according to colony-PCR) from the Post-FACS Round 3 library were tested for sialyltransferase activity 36  against two negative clones (not having CstI gene according to colony-PCR) from the set of clones. Surprisingly, along with the positive clones one of the negative clones exhibited sialyltransferase activity, indicating that there may have been false negatives in the colony-PCR experiment. Cell lysates from an additional 18 colonies from the same set of clones were tested for sialyltransferase and one more active clone was identified. These active clones were verified by sequencing as having the CstI gene, bringing up the number of clones containing the CstI gene (following 3 rounds of FACS) from 2/50 up to at least 4/50 (8%). These results clearly indicated that the colony-PCR data had not been accurate. This may have been a result of the initial cell lysis step of the PCR reaction (3 minutes at 95°C) inactivating a substantial portion of the DNA polymerase. For future experiments this was adjusted by lysing the colonies prior to the addition of the DNA polymerase. Regardless, at this point colony-PCR appeared to be a poor method for accurately quantifying the enrichment of CstI through multiple rounds of FACS sorting, and a more accurate and sensitive method was required. While the enrichment of the CstI gene through FACS had been better than originally determined, it was still very poor and improvements were necessary prior to any screening of metagenomic libraries.  One consideration for poor enrichment efficiency following multiple rounds of FACS selection was that background cells, which may retain fluorescence for some reason other than containing the activity of interest, are somewhat enriched through FACS. This had been a consideration in the previous use of FACS for performing directed evolution of a glycosyltransferase.95 The solution for those experiments had been to obtain DNA from the sorted cells following FACS and re-transform it into fresh screening cells so that no accumulation of false-positive background cells could occur. In order to test if this would improve enrichment efficiency of the CstI gene through FACS, the same Campylobacter sp. genomic library used previously was 37  subjected to three rounds of growth, assay and FACS selection. In this iteration, following each round of FACS, the DNA from the sorted cells was obtained and re-transformed into fresh screening cells (JM107∆nanA:pACYC18[SiaB]) prior to the next round of screening. There was no evident increase in overall fluorescence retention in the population as seen in the FACS histogram data for this method (Figure 2.4), indicating that this method would not be an improvement over the original strategy. The enrichment efficiency was quantitatively determined by picking 40 random colonies following the third round of FACS sorting and testing for both sialyltransferase activity (in vitro) and the presence of the CstI gene by sequencing. None of the selected clones contained the CstI gene or exhibited sialyltransferase activity. The sequencing data indicated that the re-transformation step inadvertently biased the library to contain plasmids with the smallest genomic DNA inserts (500-1000 bp). This was likely because smaller plasmids have a much higher transformation efficiency than larger plasmids, heavily biasing the re-transformation step towards smaller plasmids. So while this method of re-transforming the genomic library DNA between each round of FACS is a good control for projects in which all of the plasmids are the same size (e.g. directed evolution), this is a very poor method for situations in which a variety of insert sizes are present, such as with metagenomic libraries.   38   Figure 2.4 - FACS histograms of Campylobacter jejuni genomic library using retransformation method. Following treatment with Neu5Ac and β-Lactose-C2-BODIPY, the genomic library was submitted to FACS for sorting of cells based on fluorescence (488 nm/530 nm and 488 nm/610 nm). A gate was set (shown in red) above which cells were sorted for growth and further screening by FACS. Following each round DNA from the library was isolated and retransformed into fresh screening cells prior to the next round of FACS.   Another consideration for improving enrichment efficiency is the potential loss of enrichment during both the growth/amplification and growth/induction steps between each round of FACS. While amplification of the selected clones is required following FACS in order to perform another round of selection, there may be growth bias in which background cells or cells not containing the gene of interest grow more rapidly or efficiently than active clones. In liquid media there is no way to normalize this growth between different clones. This may lead to a decrease in the abundance of active genes, counteracting the enrichment just performed by FACS. Indeed this has been documented in screens requiring a growth or amplification step prior to subsequent rounds, particularly with phage display.140,141 In order to minimize this higher variation of growth in liquid culture, library cells were first grown on solid agar plates. Following this, cell were then transferred to liquid minimal media containing glucose (0.4%) for the growth/induction step prior to FACS sorting. In this way the growth of each clone is somewhat limited to the general size of colonies formed on solid media. Following each round of 39  FACS, the sorted cells were plated and grown overnight prior to the next round of induction and assay by FACS. Prior to FACS, the cells were washed off the plate, grown and induced in liquid minimal media, then an aliquot used for assaying and FACS sorting. A substantial shift in the fluorescence of the cell populations can be seen in the FACS histogram data (Figure 2.5), indicating an increase in the proportion of ST-containing cells.  Figure 2.5 - FACS histograms of Campylobacter jejuni genomic library using solid media method. Following treatment with Neu5Ac and β-Lactose-C2-BODIPY, the genomic library was submitted to FACS for sorting of cells based on fluorescence (488 nm/530 nm and 488 nm/610 nm). A gate was set (shown in red) above which cells were sorted for growth and further screening by FACS. Following each round, cells were regrown on solid media prior to expression and the subsequent rounds of FACS.   Following the three rounds of FACS sorting, 40 colonies were randomly picked and assayed for sialyltransferase activity (in vitro). Ten of the colonies were active, indicating that cells containing the CstI gene had been enriched to 25% in this method, the best results to date. Limiting the growth bias between each round of FACS sorting clearly had a positive effect on enrichment efficiency. In order to limit this growth bias further, cell were both grown and induced on solid media plates, completely removing any liquid growth steps. A control experiment was conducted in which positive control cells containing a sialyltransferase (CstI) were grown, induced, and assayed by either the liquid media growth/induction method or by the 40  growth/induction on solid media method. Following incubation with the screening substrates and washing, the two samples were analyzed by flow cytometry (Figure 2.6).   Figure 2.6 - Comparison of fluorescence uniformity between liquid and solid media growth/expression methods. While present in each histogram, the sorting gates (indicated in red) were not utilized for these control samples.   It was evident from the flow cytometry data that the solid media growth/induction method worked as well as the liquid growth/induction method, and even better in terms of uniformity of fluorescence in the assayed cells. As can be seen in the histograms, the solid media growth/induction method resulted in a very even distribution of fluorescent cells, indicating that most of the cells had expressed CstI and this had modified Lactose-C2-BODIPY. The liquid media growth/induction method was also successful, but the distribution of cells was much less symmetrical, indicating that a larger portion of cells had not expressed CstI, died, or exhibited less enzymatic activity. Despite the concern that growth/induction in rich media might negatively impact sialic acid uptake or enhance background fluorescence, sufficient pre-washing in minimal media prior to adding the reaction substrates appeared to mitigate these risks. Using this method, growth bias was mostly eliminated, and resulted in further improved enrichment of cells containing CstI. This was determined by another set of enrichment experiments in which 41  negative control screening cells (containing no active sialyltransferase) were dosed with various amounts (0.01%, 0.1%, and 1%) of positive control cells (containing CstI gene) and submitted to multiple rounds of FACS enrichment. Following 3 rounds of FACS enrichment, the proportion of cells containing CstI in all three samples had increased substantially (>75%), as determined by picking random colonies and testing for the presence of the CstI gene by PCR. Providing the most efficient enrichment so far, this method was used for all of the following screening experiments. Additionally, a more accurate determination of the rate of enrichment using qPCR had subsequently been developed and could be used following optimization of the enrichment method.  2.3.6 The Effect of Library Complexity on Enrichment Efficiency Determined by qPCR The efficiency of enrichment using the optimized solid media growth/induction method could be more accurately quantified using qPCR. Specialized primers and probes (Primetime®) were designed for the CstI gene in order to quantify the number of gene copies present within DNA samples taken before and after each round of FACS enrichment. Two sets of samples were created in which the background populations were either simple (basic negative control cells containing no active sialyltransferase) or complex (Cellulomonas flavigena genomic DNA library containing no active sialyltransferases). By quantitatively following the enrichment of the positive-control gene (CstI) through multiple rounds of FACS, enrichment efficiency could be monitored as well as the number of FACS rounds necessary for maximum enrichment. Also, by testing the enrichment of CstI against a background of diverse genes (Cellulomonas flavigena genome) the experiment would more closely mirror the screening of a metagenomic library. For each set of samples (simple and complex), the initial abundance of genes containing CstI varied 42  (0.01%, 0.1%, and 1%). This set of six samples was submitted to 6 rounds of FACS enrichment along with a negative control. Following growth/induction overnight and washing of the cells off the solid media, a DNA miniprep was obtained from an aliquot of the cells and analyzed by qPCR to determine the amount of CstI gene relative to the total amount of DNA present. This relative abundance of CstI DNA in each sample was determined as the proportion of CstI DNA (ng) to the total amount of DNA (ng) in each miniprep, and was obtained following each round of FACS and plotted (Figure 2.7).   Figure 2.7 – qPCR Measurement of CstI Gene Abundance though FACS Sorting. The relative abundance of CstI (ng CstI DNA / ng total DNA) in the (1a) simple library and (2a) complex library through multiple rounds of FACS enrichment as measured by qPCR. FACS histograms showing the fluorescence of the (1b) simple library and (2b) complex library populations through the first three rounds of FACS enrichment. The percentage of cells in the population that fluoresce above the sorting cut-off are shown in red.  [Simple library = Empty screening cells spiked with small amount of CstI-containing cells] [Complex library = Screening cells containing a random genomic library (no sialyltransferases) spiked with small amount of CstI-containing cells]  43  When comparing the apparent rate of enrichment between the simple and complex samples, it is evident that enrichment occurs more slowly when a complex set of background genes is present. While the simple samples showed very strong enrichment following just one or two rounds of FACS, the complex samples required at least two rounds of FACS before any measureable increase in positive cells could be seen. Consistently though, maximum enrichment for almost all samples occurred following 3 rounds of FACS. It is obvious from the qPCR data for both the simple and complex samples that more than 3 rounds of FACS enrichment did not improve the abundance of ‘hits’ in the sample. In fact, it appears that additional rounds of enrichment have a negative impact on the abundance of CstI. This may be the result of more highly-fluorescent background cells eventually being enriched through FACS, similar to what was discussed previously when DNA is not transformed into fresh cells following each round of FACS. In this way it appeared that 3 rounds of FACS enrichment was sufficient for increasing the abundance of positive genes in the sample without false-positive background cells becoming substantially enriched. Overall, it was validated that a sialyltransferase gene could be enriched from libraries through an activity-based FACS screen. Following this successful proof-of-principle, the screen was applied to a number of metagenomic samples with the goal of isolating sialyltransferase genes.  2.4 Functional Metagenomic Screening 2.4.1 Sampling As many of the bacterial STs discovered to date had originated from pathogenic bacteria and marine samples125,142,143, it was posited that these types of sources might be good starting points for sampling. Three samples were obtained to satisfy this criterion. Both solid sludge and raw 44  wastewater samples were obtained from the UBC wastewater treatment facility. Also, seawater was obtained locally at Jericho Beach in Vancouver. Following the screening of these initial samples (discussed below), additional samples were obtained from Southland Farms in Vancouver for the feces of chicken, pig, and goat. These samples were considered likely to contain Campylobacter DNA, improving the likelihood of active ST genes (such as CstI) being enriched through the screen and validating the screen for metagenomic libraries.  2.4.2 Initial Construction of Libraries The sea and wastewater libraries were constructed similarly to the Campylobacter jejuni genomic libraries used for the proof-of-principle screens. Samples were simultaneously lysed and the DNA sheared into small fragments using a bead-beating method. Following lysis and DNA purification, small DNA fragments (2-10 kb) were directly isolated and used to construct small-insert libraries with pDUAL2 in the screening strain JM107∆nanA:pACYC18(SiaB). The sizes of the resulting libraries from the solid sludge, raw wastewater, and seawater samples were 3.3 x 106, 2.4 x 106, and 4.4 x 106 clones, respectively.   2.4.3 FACS Enrichments Each of the solid sludge, raw wastewater, and seawater libraries were subjected to three rounds of FACS enrichment using the optimized solid media growth/induction method. The FACS histogram data for these three sets of screens are summarized in Figure 2.8. According to the data, only the seawater library cells showed a small level of increased fluorescence retention through the three rounds of FACS. Due to a lack of perceptible fluorescence increase in the solid 45  sludge and raw wastewater libraries, only the seawater library was submitted for secondary screening to determine if any ST genes had been enriched.    Figure 2.8 – Relative fluorescence of metagenomic libraries through three rounds of FACS enrichment. The solid sludge (left), raw wastewater (middle), and seawater (right) metagenomic libraries were sorted based on fluorescence (488 nm/530 nm) for three rounds of FACS. Only cells above the sorting gate (yellow) were selected for subsequent rounds of FACS.   2.4.4 Secondary Screening From the seawater metagenomic library, 15 random colonies were picked from the enriched population following three rounds of FACS. Those 15 colonies were separately grown, induced, Solid Sludge            Raw Wastewater             Seawater 46  and assayed with the screening substrates. Following washing, the lysates of the cells were analyzed by thin-layer chromatography (TLC) under UV light for sialyltransferase activity. There was no apparent sialyltransferase product (Figure 2.9). The small amount of increased fluorescence population seen in the FACS histogram data appears to have been non-specific background enrichment as opposed to the enrichment of ST activity.  2.4.5 Additional Library Constructions and FACS Enrichments As a result of no ST gene hits being identified in the first set of metagenomic libraries, it was reasoned that metagenomic libraries containing a higher proportion of pathogenic Campylobacter jejuni might have a higher level of success, given that our positive control gene CstI had been derived from this source. Animal feces, particularly chicken, has often been associated with the presence of Campylobacter.144,145 Fecal samples from chicken, pig, and goat were obtained from Southlands Farms and used to produce new small-insert metagenomic libraries for screening. Not enough DNA was obtained from the pig sample following lysis and purification, but 2-10 kb metagenomic libraries were constructed from the avian and goat samples at sizes of 1.0 x 106 and 1.1 x 106, respectively. These two libraries were each submitted Figure 2.9 – Analysis of seawater metagenomic library clones for sialyltransferase activity. Following incubation with Neu5Ac and Lactose-C2-BODIPY, 15 random colonies were lysed and the lysate analyzed by thin-layer chromatography (TLC) under UV. The expected location of the product spot (α-2,3-Neu5Ac-Lactose-C2-BODIPY) in indicated above.             1    2    3    4    5    6    7     8                     α-2,3-Neu5Ac-Lactose-C2-BODIPY 47  to three round of FACS enrichment as previously described. Disappointingly, there was no apparent increase in the fluorescence retention of either library.  2.4.6 Additional Secondary Screening In order to ensure that no ST gene hits were present in either the chicken or goat fecal samples, 24 colonies from each library (post-FACS round 3) were picked and tested for in vivo formation of α-2,3/2,6-Neu5Ac-Lactose-C2-BODIPY (Appendix B). As suspected based on the lack of increased fluorescence shift in the FACS data, none of the colonies demonstrated sialyltransferase activity. It was somewhat surprising that no ST genes could be enriched even from samples likely to contain pathogenic Campylobacter bacteria. Despite its presence though, Campylobacter DNA would be present in a very small amount relative to the more abundant commensal gut microbiota. This brought into question the sensitivity of the FACS system to enrich this gene class from a potentially very low starting amount. In order to determine the viability of the screen it was necessary to determine the limit of enrichment, or the minimum starting concentration of ST-containing cells in a library that could be enriched through FACS.  2.5 Limit of FACS Enrichments 2.5.1 Construction of Dosed Libraries A negative control library, consisting of Cellulomonas flavigena genomic DNA library containing no active sialyltransferases, was dosed with varying amounts of positive control cells (containing the sialyltransferase CstI). The proportion of positive control cells in the sample was 48  estimated by determining the actual number of CstI-containing cells added to the sample (determined by plate titre) relative to the number of background negative control cells (estimated by OD600). Six samples were initially constructed, including a negative control sample with no positive-control cells added. Following estimation of the positive cell proportions in the samples it was found that two of the samples had no positive cells present (too much dilution) and the other samples contained 0.00004%, 0.00013%, and 0.006%.  2.5.2 Determination of Limit of Enrichment The remaining four samples consisted of the negative control sample and three spiked samples. These samples were subjected to three rounds of FACS enrichment as before, with a miniprep of DNA obtained before and after each round of enrichment. As indicated in the FACS histogram data (Figure 2.10), there is a visible fluorescence shift in both the negative control and spiked samples. From these data it is clear that the fluorescent population shifts in the FACS histogram data are not indicative of ST gene enrichment. This may be from a non-specific background shift in the population, or an unidentified activity modifying the Lactose-C2-BODIPY acceptor for cell encapsulation. This second option seems quite unlikely though, as from previous experiments testing the lysate of cells following numerous rounds of FACS enrichment, no apparent modifications of Lactose-C2-BODIPY could be seen in the TLC data. It is difficult, therefore, to pinpoint the reason for this general shift in the fluorescence retention of the background cells, but increases in background fluorescence would likely inhibit the overall 49  enrichment efficiency. Regardless, qPCR still proved an accurate method for quantifying the CstI gene enrichment for each sample. The relative abundance of the CstI gene in each sample through each round of FACS was quantified by qPCR and the results are summarized in Figure 2.11. Only the sample with the highest starting abundance of CstI-containing cells (0.006%) were enriched for CstI through FACS. From this data it appeared that the threshold for an active ST gene to be enriched by FACS was at least 0.006% (starting abundance of hit-containing cell within library).  Figure 2.10 - Relative fluorescence of dosed libraries through three rounds of FACS enrichment. A negative control library (A) was tested, as well as dosed libraries with 0.00004% (B), 0.00013% (C), and 0.006% (D) of positive control cells. Following treatment with Neu5Ac and β-Lactose-C2-BODIPY, these libraries were submitted to FACS for sorting of cells based on fluorescence (488 nm/530 nm and 488 nm/610 nm). A gate was set (shown in red) above which cells were sorted for growth and further screening by FACS. Following each round, cells were regrown and expressed on solid media prior to subsequent rounds of FACS. While all four libraries indicated an increase in cells containing CstI according to a shift in their fluorescence, only that of sample D correlated to actual increases in CstI abundance.    Round: 1   2   3 A          B           C          D 50    With only a single sample providing an estimation of the limit of enrichment, it was difficult to assign an accurate value to this limit. A second set of experiments was set up with dosages of positive cells closer to the previously determined value. This experiment was performed with spiked samples containing positive control cells in the amounts 0.00002%, 0.0002%, 0.002%, and 0.02%. Along with a negative control, the spiked samples were again subjected to three rounds of FACS enrichment, after which the proportion of CstI-containing cells within the resulting populations was determined by qPCR (Figure 2.11).  The two samples with the higher starting concentrations of positive cells (0.02% and 0.002%) showed CstI enrichment through three rounds of FACS.  Following FACS, 40 colonies from each of these enriched samples were picked and the actual number of CstI-containing cells determined by colony-PCR and sequencing. The qPCR data for the 0.02% and 0.002% correlated to 36/40 (90%) and 27/40 Figure 2.11 - Summary of qPCR data for dosed libraries. *Enrichment not detected *Enrichment not detected 51  (68%) of the final population cells containing CstI, verifying sufficient enrichment. From these data, a slightly lower limit of detection was estimated to be 0.002%. In a metagenomic library this would equate to a limit of detection/enrichment of 1 in 50,000 cells (or 1 in 5.0x104).   Figure 2.12 - Summary of relative CstI gene abundance in library through FACS enrichment.  Disappointingly, this limit was fairly poor relative to the complex and large metagenomic libraries being screened (105-106). This also likely accounted for why no ST genes could be enriched from the sampled metagenomic libraries, even those presumed to contain Campylobacter species of bacteria. If the initial abundance of cells containing ST genes was not in excess of 1 in 50000, it would have been very unlikely that these cells would be enriched through FACS at this level of sensitivity. The initial proof-of-principle experiments all proved successful because even the lowest starting concentrations of positive cells (0.01%) was almost an order of magnitude higher than the limit of enrichment.  52   2.6 Conclusions and Future Work Through method optimization, it was determined that FACS could be used as an ultra-high throughput method for enriching ST genes from genomic libraries. Enrichment efficiency could be substantially improved by simultaneously growing and inducing cells on solid media, effectively minimizing growth bias for the amplification step between selections. Unfortunately this method was not successful on large, diverse metagenomic libraries due to the sensitivity of enrichment being too low relative to the abundance of ST-containing cells in the initial library population. Genes of interest must be more abundant initially for reliable enrichment using this method. This method may become more viable if the sensitivity of FACS sorting were improved, potentially through decreasing the selection of false positives by using stained live cells or applying more stringent SSC and FSC parameters when gating cells. Alternatively, enrichment viability might be improved by increasing the initial abundance of genes-of-interest through adding substrates (ex. Neu5Ac) to the environmental sample in order to boost the number of organisms containing genes for these compounds prior to sampling. This would be somewhat similar to the substrate-induced gene expression method (SIGEX)90 of identifying catabolic genes associated with specific nutrients, but functionally assaying the library to directly isolate the desired genes. Understandably, pursuing further development of this FACS-based approach to GT detection appeared an uncertain course of action. As such, a more direct plate-based functional metagenomic screen was developed to target a different carbohydrate-modifying activity. This work is discussed in Chapter 3. Additionally, to capitalize on the insights gained from this work and obtain a bacterial sialyltransferase of value for glycan synthesis, the directed evolution of a sialyltransferase was conducted using FACS. This work is discussed in Chapter 4.  53  Chapter 3: Functional Metagenomic Screening of the Human Gut Microbiome 3.1 Introduction 3.1.1 Functional Metagenomic Screening of GHs Screening for the hydrolytic activity of GHs is readily achieved using synthetic, chromogenic glycosides whose cleavage results in the release of a chromophore or fluorophore that can be detected by a plate reader or even directly on growth media (agar). Reporter stains or additives such as Congo Red and Fast Blue RR have traditionally been used on agar plates along with carbohydrate sources to screen for GH activities such as β-glucanases84,146 or cellulases82,147, but these methods lack sensitivity, resulting in low hit rates. More sensitive fluorogenic reporters such as 4-methylumbelliferone can be used in plate assays (96-well or 384-well), but the challenge in this approach is scaling up the methodology to accommodate large libraries, such as those from metagenomic studies >106. Recently, metagenomic libraries (104-105) have successfully been screened in high-throughput (96-well or 384-well) assays using substrates with fluorogenic reporters (e.g. 4-methylumbelliferone) to discover novel GH genes.85,105,148 Regular screening of very large plate libraries (>106) is challenging in terms of time, cost, substrate availability, and plate storage capacities. Screening of more manageable library sizes (104-105) may be worthwhile though, since use of more sensitive assays substantially increases hit rates. Using fluorogenic reporters to screen plate-based metagenomic libraries for enzymes of interest is a rapid, economical approach that can be applied to almost any GH activity for which the fluorescent substrate is available. Scaling up of screens of large libraries has been demonstrated very recently using droplet-based assays in which enzymatic reactions are carried out in picoliter 54  droplets, then bacterial enzymes of interest then isolated through droplet sorting.149,150 This new approach offers an economical means of screening very large libraries, although it requires substantial time for initial development and screen optimization relative to plate-based approaches. Ultimately the choice of screen type will depend heavily on the time and resources available to the researcher conducting the screen.  3.1.2 Blood Antigen-Cleaving Enzymes GHs capable of cleaving blood antigens are good candidates for functional metagenomic screening. The human blood group system consists of a large and diverse group of antigens. The most well-studied, clinically relevant, and immunodominant blood group is the ABO system. This system consists of carbohydrate-based A , B, and H antigens displayed on the red blood cell (RBC) surface.4,151 These antigens are also displayed on platelets, endothelial cells, and epithelial cells.152 Several acute and chronic medical conditions can be treated by blood transfusion. Some of these include massive blood loss from trauma or surgery, cancer therapies, or chronic blood transfusions for diseases like sickle cell disease. Accidental mismatching of RBCs for blood groups during blood transfusion or any organ transplantation can lead to serious or fatal immune reactions.153-156 Many blood transfusions rely on the use of type O blood, as this blood type will not normally elicit any immune response. While any individual can receive blood of type O there is often a shortage of this “universal donor blood.” Novel methods to make existing type A or B blood transfusible to non-matching recipients have been developed, including methods to mask the immunogenicity of RBCs with polymeric systems157-161 and the enzymatic removal of terminal α-N-acetylgalactosamine (GalNAc) and α-galactose residues from type A and B antigens to convert them to type O.104,162-169 This use of glycosidases that are specific for the 55  removal of α-1,3- linked GalNAc and α-1,3- linked Gal residues present on the termini of the A and B antigens was first pioneered by Goldstein in 1984.170 Unfortunately the efficiencies of the enzymes were too poor for safe or economical use, needing 1 gram of enzyme per unit of blood. Over the past 15 years a small number of screening studies have been carried out to discover more efficient bacterial enzymes capable of converting the blood antigens. To date the most efficient enzymes found were from screens in 2005168 and 2007104, which identified enzymes that are capable of cleaving either the terminal trisaccharide of the blood type A or B antigens or the terminal α-Gal/GalNAc residues, respectively. Despite these discoveries, issues still exist with linkage diversity and conversion efficiencies, particularly with the type A antigen. More efficient enzymes must be both discovered and engineered for improved activity in order for this conversion approach to become economical and consistently effective. High-throughput functional metagenomics provides a means for this enzyme discovery. Highly sensitive fluorogenic substrates specific to these blood antigen-cleaving activities could be applied to carefully chosen metagenomic libraries to isolate these potentially valuable antigen-processing enzymes.   3.1.3 The Human Gut as a CAZyme Source Blood cell antigens are not only displayed on RBCs, but also on other cells within the body. In particular, they are highly displayed on most intestinal epithelial cells171-173. As a nutrient-rich environment, the gut offers bacteria a large breadth of cell-displayed sugars for nutrition. It seems reasonable to suggest that the human gut microbiome may possess hydrolytic enzymes capable of targeting the blood antigens, specifically the terminal α-1,3- linked GalNAc and α-1,3- linked Gal residues. Some of the putative protein sequences within the CAZy database for 56  blood type A (GH109) and blood type B (GH110) cleaving enzymes originate from known commensal gut bacteria such as Bacteroides sp.. Additionally, a blood type B cleaving enzyme was recently cloned and expressed from the genome of the commensal gut bacterium Bifidobacterium bifidum163. A substantial volume of metagenomic sequence data has been collected from bacteria within the human gut46,174-176, but analysis of the functional roles of gut bacteria have largely focused on their roles in human dietary and disease processes.177-181 The human gut microbiome may be a valuable a source of enzymes that are capable of modifying biologically relevant glycans for health and research purposes. Specifically, this environment may already contain novel blood antigen-modifying enzymes more efficient than those currently available or predicted by sequencing, enzymes which can be identified by functional metagenomic screening.  3.2 Rationale and Research Goals Bacterial blood antigen-cleaving enzymes are good candidates for functional metagenomic screening, based on their potential for use in engineering universal donor blood. Functional metagenomic screening is well suited to the discovery of GH activities, such as the exoglycosidase activity of these desired enzymes. The human gut offers a rich environment for bacterial degradation of human glycans, including the type A and type B blood antigens. The goals of this research are to (1) create metagenomic libraries of the human gut microbiome, (2) screen these libraries for α-galactosidase and α-N-acetylgalactosaminidase enzymes specific to blood group antigens using a targeted, high-throughput approach, and (3) determine the function of any discovered enzymes relative to those currently available.  57  3.3 Development of Screen 3.3.1 Screening Approach In order to identify α-galactosidases and α-N-acetylgalactosaminidases that are specific to the blood group antigens A and B, the terminal trisaccharide of the full blood group antigen (Figure 3.1A) must be present on the screening substrate. While available within our lab, this complex substrate (Figure 3.1B) would be rapidly consumed by the screening of even a single 60 plate, 384-well metagenomic library. In order to conduct the screen more economically, simple, activated aryl monosaccharide substrates would first be used to screen for enzymes having any α-galactosidase or α-N-acetylgalactosaminidase activity. These simple substrates would contain the highly fluorescent 4-methylumbelliferyl reporter (Figure 3.1B) to allow for the detection of enzymes having even low activity on the substrate. Any initial hits would then be tested on the more complex blood antigen mimicking substrates containing the terminal trisaccharide necessary to verify selectivity. While this approach runs the risk that simple substrates would not be used by some enzyme classes, experience has shown that such aryl glycosides are cleaved by most related glycosidases. Libraries would be constructed in the E. coli screening strain ReplicatorFOS (similar to EPI300), which is capable of accommodating either large- or small-insert libraries. Ideally a large insert library would be constructed to maximize screening coverage (more tested DNA per clone >25 kb vs <10 kb), with the option of constructing small insert libraries if any challenges arise.  58   Figure 3.1 - Blood group antigens and screening substrates. (A) Chemical structures of the ABO blood group system antigens. (B) The blood type A antigen mimicking substrate MU-Type 2Atetra (left) along with the two simple screening substrates α-GalNAc-MU (Type A) and α-Gal-MU (Type B) (right).    3.3.2 Screening Strain Background Activity The first requirement of the screening strain for assay viability was the determination of any background α-galactosidase or α-N-acetylgalactosaminidase activity. If any background activity existed, this would obstruct the ability to screen using the α-Gal-MU and α-GalNAc-MU substrates. Both clarified supernatant and unclarified supernatant (including cell debris) following lysis of the screening strain ReplicatorFOS were added to an assay mixture containing 200 µM concentrations of either α-Gal-MU or α-GalNAc-MU and the fluorescence monitored for 1 hour at 37°C. The cells also contained an empty expression vector (pHSG396) to ensure that the β-galactosidase expressed by most expression vectors for blue/white screening did not produce undesirable background activity. Wells containing positive control enzymes were also included for each substrate. For the α-GalNAc-MU substrate, the enzyme Elizabethkingia meningoseptica GH109 (EmGH109) was used. This enzyme is known to cleave the blood type A antigen as well as α-GalNAc-MU, albeit slowly. For the α-Gal-MU substrate, the enzyme 59  Bacillus fragilis GH110 (BfGH110) was used. This enzyme is known to cleave the blood type B antigen as well as α-Gal-MU, albeit slowly. These control enzymes, though less active on the simple substrates to be used in the primary screen, were more reflective of the activity expected from enzymes specific to the complex blood group antigens. The activities of all samples are summarized in Table 3.1. When testing clarified lysates (centrifuged lysate containing no cell debris) of all three controls, the screening strain exhibited no activity on either substrate, as desired. Unfortunately when unclarified cell lysates (un-centrifuged lysate containing cell debris and cell wall-bound proteins) was tested, the screening strain exhibited strong activity against the α-Gal-MU substrate. An α-galactosidase produced within the screening strain was likely responsible. As a result, α-Gal-MU had to be excluded from the screen until this background activity could be abolished. Fortunately there was no background activity against the α-GalNAc-MU substrate, which was only cleaved in the presence of EmGH109. Following this, it was necessary to validate this screening system for a 384-well plate format.  Table 3.1 - Background activity of screening strain.  α-Gal-MU α-GalNAc-MU Clarified Cell Lysate:   ReplicatorFOS + pHSG396(empty) (-) (-) ReplicatorFOS + pHSG396(EmGH110) (-) (+) ReplicatorFOS + pHSG396(BfGH109) (+) (-) Unclarified* Cell Lysate:   ReplicatorFOS + pHSG396(empty) (+) (-) ReplicatorFOS + pHSG396(EmGH110) (+) (+) ReplicatorFOS + pHSG396(BfGH109) (+) (-) *Unclarified cell lysate contained cell debris following lysis.  60  3.3.3 384-Well Screen Validation In order to validate the screen for 384-well format, a quantitative assessment of the screen quality was conducted. The most common way to do this is the calculation of Z-factor (Z’) for the screen. The Z-factor is a measure of statistical effect size, which has been proposed for use in high-throughput screening to assess if the response of ‘hits’ is large enough to warrant further investigation. This value can be estimated for a given screen by assessing the response of positive controls over a background of negative controls. For this screen, control cells were grown and induced for protein expression (IPTG) overnight, then a lysis/assay mixture containing the screening substrate α-GalNAc-MU (50 µM) was added. Following incubation at 37°C for 1, 3, and 20 hours, the fluorescence (365/440 nm) of the wells was measured by plate reader. Five rows (120 wells) of the plate were dosed with negative control cells (ReplicatorFOS+pHSG396[empty]) while two rows (48 wells) were dosed with positive control cells (ReplicatorFOS+pHSG396[EmGH109]). A significant increase in the fluorescence of the positive control cells was evident following 20 hours of incubation (Figure 3.2). This slow 61  response (>20 hours) was expected for the positive control enzymes, which are known to react poorly with simple monosaccharide substrates compared to trisaccharide substrates which better mimic their natural blood antigen targets. From these results, a Z’ value of 0.25 was determined. When assessing plate screens, Z’ factors of less than 0 indicate a poor screen, values between 0 and 0.5 indicate a satisfactory assay, and values from 0.5-1.0 indicate a very good screen. Based on this Z’, the screen was sufficient to move forward with screening metagenomic libraries. This control experiment also closely mirrored the likely response from a low-activity hit within a real microbiome library, which might not completely cleave all substrate in the reaction assay.  Figure 3.2 - Plate screen validation. Following 20 hours of incubation at 37°C, the fluorescence of each well in the plate was measured (365/435 nm). All of the negative controls were clustered in the first 120 wells while all of the positive control wells were clustered in the remaining 48 wells. Any wells in which cells did not grow were excluded from analysis.  80001000012000140001600018000200000 50 100 150 200Fluorescence 365/435 nm (ABU)Wells-GalNAc Test Screen (384-well)10 SD   3 SD  Mean Positive Controls Z-factor (Z’) = 0.251 62  3.4 Library Construction With the aim of identifying enzymes from bacteria in the human gut which might cleave the terminal sugar on blood group antigen A (α-N-acetyl-galactosaminidases), fecal samples from two participants were obtained. The blood types of the participants were O and A, respectively. It was difficult to predict if any blood type A cleaving enzymes could be derived from a blood type O individual. While a blood type O individual would not be displaying any type A antigens on the epithelial cells of his/her intestinal tract, the genes coding for the ability to cleave this antigen might still be conserved within the commensal gut bacterial. The presence of type A antigens in the gut lining of a blood type A individual would potentially increase the proportion of these same genes, increasing the likelihood of hits from this individual. Either way, it seemed prudent to include multiple samples and this additional type (blood type O) was readily available. DNA was extracted from these samples using a methodology similar to that for extracting DNA from soil (chemical lysis followed by chloroform/ethanol treatment). As discussed in the previous chapter, metagenomic libraries can be produced as either large-insert (>25 kb DNA in fosmid) or small-insert (<10 kb DNA in plasmid) constructs. Initial attempts focused on producing a large-insert fosmid library, as this type of library would allow a larger diversity of metagenomic DNA to be screened (i.e. more metagenomic DNA present in each clone that is screened). Three attempts were made to produce fosmid libraries using the CopyControl fosmid system for heterologous expression in ReplicatorFOS cells. Briefly, following DNA extraction the DNA was further purified by ultracentrifugation in a cesium-chloride gradient overnight, removed from the gradient, applied to pulsed-field gel electrophoresis, and DNA of the correct size (35-45 kb) isolated from the gel prior to ligation within the fosmid expression system. No clones were obtained for any of these attempts. Following failed attempts by other lab members to produce 63  libraries from this source, it has since been inferred that metagenomic DNA from fecal samples requires numerous rounds of purification in order to remove contaminants that severely hinder ligation and the formation of fosmid libraries. As a result of this, library construction shifted to the creation of small-insert (<10 kb) libraries from both samples. Two methods were applied for the creation of the small-insert libraries. First, metagenomic DNA was sheared acoustically using sonication, subjected to gel electrophoresis, and DNA in the size range of 5-10 kb was excised from the gel. This was end-repaired and subjected to blunt-end ligation within the small-insert expression vector pHSG396 previously digested with SmaI. Following this ligation, the DNA was transformed into the screening strain ReplicatorFOS. The library size was quite poor (approximately 103) despite multiple attempts, indicating a very poor ligation efficiency using blunt-end cloning. In order to form a library of 60 plates (384-well), a standard plate library size based on screening time and storage capacity, the library yield would need to be in excess of 2.4 x 104. In order to obtain a sufficient library size, the DNA was subjected to a partial digest with the restriction enzyme Sau3AI. The DNA was subsequently subjected to gel electrophoresis and DNA in the size range of 3-10 kb was excised from the gel. This was then subjected to ligation with pHSG396 that had been digested with BamHI (Figure 3.3). The resulting clones were transformed into the screening strain ReplicatorFOS cells and plated. The plated colonies were picked by robotic colony picker into 384-well plates for growth and storage at -80°C. Two metagenomic libraries were created from the participants of blood type O and blood type A (8.8 x 103 and 2.3 x 104, respectively). While these libraries were relatively small, they were carried forward for screening in order to determine if any hits could be identified prior to spending time and resources scaling up the screen size.  64   Figure 3.3 - Production of small-insert libraries. Purified metagenomic DNA was partially digested with Sau3AI and size-selected on agarose gel (Left), cloned into the small high copy expression vector pHSG396 (Middle), then transformed into ReplicatorFOS E. coli cells for plating and library storage (Right).    3.5 Plate Screening of Gut Microbiome Libraries The two libraries (hereby referred to as Type O and Type A) were used to inoculate fresh autoinduction growth media in 384-well plates and the plates grown for 20 hours at 37°C. Due to the time required to screen each library, the Type O library was screened and analyzed first, followed by the Type A library. Following sufficient growth, the lysis/assay mixture containing α-GalNAc-MU (50 µM) was added to the Type O library plates using an automated filling device (QFill) and incubation commenced at 37°C. The fluorescence (365/440 nm) of each well was read using a plate reader following 22 hours of incubation. The fluorescence data for all Type O library plates (28 plates) were compiled for analysis (Figure 3.4). All of the wells fluorescing above 10 standard deviations (SD) from the mean were transferred to a secondary hit plate for validation. Of the 4 hits identified in the initial screen, 3 hits retained activity in the secondary screen and were carried forward for further analysis. From the data it was also evident that an edge effect had occurred. Edge effect arises from the increased evaporation of water and 65  media in the wells closest to the perimeter of the plate, leading to decreased cell growth and activity in those wells. The Type A library plates were then assayed in the same way as the Type O library, except that the plates were grown and incubated in a high (>90%) humidity chamber to reduce edge effects. The fluorescence data for all Type A library plates (60 plates) were compiled and analyzed in the same way as for the Type O library (Figure 3.5). Edge effects were largely eliminated in the Type A library screen, and three hits (above 10 standard deviations from the mean) were transferred to a secondary hit plate for validation. All three hits retained activity in the secondary screen and were carried forward for analysis.    Figure 3.4 - Type O library plate screen results. Following incubation of the assay plates, the fluorescence (365/440 nm) was determined and hits identified as any well fluorescing above 10 SD.  020004000600080001000012000140000 2000 4000 6000 8000 10000Fluorescence (ABU)WellType O Library Screen Results10 SD 66   Figure 3.5 - Type A library plate screen. Following incubation of the assay plates, the fluorescence (365/440 nm) was determined and hits determined as any well fluorescing above 10 SD.   3.6 Initial Hit Characterization 3.6.1 Hit Isolation and Storage The wells from the storage library corresponding to the six hits identified through primary screening (3 from each library) were re-streaked on fresh agar plates and a single colony of each picked for storage and downstream analysis. The activity of each colony against the α-GalNAc-MU substrate was re-verified in vitro (lysate) prior to storage. These six hits were initially referred to as E2 (Type O), E12 (Type O), F1 (Type O), P5 (Type A), P19 (Type A), and P24 (Type A).  01000002000003000004000005000000 5000 10000 15000 20000Fluorescence (ABU)WellType A Library10 SD 67  3.6.2 Determination of Hydrolytic Activity on Blood Type A Antigen 3.6.2.1 N-Acetylgalactosaminidase Coupled Assay on Blood Type A Antigen As the overall intent of screening for α-N-acetylgalactosaminidase activity was to identify enzymes capable of cleaving α-GalNAc from the type A antigen displayed on red blood cells, the hits were subjected to a specialized secondary assay. This assay utilized a tetrasaccharide substrate mimicking the blood type A antigen that is coupled to the reporter 4-methylumbelliferone (MU). This complex substrate requires the coupled action of a α-fucosidase (AfcA), β-galactosidase (BgaA), and a β-hexosaminidase (SpHex) in order for the reporter to be released following cleavage of the terminal α-GalNAc residue (Figure 3.6). These enzymes were expressed and purified prior to performing the coupled assay with the screening hits.  68   Figure 3.6 – Overview of N-acetylgalactosaminidase coupled assay on blood antigen A substrate. The stable substrate (MU-Type2Atetra) mimics the terminal tetrasaccharide of the blood type A antigen. Three coupled enzymes will only break down the internal trisaccharide and release the reporter (MU) once the terminal α-GalNAc residue is released by the action of an exo-α-N-acetylgalactosaminidase. The release of reporter (MU) can be monitored continuously as a change in fluorescence (365/440 nm).   3.6.2.2 Coupled Assay Results The six hits were separately grown and expression-induced, then lysed and the lysate applied to an assay mix containing the tetrasaccharide substrate MU-Type2Atetra (50 µM) previously synthesized by Dr. David Kwan182 within our research group. Along with controls, including cells expressing the positive control enzyme EmGH109, the hits were continuously monitored on plate for an increase in fluorescence (365/440 nm) corresponding to a release of the MU reporter 69  (Figure 3.7). The hit P5 from the blood type A library cleaved the tetrasaccharide substrate just as do the known blood type A-cleaving enzymes. None of the other hits demonstrated any blood antigen-cleaving activity above the negative controls.   Figure 3.7 - Type A antigen coupled assay results. Fluorescence (365/440 nm) was continuously monitored for approximately 3 hours to determine enzymatic activity against the type A antigen substrate MU-Type2Atetra.   Following activity verifications on both the primary and secondary screening substrates, all six hits were submitted to sequence analysis to identify specific activity-causing genes.  3.6.3 Sequence Analysis DNA from each hit identified in primary screening was isolated and submitted to sequence analysis. The hits were first submitted for end-sequencing (Sanger) to determine if any were identical sequences from within each metagenomic library. This might occur if a particular gene is present within the library in a reasonably high abundance and subsequently isolated multiple times within a screen. Three of the hits were duplicates, two from the Type O library, and one (+) Control (EmGH109) P5   E2 E12 F1 P19 P24 (-) Control (no enzyme)  70  from the Type A library, resulting in 3 hits overall. These remaining hits were E2 (Type O), P5 (Type A), and P19 (Type A). This allowed the overall hit rate of the plate assay to be calculated as 3 in 33,000, or approximately 0.01%, from the tested libraries. This was one to two orders of magnitude lower than what has been seen in a small insert library for a more common glycosidase activity such as cellulase activity.101 As the libraries were both small-insert (<10 kb), the full sequence of the metagenomic inserts could be determined simply by primer-walking. This entailed sequencing (approximately 800-1000 bp) from each end of each insert sequence using sequencing primer sites within the expression vector (pHSG396), then designing primers based on the determined sequence and re-sequencing with those to obtain more sequence information. Once the full insert sequences were mapped, the data were analyzed using Open-Reading Frame (ORF) Prediction and BLASTX (NCBI)183 to determine the most likely protein-coding regions within the DNA. For all hits only a single open-reading frame (ORF) over 1 kb in length was found indicating a likely candidate gene responsible for activity. Use of BLASTX searches of the NCBI non-redundant (nr) databases using a translated nucleotide query, provided the most closely matching protein sequences found within the DNA sequence of interest. All three of the identified ORFs were protein-coding regions with proposed glycosidase activities. The hits E2 and P19 both matched (≥97% sequence identity over full length) to putative α-xylosidases from Bacteroides plebeius (WP_007559952)50 and Bacteroides caccae (WP_005682123)50, respectively (Figure 3.8Figure 1.1). 71   Figure 3.8 – Sequence identity of screening hits to annotated GH31 genes from GenPept.  The proposed α-xylosidase activity was based on the presence of a conserved catalytic domain from the glycoside hydrolase family 31 (GH31). This family contains members with α-glucosidase, α-galactosidase, α-mannosidase, α-xylosidase, and α-glucan lyase activities. Surprisingly there were no reported members of GH31 exhibiting α-N-acetylgalactosaminidase activity. This activity has only been reported in families GH27, GH36, and GH109. Despite this, it has recently been reported that members of GH31 share similar structural characteristics with members of GH27 and GH36, and fall into Clan D of glycoside hydrolases.184 The hits E2 and P19, hereby referred to as BpGH31 and BcGH31, shared no significant (>30%) sequence similarity to known members of that family beyond the conserved catalytic domain. As such, these two genes potentially formed a new sub-family of GH31, but all GH activities and substrate preferences would have to be confirmed first. These two genes were also aligned to other characterized GH31 enzymes for analysis of their phylogenetic relationship ().  72   Figure 3.9 - Phylogenetic tree analysis of characterized GH31 enzymes.  It should also be noted that the sequences for the BpGH31 and BcGH31 genes within my hits only represented 73% and 69% of the N-terminal region of the putative full gene sequences, respectively (Figure 3.8). Both hits were missing the remaining C-terminal regions of the putative full genes. The missing C-terminal region of both proteins appeared to contain two additional domains not necessary for hydrolytic activity. The first was a F5/8 Type C domain (or C2-like domain), named after its presence on the C-terminus of factor V and VIII blood coagulation factors. This is a major protein domain of many blood coagulation factors and is believed to promote binding to anionic phospholipids on the surface of endothelial cells and platelets.185 Similar domains are also found on other extracellular and membrane-bound proteins. The second missing domain was a cohesin domain, which interacts with a counterpart dockerin domain found other proteins, often to form a multi-enzyme complex of hydrolytic enzymes. A classic example of this type of multi-enzyme complex is the cellulosome, which works as a complex of binding and hydrolytic sub-units to attach to substrate (ex. cellulose) and degrade it. While often found in anaerobic bacteria such as these two Bacteroides sp., the cohesin domain is 73  usually part of the scaffold protein, or “scaffoldin,” which is not hydrolytic itself but coordinates a set of hydrolytic enzymes containing dockerin domains. Neither of these domains would be necessary for catalytic activity, allowing the truncated proteins to be active on the screening substrate. The other hit, P5 (462 aa), matched (100% sequence identity, 89% coverage) a Bacteroides vulgatus α-N-acetylgalactosaminidase (520 aa) from the GH family 109 (GH109), hereby referred to as BvGH109. Similar to the hits E2 and P19, a small portion of the C-terminus of the full gene was missing from the hit identified through screening. This uncharacterized gene was a putative α-N-acetylgalactosaminidase proposed to act on the terminal α-GalNAc of the blood type A antigen based on sequence similarity to the previously characterized gene EmGH109 from the same family. This suggested that BvGH109 would indeed act on the blood type A antigen as hoped, and aligned with the functional assay data showing type A antigen cleaving activity by the lysate of the hit P5 (containing BvGH109). If active once isolated and expressed, this would be the first bacterial enzyme having this activity isolated and characterized since the discovery of the E. meningoseptica enzyme (EmGH109) in 2007104, though the family contains 283 uncharacterized sequences at this time.  3.7 Sub-Cloning, Expression, and Activity Verification In order to verify the identified ORFs as the genes of interest, all three (BpGH31-E2, BcGH31-P19, and BvGH109-P5) were sub-cloned into the expression vector pET29a, containing a C-terminal His6-tag, and transformed into BL21(DE3) cells for expression and purification. From their initial induction profiles (Figure 3.10) it appeared that only the BpGH31(E2) ORF expressed very well, although expression of the BcGH31(P19) ORF was also visible on the gel. No band showing expression of the BvGH109 (P5) ORF was detectable. Following initial Ni2+ 74  column purification, BpGH31(E2) and BcGH31(P19) were obtained at purities of approximately 90% and 60%, respectively (Figure 3.10). BcGH31(P19) had to be re-purifed on Ni2+ resin a second time to improve purity to approximately 90%, although the subsequent yield of BcGH31(P19) was very low (< 1 mg). No soluble BvGH109 (P5) could be expressed or purified in this form. Both purified proteins were active when tested against the screening substrate α-GalNAc-MU, indicating that the identified ORFs were indeed the regions within the metagenomic inserts responsible for activity. Both BpGH31(E2) and BcGH31(P19) were then carried forward for further characterization.   Figure 3.10 - SDS-PAGE of purified hits. Following a single purification on Ni2+ resin, the approximate purity and yield of the enzymes was determined by SDS-PAGE.   75  With no apparent expression of the BvGH109 ORF in either its partial gene form or as a C-terminal His-tagged protein, steps were necessary in order to enable soluble expression. Firstly, the full-length version of the gene, including the sequence for the missing 58 amino acids on the C-terminus, was assembled by constructing the missing gene sequence and appending it to the truncated gene. This was accomplished by ordering the codon-optimized region of missing amino acids (GENEBLOCKS, BioBasic Inc.) along with some overlapping region. Overlap-extension PCR was then performed to obtain a full-length gene product. The full gene was sub-cloned into pET16b, an alternative expression vector providing an N-terminal His-tag10. The full length BvGH109 gene contained an N-terminal 28 amino acid signal peptide sequence, which often has a negative effect on soluble expression. A truncated version of the gene with the signal peptide sequence removed (tBvGH109∆2-28) was created and sub-cloned into pET16b for testing in parallel with the full sequence. Both versions of the gene were induced in BL21(DE3) cells and the induction profiles of each obtained (Figure 3.11). The version of the protein with the signal peptide removed clearly expressed very well compared to the full-length enzyme, although lysates from both versions were active on α-GalNAc-MU and MU-Type2Atetra substrates. As a result, only the truncated version of the gene (hereby referred to as tBvGH109) was carried forward for further characterization. 76   Figure 3.11 - SDS-PAGE protein expression test for BvGH109 (P5). A comparison of induction was performed between the full-length BvGH109 (P5) gene and its truncated form following the removal of an N-terminal signal peptide sequence.   3.8 Characterization of BpGH31(E2) and BcGH31(P19) 3.8.1 Substrate Specificity Based on sequence similarity to proteins from GH family 31, the substrate specificities of these two enzymes were tested using either o-nitrophenyl or methylumbelliferyl glycosides of α-N-acetylgalactosamine, α-galactose, α-glucose, α-mannose, α-xylose, β-galactose, and β-glucose. The results are summarized in Table 3.2. The enzymes did not display activity on any of the other substrates except for a small amount of activity on the substrate β-Gal-oNP. Often when using BL21(DE3) cells for expression, any small amount of activity on β-Gal-oNP is believed to be background activity from β-galactosidase produced in the expression strain (a lacZ+ strain) 77  and carried through purification. No activity was seen in the negative control consisting of lysate from the expression strain (BL21) carried through identical purification steps as the enzymes BpGH31(E2) and BcGH31(P19). This indicates that the activity is specific to the enzymes in question and not a background activity carried through purification. This is not definitive though, as these results did not preclude the potential for β-galactosidase to be co-eluting with the enzymes in question during purification. In order to verify if this activity was background or specific to the enzymes in question, each enzyme (20 µg) was pre-incubated at 37°C with or without 2,4-DNP-2F-β-Gal (2.5 mM), an inactivator of β-galactosidase activity. Following this incubation, aliquots of the reaction were transferred to a second assay mixture containing either α-GalNAc-MU or β-Gal-MU (100 µM final concentration). These reactions were then monitored on plate reader (365/440 nm). Presumably if BpGH31(E2) and BcGH31(P19) have both β-galactosidase and α-N-acetylgalactosaminidase activity, then incubation with an inactivator of β-galactosidase activity should have an inhibitory effect on α-N-acetylgalactosaminidase activity through active site competition, reducing hydrolysis of the α-GalNAc-MU substrate. Based on a qualitative assessment of the results (Figure 3.12), it appeared that the β-galactosidase activity seen previously was the product of another enzyme (background), as hydrolysis of α-GalNAc-MU was not affected following pre-incubation with 2,4-DNP-2F-β-Gal. 78   Figure 3.12 - Inhibitory effect of 2,4-DNP-2F-β-Gal on α-N-acetylgalactosaminidase activity of BpGH31 (E2)  It is difficult to know if there might be additional activities on structures more complex than the simple monosaccharide substrates tested here. Further study of these enzymes with a broad selection of substrates is warranted based on these results. Ultimately, it appeared that these enzymes differed in their preferred substrate from all known GH31 enzymes, exhibiting a new activity for the family.  79  Table 3.2 - Testing substrate specificity of BpGH31(E2) and BcGH31(P19). Each substrate containing a 2-nitrophenyl moiety (10 mM) was tested at pH 7.5 and the reaction monitored continuously at 405 nm. Assays with MU-containing substrates were monitored by fluorescence (365/440 nm). Substrate: BpGH31 (E2) BcGH31(P19) (-) BL21(DE3) lysate α-GalNAc-MU + + - α-Gal-oNP - - - α-Glc-oNP - - - α-Man-oNP - - - α-Xyl-oNP - - - β-Gal-oNP - - - β-Glc-oNP - - -   3.8.2 Classification As mentioned before, the two gene sequences for BpGH31(E2) and BcGH31(P19) shared a conserved domain for hydrolytic activity with members of the glycoside hydrolase family 31 (GH31). There was little similarity to any GH31 proteins beyond this region, indicating that these proteins may form a new sub-family within GH31. Interestingly there are no known members of this family to exhibit α-N-acetylgalactosaminidase activity, which would make these enzymes the first of this GH family to have such activity. To date, this activity has only been found in bacteria within GH families 36 and 109. It also appears within the GH family 27 for eukaryotes (humans and fungus). This furthers the argument that these newly-isolated enzymes would constitute a new sub-family within GH family 31. This inference was confirmed through correspondence with Dr. Bernard Henrissat (curator of the CAZy database).  80  3.8.3 Kinetics Michaelis-Menten kinetic parameters for the hydrolysis of α-GalNAc-MU by BpGH31(E2) and BcGH31(P19) were determined at 25°C and pH 7.5 (Table 3.3). As expected based on their high protein sequence similarity (67%) and shared hydrolytic domain sequence, the Michaelis constants (KM and kcat) of both enzymes are almost identical.   Table 3.3 – Michaelis-Menten parameters for cleavage of α-GalNAc-MU by BpGH31(E2) and BcGH31(P19) at pH 7.5 (25°C). Enzyme KM(µM) kcat  (s-1) kcat/KM (M-1s-1) BpGH31 (E2) 126 ± 16 0.68 ± 0.04 5.4 x 103 BcGH31(P19) 111 ± 6 0.50 ± 0.01 4.4 x 103   3.8.4 pH Optima An estimate of pH optima for the two GH31 enzymes was determined using a stopped assay. Both enzymes were incubated with the α-GalNAc-MU substrate (50 µM) for 15 minutes at room temperature in buffers of varying pH (3.0-9.0), then stopped/quenched in glycine buffer (pH 10) prior to reading the fluorescence (365/440 nm). The amount of hydrolysis following 15 minutes of incubation (measured by MU release) was determined for each enzyme under varying pH conditions and compared in order to estimate pH optimum. Both enzymes were active within a pH range of 5.0-8.0, with their highest activities at pH 6.0-6.5 (Figure 3.13). This is consistent with many bacterial enzymes functioning in the gut (caecum).186  81   Figure 3.13 - pH activity profiles of BpGH31(E2) and BcGH31(P19). Hydrolysis of α-GalNAc-MU (50 µM) was performed by each enzyme in varying pH environments for 15 minutes (RT), an aliquot of the reaction quenched in 1 M glycine (pH 10.4), then the resulting fluorescence (365/440 nm) determined by plate reader (Biotek Synergy). Michaelis-Menten kinetic parameters (kcat/KM) values were determined in Grafit and plotted against the tested pH values.  3.8.5 Stereochemical Outcome of Hydrolysis In order to determine if the enzymes function through an inverting or retaining hydrolytic mechanism, cleavage of α-GalNAc-MU by BpGH31(E2) and BcGH31(P19) was monitored by 1H NMR (300 MHz, Bruker). The 1H NMR spectrum was first obtained for a standard of equilibrated α/β-GalNAc (10 mM) dissolved in D2O. Discernible peaks for the anomeric hydrogens of both the α-GalNAc and β-GalNAc in the solution were easily observable in the resulting spectra at 5.2 ppm and 4.6 ppm, respectively. Either BpGH31(E2) or BcGH31(P19) (100 µg or 8 µg, respectively) was added to a reaction mixture containing 50 mM HEPES buffer (pH 7.5) and α-GalNAc-MU (5 mM). Following addition of the enzyme and mixing, the sample was immediately submitted for monitoring by 1H NMR (300 MHz, Bruker). The first spectra could be obtained following 5 minutes of reaction. In the reaction by BpGH31(E2), hydrolysis product was observable following 5 minutes of reaction. Within the resulting 1H NMR spectra of the reaction, only the single peak for the anomeric hydrogen of an α-GalNAc product was 82  observed at 5.2 ppm, indicating a net retention of stereochemical configuration (Figure 3.14). In the hydrolysis reaction with BcGH31(P19), no hydrolysis product peaks could be observed following 5 or 10 minutes of reaction. This was likely the result of using substantially less enzyme in the reaction (8 µg as opposed to 100 µg), although this constituted the entire stock of purified enzyme which has exhibited low purification yields. Based on the sequence identity between the two enzymes (67%) it is reasonable to hypothesize that BcGH31(P19) follows the same retaining mechanism as BpGH31(E2), although this will need to be verified. The stereochemical outcome of the hydrolysis mechanism for these enzymes is consistent with that of the only other known bacterial α-N-acetylgalactosaminidases from families GH36 and GH109. A retaining mechanism is also observed for all GH31 enzymes, although these are acting on different substrates.  Figure 3.14 – Determination of hydrolysis anomeric stereochemical outcome 1H NMR. The hydrolysis of α-GalNAc-MU (5 mM) by BpGH31(E2) at room temperature was monitored by 1H NMR. A 1H NMR spectrum of the reaction was immediately obtained (right) and the anomeric stereochemistry of the hydrolysis product determined. Stereochemistry was determined by observation of the anomeric hydrogen peak appearing at either 5.2 ppm (α-GalNAc) or 4.6 ppm (β-GalNAc). Only α-GalNAc product was observed following rapid hydrolysis of the starting material, indicationg a net retention of stereochemistry (retaining mechanism) by BpGH31(E2).  83  3.8.6 Potential Function in Nature The context in which the BpGH31 and BcGH31 enzymes work in nature is unknown at this stage, but an examination of where α-N-acetylgalactosamine is most often found provides some clues. Many mammalian proteins are glycosylated at either an asparagine (N-linked) or serine/threonine residue (O-linked). There are two dominant forms of O-glycosylation, O-GalNAc glycosylation and O-GlcNAc modification. In O-GalNAc glycans, often referred to as mucins, the first sugar residue in the chain is an α-linked-N-acetylgalactosamine (Figure 3.15). Unlike other monosaccharides which appear quite ubiquitously in nature, α-linked-N-acetylgalactosamine appears almost exclusively in the sugar chain of O-glycosylated mammalian proteins. Plants and fungi do not appear to have this modification, and favour alternate forms of Figure 3.15 - Depiction of an N- and O-glycosylated glycan. Each O-glycosylation carbohydrate chain begins with an α-N-acetylgalactosamine linked to a serine/threonine residue. (Adapted with permission, www.neb.com/applications/glycosylation)  84  O-glycosylation. Heavily O-glycosylated (O-GalNAc) mucins are found on the surface of epithelial cells, which line the entire intestinal tract. This makes the gut a very O-GalNAc-rich environment for commensal bacteria capable of O-glycan degradation. As a result of the very high saturation of O-GalNAc glycans in the gut containing α-linked-N-acetylgalactosamine, it seems likely that the BpGH31 and BcGH31 enzymes function as part of a set of hydrolytic enzymes capable of O-glycan chain degradation. A cursory analysis of the genes surrounding BcGH31 within the sequenced genome of Bacteroides caccae point to BcGH31 acting along with a set of other carbohydrate-degrading enzymes for the utilization of carbohydrates bound to glycosylated proteins within the gut (Figure 3.16). The neighboring putative GH88 protein may play a role degrading glycosaminoglycans (GAGs) present on the intestinal mucosal lining by cleaving glucuronic acid (GlcU) units. The presence of SusD is indicative of this bacterial species utilizing large carbohydrate chains for nutrition187-189, and has been found to specifically aid the organism in binding glycan chains released from mucins.190 The presence of a SusC/RagA protein, which acts as part of a transporter complex which imports large degradation Figure 3.16 - Genes surrounding BcGH31(P19) in the Bacteroides caccae reference genome.  85  products of proteins (e.g. RagA) or carbohydrates (e.g. SusC), is also indicative of large carbohydrate chain utilization. This would be further corroborated by the presence of β-galactosidases and α-fucosidases, which would be required for the removal of other residues present in O-GalNAc chains. In fact there appear to be three nearby β-galactosidases (two putative GH2 genes and one putative GH35 gene) and one α-fucosidase (GH29) further upstream of the GH31 gene. While these other genes as well as a nearby carbohydrate-binding module (CBM 32) are fairly diverse in the carbohydrates they target for hydrolysis, their close proximity could indicate an overall function of carbohydrate utilization from mucins by BpGH31 and BcGH31 and their neighbouring genes. Analysis of this gene set in the CAZy Polysaccharide Utilization Loci (PUL) Database (PULDB) indicates similarity to other predicted PULS from Bacteroides caccae ATCC 43185, Bacteroides sp. 2_2_4, Bacteroides sp. D20, and Bacteroides uniformis ATCC 8492, which contain a combination of GH31, GH29, and GH2 enzymes. There was no similarity to known mucin-degrading PULs, but more work will be required to conclusively determine if BpGH31 and BcGH31 act on the O-GalNAc residue of O-GalNAc glycans (mucins).   3.9 Characterization of BvGH109 (P5) 3.9.1 Mechanism Following purification, the truncated form of BvGH109, tBvGH109, was further characterized. Based on its conserved GH109 hydrolytic domain, tBvGH109 was presumed to follow the same hydrolytic mechanism as EmGH109. This hydrolysis mechanism is unusual in that it involves a NAD+ cofactor, as determined by Liu and colleagues in 2007.104 All assays performed with purified tBvGH109 or EmGH109 did not require the addition of NAD+ to function, and the 86  addition of NAD+ to the reactions did not alter or enhance the reactions. Additionally both enzymes were active without the addition of any metals (ex. Mg2+ or Mn2+). Further experiments will be needed to verify the mechanism of BvGH109 in relation to other characterized GH109 enzymes. Any difference in the mechanism of BvGH109 would indicate a potential new family of blood antigen-cleaving enzymes, either distinct from GH109 or forming a new sub-family.  3.9.2 Kinetics In order to determine how well tBvGH109 cleaves blood antigen A compared to the previously characterized EmGH109, Michaelis-Menten kinetic parameters were obtained for the two enzymes using the blood type A mimicking substrate MU-Type2Atetra. Identical to the previous assays using this tetrasaccharide material, the hydrolysis by each GH109 enzyme was coupled to hydrolysis of the remaining trisaccharide and MU release by the enzymes AfcA, BgaA, and SpHex (see section 3.6.2.1). Sufficient quantities of the coupled enzymes were used in order to ensure either tBvGH109 or EmGH109 was the rate-limiting enzyme. Using this coupled enzyme assay, the KM and kcat were determined for each enzyme at a pH of 7.5 (Table 3.4). At the pH tested, the enzyme BvGH109 was approximately 11 times less efficient than the previously characterized EmGH109 on the MU-Type2Atetra substrate. While comparatively less efficient than EmGH109 under the conditions tested, tBvGH109 may still be quite useful in the attempt to obtain better blood type A-cleaving enzymes, as it expressed very well in E. coli in its truncated form. Additionally, it provides a second enzyme capable of cleaving the type A antigen as a possible template for directed evolution.   87  Table 3.4 - Kinetic comparison of BvGH109 and EmGH109 on the MU-Type2Atetra substrate. Enzyme KM (mM) kcat  (min-1) kcat/KM (M-1min-1) BvGH109 1.5 ± 0.3 3.4 ± 0.3 2.3 x 103 EmGH109 (NagA) 0.4 ± 0.1 10 ± 0.8 2.7 x 104  3.10 Conclusion and Future Work In conclusion, two metagenomic libraries were created from two human gut (fecal) samples and successfully screened for α-N-acetylgalactosaminidase activity. Three bacterial α-N-acetylgalactosaminidases were identified in the screen. Two novel GH31 enzymes, BpGH31 and BcGH31, were isolated and characterized. Testing of these enzymes on a diverse set of substrates showed a departure from the current activities found in the GH31 enzyme family. These were the first GH31 enzymes to exhibit α-N-acetylgalactosaminidase activity (EC 3.2.1.49), forming a new GH31 sub-family. These enzymes are hypothesized to act on O-GalNAc glycans (mucins) displayed by epithelial cells within the intestinal tract, and further experiments will be required to determine if this is the case. The function of the C-terminal regions of these enzymes is also yet unknown, and further work will be needed to assess their roles. The third α-N-acetylgalactosaminidase identified in the screen, BvGH109, exhibited hydrolytic activity against the blood type A antigen substrate, MU-Type2Atetra. A full version of the enzyme was produced, including a missing C-terminal segment. A truncated version of this enzyme (N-terminal signal peptide deletion), tBvGH109, was found to express very well in E. coli. The catalytic efficiency (kcat/KM) of this enzyme against the blood type A antigen substrate (MU-Type2Atetra) was compared with that of the previously characterized GH109 enzyme EmGH109. Under the conditions tested, tBvGh109 was approximately 11 times less efficient than EmGH109. More experiments will need to be conducted in order to test the pH optimum of this enzyme, determine 88  its efficiency relative to EmGH109 under different conditions, determine its activity against blood type A antigens with varying linkages (ex. Type 1A, Type 3A, Type 4A), and determine its effectiveness on actual samples of blood (Type A). Despite the work still needed to better understand this enzyme, this is the first bacterial enzyme with exo-N-acetylgalactosaminidase activity on blood type A antigen to be discovered since 2007104, and offers a new tool for the production of universal donor blood.  89  Chapter 4: Directed Evolution of the Sialyltransferase PmST1 4.1 Introduction 4.1.1 α-2,3/2,6 Sialyltransferase from Pasteurella multocida (PmST1) As discussed previously in Chapter 2, STs can be used to help synthesize complex glycoconjugates such as sialylated oligosaccharides126, ganglioside mimics127, sialyl-Lewis antigens127, which are neccessary for studying the roles of glycosylation in diseases. Bacterial STs are more readily applied than eukaryotic STs for these syntheses in vitro, owing to their more facile heterologous expression in E. coli and lack of post-translational modifications. Enzyme engineering through directed evolution allows for the improvement and optimization of enzymes for a desired purpose such as synthesis. Bacterial STs from family GT80 are strong candidates for enzyme engineering as a result of their strong expression in E. coli, rapid reaction rates, and broad substrate specificities.23,191 One GT80 enzyme which has garnered interest as an engineering target is PmST1, a multifunctional α-2,3/2,6 sialyltransferase from Pasteurella Figure 4.1 - Structure of PmST1 enzyme from Pasteurella multocida bound with CMP (PDB entry: 2C84). 90  multocida (Figure 4.1). PmST1 belongs to the GT-B structural superfamily, with its structure consisting of two separate Rossmann domains with the active site located in the deep cleft between the two domains.192 Most STs, including PmST1, are metal-ion-independent inverting enzymes, with catalysis believed to follow a single displacement mechanism involving nucleophilic attack of the OH group of the acceptor on the anomeric center of the donor sugar.23,193 PmST1 uses this inverting mechanism to transfer Neu5Ac from CMP-Neu5Ac (β-linked) to galactoside acceptors (α-linked product). A catalytic residue serves as a general base to deprotonate the reactive oxygen of the acceptor. Inverting reactions occur with the formation of an oxocarbenium-ion transition state taking place in parallel with the departure of the nucleotide leaving group.194 In PmST1, the catalytic base is an aspartate residue (D141), with a conserved histidine residue (H311) helping to stabilize the phosphate leaving group.192 PmST1 exhibits not only α-2,3/2,6-sialyltransferase, but also α-2,3/2,6-sialidase and α-2,3/2,6-trans-sialidase activities on the sialogalactosides it forms. Along with multifunctionality, PmST1 also demonstrates a broad substrate specificity. Previous work has shown that PmST1 is capable of utilizing the LewisX epitope (Lex) as an acceptor to synthesize a broad set of sialyl-Lewisx (SLex) products with multiple natural sialic acid forms such as N-acetylneuraminic acid (Neu5Ac), N-glycolylneuraminic acid (Neu5Gc), 2-keto-3-deoxy-d-glycero-d-galacto-nonulosonic acid (Kdn), as well as 9-O-acetylated Neu5Ac and Neu5Gc.132 In addition, PmST1 was also able to produce SLex containing non-natural sialic acid forms including those with an N-azidoacetyl group or an azido group at the C-5 or C-9 positions. Despite this seemingly powerful synthetic capability, syntheses utilizing PmST1 must be tightly monitored over time to ensure that synthesis is halted when the maximum yield is reached to ensure that the product is not then rapidly degraded. Larger amounts of valuable starting material (CMP-Neu5Ac) may also be required for these 91  reactions, as PmST1 will hydrolyze this donor substrate to a significant degree as the reaction proceeds (donor hydrolysis). PmST1, along with other GT80 enzymes, has been used extensively for making natural and unnatural sialylated products126,128,132,195-206, with efforts to engineer improved mutants focused on decreasing product degradation.132,201,202 While improvements have been demonstrated through decreasing sialidase or donor hydrolysis activity, engineering was limited in part to the unavailability of high-throughput screening approaches for ST activity. Only small, site-directed mutagenesis libraries could be assessed in order to accommodate the available low-throughput assays, severely limiting the accessible protein space for this enzyme.  4.2 Rationale and Research Goals PmST1 has strong utility in synthesizing sialylated glycan structures as shown by its promiscuous substrate specificity and rapid reaction rates. If product degradation, the major drawback of this enzyme, could be reduced or removed completely, PmST1 would become a robust and valuable enzyme for sialylating glycans. Specifically, reactions with this enzyme would require less monitoring (reduced product degradation), use less of the valuable starting material (reduced donor hydrolysis), and provide higher product yields. The goal of this research is to decrease product degradation by PmST1 without adversely affecting its rapid reaction rates. Overall this will improve the synthetic capability of PmST1. Improved mutants of PmST1 can be developed by directed evolution (DE) using a targeted high-throughput FACS approach. This will be accomplished by using the FACS method previously discussed in Chapter 2 to screen large (>107) mutagenic libraries of PmST1 for variants with improved synthetic competency. This differs from previous efforts to engineer an improved PmST1 which used small targeted libraries (<105) limited to mutations near the active site of the enzyme and low-throughput assays 92  to determine activity of the variants.132,202 By using a targeted FACS-based approach, very large (>107) mutagenic libraries of PmST1 can be rapidly assessed for beneficial mutations.   4.3 Mutagenesis and Library Construction To create a library of PmST1 mutants, error-prone PCR was first used. The innate error-rate of taq polymerase is enhanced through the addition of Mn2+ ions which help stabilize mismatched DNA base-pairs. Varying the amount of Mn2+ in the reaction results in varying frequencies of mutation within the gene. Various concentrations of Mn2+ were used (50 µM, 100 µM, and 250 µM) to obtain libraries of the PmST1 gene varying in mutation frequency. The resulting libraries, referred to as low, medium, and high, contained varying mutation frequencies (2.2, 2.7, and 6.4 mutations/kb) and were all carried forward for screening. Generally, the desired mutation rate for the first round of a directed evolution is 1-8 mutations per gene. All gene libraries were cloned into pUC18 for expression in the screening strain JM107∆nanA:pACYC18(siaB). This was the same strain used previously for assaying sialyltransferase activity (discussed in Chapter 2), having an inactivated sialic acid aldolase (nanA) and the additional vector pACYC18 for expression of the sialic acid synthetase SiaB. Prior to screening, the titers of the resulting three libraries (low, medium, and high) were determined to be 1.2 x 107, 2.1 x 107, and 1.8 x 107 clones, respectively.  4.4 FACS Sorting For the first round of FACS selection, the libraries were grown overnight in growth/induction media containing IPTG to induce expression of both the PmST1 mutants and the sialic acid synthetase (SiaB) required to produce the ST donor substrate (CMP-Neu5Ac). Following 93  incubation with sialic acid (Neu5Ac) and the reporter substrates, the cells were washed extensively, diluted, and submitted to FACS. The mutant libraries were subjected to three rounds of FACS sorting (Figure 4.2), in which the optimized amplification method discussed in Chapter 2 (re-growth and induction on plate following each round of FACS) was used. Additionally, a two-substrate selection method (β-lactose-C2-BODIPY and β-lactose-MU) was applied in order to avoid the potential enrichment of BODIPY- or MU- binding specificities in the PmST1 variants, a problem in previous studies.110 Selection of the top mutants was based on retention of fluorescence for both the MU (355/460 nm) and BODIPY (488/530 nm) substrates, with the top 1% of cells isolated each round. Following two rounds of FACS, all three libraries were pooled for the final round of FACS. A visible increase in retention of both the sialylated β-lactose-C2-BODIPY and β-lactose-MU substrates was evident from the FACS histogram data for all three libraries (Figure 4.2). This can be seen as an upward, rightward shift of the population as the amount of BODIPY (upward) and MU (rightward) substrates in the cells increases with improved enzyme activity. 94   Figure 4.2 - FACS sorting data summary. Three mutagenic libraries of PmST1 were submitted to FACS sorting. Cells were assessed for both green (488/530 nm) and blue (355/460 nm) fluorescence to select the top 1% of cells (cells highlighted in red box) retaining the α-2,3-Neu5Ac-Lactose-C2-BODIPY and α-2,3-Neu5Ac-Lactose-MU products, respectively. Selected cells were regrown, assayed, and submitted to the following round of FACS sorting. All libraries were pooled prior to the third round of sorting.    4.5 Enriched Mutant Analysis Following FACS, lysate from the resulting pool of mutants was tested for ST activity on both assay substrates (Figure 4.3A). ST activity was present for both substrates, demonstrating that ST activity had not been lost through FACS enrichment. A representative sample of 94 colonies was randomly picked, grown, and stored at -80°C for further study. 32 colonies from the resulting 95  library were then randomly picked and assayed for fluorescence retention via ST activity with the β-lactose-C2-BODIPY substrate. Each colony was grown/induced overnight, then incubated with sialic acid (Neu5Ac) and β-lactose-C2-BODIPY using the same assay parameters used for the PmST1 mutant libraries prior to FACS sorting. Following extensive washing, each sample was visually assayed for fluorescence retention in microcentrifuge tubes (Figure 4.3B). In order to minimize variations in fluorescence values resulting from variations in cell densities (OD600), each sample was diluted to approximately equal cell density (OD600) prior to transferring and reading the fluorescence (485/530 nm) on plate (Figure 4.3C). The top variants were then determined from these values. 96   Figure 4.3 – Post-FACS results. (A) Lysate from the pool of selected cells following three rounds of FACS sorting was tested in vitro for sialyltransferase activity on the β-lactose-C2-BODIPY (green fluorescence) and β-lactose-MU (blue fluorescence) acceptor substrates to verify activity was not lost through FACS selection. (B) Randomly-picked colonies (x32) from the resulting pool of variants were visually assayed for sialyltransferase activity in vivo. (C) The same colonies were then transferred to plate (96-well) and quantitatively assayed for fluorescence (485/530 nm).   97  DNA minipreps from the ten clones with the highest fluorescence retention were prepared and sequenced. The mutations present in the top ten clones are summarized in Table 4.1. DNA from these hits was used as the template for the following round of mutagenesis and screening. Any highly-fluorescent clones containing no mutations (wild-type) were excluded from use in the mutagenesis template (e.g. Clone #16).  Table 4.1 - Summary of PmST1 mutations following screening.   4.6 Mutagenesis (Round 2) Error-prone PCR was performed on a mixture of DNA from the ten top hits from the first round of screening. Similar to the previous mutagenesis, two PmST1 libraries with low and high rates of mutation (0.7 mutations/kb and 2.3 mutations/kb) were created. For this round of mutagenesis, the error-prone Mutazyme II polymerase was used to produce both libraries. By pooling the DNA of all top variants, mutagenesis would allow for the addition of new mutations while retaining all of the potentially beneficial mutations enriched in the first round.  Clone (1-32):161011182227303132                                             K196N       Q293L E316K                         I404F     T47M                                         E250D                                            L395I                                    N154K            I263V             F337S             K390T                                                                                                              K390EMutations:                                L121S                                                                                             F359S                  I70F     S113A           E206DN41D                               L120V                           S113T    L172F                              I328N                 L395S98  4.7 FACS Sorting (Round 2) The two libraries were subjected to enrichment of the top mutants by FACS selection (3 rounds). Following the first two rounds, the libraries were pooled for the third and final round of FACS selection. As shown in Figure 4.4, an increase in the overall fluorescence retention by the libraries through three rounds of FACS selection was detectable (seen as rightward, upward shift in populations). The top 1% of cells were sorted through each round to select for the mutants with highest ST activity. As before, 94 random colonies from the resulting library were picked as a representative sample and stored at -80°C for future analysis.   Figure 4.4 - FACS sorting data summary (Round 2). Two mutagenic libraries of PmST1 were submitted to three rounds of FACS sorting. Cells were assessed for both green (488/530 nm) and blue (355/460 nm) fluorescence to select the top 1% of cells retaining the α-2,3-Neu5Ac-Lactose-C2-BODIPY and α-2,3-Neu5Ac-Lactose-MU products, respectively. Selected cells (sorting gates not shown) were regrown, assayed, and submitted to the following round of FACS sorting. Both libraries were pooled prior to the third round of sorting. Improvements in activity were assessed as increases in fluorescent product formation and encapsulation (visible in histograms as upward, rightward shifts in population).  99  4.8 Enriched Mutant Analysis (Round 2) Following FACS, 24 random colonies were picked, grown, and assayed for fluorescence retention in vivo via ST activity with the β-lactose-C2-BODIPY substrate. Each colony was grown/induced overnight, then incubated with sialic acid (Neu5Ac) and β-lactose-C2-BODIPY using the same assay parameters used following the first round of directed evolution. Following extensive washing, samples were again diluted to equivalent cell densities (OD600nm), transferred to plate, and the fluorescence (485/530 nm) of each clone quantified (Figure 4.5). DNA minipreps from all 24 samples were then isolated and sequenced. A summary of all mutations present within the samples following the 2nd round of directed evolution are shown in Table 4.2.           Figure 4.5 - Fluorescence retention following FACS (Round 2). Randomly picked colonies were grown, induced, then incubated with Neu5Ac (5 mM) and β-lactose-C2-BODIPY (500 µM). Following extensive washing with PBS, the fluorescence (485/530 nm) of each sample was determined by plate reader. The red-dashed line denotes the fluorescence value of wild-type PmST1.     0500010000150002000025000300000 5 10 15 20 25Fluorescence485/530 nm(ABU)Colony #100  Table 4.2 - Summary of PmST1 mutations present in top fluorescing cells (Round 2). Mutations present in ≥2 of the resulting samples are denoted in red.          The most prominent mutation, E250D, was present in most (12/13) of the top fluorescing cells. Additionally, the mutations N41D, L121S, M225I, E316K, L395I, and I404F were present in multiple (≥2/13) members of the top fluorescing samples. All of these mutations were present in the mutants selected through the first round of screening, with the exception of M225I which only appeared during the second round. Analysis of the mutations enriched in the second round of directed evolution indicates that most of the top variants contained multiple mutations from the first round which had not initially been present together. These variants, which appear to have amassed multiple mutations enriched in the first round, likely developed as a result of aberrant PCR. Chimeras such as these variants are commonly formed as a result of performing multi-template PCR.207,208 Despite this, these accumulations appear to have been beneficial as the chimeras were enriched preferentially over their parent sequences. From this same analysis it appears unlikely that the L395I or I404F mutations have any significant effect on PmST1. These mutations were only present in variants also containing E250D or E316K, which both Top Cells: PmST1 Mutation Summary: Mutation: Abundance: 1 N41D, M225I, E250D, E316K, I404F N41D 2/13 3, 11, 13, 15 M225I, E250D, L395I N75S 1/13 7 S113A, L121S, E250D, E316K, I404F S113A 1/13 10 L121S, E250D, E316K, I404F L121S 3/13 14 N41D, L121S, K196N, M225I, E250D, L395I K196N 1/13 16, 17, 22, 23 E250D, E316K E205D 1/13 24 N75S, E205D, Q258P M225I 5/13   E250D 12/13   Q258P 1/13   E316K 6/13   L395I 5/13   I404F 3/13 101  demonstrate improvements in the absence of L395I or I404F. These two pairs of mutations were already partnered within the template DNA used to make the second mutagenic libraries and were likely carried forward together. Interestingly, the mutation M225I appeared spontaneously during the second round of selection. In all of the top variants, this mutation is present along with other mutations from the first round of screening. It is difficult then, to infer if this mutation is separately beneficial to PmST1 activity or if it requires the presence of other mutations to elicit any positive effect. Within the structure of PmST1, none of the potentially beneficial mutations, except for N41D, are located in direct proximity to the active site (Figure 4.6). The mutation E316K is in close proximity to the catalytically important residue H311, which is believed to aid in stabilizing the phosphate group of CMP in the donor sugar. Additionally, the mutation L121S is in close proximity to the D141, which acts as the general base in the sialyltransferase reaction. All other mutations, E250D in particular, are not in obvious proximity of any catalytically important residues and may be beneficial to PmST1 in regards to expression or thermostability/folding, or only infer minor activity improvements. Further characterization of the top hits would be required to better understand any effects these mutations had on the activity of PmST1. 102   Figure 4.6 - Structure of PmST1 with location of mutations  4.9 Sub-Cloning, Expression, and Purification In order to characterize the PmST1 variants identified through FACS selection in relation to the wild-type enzyme, the variants needed to be expressed and purified for kinetic analysis. PmST1 variants representative of the accumulated mutations (or varying combinations thereof) were sub-cloned into pET29a for expression in BL21(DE3) and purification on Ni2+ resin. The purified enzymes were carried forward to kinetic characterization of their sialyltransferase and sialidase activities relative to the wild-type enzyme. The overall synthetic competency would be quantified as the ratio of sialyltransfer activity (kcat/KM) to sialidase activity (kcat/KM). In order to determine this competency, it was necessary to be able to quantitate all of the sialyltransferase-related reactions (sialyltransfer, donor hydrolysis, sialidase). Different methods were examined to achieve this, as discussed further in the next section. 103   4.10 Characterization 4.10.1 Kinetics (Coupled Assay) Traditionally, sialyltransferase rates have been determined using one of two methods. The first method is a stopped assay in which the amount of product formed is detected by an analytical instrument (HPLC or CE) following a set incubation time. This method requires optimization to ensure that the initial rates are being measured within the chosen assay parameters. More preferred is the other method type, which uses a continuous coupled assay. Briefly, the release of CMP from the CMP-Neu5Ac donor sugar is coupled to the oxidation of NADH (λ = 340 nm, ϵ = 6.22 mm−1 cm−1) through the action of nucleoside monophosphate kinase, pyruvate kinase (PK), and lactate dehydrogenase.107 This method had been regularly used within our lab and the required coupled assay components were readily available. Unfortunately, this method is most suitable to Leloir GTs and STs with little to no donor hydrolysis. Due to the high rate of donor hydrolysis by PmST1, this assay could not consistently give reliable results and another method was required.   4.10.2 Development of HPAE-PAD Method One of the major disadvantages of the existing stopped assays for sialyltransferase kinetics was the required use of acceptor sugars with chromophores (e.g. fluorescent derivatives) in order to improve the sensitivity of detection. While functional, this does not provide accurate kinetic data for the enzyme acting on natural non-derivatized substrates. Additionally, this approach requires a separation step (i.e. chromatography) coupled to fluorescence detection, which was not easily accessible to this researcher. In order to detect α-2,3/2,6-Neu5Ac-lactose production 104  (sialyltransfer) or Neu5Ac release (sialidase) with high sensitivity, a stopped assay method using high-performance anion-exchange chromatography with pulsed-amperometric detection (HPAE-PAD) was developed. Sialyltransfer, donor hydrolysis, and sialidase reaction rates could all be determined using this method, with the reliable detection of products as low as 1 µM in concentration. The HPAE-PAD system also allowed for the quantitation of all reactions without chromophores, providing a simplified approach over methods requiring more complex materials and detection systems. Specifically, the Dionex HPAE-PAD system was used for this work. Briefly, following sialyltransfer or sialidase reactions, aliquots of the reaction mixture were transferred to sample vials, diluted, flash frozen, then applied to HPAE-PAD with the Dionex instrument. Separation was optimized to ensure sufficient separation between the assay components Neu5Ac, lactose, and Neu5Ac-lactose (Figure 4.7). Peak areas for each component were converted to substrate concentration for rate determination. For sialyltransfer rates, the increases in Neu5Ac-lactose peak areas over time were used for calculations. For sialidase rates and donor hydrolysis rates, the changes in Neu5Ac peak areas (converted to substrate concentration) over time were used.  Figure 4.7 - High-performance anion-exchange chromatography with pulsed amperometric detection (HPAE-PAD) for sialyltransfer and sialidase rate determination. A representative reaction is shown in which various time points of the reaction are overlaid and changes in [Neu5Ac] and [Neu5Ac-Lactose] were detectable.  CMP-Neu5Ac 105  4.10.3 Sialyltransferase Kinetics Michaelis-Menten kinetic parameters were determined for the top PmST1 mutants that expressed well enough to be purified in sufficient quantities. The top mutants PmST1_1(N41D, M225I, E250D, E316K, I404F), PmST1_7(S113A, L121S, E250D, E316K, I404F), and PmST1_10(L121S, E250D, E316K, I404F) were characterized, along with the negative control enzymes wild-type PmST1 and PmST1_6(I70F, L121S, F359S), which showed activity similar to wild-type. The kinetic data are summarized in Table 4.3. The best mutants PmST1_7 and PmST1_10, as indicated by the fluorescence retention assay, showed increases of 1.7- and 2.6-fold to their catalytic efficiencies (kcat/KM) for α-2,3-sialyltransferase activity over the wild-type enzyme. The mutant PmST1_1, conversely, had a decreased catalytic efficiency relative to the wild-type enzyme.  Table 4.3 - Summary of sialyltransferase kinetic data for PmST1 mutants.   KM (mM) kcat (s-1) kcat/KM (s-1 mM-1) PmST1 Clones: (Lactose-MU)   wt (lit.) 1.4 ± 0.2 47 ± 4 34 ± 8  (β-Lactose)   wt 1.2 ± 0.2 170 ± 15. 144 ± 48.   #1 3.4 ± 0.5 210 ± 19  62 ± 20 #6 7.7 ± 1.5 220 ± 20 29 ± 15 #7 0.9 ± 0.2 220 ± 17 250 ± 63 #10 0.4 ± 0.3 150 ± 9 380 ± 33   106  4.10.4 Sialidase Kinetics Sialidase kinetics were also determined for the same PmST1 mutants. It should be noted that the addition of CMP dramatically increases the sialidase rate of PmST1, so CMP (300 µM) was included in every sialidase reaction in order to minimize any differences which may have arisen as a result of varying trace amounts of CMP in the assay. At this concentration (over 2 orders of magnitude above the KM of PmST1 for CMP) any effect of CMP on sialidase rates was assumed to be negligible. The effect of CMP on sialidase rates will be discussed in more detail in Chapter 5. The kinetic data for the sialidase activity of the mutants are summarized in Table 4.4. All four of the assayed mutants showed decreased α-2,3-sialidase catalytic efficiencies (kcat/KM) over the wild-type enzyme. For a more accurate determination of enzyme improvement though, the change in the ratio of sialyltransfer activity (kcat/KM) to sialidase activity (kcat/KM) was determined for each mutant. This ratio was termed synthetic competency.  Table 4.4 - Summary of sialidase kinetics for PmST1 mutants.  KM (mM) kcat (s-1) kcat/KM (s-1 mM-1) PmST1 (α-2,3-Neu5Ac-Lactose-MU @pH 5.0, no added CMP) wt (lit.) 24 230 9.5  (α-2,3-Neu5Ac-Lactose @pH 7.5, CMP added) wt 1.3 56.2 43.3 1 2.4 29.4 12.2 6 2.9 11.9 4.1 7 2.2 52.3 23.8 10 2.0 54.2 27.1  4.10.5 Synthetic Competency Following determination of both sialyltransferase and sialidase rates for the PmST1 mutants found through FACS screening, the synthetic competency of each enzyme was quantitatively 107  assessed as the ratio of sialyltransferase to sialidase catalytic efficiencies (kcat/KM). These overall values, as well as the values in relation to the wild-type enzyme, are summarized in Table 4.5. All four mutants showed increased synthetic competency, including PmST1_06, which had only showed activity similar to wild-type in the in vivo plate assay. This might explain the increased in vivo formation of sialyltransferase product detected in the FACS screening. The two top mutants, PmST1_07 and PmST1_10, showed the highest improvement in synthetic competency values with increases of 3.1-fold and 4.2-fold over the wild-type, respectively. To determine if this measured improvement in sialyltransferase efficiency over sialidase efficiency translated to an appreciable improvement in sialyltransferase product yield, the wild-type enzyme was tested in vitro against the mutant PmST1_07. The formation of sialyltransferase product (α-2,3-Neu5Ac-lactose) by both wild-type PmST1 and PmST1_07 (50 ng each) was monitored by HPAE-PAD until product formation reached a maximum and then for some time afterwards. Reactions were carried out at 37°C in HEPES buffer (5 mM, pH 7.5) with 1 mM of CMP-Neu5Ac, 50 mM of lactose, and 50 ng of each enzyme. There was no appreciable increase in the maximum product yield for the PmST1_07 mutant over the wild-type enzyme (Figure 4.8). 108  Disappointingly, the quantitative increases in the synthetic competencies of the top PmST1 mutants appear to have been insufficient to translate into substantial improvements in the synthetic yield of the enzymes The synthetic competency of a previously engineered PmST1 mutant (N144D)132 showing reduced sialidase activity and increased product yields following site-directed mutagenesis was estimated for comparison. That mutant exhibited a synthetic competency ratio of 1118 (313-fold improved over the wild-type enzyme). This value could be used as a baseline for what would define a successful improvement in synthetic competency for my library. Unfortunately, even the top mutants PmST1_07 and PmST1_10 did not come close to achieving this level of success.  02004006008001000PmST1 wt PmST1_07[α-2,3-Neu5Ac-lactose] (µM)Maximum Product YieldFigure 4.8 - Comparison of sialyltransferase product yields. The formation of sialyltransferase product (α-2,3-Neu5Ac-lactose) by PmST1 wild-type and the mutant PmST1_07 was monitored by HPAE-PAD over an extended incubation time until a maximum product yield had been reached. Each enzyme (50 ng each) was incubated in HEPES buffer (5 mM, pH 7.5) with CMP-Neu5Ac (1 mM) and lactose (50 mM) at 37°C. At various time points, aliquots were removed from each reaction, flash-frozen, and submitted for analysis to HPAE-PAD to determine the product concentration. The highest concentration of product detected during the reaction is presented for each enzyme.  109  Table 4.5 - Synthetic competency of PmST1 mutants. Synthetic Competency (SC) = Sialyltransfer (kcat/KM) / Sialidase (kcat/KM) Ratio PmST1: SC: relative to wt: wt 3.3 1.00 1  (N41D, M225I, E250D, E316K, I404F) 5.0 1.51 6  (I70F, L121S, F359S) 7.0 2.10 7  (S114A, L121S, E250D, E316K, I404F) 10.3 3.10 10 (L121S, E250D, L395I, I404F) 14.0 4.21  Considering these results, some aspects of the screen design likely contributed to the minimal improvements seen over two rounds of mutagenesis and FACS sorting. While allowing for high-throughput assessment of large mutant libraries, the FACS approach may have had some limitations which hindered the ultimate level of desired enzyme improvement. The assay incubation timeframes (30-45 minutes at 37°C) used throughout the screen may not have been conducive to obtaining the best PmST1 variants, although this is difficult to know for reactions occurring in vivo. Varying expression levels between mutants may have affected the amount of product formed within the assay timeframe. Also, maximum in vivo product formation might have been affected by variations in sialic acid and lactose uptake, despite the concentrations of substrates added externally. The inability to more tightly control some of these parameters with the FACS method may have mired engineering efforts.   4.11 Conclusions  In conclusion, large mutant libraries (>107) of the ST gene PmST1 were successfully created and subjected to two rounds of mutagenesis and FACS sorting to enrich improved variants. While recurring mutations (N41D, L121S, M225I, E250E, E316K, L395I, I404F) were evident in the resulting pool of variants, substantial improvements in synthetic efficiency (higher yields and 110  decreased product degradation) were not obtained. While originally mutations were to be individually assessed following characterization of the top hits, limited enzyme improvements rendered this task futile. Although the desired improvements were not obtained, valuable insights into the possible sialidase and trans-sialidase mechanism of PmST1 were gained. The large effect of free CMP on sialidase rates indicated a reversible sialylation mechanism previously discounted in the literature. These effects prompted a substantial evaluation of the sialidase and trans-sialidase mechanism of PmST1 and other GT80 enzymes. This work is discussed in Chapter 5.   111  Chapter 5: Mechanisms of the Sialidase and Trans-Sialidase Activities of Bacterial Glycosyltransferases from Family GT80 5.1 Introduction STs from family GT80 have proved to be highly suitable for synthesis and enzyme engineering as a result of their good expression, rapid reaction rates, and broad substrate specificities. Work with STs in this family has led to an expanded availability of diverse sialylated compounds.199,200,205,209 Sialyltransferases from family GT80 have regularly been classified as multifunctional, owing to an array of detected activities: not only the expected α-2,3/2,6-sialyltransfer from CMP-Neu5Ac to galactoside acceptors, but also α-2,3/2,6-sialidase and α-2,3/2,6-trans-sialidase reactions of the formed sialogalactosides. There is some debate about the mechanisms of the sialidase and trans-sialidase activities exhibited by these enzymes. In particular, the mechanism of the trans-sialidase activity has been proposed to be similar to that of the well-studied trans-sialidase from Trypanosoma cruzi (TcTS), a member of Glycoside Hydrolase (GH) family 33.203 In the two-step double displacement mechanism of TcTS (Figure 5.1), a sialic acid is first displaced from a sialoglycoconjugate with formation of a covalent sialyl-enzyme intermediate. Following this, the sialic acid is transferred to an alternate acceptor 112  sugar (β-galactoside), or to water if no suitable acceptor is present.210,211 Thus, the only difference between a sialidase and a trans-sialidase is the nature of the acceptor (Figure 5.2A). STs from GT80 share no sequence homology with TcTS (or any GH33 sialidase or trans-sialidase). In addition, there is no structural evidence to suggest that STs from GT80 contain the active site components, such as a nucleophilic tyrosine, that are necessary for a TcTS-like trans-sialidase mechanism.142,192,212,213 Further, structural studies make it clear that no separate trans-sialidase site exists. Since the trans-sialylation occurs with retention of configuration, an intermediate of some sort is most likely formed with an exogenous nucleophile, and the most Figure 5.1 - Mechanism of the trans-sialidase (TcTs) from Trypanozoma cruzi.  113  probable candidate for that nucleophile is CMP, for both the sialidase and trans-sialidase activities. Even trace amounts of CMP could effect a reverse sialyl transfer reaction, reforming CMP-Neu5Ac from free CMP and a sialoglycoconjugate. The reformed donor (CMP-Neu5Ac) is then available for transfer to an alternate acceptor (trans-sialidase) or can be hydrolysed (known as donor hydrolysis) to release free Neu5Ac into solution (Figure 5.2B). Each step would occur with net inversion of configuration, as shown for donor hydrolysis214 thus overall reaction would occur with net retention. For at least one ST from GT80 it has been found that the addition of CMP enhances sialidase and trans-sialidase rates203, lending support to this hypothesis, though it was claimed that CMP was not essential.    Figure 5.2 - General reaction schemes for sialidase/trans-sialidase activities of sialyltransferases. The enzymatic cleavage of sialic acids from sialylated compounds can occur by (A) direct hydrolysis or (B) reversible sialylation followed by donor hydrolysis. Reaction schemes for coupled enzyme assays used to indirectly determine the formation of CMP-Neu5Ac in a sialidase reaction by the α-2,3-sialyltransferase PmST1 (C) or the α-2,6-sialyltransferases Psp2,6ST/Pd2,6ST (D). Formation of the alternate products (shown in bold) requires the presence of CMP-Neu5Ac.  114  5.2 Rationale and Research Goals Reverse glycosyltransferase activity is a simple consequence of equilibrium activity, and has been used in numerous GTs, including STs, in alternate modes of glycoside formation.215-219 This simple explanation for the trans-sialidase activities of STs from GT80 has largely been ignored for two main reasons: (1) sialidase and trans-sialidase activities are present in the absence of added CMP and (2) CMP-Sia has not been detected in sialidase and trans-sialidase reactions. The goals of this research are to (1) demonstrate the formation of CMP-Neu5Ac in sialidase and trans-sialidase reactions by GT80 enzymes and (2) demonstrate CMP contamination as the cause of incorrect CMP-independent sialidase and trans-sialidase mechanism proposals. This will be accomplished using three representative STs from GT80, PmST1 (α-2,3-sialyltransferase from Pasteurella multocida), Pd2,6ST (α-2,6-sialyltransferase from Photobacterium damselae) and Psp2,6ST (α-2,6-sialyltransferase from Photobacterium sp. JT-ISH-224). Also by using phosphatase treatment to actively remove CMP contamination and investigating the effects of CMP on sialidase/trans-sialidase activities.   5.3 Effect of CMP on the Sialidase Rate of PmST1 It has been found previously that CMP enhances the rate at which at least one ST from GT80 cleaves sialic acid from sialoglycoconjugates.203 In order to further examine the role of CMP in the sialidase activity of STs from GT80 the rates of hydrolysis of Neu5Ac from α-2,3-Neu5Ac- Gal-β-pNP by the α-2,3-sialyltransferase from Pasteurella multocida, PmST1205, were measured 115  using a coupled assay with excess β-galactosidase from E. coli (Figure 5.3). Upon removal of Neu5Ac, the Gal-β-pNP is rapidly degraded by β-galactosidase to release pNP, which is monitored over time by UV-Vis at 405 nm. The sialidase rates were calculated from the rates of pNP release. In the absence of added CMP under our reaction conditions, only a very low rate of Neu5Ac release was measured (Table 5.1). Complete removal of any contaminating CMP by treatment of the enzyme stock with phosphatase resulted in no measureable sialidase activity, as discussed in more detail later. Addition of even small amounts of CMP (10 µM, 100 µM, and 1000 µM) under the same reaction conditions increased the sialidase rate more than 80-fold (Table 5.1), and the effect seems to be specific to CMP since the activity could not be stimulated in a similar fashion upon addition of phosphate, cytidine, phosphate plus cytidine, AMP or pyrophosphate (Figure 5.4).   Table 5.1 - The effect of CMP on the sialidase rates of PmST1 acting on α-2,3-Neu5Ac-Gal-pNP. [CMP] (added) Sialidase Rate  (µmol Neu5Ac released min-1 mg protein-1) Relative Sialidase Rate 0 uM 6.0 x 10—6 1 10 uM 4.8 x 10-4 80 100 uM 5.3 x 10-4 88 1000 uM 5.6 x 10-4 93    Figure 5.3 - Coupled assay for the detection of sialidase activity.   116   Figure 5.4 - The activation of PmST1 sialidase activity is specific to CMP. The effect of other potential activators (Cytidine, Phosphate, Cytidine + Phosphate, Pyrophosphate, and AMP) was determined for the sialidase reaction of PmST1 (350 nM) reacted with α-2,3-Neu5Ac-Gal-pNP (100 µM) and B-galactosidase (30 units). Each reaction was continuously measured at OD405nm.  Unlike CMP, the alternate activators had no stimulating effect on sialidase activity.  Attempts to determine the KM value of CMP for this reaction proved challenging as the KM value was clearly very low, and CMP concentrations much below 1 uM could not be tested without approaching the concentration of enzyme needed in the assay (350 nM), thereby violating Michaelis-Menten conditions. Using the same coupled reaction with a fixed concentration of α-2,3-Neu5Ac-Gal-pNP (100 µM), the KM of PmST1 for CMP was estimated at  820 ± 100 nM (Figure 5.5). This extremely tight binding of CMP to the active site of PmST1 and its unique activating effect on PmST1 sialidase activity are consistent with an active role of CMP in the sialidase mechanism.  117   Figure 5.5 - Initial rates of Neu5Ac release from 2,3-Neu5Ac-Gal-pNP by PmST1. PmST1 (350 nM) was reacted with 2,3-Neu5Ac-Gal-pNP (100 µM), β-galactosidase (30 units) and varying concentrations of CMP. Kinetic data were plotted with Grafit for determination of KM. The KM of PmST1 for CMP was estimated to be 820 ± 95 nM.  5.4 Determination of Kinetic Competence for Reverse Sialylation by PmST1 A necessary condition for the sialidase mechanism proposed in this study (Figure 5.2B) to be feasible is that the product hydrolysis rate (determined as the rate of Neu5Ac release from α-2,3-sialyllactose) must be equal to or lower than the CMP-Neu5Ac hydrolysis rate. To confirm that this condition is met, the rates of Neu5Ac release from CMP-Neu5Ac (donor hydrolysis) and α-2,3-sialyllactose (product hydrolysis) were determined under near maximum rate (Vmax) conditions (50 mM HEPES, pH 7.5) for PmST1 using HPAEC-PAD. For donor hydrolysis, a concentration of CMP-Neu5Ac well above the KM of the enzyme was chosen to ensure near maximum rate of donor hydrolysis (5 mM CMP-Neu5Ac). Similarly, high concentrations of α-2,3-sialyllactose (20 mM) and CMP (1 mM) were used in the product hydrolysis reaction to 118  ensure near maximum rates for a relevant comparison. The rates of Neu5Ac release from CMP-Neu5Ac (donor hydrolysis) and α-2,3-sialyllactose (product hydrolysis) were found to be 5.5 x 10-4 µmol min-1 mg-1 and 4.5 x 10-4 µmol min-1 mg-1, respectively, showing that the proposed reversible sialylation mechanism is indeed a feasible mechanism for sialoglycoconjugate hydrolysis.   5.5 Detection of CMP-Neu5Ac Formation by Coupled Enzyme Assay The simplest validation of the proposed mechanism would be the direct detection of CMP-Neu5Ac during a sialidase reaction. Unsurprisingly, given its very low KM, CMP-Neu5Ac could not be directly detected in sialidase reactions with PmST1 (data not shown). As an alternative, a coupled enzyme assay was set up to indirectly detect the formation of CMP-Neu5Ac in solution. Three representative bacterial ST enzymes from the GT family GT80 were selected for this test: PmST1 (α-2,3-sialyltransferase from Pasteurella multocida), Pd2,6ST (α-2,6-sialyltransferase from Photobacterium damselae) and Psp2,6ST (α-2,6-sialyltransferase from Photobacterium sp. JT-ISH-224). All of these enzymes have been found to have sialidase activity towards their respective sialyltransfer products.203,205,220 Generally, the coupled reaction involved performing a sialidase reaction with either an α-2,3 or α-2,6 specific ST and its corresponding sialyl lactoside, and introducing into the reaction an alternate acceptor and second ST which is capable of forming a unique product with the alternative acceptor if CMP-Neu5Ac is present. Specifically, the coupled assays entailed reacting the 2,3-sialyltransferase PmST1 and Pd2,6ST/Psp2,6ST with the sialyl donors α-2,3-Neu5Ac-Lactose or α-2,6-Neu5Ac-Lactose, respectively. This was done both in the absence and presence of CMP. In addition, an alternate acceptor, the fluorescently labelled Lactose-C2-BODIPY (Lac*), was included along with excess amounts of a second 119  sialyltransferase that would utilise any formed CMP-Neu5Ac to produce an alternate transfer product with Lac*. This formed product would be distinct from any product formed by the initial ST enzyme under the same reaction conditions and would thereby provide proof that CMP-Neu5Ac had been formed in situ. The general reaction schemes for both coupled assays can be found in Figure 5.2C and Figure 5.2D, respectively. In order to detect the formation of alternate sialyltransfer products (α-2,3/2,6-Neu5Ac-Lac*), reactions were monitored by TLC under UV (365 nm) using elution conditions with which the alternative sialyltransfer product could be easily separated and detected. Standards of α-2,3-Neu5Ac-Lac* and α-2,6-Neu5Ac-Lac* were included on each TLC plate to ensure accurate product identification. In the first coupled assay to test CMP-Neu5Ac formation by PmST1, the addition of CMP and a second ST (Psp2,6ST in excess) should result in the formation of α-2,6-Neu5Ac-Lac* if CMP-Neu5Ac is formed in the reaction by PmST1 (Figure 5.6).   120   Figure 5.6 - TLC data for the coupled enzyme assay of PmST1 (coupled assay results). Aliquots of reactions containing PmST1, α-2,3-Neu5Ac-Lactose, Lactose-BODIPY (Lac*), Psp2,6ST, and the absence or presence of CMP (added) were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, no α-2,6-Neu5Ac-Lac* trans-sialidase product was formed by Psp2,6ST alone (left lanes). The formation of α-2,6-Neu5Ac-Lac* trans-sialidase product (right lanes) only occurs in the presence of both enzymes (PmST1 and Psp2,6ST) and CMP. The formation of the alternate trans-sialidase product by Psp2,6ST, α-2,6-Neu5Ac-Lac*, only could have occurred if CMP-Neu5Ac was formed in the sialidase reaction of PmST1 + 2,3-Sia-Lac + CMP. Once available, CMP-Neu5Ac can be transferred by Psp2,6ST to Lac* to form α-2,6-Neu5Ac-Lac* as seen in the data.  In the presence of a sialyl-donor (α-2,3-Neu5Ac-Lactose) and CMP, the formation of α-2,6-Neu5Ac-Lac* clearly occurs over the course of the reaction (Figure 5.6), and could be detected by UV (365 nm) as early as 30 mins into the reaction following separation on TLC plates. These results are summarized in Figure 5.7A.  121   Figure 5.7 - Detection of CMP-Neu5Ac formation by bacterial sialyltransferases (reversible sialylation). Using a coupled enzyme assay, the formation of CMP-Neu5Ac could be indirectly determined through the formation of an alternate sialyltransfer product in sialidase reactions by (A) the α-2,3-sialyltransferase PmST1, (B) the α-2,6-sialyltransferase Pd2,6ST, and (C) the α-2,6-sialyltransferase Psp2,6ST. In case A (PmST1 as primary enzyme), α-2,6-Neu5Ac-Lactose-C2-BODIPY (shown as α-2,6-Neu5Ac-Lac*) is formed by a second enzyme (Psp2,6ST) in the reaction. In cases B and C (Pd2,6ST and Psp2,6ST as primary enzymes) α-2,3-Neu5Ac-Lactose-BODIPY (shown as α-2,3-Neu5Ac-Lac*) is formed by a second enzyme (the α-2,3-sialyltransferase CstII) in the reactions. Representative spot densities (AlphaImager) under UV illumination for the indicated products were chosen following TLC analysis of the reactions.  Control reactions in which either CMP or one enzyme were omitted resulted in no detectable α-2,6-Neu5Ac-Lac* formation. As expected for PmST1, a trans-sialylation product, α-2,3-Neu5Ac-Lac* was also formed in the coupled reaction, but no α-2,6-Neu5Ac-Lac* was formed independently by PmST1 under these reaction conditions (Supplemental Figure S5). In the next set of coupled assays, both Pd2,6ST and Psp2,6ST (separately) were reacted with α-2,6-NeuAc-Lactose. In the presence of the alternate acceptor (Lac*), CMP, and a second sialyltransferase CstII (α-2,3/2,8-sialyltransferase from Campylobacter jejuni), CMP-Neu5Ac formation by Pd2,6ST and Psp2,6ST could be determined indirectly through the formation of α-2,3-Neu5Ac-Lac* by CstII (Figure 5.8 and Figure 5.9). The data for these experiments are summarized in Figure 5.7B and Figure 5.7C.  122   Figure 5.8 - TLC data for the coupled enzyme assay of Pd2,6ST (all results). Aliquots of reactions containing Pd2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), and the absence or presence of CMP (added) were taken at various time points for spotting on TLC (left). Under these conditions, the expected trans-sialidase product of Pd2,6ST, α-2,6-Neu5Ac-Lac*, is formed only when CMP was added to the reaction. None of the alternate trans-sialidase product, α-2,3-Neu5Ac-Lac*, is formed by Pd2,6ST alone (left lanes). Aliquots of reactions containing Pd2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), CstII, and the absence or presence of CMP (added) were taken at various time points for spotting on TLC (right lanes). TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, the formation of α-2,3-Neu5Ac-Lac* trans-sialidase product only occurs in the presence of both enzymes (Pd2,6ST and CstII) and CMP. The formation of the alternate trans-sialidase product by CstII, α-2,3-Neu5Ac-Lac*, only could have occurred if CMP-Neu5Ac was formed in the sialidase reaction of Pd2,6ST + 2,6-Sia-Lac + CMP. Once available, CMP-Neu5Ac is transferred by CstII to Lac* to form α-2,3-Neu5Ac-Lac*.  123   Figure 5.9 - TLC data for the coupled enzyme assay of Psp2,6ST (coupled assay results). Aliquots of reactions containing Psp2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), CstII, and the absence or presence of CMP (added) were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions the formation of α-2,3-Neu5Ac-Lac* trans-sialidase product only occurs in the presence of both enzymes (Psp2,6ST and CstII) and CMP. The formation of the alternate trans-sialidase product by CstII, α-2,3-Neu5Ac-Lac*, only could have occurred if CMP-Neu5Ac was formed in the sialidase reaction of Psp2,6ST + 2,6-Sia-Lac + CMP. Once available, CMP-Neu5Ac is transferred by CstII to Lac* to form α-2,3-Neu5Ac-Lac*.  124   Figure 5.10 - TLC data for the coupled enzyme assay of Psp2,6ST (negative controls). Aliquots of reactions containing Psp2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), and the absence or presence of CMP (added) were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, the expected trans-sialidase product of Psp2,6ST, α-2,6-Neu5Ac-Lac*, formed only when CMP was added to the reaction. No alternate trans-sialidase product, α-2,3-Neu5Ac-Lac*, is formed by Psp2,6ST alone.  Again, no detectable α-2,3-Neu5Ac-Lac* product was formed independently by Pd2,6ST/Psp2,6ST or CstII, with or without the addition of CMP (Figure 5.8 and Figure 5.10). As expected for Pd2,6ST and Psp2,6ST, each enzyme formed some α-2,6-Neu5Ac-Lac* trans-sialylation product. It is clear from these results that CMP-Neu5Ac is formed by PmST1, Pd2,6ST and Psp2,6ST as part of the mechanism by which Neu5Ac is cleaved from their respective ST products. The formed CMP-Neu5Ac is then either rapidly transferred back to the initial acceptor (sialyltransfer), to an alternate acceptor such as Lac* (trans-sialidase), or broken down by hydrolysis to release free Neu5Ac into solution (sialidase).  125   5.6 Demonstration of CMP Contamination in Trans-Sialidase Reactions Following some of the coupled enzyme assays, small amounts of trans-sialidase products produced by the initial enzyme (PmST1, Pd2,6ST, or Psp2,6ST) could be detected in the absence of added CMP, as expected based on the previously determined trans-sialidase activities of these enzymes. When using α-2,3-Neu5Ac-Lactose as a sialyl donor and Lactose-C2-BODIPY (Lac*) as an alternate acceptor, PmST1 formed detectable amounts of α-2,3-Neu5Ac-Lac* in both the coupled assay reaction and control reactions containing PmST1 alone. Pd2,6ST and Psp2,6ST, utilizing α-2,6-Neu5Ac-Lactose as a sialyl donor and Lac* as an alternate acceptor, formed α-2,6-Neu5Ac-Lac*. In order to determine whether this was due to CMP (i.e. CMP that may have been carried through enzyme purification), each of the three ST enzymes was pretreated with alkaline phosphatase (bovine) for 30 mins @37°C and the trans-sialidase reaction repeated. The trans-sialidase products α-2,3-Neu5Ac-Lac* (PmST1) and α-2,6-Neu5Ac-Lac* (Pd2,6ST/Psp2,6ST) could not be detected in phosphatase-treated reactions following reaction (Figure 5.11, Figure 5.12, and Figure 5.13). These results are summarized in Figure 5.14. Addition of CMP to the phosphatase-treated samples yielded trans-sialidase products in the case of all three enzymes, indicating that that phosphatase treatment had no detrimental effect on enzyme activity (data not shown). In the case of PmST1 trans-sialidase reactions, phosphatase treatment of the sialyl donor (α-2,3-Neu5Ac-Lactose) stock (obtained from commercial source) was also required to remove all trans-sialidase activity.  126   Figure 5.11 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (PmST1). Aliquots of reactions containing untreated or treated (pre-treated with phosphatase for 30 mins at 37°C) PmST1, α-2,3-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), and in the absence or presence of untreated/treated Psp2,6ST were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, the trans-sialidase products α-2,3-Neu5Ac-Lac* and α-2,6-Neu5Ac-Lac* formed only when the enzyme stocks were not treated with phosphatase. Phosphatase pre-treatment of the enzyme stocks resulted in no detectable trans-sialidase products. This indicates that CMP must be present in the reaction mixtures already (contamination).  127   Figure 5.12 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (Pd2,6ST). Aliquots of reactions containing untreated or treated (pre-treated with phosphatase for 30 mins at 37°C) Pd2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), and in the absence or presence of untreated/treated CstII were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, the trans-sialidase products α-2,3-Neu5Ac-Lac* and α-2,6-Neu5Ac-Lac* formed only when the enzyme stocks were not treated with phosphatase. Phosphatase pre-treatment of the enzyme stocks resulted in no detectable trans-sialidase products. This indicates that CMP must have been present in the reaction mixtures already (contamination).      128   Figure 5.13 - TLC data for the effect of CMP removal on sialidase/trans-sialidase activity (Psp2,6ST). Aliquots of reactions containing untreated or treated (pre-treated with phosphatase for 30 mins at 37°C) Psp2,6ST, α-2,6-Neu5Ac-Lactose, Lactose-C2-BODIPY (Lac*), and in the absence or presence of untreated/treated CstII were taken at various time points for spotting on TLC. TLC was viewed under UV365 nm to detect products containing Lac*. Under these conditions, the trans-sialidase products α-2,3-Neu5Ac-Lac* and α-2,6-Neu5Ac-Lac* formed only when the enzyme stocks were not treated with phosphatase. Phosphatase pre-treatment of the enzyme stocks resulted in no detectable trans-sialidase products. This indicates that CMP must have been present in the reaction mixtures already (contamination).       129   Figure 5.14 – Summary of the effect of CMP removal on trans-sialidase activity. Trans-sialidase reactions of PmST1, Pd2,6ST, and Psp2,6ST were performed and then analyzed by TLC under UV. Spot densities of the indicated trans-sialidase products following reaction (120 mins) are shown for reactions with (+) and without (-) phosphatase pre-treatment (30 mins @37°) of the enzymes. Detectable trans-sialidase activity was completely removed for all sialyltransferases following pre-treatment of the enzyme mixture with phosphatase to remove contaminating CMP.  5.7 Discussion  The most logical mechanism for GT80-catalyzed sialoglycoconjugate hydrolysis and trans-sialylation has always been one in which CMP present in the reaction mixture permits reverse sialylation to generate a small amount of CMP-Neu5Ac. This in turn can be hydrolysed or can serve as a donor for transglycosylation reactions. However the observation of transglycosylation in the absence of added CMP203, and the inability to detect CMP-Neu5Ac in enzymatic reaction mixtures containing CMP and a sialoglycoconjugate had led to doubts about this pathway. The demonstration, with three different GT80 sialyltransferases, that incubation of the enzyme with a phosphatase prior to reaction ablates both sialidase and trans-sialidase activity reopens this question. The further demonstration that addition of CMP to reaction mixtures containing a 130  sialoglycoconjugate stimulates sialidase activity in a saturable, concentration-dependent manner, according to a KM value (for PmST1) of 820 nM lends further support to the notion of reverse sialylation. The CMP-Neu5Ac “intermediate” meets the criterion of kinetic competence, since it was shown that the rate constant for hydrolysis of CMP-Neu5Ac itself and of sialyl-Gal-pNP are the same within experimental error. Interestingly similar kcat values for PmST1-catalyzed donor hydrolysis and sialidase activity of 27±1 (s-1) and 23 (s-1), respectively, were also reported by202, although the reaction conditions were different (pH 8.5 vs. pH 5.5, respectively). The authors also noted parallel drastic reductions in the sialidase and donor hydrolysis rates upon mutation. These results align well with the findings in this study. The very tight binding of CMP is presumably responsible for the “carry over” of CMP into reaction mixtures with the enzyme, where only catalytic amounts are needed. This tight binding also explains why the CMP-Neu5Ac could not be detected directly since only small amounts would be formed and would be tightly associated with the enzyme. However; CMP-Neu5Ac formation could be detected indirectly, by use of pairs of linkage-specific STs and a single sialoglycoconjugate (e.g. 2,6-), in conjunction with a BODIPY-labelled acceptor and added CMP. Only in the presence of both STs could the alternative transglycosylation product (e.g. 2,3-) be detected, clearly showing that CMP-Neu5Ac must have been formed in the reaction mixtures.   5.8 Conclusion The presumed absence of CMP (i.e. no CMP added to the reaction) in sialidase and trans-sialidase reactions has led to the mistaken assumption of CMP-independent sialidase and trans-131  sialidase mechanisms for STs from GT80. The results of this research unequivocally support a reversible sialylation mechanism for STs from GT80 and present important considerations for researchers conducting syntheses with these enzymes. Addition of a phosphatase during enzymatic sialylation reactions to remove product CMP and thereby minimise inhibition is fairly common (but not universal) practice. These results suggest that substantial quantities of phosphatase should in fact be employed to degrade all CMP as it is formed; in this way the troublesome side reactions of product hydrolysis and transglycosylation can be largely avoided. Alternatively, syntheses relying on trans-sialidase activity would be enhanced by the addition of CMP.   132  Chapter 6: Conclusion The work presented in this thesis aimed to test hypotheses regarding the application of high-throughput technologies to carbohydrate-modifying enzyme discovery and engineering. My first hypothesis was that GT enzymes, STs in particular, could be isolated from metagenomic libraries using the ultra-high-throughput technology of FACS. Through proof-of-principle work I was able to prove that ST genes could be enriched from mixed genomic libraries using FACS. I was unable, however, to prove that ST genes could be isolated from actual metagenomic libraries. Further analysis made it clear that the enrichment strategy had limitations, specifically regarding the starting proportion of cells containing ST genes required for enrichment to occur. The gene class in question, STs, was too rare within actual metagenomic samples in order for sufficient enrichment to occur by FACS. For the method to be viable for functional metagenomic screening, either a more abundant class of GT genes would have to be targeted or some type of gene enrichment would need to occur prior to FACS enrichment. Additionally, the expression of GTs is generally poor in E. coli due to the common presence of membrane-associated regions which impede soluble expression. This means that many full GT gene sequences found in metagenomic libraries may not be functionally expressed without removal of membrane-associated regions or signal peptide sequences. While this means that any enzyme isolated by screening in E. coli is already suited for expression in bacteria and downstream applications, the screen is limited in its overall sensitivity for detecting all GTs with a targeted activity. The screen developed in this work fails to provide a functional screening alternative for GT gene discovery at this time, especially in comparison to sequence-based discovery strategies (in silico) in which all putative gene sequences within a metagenome could be expressed and tested. 133  Despite the limitations of the work within the larger field of CAZyme gene discovery, some technical findings offer applications of this research to future work using FACS enrichment strategies for either gene discovery or enzyme engineering. Numerous methodologies regarding optimization of the enrichment rate through multiple rounds of FACS were tested, providing an optimized growth and induction strategy for cells between each round of FACS sorting. Growing sorted libraries on solid media in order to minimize growth bias within the library population, as well as inducing gene expression on the same plate to minimize transfer loss provided superior enrichment rates over alternate strategies.  I was able to apply the insights from this work to the testing of my third hypothesis, in which I posited that the sialidase activity of a multifunctional ST could be substantially reduced or removed completely through directed evolution using the developed ultra-high-throughput FACS strategy. Using FACS, I successfully enriched improved mutants of the PmST1 enzyme. While I proved my hypothesis in that an improved mutant was developed using FACS, the amount of improvement as measured by synthetic competency of the enzyme was insubstantial when compared to another improved mutant of the same enzyme (PmST1). Given this, it is unlikely that this enzyme would have any use within the broader field of sialylated glycan synthesis. In addition to the FACS work for directed evolution, I also developed a new technique using HPAE-PAD to simultaneously determine the sialyltransferase, sialidase, and donor hydrolysis rates of a ST in order to characterize my mutants. Due to the challenging nature of the enzyme having three activities, an analogous technique using a single instrument was required for accurate comparisons. This approach also allows for the determination of Michaelis-Menten kinetic parameters of GTs without any derivatized substrates. This technique could be applied to 134  the kinetic characterization of many other GT enzymes, especially when parameters for multiple reactions need to be determined in a consistent fashion. Additionally, while the directed evolution of PmST1 did not yield impactful improvements, new research questions arose as a result of the work. An observation that sialidase rates of PmST1 were highly enhanced by the presence of CMP led to valuable insights into the possible sialidase and trans-sialidase mechanism of the enzyme and others within its GT family. The large effect of free CMP on sialidase rates indicated a reversible sialylation mechanism that had been previously discounted in the literature. The presumed absence of CMP (i.e. no CMP added to the reaction) in sialidase and trans-sialidase reactions had led to the mistaken assumption of CMP-independent sialidase and trans-sialidase mechanisms for STs from GT family 80. Through a coupled-enzyme assay which proved the formation of CMP-SA during sialidase and trans-sialidase reactions, I showed that ST enzymes from this family perform reversible sialylation followed by rapid donor hydrolysis (sialidase) or sialyltransfer (trans-sialidase). The results of this research unequivocally support a reversible sialylation mechanism for STs from GT80 and present important considerations for researchers conducting syntheses with these enzymes. My results suggest that substantial quantities of phosphatase should in fact be employed to degrade all CMP as it is formed in order to avoid side reactions of product hydrolysis and unwanted transglycosylation. Alternatively, syntheses relying on trans-sialidase activity would be enhanced by the addition of CMP to the reaction mixture.  Finally, my second hypothesis was that high-throughput plate screening could be used for the discovery of blood antigen-cleaving enzymes within the human gut microbiome. I developed metagenomic libraries from the gut microbiomes of individuals of blood type A and O. Within 135  the metagenomic library of the blood type A individual, I successfully isolated a functional GH109 enzyme from Bacteroides vulgatus capable of cleaving the terminal α-GalNAc residue of the blood type A antigen. This represents a contribution to the field, as this is the first functional bacterial enzyme characterized since 2007 that is capable of this activity at neutral pH. Additionally, two enzymes from the family GH31 were isolated and characterized. These were found to exhibit a new activity for that family, thereby forming a new sub-family within GH31. This also constituted a measureable contribution to the field of carbohydrate enzymology, as these novel enzymes offer a glimpse into possible sugar foraging within the human gut environment by bacteria. While more work is required to determine the exact function of these newly-discovered enzymes, the possibility that these enzymes cleave the core α-GalNAc residue of O-GalNAcylated proteins (mucins) in the gut is exciting. Ultimately, it is clear from the presented work that the coupling of specialized carbohydrate substrates with high-throughput screening approaches is a successful strategy for both the discovery and engineering of carbohydrate-modifying enzymes. Further improvement and development of these methods has the potential to expand our toolbox of useful GH and GT enzymes, providing both researchers and industry with enzymes for the study, synthesis, and breakdown of complex carbohydrate-containing molecules. 136  Chapter 7: Experimental 7.1 Common Methods 7.1.1 DNA Sequencing All DNA sequencing was performed using the Sanger sequencing method (Genewiz). Analysis of sequencing data was performed in the BioEdit software.  7.1.2 Protein Structure Analysis All protein structure analysis was performed using the PyMOL software package. All figures including protein structures were prepared using this software.  7.1.3 Primer List CstI_EcoRI_F: GCC GAA TTC ATG AAA AAA GTT ATT ATT GCT GGA CstI_BamHI_R: GCC GGA TCC TTA TTT TCC TTT GAA ATA ATG CTT CstI(Primetime_qPCR)_F: ATT ATT GCT GGA AAT GGA CCA AGT TTA AAA G CstI(Primetime_qPCR)_R: TTT TCC ATA AGC CTC ACT AGA AGG TAT GAG T CstI_NdeI_F: GAG CAT ATG TCC GAT ATA GTG TGA GCG CstI_XhoI_R: GCC CTC GAG TTT TCC TTT GAA ATA ATG C CstI_XhoI_R_stop: GCC CTC GAG TTA TTT TCC TTT GAA ATA ATG C CstI_F_qPCR (1st attempt): ATT ACC TCA GGG GTC TAT ATG TGT G CstI_R_qPCR (1st attempt): TAT GTC CGA TAT AGT GTG AGC GAT C CstI_Campy81176_F: ATT GCT GGA AAT GGA CCA AGT TT CstI_Campy81176_R: ATG TCC GAT ATA GTG TGA GCG AT E2_NdeI_F: GAG CAT ATG AAG CTA AAG TAT TAT TC 137  E2_XhoI_R: GAG CTC GAG ATC TGA TTT AGT AGT AGC CGT CAC E12orf_NdeI_F: GAG CAT ATG TCA TTT GTT GCA GAA AAA GTG GTG ATG G E12orf_XhoI_R: GAG CTC GAG GAT CGT TTC TCC CGG CGA TTC TTC P19orf_NdeI_F: GAG CAT ATG ATA AAA AGC ATG AAA AAA AGG P19orf_XhoI_R: GAG CTC GAG GGG ATC GGA CTT GGT CGC AGC ACC P5_NdeI_F_FUll GENE: GAG CAT ATG AGA ACC TTT AAA TCA TTG P5_NdeI_F_27aaNtermDel: GAG CAT ATG CAA ACC GTA AGT TCA GGA GAT TCC TGG P5_XhoI_Stop_R: GAG CTC GAG TTA TTT TGC TTC TTT AGC CCA TTC TTT CGC P5overlapextensionPCR_XhoI_R: GAG CTC GAG TTT TGC TTC TTT AGC CCA TTC TTT CGC  P5 Geneblocks fragment Sequence: CTGAACTGGGAGCTATCTCTATGGATAACGGCTGTGCGGCAGTAGCTTTTCCAGACTTTACGCGGGGAGAGTGGAATGTTACCAAAGGTTATAAACACGCCTATGCGTCTCCGGAAGACGAGAACGCGAGTATGGAAAAAGCCAAGGCGTTTACCGCCAAACTGAAAGAACAGGGTGCGAAAGAATGGGCTAAAGAAGCAAAACTCGAGCTC  BfGH109_BamHI_F: GAT GGA TCC ATG AAA GAT ACT GTC TAC GTC ACC G BfGH109_EcoRI_R: GAT GAA TTC TTA TTT GAT CGT TGT TTT CAG G EmGH110A_HindIII_F: GCC AAG CTT ATG CCT AAA AAG GTA AGA ATA GC EmGH110A_SacI_R: GAT GAG CTC TTA GTA GTC GTC ATT TAT TGC PmST1_NdeI_F: GCC CAT ATG TCA AAA ACA ATC ACG CTG TAT TTA G PmST1_XhoI_R: GCC CTC GAG CAA CTG TTT TAA ACT GTC CC PmST1_EcoRI_F: GCC GAA TTC ATG TCA AAA ACA ATC ACG CTG PmST1_XhoI_R(stop): GCC CTC GAG TTA CAA CTG TTT TAA ACT GTC CC PmST1_HindIII_R: GCC AAG CTT ACA ACT GTT TTA AAC TGT CCC 138  7.2 Functional Metagenomic Screening of Glycosyltransferases 7.2.1 Materials The E. coli strain JM107(∆nanA) and the plasmids pACYC18(SiaB) and pCW(CstI/SiaB) were generous gifts from the lab of Dr. Warren Wakararchuk. All other strains and plasmids were readily available within the Withers lab.  7.2.2 Construction of pDUAL and pDUAL2 The lac promoter region of pUC18 was PCR-amplified with flanking HindIII and SalI restriction sites. Attempts to ligate the lac promoter region into HindIII/SalI-digested pUC18 in opposite orientation to the existing lac promoter consistently yielded no clones. Subsequently, the single-stranded oligonucleotide for the lac promoter region was ordered and hybridized into the HindIII/SalI-digested pUC18 vector successfully. The same hybridization approach was applied for pDUAL2, in which the lac promoter and multiple-cloning site (MCS) were included in the ssDNA oligonucleotide ordered and subsequently hybridized into pUC19.   7.2.3 Verification of pDUAL and pDUAL2 The gene Bcx was sub-cloned into pDUAL using the XbaI restriction site. Following cloning, numerous colonies were picked, the plasmid DNA miniprepped, and the DNA sequenced using the M13R primer site to determine the orientation of Bcx. A representative clone with Bcx in both forward (in line with original lac promoter) and reverse (in line with secondary lac promoter unique to pDUAL) orientations was picked and grown overnight (37°C with 250 RPM shaking) in 2-mL aliquots of LB media supplemented with 0.2 mM IPTG to ensure induction of protein expression. A clone containing an empty pDUAL plasmid was also grown in parallel as a 139  negative control. The following day the cultures were pelleted and then re-suspended in 1X Bugbuster (100µL) for 20 minutes at room temperature. The mixtures were spun down (12000 RPM, 5 minutes) and 5 µL of each lysate added to a 50 µL assay mixture containing 50 mM MES buffer (pH 6.0), 50 mM NaCl, and 1 mM β-1,4-xylose-pNP. The reaction was performed at 37°C and monitored visually for 2 hours. A positive control assay was also performed in which 1 µL of purified Bcx protein was added to the assay mixture. In order to verify the dual orientation activity of pDUAL2, the jellyfish gene for green fluorescence, gfp, was sub-cloned into pDUAL2 using the SmaI restriction site. Following cloning, numerous colonies were picked, the plasmid DNA miniprepped, and the DNA sequenced using the M13R primer site to determine the orientation of gfp. Four representative clones with gfp in both forward (in line with original lac promoter) and reverse (in line with secondary lac promoter) orientations were picked and grown overnight (37°C with 250 RPM shaking) in 2-mL aliquots of LB media supplemented with 0.2 mM IPTG to ensure induction of protein expression. A clone containing an empty pDUAL2 plasmid was also grown in parallel as a negative control. Following growth overnight, the culture tubes were analyzed under UV (365 nm) for green fluorescence.  7.2.4 Campylobacter jejuni Genomic Library Construction Cell pellets from the Campylobacter jejuni species 81-176 and 11678 were generously provided by Dr. Erin Gaynor following the overnight growth of 5-mL cultures. Genomic DNA was extracted and purified using the GenElute Bacterial Genomic DNA kit from Sigma. The genomic DNA (diluted to 500 µL in TE buffer) was subjected to sonification in a microcentrifuge tube (1.5 mL) using the Branson Sonifier S-250 (Analog). Sonification was performed with a 1/8” tip at a duty cycle of 10% with a varying number of pulses. For Campylobacter genomic DNA, 25 140  pulses was sufficient to achieve an average fragment size of 3-5 kb. Following sonication, the DNA was concentrated by PCR purification (Qiagen kit), eluted in 20 µL of elution buffer and run on a 1% agarose gel. Following visualization of the fragments (smear) on the gel, DNA of the size 2-8 kb was excised from the gel and purified using the Qiagen DNA Gel Extraction kit. The DNA was end-repaired using the Fast DNA End Repair kit from ThermoFisher Scientific, and then ligated into pDUAL previously digested with SmaI and dephosphorylated with shrimp alkaline phosphatase (SAP). Following ligation the DNA was purified using a PCR Purification kit (Qiagen), eluted in 20 µL of H2O, and then 5 µL aliquots transformed into 80 µL cell aliquots of the screening strain JM107(∆nanA):pACYC18(SiaB) by electroporation (1 mm cuvette, 1.6 kV, 200Ω, 25 µF), yielding time constants between 4.5-5.0 seconds. Cells were immediately recovered in 1 mL of SOC media for 1 hour (37°C) then transferred to a larger flask containing 25-50 mL of LB media. Small aliquots of the recovery mixture were plated to determine library titre. The larger cultures were grown overnight (37°C, 250 RPM) and then aliquots (1/10 volume of DMSO) stored at -80°C prior to screening. Genomic DNA from the bacteria Cellulomonas flavigena was available within the lab having been isolated and purified previously by another member using the same method of bacterial genomic DNA extraction and purification.    7.2.5 Screening of Campylobacter Genomic DNA Libraries by FACS Following overnight growth in rich media (LB), library cells were diluted 1/50 in mineral cultured media and grown at 37°C till OD600nm of 0.6. At this point 0.5 mM IPTG was added and the cells were transferred to 20°C and grown overnight. Cells were spun down (1 ml) and re-suspended in 50 μL of M9 media supplemented with 5 mM Neu5Ac and 0.5 mM of β-lactose-C2-BODIPY. Following 1 hour of incubation, cells were spun down and excess acceptor sugar 141  and Neu5Ac were removed by aspiration. The cells were re-suspended in LB media and transferred to 37°C for 10 min, centrifuged and washed three times with PBS before resuspension in 3 ml of PBS. The cells were visually analysed for fluorescence and taken to the FACS for further analysis and sorting. Cells were diluted in PBS and run in either the FACS Aria (Becton-Dickinson) or Influx (Becton-Dickinson) flow cytometer using PBS as sheath fluid. The threshold for event detection was set to forward and side scattering (FSC and SSC). The average sort rate was ~5000 events per second, using a 70 μm nozzle, exciting argon ion (488 nm) and measuring emissions passing the 530 nm and 610 nm (FITC) band-pass filter for the BODIPY emission. Cells were sorted into Eppendorf tubes containing 200 μL LB medium. Pools of sorted positives were either transferred to larger flasks containing 25-50 mL liquid LB media or plated on LB agar plates containing ampicillin and chloramphenicol. Cultures and/or plates were allowed to grow overnight at 37°C then transferred to M9 media (1/50) for growth, induction, and activity assay prior to FACS (same as above). In later experiments, as discussed in the main text, the sorted cells were plated on LB agar plates containing carbenicillin (50 µg/mL), chloramphenicol (20 µg/mL), and IPTG (0.5 mM). Following growth, cells were removed from the agar plates, washed twice in M9 media (1 mL), and assayed (as above) prior to FACS. FACS data were processed using the Flowjo Software (Tree Star). For control reactions and sorting profiles, JM107(∆nanA):pACYC18(SiaB)+pDUAL(empty) was used as a negative control while JM107(∆nanA):pCW(CstII/SiaB) was used as a positive control.   7.2.6 Quantification of CstI by Colony PCR  PCR primers for the CstI gene were ordered from Integrated DNA Technologies (IDT). Freshly grown colonies were picked with sterile toothpicks and the cell debris transferred to 50 µL PCR 142  reaction mixtures. Following a high temperature (95°C) incubation for 3 minutes to ensure cell lysis, a traditional PCR protocol was run. Products were run on gel to visually verify the presence of the CstI product band. Following poor results, an improved colony PCR method was used in which the cell debris was first diluted in water (25 µL) and lysed by incubation at 95°C (5 minutes), then a 2 µL aliquot of this mixture added to a 50 µL PCR reaction mixture for the remaining PCR reaction. Extended incubation of the taq polymerase in the first method appeared to significantly abolish DNA polymerase activity, yielding many false negative results.  7.2.7 Verification of CstI by Activity Assay (in vitro) Picked colonies were used to inoculate 2-4 mL aliquots of LB media containing carbenicillin (50 µg/mL), chloramphenicol (20 µg/mL), and IPTG (0.5 mM). Cultures were grown/induced overnight at 37°C (250 RPM shaking). The following day, cultures were pelleted, then re-suspended in 50 µL Bugbuster (1X) for 20 minutes at room temperature. The cell mixtures were spun down (13000 RPM, 5 minutes) and aliquots (5 µL) of the lysate transferred to a sialyltransferase reaction mixture (50 µL) containing 50 mM PO4 buffer (pH 7.5), 1 mM CMP-Neu5Ac, and 0.2-0.5 mM β-lactose-C2-BODIPY. The reaction was incubated at 37°C for 20-60 minutes prior to analysis by TLC. Samples were spotted (0.5 µL) on TLC Silica gel 60 F254 plates (Merck) and separated with a solvent mixture of ethyl acetate, methanol, water, and acetic acid (6:2:1:0.1). Following separation, plates were dried and viewed under UV365nm using an AlphaImager (Innotec) instrument. Under UV, reaction components containing the fluorophore BODIPY were easily detectable. The presence or absence of the sialyltransferase product α-2,3-Neu5Ac-Lactose-C2-BODIPY was easily visualized against a standard.   143  7.2.8 Quantification of CstI by Quantitative PCT (qPCR) The PrimeTime® qPCR assay (IDT) was used to quantify the amount of CstI DNA through each round of FACS. Two PCR primers specific to a 500 bp region of the CstI gene were used along with a PrimeTime® qPCR probe which bound within that region and contained a 5’HEX fluorophore, 3’IBFQ quencher, and internal ZEN quencher. Reactions (20 µL) were performed with 2X Taqman® qPCR Master Mix (10 µL), 10X PrimeTime® assay mix (2 µL) containing the CstI probe, forward and reverse PCR primers (1 µL each), and ApliTaq Gold DNA Polymerase (1 µL), and the diluted sample DNA in the remaining 5 µL. PCR cycle steps were as follows: 50°C (5 mins), 95°C (10 mins) pre-incubation, followed by x40 cycles of 95°C (0:15) and 60°C (1:00) and a final extension at 72°C for 5 minutes.  Measurement of fluorescence for each qPCR reaction was obtained with the BioRad Chrono4 instrument. Following either the initial growth step in liquid media (pre-FACS) or subsequent amplification steps on solid agar (post-FACS rounds 1, 2, or later), aliquots of the homogenized samples were lysed and the plasmid DNA miniprepped (Qiagen DNA Extraction Miniprep Kit). The total concentration of the DNA miniprep was determined by fluorometric quantitation using the Qubit fluorometer (Thermo Fisher). Dilutions of each sample were then subjected to qPCR analysis using the protocol described above and the amount of CstI DNA determined against a standard curve. The abundance of CstI in the total sample was then determined as the concentration of CstI gene relative to the total DNA concentration and plotted.   7.2.9 Sampling Solid sludge and raw wastewater samples were obtained from the UBC Wastewater Treatment Facility. Three 50-mL samples for each of the solid sludge and raw wastewater were collected 144  and immediately frozen at -80°C prior to DNA extraction. The raw wastewater samples were lightly centrifuged (4000 RPM, 5 minutes) to condense the solid particulate prior to DNA extraction. Seawater (2 L) was collected from Jericho Beach in Vancouver, BC, Canada in a large plastic bottle. The sample was filtered through a Whatman filter disc (0.2 µm) to collect all organic material. The filter paper was cut into small pieces, re-suspended in TE buffer, and stored at -80°C prior to DNA extraction. The chicken, pig, and goat feces were collected from the Southland Farms located in Vancouver, BC, Canada. Samples were placed into 50-mL Falcon tubes and flash-frozen in dry ice and ethanol. Samples were kept frozen at -80°C until DNA extraction.  7.2.10 Environmental Library Construction For each environmental sample, approximately 0.3–0.5 g of frozen solid material was re-suspended in 5-mL cold KH2PO4 buffer (50 mM, pH 7.5). This was then split into 800-µL samples (x8) in ultracentrifuge tubes (1.5 mL). Proteinase K was added (10 µL of 20 mg/mL) along with 50 µL of SDS (10%). The samples were mixed well and incubated for 1 hour at 55°C.  Each sample was then transferred to impact-resistant bead beating tubes (2 mL) containing 350 µL of 450-500 µm glass spheres and a single 3 mm glass bead for general homogenization. Each sample was subjected to bead-beating to homogenize the sample, lyse all microorganism cells, and extract DNA. Bead beating was performed using a FastPrep®-24 Instrument (MP Biomedicals). Beating times were optimized to obtain an average sheared DNA size of approximately 3 kb. Samples were processed for 45 seconds at a beating speed of 6 m/s. Following beating, each sample was centrifuged (13,000 RPM, 5 minutes, 4°C), and the lysate transferred to a fresh tube. DNA was extracted from lysates using phenol-chloroform (2x phenol-145  chloroform, 1x chloroform), then the DNA precipitated from the aqueous portion using sodium acetate (pH 5.2, 0.3 M final concentration) and ethanol. Precipitation was left overnight (4°C) and the samples spun down in the morning (13,000 RPM, 20 mins). Pellets were washed with fresh 70% ethanol, spun down again (13,000 RPM, 15 mins), and the remaining pellet (once dried) re-suspended in 100 µL EB buffer. Samples were then run on a 1% agarose gel at 100 V for 45 minutes (gel electrophoresis) and the DNA of size 2-10 kb excised. DNA was purified from gel slices using the Qiagen DNA Gel Extraction Kit. All resulting samples (for each environmental sample) were pooled and concentrated into a final 30 µL sample by PCR purification (Qiagen PCR Purification Kit). These samples were end-repaired using the Fast DNA End Repair Kit (Thermo Fisher Scientific). The resulting DNA (2-10 kb) was then ligated into the pDUAL2 vector, previously digested with SmaI and dephosphorylated with FastAP (Thermo Fisher Scientific), and transformed into electrocompetent cells of the screening strain JM107(∆nanA):pACYC18(SiaB). Dilutions of each library were plated to determine library sizes. Following cell rescue in SOC media (1 mL), cells were grown overnight in 25 mL of LB media and aliquots (1 mL) stored at -80°C (1/10 volume DMSO) prior to FACS screening.  7.2.11 FACS of Environmental Libraries All rounds of FACS sorting and enrichment were performed using the optimized method of growth/induction on solid media described in section 7.2.5.  7.2.12 Secondary Screening of Environmental Libraries All secondary screening of the environmental libraries (seawater, chicken, goat) was performed using the in vitro activity assay described in section 7.2.7. 146   7.2.13 Determination of FACS Limit of Enrichment The negative control library was constructed from Cellulomonas flavigena genomic DNA. A sample of genomic DNA (1 µg) was diluted to 500 µL in EB buffer and acoustically sheared by sonication using a Branson Sonifier (1/8” tip, 10% duty cycle, 25 pulses). The resulting DNA was separated by gel electrophoresis (1% agarose, 100 V, 45 minutes), then the DNA of size 2-8 kb excised from the gel and purified (Qiagen DNA Gel Purification Kit). Following purification, the DNA was end-repaired using the Fast DNA End Repair Kit (Thermo Fisher Scientific). This DNA was then ligated into pDUAL2 (SmaI digested and dephosphorylated) and transformed into electrocompetent cells of the screening strain JM107(∆nanA):pACYC18(SiaB). Small aliquots (1 mL) of the resulting library (1 x 106) were stored at -80°C as frozen stocks (1/10 volume of DMSO). Aliquots of this library were used as the negative control cells for the limit of enrichment studies in order to simulate enrichment from a metagenomic library. The positive control cells used in the limit of enrichment experiments were JM107(∆nanA):pCW(CstI/SiaB) and were available within the Withers lab as a previous, generous gift from Dr. Warren Wakarchuk. Dosed libraries were constructed as 200 µL aliquots consisting of negative control cells (concentration determined by OD600, 1 OD600=2.5x108 cells) and various dilutions of positive control cells (concentrations determined by plate titre). The dosed library aliquots were then plated (15 x 15 cm agar plates) and grown/induced overnight on LBCarb50Cm20+IPTG 0.2 mM. All FACS sorting/enrichment of the dosed libraries was carried out using the previously discussed growth/induction on solid media method. Prior to each round of FACS sorting/enrichment, a small aliquot (1 mL) of the grown cells was isolated and the DNA miniprepped for subsequent 147  analysis by qPCR for CstI abundance determination. The method of qPCR analysis was discussed previously in section 7.2.8.   7.3 Functional Metagenomic Screening of the Human Gut Microbiome 7.3.1 Background Activity of Screening Strain Small cultures (2 mL) of the E. coli strain ReplicatorFOS (Lucigen) were grown/induced overnight in LB (IPTG 0.2 mM) at 37°C with shaking (250 RPM). Cultures of ReplicatorFOS cells also containing the empty plasmid vector pHSG396, and the vector pHSG396 containing the genes EmGH109 and BfGH110, were also grown/induced in LB (chloramphenicol 25 µg/mL + IPTG 0.5 mM) in parallel. Following growth/induction, 1-mL aliquots of each culture were pelleted (5000 RPM, 5 minutes), decanted, and re-suspended in 1X Bugbuster (50 µL) for 20 minutes at room temperature. The samples were then spun down (13000 RPM, 5 minutes, 4°C) and kept on ice. Assays (40 µL) were performed in 384-well plate (COSTAR) wells. Assay mixtures included 50 mM HEPES buffer, pH 7.5, α-Gal-MU/α-GalNAc-MU (200 µM final), and cell lysate (4 µL). Both cell lysate from the soluble extract (no cell debris), referred to as pure lysate, and soluble extract including cell debris (referred to as impure lysate), were tested for each culture. Assays were run at 37°C and continuously monitored for one hour by plate reader (Opticon) for fluorescence increase (365/440 nm). Background activity of the screening strain on either of the α-Gal-MU or α-GalNAc-MU substrates was also tested in the intended plate screen format. Small 40-µL cultures of ReplicatorFOS, ReplicatorFOS (pHSG396[empty]), ReplicatorFOS (pHSG396[EmGH109]), and ReplicatorFOS (pHSG396[BfGH110]) were grown/induced overnight at 37°C in LB (chloramphenicol 25 µg/mL + IPTG 0.5 mM) on plate (384-well) without shaking. In the morning, a lysis/assay mixture (40 µL) containing 50 mM 148  NaPO4 buffer (pH 7.5), 1% Triton-X 100, and either α-Gal-MU or α-GalNAc-MU (100 µM final concentration) was added to each well and the reaction then monitored continuously at 37°C for fluorescence (365/440 nm) increase.  7.3.2 384-Well Screen Validation 384-well plate screening for activity against the α-GalNAc-MU substrate was validated by control experiment and determination of Z-factor for high-throughput screen suitability. A single plate was set up with 5 rows (120 wells) containing negative control cells (ReplicatorFOS + pHSH396[empty]) and two rows containing positive control cells (ReplicatorFOS + pHSH396[EmGH109]). Each well contained a 40 µL culture (LB + chloramphenicol 25 µg/mL + IPTG 0.5 mM) grown overnight at 37°C without shaking. Following growth/induction, 40 µL of a lysis/assay mixture (see above) was added to each well and the plate incubated at 37°C. The fluorescence (365/440 nm) of each well was measured immediately upon lysis (time = 0 hrs) as well as following 1, 6, and 20 hours of incubation. The Z-factor for the assay following 20 hours was calculated in Excel using the following formula: Z-factor =  1 -  3(σp + σn) where Z-factor is defined by the means (μ) and the standard     |μp + μn|  deviations (σ) of the positive (p) and negative (n) controls.   7.3.3 Sampling and DNA extraction Human fecal samples were collected in a small sterile plastic receptacle, from which approximately 3 g aliquots were transferred to individual 50-mL Falcon tubes (x6) and either stored at -80°C or immediately subjected to DNA extraction. This portion of the sampling was completed by the study participant privately before handing the sample over to the study researcher for the DNA extraction and following steps. DNA extraction was performed using a 149  chemical lysis technique. For each 3 g sample, 1.5 mL of denaturing buffer (10 mM TRIS.HCl pH 7.0, 4 M guanidinium isothiocyanate, 1 mM EDTA, 1% β-mercaptoethanol) and 9 mL of extraction buffer (100 mM sodium phosphate - pH 7.0, 100 mM TRIS.HCl – pH 7.0, 100 mM EDTA – pH 8.0, 1.5 M NaCl, 1% CTAB, 2% SDS) was added to the sample. Two small glass beads (3 mm) were added to each sample tube to aid in homogenization, and then each sample mixed vigorously by vortexing. Samples were then incubated at 60°C for 40 minutes with light rotation (tube on its side on shaker platform rotating at 100 RPM) to avoid sample settling. Following incubation each sample was spun down (1800 RPM, 10 minutes) and the supernatant transferred to a new 50-mL Falcon tube. A second extraction step was then performed by adding 5-mL of extraction buffer to the sample, homogenizing, incubating (60°C, 30 minutes), spinning the sample down, and transferring the additional supernatant to the new tube. An equivalent volume of ice-cold chloroform was added to the supernatant and the tube shaken lightly (tube in ice on shaking platform at 100 RPM) for at least 10 minutes. The sample was then spun down (1800 RPM, 10 minutes) and the clear, aqueous layer transferred to 2-mL ultracentrifuge tubes in aliquots of 1-mL. To each ultracentrifuge tube isopropanol (0.6 equivalent to sample volume) and ammonium acetate (1/10 total volume of 3 M stock) to a final concentration of 0.3 M (pH 5.2). Tubes were mixed, then spun down (15000 RPM, 20 minutes, 4°C) and decanted (carefully so as not to disturb the DNA pellet). Each sample was washed with 1 mL of fresh 70% ethanol then spun down and decanted as before. Each pellet was allowed to air-dry for 5-10 minutes before the addition of 150 µL of resuspension buffer (TE buffer, pH 8.0). Samples were then stored overnight at 4°C to allow for sufficient resuspension of the pellet and then pooled together prior to library construction. At this point DNA concentration and yield were also determined by 150  absorbance at 260 nm relative to a water standard. Approximately 200 µg of DNA was recovered from each sample (3 g of starting material).  7.3.4 Small-Insert Library Construction 7.3.4.1 Acoustic Shearing and Blunt-End Ligation Purified metagenomic DNA (1 ug) was diluted to 500 µL in EB buffer and acoustically sheared by sonication using a Branson Sonifier (1/8” tip, 10% duty cycle, 25 pulses). The resulting DNA was separated by gel electrophoresis (1% agarose, 100 V, 45 minutes), then the DNA of size 2-8 kb excised from the gel and purified (Qiagen DNA Gel Purification Kit). Following purification, the DNA was end-repaired using the Fast DNA End Repair Kit (Thermo Fisher Scientific). This DNA was then ligated into pHSG396 (SmaI digested and dephosphorylated) and transformed into electrocompetent cells of the screening strain ReplicatorFOS. Electroporation (1 mm cuvette, 1.6 kV, 200Ω, 25 µF), yielded time constants between 4.5-5.0 seconds. Cells were immediately recovered in 1 mL of SOC media for 1 hour (37°C) and dilutions plated on LB agar (chloramphenicol 25 µg/mL) to determine the library titre.  7.3.4.2 Restriction Enzyme Digestion and Ligation Purified metagenomic DNA (1 ug) was digested with the restriction enzyme Sau3AI (0.5 units) at 37°C for 30 minutes. The resulting DNA was separated on a 1% agarose gel by electrophoresis (100V, 40 mins). DNA of the size 3-10kb was excised from the gel and purified using the Qiagen DNA Gel Extraction Kit. This insert DNA was then ligated (T4 DNA Ligase) with the vector pHSG396 which had been digested previously with BamHI and dephosphorylated with the alkaline phosphatase FastAP (ThermoFisher). The insert DNA (3-151  10kb) was ligated with the vector (pHSG396) at a vector-to-insert ratio of approximately 1:3. The resulting ligation was purified using the Qiagen PCR Purification Kit, concentrated by eluting (EB Buffer) into a smaller volume (20 uL), and stored at -20°C prior to transformation. An aliquot of the ligation product (5 uL) was transformed into ReplicatorFOS cells (80 uL) by electroporation (1 mm cuvette, output voltage=1.6 kV, capacitance=25µF, pulse time=4-5 ms) and immediately recovered in SOC media (1 mL) at 37°C for 60 mins. A very small amount (2 uL) of the cells was used to plate 0X, 100X, and 10000X dilutions on LBCm25 and determine a library titre. DMSO (1/10 volume) was added to the library and the library stored at -80°C for later use.  7.3.4.3 Library Picking and Storage Dilutions of the libraries were plated and grown on solid LB (chloramphenicol 25 µg/mL + IPTG 0.2 mM) overnight at 37°C such that each plate contained 150-400 colonies. Colonies were picked using the QPix II (Genetix) colony picker and transferred to 384-well plates containing LB (chloramphenicol 25 µg/mL + 10% glycerol v/v) in each well (80 µL). The plates were incubated at 37°C for 20 hours, then stored at -80°C.   7.3.5 Screening Library plates were thawed at room temperature, then replicated to 384-well plates containing growth/induction media (LB + chloramphenicol 25 µg/mL + IPTG 0.2 mM) in each well (40 µL). Plates were incubated at 37°C for 20 hours. Evaporation was minimized by incubating plates within a humid chamber (>90% humidity). Following growth/induction, 40 µL of a lysis/assay mixture (see section 7.3.1) was added to each well and the plate incubated at 37°C. 152  All plates were robotically filled with media or lysis/assay mixture using a QFill instrument (Genetix). Following 22 hours of incubation, the fluorescence (365/440 nm) of each plate was determined by plate reader (Opticon). All data were compiled and analyzed in Excel. Hits were identified as those wells fluorescing at least 10 standard deviations (SDs) above the mean. Wells in the library storage plates corresponding to hits (>10 SD above mean) were picked and re-tested in duplicate on 384-well plates to validate α-N-acetylgalactosaminidase activity. Frozen stocks were made for each of the hits validated through secondary screening.  7.3.6 N-Acetylgalactosaminidase Coupled Assay on Blood Type A Antigen Small LB cultures (2 mL) containing chloramphenicol (20 µg/mL) and IPTG (0.2 mM) were inoculated with the hits isolated from screening and grown overnight at 37°C with shaking (250 RPM). The following day, cultures were spun down (5000 RPM, 5 minutes), and the pellet re-suspended in 1X Bugbuster reagent for 20 minutes at room temperature. The lysis mixture was spun down (13200 RPM, 5 minutes, 4°), and the purified lysate kept on ice. Lysate from each sample was added to wells (384-well plate) containing a buffer, a coupled enzyme mixture, and reaction substrate. The α-N-acetylgalactosaminidase assays (50 µL total volume) were performed at 25°C in 100 mM HEPES buffer (pH 7.5) with 35 μM MU-Type2Atetra, 0.05 mg/mL each of SpHex, BgaA, and AfcA, and lysate from the induced overnight cultures. Fluorescence (365/440 nm) was monitored continuously for 2 hours using a BioTek Synergy HT fluorescence plate reader.  153  7.3.7 Hit Sequence Analysis Small LB cultures of each hit (3 mL) were grown overnight. Plasmid DNA was isolated for each hit from these cultures using the Qiagen DNA Miniprep Kit. DNA was submitted for Sanger sequencing using primers flanking each metagenomic insert (M13F and M13R). For inserts over 1 kb in length, additional primers were designed within each determined sequence. Primer walking was conducted until the entire insert sequence had been identified. Preliminary sequence analysis was conducted using BLASTX (NCBI) to identify possible genes of interest within each insert sequence. Sequences were then analyzed using the Open Reading Frame (ORF) Finder program (NCBI) to identify protein coding regions. The largest protein coding regions were targeted for cloning and expression.  7.3.8 Expression and Purification of Screening Hits All ORFs identified from hit sequence analysis (one for each hit) were initially sub-cloned into pET29a (C-terminal 6xHIS tag) and purified from BL21(DE3) cells. Briefly, 1-L of 2xYT media was inoculated with 1-mL of an overnight culture of BL21(DE3)-pET29a(ORF) cells, grown at 37°C with shaking (250 RPM) until an OD600 of 0.8 was reached. The culture was induced (0.5 mM IPTG) overnight at 16°C with shaking (250 RPM). Following induction, the cells were pelleted (5000 RPM, 15 mins @ 4°C) and re-suspended in 20-mL lysis buffer (50 mM HEPES, 200 mM NaCl, pH 7.5). The cells were lysed by sonication (3 mins, 30% amplitude, QSonica Sonicator, 1/8” microtip), spun down (15,000 RPM, 20 mins), and the lysate purified (0.2 µm filter) and applied to a 12 mL gravity column containing 2-mL of Ni2+ resin. The column was washed with 20 volumes of wash buffer (50 mM HEPES, 200 mM NaCl, 20 mM imidazole, pH 7.5). Bound protein was eluted with 5 volumes of elution buffer (50 mM HEPES, 200 mM NaCl, 154  250 mM imidazole, pH 7.5), then buffer-exchanged with storage buffer (50 mM sodium phosphate, 200 mM NaCl, pH 7.0) using an Amicon Ultra 15 filter (30 kDa) at stored at 4°C prior to use. Resulting protein purity was assessed by SDS-PAGE and concentrations determined by A280.  The hit BvGH109 ORF was extended to include 58 amino acids missing from its C-terminus relative to its full putative protein sequence. This was achieved by ordering the remaining DNA sequence (150 bp) as a codon-optimized DNA fragment (Genewiz, Geneblocks) containing a stop codon and XhoI restriction site. The ordered sequence is listed below: CTGAACTGGGAGCTATCTCTATGGATAACGGCTGTGCGGCAGTAGCTTTTCCAGACTTTACGCGGGGAGAGTGGAATGTTACCAAAGGTTATAAACACGCCTATGCGTCTCCGGAAGACGAGAACGCGAGTATGGAAAAAGCCAAGGCGTTTACCGCCAAACTGAAAGAACAGGGTGCGAAAGAATGGGCTAAAGAAGCAAAActcgagctc (XhoI restriction site underline)  Overlap-extension PCR was performed with the BvGH109 ORF fragment along with the additional fragment. The resulting full BvGH109 sequence was sub-cloned into the expression vector pET16b using the restriction sites NdeI and XhoI. Additionally, an N-terminal truncated form of the full BvGH109 sequence (referred to as tBvGH109) in which the first 24 amino acids (signal peptide sequence) were removed was also constructed and sub-cloned into pET16b. These N-terminal histidine-tagged (x10) proteins were expressed and purified under the same conditions as the pET29a constructs, except that the Ni2+ columns used were pre-packed cartridges (1 mM fast-flow), wash buffer contained 50 mM imidazole, elution buffer contained 350 mM imidazole, and protein was eluted under a continuous gradient (1 mL/min flow rate) using an AKTA HPLC instrument.  155  7.3.9 BpGH31(E2) and BcGH31(P19) Substrate Specificity Assay The substrate specificity assay was performed on a 384-well plate (clear) at 37°C and monitored continuously at 405 nm by plate reader (Biotek Synergy). Each well contained 50 µL of an assay mixture consisting of 50 mM phosphate buffer, pH 7.5 (40 µL), 10 mM of either oNP-α-Gal, oNP-α-Glc, oNP-α-Xyl, oNP-α-Man, oNP-β-Gal, or oNP-β-Glc (final concentrations), and 20 µg of purified BcGH31 or BpGH31. In the case of negative controls the purified enzyme was substituted with either water or lysate from Ni2+-purified BL21(DE3) cells.    7.3.10 BpGH31(E2) and BcGH31(P19) Competition Assay In microcentrifuge tubes, BcGH31 and BpGH31 (20 µg) were separately incubated with 2F-pNP- β-Gal (1 mM final concentration) for 10 minutes at room temperature. Each reaction (50 µL) was performed in 50 mM phosphate buffer (pH 7.0). Negative controls without 2F-pNP-β-Gal were incubated under identical conditions. Following incubation, a 5 µL aliquot of each reaction was transferred to wells (384-well plate) containing an assay mixture (45 µL) comprised of buffer (50 mM phosphate, pH 7.5) and the substrate α-GalNAc-MU (100 µM final concentration). All reactions were then incubated at 37°C and the fluorescence (365/440 nm) monitored continuously for 20 minutes or until fluorescence exceeded the limit of detection for the plate reader (Biotek Synergy).   7.3.11 Estimation of BcGH31(P19) and BpGH31(E2) pH Optima The pH optimum for each enzyme was determined by a stopped assay. The reactions were performed at room temperature for 15 minutes. Each reaction (30 µL) contained enzyme (100 nM final concentration), α-GalNAc-MU (50 µM final concentration), and either citrate buffer  156  (100 mM, pH 3-5.5) or sodium phosphate buffer (100 mM, pH 6-9). Following incubation, a 5 µL aliquot of each reaction was then quenched by transfer to plate wells (384-well plate) containing 1 M glycine (pH 10.4). The fluorescence (365/440 nm) of each reaction was then measured by plate reader (Biotek Synergy). Negative control reactions were carried out in parallel with no enzyme present in the assay. Fluorescence values were analyzed in Excel. The baseline fluorescence values from the negative control reactions were subtracted from those performed with enzyme and the resulting values plotted for each enzymes.  7.3.12 Determination of α-N-acetylgalactosaminidase Stereochemical Outcome of Hydrolysis by BpGH31(E2) and BcGH31(P19) α-N-Acetylgalactosaminidase reactions (500 µL) were set up in NMR tubes (Sigma) prior to analysis. Each 500 µL reaction contained 50 mM HEPES buffer (pH 7.5), α-GalNAc-MU (5 mM final concentration), and either 100 µg or 8 µg (entire stock) of purified BpGH31(E2) and BcGH31(P19), respectively. All buffer salts and substrates were lyophilized and re-suspended in D2O. The enzyme was buffer-exchanged into D2O prior to addition to the reaction mixture. Upon enzyme addition and mixing, the reaction-containing NMR tube was immediately submitted to 1H NMR analysis for monitoring. 1H NMR Spectra were recorded following 5 minutes of reaction (D2O, 298 K, 400 MHz Bruker INV).  7.3.13 Kinetics 7.3.13.1 BpGH31(E2) and BcGH31(P19) Michaelis-Menten kinetic parameters were determined for the substrate α-GalNAc-MU at pH 7.5 and 25°C. Reactions (total volume of 40 µL) were performed in plate wells (384-well plate), 157  with the fluorescence (365/440 nm) signal resulting from MU release by hydrolysis monitored by plate reader (Biotek Synergy). Reactions were performed in duplicate, with each reaction consisting of 50 mM HEPES buffer (pH 7.5), varying concentrations of α-GalNAc-MU (5 µM, 10 µM, 25 µM, 50 µM, 100 µM, 250 µM, or 500 µM), and enzyme (BpGH31 [E2], 53 nM or BcGH31 [P19], 20 nM). Initial rates (RFU s-1) were determined within the plate reader (Synergy) software, and converted from fluorescence to concentration (M s-1) using MU standard concentration curves determined under identical reaction conditions (HEPES buffer, pH 7.5, Detector Gain 80). Initial reaction rates (Vo) were plotted in Grafit 7.0 to determine the apparent Michaelis-Menten kinetic parameters.  7.3.13.2 BvGH109 and EmGH109 Michaelis-Menten kinetic parameters were determined for the substrate Type 2A-MUtetra at pH 7.5 and 25°C. Reactions (total volume of 50 µL) were performed in plate wells (384-well plate), with the fluorescence (365/440 nm) signal resulting from MU release by hydrolysis monitored by plate reader (Biotek Synergy). Reactions were performed as coupled assays in 100 mM HEPES buffer (pH 7.5) with 0.1 mg/mL each of SpHex, BgaA, and AfcA, varying concentrations of Type 2A-MUtetra (0.33 mM, 0.67 mM, 1 mM, 2 mM, 3 mM, 4 mM, 5 mM, 6.7 mM), and α-N-acetylgalactosaminidase enzyme (BvGH109, 350 nM or EmGH109, 24 nM). Initial rates (RFU s-1) were determined within the plate reader (Synergy) software, and converted from fluorescence to concentration (M s-1) using MU standard concentration curves determined under identical reaction conditions (HEPES buffer, pH 7.5, Detector Gain 60). Initial reaction rates (Vo) were plotted in Grafit 7.0 to determine the apparent Michaelis-Menten kinetic parameters. 158    7.4 Directed Evolution of the Sialyltransferase PmST1 7.4.1 Protein Sequences The gene used in this study as a template for mutagenesis was an N-terminal 2-25 AA truncation of a protein homologue of Pm0188 (D105N, R135Q, G295E) initially referred to as tPm0188Ph. The removal of an N-terminal signal peptide sequence had previously been shown to improve expression and solubility of the protein.205 This version of the gene is most commonly referred to as PmST1 as it was throughout this dissertation.  PmST1 (tPm0188Ph) – 397 amino acids (46 kDa) N-terminal 2-25 AA truncation of protein homologue of Pm0188 (D105N, R135Q, G295E) with C-terminal His6-tag205 MSKTITLYLDPASLPALNQLMDFTQNNEDKTHPRIFGLSRFKIPDNIITQYQNIHFVELKDNRPTEALFTILDQYPGNIELNIHLNIAHSVQLIRPILAYRFKHLDRVSIQQLNLYDDGSMEYVDLEKEENKDISAEIKQAEKQLSHYLLTGKIKFDNPTIARYVWQSAFPVKYHFLSTDYFEKAEFLQPLKEYLAENYQKMDWTAYQQLTPEQQAFYLTLVGFNDEVKQSLEVQQAKFIFTGTTTWEGNTDVREYYAQQQLNLLNHFTQAEGDLFIGDHYKIYFKGHPRGGEINDYILNNAKNITNIPANISFEVLMMTGLLPDKVGGVASSLYFSLPKEKISHIIFTSNKQVKSKEDALNNPYVKVMRRLGIIDESQVIFWDSLKQLLEHHHHHH*   7.4.2 Mutagenesis and Library Construction For the first round of directed evolution, error-prone PCR (epPCR) was performed to generate mutations within the PmST1 gene sequence. Reaction mixtures contained 10x taq polymerase PCR buffer (5 µL), dNTP mixture containing 2 mM dCTP, 2 mM dTTP, 5 mM dATP and 5 mM dGTP (5 µL), 200 nM each of forward and reverse primers for PmST1 with HindIII and EcoRI restriction sites (5 µL each), 1 ng of template DNA (2 µL), 3 mM MgCl2 (2 µL), taq polymerase 159  (1 µL), and 50, 100, or 250 µL MnCl2. Final reaction volumes were 50 µL. PCR conditions were as follows: 95° for 30 seconds, then 30 cycles of 95° for 30 seconds, 55° for 30 seconds, 72° for 1 min, followed by a final extension at 72° for 10 minutes. Following PCR, products were separated by gel electrophoresis, excised and purified from agarose (Qiagen DNA Gel Purification Kit), digested with HindIII and EcoRI restriction enzymes at 37° for 30 minutes (Fermentas FastDigest Enzymes), purified by spin column (Qiagen PCR Purification Kit), and ligated into pUC18 (T4 DNA Ligase, Fermentas). The pUC18 had been previously digested with HindIII and EcoRI restriction enzymes for 37° for 30 minutes (Fermentas FastDigest Enzymes) and dephosphorylated with shrimp alkaline phosphatase (FastAP, Thermo Scientific). The ligation products were cleaned by spin column (Qiagen PCR Purification Kit) and transformed into JM107∆nanA:pACYC18(siaB) cells by electroporation (1 mm cuvette, 1.6 kV, 200Ω, 25 µF), yielding time constants between 4.5-5.0 seconds. Cells were immediately recovered in 1 mL of SOC media for 1 hour (37°C). Small aliquots of the recovery mixture were plated to determine mutation frequencies. Approximately 20-30 colonies from each mutagenic library were picked and submitted for sequencing to determine the frequency of mutations resulting from mutagenesis. The second round of mutagenesis and library construction was identical to the first with the following differences: PCR was performed with Mutazyme II using the GeneMorph Random Mutagenesis Kit (Agilent Technologies). PCR reaction mixtures (50 μl) contained 42 μl of water, 5 μl of 10× Mutazyme II reaction buffer, 1 μl of 40 mM dNTP mix (200 μM each final), 1 μl of primer mix (250 ng/μl of each HindIII/EcoRI PmST1 primer), 1 μl of Mutazyme II DNA polymerase (2.5 U/μl), and 1 μl template DNA (50 ng each or 200 ng each of pUC18 vectors containing top PmST1 variants determined in first round of directed 160  evolution). PCR products were processed the same as before prior to determination of the resulting mutation frequencies. The day prior to FACS sorting, the ligation products of each mutagenic PmST1 library were transformed into JM107∆nanA:pACYC18(siaB) cells in triplicate, recovered in 1 mL SOC media (1 hour at 37°C), the recovery media pooled for each library and aliquots plated and grown overnight to determine library titres. The remaining recovery media was transferred to 20 mL of LB culture (50 µg/mL carbenicillin) and grown overnight (37°C overnight with shaking at 250 RPM). Small aliquots (1 mL) of the resulting culture were treated with DMSO (1/10 of volume) and stored at -80°C for future use. A 200-µL aliquot of each library was used to inoculate 10-mL of M9 minimal media (1/50 dilution) containing glucose (0.4%). These cultures were grown at 37°C to OD600 = 0.8, induced with IPTG (0.75 mM), then induced overnight at 20°C with shaking (250 RPM). Samples were submitted to FACS sorting the next day.  7.4.3 FACS Sorting Cells from the induced cultures (1 mL) were spun down and re-suspended in 50 μL of M9 media supplemented with 5 mM Neu5Ac and 0.3 mM each of lactose-C2-BODIPY and lactose-MU. Following 30 minutes of incubation, cells were spun down and excess acceptor sugars and Neu5Ac were removed by aspiration. The cells were re-suspended in LB media and incubated at 37°C for 10 min, centrifuged and washed three times with PBS before resuspension in 3 ml of PBS. The cells were visually analysed for fluorescence and taken to the FACS for further analysis and sorting. Cells were diluted in PBS and run in either the FACS Aria (Becton-Dickinson) or Influx (Becton-Dickinson) flow cytometer using PBS as sheath fluid. 161  The threshold for event detection was set to forward and side scattering (FSC and SSC) in order to select healthy, viable, single cells within droplets. The average sort rate was ~5000 events per second, using a 70 μm nozzle, exciting argon ion (488 nm) and 405 nm lasers, and measuring emissions passing the 530 ± 20 nm (FITC) band-pass filter for the BODIPY emission, and the 450 nm (violet 1) filter for the coumarin (MU) emission. Cells were sorted into Eppendorf tubes containing 200 μL LB medium. Pools of sorted positives were plated on LB agar plates supplemented with ampicillin (100 µg/mL), chloramphenicol (20 µg/mL), and IPTG (0.1 mM). Colonies were allowed to grow overnight at 37°C. Following growth, cells were removed from the agar plates (5 mL M9 media), an aliquot (500 µL) washed twice in M9 media (1 mL), and assayed (as above) prior to the next round of FACS sorting. Libraries were pooled prior to the third round of FACS sorting. FACS data were processed using the Flowjo Software (Tree Star).  7.4.4 Activity Verification (in vitro) Following FACS sorting in both rounds of directed evolution, pools of sorted positives were plated on LB agar plates supplemented with ampicillin (100 µg/mL), chloramphenicol (20 µg/mL), and IPTG (0.1 mM) and grown overnight (37°C). The following day, 94 colonies were randomly picked using the QPix II XT (Genetix) colony-picking robot and transferred to a 96-well plate containing 100 µL of LB media supplemented with carbenicillin (100 µg/mL) and glycerol (10% v/v). This plate was incubated overnight (37°C) and stored at -80°C. All the remaining cells were washed off of the agar plate as previously discussed. Following the first round of directed evolution, a 1-mL aliquot of cells was spun down (5000 RPM, 5 minutes) and decanted, then re-suspended in 1X Bugbuster Protein Extraction Solution (50 µL) and incubated for 20 minutes (room temperature). The resulting lysate was spun down (13200 RPM, 5 minutes 162  @ 4°C) to remove cell debris. The lysate was then tested for sialyltransferase activity in the following assay mixture: 20 µL buffer (50 mM HEPES, pH 7.5), 1 µL lactose-C2-BODIPY (5 mM), 1 µL lactose-MU (5 mM), 3 µL lysate, 5 µL CMP-Neu5Ac (50 mM). Small aliquots (0.5 µL) of the reaction (incubated at 37°C) were removed at various time points (10 mins, 30 mins, and 60 mins). Samples were spotted (0.5 µL) on TLC Silica gel 60 F254 plates (Merck) and separated with a solvent mixture of ethyl acetate, methanol, water, and acetic acid (6:2:1:0.1). Following separation, plates were dried and viewed under UV365nm using an AlphaImager instrument. Under UV, reaction components containing the fluorophore BODIPY or MU were easily detectable.  7.4.5 Mutant Analysis (in vivo) From the frozen stock (94 colonies) of sorted cells following the first round of directed evolution (three rounds of FACS sorting), 32 colonies were used to inoculate (in duplicate) a 96-well plate with 200 µL of LB media in each well supplemented with chloramphenicol (20 µg/mL), carbenicillin (50 µg/mL), and IPTG (0.2 mM). Following overnight growth at 37°C, cultures (one of each duplicate) were transferred to 1.5-mL ultracentrifuge tubes. Cells were spun down (4000 RPM, 5 mins), decanted, and washed once with M9 minimal media (1 mL). Cells were spun down again, decanted, and re-suspended in assay mixture (50 µL) containing buffer (50 mM HEPES, pH 7.5), Neu5Ac (5 mM), and lactose-BODIPY (0.5 mM). Samples were incubated for 45 minutes at 37°C, washed 3 times with PBS (1 mL), and finally re-suspended in 100 µL of PBS for analysis. All samples (in centrifuge tubes) were first viewed under UV (365 nm) using the SafeImager. The OD600 of each sample was then determined by plate reader (Opticon), and each sample diluted with PBS to normalize cell densities. The fluorescence of 163  each sample (485/530 nm) was then quantified by plate reader (Opticon). Following the second round of directed evolution, all 24 picked colonies were tested in the same way as those following the first round.  7.4.6 Cloning and Protein Purification PmST1 wt and top PmST1 variants were sub-cloned into pET29a (C-terminal 6xHIS tag) and purified from BL21(DE3) cells. Briefly, 1-L of 2xYT media was inoculated with 1-mL of an overnight culture of BL21(DE3)-pET29a(PmST1*) cells, grown at 37°C with shaking (250 RPM) until an OD600 of 0.8 was reached. The culture was induced (0.5 mM IPTG) overnight at 16°C with shaking (250 RPM). Following induction, the cells were pelleted (5000 RPM, 15 mins @ 4°C) and re-suspended in 20-mL lysis buffer (50 mM HEPES, 200 mM NaCl, pH 7.5). The cells were lysed by sonication (3 mins, 30% amplitude, QSonica Sonicator, 1/8” microtip), spun down (15,000 RPM, 20 mins), and the lysate purified (0.2 µm filter) and applied to a 12 mL gravity column containing 2-mL of Ni2+ resin. The column was washed with 20 volumes of wash buffer (50 mM HEPES, 200 mM NaCl, 10 mM imidazole, pH 7.5). Bound protein was eluted with 5 volumes of elution buffer (50 mM HEPES, 200 mM NaCl, 250 mM imidazole, pH 7.5), then buffer-exchanged with storage buffer (50 mM HEPES, 200 mM NaCl, pH 7.0) using an Amicon Ultra 15 filter (30 kDa) at stored at 4°C prior to use. Resulting protein purity was assessed by SDS-PAGE and concentrations determined by A280.   7.4.7 HPAE-PAD Optimization All standards and reaction samples were separated and analyzed on a Dionex HPAE-PAD instrument. Separation was obtained on a CarboPAC PA200 (150 mm) column with guard 164  column, and detection was achieved using a disposable gold on polytetrafluoroethylene (PTFE) electrode and a four-potential waveform. The separation conditions were as follows: 100 mM sodium hydroxide and a sodium acetate gradient from 70 to 300 mM over the first 10 min of the separation. The eluent was held at the final gradient conditions for 1 min and then returned to the starting conditions over the next minute. The flow rate was 1.0 mL/min and an injection was made every 27 min. The column and amperometry cell were housed in a chromatography oven set at room temperature and the injection volume was 20 mL. Standard curves relating peak area to concentration were obtained for Neu5Ac, lactose, and α-2,3-Neu5Ac-lactose. Each standard concentration point was determined in duplicate.  7.4.8 Kinetics Michaelis-Menten kinetic parameters were determined using a stopped assay with HPAE-PAD. For sialyltransfer reactions, assays (300 µL) were set up with the following components: 230 µL of buffer (5 mM HEPES, pH 7.5), 30 µL CMP-Neu5Ac (50 mM), 10 µL of enzyme (varying concentrations), and 30 µL lactose (varying concentrations). Buffer and lactose were pre-warmed to 37°C prior to addition of CMP-Neu5Ac and enzyme. At various time points a 30-µL aliquot of the reaction was transferred to a 2-mL clear glass vial (MS certified with 9 mm screw top and PTFE/Silicon, pre-slit septa, Shimadzu) containing 270 µL of water (18 megOhm) and immediately flash-frozen in a dry-ice/EtOH bath. Following freezing, samples were stored at -20°C. For sialidase reactions, assays (300 µL) were set up with the following components: 230 µL of buffer (5 mM HEPES, pH 7.5), 30 µL CMP (3 mM), 30 µL α-2,3-Neu5Ac-lactose (varying concentrations), and 10 µL of enzyme (varying concentrations). Similarly to sialyltransfer reactions, aliquots were removed, diluted, and flash-frozen prior to analysis by 165  HPAE-PAD. Samples were submitted to HPAE-PAD analysis and kept at 4°C to minimize any continuation of reactions. Histograms were analyzed using the Chromeleon software (Dionex), and peak areas converted to concentrations using standard curves. These values were then plotted in Excel and reaction rates determined. Initial reaction rates were plotted in the Grafit 7 software and the Michaelis-Menten kinetic parameters KM and kcat determined for sialyltransfer and sialidase reactions. Sialyltransfer reaction rates were determined as the increase in α-2,3-Neu5Ac-lactose concentration over time. Sialidase rates were determined as the increase in free Neu5Ac concentration over time. Measures of synthetic competency were determined from these values.  7.5 Mechanisms of the Sialidase and Trans-Sialidase Activities of Bacterial Glycosyltransferases from the Family GT80 7.5.1 Materials Plasmids containing the genes PmST1205 and Psp2,6ST were generously provided by the lab of Dr. Warren Wakarchuk. PmST1 was sub-cloned into pET29a while Psp2,6ST was maintained on pCW as a fusion with maltose-binding protein. The plasmid pTXB1 containing the Pd2,6ST gene was generously provided by the lab of Dr. Chun-Cheng Lin. CstII, on the plasmid pET28a, was available within our lab from previous work.221 α-2,3-Sialyllactose was purchased from CarboSynth. α-2,6-Sialyllactose was synthesized for this project. α-2,3- and α-2,6-Neu5Ac-Lactose-C2-BODIPY for use as standards for TLC analysis were a gift from Dr. Hongming Chen. The compounds CMP, CMP-Neu5Ac, AMP, cytidine, sodium phosphate (monobasic), and pyrophosphate were purchased from Sigma.   166  7.5.2 Protein Sequences PmST1 (tPm0188Ph) – 397 amino acids (46 kDa) N-terminal 2-25 AA truncation of protein homologue of Pm0188 (D105N, R135Q, G295E) with C-terminal His6-tag205 MSKTITLYLDPASLPALNQLMDFTQNNEDKTHPRIFGLSRFKIPDNIITQYQNIHFVELKDNRPTEALFTILDQYPGNIELNIHLNIAHSVQLIRPILAYRFKHLDRVSIQQLNLYDDGSMEYVDLEKEENKDISAEIKQAEKQLSHYLLTGKIKFDNPTIARYVWQSAFPVKYHFLSTDYFEKAEFLQPLKEYLAENYQKMDWTAYQQLTPEQQAFYLTLVGFNDEVKQSLEVQQAKFIFTGTTTWEGNTDVREYYAQQQLNLLNHFTQAEGDLFIGDHYKIYFKGHPRGGEINDYILNNAKNITNIPANISFEVLMMTGLLPDKVGGVASSLYFSLPKEKISHIIFTSNKQVKSKEDALNNPYVKVMRRLGIIDESQVIFWDSLKQLLEHHHHHH*  Pd2,6ST (∆N15C178) – 488 amino acids (55 kDa) MCNSDNTSLKETVSSNSADVVETETYQLTPIDAPSSFLSHSWEQTCGTPILNESDKQAISFDFVAPELKQDEKYCFTFKGITGDHRYITNTTLTVVAPTLEVYIDHASLPSLQQLIHIIQAKDEYPSNQRFVSWKRVTVDADNANKLNIHTYPLKGNNTSPEMVAAIDEYAQSKNRLNIEFYTNTAHVFNNLPPIIQPLYNNEKVKISHISLYDDGSSEYVSLYQWKDTPNKIETLEGEVSLLANYLAGTSPDAPKGMGNRYNWHKLYDTDYYFLREDYLDVEANLHDLRDYLGSSAKQMPWDEFAKLSDSQQTLFLDIVGFDKEQLQQQYSQSPLPNFIFTGTTTWAGGETKEYYAQQQVNVINNAINETSPYYLGKDYDLFFKGHPAGGVINDIILGSFPDMINIPAKISFEVLMMTDMLPDTVAGIASSLYFTIPADKVNFIVFTSSDTITDREEALKSPLVQVMLTLGIVKEKDVLFWALEGSSC (C-terminal chitin-binding module removed upon intein cleavage)  Psp2,6ST (∆N109)  N-terminal Mal-E (Maltose-binding protein) fusion with thrombin cleavage/linker to N-terminal truncated (2-109 AA) Pd2,6ST gene – 808 amino acids (102 kDa) MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITKLVPRGSMKNFLLLTLILLTACNNSEENTQSIIKNDINKTIIDEEYVNLEPINQSNISFTKHSWVQTCGTQQLLTEQ NKESISLSVVAPRLDDDEKYCFDFNGVSNKGEKYITKVTLNVVAPSLEVYVDHASLPTLQQLMDIIKSEEENPTAQRYIAWGRIVPTDEQMKELNITSFALINNHTPADLVQEIVKQAQTKHRLNVKLSSNTAHSFDNLVPILKELNSFNNVTVTNIDLYDDGSAEYVNLYNWRDTLN167  KTDNLKIGKDYLEDVINGINEDTSNTGTSSVYNWQKLYPANYHFLRKDYLTLEPSLHELRDYIGDSLKQMQWDGFKKFNSKQQELFLSIVNFDKQKLQNEYNSSNLPNFVFTGTTVWAGNHEREYYAKQQINVINNAINESSPHYLGNSYDLFFKGHPGGGIINTLIMQNYPSMVDIPSKISFEVLMMTDMLPDAVAGIASSLYFTIPAEKIKFIVFTSTETITDRETALRSPLVQVMIKLGIVKEENVLFWADLPNCETGVCIAV*  7.5.3 Cloning and Protein Purification  The truncated (∆N2-109) α-2,6-sialyltransferase from Photobacterium sp. JT-ISH-224 (Genbank BAF92026)  was cloned into the vector pCW as a fusion with MalE (maltose-binding protein). Briefly, 1-L of 2xYT media was inoculated with 1-mL of an overnight culture of BL21(DE3)-pCW(MalE-Psp2,6ST) cells, grown at 37°C with shaking (250 RPM) until an OD600 of 0.8 was reached. The culture was induced (0.5 mM IPTG) overnight at 16°C with shaking (250 RPM). Following induction, the cells were pelleted (5000 RPM, 15 mins @ 4°C) and re-suspended in 20-mL lysis buffer (50 mM HEPES, 200 mM NaCl, pH 7.5, 1 mM EDTA, 5 mM β-mercaptoethanol). The cells were lysed by sonication (3 mins, 30% amplitude, QSonica Sonicator, 1/8” microtip), spun down (15,000 RPM, 20 mins), and the lysate purified (0.2 µm filter) and applied to a 12 mL gravity column containing 2 mL of amylose resin. The column was washed with 20 volumes of wash buffer (50 mM HEPES, 200 mM NaCl, pH 7.5). Bound protein was eluted with 5 volumes of elution buffer (50 mM HEPES, 200 mM NaCl, 10 mM maltose, pH 7.5), then buffer-exchanged with storage buffer (50 mM MES, 200 mM NaCl, pH 6.5) using an Amicon Ultra 15 filter (30 kDa) at stored at 4°C prior to use.  The α-2,6-sialyltransferase from Pasteurella damselae was cloned into pTXB1 (chitin-binding domain fusion with intein linker) and purified in the same way as Psp2,6ST with the following differences. An alternate lysis buffer used was (50 mM HEPES, pH 8.0, 500 mM NaCl, 0.1% Triton X-100, 0.1 mM EDTA). Column purification was performed using a 50 mL gravity 168  column containing 6 mL of chitin resin. The column was kept at 4°C for all steps. The column was washed with 20 volumes of wash buffer (50 mM HEPES, pH 8.0, 500 mM NaCl). The column was treated with 1 volume of elution buffer (50 mM HEPES, 500 mM NaCl, pH 8.0, 50 mM DTT) and the effluent reloaded. The column was closed and left to incubate at 4°C overnight (16 hours). The purified protein was eluted using 5 volumes of wash buffer. The effluent was buffer-exchanged with storage buffer (50 mM MES, 200 mM NaCl, pH 6.5) using an Amicon Ultra 15 filter (10 kDa) at stored at 4°C prior to use.  Most commonly referred to as PmST1, the truncated (∆2-25) homologue of pm0188, the α-2,3-sialyltransferase from Pasteurella multocida (tPm0188Ph), was sub-cloned into pET29a (C-terminal 6xHIS tag) and purified from BL21(DE3) cells under the same conditions as Psp2,6ST with the following differences. Following expression and lysis, purification was performed on a 12 mL gravity column with 2 mL of Ni2+ resin.  CstII (C-terminal 6xHIS tag), was purified under identical conditions to PmST1.  7.5.4 Effect of CMP on Sialidase Rate of PmST1 Sialidase rates of PmST1 were determined in 96-well plates by continuous measurement by UV-Vis of pNP release (405nm) from α-2,3-Sia-Gal-pNP by PmST1 and β-galactosidase (E. coli) in excess. Reactions (150 µL), were carried out and measured (405 nm) at 37°C for 2 hours on plate using a Synergy H1 plate reader. Specifically, each reaction contained PmST1 (1 µM), α-2,3-Sia-Gal-pNP (300 µM), and β-galactosidase (30 units) in HEPES buffer (50 mM, pH 7.5). Reactions both without and with the addition of CMP (10 µM, 100 µM, 1000 µM) were performed for rate comparison. Initial rates (VO) were calculated in the Gen5 software from the resulting reaction curves. The same reactions were carried out without CMP, but alternatively 169  with either PO4 (1 mM) Cytidine (1 mM), PO4 and Cytidine (1 mM each), AMP (1 mM), or pyrophosphate (1 mM) and the sialidase rates determined from the resulting reaction curves.  Under similar conditions (150 µL reactions in 96-well plate), PmST1 (350 nM) was reacted with α-2,3-Sia-Gal-pNP (100 µM) and varying amounts of CMP (1 µM, 2.5 µM, 5 µM, 10 µM, 25 µM, 50 µM) to determine the KM of PmST1 for CMP in the sialidase reaction. Initial rates were calculated as previously stated and fitted with Grafit to determine kinetic values.  7.5.5 Donor Hydrolysis Kinetics To measure a representative donor hydrolysis rate of PmST1, or the hydrolysis of CMP-Neu5Ac by PmST1, a reaction was set up (200 µL) containing PmST1 (20 µg) and CMP-Neu5Ac (5 mM) in HEPES buffer (10 mM, pH 7.5) and run at 37°C. 40 µL aliquots were removed at 10 mins and 20 mins, diluted 10X in water and immediately flash-frozen in a dry ice/EtOH bath. Samples were kept frozen until analysed on a Dionex HPAE-PAD instrument (4°C) to determine the amount of free Neu5Ac released. Analysis was performed using a CarboPac PA200 (3x250 mm) analytical column. Samples were eluted with a solution of 50 mM sodium acetate/100 mM NaOH at a flow rate of 0.5 mL/min. Rates were calculated from the amount of Neu5Ac at these time points following subtraction of the rate of spontaneous hydrolysis of CMP-Neu5Ac. To measure corresponding sialidase rates of PmST1, another reaction (200 µL) was set up containing PmST1 (20 µg), CMP (1 mM), and α-2,3-sialyllactose (20 mM) in HEPES buffer (10 mM, pH 7.5) and run at 37°C. Aliquots (40 µL) were removed, diluted, and flash frozen (as before) prior to analysis by HPAE-PAD. Sialidase rates were calculated from free Neu5Ac concentrations at the indicated time points (as with donor hydrolysis rate determination).   170  7.5.6 Coupled Enzyme Assays and CMP Removal In order to indirectly determine CMP-Neu5Ac formation by PmST1, a set of reactions (20 µL) including various combinations of PmST1 (90 nM), α-2,3-Neu5Ac-Lactose (20 mM), CMP (3 mM), Lactose-C2-BODIPY (Lac*; 500 µM), and Psp2,6ST in excess (1 µM) of PmST1 were set up at 37°C. Reactions set up included controls with only PmST1 or Pd2,6ST, α-2,3-Neu5Ac-Lactose, and Lac*  with or without the addition of CMP. The coupled reactions with both enzymes (with and without CMP) were also performed. For some reactions, a 10X dilution of the protein stock solution (20 µL) was treated with 10 units of alkaline phosphatase (bovine, Sigma), for 30 minutes at 37°C prior to final dilution in reaction mixtures. When enzyme stocks were treated with phosphatase, untreated samples of enzyme were heat treated (30 minutes at 37°C) in parallel as controls to test the effect of incubation. Aliquots (2 µL) of these reactions were removed at various time points and immediately frozen at -80°C prior to analysis by TLC. Samples were spotted (0.5 µL) on TLC Silica gel 60 F254 plates (Merck) and separated with a solvent mixture of ethyl acetate, methanol, water, and acetic acid (6:2:1:0.1). Following separation, plates were dried and viewed under UV365nm using an AlphaImager instrument. Under UV, only reaction components containing the fluorophore BODIPY were easily detectable. AlphaImager 1-D Analysis software was used to determine the spot density of product spots for comparisons between reactions. α-2,3/2,6-Sia-Lac* standards were included on each plate to verify product spot identification. Representative sets of densitometry data points at the points of maximum spot densities corresponding to reaction products of interest in the various control and coupled enzyme reactions were compiled and used for the summary data in Figure 5.7 and Figure 5.14.  171  Reactions with either Pd2,6ST or Psp2,6ST were carried out as above with the following differences. Concentrations of 70 nM and 100 nM were used for Pd2,6ST and Psp2,6ST, respectively. The second enzyme utilized for coupled assays was CstII (30 µM). The sialyl donor used was α-2,6-Neu5Ac-Lactose (5 mM).   7.5.7 Synthesis of α-2,6-Sialyllactose α-2,6-Sialyllactose (Neu5Acα2,6Lac) was synthesized as follows. Preparative-scale synthesis in a 600-µL reaction was carried out at 37°C using Psp2,6ST (10 µg) at pH 6.0 in MES buffer (20 mM) containing Lactose (15 µmol) and CMP-Neu5Ac (17 µmol). The reaction was monitored by performing thin-layer chromatography (TLC) on Silica gel 60 plates (EMD) at various time intervals. Samples spotted on TLC were eluted using a mixture of Butanol:Acetic Acid:Water (3:1:1). After 2 hours, another 10 µg of enzyme was added. After 4 hours, the reaction plateaued, so the mixture was prepared for purification. Protein was removed by centrifugal ultrafiltration using an Amicon (10 kDa cutoff) and the eluent collected (700 µL). The product was first purified by size exclusion (P2 column -120 mL, 0.3 mL/min). Product was eluted with water. Fractions containing the product were pooled and freeze-dried. Next, the sample was re-suspended in acetonitrile (200-uL) and purified by HPLC (Zorbax SAX column, 1 mL/min) to remove any residual lactose and CMP. For HPLC, an elution buffer of 40 mM NH4OAc (pH 5.75), 5% acetonitrile worked well to completely separate CMP and CMP-Neu5Ac from the desired products. Final product obtained was 7.4 mg, and NMR (Appendix C) and ESI-Mass Spec data for the obtained product were in agreement with those reported previously.204  172  Bibliography 1. Sinnott ML. Carbohydrate Chemistry and Biochemistry. Chemistry RSo, editor. Cambridge, UK: RCS Publishing; 2007. 2. Ferrier R. Carbohydrate Chemistry. Cambridge, UK: RCS Publishing; 2000. 3. Stick RV. Carbohydrates: The Sweet Molecules of Life. London, UK: Academic Press; 2001. 4. Daniels G, Reid ME. Blood groups: the past 50 years. Transfusion 2010;50(2):281-289. 5. Coutinho PM, Deleury E, Davies GJ, Henrissat B. An evolving hierarchical family classification for glycosyltransferases. Journal of Molecular Biology 2003;328(2):307-317. 6. Henrissat B, Sulzenbacher G, Bourne Y. Glycosyltransferases, glycoside hydrolases: surprise, surprise! Current Opinion in Structural Biology 2008;18(5):527-533. 7. Lombard V, Ramulu HG, Drula E, Coutinho PM, Henrissat B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Research 2014;42(D1):D490-D495. 8. Koshland DE. Stereochemistry and the Mechanism of Enzymatic Reactions. Biological Reviews of the Cambridge Philosophical Society 1953;28(4):416-436. 9. Lairson LL, Withers SG. Mechanistic analogies amongst carbohydrate modifying enzymes. Chemical Communications 2004(20):2243-2248. 10. Vasella A, Davies GJ, Bohm M. Glycosidase mechanisms. Current Opinion in Chemical Biology 2002;6(5):619-629. 11. Rajan SS, Yang XJ, Collart F, Yip VLY, Withers SG, Varrot A, Thompson J, Davies GJ, Anderson WF. Novel catalytic mechanism of glycoside hydrolysis based on the structure of an NAD+/Mn2+-dependent phospho-alpha-glucosidase from Bacillus subtilis. Structure 2004;12(9):1619-1629. 12. Yip VLY, Varrot A, Davies GJ, Rajan SS, Yang XJ, Thompson J, Anderson WF, Withers SG. An unusual mechanism of glycoside hydrolysis involving redox and elimination steps by a family 4 beta-glycosidase from Thermotoga maritima. Journal of the American Chemical Society 2004;126(27):8354-8355. 13. Bras NF, Fernandes PA, Ramos MJ. QM/MM Studies on the beta-Galactosidase Catalytic Mechanism: Hydrolysis and Transglycosylation Reactions. Journal of Chemical Theory and Computation 2010;6(2):421-433. 14. Faijes M, Planas A. In vitro synthesis of artificial polysaccharides by glycosidases and glycosynthases. Carbohydrate Research 2007;342(12-13):1581-1594. 15. Bojarova P, Kren V. Glycosidases: a key to tailored carbohydrates. Trends in Biotechnology 2009;27(4):199-209. 16. Shaikh FA, Withers SG. Teaching old enzymes new tricks: engineering and evolution of glycosidases and glycosyl transferases for improved glycoside synthesis. Biochemistry and Cell Biology-Biochimie Et Biologie Cellulaire 2008;86(2):169-177. 17. Hancock SM, D Vaughan M, Withers SG. Engineering of glycosidases and glycosyltransferases. Current Opinion in Chemical Biology 2006;10(5):509-519. 18. Mackenzie LF, Wang QP, Warren RAJ, Withers SG. Glycosynthases: Mutant glycosidases for oligosaccharide synthesis. Journal of the American Chemical Society 1998;120(22):5583-5584. 173  19. Filice M, Marciello M. Enzymatic Synthesis of Oligosaccharides: A Powerful Tool for a Sweet Challenge. Current Organic Chemistry 2013;17(7):701-718. 20. Boltje TJ, Buskas T, Boons GJ. Opportunities and challenges in synthetic oligosaccharide and glycoconjugate research. Nature Chemistry 2009;1(8):611-622. 21. Chang A, Singh S, Phillips GN, Thorson JS. Glycosyltransferase structural biology and its role in the design of catalysts for glycosylation. Current Opinion in Biotechnology 2011;22(6):800-808. 22. Wu ZL, Ethen CM, Prather B, Machacek M, Jiang WP. Universal phosphatase-coupled glycosyltransferase assay. Glycobiology 2011;21(6):727-733. 23. Lairson LL, Henrissat B, Davies GJ, Withers SG. Glycosyltransferases: Structures, functions, and mechanisms. Annual Review of Biochemistry. Volume 77, Annual Review of Biochemistry. Palo Alto: Annual Reviews; 2008. p 521-555. 24. Schloss PD, Handelsman J. Biotechnological prospects from metagenomics. Current Opinion in Biotechnology 2003;14(3):303-310. 25. Coughlan LM, Cotter PD, Hill C, Alvarez-Ordonez A. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries. Frontiers in Microbiology 2015;6. 26. Thomas T, Gilbert J, Meyer F. Metagenomics - a guide from sampling to data analysis. Microbial informatics and experimentation 2012;2(1):3-3. 27. Pace NR, Stahl DA, Lane DJ, Olsen GJ. The analysis of natural microbial-populations by ribosomal-RNA sequences. Advances in Microbial Ecology 1986;9:1-55. 28. Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms - Proposal for the domains of Archaea, Bacteria, and Eukarya. Proceedings of the National Academy of Sciences of the United States of America 1990;87(12):4576-4579. 29. Schmidt TM, Delong EF, Pace NR. Analysis of a marine picoplankton community by 16S ribosomal-RNA gene cloning and sequencing. Journal of Bacteriology 1991;173(14):4371-4378. 30. Beja O, Suzuki MT, Heidelberg JF, Nelson WC, Preston CM, Hamada T, Eisen JA, Fraser CM, DeLong EF. Unsuspected diversity among marine aerobic anoxygenic phototrophs. Nature 2002;415(6872):630-633. 31. Hugenholtz P. Exploring prokaryotic diversity in the genomic era. Genome Biology 2002;3(2). 32. Rondon MR, August PR, Bettermann AD, Brady SF, Grossman TH, Liles MR, Loiacono KA, Lynch BA, MacNeil IA, Minor C and others. Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Applied and Environmental Microbiology 2000;66(6):2541-2547. 33. Barns SM, Takala SL, Kuske CR. Wide distribution and diversity of members of the bacterial kingdom Acidobacterium in the environment. Applied and Environmental Microbiology 1999;65(4):1731-1737. 34. Rondon MR, Goodman RM, Handelsman J. The Earth's bounty: assessing and accessing soil microbial diversity. Trends in Biotechnology 1999;17(10):403-409. 35. Sandler SJ, Hugenholtz P, Schleper C, DeLong EF, Pace NR, Clark AJ. Diversity of radA genes from cultured and uncultured Archaea: Comparative analysis of putative RadA proteins and their use as a phylogenetic marker. Journal of Bacteriology 1999;181(3):907-915. 174  36. Head IM, Saunders JR, Pickup RW. Microbial evolution, diversity, and ecology: A decade of ribosomal RNA analysis of uncultivated microorganisms. Microbial Ecology 1998;35(1):1-21. 37. Hugenholtz P, Goebel BM, Pace NR. Impact of culture-independent studies on the emerging phylogenetic view of bacterial diversity. Journal of Bacteriology 1998;180(18):4765-4774. 38. Pace NR. A molecular view of microbial diversity and the biosphere. Science 1997;276(5313):734-740. 39. Torsvik V, Goksoyr J, Daae FL. High diversity of DNA in soil bacteria. Applied and Environmental Microbiology 1990;56(3):782-787. 40. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu DY, Paulsen I, Nelson KE, Nelson W and others. Environmental genome shotgun sequencing of the Sargasso Sea. Science 2004;304(5667):66-74. 41. Xia LC, Cram JA, Chen T, Fuhrman JA, Sun FZ. Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads. Plos One 2011;6(12). 42. Breitbart M, Hewson I, Felts B, Mahaffy JM, Nulton J, Salamon P, Rohwer F. Metagenomic analyses of an uncultured viral community from human feces. Journal of Bacteriology 2003;185(20):6220-6223. 43. D'Argenio V, Casaburi G, Precone V, Salvatore F. Comparative Metagenomic Analysis of Human Gut Microbiome Composition Using Two Different Bioinformatic Pipelines. Biomed Research International 2014. 44. Li JH, Jia HJ, Cai XH, Zhong HZ, Feng Q, Sunagawa S, Arumugam M, Kultima JR, Prifti E, Nielsen T and others. An integrated catalog of reference genes in the human gut microbiome. Nature Biotechnology 2014;32(8):834-841. 45. Lan YM, Kriete A, Rosen GL. Selecting age-related functional characteristics in the human gut microbiome. Microbiome 2013;1. 46. Rampelli S, Candela M, Turroni S, Biagi E, Collino S, Franceschi C, O'Toole PW, Brigidi P. Functional metagenomic profiling of intestinal microbiome in extreme ageing. Aging-Us 2013;5(12):902-912. 47. Bapteste E, Bicep C, Lopez P. Evolution of genetic diversity using networks: the human gut microbiome as a case study. Clinical Microbiology and Infection 2012;18:40-43. 48. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Fernandes GR, Tap J, Bruls T, Batto JM and others. Enterotypes of the human gut microbiome. Nature 2011;473(7346):174-180. 49. Ellrott K, Jaroszewski L, Li WZ, Wooley JC, Godzik A. Expansion of the Protein Repertoire in Newly Explored Environments: Human Gut Microbiome Specific Protein Families. Plos Computational Biology 2010;6(6). 50. Peterson J, Garges S, Giovanni M, McInnes P, Wang L, Schloss JA, Bonazzi V, McEwen JE, Wetterstrand KA, Deal C and others. The NIH Human Microbiome Project. Genome Research 2009;19(12):2317-2323. 51. Sait M, Hugenholtz P, Janssen PH. Cultivation of globally distributed soil bacteria from phylogenetic lineages previously only detected in cultivation-independent surveys. Environmental Microbiology 2002;4(11):654-666. 175  52. Bradford MA, Davies CA, Frey SD, Maddox TR, Melillo JM, Mohan JE, Reynolds JF, Treseder KK, Wallenstein MD. Thermal adaptation of soil microbial respiration to elevated temperature. Ecology Letters 2008;11(12):1316-1327. 53. Xiong J, Liu Y, Lin X, Zhang H, Zeng J, Hou J, Yang Y, Yao T, Knight R, Chu H. Geographic distance and pH drive bacterial distribution in alkaline lake sediments across Tibetan Plateau. Environmental Microbiology 2012;14(9):2457-2466. 54. Garcia-Moyano A, Gonzalez-Toril E, Aguilera A, Amils R. Comparative microbial ecology study of the sediments and the water column of the Rio Tinto, an extreme acidic environment. Fems Microbiology Ecology 2012;81(2):303-314. 55. Johnson DB. Geomicrobiology of extremely acidic subsurface environments. Fems Microbiology Ecology 2012;81(1):2-12. 56. Radajewski S, Webster G, Reay DS, Morris SA, Ineson P, Nedwell DB, Prosser JI, Murrell JC. Identification of active methylotroph populations in an acidic forest soil by stableisotope probing. Microbiology-Sgm 2002;148:2331-2342. 57. Stevens H, Ulloa O. Bacterial diversity in the oxygen minimum zone of the eastern tropical South Pacific. Environmental Microbiology 2008;10(5):1244-1259. 58. Bryant JA, Stewart FJ, Eppley JM, DeLong EF. Microbial community phylogenetic and trait diversity declines with depth in a marine oxygen minimum zone. Ecology 2012;93(7):1659-1673. 59. Tang K, Liu KS, Jiao NZ, Zhang Y, Chen CTA. Functional Metagenomic Investigations of Microbial Communities in a Shallow-Sea Hydrothermal System. Plos One 2013;8(8). 60. Benson CA, Bizzoco RW, Lipson DA, Kelley ST. Microbial diversity in nonsulfur, sulfur and iron geothermal steam vents. Fems Microbiology Ecology 2011;76(1):74-88. 61. Golebiewski M, Deja-Sikora E, Cichosz M, Tretyn A, Wrobel B. 16S rDNA Pyrosequencing Analysis of Bacterial Community in Heavy Metals Polluted Soils. Microbial Ecology 2014;67(3):635-647. 62. Chodak M, Golebiewski M, Morawska-Ploskonka J, Kuduk K, Niklinska M. Diversity of microorganisms from forest soils differently polluted with heavy metals. Applied Soil Ecology 2013;64:7-14. 63. Uchiyama T, Miyazaki K. Functional metagenomics for enzyme discovery: challenges to efficient screening. Current Opinion in Biotechnology 2009;20(6):616-622. 64. Nacke H, Will C, Herzog S, Nowka B, Engelhaupt M, Daniel R. Identification of novel lipolytic genes and gene families by screening of metagenomic libraries derived from soil samples of the German Biodiversity Exploratories. Fems Microbiology Ecology 2011;78(1):188-201. 65. Bunterngsook B, Kanokratana P, Thongaram T, Tanapongpipat S, Uengwetwanit T, Rachdawong S, Vichitsoonthonkul T, Eurwilaichitr L. Identification and Characterization of Lipolytic Enzymes from a Peat-Swamp Forest Soil Metagenome. Bioscience Biotechnology and Biochemistry 2010;74(9):1848-1854. 66. Hu YF, Fu CZ, Huang YP, Yin YS, Cheng G, Lei F, Lu N, Li J, Ashforth EJ, Zhang LX and others. Novel lipolytic genes from the microbial metagenomic library of the South China Sea marine sediment. Fems Microbiology Ecology 2010;72(2):228-237. 67. Jiang XW, Xu XW, Huo YY, Wu YH, Zhu XF, Zhang XQ, Wu M. Identification and characterization of novel esterases from a deep-sea sediment metagenome. Archives of Microbiology 2012;194(3):207-214. 176  68. Jimenez DJ, Montana JS, Alvarez D, Baena S. A novel cold active esterase derived from Colombian high Andean forest soil metagenome. World Journal of Microbiology & Biotechnology 2012;28(1):361-370. 69. Lee MH, Hong KS, Malhotra S, Park JH, Hwang EC, Choi HK, Kim YS, Tao WX, Lee SW. A new esterase EstD2 isolated from plant rhizosphere soil metagenome. Applied Microbiology and Biotechnology 2010;88(5):1125-1134. 70. Pushpam PL, Rajesh T, Gunasekaran P. Identification and characterization of alkaline serine protease from goat skin surface metagenome. Amb Express 2011;1. 71. Neveu J, Regeard C, Dubow MS. Isolation and characterization of two serine proteases from metagenomic libraries of the Gobi and Death Valley deserts. Applied Microbiology and Biotechnology 2011;91(3):635-644. 72. Ye M, Li G, Liang WQ, Liu YH. Molecular cloning and characterization of a novel metagenome-derived multicopper oxidase with alkaline laccase activity and highly soluble expression. Applied Microbiology and Biotechnology 2010;87(3):1023-1031. 73. Beloqui A, Pita M, Polaina J, Martinez-Arias A, Golyshina OV, Zumarraga M, Yakimov MM, Garcia-Arellano H, Alcalde M, Fernandez VM and others. Novel polyphenol oxidase mined from a metagenome expression library of bovine rumen - Biochemical properties, structural analysis, and phylogenetic relationships. Journal of Biological Chemistry 2006;281(32):22933-22942. 74. Voget S, Leggewie C, Uesbeck A, Raasch C, Jaeger KE, Streit WR. Prospecting for novel biocatalysts in a soil metagenome. Applied and Environmental Microbiology 2003;69(10):6235-6242. 75. Gabor EM, de Vries EJ, Janssen DB. Construction, characterization, and use of small-insert gene banks of DNA isolated from soil and enrichment cultures for the recovery of novel amidases. Environmental Microbiology 2004;6(9):948-958. 76. Knietsch A, Waschkowitz T, Bowien S, Henne A, Daniel R. Construction and screening of metagenomic libraries derived from enrichment cultures: Generation of a gene bank for genes conferring alcohol oxidoreductase activity on Escherichia coli. Applied and Environmental Microbiology 2003;69(3):1408-1416. 77. Brady SF, Clardy J. Palmitoylputrescine, an antibiotic isolated from the heterologous expression of DNA extracted from bromeliad tank water. Journal of Natural Products 2004;67(8):1283-1286. 78. Lim HK, Chung EJ, Kim JC, Choi GJ, Jang KS, Chung YR, Cho KY, Lee SW. Characterization of a forest soil metagenome clone that confers indirubin and indigo production on Escherichia coli. Applied and Environmental Microbiology 2005;71(12):7768-7777. 79. Gillespie DE, Brady SF, Bettermann AD, Cianciotto NP, Liles MR, Rondon MR, Clardy J, Goodman RM, Handelsman J. Isolation of antibiotics turbomycin A and B from a metagenomic library of soil microbial DNA. Applied and Environmental Microbiology 2002;68(9):4301-4306. 80. Moser MJ, DiFrancesco RA, Gowda K, Klingele AJ, Sugar DR, Stocki S, Mead DA, Schoenfeld TW. Thermostable DNA Polymerase from a Viral Metagenome Is a Potent RT-PCR Enzyme. Plos One 2012;7(6). 177  81. Wang XY, Xu F, Chen SF. Metagenomic cloning and characterization of Na+/H+ antiporter genes taken from sediments in Chaerhan Salt Lake in China. Biotechnology Letters 2013;35(4):619-624. 82. Lee C-M, Lee Y-S, Seo S-H, Yoon S-H, Kim S-J, Hahn B-S, Sim J-S, Koo B-S. Screening and Characterization of a Novel Cellulase Gene from the Gut Microflora of Hermetia illucens Using Metagenomic Library. Journal of Microbiology and Biotechnology 2014;24(9):1196-1206. 83. Nimchua T, Thongaram T, Uengwetwanit T, Pongpattanakitshote S, Eurwilaichitr L. Metagenomic Analysis of Novel Lignocellulose-Degrading Enzymes from Higher Termite Guts Inhabiting Microbes. Journal of Microbiology and Biotechnology 2012;22(4):462-469. 84. Liu JA, Liu WD, Zhao XL, Shen WJ, Cao H, Cui ZL. Cloning and functional characterization of a novel endo-beta-1,4-glucanase gene from a soil-derived metagenomic library. Applied Microbiology and Biotechnology 2011;89(4):1083-1092. 85. Mewis K, Taupp M, Hallam SJ. A high throughput screen for biomining cellulase activity from metagenomic libraries. Journal of visualized experiments : JoVE 2011(48). 86. Warnecke F, Luginbuhl P, Ivanova N, Ghassemian M, Richardson TH, Stege JT, Cayouette M, McHardy AC, Djordjevic G, Aboushadi N and others. Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. Nature 2007;450(7169):560-U17. 87. Tan H, Mooij MJ, Barret M, Hegarty PM, Harrington C, Dobson ADW, O'Gara F. Identification of Novel Phytase Genes from an Agricultural Soil-Derived Metagenome. Journal of Microbiology and Biotechnology 2014;24(1):113-118. 88. Litthauer D, Abbai NS, Piater LA, van Heerden E. Pitfalls using tributyrin agar screening to detect lipolytic activity in metagenomic studies. African Journal of Biotechnology 2010;9(27):4282-4285. 89. Jones BV, Sun F, Marchesi R. Using skimmed milk agar to functionally screen a gut metagenomic library for proteases may lead to false positives. Letters in Applied Microbiology 2007;45(4):418-420. 90. Uchiyama T, Watanabe K. Substrate-induced gene expression (SIGEX) screening of metagenome libraries. Nature Protocols 2008;3(7):1202-1212. 91. Cohen LJ, Kang HS, Chu J, Huang YH, Gordon EA, Reddy BVB, Ternei MA, Craig JW, Brady SF. Functional metagenomic discovery of bacterial effectors in the human microbiome and isolation of commendamide, a GPCR G2A/132 agonist. Proceedings of the National Academy of Sciences of the United States of America 2015;112(35):E4825-E4834. 92. Brady SF, Chao CJ, Handelsman J, Clardy J. Cloning and heterologous expression of a natural product biosynthetic gene cluster from eDNA. Organic Letters 2001;3(13):1981-1984. 93. Kittl R, Withers SG. New approaches to enzymatic glycoside synthesis through directed evolution. Carbohydrate Research 2010;345(10):1272-1279. 94. Hancock SM, Rich JR, Caines MEC, Strynadka NCJ, Withers SG. Designer enzymes for glycosphingolipid synthesis by directed evolution. Nature Chemical Biology 2009;5(7):508-514. 178  95. Aharoni A, Thieme K, Chiu CPC, Buchini S, Lairson LL, Chen HM, Strynadka NCJ, Wakarchuk WW, Withers SG. High-throughput screening methodology for the directed evolution of glycosyltransferases. Nature Methods 2006;3(8):609-614. 96. Lairson LL, Watts AG, Wakarchuk WW, Withers SG. Using substrate engineering to harness enzymatic promiscuity and expand biological catalysis. Nature Chemical Biology 2006;2(12):724-728. 97. Taly V, Kelly BT, Griffiths AD. Droplets as microreactors for high-throughput biology. Chembiochem 2007;8(3):263-272. 98. Baret JC, Taly V, Ryckelynck M, Merten CA, Griffiths AD. Droplets and emulsions: very high-throughput screening in biology. M S-Medecine Sciences 2009;25(6-7):627-632. 99. Schonbrun E, Abate AR, Steinvurzel PE, Weitz DA, Crozier KB. High-throughput fluorescence detection using an integrated zone-plate array. Lab on a Chip 2010;10(7):852-856. 100. Song LT, Laguerre S, Dumon C, Bozonnet S, O'Donohue MJ. A high-throughput screening system for the evaluation of biomass-hydrolyzing glycoside hydrolases. Bioresource Technology 2010;101(21):8237-8243. 101. Maruthamuthu M, Jimenez DJ, Stevens P, van Elsas JD. A multi-substrate approach for functional metagenomics-based screening for (hemi)cellulases in two wheat straw-degrading microbial consortia unveils novel thermoalkaliphilic enzymes. Bmc Genomics 2016;17. 102. White BA, Lamed R, Bayer EA, Flint HJ. Biomass Utilization by Gut Microbiomes. Annual Review of Microbiology, Vol 68 2014;68:279-296. 103. Sommer MOA, Church GM, Dantas G. A functional metagenomic approach for expanding the synthetic biology toolbox for biomass conversion. Molecular Systems Biology 2010;6. 104. Liu QP, Sulzenbacher G, Yuan H, Bennett EP, Pietz G, Saunders K, Spence J, Nudelman E, Levery SB, White T and others. Bacterial glycosidases for the production of universal red blood cells. Nature Biotechnology 2007;25(4):454-464. 105. Chen H-M, Armstrong Z, Hallam SJ, Withers SG. Synthesis and evaluation of a series of 6-chloro-4-methylumbelliferyl glycosides as fluorogenic reagents for screening metagenomic libraries for glycosidase activity. Carbohydrate Research 2016;421:33-39. 106. Kumagai K, Kojima H, Okabe T, Nagano T. Development of a highly sensitive, high-throughput assay for glycosyltransferases using enzyme-coupled fluorescence detection. Analytical Biochemistry 2014;447:146-155. 107. Jiang WP, Ethen C, Prather B, Machacek M, Wu ZL. Universal Phosphatase-Coupled Glycosyltransferase Assay. Faseb Journal 2011;25. 108. Lee HS, Thorson JS. Development of a universal glycosyltransferase assay amenable to high-throughput formats. Analytical Biochemistry 2011;418(1):85-88. 109. Moretti R, Thorson JS. A comparison of sugar indicators enables a universal high-throughput sugar-1-phosphate nucleotidyltransferase assay. Analytical Biochemistry 2008;377(2):251-258. 110. Yang GY, Rich JR, Gilbert M, Wakarchuk WW, Feng Y, Withers SG. Fluorescence Activated Cell Sorting as a General Ultra-High-Throughput Screening Method for 179  Directed Evolution of Glycosyltransferases. Journal of the American Chemical Society 2010;132(30):10570-10577. 111. Vimr ER, Kalivoda KA, Deszo EL, Steenbergen SM. Diversity of microbial sialic acid metabolism. Microbiology and Molecular Biology Reviews 2004;68(1):132-+. 112. Angata K, Fukuda M. Polysialyltransferases: major players in polysialic acid synthesis on the neural cell adhesion molecule. Biochimie 2003;85(1-2):195-206. 113. Li YH, Chen X. Sialic acid metabolism and sialyltransferases: natural functions and applications. Applied Microbiology and Biotechnology 2012;94(4):887-905. 114. Schauer R. Achievements and challenges of sialic acid research. Glycoconjugate Journal 2000;17(7-9):485-499. 115. Pshezhetsky AV, Ashmarina LI. Desialylation of surface receptors as a new dimension in cell signaling. Biochemistry-Moscow 2013;78(7):736-745. 116. Bork K, Horstkorte R, Weidemann W. Increasing the Sialylation of Therapeutic Glycoproteins: The Potential of the Sialic Acid Biosynthetic Pathway. Journal of Pharmaceutical Sciences 2009;98(10):3499-3508. 117. Baker MA, Taub RN, Whelton CH, Hindenburg A. Aberrant sialylation of granulocyte membranes in chronic myelogenous leukemia. Blood 1984;63(5):1194-1197. 118. Ding JX, Xu LX, Lv JC, Zhao MH, Zhang H, Wang HY. Aberrant sialylation of serum IgA1 was associated with prognosis of patients with IgA nephropathy. Clinical Immunology 2007;125(3):268-274. 119. Varki A. Sialic acids in human health and disease. Trends in Molecular Medicine 2008;14(8):351-360. 120. Ando T, Ando H, Kiso M. Sialic acid and glycobiology: A chemical approach. Trends in Glycoscience and Glycotechnology 2001;13(74):573-586. 121. Yamamoto K. Biological Analysis of the Microbial Metabolism of Hetero-Oligosaccharides in Application to Glycotechnology. Bioscience Biotechnology and Biochemistry 2012;76(10):1815-1827. 122. Frasch AC, Paris G, Cremona ML, Amaya MF, Buschiazzo A, Alzari PM. Structural features conferring trans-sialidase activity to Trypanosoma sialidases. Faseb Journal 2001;15(5):A864-A864. 123. Hood DW, Cox AD, Gilbert M, Makepeace K, Walsh S, Deadman ME, Cody A, Martin A, Mansson M, Schweda EKH and others. Identification of a lipopolysaccharide alpha-2,3-sialyltransferase from Haemophilus influenzae. Molecular Microbiology 2001;39(2):341-350. 124. Gilbert M, Brisson JR, Karwaski MF, Michniewicz J, Cunningham AM, Wu YY, Young NM, Wakarchuk WW. Biosynthesis of ganglioside mimics in Campylobacter jejuni OH4384 - Identification of the glycosyltransferase genes, enzymatic synthesis of model compounds, and characterization of nanomole amounts by 600-MHz H-1 and C-13 NMR analysis. Journal of Biological Chemistry 2000;275(6):3896-3906. 125. Gilbert M, Watson DC, Cunningham AM, Jennings MP, Young NM, Wakarchuk WW. Cloning of the lipooligosaccharide alpha-2,3-sialyltransferase from the bacterial pathogens Neisseria meningitidis and Neisseria gonorrhoeae. Journal of Biological Chemistry 1996;271(45):28271-28276. 180  126. Gilbert M, Bayer R, Cunningham AM, Defrees S, Gao YH, Watson DC, Young NM, Wakarchuk WW. The synthesis of sialylated oligosaccharides using a CMP-Neu5Ac synthetase/sialyltransferase fusion. Nature Biotechnology 1998;16(8):769-772. 127. Koeller KM, Wong CH. Chemoenzymatic synthesis of sialyl-trimeric-Lewis x. Chemistry-a European Journal 2000;6(7):1243-1251. 128. Watson DC, Leclerc S, Wakarchuk WW, Young NM. Enzymatic synthesis and properties of glycoconjugates with legionaminic acid as a replacement for neuraminic acid. Glycobiology 2011;21(1):99-108. 129. Dube DH, Bertozzi CR. Glycans in cancer and inflammation. Potential for therapeutics and diagnostics. Nature Reviews Drug Discovery 2005;4(6):477-488. 130. Jeong YT, Choi O, Son YD, Yeol-Park S, Kim JH. Enhanced sialylation of recombinant erythropoietin in genetically engineered Chinese-hamster ovary cells. Biotechnology and Applied Biochemistry 2009;52:283-291. 131. Dimitrov JD, Bayry J, Siberil S, Kaveri SV. Sialylated therapeutic IgG: a sweet remedy for inflammatory diseases? Nephrology Dialysis Transplantation 2007;22(5):1301-1304. 132. Sugiarto G, Lau K, Qu JY, Li YH, Lim S, Mu S, Ames JB, Fisher AJ, Chen X. A Sialyltransferase Mutant with Decreased Donor Hydrolysis and Reduced Sialidase Activities for Directly Sialylating Lewis(X). Acs Chemical Biology 2012;7(7):1232-1240. 133. Gosselin S, Alhussaini M, Streiff MB, Takabayashi K, Palcic MM. A continuous spectrophotometric assay for glycosyltransferases. Analytical Biochemistry 1994;220(1):92-97. 134. Palcic MM, Sujino K. Assays for glycosyltransferases. Trends in Glycoscience and Glycotechnology 2001;13(72):361-370. 135. Hansen SF, Bettler E, Wimmerova M, Imberty A, Lerouxel O, Breton C. Combination of Several Bioinformatics Approaches for the Identification of New Putative Glycosyltransferases in Arabidopsis. Journal of Proteome Research 2009;8(2):743-753. 136. Hansen SF, Bettler E, Rinnan A, Engelsen SB, Breton C. Exploring genomes for glycosyltransferases. Molecular Biosystems 2010;6(10):1773-1781. 137. Palcic MM. Glycosyltransferases as biocatalysts. Current Opinion in Chemical Biology 2011;15(2):226-233. 138. Lammle K, Zipper H, Breuer M, Hauer B, Buta C, Brunner H, Rupp S. Identification of novel enzymes with different hydrolytic activities by metagenome expression cloning. Journal of Biotechnology 2007;127(4):575-592. 139. Clarke L, Carbon J. Colony bank containing synthetic Col EI hybrid plasmids representative of entire Escherichia-coli genome. Cell 1976;9(1):91-99. 140. Derda R, Tang SKY, Li SC, Ng S, Matochko W, Jafari MR. Diversity of Phage-Displayed Libraries of Peptides during Panning and Amplification. Molecules 2011;16(2):1776-1803. 141. Derda R, Tang SKY, Whitesides GM. Uniform Amplification of Phage with Different Growth Characteristics in Individual Compartments Consisting of Monodisperse Droplets. Angewandte Chemie-International Edition 2010;49(31):5301-5304. 142. Huynh N, Li YH, Yu H, Huang SS, Lau K, Chen X, Fisher AJ. Crystal structures of sialyltransferase from Photobacterium damselae. Febs Letters 2014;588(24):4720-4729. 181  143. Tsuji S. Molecular cloning and functional analysis of sialyltransferases. Journal of Biochemistry 1996;120(1):1-13. 144. Saito S, Yatsuyanagi J, Harata S, Ito Y, Shinagawa K, Suzuki N, Amano K, Enomoto K. Campylobacter jejuni isolated from retail poultry meat, bovine feces and bile, and human diarrheal samples in Japan: Comparison of serotypes and genotypes. Fems Immunology and Medical Microbiology 2005;45(2):311-319. 145. Keramas G, Bang DD, Lund M, Madsen M, Bunkenborg H, Telleman P, Christensen CBV. Use of culture, PCR analysis, and DNA microarrays for detection of Campylobacter jejuni and Campylobacter coli from chicken feces. Journal of Clinical Microbiology 2004;42(9):3985-3991. 146. Walter J, Mangold M, Tannock GW. Construction, analysis, and beta-glucanase screening of a bacterial artificial chromosome library from the large-bowel microbiota of mice. Applied and Environmental Microbiology 2005;71(5):2347-2354. 147. Kim D, Kim S-N, Baik KS, Park SC, Lim CH, Kim J-O, Shin T-S, Oh M-J, Seong CN. Screening and Characterization of a Cellulase Gene from the Gut Microflora of Abalone Using Metagenomic Library. Journal of Microbiology 2011;49(1):141-145. 148. Nyyssonen M, Tran HM, Karaoz U, Weihe C, Hadi MZ, Martiny JBH, Martiny AC, Brodie EL. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries. Frontiers in Microbiology 2013;4. 149. Colin PY, Kintses B, Gielen F, Miton CM, Fischer G, Mohamed MF, Hyvonen M, Morgavi DP, Janssen DB, Hollfelder F. Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics. Nature Communications 2015;6. 150. Hosokawa M, Hoshino Y, Nishikawa Y, Hirose T, Yoon DH, Mori T, Sekiguchi T, Shoji S, Takeyama H. Droplet-based microfluidics for high-throughput screening of a metagenomic library for isolation of microbial enzymes. Biosensors & Bioelectronics 2015;67:379-385. 151. Yamamoto F, Clausen H, White T, Marken J, Hakomori SI. Molecular genetic basis of the histo-blood group ABO system. Nature 1990;345(6272):229-233. 152. Landsteiner K. Reprint from: Wien Klin Wochenschr (1901) 14/46 : 1132-1134. Wiener Klinische Wochenschrift 2001;113(20-21):768-769. 153. Hernandez TI, Lacasa RC, Chulilla JAM, Martin MG. Cytokines and immunomodulation in the blood transfusion: a continuing challenge. Medicina Clinica 1999;113(19):758-758. 154. Blumberg N, Heal JM. Immunomodulation by blood transfusion: An evolving scientific and clinical challenge. American Journal of Medicine 1996;101(3):299-308. 155. Quintiliani L, Buzzonetti A, Digirolamo M, Iudicone P, Guglielmetti M, Martini F, Scocchera R, Terlizzi F, Lapponi P, Giuliani E. Effects of blood-transfusion on the immune responsiveness and survival of cancer patients - A prospective study. Transfusion 1991;31(8):713-718. 156. Blumberg N, Peck K, Ross K, Avila E. Immune response to chronic red blood cell transfusion. Vox Sanguinis 1983;44(4):212-217. 157. Wang B, Wang GC, Zhao BJ, Chen JJ, Zhang XY, Tang RK. Antigenically shielded universal red blood cells by polydopamine-based cell surface engineering. Chemical Science 2014;5(9):3463-3468. 182  158. Narain R, Wang Y, Ahmed M, Lai BFL, Kizhakkedathu JN. Blood Components Interactions to Ionic and Nonionic Glyconanogels. Biomacromolecules 2015;16(9):2990-2997. 159. Chapanian R, Kwan DH, Constantinescu I, Shaikh FA, Rossi NAA, Withers SG, Kizhakkedathu JN. Enhancement of biological reactions on cell surfaces via macromolecular crowding. Nature Communications 2014;5. 160. ul-haq MI, Lai BFL, Kizhakkedathu JN. Hybrid Polyglycerols with Long Blood Circulation: Synthesis, Biocompatibility, and Biodistribution. Macromolecular Bioscience 2014;14(10):1469-1482. 161. Yu K, Mei Y, Hadjesfandiari N, Kizhakkedathu JN. Engineering biomaterials surfaces to modulate the host response. Colloids and Surfaces B-Biointerfaces 2014;124:69-79. 162. Gao HW, Li SB, Tan YX, Ji SP, Wang YL, Bao GQ, Xu LJ, Gong F. Application of alpha-N-acetylgalactosaminidase and alpha-galactosidase in AB to O Red Blood Cells Conversion. Artificial Cells Nanomedicine and Biotechnology 2013;41(1):32-36. 163. Wakinaka T, Kiyohara M, Kurihara S, Hirata A, Chaiwangsri T, Ohnuma T, Fukamizo T, Katayama T, Ashida H, Yamamoto K. Bifidobacterial alpha-galactosidase with unique carbohydrate-binding module specifically acts on blood group B antigen. Glycobiology 2013;23(2):232-240. 164. Sulzenbacher G, Liu QP, Bennett EP, Levery SB, Bourne Y, Ponchel G, Clausen H, Henrissat B. A novel alpha-N-acetylgalactosaminidase family with an NAD(+)-dependent catalytic mechanism suitable for enzymatic removal of blood group A antigens. Biocatalysis and Biotransformation 2010;28(1):22-32. 165. Kobayashi T, Liu D, Ogawa H, Miwa Y, Nagasaka T, Maruyama S, Li YT, Onishi A, Iwamoto M, Kuzuya T and others. Removal of blood group A/B antigen in organs by ex vivo and in vivo administration of endo-beta-galactosidase (ABase) for ABO-incompatible transplantation. Transplant Immunology 2009;20(3):132-138. 166. Shaikh FA, Randriantsoa M, Withers SG. Mechanistic Analysis of the Blood Group Antigen-Cleaving endo-beta-Galactosidase from Clostridium perfringens. Biochemistry 2009;48(35):8396-8404. 167. Sulzenbacher G, Bourne Y, Henrissat B. Glycosidases for the production of universal blood. M S-Medecine Sciences 2007;23(8-9):703-705. 168. Anderson KM, Ashida H, Maskos K, Dell A, Li SC, Li YT. A clostridial endo-beta-galactosidase that cleaves both blood group A and B glycotopes. Journal of Biological Chemistry 2005;280(9):7720-7728. 169. Olsson ML, Hill CA, de la Vega H, Liu QYP, Stroud MR, Valdinocci J, Moon S, Clausen H, Kruskall MS. Universal red blood cells - enzymatic conversion of blood group A and B antigens. Transfusion Clinique Et Biologique 2004;11(1):33-39. 170. Goldstein J, Siviglia G, Hurst R, Lenny L, Reich L. Group B erythrocytes enzymatically converted to group O survive normally in A, B, and O individuals. Science 1982;215(4529):168-170. 171. Ruas-Madiedo P, Gueimonde M, Fernandez-Garcia M, Reyes-Gavilan CGD, Margolles A. Mucin degradation by Bifidobacterium strains isolated from the human intestinal microbiota. Applied and Environmental Microbiology 2008;74(6):1936-1940. 183  172. Robbe C, Capon C, Coddeville B, Michalski JC. Structural diversity and specific distribution of 0-glycans in normal human mucins along the intestinal tract. Biochemical Journal 2004;384:307-316. 173. Prakobphol A, Leffler H, Fisher SJ. The high-molecular weight human mucin is the primary salivary carrier of ABH, LE(A), and LE(B) blood-group antigens. Critical Reviews in Oral Biology &amp; Medicine 1993;4(3-4):325-333. 174. Lee J, Lee HT, Hong WY, Jang E, Kim J. FCMM: A comparative metagenomic approach for functional characterization of multiple metagenome samples. Journal of Microbiological Methods 2015;115:121-128. 175. Yoon SS, Kim EK, Lee WJ. Functional genomic and metagenomic approaches to understanding gut microbiota-animal mutualism. Current Opinion in Microbiology 2015;24:38-46. 176. Wang J, Linnenbrink M, Kunzel S, Fernandes R, Nadeau MJ, Rosenstiel P, Baines JF. Dietary history contributes to enterotype-like clustering and functional metagenomic content in the intestinal microbiome of wild mice. Proceedings of the National Academy of Sciences of the United States of America 2014;111(26):E2703-E2710. 177. Aron-Wisnewsky J, Clement K. The gut microbiome, diet, and links to cardiometabolic and chronic disorders. Nature Reviews Nephrology 2016;12(3):169-181. 178. Cani PD, Everard A. Talking microbes: When gut bacteria interact with diet and host organs. Molecular Nutrition & Food Research 2016;60(1):58-66. 179. Gerard P. Gut microbiota and obesity. Cellular and Molecular Life Sciences 2016;73(1):147-162. 180. Videhult FK, West CE. Nutrition, gut microbiota and child health outcomes. Current opinion in clinical nutrition and metabolic care 2016;19(3):208-13. 181. Cresci GA, Bawden E. Gut Microbiome: What We Do and Don't Know. Nutrition in Clinical Practice 2015;30(6):734-746. 182. Kwan DH, Ernst S, Kotzler MP, Withers SG. Chemoenzymatic Synthesis of a Type 2 Blood Group A Tetrasaccharide and Development of High-throughput Assays Enables a Platform for Screening Blood Group Antigen-cleaving Enzymes. Glycobiology 2015;25(8):806-811. 183. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic Local Alignment Search Tool. Journal of Molecular Biology 1990;215(3):403-410. 184. Ernst HA, Lo Leggio L, Willemoes M, Leonard G, Blum P, Larsen S. Structure of the Sulfolobus solfataricus alpha-glucosidase: Implications for domain conservation and substrate recognition in GH31. Journal of Molecular Biology 2006;358(4):1106-1124. 185. Fuentes-Prior P, Fujikawa K, Pratt KP. New insights into binding interfaces of coagulation factors V and VIII and their homologues - Lessons from high resolution crystal structures. Current Protein & Peptide Science 2002;3(3):313-339. 186. Martens EC, Roth R, Heuser JE, Gordon JI. Coordinate Regulation of Glycan Degradation and Polysaccharide Capsule Biosynthesis by a Prominent Human Gut Symbiont. Journal of Biological Chemistry 2009;284(27):18445-18457. 187. Mackenzie AK, Naas AE, Kracun SK, Schuckel J, Fangel JU, Agger JW, Willats WGT, Eijsink VGH, Pope PB. A Polysaccharide Utilization Locus from an Uncultured Bacteroidetes Phylotype Suggests Ecological Adaptation and Substrate Versatility. Applied and Environmental Microbiology 2015;81(1):187-195. 184  188. Bolam DN, Koropatkin NM. Glycan recognition by the Bacteroidetes Sus-like systems. Current Opinion in Structural Biology 2012;22(5):563-569. 189. Bakolitsa C, Xu QP, Rife CL, Abdubek P, Astakhova T, Axelrod HL, Carlton D, Chen C, Chiu HJ, Clayton T and others. Structure of BT_3984, a member of the SusD/RagB family of nutrient-binding molecules. Acta Crystallographica Section F-Structural Biology and Crystallization Communications 2010;66:1274-1280. 190. Koropatkin N, Martens EC, Gordon JI, Smith TJ. Structure of a SusD Homologue, BT1043, Involved in Mucin O-Glycan Utilization in a Prominent Human Gut Symbiont. Biochemistry 2009;48(7):1532-1542. 191. Wang YJ, Xing GW. Progress in Structure, Function and Catalytic Reactions of Sialidase and Sialyltransferase. Chinese Journal of Organic Chemistry 2011;31(8):1157-1168. 192. Kim DU, Yoo JH, Lee YJ, Kim KS, Cho HS. Structural analysis of sialyltransferase PM0188 from Pasteurella multocida complexed with donor analogue and acceptor sugar. Bmb Reports 2008;41(1):48-54. 193. Chiu CPC, Lairson LL, Gilbert M, Wakarchuk WW, Withers SG, Strynadka NCJ. Structural analysis of the alpha-2,3-sialyltransferase cst-i from Campylobacter jejuni in apo and substrate-analogue bound forms. Biochemistry 2007;46(24):7196-7204. 194. Audry M, Jeanneau C, Imberty A, Harduin-Lepers A, Delannoy P, Breton C. Current trends in the structure-activity relationships of sialyltransferases. Glycobiology 2011;21(6):716-726. 195. Schmolzer K, Czabany T, Luley-Goedl C, Pavkov-Keller T, Ribitsch D, Schwab H, Gruber K, Weber H, Nidetzky B. Complete switch from alpha-2,3-to alpha-2,6-regioselectivity in Pasteurella dagmatis beta-D-galactoside sialyltransferase by active-site redesign. Chemical Communications 2015;51(15):3083-3086. 196. Choi YH, Kim JH, Park JH, Lee N, Kim DH, Jang KS, Park IH, Kim BG. Protein engineering of alpha 2,3/2,6-sialyltransferase to improve the yield and productivity of in vitro sialyllactose synthesis. Glycobiology 2014;24(2):159-169. 197. Guo Y, Jers C, Meyer AS, Arnous A, Li HY, Kirpekar F, Mikkelsen JD. A Pasteurella multocida sialyltransferase displaying dual trans-sialidase activities for production of 3 '-sialyl and 6 '-sialyl glycans. Journal of Biotechnology 2014;170:60-67. 198. Guo Y, Jers C, Meyer AS, Li HY, Kirpekar F, Mikkelsen JD. Modulating the regioselectivity of a Pasteurella multocida sialyltransferase for biocatalytic production of 3 '- and 6 '-sialyllactose. Enzyme and Microbial Technology 2015;78:54-62. 199. Watson DC, Wakarchuk WW, Leclerc S, Schur MJ, Schoenhofen IC, Young NM, Gilbert M. Sialyltransferases with enhanced legionaminic acid transferase activity for the preparation of analogs of sialoglycoconjugates. Glycobiology 2015;25(7):767-773. 200. Malekan H, Fung G, Thon V, Khedri Z, Yu H, Qu JY, Li YH, Ding L, Lam KS, Chen X. One-pot multi-enzyme (OPME) chemoenzymatic synthesis of sialyl-Tn-MUC1 and sialyl-T-MUC1 glycopeptides containing natural or non-natural sialic acid. Bioorganic & Medicinal Chemistry 2013;21(16):4778-4785. 201. Sugiarto G, Lau K, Yu H, Vuong S, Thon V, Li YH, Huang SS, Chen X. Cloning and characterization of a viral alpha 2-3-sialyltransferase (vST3Gal-I) for the synthesis of sialyl Lewis(x). Glycobiology 2011;21(3):387-396. 185  202. Sugiarto G, Lau K, Li YH, Khedri Z, Yu H, Le DT, Chen X. Decreasing the sialidase activity of multifunctional Pasteurella multocida alpha 2-3-sialyltransferase 1 (PmST1) by site-directed mutagenesis. Molecular Biosystems 2011;7(11):3021-3027. 203. Cheng JS, Huang SS, Yu H, Li YH, Lau K, Chen X. Trans-sialidase activity of Photobacterium damsela alpha 2,6-sialyltransferase and its application in the synthesis of sialosides. Glycobiology 2010;20(2):260-268. 204. Drouillard S, Mine T, Kajiwara H, Yamamoto T, Samain E. Efficient synthesis of 6 '-sialyllactose, 6,6 '-disialyllactose, and 6 '-KDO-lactose by metabolically engineered E. coli expressing a multifunctional sialyltransferase from the Photobacterium sp JT-ISH-224. Carbohydrate Research 2010;345(10):1394-1399. 205. Yu H, Chokhawala H, Karpel R, Wu BY, Zhang JB, Zhang YX, Jia Q, Chen X. A multifunctional Pasteurella multocida sialyltransferase: A powerful tool for the synthesis of sialoside libraries. Journal of the American Chemical Society 2005;127(50):17618-17619. 206. Monaco L, Marc A, EonDuval A, Acerbis G, Distefano G, Lamotte D, Engasser JM, Soria M, Jenkins N. Genetic engineering of alpha 2,6-sialyltransferase in recombinant CHO cells and its effects on the sialylation of recombinant interferon-gamma. Cytotechnology 1996;22(1-3):197-203. 207. Kalle E, Kubista M, Rensing C. Multi-template polymerase chain reaction. Biomolecular Detection and Quantification 2014;2:11-29. 208. Kanagawa T. Bias and artifacts in multitemplate polymerase chain reactions (PCR). Journal of Bioscience and Bioengineering 2003;96(4):317-323. 209. Yu CC, Withers SG. Recent Developments in Enzymatic Synthesis of Modified Sialic Acid Derivatives. Advanced Synthesis & Catalysis 2015;357(8):1633-1654. 210. Amaya MF, Watts AG, Damager I, Wehenkel A, Nguyen T, Buschiazzo A, Paris G, Frasch AC, Withers SG, Alzari PM. Structural insights into the catalytic mechanism of Trypanosoma cruzi trans-sialidase. Structure 2004;12(5):775-784. 211. Damager I, Buchini S, Amaya MF, Buschiazzo A, Alzari P, Frasch AC, Watts A, Withers SG. Kinetic and mechanistic analysis of Trypanosoma cruzi trans-sialidase reveals a classical ping-pong mechanism with acid/base catalysis. Biochemistry 2008;47(11):3507-3512. 212. Kakuta Y, Okino N, Kajiwara H, Ichikawa M, Takakura Y, Ito M, Yamamoto T. Crystal structure of Vibrionaceae Photobacterium sp JT-ISH-224 alpha 2,6-sialyltransferase in a ternary complex with donor product CMP and acceptor substrate lactose: catalytic mechanism and substrate recognition. Glycobiology 2008;18(1):66-73. 213. Ni LS, Chokhawala HA, Cao HZ, Henning R, Ng L, Huang SS, Yu H, Chen X, Fisher AJ. Crystal structures of Pasteurella multocida sialyltransferase complexes with acceptor and donor analogues reveal substrate binding sites and catalytic mechanism. Biochemistry 2007;46(21):6288-6298. 214. Schwarz A, Brecker L, Nidetzky B. Acid-base catalysis in Leuconostoc mesenteroides sucrose phosphorylase probed by site-directed mutagenesis and detailed kinetic comparison of wild-type and Glu(237) -> Gln mutant enzymes. Biochemical Journal 2007;403:441-449. 186  215. Gantt RW, Peltier-Pain P, Cournoyer WJ, Thorson JS. Using simple donors to drive the equilibria of glycosyltransferase-catalyzed reactions. Nature Chemical Biology 2011;7(10):685-691. 216. Lairson LL, Wakarchuk WW, Withers SG. Alternative donor substrates for inverting and retaining glycosyltransferases. Chemical Communications 2007(4):365-367. 217. Zhang CS, Griffith BR, Fu Q, Albermann C, Fu X, Lee IK, Li LJ, Thorson JS. Exploiting the reversibility of natural product glycosyltransferase-catalyzed reactions. Science 2006;313(5791):1291-1294. 218. Chandrasekaran EV, Xue J, Xia J, Locke RD, Matta KL, Neelamegham S. Reversible sialylation: Synthesis of cytidine 5 '-monophospho-N-acetylneuraminic acid from cytidine 5 '-monophosphate with alpha 2,3-sialyl O-glycan-, glycolipid-, and macromolecule-based donors yields diverse sialylated products. Biochemistry 2008;47(1):320-330. 219. Lougheed B, Ly HD, Wakarchuk WW, Withers SG. Glycosyl fluorides can function as substrates for nucleotide phosphosugar-dependent glycosyltransferases. Journal of Biological Chemistry 1999;274(53):37717-37722. 220. Okino N, Kakuta Y, Kajiwara H, Ichikawa M, Takakura Y, Ito M, Yamamoto T. Purification, crystallization and preliminary crystallographic characterization of the alpha 2,6-sialyltransferase from Photobacterium sp JT-ISH-224. Acta Crystallographica Section F-Structural Biology and Crystallization Communications 2007;63:662-664. 221. Chiu CPC, Watts AG, Lairson LL, Gilbert M, Lim D, Wakarchuk WW, Withers SG, Strynadka NCJ. Structural analysis of the sialyltransferase CstII from Campylobacter jejuni in complex with a substrate analog. Nature Structural & Molecular Biology 2004;11(2):163-170.   187  Appendices Appendix A  - FACS instrument setup   (BD Influx)    188  Appendix B  - TLC Data  in vivo assay of avian and goat metagenomic library samples (x24 each) following FACS.        189  Appendix C – NMR data NMR Data – α-2,6-sialyllactose       1H NMR (300 MHz, D2O, 293 K): δ 5.25 (d, 0.4H, H-1α), 4.69 (d, 0.6H, H-1β), 4.45 (d, 1H, H-1′), 3.33 (m, 0.6H, H-2β), 2.74 (dd, 1H, H-3″eq), 2.06 (s, 3H, Ac), 1.77 (dd, 1H, H-3″ax). 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0340628/manifest

Comment

Related Items